
<p>Introduction to Biosystematics - Zool 575 </p><p>Introduction to Biosystematics Lecture 25 - Confidence - Assessment 2 </p><p><strong>Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 </strong></p><p>1. Consistency Index 2. g1 statistic, PTP - test </p><p>“Quantifying the uncertainty of a phylogenetic estimate is at least as important a goal as obtaining the phylogenetic estimate itself.” </p><p>3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) <br>- Huelsenbeck & Rannala (2004) </p><p>Derek S. Sikes University of Calgary Zool 575 </p><p></p><ul style="display: flex;"><li style="flex:1">Multiple optimal trees </li><li style="flex:1">Multiple optimal trees </li></ul><p></p><p>• Many methods can yield multiple equally optimal trees <br>• If multiple optimal trees are found we know </p><p>that <em>all of them are wrong </em>except, possibly, </p><p>(hopefully) one <br>• We can further select among these trees </p><ul style="display: flex;"><li style="flex:1">with additional criteria, but </li><li style="flex:1">• Some have argued against consensus tree </li></ul><p>methods for this reason </p><p>• Typically, relationships common to all the optimal trees are summarized with </p><p><em>consensus trees </em></p><p>• Debate over quest for true tree (point estimate) versus quantification of uncertainty </p><p>Strict consensus methods <br>Consensus methods </p><p>• Strict consensus methods require agreement across all the fundamental trees <br>• A consensus tree is a summary of the agreement among a set of fundamental trees </p><p>• They show only those relationships that are unambiguously supported by the data <br>• There are many consensus methods that differ in: <br>1. the kind of agreement 2. the level of agreement <br>• The commonest method (<em>strict component consensus</em>) focuses on clades/components/full splits <br>• Consensus methods can be used with multiple trees from a <em>single analysis </em>or from multiple analyses </p><p>1<br>Introduction to Biosystematics - Zool 575 </p><p>Strict consensus methods <br>Strict consensus methods </p><p>TWO FUNDAMENTAL TREES </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><p>• This method produces a consensus tree that includes all and only those full splits found in all the fundamental trees </p><p>• Other relationships (those in which the fundamental trees disagree) are shown as unresolved polytomies </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li></ul><p></p><p>• Can be less optimal than any of the optimal trees </p><p>Simplest to interpret </p><p><strong>STRICT CONSENSUS TREE </strong></p><p></p><ul style="display: flex;"><li style="flex:1">Majority rule consensus </li><li style="flex:1">Majority rule consensus </li></ul><p></p><p>• This method produces a consensus tree that includes all and only those full splits found in a majority (>50%) of the fundamental trees <br>• Majority-rule consensus methods require agreement across a majority of the fundamental trees </p><p>• Other relationships are shown as unresolved polytomies <br>• May include relationships that are not supported by the most parsimonious interpretation of the data </p><p>• Of particular use in bootstrapping and Bayesian Inference (best not to use for single searches) <br>• The commonest method focuses on clades/components/full splits </p><p>• Implemented in PAUP* and MrBayes </p><p></p><ul style="display: flex;"><li style="flex:1">Majority rule consensus </li><li style="flex:1">Majority rule consensus </li></ul><p></p><p>Majority Rule Consensus trees are used for </p><p><strong>THREE FUNDAMENTAL TREES </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><p>1. Summarizing multiple equally optimal trees from one search (but they shouldn’t be!) </p><p>2. Summarizing the results of a bootstrapping analysis (multiple searches) </p><p><strong>Numbers indicate frequency of clades in the </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><p>3. Summarizing the results of a Bayesian analysis </p><p><strong>66 </strong><br><strong>100 </strong></p><ul style="display: flex;"><li style="flex:1"><strong>66 </strong></li><li style="flex:1"><strong>66 </strong></li></ul><p></p><p><strong>fundamental trees </strong></p><p><strong>66 </strong></p><p>Don’t confuse these! <em>The numbers on the branches </em></p><p><strong>MAJORITY-RULE CONSENSUS TREE </strong></p><p><em>mean very different things in each case </em></p><p>2<br>Introduction to Biosystematics - Zool 575 </p><p>Consensus methods <br>Reduced consensus methods </p><p><strong>TWO FUNDAMENTAL TREES </strong></p><p><strong>Three fundamental trees </strong></p><ul style="display: flex;"><li style="flex:1"><strong>agreement subtree </strong></li><li style="flex:1"><strong>strict consensus </strong></li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><p><strong>Euplotes excluded </strong></p><p><strong>Ochromonas Symbiodinium Prorocentrum Loxodes </strong></p><p><strong>Ochromonas Symbiodinium Prorocentrum Loxodes </strong></p><p><strong>Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomum Tracheloraphis Gruberia </strong></p><p><strong>Tetrahymena Spirostomumum Tracheloraphis Euplotes </strong></p><p><strong>Tetrahymena Tracheloraphis Spirostomum Euplotes </strong></p><p><strong>Gruberia Ochromonas Symbiodinium Prorocentrum Loxodes </strong></p><p><strong>Gruberia </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>B C D E </strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><p><strong>Ochromonas </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li></ul><p></p><p><strong>Tetrahymena </strong></p><p><strong>majority-rule </strong></p><p><strong>Spirostomumum </strong></p><p><strong>Ochromonas </strong></p><p><strong>Euplotes Tracheloraphis Gruberia </strong></p><p><strong>Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomum Euplotes </strong></p><p><strong>100 </strong><br><strong>100 </strong><br><strong>66 </strong></p><p><strong>Strict component consensus completely unresolved </strong></p><p><strong>Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Euplotes Spirostomumum Tracheloraphis Gruberia </strong></p><p><strong>100 </strong><br><strong>66 </strong></p><p><strong>AGREEMENT SUBTREE - PAUP* </strong><br><strong>Taxon G is excluded </strong></p><p><strong>100 </strong></p><p><strong>Tracheloraphis Gruberia </strong></p><p><strong>Consensus methods </strong></p><p>Recall </p><p>• Use strict methods to identify those relationships unambiguously supported by parsimonious interpretation of the data </p><p>• Stochastic error vs Systematic error • These assessment methods help identify stochastic error </p><p>• Use reduced methods where consensus trees are poorly resolved <br>– How repeatable are the results? </p><ul style="display: flex;"><li style="flex:1">– How strongly do the data support them? </li><li style="flex:1">• Avoid methods which have ambiguous </li></ul><p>interpretations. Prevent possible confusion between MR consensus for an optimal tree search and a MR consensus for a bootstrapping search <br>– This is a measure of <strong>precision </strong>(which is </p><p><em>hopefully </em>related to <strong>accuracy</strong>) <br><strong>Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 </strong></p><p>Accuracy and Precision </p><p>• <strong>Accuracy </strong></p><p>1. Consistency Index </p><p>– Accuracy is correctness. How close a measurement is to the true value. <br>(unless we know the “true tree” in advance we cannot measure this) </p><p>2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) </p><p>• <strong>Precision </strong></p><p>– Precision is reproducibility. How closely two or more measurements agree with one another. (this we can measure!) </p><p>3<br>Introduction to Biosystematics - Zool 575 </p><p>Decay analysis </p><p><strong>Branch Support </strong></p><p>• In parsimony analysis, a way to assess support for a group is to see if the group occurs in slightly less parsimonious trees also <br>• Several methods have been proposed that attach numerical values to internal branches in trees that are intended to provide some measure of the </p><p><em>strength of support </em>for those branches and the </p><p>corresponding groups <br>• The length difference between: the shortest trees including the group and </p><p>• These methods include: the shortest trees that exclude the group </p><p>(the extra steps required to collapse a group) </p><p>is the <strong>decay index </strong>or <strong>Bremer support </strong></p><p>- The Bootstrap (BS) and jackknife - Decay analyses (aka Bremer Support) - Bayesian Posterior Probabilities (PP or BPP) </p><p>Decay analyses - in practice </p><p>• Decay indices for each clade can be determined by: </p><p>Decay analysis -example </p><p>- Using PAUP* to search for the shortest tree that lacks the branch of interest using reverse topological constraints </p><p><strong>Ciliate SSUrDNA data </strong></p><p><strong>Ochromonas </strong></p><p><strong>Randomly permuted data </strong></p><p><strong>Ochromonas </strong></p><p><strong>+27 </strong></p><p><strong>Symbiodinium Prorocentrum Loxodes </strong><br><strong>Symbiodinium </strong></p><p><strong>+1 </strong></p><p><strong>Prorocentrum </strong></p><p><strong>+1 </strong></p><p>- with the Autodecay or TreeRot programs (in conjunction with PAUP*) - MacClade 4 will also help prepare for a Decay analysis </p><p><strong>+45 </strong></p><p><strong>Loxodes </strong></p><p><strong>+3 </strong></p><p><strong>Tracheloraphis Spirostomum Gruberia </strong><br><strong>Tetrahymena Tracheloraphis Spirostomum Euplotes </strong></p><p><strong>+8 </strong><br><strong>+15 </strong></p><p><strong>+7 </strong><br><strong>+10 </strong></p><p><strong>Euplotes </strong></p><p>- An excellent use for the Parsimony Ratchet - because finding the shortest tree length is all that matters (not finding multiple shortest trees) </p><p><strong>Tetrahymena </strong><br><strong>Gruberia </strong></p><p></p><ul style="display: flex;"><li style="flex:1">Decay indices - interpretation </li><li style="flex:1">Decay indices - interpretation </li></ul><p></p><p>• Generally, the higher the decay index the better the relative support for a group <br>• Unlike BS decay indices are not scaled (0-100) </p><p>– This has the advantage that the value can exceed 100 whereas BS “tops - out” at 100 meaning that we cannot distinguish between the support of two branches with BS values of 100 although one might have a far greater decay index than the other </p><p>• Like Bootstrap values (BS), decay indices may be misleading if the data are misleading </p><p>• Magnitude of decay indices and BS generally correlated (i.e. they tend to agree) <br>• It is even less clear what is an acceptable decay </p><p>index than a BS value… </p><p>– Unlike the BS value very little work has examined the properties and behavior of decay indices </p><p>• Only groups found in all most parsimonious trees have decay indices > zero </p><p>4<br>Introduction to Biosystematics - Zool 575 </p><p><strong>Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 </strong></p><p>Decay indices - interpretation </p><p>One key study is that of DeBry (2001) </p><p>1. Consistency Index <br>– He showed that decay indices should be interpreted in </p><p>light of branch lengths <br>2. g1 statistic, PTP - test </p><p>– That the same values, <em>even within the same tree</em>, do not </p><p>represent the same support if the branch lengths differ <br>3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) <br>- ie Decay Indices are not easily comparable as measures of branch support </p><p>- Values < 4 should be considered weak regardless of branch length </p><p>DeBry, R.W. (2001) Improving interpretation of the Decay Index for DNA sequence data. Systematic Biology 50: 742-752. </p><p>Bootstrapping (non-parametric) </p><p>• Bootstrapping is a modern statistical technique that uses computer intensive random resampling of data to determine sampling error or confidence intervals for some estimated parameter <br>• Introduced to phylogenetics by Felsenstein in 1985 </p><p>Decay values versus Bootstrap and Jacknife values from one empirical study </p><p>• Based on idea of Efron (1979) </p><p>Norén, M. & U. Jondelius. 1999. Phylogeny of the Prolecithophora (Platyhelminthes) inferred from 18S rDNA sequences. Cladistics 15: 103-112. </p><p>Bootstrapping (non-parametric) </p><p>1. Characters are sampled with replacement to create many (100-1000) bootstrap replicate data sets </p><p>(think shuffle vs random play of music) </p><p>2. Each bootstrap replicate data set is analysed (e.g. with parsimony, distance, ML) </p><p>3. Agreement among the resulting trees is summarized with a majority-rule consensus tree </p><p>5<br>Introduction to Biosystematics - Zool 575 </p><p>Bootstrapping (non-parametric) <br>Bootstrapping </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>Resampled data matrix </strong></li><li style="flex:1"><strong>Original data matrix </strong></li></ul><p></p><p><strong>Characters </strong></p><p>• Frequency of occurrence of groups, bootstrap support (BS), is a measure of support for those groups </p><p><strong>Characters </strong></p><p><strong>Summarize the results of multiple analyses with a majority-rule consensus tree Bootstrap values (BS) are the frequencies with which groups are encountered in analyses of replicate data sets </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>Taxa </strong></li><li style="flex:1"><strong>1 2 2 5 5 6 6 8 </strong></li></ul><p><strong>R R R Y Y Y Y Y R R R Y Y Y Y Y Y Y Y Y Y R R R Y Y Y R R R R R </strong></p><ul style="display: flex;"><li style="flex:1"><strong>Taxa </strong></li><li style="flex:1"><strong>1 2 3 4 5 6 7 8 </strong></li></ul><p><strong>R R Y Y Y Y Y Y R R Y Y Y Y Y Y Y Y Y Y Y R R R Y Y R R R R R R </strong><br><strong>ABCD</strong><br><strong>ABCD</strong><br><strong>Outgp R R R R R R R R </strong><br><strong>Outgp R R R R R R R R </strong></p><p>• Additional information is given in partition tables (for groups below 50% support) </p><p><strong>Randomly resample characters from the original data with replacement to build many bootstrap replicate data sets of the same size as the original - analyse each replicate data set </strong></p><p><strong>D</strong></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li></ul><p></p><p><strong>12</strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li></ul><p><strong>A</strong></p><p>• Can ask PAUP* to create MR con-tree of higher cut-off, eg 80% - all weaker branches collapse </p><p><strong>5</strong></p><ul style="display: flex;"><li style="flex:1"><strong>5</strong></li><li style="flex:1"><strong>1</strong></li></ul><p></p><p><strong>96% </strong><br><strong>66% </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>2</strong></li><li style="flex:1"><strong>8</strong></li></ul><p><strong>86</strong><br><strong>76</strong><br><strong>2</strong></p><ul style="display: flex;"><li style="flex:1"><strong>6</strong></li><li style="flex:1"><strong>2</strong></li></ul><p><strong>543</strong><br><strong>1</strong><br><strong>Outgroup </strong></p><ul style="display: flex;"><li style="flex:1"><strong>Outgroup </strong></li><li style="flex:1"><strong>Outgroup </strong></li></ul><p></p><p>Bootstrapping - an example <br>Bootstrapping - random data </p><p><strong>Partition Table </strong></p><p><strong>Partition Table </strong></p><p><strong>123456789 Freq </strong></p><p><strong>Ciliate SSUrDNA - parsimony bootstrap </strong></p><p><strong>123456789 Freq ----------------- </strong></p><p><strong>Randomly permuted data - parsimony bootstrap </strong></p><p><strong>Ochromonas (1) </strong></p><p><strong>.*****.** ..**..... ....*..*. .*......* .***.*.** ...*...*. .*..**.** .....*..* .*...*..* .***....* ....**.** ....**.*. ..*...*.. .**..*..* .*...*... .....*.** .***..... </strong><br><strong>71.17 58.87 26.43 25.67 23.83 21.00 18.50 16.00 15.67 13.17 12.67 12.00 12.00 11.00 10.80 10.50 10.00 </strong></p><p><strong>----------------- .**...... 100.00 ...**.... 100.00 .....**.. 100.00 ...****.. 100.00 </strong></p><p>Ochromonas Symbiodinium Prorocentrum Loxodes <br>Ochromonas Symbiodinium Prorocentrum Loxodes </p><p><strong>Symbiodinium (2) </strong></p><p><strong>100 </strong></p><p>16 16 </p><p><strong>Prorocentrum (3) </strong></p><p>59 21 <br>59 <br>26 </p><p><strong>Euplotes (8) </strong></p><p>71 </p><p>Spirostomumum Tetrahymena Euplotes <br>Tracheloraphis Spirostomumum Euplotes </p><p><strong>84 </strong></p><p>71 </p><p><strong>Tetrahymena (9) </strong></p><p><strong>...****** .......** ...****.* ...*****. .*******. .**....*. .**.....* </strong><br><strong>95.50 84.33 11.83 </strong><br><strong>3.83 2.50 1.00 </strong></p><p><strong>96 </strong></p><p>Tracheloraphis Gruberia <br>Tetrahymena Gruberia </p><p><strong>Loxodes (4) </strong></p><p><strong>100 </strong></p><p><strong>Tracheloraphis (5) </strong></p><p><strong>100 </strong></p><p><strong>50% Majority-rule consensus (with minority components) </strong></p><p><strong>Spirostomum (6) </strong></p><p><strong>100 </strong></p><p><strong>Gruberia (7) </strong></p><p><strong>Majority-rule consensus </strong></p>
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages9 Page
-
File Size-