Handout Lec. 25

Handout Lec. 25

<p>Introduction to Biosystematics - Zool 575 </p><p>Introduction to Biosystematics Lecture 25 - Confidence - Assessment 2 </p><p><strong>Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 </strong></p><p>1. Consistency Index 2. g1 statistic, PTP - test </p><p>“Quantifying the uncertainty of a phylogenetic estimate is at least as important a goal as obtaining the phylogenetic estimate itself.” </p><p>3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) <br>- Huelsenbeck &amp; Rannala (2004) </p><p>Derek S. Sikes University of Calgary Zool 575 </p><p></p><ul style="display: flex;"><li style="flex:1">Multiple optimal trees </li><li style="flex:1">Multiple optimal trees </li></ul><p></p><p>• Many&nbsp;methods can yield multiple equally optimal trees <br>• If&nbsp;multiple optimal trees are found we know </p><p>that <em>all of them are wrong </em>except, possibly, </p><p>(hopefully) one <br>• We&nbsp;can further select among these trees </p><ul style="display: flex;"><li style="flex:1">with additional criteria, but </li><li style="flex:1">• Some&nbsp;have argued against consensus tree </li></ul><p>methods for this reason </p><p>• Typically,&nbsp;relationships common to all the optimal trees are summarized with </p><p><em>consensus trees </em></p><p>• Debate&nbsp;over quest for true tree (point estimate) versus quantification of uncertainty </p><p>Strict consensus methods <br>Consensus methods </p><p>• Strict&nbsp;consensus methods require agreement across all the fundamental trees <br>• A&nbsp;consensus tree is a summary of the agreement among a set of fundamental trees </p><p>• They&nbsp;show only those relationships that are unambiguously supported by the data <br>• There&nbsp;are many consensus methods that differ in: <br>1. the&nbsp;kind of agreement 2. the&nbsp;level of agreement <br>• The&nbsp;commonest method (<em>strict component consensus</em>) focuses on clades/components/full splits <br>• Consensus&nbsp;methods can be used with multiple trees from a <em>single analysis </em>or from multiple analyses </p><p>1<br>Introduction to Biosystematics - Zool 575 </p><p>Strict consensus methods <br>Strict consensus methods </p><p>TWO FUNDAMENTAL TREES </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><p>• This&nbsp;method produces a consensus tree that includes all and only those full splits found in all the fundamental trees </p><p>• Other&nbsp;relationships (those in which the fundamental trees disagree) are shown as unresolved polytomies </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li></ul><p></p><p>• Can&nbsp;be less optimal than any of the optimal trees </p><p>Simplest to interpret </p><p><strong>STRICT CONSENSUS TREE </strong></p><p></p><ul style="display: flex;"><li style="flex:1">Majority rule consensus </li><li style="flex:1">Majority rule consensus </li></ul><p></p><p>• This&nbsp;method produces a consensus tree that includes all and only those full splits found in a majority (&gt;50%) of the fundamental trees <br>• Majority-rule&nbsp;consensus methods require agreement across a majority of the fundamental trees </p><p>• Other&nbsp;relationships are shown as unresolved polytomies <br>• May&nbsp;include relationships that are not supported by the most parsimonious interpretation of the data </p><p>• Of&nbsp;particular use in bootstrapping and Bayesian Inference (best not to use for single searches) <br>• The&nbsp;commonest method focuses on clades/components/full splits </p><p>• Implemented&nbsp;in PAUP* and MrBayes </p><p></p><ul style="display: flex;"><li style="flex:1">Majority rule consensus </li><li style="flex:1">Majority rule consensus </li></ul><p></p><p>Majority Rule Consensus trees are used for </p><p><strong>THREE FUNDAMENTAL TREES </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><p>1. Summarizing multiple equally optimal trees from one search (but they shouldn’t be!) </p><p>2. Summarizing the results of a bootstrapping analysis (multiple searches) </p><p><strong>Numbers indicate frequency of clades in the </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><p>3. Summarizing the results of a Bayesian analysis </p><p><strong>66 </strong><br><strong>100 </strong></p><ul style="display: flex;"><li style="flex:1"><strong>66 </strong></li><li style="flex:1"><strong>66 </strong></li></ul><p></p><p><strong>fundamental trees </strong></p><p><strong>66 </strong></p><p>Don’t confuse these! <em>The numbers on the branches </em></p><p><strong>MAJORITY-RULE CONSENSUS TREE </strong></p><p><em>mean very different things in each case </em></p><p>2<br>Introduction to Biosystematics - Zool 575 </p><p>Consensus methods <br>Reduced consensus methods </p><p><strong>TWO FUNDAMENTAL TREES </strong></p><p><strong>Three fundamental trees </strong></p><ul style="display: flex;"><li style="flex:1"><strong>agreement subtree </strong></li><li style="flex:1"><strong>strict consensus </strong></li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li><li style="flex:1"><strong>G</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>E</strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><p><strong>Euplotes excluded </strong></p><p><strong>Ochromonas Symbiodinium Prorocentrum Loxodes </strong></p><p><strong>Ochromonas Symbiodinium Prorocentrum Loxodes </strong></p><p><strong>Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomum Tracheloraphis Gruberia </strong></p><p><strong>Tetrahymena Spirostomumum Tracheloraphis Euplotes </strong></p><p><strong>Tetrahymena Tracheloraphis Spirostomum Euplotes </strong></p><p><strong>Gruberia Ochromonas Symbiodinium Prorocentrum Loxodes </strong></p><p><strong>Gruberia </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>G</strong></li><li style="flex:1"><strong>B C D E </strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><p><strong>Ochromonas </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>F</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>E</strong></li></ul><p></p><p><strong>Tetrahymena </strong></p><p><strong>majority-rule </strong></p><p><strong>Spirostomumum </strong></p><p><strong>Ochromonas </strong></p><p><strong>Euplotes Tracheloraphis Gruberia </strong></p><p><strong>Symbiodinium Prorocentrum Loxodes Tetrahymena Spirostomum Euplotes </strong></p><p><strong>100 </strong><br><strong>100 </strong><br><strong>66 </strong></p><p><strong>Strict component consensus completely unresolved </strong></p><p><strong>Ochromonas Symbiodinium Prorocentrum Loxodes Tetrahymena Euplotes Spirostomumum Tracheloraphis Gruberia </strong></p><p><strong>100 </strong><br><strong>66 </strong></p><p><strong>AGREEMENT SUBTREE - PAUP* </strong><br><strong>Taxon G is excluded </strong></p><p><strong>100 </strong></p><p><strong>Tracheloraphis Gruberia </strong></p><p><strong>Consensus methods </strong></p><p>Recall </p><p>• Use&nbsp;strict methods to identify those relationships unambiguously supported by parsimonious interpretation of the data </p><p>• Stochastic&nbsp;error vs Systematic error • These&nbsp;assessment methods help identify stochastic error </p><p>• Use&nbsp;reduced methods where consensus trees are poorly resolved <br>– How repeatable are the results? </p><ul style="display: flex;"><li style="flex:1">– How strongly do the data support them? </li><li style="flex:1">• Avoid&nbsp;methods which have ambiguous </li></ul><p>interpretations. Prevent possible confusion between MR consensus for an optimal tree search and a MR consensus for a bootstrapping search <br>– This is a measure of <strong>precision </strong>(which is </p><p><em>hopefully </em>related to <strong>accuracy</strong>) <br><strong>Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 </strong></p><p>Accuracy and Precision </p><p>• <strong>Accuracy </strong></p><p>1. Consistency Index </p><p>– Accuracy is correctness. How close a measurement is to the true value. <br>(unless we know the “true tree” in advance we cannot measure this) </p><p>2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) </p><p>• <strong>Precision </strong></p><p>– Precision&nbsp;is reproducibility. How closely two or more measurements agree with one another. (this we can measure!) </p><p>3<br>Introduction to Biosystematics - Zool 575 </p><p>Decay analysis </p><p><strong>Branch Support </strong></p><p>• In&nbsp;parsimony analysis, a way to assess support for a group is to see if the group occurs in slightly less parsimonious trees also <br>• Several&nbsp;methods have been proposed that attach numerical values to internal branches in trees that are intended to provide some measure of the </p><p><em>strength of support </em>for those branches and the </p><p>corresponding groups <br>• The&nbsp;length difference between: the shortest trees including the group and </p><p>• These&nbsp;methods include: the shortest trees that exclude the group </p><p>(the extra steps required to collapse a group) </p><p>is the <strong>decay index </strong>or <strong>Bremer support </strong></p><p>- The Bootstrap (BS) and jackknife - Decay analyses (aka Bremer Support) - Bayesian Posterior Probabilities (PP or BPP) </p><p>Decay analyses - in practice </p><p>• Decay&nbsp;indices for each clade can be determined by: </p><p>Decay analysis -example </p><p>- Using&nbsp;PAUP* to search for the shortest tree that lacks the branch of interest using reverse topological constraints </p><p><strong>Ciliate SSUrDNA data </strong></p><p><strong>Ochromonas </strong></p><p><strong>Randomly permuted data </strong></p><p><strong>Ochromonas </strong></p><p><strong>+27 </strong></p><p><strong>Symbiodinium Prorocentrum Loxodes </strong><br><strong>Symbiodinium </strong></p><p><strong>+1 </strong></p><p><strong>Prorocentrum </strong></p><p><strong>+1 </strong></p><p>- with&nbsp;the Autodecay or TreeRot programs (in conjunction with PAUP*) - MacClade 4 will also help prepare for a Decay analysis </p><p><strong>+45 </strong></p><p><strong>Loxodes </strong></p><p><strong>+3 </strong></p><p><strong>Tracheloraphis Spirostomum Gruberia </strong><br><strong>Tetrahymena Tracheloraphis Spirostomum Euplotes </strong></p><p><strong>+8 </strong><br><strong>+15 </strong></p><p><strong>+7 </strong><br><strong>+10 </strong></p><p><strong>Euplotes </strong></p><p>- An&nbsp;excellent use for the Parsimony Ratchet - because finding the shortest tree length is all that matters (not finding multiple shortest trees) </p><p><strong>Tetrahymena </strong><br><strong>Gruberia </strong></p><p></p><ul style="display: flex;"><li style="flex:1">Decay indices - interpretation </li><li style="flex:1">Decay indices - interpretation </li></ul><p></p><p>• Generally,&nbsp;the higher the decay index the better the relative support for a group <br>• Unlike&nbsp;BS decay indices are not scaled (0-100) </p><p>– This has the advantage that the value can exceed 100 whereas BS “tops - out” at 100 meaning that we cannot distinguish between the support of two branches with BS values of 100 although one might have a far greater decay index than the other </p><p>• Like&nbsp;Bootstrap values (BS), decay indices may be misleading if the data are misleading </p><p>• Magnitude&nbsp;of decay indices and BS generally correlated (i.e. they tend to agree) <br>• It&nbsp;is even less clear what is an acceptable decay </p><p>index than a BS value… </p><p>– Unlike the BS value very little work has examined the properties and behavior of decay indices </p><p>• Only&nbsp;groups found in all most parsimonious trees have decay indices &gt; zero </p><p>4<br>Introduction to Biosystematics - Zool 575 </p><p><strong>Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 </strong></p><p>Decay indices - interpretation </p><p>One key study is that of DeBry (2001) </p><p>1. Consistency Index <br>– He showed that decay indices should be interpreted in </p><p>light of branch lengths <br>2. g1 statistic, PTP - test </p><p>– That the same values, <em>even within the same tree</em>, do not </p><p>represent the same support if the branch lengths differ <br>3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) <br>- ie&nbsp;Decay Indices are not easily comparable as measures of branch support </p><p>- Values&nbsp;&lt; 4 should be considered weak regardless of branch length </p><p>DeBry, R.W. (2001) Improving interpretation of the Decay Index for DNA sequence data. Systematic Biology 50: 742-752. </p><p>Bootstrapping (non-parametric) </p><p>• Bootstrapping&nbsp;is a modern statistical technique that uses computer intensive random resampling of data to determine sampling error or confidence intervals for some estimated parameter <br>• Introduced&nbsp;to phylogenetics by Felsenstein in 1985 </p><p>Decay values versus Bootstrap and Jacknife values from one empirical study </p><p>• Based&nbsp;on idea of Efron (1979) </p><p>Norén, M. &amp; U. Jondelius. 1999. Phylogeny of the Prolecithophora (Platyhelminthes) inferred from 18S rDNA sequences. Cladistics 15: 103-112. </p><p>Bootstrapping (non-parametric) </p><p>1. Characters are sampled with replacement to create many (100-1000) bootstrap replicate data sets </p><p>(think shuffle vs random play of music) </p><p>2. Each bootstrap replicate data set is analysed (e.g. with parsimony, distance, ML) </p><p>3. Agreement among the resulting trees is summarized with a majority-rule consensus tree </p><p>5<br>Introduction to Biosystematics - Zool 575 </p><p>Bootstrapping (non-parametric) <br>Bootstrapping </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>Resampled data matrix </strong></li><li style="flex:1"><strong>Original data matrix </strong></li></ul><p></p><p><strong>Characters </strong></p><p>• Frequency&nbsp;of occurrence of groups, bootstrap support (BS), is a measure of support for those groups </p><p><strong>Characters </strong></p><p><strong>Summarize the results of multiple analyses with a majority-rule consensus tree Bootstrap values (BS) are the frequencies with which groups are encountered in analyses of replicate data sets </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>Taxa </strong></li><li style="flex:1"><strong>1 2 2 5 5 6 6 8 </strong></li></ul><p><strong>R R R Y Y Y Y Y R R R Y Y Y Y Y Y Y Y Y Y R R R Y Y Y R R R R R </strong></p><ul style="display: flex;"><li style="flex:1"><strong>Taxa </strong></li><li style="flex:1"><strong>1 2 3 4 5 6 7 8 </strong></li></ul><p><strong>R R Y Y Y Y Y Y R R Y Y Y Y Y Y Y Y Y Y Y R R R Y Y R R R R R R </strong><br><strong>ABCD</strong><br><strong>ABCD</strong><br><strong>Outgp R&nbsp;R R R R R R R </strong><br><strong>Outgp R&nbsp;R R R R R R R </strong></p><p>• Additional&nbsp;information is given in partition tables (for groups below 50% support) </p><p><strong>Randomly resample characters from the original data with replacement to build many bootstrap replicate data sets of the same size as the original - analyse each replicate data set </strong></p><p><strong>D</strong></p><ul style="display: flex;"><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>D</strong></li><li style="flex:1"><strong>A</strong></li><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li></ul><p></p><p><strong>12</strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>B</strong></li><li style="flex:1"><strong>C</strong></li><li style="flex:1"><strong>D</strong></li></ul><p><strong>A</strong></p><p>• Can&nbsp;ask PAUP* to create MR con-tree of higher cut-off, eg 80% - all weaker branches collapse </p><p><strong>5</strong></p><ul style="display: flex;"><li style="flex:1"><strong>5</strong></li><li style="flex:1"><strong>1</strong></li></ul><p></p><p><strong>96% </strong><br><strong>66% </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>2</strong></li><li style="flex:1"><strong>8</strong></li></ul><p><strong>86</strong><br><strong>76</strong><br><strong>2</strong></p><ul style="display: flex;"><li style="flex:1"><strong>6</strong></li><li style="flex:1"><strong>2</strong></li></ul><p><strong>543</strong><br><strong>1</strong><br><strong>Outgroup </strong></p><ul style="display: flex;"><li style="flex:1"><strong>Outgroup </strong></li><li style="flex:1"><strong>Outgroup </strong></li></ul><p></p><p>Bootstrapping - an example <br>Bootstrapping - random data </p><p><strong>Partition Table </strong></p><p><strong>Partition Table </strong></p><p><strong>123456789 Freq </strong></p><p><strong>Ciliate SSUrDNA - parsimony bootstrap </strong></p><p><strong>123456789 Freq ----------------- </strong></p><p><strong>Randomly permuted data - parsimony bootstrap </strong></p><p><strong>Ochromonas (1) </strong></p><p><strong>.*****.** ..**..... ....*..*. .*......* .***.*.** ...*...*. .*..**.** .....*..* .*...*..* .***....* ....**.** ....**.*. ..*...*.. .**..*..* .*...*... .....*.** .***..... </strong><br><strong>71.17 58.87 26.43 25.67 23.83 21.00 18.50 16.00 15.67 13.17 12.67 12.00 12.00 11.00 10.80 10.50 10.00 </strong></p><p><strong>----------------- .**...... 100.00 ...**.... 100.00 .....**.. 100.00 ...****.. 100.00 </strong></p><p>Ochromonas Symbiodinium Prorocentrum Loxodes <br>Ochromonas Symbiodinium Prorocentrum Loxodes </p><p><strong>Symbiodinium (2) </strong></p><p><strong>100 </strong></p><p>16 16 </p><p><strong>Prorocentrum (3) </strong></p><p>59 21 <br>59 <br>26 </p><p><strong>Euplotes (8) </strong></p><p>71 </p><p>Spirostomumum Tetrahymena Euplotes <br>Tracheloraphis Spirostomumum Euplotes </p><p><strong>84 </strong></p><p>71 </p><p><strong>Tetrahymena (9) </strong></p><p><strong>...****** .......** ...****.* ...*****. .*******. .**....*. .**.....* </strong><br><strong>95.50 84.33 11.83 </strong><br><strong>3.83 2.50 1.00 </strong></p><p><strong>96 </strong></p><p>Tracheloraphis Gruberia <br>Tetrahymena Gruberia </p><p><strong>Loxodes (4) </strong></p><p><strong>100 </strong></p><p><strong>Tracheloraphis (5) </strong></p><p><strong>100 </strong></p><p><strong>50% Majority-rule consensus (with minority components) </strong></p><p><strong>Spirostomum (6) </strong></p><p><strong>100 </strong></p><p><strong>Gruberia (7) </strong></p><p><strong>Majority-rule consensus </strong></p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us