<<

Systematics - Bio 615

Confidence - Assessment of the Strength of the Phylogenetic Signal - part 3

Branch lengths and support:

Farris S. J. (2001) Branch lengths do not indicate support - even in maximum likelihood. Cladistics 17: 298-299

“As a simple example, suppose that the data comprise four terminals A–D and 500 informative characters, of which half split the terminals AB/CD and the rest AC/ BD. The (undirected) consensus of most parsimonious trees is unresolved, so that the Bremer support (Bremer, 1988, 1994; cf. Farris, 1996) for either split is zero, but the internal branch of either tree has length 250 according to parsimonious optimization (Farris,1970).” [italics added]

Derek S. Sikes University of Alaska

Confidence - Assessment of the Strength of Confidence - Assessment of the Strength of the Phylogenetic Signal - part 3 the Phylogenetic Signal - part 3

1. Consistency Index 1. Consistency Index

2. g1 statistic, PTP - test 2. g1 statistic, PTP - test

3. Consensus trees 3. Consensus trees

4. Decay index (Bremer Support) [pp.190-193 - Text] 4. Decay index (Bremer Support) [pp.190-193 - Text]

5. Bootstrapping / Jackknifing 5. Bootstrapping / Jackknifing

6. Statistical hypothesis testing (frequentist) 6. Statistical hypothesis testing (frequentist) - Independent Dataset Congruence - Independent Dataset Congruence

7. Posterior probability - Bayesian Inference 7. Posterior probability - Bayesian Inference

Comparing competing phylogenetic Comparing competing phylogenetic hypotheses - tests of two trees hypotheses - tests of two trees

• Particularly useful techniques are those designed to 1. Winning sites test (MP & ML) allow evaluation of alternative phylogenetic hypotheses (frequentist, not Bayesian) 2. Templeton test (MP & ML)

• Several such tests allow us to determine if one tree is statistically significantly worse than another: 3. Kishino-Hasegawa test (MP & ML)

Winning sites test, Templeton test, Kishino-Hasegawa test, 4. Shimodaira-Hasegawa test (ML only) Shimodaira-Hasegawa test, parametric bootstrapping*

*(we won’t cover but you can read about this method in Hillis, D. M., B. K. Mable, and C. Moritz. 1996. Applications of molecular systematics: The state of the field and a look to the future. Chapter 12 in Hillis, D. M., C. Moritz, & B. K. Mable (eds). Molecular Systematics (2nd ed). Sinaeur Associates, Inc. Massachusetts. xvi + 655 pp.) 5. Bayesian Posterior Probabilities (BI only)

1 Systematics - Bio 615

Tests of two trees The Templeton test • Test the null hypothesis that the differences between two trees (A and B) are no greater than expected from sampling error • A non-parametric Wilcoxon signed ranks test of the differences in fits of characters to two • The simplest test is the ‘winning sites’ test trees – sums the number of sites supporting tree A over tree B and vice versa (those having fewer steps on, and better fit to, one of the trees) • Like the ‘winning sites’ test but also takes into account the magnitudes of differences in the • If the null hypothesis is true: characters are equally likely to support tree A or tree B support of characters for the two trees – a binomial distribution gives the probability of the observed difference in numbers of winning sites

Templeton’s test - an example Comparing competing phylogenetic Recent studies of the relationships of hypotheses - tests of two trees 1 turtles using morphological data have produced very different results with turtles grouping either within the parareptiles (H1) or within the diapsids 1. Winning sites test (MP & ML) Seymouriadae Paleothyris ! Diadectomorpha Lepidosauriformes Araeoscelidia (H2) the result depending on the Archosauromorpha Parareptilia Captorhinidae Claudiosaurus ! Younginiformes ! Placodus Eosauropterygia Synapsida 2 morphologist This suggests there may be: - problems with the data 2. Templeton test (MP & ML) - special problems with turtles - weak support for turtle relationships 3. Kishino-Hasegawa test (MP & ML) Parsimony analysis of the most recent data favoured H2 However, analyses constrained by H1 produced trees that required only 3 extra steps (<1% tree length) 4. Shimodaira-Hasegawa test (ML only) The Templeton test was used to evaluate these trees and showed that the slightly longer H1 tree found in the constrained analyses was not significantly worse than the unconstrained H2 tree The morphological data do not allow choice between H1 and 5. Bayesian Posterior Probabilities (BI only) H2

Kishino-Hasegawa test Kishino-Hasegawa test

Sites favouring tree A Sites favouring tree B • Uses differences in the support provided by individual sites If the difference between trees for two trees to determine if the overall differences between (tree lengths or likelihoods) is the trees are significantly greater than expected from attributable to sampling error, then

random sampling error Expected characters will randomly support Mean

0 tree A or B and the total difference Distribution of Step/Likelihood differences at each will be close to zero • A parametric test - depends on assumptions: site The observed difference is – that the characters are independent Under the null hypothesis the significantly greater than zero if it – and identically distributed (the same assumptions underlying the mean of the differences in is greater than 1.95 standard statistical interpretation of bootstrapping) parsimony steps or likelihoods for deviations each site is expected to be zero, This allows us to reject the null and the distribution normal hypothesis and declare the sub- • It can be used with parsimony and maximum likelihood - optimal tree significantly worse implemented in PHYLIP and PAUP* From observed differences we than the optimal tree (p < 0.05) calculate a standard deviation

2 Systematics - Bio 615

Kishino-Hasegawa test - an example Kishino-Hasegawa test - an example

Ochromonas! SSUrDNA Maximum likelihood tree Ciliate SSUrDNA data ! Ochromonas 99-100 Symbiodinium Most parsimonious tree Prorocentrum! 95-100 Parsimony analysis yields a 11! Prorocentrum ! very similar tree ! Parsimonious character 7! Sarcocystis n! 81-86 Theileria - in particular, parsimonious optimization of the 3! 100 Plagiopyla n Plagiopyla f! 100 33! character optimization Trimyema c! Plagiopyla f presence and absence of 48! 100 Trimyema s! 15-0 Trimyema c indicates four separate origins hydrogenosomes suggests 27! Trimyema s Cyclidium p! 3! 69-99 of hydrogenosomes within Cyclidium g! 11-0 100 Glaucoma four separate origins of 6! Colpodinium Cyclidium l! 3! 75! Glaucoma! hydrogenosomes within 35-17 Colpodinium! the ciliates 3! 7! Cyclidium p Tetrahymena! 41-30 12! Paramecium! 78-99 Cyclidium g 3! 89-91 Cyclidium l Discophrya! 100-99 Trithigmostoma! 46-26 Discophryal Opisthonecta! 23! Trithigmostoma ! 3! Opisthonecta 18-0 100-98 Dasytrichia! Questions 50-53 Dasytrichia Entodinium! 1! 67-99 17! Entodinium 3! Spathidium Spathidium! - how reliable is this result? 100 3! Loxophylum! 53-45 56! Homalozoon - in particular how well Loxophylum Decay indices and BS for parsimony Homalozoon! 69-78 5! 100 42 c! 4! Metopus c and distance analyses indicate relative supported is the idea of 3! Metopus p Metopus p! 45-72 83-82 ! multiple origins? Stylonychia support for Onychodromous! Onychodromous 100 27! Oxytrichia Oxytrichia! - how many origins can we 96-100 Differences between the ML, MP and ! Colpoda 10! 100 Loxodes ! confidently infer? distance trees generally reflect the less ! 100 63! Tracheloraphis 100 Spirostomum well supported relationships Gruberia! 26! ! anaerobic ciliates with hydrogenosomes 18! Gruberia 80-50 Blepharisma 3!

Kishino-Hasegawa test - an example Kishino-Hasegawa test - an example

Parsimony analyses with Ochromonas! Ochromonas! Symbiodinium! Symbiodinium! topological constraints - to Prorocentrum! Prorocentrum! Test summary and results - origins of ciliate hydrogenosomes Sarcocystis! Sarcocystis! Theileria! Theileria! find the shortest trees that Plagiopyla n! Plagiopyla n! Plagiopyla f! Plagiopyla f! forced hydrogenosomal Constrained analyses used to find Trimyema c! Trimyema c! N!o.!! C!on!s!t!ra!!i!n!t!! Ex!t!ra!!! D!iff!!e!r!e!n!c!e!! Trimyema s! Trimyema s! ciliate lineages together and S!ig!!n!if!i!c!!a!n!tl!y!! most parsimonious trees with less Cyclidium p! Cyclidium p! O!rig!!i!n!s!! tre!!e!! St!e!p!s!! an!d!!SD!! Cyclidium g! Metopus c! w!o!rs!!e!?! than four separate origins of Cyclidium l! Metopus p! thereby reduced the number Dasytrichia! Dasytrichia! 4! M!L! +1!0!! -! -! hydrogenosomes Entodinium! Entodinium! of separate origins of 4 M P - -13± 18 N!o! Loxophylum! Cyclidium g! Homalozoon! Cyclidium l! Hydrogenosomes. 3 (cp,p!t)! +13 -21± 22 N!o! Tested against ML tree Spathidium! Loxophylum! Metopus c! Spathidium! 3 (cp,r!c) +113 -337± 40 Y!e!s! Metopus p! Homalozoon! 3 (cp,m!) +47 -147± 36 Y!e!s! Trees with 2 or 1 origin are all Loxodes! Loxodes! Each of the constrained Tracheloraphis! Tracheloraphis! 3 (pt,r!!c) +96 -279± 38 Y!e!s! significantly worse than the ML tree Spirostomum! Spirostomum! Gruberia! Gruberia! parsimony trees were 3 (pt,m!!) +22 -68± 29 Y!e!s! Blepharisma! Blepharisma! Discophrya! Discophrya! compared to the ML tree 3 (rc,m!) +63 -190± 34 Y!e!s! We can confidently conclude that Trithigmostoma! Trithigmostoma! 2 (pt,c!!p,r!c) +123 -432 40 Stylonychia! Stylonychia! with the KH test to ± Y!e!s! there have been at least three Onychodromous! Onychodromous! 2 (pt,r!!c,m!) +100 -353± 43 Y!e!s! separate origins of Oxytrichia! Oxytrichia! determine if any of these Colpoda! Colpoda! 2 (pt,c!!p,m!) +40 -140± 37 Y!e!s! hydrogenosomes within the Paramecium! Paramecium! Glaucoma! Glaucoma! trees were significantly 2 (cp,r!c,m!) +124 -466± 49 Y!e!s! sampled ciliates Colpodinium! Colpodinium! 2 (pt,c!!p)(rc,m!) +77 Tetrahymena! Tetrahymena! worse than the ML tree -222± 39 Y!e!s! Opisthonecta! Opisthonecta! 2 (pt,m!!)(rc,c!p) +131 -442± 48 Y!e!s! Two examples of the topological constraint 2 (pt,r!!c)(cp,m!) +140 -414± 50 Y!e!s! trees 1 (pt,c!!p,m!,r!c) +131 -515± 49 Y!e!s!

PAUP* Comparison of shortest Parsimony tree (L=90) to UPGMA tree (L=93) Comparing competing phylogenetic Tree # 1 3! Length 90 93! hypotheses - tests of two trees Kishino-Hasegawa test:! Length! Tree Length diff s.d.(diff) t P*! ------! 1. Winning sites test (MP & ML) 1 90 (best)! 3 93 3 1.68184 1.7838 0.0831 ! Not Sig.

* Probability of getting a more extreme T-value under the null hypothesis of no! difference between the two trees (two-tailed test). Asterisked values in table (if! any) indicate significant difference at P < 0.05.! 2. Templeton test (MP & ML)

Templeton (Wilcoxon signed-ranks) and winning-sites (sign) tests:!

------Templeton ------Winning-sites --! Tree Length Rank sums* N z P** counts P**! ------! 3. Kishino-Hasegawa test (MP & ML) 1 90 (best)! 3 93 6.0 3 -1.7321 0.0833 3 0.2500 ! Not Sig. -0.0 0!

* Wilcoxon signed-ranks test statistic is the smaller of the absolute values of the two! rank sums.! 4. Shimodaira-Hasegawa test (ML only) ** Approximate probability of getting a more extreme test statistic under the null! hypothesis of no difference between the two trees (two-tailed test). Asterisked! values in table (if any) indicate significant difference at P < 0.05. Consult a table! for critical values of Wilcoxon rank sum when N is 25 or less.! 5. Bayesian Posterior Probabilities (BI only) This dataset cannot reject the UPGMA tree as significantly worse!

3 Systematics - Bio 615

Shimodaira-Hasegawa Test (ML only) SH Test - PAUP* output • To be statistically valid, the Kishino-Hasegawa test should be of trees that are selected a priori * Tree 1 16383! -ln L 9935.26295 10206.72132!

• However, most people have used trees selected a Time used to compute likelihoods = 1.50 sec! posteriori on the basis of the phylogenetic analysis Shimodaira-Hasegawa test:! • When we test the ‘best’ tree against some other tree SH test using RELL bootstrap (one-tailed test)! the KH test will be biased towards rejection of the null Number of bootstrap replicates = 1000! hypothesis (so it is a conservative result when it Tree -ln L Diff -ln L P ! doesn’t reject) ------! 1 9935.26295 (best)! • The SH test is a similar but more statistically correct 16383 10206.72132 271.45838 0.000*! technique in these circumstances and should be * P < 0.05! preferred -but is limited to comparing likelihood scores This ML tree is significantly better than the alternative * Goldman, N., Anderson, J.P. and Rodrigo, A.G. (2000) Likelihood-based tests of topologies in phylogenetics. Syst Biol 49: 652-670.

Independent Dataset Congruence Independent Dataset Congruence

• Trees inferred from different data sets (different genes, Example - DNA & Morphology yield different trees morphology) should agree if they are accurate Morphology - preferred tree (best Bayes tree) vs DNA [COII] - preferred tree (best Bayes tree) • Congruence between trees is best explained by their accuracy Parsimony - Kishino-Hasegawa test - Templeton & winning-sites test • Congruence can be investigated using consensus methods, bootstrapping, and KH / SH testing - use Likelihood - Kishino-Hasegawa test dataset1 on the best tree of dataset2 (& vice versa) - Shimodaira-Hasegawa test to see if the different datasets significantly reject the best trees of the other dataset * COII data reject morphology tree at P<0.001

• Incongruence requires further work to explain or * Morphology data reject COII tree at P<0.005 resolve disagreements Rare result - strongly supported incongruence

Bayesian Hypothesis testing - Simple & straightforward Bayesian Hypothesis Testing [ID: 6885249107]! ID = Partition ID number! PART = Description of partition in .* format! NUM = Number of trees sampled with the partition! In MRBayes - check PROB = Posterior probability of the partition! the output file Bayesian Inference yields Posterior Probabilities for clades BLEN = Mean branch length! VAR = Variance of branch length! name.trprobs for (branches) ID !PART ! NUM !PROB BLEN! VAR! the posterior 1 !...... *.... 15001 1.000 0.006 (0.000)! 2 !...... **.... 15001 1.000 0.053 (0.000)! probabilities of all 3 !..*...... 15001 1.000 0.002 (0.000)! likely bipartitions (if Posterior Probabilities (BPP, or PP) are interpreted quite 4 !...... *..... 15001 1.000 0.002 (0.000)! 5 !...... *...... 15001 1.000 0.001 (0.000)! bipartition of interest simply as: 6 !...... *...... 15001 1.000 0.001 (0.000)! isn’t listed, its P is 7 !...... **...... 15001 1.000 0.180 (0.002)! 8 !.....*...... 15001 1.000 0.006 (0.000)! lower than the 9 !..****************** 15001 1.000 0.080 (0.000)! lowest listed, usually THE PROBABILITY THE HYPOTHESIS (branch / tree) ... ! 47 !...... **.....**** 346 0.023 0.017 (0.000)! P<0.000) IS CORRECT 48 !...... ***....**.. 204 0.014 0.037 (0.000)! 49 !...... *****...... 99 0.007 0.007 (0.000)! (Given the model, priors, & data) 50 !...... *....**** 49 0.003 0.008 (0.000)! 51 !..*******.....**.... 39 0.003 0.018 (0.000)! 52 !..*****...... **.... 28 0.002 0.008 (0.000)! 53 !.....***...... 12 0.001 0.002 (0.000)! These probabilities can be used to compare competing 54 !.....**.*...... 11 0.001 0.001 (0.000)! 55 !...... **.....**.. 10 0.001 0.015 (0.000)! hypotheses instead of using the KH or SH tests 56 !...... ***...... ** 10 0.001 0.015 (0.000)! This is a 57 !...... **....** 9 0.001 0.016 (0.000)! hypothesis that is 58 !..***.*..*********** 7 0.000 0.001 (0.000)! 59 !..***....***..****** 6 0.000 0.012 (0.000)! rejected at P<0.000 Miller, R. E., Buckley, T. R. and Manos, P. S. (2002) An examination of the monophyly of morning glory taxa using Bayesian phylogenetic inference. Systematic Biology 51(5): 740-753

4 Systematics - Bio 615

Confidence - Assessment of the Strength of Character evolution: the Phylogenetic Signal - part 3 Melanism - homology or homoplasy? 1. Consistency Index Monophyly rejected PP = 0.000067% 2. g1 statistic, PTP - test

3. Consensus trees

4. Decay index (Bremer Support) [pp.190-193 - Text]

5. Bootstrapping / Jackknifing

6. Statistical hypothesis testing (frequentist) - Independent Dataset Congruence

7. Posterior probability - Bayesian Inference

Bayesian Phylogenetic Inference Bayesian Statistics

Outline Bayesian statistics is fundamentally different from the more commonly taught and used frequentist statistics 1. Introduction, history Based on the work of Reverend Thomas Bayes (1764) 2. Advantages over ML Probability theory was Bayesian until ca. 1920s 3. Bayes’ Rule Fisherian / Frequentist statistics developed 4. The Priors 1929 & have since dominated

5. Marginal vs Joint estimation Logical problems with Bayesian Statistics were fixed in 1955 - re-emergence of 6. Posteriors vs Bootstrap values Bayesian methods - not used for phylogenetics until 1996 7. MCMC

The Classical Hypothesis Testing Setup The Bayesian Approach Step 1. Define the Research Hypothesis A Research or Alternative Hypothesis is a statement Bayesians, in contrast, try to do the following: derived from theory about what the researcher expects to find in the data. 1) Make inferences based on all information at our disposal 2) See how new data effects our (old) inferences Step 2. Define the Null Hypothesis 3) Need to identify all hypotheses (or states of nature) that The Null Hypothesis is a statement of what you would not may be true expect to find if your research or alternative hypothesis 4) Need to know what each hypothesis (or state of nature) was consistent with reality. predicts that we will observe 5) Need to know how to compute the consequences. i.e. we Step 3. Conduct an analysis of the data to determine need to know how to update our old inferences in light of whether or not you can reject the null hypothesis with our observations some pre-determined probability If you can reject the null hypothesis with some probability, then the data are consistent with the model In sum, their statistics matches how scientists actually If you cannot reject the null hypothesis with some probability, then think the data are not consistent with the model

5 Systematics - Bio 615

Bayesian vs Frequentist Bayesian vs Frequentist Coin - Flipping example - Flip a coin 50 times, get 28 heads Coin - Flipping example - Flip a coin 50 times, get 28 heads

“Is this a fair coin?” - Frequentist approach: test - “Is this a fair coin?” - Bayesian approach

H0: the probability of heads = 0.5 H1: the probability of heads ≠ 0.5 Bayesian statistics allows us to answer a more useful question than is “P = 0.05?” namely… Bayesians would say “This approach is flawed. There is virtually no chance that the coin is perfectly fair (P = “What is the probability that P is between 0.4 and 0.6?” 0.5000000000…) one only need collect enough data to demonstrate this. So this isn’t a test of whether the coin is Bayesians specify in advance the range they feel would fair, it is a test of whether we have collected enough data represent a “fair coin” [ the hypothesis ] to prove it isn’t!” Can calculate the probability of a hypothesis being (there is a difference between biological significance true (give the data and some other stuff..) and statistical significance)

Bayesian Phylogenetic Inference Bayesian Phylogenetic Inference Outline Bayesian methods build on maximum likelihood - Can use the same models of evolution 1. Introduction, history - The likelihood is incorporated into the equation

2. Advantages over ML - Major advantages over ML: 1. The likelihood is not maximized so 3. Bayes’ Rule considerable time is saved 2. Branch support values analogous to bootstrap 4. The Priors values are obtained in ~ 10+ x less time 5. Marginal vs Joint estimation 3. More realistic mixed-model analysis is possible 6. Posteriors vs Bootstrap values 4. Does not return a point-estimate of phylogeny but a quantification of 7. MCMC the strength of the signal in the data

Bayesian Phylogenetic Inference Bayesian Phylogenetic Inference Maximum log-Likelihood score:

2001 The probability of the data given the hypothesis (sub. model, tree, branch lengths) - this is neither intuitive nor desirable

Bayesian Posterior Probability score:

The probability of the hypothesis (tree or branch) being correct given the data, priors, sub. model, and branch lengths - this is intuitive and what we want!

6 Systematics - Bio 615

Bayesian Phylogenetic Inference Bayesian Phylogenetic Inference Outline Bayesian inference

1. Introduction, history - optimal topology is the one that maximizes the posterior probability (PP) 2. Advantages over ML - PP is proportional to the prior probability and the 3. Bayes’ Rule likelihood 4. The Priors

5. Marginal vs Joint estimation - We use Bayes’ Rule to calculate the PP

6. Posteriors vs Bootstrap values

7. MCMC

Bayesian Phylogenetic Inference Four Components of Bayes’ Rule Bayes’ Rule - converts likelihood into Posterior Prob. 1. Posterior Probability = probability of hypothesis given (…) after we have observed the data Pr = probability; D = data; tree = parameter (hypothesis) prior 2. Prior Probability = probability of hypothesis before the probability likelihood data are seen (industry of thought on this Pr (tree | D) = Pr (tree) Pr (D | tree) component!) - based on belief –––––––––––––– Pr (D) 3. Likelihood = probability of the data given the hypothesis

posterior probability 4. Pr (D) = marginal probability of the data - a constant scaling factor to get PP marginal probability of data given model - a to range from 0.0 to 1.0 normalizing constant

Bayesian Phylogenetic Inference Bayesian Phylogenetic Inference Four components to Bayes’ Rule Outline

Since Pr(D) is constant the equation can be written 1. Introduction, history

Pr (p | D) ≈ Pr (p) Pr ( D | p ) 2. Advantages over ML

or in words 3. Bayes’ Rule 4. The Priors the posterior probability is proportional to the prior probability times the likelihood 5. Marginal vs Joint estimation

6. Posteriors vs Bootstrap values

7. MCMC

7 Systematics - Bio 615

Subjectivity of the Prior is a common Bayesian Phylogenetic Inference criticism of Bayesian Inference Four components to Bayes’ Rule Flat prior Informative prior And if we use a completely flat, uninformative prior, such that all values are equally likely a priori…

Pr (p | D) ≈ Pr ( D | p )

or in words

the posterior probability is proportional to the likelihood! 1. Influence of the prior weakens as the signal in the data But what are those priors anyway? strengthens - few or very weak data can be dangerous 2. Priors are explicit and must be justified 3. Try different priors to see how they influence the posteriors

Bayesian Phylogenetic Inference Bayesian Phylogenetic Inference Outline Bayesian Inference differs subtly from Maximum Likelihood in how parameter values are obtained 1. Introduction, history

2. Advantages over ML - BI uses marginal estimation (volume of dist) - ML uses joint estimation (maximum of dist) 3. Bayes’ Rule Holder, M. and Lewis, P.O. (2003) Phylogeny estimation: traditional and Bayesian approaches. Nature Reviews Genetics 4: 275-284 4. The Priors

5. Marginal vs Joint estimation

6. Posteriors vs Bootstrap values

7. MCMC

Bayesian Phylogenetic Inference Bayesian Phylogenetic Inference BI & ML will prefer different trees when the data do Outline not strongly prefer one tree (like comparing a bootstrap tree to a most parsimonious tree) 1. Introduction, history

ML will prefer tree A BI will prefer tree B 2. Advantages over ML

3. Bayes’ Rule

Y = axis likelihood or PP Bayesian ML 4. The Priors

5. Marginal vs Joint estimation

6. Posteriors vs Bootstrap values - (slide)

7. MCMC X axis = parameter values

8 Systematics - Bio 615

Terms - from lecture & readings Study questions

winning sites test Explain how a branch length (under parsimony) of 250 steps can Templeton test have zero branch support? Kishino-Hasegawa test Shimodaira-Hasegawa test The Templeton test, KH, and SH test all test what null hypothesis? Bayesian Statistics Reverend Thomas Bayes How must one select the trees to be compared for the KH to be Frequentist Statistics statistically valid? What alternative test can one use if this posterior probability requirement of the KH cannot be met? prior probability What are four advantages of Bayesian Phylogenetic Inference over Bayes' Rule Maximum Likelihood? flat prior marginal estimation Although BI & ML are expected to prefer the same tree if the same joint estimation model of evolution is used for both - when might BI & ML prefer different trees & why?

9