O'Reilly, J., Puttick, M., Pisani, D., & Donoghue, P. (2017). Probabilistic methods surpass parsimony when assessing clade support in phylogenetic analyses of discrete morphological data. Palaeontology. https://doi.org/10.1111/pala.12330

Publisher's PDF, also known as Version of record License (if available): CC BY Link to published version (if available): 10.1111/pala.12330

Link to publication record in Explore Bristol Research PDF-document

This is the final published version of the article (version of record). It first appeared online via Wiley at http://onlinelibrary.wiley.com/doi/10.1111/pala.12330/abstract. Please refer to any applicable terms of use of the publisher.

University of Bristol - Explore Bristol Research General rights

This document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Full terms of use are available: http://www.bristol.ac.uk/red/research-policy/pure/user-guides/ebr-terms/ [Palaeontology, 2017, pp. 1–14]

PROBABILISTIC METHODS SURPASS PARSIMONY WHEN ASSESSING CLADE SUPPORT IN PHYLOGENETIC ANALYSES OF DISCRETE MORPHOLOGICAL DATA by JOSEPH E. O’REILLY1 ,MARKN.PUTTICK1,2 , DAVIDE PISANI1,3 and PHILIP C. J. DONOGHUE1 1School of Earth Sciences, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol, BS8 1TQ, UK; [email protected]; [email protected]; [email protected]; [email protected] 2Department of Earth Sciences, The Natural History Museum, Cromwell Road, London, SW7 5BD, UK 3School of Biological Sciences, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol, BS8 1TQ, UK

Typescript received 28 April 2017; accepted in revised form 13 September 2017

Abstract: Fossil taxa are critical to inferences of historical 50% support. Ignoring node support, Bayesian inference is diversity and the origins of modern biodiversity, but realizing the most accurate method in estimating the tree used to sim- their evolutionary signiﬁcance is contingent on restoring fossil ulate the data. After assessing clade support, Bayesian and species to their correct position within the tree of life. For Maximum Likelihood exhibit comparable levels of accuracy, most fossil species, morphology is the only source of data for and parsimony remains the least accurate method. However, phylogenetic inference; this has traditionally been analysed Maximum Likelihood is less precise than Bayesian phylogeny using parsimony, the predominance of which is currently estimation, and Bayesian inference recaptures more correct challenged by the development of probabilistic models that nodes with higher support compared to all other methods, achieve greater phylogenetic accuracy. Here, based on simu- including Maximum Likelihood. We assess the effects of these lated and empirical datasets, we explore the relative efﬁcacy of ﬁndings on empirical phylogenies. Our results indicate proba- competing phylogenetic methods in terms of clade support. bilistic methods should be favoured over parsimony. We characterize clade support using bootstrapping for parsi- mony and Maximum Likelihood, and intrinsic Bayesian pos- Key words: phylogenetic analysis, morphology, parsimony, terior probabilities, collapsing branches that exhibit less than Maximum Likelihood, Bayesian, Mk model.

T HE goal of reconstructing an holistic Tree of Life has majority rule consensus tree constructed from a sample been envisaged since the inception of evolutionary theory. of trees from the posterior distribution (hereafter the This entails not only the use of molecular phylogenetics Bayesian tree) as optimality trees (following Holder et al. to determine the inter-speciﬁc relationships between 2008) and compared them directly to the optimal Maxi- extant taxa, but also the restoration of extinct branches to mum Likelihood and Maximum Parsimony trees. How- the Tree of Life. For the majority of extinct species, this ever, because Bayesian trees have intrinsic support values can only be achieved through phylogenetic analysis of (posterior probabilities) some have argued that they morphological data. Parsimony has dominated the phylo- should be compared to bootstrapped Maximum Likeli- genetic analysis of morphological data but its hegemony hood trees (Huelsenbeck et al. 2001), even though post- is now challenged by model-based phylogenetic methods erior probabilities and bootstrap proportions are neither that attempt to approach the realism of models of evolu- interchangeable nor directly comparable (Douady et al. tion developed for molecular evolution (Lewis 2001). 2003). Brown et al. (2017) showed that, when clade Simulation-based studies have shown that parsimony is support is considered, Maximum Likelihood and Bayesian less accurate than Bayesian analysis for phylogenetic infer- implementations of the Mk model are effectively indistin- ence with morphological data (Wright & Hillis 2014; guishable in terms of topological accuracy. O’Reilly et al. 2016; Puttick et al. 2017). Previous studies Here we develop upon previous studies to incorporate (O’Reilly et al. 2016; Puttick et al. 2017) treated the 50% measures of clade support in the investigation of the

© 2017 The Authors. doi: 10.1111/pala.12330 1 Palaeontology published by John Wiley & Sons Ltd on behalf of The Palaeontological Association. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. 2 PALAEONTOLOGY performance of parsimony relative to Maximum Likeli- and the shape of the gamma distribution of character- hood and Bayesian implementations of the Mk model in wise rate heterogeneity were sampled independently from the phylogenetic analyses of morphological data. An ini- an exponential distribution of mean 1, ensuring a reason- tial comparison of the accuracy and resolution of differ- able level of rate heterogeneity between replicate datasets ent methods is achieved through the use of simulated and between the constituent characters of each replicate data. We take a simulation-based, rather than an empiri- matrix. The choice of rate parameters in our simulation cal approach since the performance of phylogenetic meth- framework is validated by our ability to easily simulate ods in tree estimation can only be assessed when the tree matrices with empirical levels of homoplasy. To ensure is known; this is never the case for empirical data. Empir- data did not collapse into the Mk model and therefore ical analyses are performed to demonstrate the inﬂuence unintentionally provide a beneﬁt when applying the Mk of methods on the testing of phylogenetic hypotheses. We model for phylogenetic inference, we applied an unequal compare clade support using standard methods, with pos- stationary distribution of p = [0.2, 0.2, 0.3, 0.3] in the terior probabilities obtained from posterior samples of HKY model. Fixing the stationary distribution in this Bayesian trees and non-parametric bootstrap proportions manner enforces violation of the assumptions of the Mk calculated for most-parsimonious and Maximum Like- model that the limiting distribution of character states is lihood trees, collapsing poorly-supported (nodes with equal and that transitions between any two character states < 0.5 posterior probability, or present in < 50% of boot- are equally probable. A mixture of binary and multistate strap replicates) nodes in trees constructed with each characters was simulated, with binary characters obtained method. We ﬁnd that parsimony has low accuracy com- through the R/Y (purine/pyrimidine) recoding of molecu- pared to both Bayesian and Maximum Likelihood imple- lar data as character states of 0 or 1, resulting in the elimi- mentations of the Mk model. However, the Bayesian nation of the transition–transversion ratio for these implementation still achieves higher support values on characters and an effective stationary distribution of correct nodes than does the Maximum Likelihood imple- p = [0.4, 0.6], which is a violation of the assumption of mentation. Based on these results, we conclude that a the Mk model that character states exhibit an even limiting Bayesian implementation of the Mk model should be pre- distribution. Multistate characters were obtained by recod- ferred over the Maximum Likelihood implementation ing nucleotide data as character states of 0, 1, 2, or 3, when analysing categorical morphological data. Neverthe- depending on the nucleotide present at each terminal. The less, both implementations are superior to parsimony ﬁnal ratio of binary to multistate characters in each matrix when analysing categorical morphological data. was 55:45, based on the mean ratio observed in empirical data (Guillerme & Cooper 2016). For each of the 100, 350, and 1000 character sets we generated 1000 matrices that, MATERIAL AND METHOD in total, exhibited a distribution of homoplasy approxi- mating that reported by Sanderson & Donoghue (1996). Simulated data Goloboff et al. (2017) have criticized this approach to simulating morphology-like datasets on the basis that our For all analyses, we estimated topologies and support val- generating trees encompass only contemporaneous taxa, ues using datasets of 100, 350, and 1000 characters from assume that evolutionary rates are constant across time Puttick et al. (2017), simulated on both the asymmetrical and the tree, and that our measure of biological realism, and symmetrical phylogenies; these variables allowed us the spread of homoplasy exhibited by datasets, is inade- to explore the impact of both character matrix size and quate. However, our experiments do not attempt to sim- tree symmetry on phylogeny estimation. We generated ulate non-contemporary taxa or address the problem of 1000 independent datasets for each tree and character set missing data, qualities of palaeontological data that are of size. These data were simulated to ensure that they a level of complexity that is beyond the current debate. matched an empirical distribution of homoplasy. To Goloboff et al. (2017, ﬁg. 1A) demonstrated that our sim- reduce potential biases in favour of the use of the Mk ulated data broadly achieve their preferred measure of model for phylogenetic inference (regarding Parsimony, biological realism. Our review of their datasets indicates see O’Reilly et al. 2016) we simulated our data using a that, while Goloboff et al. (2017) drew characters from an model with more parameters than the Mk model: the empirically realistic global distribution of homoplasy,

HKY + Γcontinuous model of molecular evolution (Hase- their simulated datasets are not individually empirically gawa et al. 1985), with the transition to transversion ratio realistic, with many matrices dominated by characters parameter k ﬁxed to a value of 2. This approach to simu- with very high consistency and an unrealistically small lation enforces the violation of the assumption of the Mk proportion of characters exhibiting high levels of homo- model that a transition between any two character states plasy. The datasets simulated by Goloboff et al. (2017) is equally probable. The substitution rate of each dataset have qualities that strongly bias in favour of parsimony O’REILLY ET AL.: MORPHOLOGICAL TREE SUPPORT 3 phylogenetic inference, and implied-weights parsimony in in ≥ 50% of bootstrap replicates. The trees estimated in particular, as the presence of large numbers of characters the Bayesian framework already represent the 50% majority that are congruent with the tree allows implied weights to rule consensus of the posterior distribution of trees, and so increase the power of these ‘true’ congruent characters. these trees are identical before and after assessing support. This effect will not be possible when increased levels of For the Maximum Likelihood and parsimony analyses, we homoplasy are present or when the true tree is unknown estimated clade support using non-parametric bootstrap- (as is the case for all empirical datasets). ping (Felsenstein 1985). We obtained 250 bootstrap repli- For each of our datasets, we estimated trees using the cates for each simulated dataset in both parsimony and Maximum Likelihood and Bayesian implementations of Maximum Likelihood frameworks, using TNT (Goloboff the Mk model, and equal-weights parsimony, in RAxML et al. 2008) and RAxML (Stamatakis 2014), respectively. (Stamatakis 2014), MrBayes (Ronquist et al. 2012) and We collapsed branches on the parsimony and Maximum TNT (Goloboff et al. 2008), respectively. Bayesian phylo- Likelihood trees with < 50% bootstrap support. Support genetic methods that use MCMC sampling result in a dis- for clades in trees estimated from Bayesian analysis was tribution of trees by design, Maximum Likelihood assessed using posterior probability. invariably recovers a single tree that maximizes the likeli- hood function, and parsimony recovers one or several equally most-parsimonious trees. In the event of the Empirical data recovery of multiple equally most-parsimonious trees, a majority rule consensus tree was constructed from this Puttick et al. (2017) re-analysed four published morphologi- set. Majority rule consensus trees were constructed from cal matrices (Hilton & Bateman 2006; Sutton et al. 2012; the posterior sample of trees obtained with the Bayesian Nesbitt et al. 2013; Luo et al. 2015) using Bayesian, Maxi- implementation of the Mk model. Alternative consensus mum Likelihood and parsimony frameworks to identify the tree methods are available, each with their attendant inﬂuence that each method has on the support for published advantages and disadvantages (Heled & Bouckaert 2013; hypotheses. We analysed these datasets again using the same Holder et al. 2008); we selected the majority rule consen- three phylogenetic inference frameworks in addition to esti- sus method to maintain comparability among inference mating non-parametric bootstrap support for clades in these frameworks as other consensus tree construction methods four estimated topologies with both parsimony and the are not universally applicable. The Robinson–Foulds Maximum Likelihood implementation of the Mk model (1981) distance relative to the generating tree was calcu- (support being obtained intrinsically within this Bayesian lated for each of the trees estimated across all three infer- framework) and collapsed nodes with less than 50% support ence frameworks (parsimony, Maximum Likelihood and on the Maximum Likelihood and parsimony trees. As with Bayesian analysis). The Robinson–Foulds distance is well the simulated datasets, non-parametric bootstrapping was understood and rooted in set theory, as it represents the performed in TNT and RAxML for parsimony and Maxi- symmetric difference between the sets of all the clades in mum Likelihood, respectively, with 250 replicates obtained. each considered tree. The Robinson–Foulds distance Our aim is to determine whether the phylogenetic conclu- between two trees is calculated as the sum of the clades sions drawn in the original studies were contingent on the found in the ﬁrst tree but not in the second plus the sum phylogenetic method employed. of clades found in the second tree but not the ﬁrst one. Accordingly, smaller Robinson–Foulds distances are char- acteristic of more accurate phylogenetic trees (i.e. trees RESULTS that do not disagree with the generating tree). A draw- back of the Robinson–Foulds metric is that small dis- Simulations tances are expected for unresolved trees; that is, they can be achieved without precision, by consensus trees lacking In all of our analyses, competing phylogenetic methods resolution. Accordingly, to qualify whether accuracy is exhibited greater accuracy when reconstructing data from achieved with or without precision, we also measure reso- the symmetrical tree compared to data from the asym- lution (i.e. number of nodes in the recovered tree) and metrical tree. For the datasets derived from the asymmet- consider it in our interpretations. rical generating tree, the Bayesian and Maximum For each method, we integrated the effect of support on Likelihood implementations of the Mk model out- phylogenetic accuracy collapsing nodes that had < 50% performed parsimony in terms of accuracy (Table 1). support and re-calculating the Robinson–Foulds distances Support values for nodes were generally higher when relative to the generating tree (Robinson & Foulds 1981). nodes were accurately reconstructed, and this was more In these analyses, ≥ 50% support is deﬁned by a node with pronounced in analyses of datasets generated from the a posterior probability of ≥ 0.5, or a node that is present asymmetrical tree compared to those derived from the 4 PALAEONTOLOGY

TABLE 1. The median and 95% quantile of Robinson–Foulds distances for the asymmetrical and symmetrical phylogenies with 100, 350 and 1000 characters.

Bayesian ML Parsimony Bayesian ML Parsimony asymmetrical asymmetrical asymmetrical symmetrical symmetrical symmetrical

100 majority rule 29 (23–35) 47 (31–59) 35 (27–48.02) 7 (1–25) 7 (1–43.05) 7 (1–26) 100 ≥ 50% branch support 29 (23–35) 30 (24–34) 30 (26–32) 7 (1–25) 7 (2–24.02) 9 (3–27) 350 majority rule 20 (12–30) 25 (13–55) 28 (18–37) 1 (1–15.02) 1 (1–23) 1 (1–15.02) 350 ≥ 50% branch support 20 (12–30) 20 (14–31) 24 (19–30) 1 (1–15.02) 1 (1–15) 1 (1–17) 1000 majority rule 9 (3–26) 11 (3–43) 18 (9–30) 1 (1–3.02) 1 (1–5) 1 (1–3.02) 1000 ≥ 50% branch support 9 (3–26) 10 (4–27) 17 (10–28) 1 (1–3.02) 1 (1–4) 1 (1–4)

ML, Maximum Likelihood

asymmetrical symmetrical asymmetrical >50% nodes symmetrical >50% nodes 42 620 4 10 42 3 210.5 20 1 20 5 20 1 20 1 20 0.9 20 0.2 100

80

60

40

20 100 characters

0 10 3 20 10 10 9 10 3 10 2 6 0.9 30 0.1 30 0.9 30 0.1 30 0.1 30 0.1 30 0.06 100

80

60

40 support

20 350 characters

0 20 1 20 6 20 7 20 1 20 2 10 1 30 6e−3 30 0.3 30 9e−3 30 6e−2 30 2e−2 30 6e−2 100

80

60

40

20 1000 characters 0 ML Correct ML Correct ML Correct ML Correct ML Incorrect ML Incorrect ML Incorrect ML Incorrect Bayesian Correct Bayesian Correct Bayesian Correct Bayesian Correct Parsimony Correct Parsimony Correct Parsimony Correct Parsimony Correct Bayesian Incorrect Bayesian Incorrect Bayesian Incorrect Bayesian Incorrect Parsimony Incorrect Parsimony Incorrect Parsimony Incorrect Parsimony Incorrect FIG. 1. The distribution of support values on accurate and inaccurate nodes using all methods with simulated datasets summarizing support for all nodes, and nodes that have above 50% accuracy only. The boxplots show the support for accurate and inaccurate nodes across all trees, and the numbers above show the mean resolution (number of nodes) of the 1000 trees. ML, Maximum Likelihood. Colour online. O’REILLY ET AL.: MORPHOLOGICAL TREE SUPPORT 5 symmetrical generating tree (Fig. 1). Bayesian methods accuracy after nodes with < 50% support were collapsed. reconstructed the highest number of accurate nodes and had For the asymmetrical tree, the accuracy of the Maximum higher support on these nodes compared to inaccurate Likelihood and Bayesian implementations of the Mk model nodes, and accurate nodes recovered by alternative methods. overlap, but parsimony is the least accurate method (Figs 2–4; Table 1). Similar results were obtained from analysis of the data derived from the symmetrical generat- Effects of considering support values ing tree (Table 1). For both the symmetrical and asymmet- rical trees, accuracy increases with dataset size, and the Both the Maximum Likelihood implementation of the Mk Bayesian and Maximum Likelihood Mk implementations model and parsimony produced trees with increased achieve very high accuracy at datasets with 1000 characters.

Maximum EW Bayesian Likelihood Parsimony 60 50 increasing precision low density (more resolved nodes) 40 high density 30

20 increasing accuracy

Maj. Rule 10 (smaller RF distance) 0 50 40

100 Characters 30 20 10 50% support 0 50 40 30 20

Maj. Rule 10 0 50 40

350 Characters 30 20 10 50% support Robinson Foulds distance Robinson Foulds 0 50 40 30 20

Maj. Rule 10 0 50 40 30 1000 Characters 20 10 50% support 0 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 30 Resolution (resolved nodes) FIG. 2. Contour plots of resolution against Robinson–Foulds distance for data (100, 350, 1000 characters) simulated on the asym- metrical tree before and after nodes with less than 50% support are collapsed. Performance of Maximum Likelihood and Parsimony improves after these nodes are collapsed, but both Maximum Likelihood and Bayesian surpass parsimony in accuracy. EW, equal weights. Colour online. 6 PALAEONTOLOGY

Bayesian majority rule consensus trees were generally the Bayesian implementation resolves more inaccurate more resolved than the 50% support trees obtained from nodes than does either the Maximum Likelihood Mk Maximum Likelihood topology estimates; the 50% sup- implementation or parsimony. Median support for correct port trees estimated using parsimony were the most con- nodes is higher in Bayesian, as opposed Maximum Likeli- servative (Table 2). This trend was also observed in the hood analyses, across all dataset sizes, and this trend is par- results of analyses of 100 character datasets derived from ticularly evident on the asymmetric trees (Fig. 1). The the symmetrical generating tree. However, all methods median support for correct nodes is higher than for incor- yielded more fully resolved 50% support trees based on rect nodes in trees derived from the Bayesian Mk imple- 350 and 1000 character datasets. mentation, and over all methods (Fig. 1). Bayesian support The Bayesian Mk implementation resolves a higher num- is higher for incorrect nodes compared to Maximum Likeli- ber of correct nodes across all analyses (Fig. 1), and these hood and parsimony (Fig. 1). There is a clear difference in have higher support compared to the Maximum Likelihood the levels of support for correct and incorrect nodes in the implementation and parsimony (Fig. 5). However, overall, Bayesian trees, with correct nodes generally achieving

majority rule consensus nodes with > 50% support 0.13 0.73

0.09 0.48

0.04 0.24

asymmetric 0.00 0.00 0.09 0.09

0.06 0.06 100 characters 0.03 0.03 symmertric 0.00 0.00 0.08 0.10 0.05 0.07 0.03 0.03 asymmetric 0.00 0.00 1.38 1.41 density density

0.92 0.94 350 characters 0.46 0.47 symmetric 0.00 0.00 0.09 0.10

0.06 0.07

0.03 0.03 asymmetric 0.00 0.00 0.74 Bayesian 0.74 Maximum Likelihood 0.49 EW Parsimony 0.49

1000 characters 0.25 0.25 symmetric 0.00 0.00 01020304050600 102030405060 Robinson–Foulds distance FIG. 3. Distribution of Robinson–Foulds distance for all datasets on the asymmetrical and symmetrical phylogenies. After collapsing nodes with < 50% support Maximum Likelihood and Bayesian achieve similar values, and both methods out-perform parsimony. EW, equal weights. Colour online. O’REILLY ET AL.: MORPHOLOGICAL TREE SUPPORT 7

characters 100 350 1000 350 characters 1000 characters 100 characters Bayesian Maximum Likelihood Parsimony

5075 100 5075 100 5075 100

support values of correct nodes 0 0 0 100 100 100 correct node found (%)

FIG. 4. The degree of accuracy (barplots at tips) and support (density plots at nodes) for each method and all datasets on the asym- metrical phylogeny. Performance of all methods increases with dataset size, and Bayesian methods tend to ﬁnd higher support for cor- rect nodes with datasets containing 100 characters. Colour online. greater than 80% support and incorrect nodes exhibiting implementation resolves only correct nodes (Fig. 1); the less than 80% support. Applying an 80% threshold for Maximum Likelihood Mk implementation and parsimony dataset sizes of 350 and above, the Bayesian Mk do not match this level of support. 8 PALAEONTOLOGY

TABLE 2. The median and 95% quantile of resolution for the asymmetrical and symmetrical phylogenies with 100, 350, and 1000 characters.

Bayesian ML Parsimony Bayesian ML Parsimony asymmetrical asymmetrical asymmetrical symmetrical symmetrical symmetrical

100 majority rule 8 (1–15) 30 (30–30) 16 (2–28) 27 (7–30) 30 (30–30) 27 (6.98–30) 100 ≥ 50% branch support 8 (1–15) 6 (1–12) 3 (1–7) 27 (7–30) 26 (7.98–30) 22 (4.98–28) 350 majority rule 19 (4–25) 30 (30–30) 24 (4–30) 30 (16–30) 30 (30–30) 30 (16–30) 350 ≥ 50% branch support 19 (4–25) 15 (4–23) 9 (2–15) 30 (16–30) 30 (16–30) 30 (14–30) 1000 majority rule 26.5 (8–30) 30 (30–30) 28 (7.98–30) 30 (28–30) 30 (30–30) 30 (28–30) 1000 ≥ 50% branch support 26.5 (8–30) 25 (8–29) 17 (4.98–24) 30 (28–30) 30 (28–30) 30 (27.98–30)

ML, Maximum Likelihood

Empirical data (Fig. 7). However, there were changes in the certainty of placement of Nyasasaurus after support was assessed, col- There were substantial differences between the empirical lapsing from membership of Theropoda to Saurischia in topologies derived from the Maximum Likelihood Mk the Maximum Likelihood Mk analyses, and from Thero- implementation and the parsimony framework, before poda to Dinosauria in parsimony analyses. Neither the and after the collapse of nodes with < 50% support Maximum Likelihood Mk implementation nor parsimony (Figs 6, 7; O’Reilly et al. 2017, ﬁgs S1, S2). Smaller data- recovered Saurischia, Ornithischia, Theropoda or Sau- sets showed the largest differences in topology and the ropoda, though these clades were resolved by the Baye- placement of key taxa whether support is accounted for sian Mk implementation (Fig. 7). A similar pattern is by collapsing poorly supported nodes or not. seen in the re-analysis of the dataset from Luo et al. For the smaller datasets, there was congruence between (2015). All trees, before and after accounting for support topologies estimated with different methods after poorly in the ﬁnal topology, resolved Haramiyavia outside supported nodes were collapsed. Kulindroplax, from Sut- crown-Mammalia and the multiturberculates (O’Reilly ton et al. (2012), was not supported as a crown-mollusc et al. 2017, ﬁg. S2). with any method when < 50% supported nodes are col- Analyses of the smallest dataset (34 characters, 48 taxa), lapsed (Fig. 6). This contrasts with the crown-mollusc from Sutton et al. (2012), recovered only seven nodes in afﬁnity of Kulindroplax in the Maximum Likelihood Mk the Bayesian majority rule consensus tree, but more resolu- estimated and most-parsimonious trees (Fig. 6). A similar tion was achieved in the Maximum Likelihood Mk estimate pattern was observed in the topologies estimated from the (32 nodes) and most-parsimonious trees (17 nodes). After data of Hilton & Bateman (2006). Both the Maximum collapsing poorly-supported nodes, a similar level of reso- Likelihood implementation of the Mk model and the par- lution was achieved by all methods: Maximum Likelihood simony framework supported the anthophyte hypothesis Mk implementation (8 nodes) and parsimony (6 nodes). in the respective optimal trees. After collapsing nodes This contrasts with the results obtained from analyses of with < 50% support, both methods yielded topologies the other, larger empirical datasets where, after collapsing that are more congruent with the Bayesian majority rule nodes with less than 50% support, the Bayesian Mk imple- consensus tree, with a polytomy uniting, but not differen- mentation consistently yielded trees with the greatest reso- tiating between, gymnosperms, seed ferns and angios- lution: a pattern opposite to that seen in comparison of the perms (O’Reilly et al. 2017, ﬁg. S1). optimal trees derived from the three methods of phyloge- Collapsing poorly supported nodes in trees estimated netic inference (Table 3). from the empirical datasets had less impact on the place- ment of key taxa in datasets with larger numbers of char- acters. Within the majority rule consensus tree obtained DISCUSSION from the Bayesian implementation of the Mk model, Nyasaurus was resolved in a polytomy with the major After incorporating estimates of node support, Parsimony clades of Dinosauria (Saurischia, Ornithischia; but see is outperformed by both Maximum Likelihood and Baye- Baron et al. 2017), as was shown by Nesbitt et al. (2013). sian implementations of the Mk model, providing further Nyasasurus was also resolved as a member of Dinosauria support for the use of stochastic models of character in the Maximum Likelihood Mk estimate and most-parsi- change in morphological data analyses (Wright & Hillis monious trees; this conclusion was not impacted by col- 2014; O’Reilly et al. 2016; Puttick et al. 2017). As shown by lapsing nodes with less than 50% bootstrap support Brown et al. (2017), Bayesian and Maximum Likelihood O’REILLY ET AL.: MORPHOLOGICAL TREE SUPPORT 9

Asymmetrical Symmetrical Correct Incorrect Correct Incorrect

0.25 0.20 0.15 0.10 100 0.05 0.00 0.4 0.3 0.2 350 0.1 0.0 Bayesian 0.5 0.4 0.3 0.2

1000 0.1 0.0 0.25 0.20 0.15 0.10 100 0.05 0.00 0.4 0.3 0.2 350 0.1 0.0 Frequency 0.5 0.4 0.3 0.2 Maximum Likelihood Maximum

1000 0.1 0.0 0.25 0.20 0.15 0.10 100 0.05 0.00 0.4 0.3 0.2 350 0.1 0.0

Parsimony 0.5 0.4 0.3 0.2

1000 0.1 0.0 90 60 90 50 80 50 80 50 70 60 70 80 90 70 60 70 90 60 80 50 100 100 100 clade support (%)

FIG. 5. Frequency of clade support for accurate and inaccurate branches for all methods and dataset sizes. Bayesian and Maximum Likelihood methods show higher clade support for correct nodes, and lower support for inaccurate nodes with increasing dataset size. Bayesian trees have highest support compared to all other methods on the asymmetrical and symmetrical phylogenies. Colour online.

implementations of the Mk model achieve similar levels of values on correct nodes are highest for the Bayesian imple- accuracy after collapsing weakly-supported branches on mentation. Though bootstrapping increases the accuracy of Maximum Likelihood trees (Fig. 2). However, after node the Maximum Likelihood Mk implementation to a level support is considered, the Bayesian Mk implementation similar to the Bayesian implementation (Brown et al. recovers more correct nodes with higher support than does 2017), Bayesian posterior probabilities are still higher for the Maximum Likelihood implementation, and support correct nodes. 10 PALAEONTOLOGY

A Bayesian BCMaximum Likelihood parsimony Mickwitzia Mickwitzia Lingula Lingula Lingula Mickwitzia Neopilina Orthrozanclus Halkieria Wiwaxia 0.08 Neocrania Neocrania Rhombichiton Odontogriphus 1 Orthrozanclus Permochiton 0.5 0.46 Wiwaxia Wiwaxia Pedanochiton Halkieria Odontogriphus 0.31 0.03 0.2 Orthrozanclus 0.08 Neopilina Neopilina Odontogriphus Babinka Babinka 0.4 0.29 Pojetaia Matthevia 0.04 0.53 Fordilla 0.58 Leptochiton 0.83 Pojetaia 0.52 Fordilla Hanleya Echinochiton Chaetoderma Halkieria 0.12 Phthipodochiton Matthevia Gryphochiton Chaetoderma Epimenia 0.06 0.1 Glaphurochiton 0.51 Epimenia Heloplax Epimenia Kulindroplax * Enetoplax 0.24 0.06 0.03 0.54 Echinochiton Matthevia 0.73 Acaenoplax Cryptoplax 0.12 Heloplax Phthipodochiton 0.45 Enetoplax Kulindroplax * Chaetoderma 0.91 Acutichiton 0.1 0.15 Acaenoplax Rhombichiton Acanthopleura Pedanochiton 0.02 Permochiton all nodes Phthipodochiton Permochiton Pedanochiton Kulindroplax * Acutichiton 0 Hanleya 0.06 0.02 G1yphochiton Neocrania 0.02 Robustum Septemchiton 0.68 Septemchiton Cryptoplax 0.84 Robustum 0.02 G1yphochiton 0.04 Acutichiton Strobilepis Leptochiton Acanthopleu1a 0.98 Polysaccos 0.04 0.3 Glaphurochiton Septemchiton Babinka Hanleya 0.54 Robustum Pojetaia 0.04 Cryptoplax Leptochiton 0.63 0.1 0.7 Fordilla Acanthopleu1a 0.14 Glaphurochiton Heloplax 0.07 Rhombichiton Echinochiton Enetoplax 0.1 Strobilepis Strobilepis 0.82 0.27 0.18 DEAcaenoplax 0.99 Polysaccos F 0.83 Polysaccos Mickwitzia Mickwitzia Neopilina Lingula Lingula Wiwaxia Neopilina Neopilina Odontogriphus Wiwaxia Kulindroplax * Orthrozanclus Rhombichiton Matthevia Leptochiton Permochiton Phthipodochiton Glaphurochiton Pedanochiton Leptochiton Echinochiton Orthrozanclus Glaphurochiton Rhombichiton Odontogriphus Hanleya Permochiton Matthevia Cryptoplax Pedanochiton Leptochiton Rhombichiton Hanleya Hanleya Acanthopleu1a G1yphochiton Halkieria G1yphochiton Cryptoplax Gryphochiton Acutichiton Acutichiton Glaphurochiton Permochiton Acanthopleu1a Kulindroplax * Epimenia 0.5 Pedanochiton 0.54 Echinochiton Echinochiton Phthipodochiton Cryptoplax Halkieria Matthevia Chaetoderma Odontogriphus Epimenia Acutichiton Wiwaxia Chaetoderma Acanthopleura Orthrozanclus Halkieria Phthipodochiton Neocrania Neocrania Kulindroplax * Chaetoderma Mickwitzia Neocrania 0.51 Epimenia Lingula Septemchiton Strobilepis Septemchiton 0.84 Robustum 0.99 Polysaccos 0.54 Robustum Strobilepis Robustum Strobilepis 0.98 Polysaccos 0.68 Septemchiton 0.83 Polysaccos Babinka Babinka Babinka Pojetaia Fordilla Pojetaia 0.63 0.53 Pojetaia 0.58 nodes < 0.5 support collapsed 0.7 Fordilla 0.83 0.52 Fordilla Heloplax Enetoplax Heloplax Enetoplax Acaenoplax Enetoplax 0.82 0.91 0.73 GHAcaenoplax Heloplax I Acaenoplax Pojetaia Mickwitzia Pojetaia Fordilla Babinka Fordilla Babinka Neopilina Babinka Neopilina Kulindroplax * Neopilina Wiwaxia Matthevia Wiwaxia Rhombichiton Chaetoderma Odontogriphus Permochiton Epimenia Orthrozanclus Pedanochiton Phthipodochiton Heloplax Orthrozanclus Leptochiton Enetoplax Odontogriphus Glaphurochiton Acaenoplax Matthevia Hanleya Septemchiton Leptochiton Cryptoplax Robustum Hanleya Rhombichiton Leptochiton Halkieria Acanthopleu1a Glaphurochiton Gryphochiton G1yphochiton Echinochiton Glaphurochiton Acutichiton Rhombichiton Epimenia Robustum Permochiton Echinochiton Septemchiton Pedanochiton Cryptoplax Permochiton Hanleya Chaetoderma Pedanochiton G1yphochiton Acutichiton Echinochiton Cryptoplax Acanthopleura Halkieria Acutichiton Phthipodochiton Odontogriphus Acanthopleu1a Kulindroplax * Wiwaxia Kulindroplax * Neocrania Orthrozanclus Phthipodochiton Mickwitzia Neocrania Matthevia Lingula Lingula Epimenia Septemchiton Fordilla Chaetoderma 0.84 Robustum 0.83 Pojetaia Halkieria Strobilepis Strobilepis Neocrania Polysaccos Polysaccos nodes < 0.8 support collapsed 0.98 0.99 Mickwitzia Heloplax Enetoplax Lingula Enetoplax Acaenoplax Strobilepis 0.82 0.91 Acaenoplax Heloplax 0.83 Polysaccos FIG. 6. Bayesian, Maximum Likelihood and parsimony topologies of the data from Sutton et al. (2012). The optimal tree with node support is shown for each method (A–C) as well as the topologies that are produced after nodes with < 0.5 support are collapsed to polytomies (D–F), and when nodes with < 0.8 support are collapsed (G–I). After these nodes are collapsed, no method supports a crown-group mollusc afﬁliation for Kulindroplax. Colour online.

Incorporating support, probabilistic methods out-perform Bayesian phylogenetics) achieve higher accuracy than does parsimony, and the accuracy of ML improves parsimony (Wright & Hillis 2014; O’Reilly et al. 2016; Puttick et al. 2017). Similar observations have been made In line with previous results, probabilistic methods that in analyses of molecular data using probabilistic versus implement the Mk model (Maximum Likelihood and parsimony methods (Felsenstein 1978). O’REILLY ET AL.: MORPHOLOGICAL TREE SUPPORT 11

A Bayesian BCMaximum Likelihood parsimony Nyasasaurus * Pisanosaurus Pisanosaurus

Pisanosaurus 0.91 Heterodontosaurus 0.7 Heterodontosaurus

0.81 Heterodontosaurus 0.43 Lesothosaurus 0.4 Scutellosaurus 0.53 0.82 Scutellosaurus Eocursor 0.27 Lesothosaurus 0.2 0.82 Lesothosaurus Scutellosaurus Eocursor Eocursor Saturnalia Saturnalia

Saturnalia 0.89 Plateosaurus 0.93 Efraasia 0.88 0.75 0.99 Efraasia Efraasia Plateosaurus 0.74 Plateosaurus 0.51 Herrerasaurus 0.21 Herrerasaurus 0.73 Herrerasaurus 0.33 Staurikosaurus 0.16 Staurikosaurus

all nodes Eoraptor Eoraptor 0.72 Staurikosaurus 0.33 0.1 Eoraptor Nyasasaurus * 0.53 0.38 Tawa 0.15 Coelophysis 0.74 Tawa 0.4 0.19 Tawa

0.77 Coelophysis 0.32 Dilophosaurus 0.39 Coelophysis Nyasasaurus * 0.79 Dilophosaurus 0.3 0.42 Dilophosaurus

0.82 Velociraptor 0.14 Velociraptor 0.42 Velociraptor 0.97 0.94 0.77 D Allosaurus E Allosaurus F Allosaurus Nyasasaurus * Pisanosaurus Dilophosaurus Pisanosaurus Heterodontosaurus Coelophysis 0.91 0.0081 Heterodontosaurus Eocursor Tawa Nyasasaurus * 0.0082 Scutellosaurus 0.53 Scutellosaurus Eoraptor 0.0082 Lesothosaurus Lesothosaurus Eocursor Coelophysis Staurikosaurus Saturnalia Nyasasaurus * Herrerasaurus

0.0099 Efraasia Dilophosaurus Velociraptor 0.0074 Plateosaurus Tawa 0.77 Allosaurus 0.0073 Herrerasaurus Eoraptor Saturnalia 0.51 0.93 0.0072 Staurikosaurus Staurikosaurus Efraasia Eoraptor 0.75 0.0053 Herrerasaurus Plateosaurus Velociraptor 0.0074 Tawa Scutellosaurus 0.94 0.0077 Coelophysis Allosaurus Lesothosaurus Dilophosaurus 0.0079 Saturnalia 0.7 Eocursor nodes < 0.5 support collapsed 0.0082 Velociraptor 0.89 Plateosaurus Heterodontosaurus 0.0097 Allosaurus 0.88 Efraasia Pisanosaurus G H I Nyasasaurus * Coelophysis Velociraptor Coelophysis Nyasasaurus * Allosaurus Tawa Dilophosaurus Dilophosaurus Eoraptor Tawa Coelophysis Staurikosaurus Eoraptor Tawa Herrerasaurus Staurikosaurus Nyasasaurus * 0.51 Efraasia Herrerasaurus Eoraptor Velociraptor 0.99 Plateosaurus Staurikosaurus Saturnalia 0.94 Allosaurus Herrerasaurus Dilophosaurus Saturnalia Scutellosaurus

0.82 Velociraptor 0.89 Plateosaurus Lesothosaurus 0.97 Allosaurus 0.88 Efraasia Eocursor Pisanosaurus Pisanosaurus Heterodontosaurus

0.81 Heterodontosaurus Heterodontosaurus Pisanosaurus

0.82 Scutellosaurus 0.91 Eocursor Efraasia nodes < 0.8 support collapsed 0.82 Lesothosaurus Scutellosaurus 0.93 Plateosaurus Eocursor Lesothosaurus Saturnalia FIG. 7. Bayesian, Maximum Likelihood, and parsimony topologies of the data from Nesbitt et al. (2013). The optimal tree with node support is shown for each method (A–C) as well as the topologies that are produced after nodes with < 0.5 support are collapsed to polytomies (D–F), and when nodes with < 0.8 support are collapsed (G–I). Only Bayesian (A) recovers the major clades of Dinosauria (Ornithischia, Saurischia, Sauropoda, Theropoda). Colour online.

After considering support values, the Bayesian and Maximum Likelihood in a number of ways. The Bayesian Maximum Likelihood implementations of the Mk model implementation recovers more nodes with higher support achieve similar levels of accuracy (Brown et al. 2017) but compared to the Maximum Likelihood implementation, the Bayesian implementation remains superior to but the Bayesian method also recovers more incorrect 12 PALAEONTOLOGY

TABLE 3. Resolution of the three methods on empirical trees were analysed with the Mk model. However, it could be before and after nodes with less than 50% support are collapsed. argued that the use of a stochastic Markov process to Bayesian Maximum Parsimony generate data will bias results toward the preference for Likelihood the use of a stochastic Markov model for phylogenetic inference. Further, the stochastic process used for simula- Sutton Optimal 73217 tion assumes that the evolutionary rates of different char- (34 taxa, tree acters are independent, in addition to assuming that the 48 characters) Collapsed 78 6 rates of evolution at different characters across the tree tree are proportional. However, the stochastic model we Hilton Optimal 28 46 38 employ to generate data does produce realistic distribu- (48 taxa, tree 82 characters) Collapsed 28 15 13 tions of morphological characters at the tips of the tree, tree demonstrating that this approach to simulation is valid. Nesbitt Optimal 72 80 72 Also, there is no obvious model for the evolution of dis- (82 taxa, 413 tree crete morphological characters. characters) Collapsed 72 63 47 tree Luo (114 taxa, Optimal 92 112 109 Comparison of support on Bayesian and Maximum 497 characters) tree Likelihood phylogenies Collapsed 92 69 69 tree We used results from simulation analyses to re-assess the empirical phylogenies: we presented topologies on which nodes overall than does Maximum Likelihood (Fig. 5). nodes with only 80% posterior probabilities or bootstrap Thus, the Maximum Likelihood implementation also pro- support are presented (Figs 6, 7). For most analyses, the duces a more topologically conservative tree, as it recov- consensus trees bring congruence between the Bayesian ers fewer nodes overall (Fig. 1), which inevitably equates and Maximum Likelihood Mk implementations, which is to improved Robinson–Foulds scores (Fig. 2). Support on similar to the pattern seen with the simulated data (Brown the remaining correct nodes is not as high in trees et al. 2017). For one of the larger matrices analysed (Luo derived using the Maximum Likelihood Mk implementa- et al. 2015), resolving only nodes with over 80% support tion as it is for the Bayesian implementation; this trend has little inﬂuence on the overall conclusions as Haramiya- is particularly pronounced at posterior probabilities of via is still recovered outside multiturberculates in the 0.8 and above in which Bayesian only recovers correct analyses performed using the Bayesian and Maximum nodes. Likelihood implementations of the Mk model. These clades As the number of analysed characters increases, the cannot be distinguished in the parsimony analyses. Similar accuracy and resolution of all methods also increase, results are seen in the analyses of the dataset from Hilton irrespective of tree symmetry. Despite this, the Maximum & Bateman (2006): no method supports the anthophyte likelihood and Bayesian implementations of the Mk model hypothesis, but parsimony does not separate pteridosper- still outperform Parsimony when a large number of char- mous taxa from gymnosperms, seed ferns and angiosperms acters are analysed. This general trend of improved accu- (O’Reilly et al. 2017, ﬁg. S1). Presenting only nodes with racy and resolution should be expected from the stochastic 80% support brings congruence between all methods for models as they are statistically consistent, but the relatively the re-analyses of Nesbitt et al. (2013) and Sutton et al. accurate performance of Parsimony is less expected. These (2012). Nyasasurus from Nesbitt et al. (2013) is placed in a results suggest that no matter which method is applied to polytomy with basal dinosaurs in Bayesian, Maximum a dataset, it should be a goal for morphological datasets to Likelihood and parsimony analyses (Fig. 7). Bayesian and include as many characters as possible if the most accurate Maximum Likelihood can only resolve three nodes, and estimates of topology are to be obtained. parsimony only one node, from the reanalysis of the data- Here, a stochastic Markov process was used to simulate set from Sutton et al. (2012) in which no method can categorical character data, facilitating a comparison of the resolve the position of Kulindroplax (Fig. 6). efﬁcacy of probabilistic (Bayesian, Maximum Likelihood) and non-probabilistic models (parsimony) of evolution when morphological data are analysed. To ensure that Differences between the Maximum Likelihood and Bayesian our data generation process did not favour the probabilis- Mk implementations tic Mk model we enforced violation of the assumptions of this model in the simulation procedure, resulting in a Difference in performance between Maximum Likelihood suitable level of model misspeciﬁcation when those data and Bayesian inference, when poorly supported nodes are O’REILLY ET AL.: MORPHOLOGICAL TREE SUPPORT 13 collapsed, is surprising given that the primary difference DATA ARCHIVING STATEMENT between these methods is the scaling of the likelihood by the prior distribution in the Bayesian implementation. It Data for this study are available in the Dryad Digital Repository: is therefore possible that the prior distribution placed on https://doi.org/10.5061/dryad.8dd39 the branch lengths may be a cause of difference in perfor- mance. As topology and branch lengths are jointly Editor. Imran Rahman estimated in the probabilistic framework it is entirely pos- sible that the prior distribution placed on the branch REFERENCES lengths is inﬂuencing the estimation of topology. A fur- ther potential cause of discrepancies between Maximum BARON, M. G., NORMAN, D. B. and BARRETT, P. M. Likelihood and Bayesian inference is the partitioning of 2017. A new hypothesis of dinosaur relationships and early characters by the number of possible states they exhibit. dinosaur evolution. Nature, 543, 501–506. MrBayes automatically partitions characters by the num- BROWN, J. W., PARINS-FUKUCHI, C., STULL, G. W., ber of implied character states, constructing a transition VARGAS, O. M. and SMITH, S. A. 2017. Bayesian matrix of the appropriate dimensions for those characters. and likelihood phylogenetic reconstructions of morphologi- Conversely, RAxML appears to construct a single transi- cal traits are not discordant when taking uncertainty into con- tion matrix that is applied to all characters, which may be sideration. A comment on Puttick et al. Proceedings of the a contributing factor to differences between these meth- Royal Society B, published online 11 October. https://doi.org/ 10.1098/rspb.2017.0986 ods. These differences may also be caused by errors intro- DOUADY, C. J., DELSUC, F., BOUCHER, Y., DOOLIT- duced by the efﬁciency of the algorithm applied to search TLE, W. F. and DOUZERY, E. J. P. 2003. Comparison of for the Maximum Likelihood tree or the rapid bootstrap Bayesian and Maximum Likelihood bootstrap measures of method used to calculate support for nodes in the Maxi- phylogenetic reliability. Molecular Biology & Evolution, 20, mum Likelihood topology estimate. 248–254. FELSENSTEIN, J. 1978. Cases in which parsimony or com- patibility methods will be positively misleading. Systematic CONCLUSIONS Zoology, 27, 401–410. -1985. Conﬁdence limits on phylogenies: an approach using 39 – When estimating phylogenetic relationships from mor- the bootstrap. Evolution, , 783 791. GOLOBOFF, P. A., FARRIS, J. S. and NIXON, K. C. 2008. phological data, the parsimony criterion is not as accurate TNT, a free program for phylogenetic analysis. Cladistics, 24, as the stochastic Mk model, whether clade support values 774–786. are considered or not. In contrast to most previous analy- -TORRES, A. and ARIAS, J. S. 2017. Weighted parsi- ses, and following Brown et al. (2017), we ﬁnd that when mony outperforms other methods of phylogenetic inference accounting for clade support values the Maximum Likeli- under models appropriate for morphology. Cladistics, pub- hood implementation of the Mk model achieves similar lished online 4 June. https://doi.org/10.1111/cla.12205 overall accuracy to the Bayesian implementation of the GUILLERME, T. and COOPER, N. 2016. Effects of missing Mk model, albeit with Maximum Likelihood producing data on topological inference using a Total Evidence less-resolved phylogenies. Our simulations indicate that approach. Molecular Phylogenetics & Evolution, 94, 146–158. the Bayesian implementation of the Mk model estimates HASEGAWA, M., KISHINO, H. and YANO, T. A. 1985. higher support for correct nodes compared to Maximum Dating of the human ape splitting by a molecular clock of mitochondrial-DNA. Journal of Molecular Evolution, 22, 160– Likelihood. Therefore, we advocate the Maximum Likeli- 174. hood or Bayesian implementations of the Mk model, in HELED, J. and BOUCKAERT, R. R. 2013. Looking for trees place of parsimony, for phylogenetic analyses based on in the forest: summary tree from posterior samples. BMC Evo- discrete morphological data. lutionary Biology, 13, 221. HILTON, J. and BATEMAN, R. M. 2006. Pteridosperms are Acknowledgements. We thank the other members of the Bristol the backbone of seed-plant phylogeny. Journal of the Torrey Palaeobiology Research Group for discussion. Our work is sup- Botanical Society, 133, 119–168. ported by BBSRC (BB/N000919/1), NERC (NE/P013678/1; NE/ HOLDER, M. T., SUKUMARAN, J. and LEWIS, P. O. N002067/1), and a Royal Society Wolfson Merit Award (PCJD). 2008. A justiﬁcation for reporting the majority-rule consensus We would like to thank three anonymous reviewers for com- tree in Bayesian phylogenetics. Systematic Biology, 57, 814– ments that improved this manuscript. 821. HUELSENBECK, J. P., RONQUIST, F., NIELSEN, R. Author contributions. JEO’R and MNP analysed all data; JEO’R, and BOLLBACK, J. P. 2001. Bayesian inference of phy- MNP, DP, and PCJD interpreted results and wrote the manu- logeny and its impact on evolutionary biology. Science, 294, script. 2310–2314. 14 PALAEONTOLOGY

LEWIS, P. O. 2001. A likelihood approach to estimating phy- Uncertain-tree: discriminating among competing approaches logeny from discrete morphological character data. Systematic to the phylogenetic analysis of phenotype data. Proceedings of Biology, 50, 913–925. the Royal Society B, 284, 20162290. LUO, Z. X., GATESY, S. M., JENKINS, F. A., AMARAL, ROBINSON, D. F. and FOULDS, L. R. 1981. Comparison W. W. and SHUBIN, N. H. 2015. Mandibular and dental of phylogenetic trees. Mathematical Biosciences, 53, 131–147. characteristics of Late Triassic mammaliaform Haramiyavia RONQUIST, F., TESLENKO, M., VAN DER MARK, P., and their ramiﬁcations for basal mammal evolution. Proceed- AYRES, D. L., DARLING, A., HOHNA,€ S., LARGET, ings of the National Academy of Sciences, 112, E7101–E7109. B., LIU, L., SUCHARD, M. A. and HUELSENBECK, J. NESBITT, S. J., BARRETT, P. M., WERNING, S., P. 2012. MrBayes 3.2: efﬁcient Bayesian phylogenetic inference SIDOR, C. A. and CHARIG, A. J. 2013. The oldest dino- and model choice across a large model space. Systematic Biol- saur? A Middle Triassic dinosauriform from Tanzania. Biology ogy, 61, 539–542. Letters, 9, 20120949. SANDERSON, M. J. and DONOGHUE, M. J. 1996. The O’ REILLY, J. E., PUTTICK, M. N., PARRY, L. A., TAN- relationship between homoplasy and conﬁdence in a phyloge- NER, A. R., TARVER, J. E., FLEMING, J., PISANI, D. netic tree. 67–89. In SANDERSON, M. J. and HUF- and DONOGHUE, P. C. J. 2016. Bayesian methods outper- FORD, L. (eds). Homoplasy: the recurrence of similarity in form parsimony but at the expense of precision in the estima- evolution. Academic Press. tion of phylogeny from discrete morphological data. Biology STAMATAKIS, A. 2014. RAxML version 8: a tool for phylo- Letters, 12, 20160081. genetic analysis and post-analysis of large phylogenies. Bioin- -- PISANI, D. and DONOGHUE, P. C. J. 2017. formatics, 30, 1312–1313. Data from: Probabalistic methods surpass parsimony when SUTTON, M. D., BRIGGS, D. E., SIVETER, D. J., SIV- assessing clade support in phylogenetic analyses of discrete ETER, D. J. and SIGWART, J. D. 2012. A Silurian morphological data. Dryad Digital Repository. https://doi.org/ armoured aplacophoran and implications for molluscan phy- 10.5061/dryad.8dd39 logeny. Nature, 490,94–97. PUTTICK, M. N., O’ REILLY, J. E., TANNER, A. R., WRIGHT, A. M. and HILLIS, D. M. 2014. Bayesian analysis FLEMING, J. F., CLARK, J., HOLLOWAY, L., using a simple likelihood model outperforms parsimony for LOZANO-FERNANDEZ, J., PARRY, L. A., TARVER, estimation of phylogeny from discrete morphological data. J. E., PISANI, D. and DONOGHUE, P. C. 2017. PLoS One, 9, e109210.