
Chapter 3 The Impact of Unstable Taxa in Coelurosaurian Phylogeny and Resampling Support Measures for Parsimony Analyses DIEGO POL1 AND PABLO A. GOLOBOFF2 ABSTRACT Paleontological datasets often have large amounts of missing entries that result in multiple most parsimonious trees. Highly incomplete and conflictive taxa produce a collapsed strict consensus and several methods have been developed for identifying these unstable or rogue taxa in optimal trees derived from phylogenetic analyses. In addition to decreasing consensus resolution, incomplete or conflictive taxa can also severely affect the support values of phylogenetic analysis in paleontological datasets. Here, we explore a protocol for the identification of taxa that decrease jackknife support values in parsimony analysis. The taxa identified are excluded from majority rule jackknife trees, revealing nodes that have either low or high support irrespective of the uncertainties in the place- ment of unstable taxa. A recently published dataset of coelurosaurian relationships based on 164 taxa and 853 characters is explored using this protocol; our protocol detects a total of 40 unstable taxa as the most detrimental for node supports. Major clades that are well supported in the reduced jackknife tree include Coelurosauria, Maniraptoriformes, Compsognathidae, Ornithomimosauria, Alvarezsauroidea, Therizinosauria, Oviraptorosauria. Clades with moderate support instead include Maniraptora, Pennaraptora, Paraves, Dromaeosauridae, Troodontidae, Anchiornithinae, and early- diverging clades of Avialae. INTRODUCTION with thousands of trees (which were very prob- lematic for early phylogenetic software). A sub- Morphological datasets that include a large sequent problem in these cases is how the number of extinct taxa are usually characterized multiple optimal trees can be efficiently summa- by copious amounts of missing entries. The rized given that the strict consensus is usually abundance of missing data in these datasets has highly collapsed due to the alternative positions been regarded as problematic for phylogenetic of wildcard or rogue taxa. The role of reduced analyses, since the early days of quantitative cla- consensus methods (Wilkinson, 1994) has distics (Gauthier, 1986; Wilkinson and Benton, become increasingly important in recent years, 1995). The presence of taxa with abundant miss- and several methods have been proposed and ing entries has been linked to searches that find implemented for detecting rogue taxa in a collec- multiple optimal trees in parsimony analyses and tion of optimal trees (Goloboff et al., 2008; Pol the related computational difficulties of dealing and Escapa, 2009; Goloboff and Szumik, 2015). 1 CONICET, Museo Paleontológico Egidio Feruglio, Trelew, Argentina. 2 CONICET, Unidad Ejecutora Lillo (UEL), S.M. Tucumán, Argentina. 98 BULLETIN AMERICAN MUSEUM OF NATURAL HISTORY NO. 440 A second level of problems introduced by the the affinities of some small but conflictive groups presence of copious missing entries is related to such as scansoriopterygids (Xu et al., 2011, 2015, their influence in support values. It is commonly 2017; Turner et al., 2012; Agnolín and Novas, the case that highly incomplete taxa can be more 2013; O’Connor and Sullivan, 2014) and unenla- easily placed in alternative positions than more giines (Turner et al., 2007; Turner et al., 2012; complete taxa. This obviously affects not only the Agnolín and Novas, 2013; Brusatte et al., 2014). support with which a taxon is placed in the phy- The application of the new protocol allows ana- logenetic tree but also the support values of lyzing the varying degrees of clade support many adjacent nodes (Wilkinson, 1996; Wilkin- within this group and distinguishing between son et al., 2000). The empirical outcome of this low support caused by fragmentary taxa, and low is that paleontological datasets are also charac- support due to underlying character conflict terized by the presence of low support values for and/or lack of sufficient phylogenetic data. most of the nodes recovered in the consensus tree. Some methods exist for assessing the role of rogue taxa for parsimony measures such as MATERIALS AND METHODS Bremer or decay support (e.g., double decay; Phylogenetic Analysis Wilkinson et al., 2000). Also, alternative resam- pling methods that do not eliminate characters The phylogenetic dataset (Pei et al., in press) and thus produce lower estimates of support has 164 taxa scored across 853 characters and its only in the presence of actual character conflict parsimony analysis is best carried out using the have been explored (e.g. the “nozeroweight” “New Technology Searches” option in TNT option for resampling in TNT, used by Pei et al., (Goloboff et al., 2008). This strategy was applied in press). Recently, however, most efforts have in a first phase until 50 hits to minimum length been focused on the development of ways to were achieved (command: xmult = hits 50;), detect wildcard taxa that decrease bootstrap sup- resulting in trees of 3424 steps. The application of port (Pattengale et al., 2011; Aberer and Stamata- traditional heuristic searches (multiple replicates kis, 2011; Aberer et al., 2013). of Wagner trees followed by TBR branch swap- Here, we explore the application of a more ping) is possible, but then finding optimal trees detailed protocol that identifies unstable taxa requires longer search times than new technology that decrease support measures, based on resam- searches. The strict consensus of this analysis is pling procedures (i.e., jackknife or bootstrap) well resolved, with two relatively large polytomies implemented with a script for TNT (Goloboff et at the base of Avialae and in Dromaeosauridae al., 2008), that combines several of the options (fig. 1). These polytomies are caused by four for identifying rogue taxa that already exist in unstable taxa that take multiple positions among that program. We employ this procedure for a the MPTs (i.e., Yurgovuchia, Acheroraptor, Veloci- comprehensive phylogenetic analysis of Coeluro- raptor osmolskae, and Archaeopteryx Haarlem). sauria, using the Theropod Working Group Ten other taxa are also identified as unstable by (TWiG) matrix published by Pei et al. (in press). IterPCR (Pol and Escapa, 2009) applied to the This dataset has an extensive taxon sampling MPTs, as implemented in TNT (see Goloboff and (164 taxa), and therefore provides an ideal case Szumik, 2015). Ignoring all the unstable taxa for testing the impact of fragmentary taxa on results in a well-resolved, strict reduced-consen- support measures. In particular, the TWiG data- sus tree (see resolved nodes in fig. 1). set includes a dense sampling of pennaraptoran It is important to note that the unstable taxa coelurosaurians that have been the focus of are pruned from the trees, but they are not elimi- recent systematic debates, including the interre- nated from the matrix or the tree searches at any lationships of some of its major clades as well as point. The elimination of taxa from the trees 2020 POL AND GOLOBOFF: UNSTABLE TAXA IN COELUROSAURIAN PHYLOGENY 99 amounts to representing those parts of the results of interest are collapsed in the majority rule tree that are more useful, while eliminating taxa from derived from the resampling replicates, is in fact the matrix amounts to ignoring the evidence (in a common result in paleontological datasets (in the form of character combinations) provided by particular when they are constructed using an those taxa (see discussion in Goloboff and Szu- extensive taxon-sampling regime). mik, 2015: 100–101, and fig. 6). Identifying Unstable Taxa for Resampling Support Measures Resampling Support Analysis In this paper we compare jackknife support A resampling procedure (e.g., jackknife or values obtained using the default settings in TNT bootstrap) normally involves conducting at least (including all taxa and using group frequencies on 100 pseudoreplicates. In each of these replicates the majority rule consensus) with jackknife fre- the characters of the original matrix are resam- quencies obtained with the same procedure but pled at random so that a modified (perturbed) on a reduced majority rule consensus tree (ignor- matrix is obtained and a tree search is conducted ing the alternative position of unstable taxa). on this modified matrix (fig. 3). The difference We opted for this comparison for the sake of between alternative resampling support mea- simplicity to highlight the effect of unstable taxa sures is simply how the resampling of characters on resampling support measures. However, we is performed (e.g., bootstrap, jackknife, symmet- note that there are alternative ways for summa- ric; Farris et al., 1996; Goloboff et al., 2003). rizing resampling measures, such as GC frequen- Regardless of how this is done, the tree search cies (rather than raw frequencies) and/or conducted for each pseudoreplicate results in a measuring frequencies for the nodes present in set of most parsimonious trees and, therefore, the strict consensus of the MPTs (rather than after finishing the 100 pseudoreplicates, there are those appearing in the majority rule consensus 100 sets of most parsimonious trees. of the resampling procedure). As noted previously (Goloboff et al., 2003; The frequencies of the majority rule consen- Simmons and Freudenstein, 2011), some phylo- sus obtained after performing a default jackknife genetic software (e.g., PAUP) calculates the boot- analysis on the dataset of Pei et al. (in press) are strap/jackknife
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages20 Page
-
File Size-