<<

Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

Syst. Biol. 69(3):502–520, 2020 © The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please email: [email protected] DOI:10.1093/sysbio/syz062 Advance Access publication September 24, 2019

Interrogating Genomic-Scale Data for (, , and Amphisbaenians) Shows no Support for Key Traditional Morphological Relationships

, FRANK T. BURBRINK1,FELIPE G. GRAZZIOTIN2,R.ALEXANDER PYRON3,DAVID CUNDALL4,STEVE DONNELLAN5 6,FRANCES IRISH7,J.SCOTT KEOGH8,FRED KRAUS9,ROBERT W. M URPHY10,BRICE NOONAN11,CHRISTOPHER J. RAXWORTHY1,SARA , ,∗ RUANE12,ALAN R. LEMMON13,EMILY MORIARTY LEMMON14 , AND HUSSAM ZAHER15 16 1Department of Herpetology, The American Museum of Natural History, 79th Street at Central Park West, New York, NY 10024, USA; 2Laboratório de Coleções Zoológicas, Instituto Butantan, Av. Vital Brasil, 1500—Butantã, São Paulo—SP 05503-900, Brazil; 3Department of Biological Sciences, The George Washington University, Washington, DC 20052, USA; 4Department of Biological Sciences, 1 W. Packer Avenue, Lehigh University, Bethlehem, PA 18015, USA; 5South Australian Museum, North Terrace, Adelaide SA 5000, Australia; 6School of Biological Sciences, University of Adelaide, SA 5005 Australia; 7Department of Biological Sciences, Moravian College, 1200 Main St, Bethlehem, PA 18018, US; 8Division of Ecology and Evolution, Research School of Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Biology, The Australian National University, Canberra, ACT 2601, Australia; 9Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA; 10 Department of Natural History, Royal Ontario Museum, 100 Queens Park, Toronto, ON M5S 2C6, Canada; 11 Department of Biology, University of Mississippi, Oxford, MS 38677, USA; 12Department of Biological Sciences, 206 Boyden Hall, Rutgers University, 195 University Avenue, Newark, NJ 07102, USA; 13Department of Scientific Computing, Florida State University, Dirac Science Library, Tallahassee, FL 32306-4102, USA; 14 Department of Biological Science, Florida State University, 319 Stadium Drive, Tallahassee, FL 32306-4295, USA; 15Museu de Zoologia da Universidade de São Paulo, São Paulo, Brazil CEP 04263-000, Brazil; and 16 Centre de Recherche sur la Paléobiodiversité et les Paléoenvironnements (CR2P), UMR 7207 CNRS/MNHN/Sorbonne Université, Muséum national d’Histoire naturelle, 8 rue Buffon, CP 38, 75005 Paris, France ∗ Correspondence to be sent to: Museu de Zoologia da Universidade de São Paulo, São Paulo, Brazil CEP 04263-000, Brazil; E-mail: [email protected].

Received 15 January 2019; reviews returned 5 September 2019; accepted 10 September 2019 Associate Editor: Robert Thomson

Abstract.—Genomics is narrowing uncertainty in the phylogenetic structure for many amniote groups. For one of the most diverse and species-rich groups, the squamate (lizards, snakes, and amphisbaenians), an inverse correlation between the number of taxa and loci sampled still persists across all publications using DNA sequence data and reaching a consensus on the relationships among them has been highly problematic. In this study, we use high-throughput sequence data from 289 samples covering 75 families of squamates to address phylogenetic affinities, estimate divergence times, and characterize residual topological uncertainty in the presence of genome-scale data. Importantly, we address genomic support for the traditional taxonomic groupings Scleroglossa and Macrostomata using novel machine-learning techniques. We interrogate genes using various metrics inherent to these loci, including parsimony-informative sites (PIS), phylogenetic informativeness, length, gaps, number of substitutions, and site concordance to understand why certain loci fail to find previously well- supported molecular clades and how they fail to support species-tree estimates. We show that both incomplete lineage sorting and poor gene-tree estimation (due to a few undesirable gene properties, such as an insufficient number of PIS), may account for most gene and species-tree discordance. We find overwhelming signal for Toxicofera, and also show that none of the loci included in this study supports Scleroglossa or Macrostomata. We comment on the origins and diversification of Squamata throughout the Mesozoic and underscore remaining uncertainties that persist in both deeper parts of the tree (e.g., relationships between Dibamia, Gekkota, and remaining squamates; among the three toxicoferan clades Iguania, Serpentes, and Anguiformes) and within specific clades (e.g., affinities among gekkotan, pleurodont iguanians, and colubroid families). [Neural network; gene interrogation; lizards; snakes; genomics; phylogeny.]

Well-supported phylogenies inferred using both thor- few of intermediate range with ∼50 loci and 161 taxa ough taxon-sampling and genome-scale sequence data (Wiens et al., 2012; Reeder et al., 2015). Furthermore, are paramount for understanding phylogenetic struc- phylogenetic thinking about such groups often reflects ture and settling debates about higher-level . historical morphological hypotheses that are weakly Phylogenomic analyses can provide reliable trees for congruent or incongruent with recent phylogenomic downstream use in comparative biology (Garland et al. estimates (Conrad 2008; Gauthier et al., 2012; Losos et al., 2005; Wortley et al., 2005; Heath et al., 2008; Ruane 2012). et al., 2015; Burbrink et al., 2019) and help unravel The order Squamata comprises almost 10,800 extant evolutionary complexity (Philippe et al., 2011), such lizards, snakes, and amphisbaenians (Uetz et al. 2018) as deep-time phylogenetic reticulation (Burbrink and showing continuous diversification since the , Gehara, 2018). In recent years, well-resolved phylogenies with many groups surviving the /Tertiary of birds and mammals used both large numbers of mass (Evans 2003; Evans and Jones 2010; genes and taxa (Prum et al., 2015; Liu et al., 2017). Jones et al., 2013). Extant squamates occur in nearly Unfortunately, among amniotes, squamates have fallen all habitats globally and show huge variation in body behind and all comparative and taxonomic studies still size, body shape, limb types (including repeated com- rely on phylogenetic structure estimated from either plete limb loss), oviparous and viviparous reproduct- a handful of genes and a large number of species ive modes, complex venoms, and extremely varied (Pyron et al., 2013), small number of lineages with diets that include plants, invertebrates and verteb- phylogenomic data (Streicher and Wiens 2017), and a rates (Vitt and Pianka 2005; Vitt and Caldwell 2009; 502

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 502 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 503

Colston et al., 2010; Pyron and Burbrink, 2014; Zaher have found quantitatively similar hidden morphological et al., 2014; Fry 2015). Squamates are important research support for Toxicofera as well as for Scleroglossa (Reeder organisms in the fields of behavior, ecology, and macro- et al., 2015), which suggests that most traits supporting evolution, for which major studies on their speciation, a basal Iguania/Scleroglossa split are the result of con- biogeography, and latitudinal richness gradients have vergent ecological adaptations. Moreover, Simões et al. contributed to the basic understanding of how diversity (2018) recently rejected this basal split into Scleroglossa accumulates across the earth (O’Connor and Shine 2004; using a new morphological data matrix containing Ricklefs et al., 2007; Pyron and Burbrink 2012; Pyron, characters that support the molecular phylogeny (at least 2014; Burbrink et al., 2015; Esquerré and Scott Keogh 2016; in part). This suggests that the traditional coding of some Esquerré et al., 2017). of these morphological traits may have been in error or Attempts to understand relationships among squam- reflected homoplasy. ates over the last 250 years reflect the input from Many morphological studies recovered a single origin

hundreds of researchers spanning morphological, small- for large-gaped snakes, referred to as Macrostomata Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 scale molecular, and now phylogenomic data sets, (Cundall et al., 1993; Rieppel et al., 2003; Conrad 2008; marked by several key milestones (Oppel 1811; Camp Wilson et al., 2010; Gauthier et al., 2012; Zaher and 1923; Underwood 1967; Estes et al., 1988; Townsend Scanferla 2012; Simões et al., 2018). This group has been et al., 2004; Vidal and Hedges 2005, 2009; Conrad 2008; defined by a large number of features, all involving Wiens et al., 2010, 2012; Gauthier et al., 2012; Pyron the dentigerous upper and lower jaws, palatal, and et al., 2013, 2014; Reeder et al., 2015; Streicher and Wiens suspensorium bones, which contribute to an increase of 2016). Although research from both morphological gape size (Rieppel 1988; Cundall and Irish 2008). How- and molecular studies have converged toward sim- ever, Macrostomata was found to be paraphyletic based ilar content of most family-level groups, relationships on taxa and morphological data alone (Lee and among these groups and, more intriguingly, the deepest Scanlon 2002; Rieppel et al., 2003; Scanlon 2006; Rieppel, divisions within squamates remain highly contentious 2012). Similarly, molecular studies using mtDNA and/or (Losos et al., 2012). Squamate relationships inferred single-copy nuclear genes also demonstrate paraphyly from molecular sequence-based that are in Macrostomata, where Aniliidae (non-Macrostomata) fundamentally at odds with historical interpretations of and Tropidophiidae (Macrostomata) represent sister morphological data include: 1) monophyly of Toxicofera taxa to the exclusion of other macrostomatan families (snakes, iguanians, and anguiforms; Vidal and Hedges (Vidal and Hedges, 2004; Pyron and Burbrink, 2012; 2005), 2) polyphyly of Anilioidea (pipe-snakes) and Wiens et al., 2012; Streicher and Wiens, 2016); this has also Macrostomata (large-gaped snakes including Acrochor- been reinforced by some unconventional morphological didae, Boidea, Bolyeriidae, Colubriformes, Pythonidae, traits (Siegel et al., 2011). Finally, a recent study com- Tropidophiidae, and Ungaliophiidae), 3) paraphyly of bining morphological and molecular data from extant Scolecophidia (worm-snakes), 4) phylogenetic affinit- and fossil species using tip dating methods showed that ies of Dibamia and within squamates macrostomatan morphological features evolved early (Hallermann 1998; Rieppel and Zaher 2000; Conrad 2008; in snakes and subsequently reversed multiple times Gauthier et al., 2012), and 5) phylogenetic affinities of (Harrington and Reeder, 2017). Heloderma and Shinisaurus within anguiformes (Gao and Some authors have argued that because morpho- Norell 1998). logical data are subjectively coded and demonstrably In brief, all cladistic morphological studies of extant susceptible to convergence among key traits, they should and extinct taxa since the landmark analysis of Estes be considered biased against estimating correct rela- et al. (1988), which itself tested the original divisions tionships when compared with more voluminous and of Camp (1923), have supported a basal split between objectively coded molecular data (Wiens et al., 2010; Iguania, represented by Pleurodonta and Acrodonta, Reeder et al., 2015). Molecular data, therefore, should not and Scleroglossa, which includes gekkotans and all be as influenced by ontogeny,environment, convergence, remaining (“autarchoglossan”) squamate groups (see or user-coded bias when compared with morphological Conrad [2008] and Gauthier et al. [2012] for reviews). traits. However, a few examples of molecular conver- This arrangement at the root of crown-Squamata is gence within mtDNA in squamates exist (Castoe et al., supported by a suite of morphological characters that 2009) and among particular genes in other organisms range from 2 to 10 unambiguous synapomorphies (Parker et al., 2013; Projecto-Garcia et al., 2013; Zou uniting Scleroglossa as the sister group of Iguania and Zhang, 2016). Although instances of genome-wide (Conrad 2008; Gauthier et al., 2012). Alternatively, all convergence occur (Footeetal., 2015), it nevertheless molecular phylogenetic studies since Saint et al. (1998), seems unlikely to expect convergence across nuclear and Townsend et al. (2004), and Vidal and Hedges (2005)have mtDNA genomes while yielding identical topologies demonstrated that snakes, anguimorphs, and iguanians with respect to groups like Toxicofera and Alethi- share a more-recent common ancestor to the exclusion nophidia. of other squamates; this clade is collectively referred to Although genome-scale data should thus be optimal as Toxicofera (Vidal and Hedges 2005). Importantly, re- for resolving squamate relationships, most molecular analysis of combined molecular and morphological data studies have been limited by either having many taxa

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 503 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

504 SYSTEMATIC BIOLOGY VOL. 69

for few genes (Pyron et al., 2013) or having few taxa and FXp liquid-handling robot. After ligating 8 bp indexes, many genes (Streicher and Wiens, 2017). For many stud- we pooled libraries in groups of 16 and enriched the ies, understanding how many genes support a species library pools using the AHE probes developed for tree and what signal exists for alternative arrangements Squamates by Ruane et al. (2015) and Tucker et al. (2016). is currently unknown for most genomic-scale data sets. After verifying the quality and quantity of the enriched Thus, a study using genome-scale molecular data and libraries by bioanalysis and qPCR, we sequenced the dense sampling of extant lineages is needed to confirm libraries at the Florida State University translational the robustness of the molecular signal, particularly with laboratory on eight lanes of Illumina HiSeq 2500 with respect to methods that can interrogate phylogenetic paired-end 150-bp protocol (∼395 Gb of total data). signal across loci with respect to the species tree (e.g., Following sequencing, we demultiplexed reads Arcila et al., 2017). passing the Casava high-chastity filter, allowing for Here, we sample 289 species from nearly all squamate no index mismatches. Next, we merged overlapping

families for 394 anchored phylogenomic loci (AHE; read pairs to remove library adapters and correct Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Lemmon et al., 2012) to investigate species-tree relation- for sequencing errors (Rokyta et al., 2012). We then ships within Squamata, assess support for all nodes, assembled the reads using the quasi-de novo approach and estimate divergence dates. We then examine the described by Ruane et al. (2015) and Prum et al. (2015), influence of each locus on the overall topology. Spe- with Calamaria pavimentata and Anolis carolinensis as ref- cifically, we use machine-learning techniques (Lek et al., erences. We avoided sequences derived from low-level 1996; Zhang, 2010) to quantify and understand why contamination or misindexing by removing assembly particular genes do not support the standard molecular clusters containing fewer than 500 reads. We also verified relationships and determine if there is hidden support the identity of some tissue samples by comparing for Scleroglossa and Macrostomata. Given past morpho- mitochondrial genes to previously published mitochon- logical evidence for these relationships, we expect at drial sequences (mainly cytb and the rRNAs genes). least some loci in any phylogenetic context (concatenated Mitochondrial sequences were assembled through the or species trees) should support these relationships. map to reference approach using the Geneious mapper Thus, the genomic distribution of congruence and in Geneious R9 (Biomatters Ltd.; Kearse et al., 2012). We incongruence should highlight the underlying biolo- established orthology by clustering consensus sequences gical mechanisms for these topological disputes and using pairwise-distances, as described by Hamilton potentially provide resolution. Results from our research et al. (2016). For some loci, we removed aberrant solidify relationships among most extant squamates, sequences resulting from low-level contamination or a further clarify their taxonomy, and highlight remaining low-divergent paralog that passed by our primary filters problems. by using a modified version of the orthology assessment method described by Hamilton et al. (2016). In this mod- ified approach, we iteratively clustered the sequences METHODS AND MATERIALS using pairwise-distances for each nested taxonomic level. We then aligned orthologous sequences using Data set MAFFT v7.023b(Katoh and Standley, 2013). We trimmed Using anchored phylogenomics (Lemmon et al., and masked the alignments following Hamilton et al. 2012), we generated a genomic data set for 289 spe- (2016); but with MINGOODSITES = 13, MINPROPSAME cies representing 75 families of squamates and one = 0.3, and MISSINGALLOWED = 82, then inspected the Rhynchocephalian (see Supplemental material available alignments manually in Geneious R9 to identify and on Dryad at http://dx.doi.org/10.5061/dryad.6392n3s). remove remaining aberrant sequences. Our taxonomic arrangement followed Vidal and Hedges (2005, 2009), Conrad (2008), and Vidal et al. (2010) for higher-level squamatan clades; Pyron et al. (2013) and Barker et al. (2015) for Booidea and Pythonoidea Phylogeny familial and generic levels; Zaher et al. (2009) and Kelly We estimated models of substitution for each et al. (2009), including nomenclatural suggestions of locus using ModelFinder (Chernomor et al., 2016; Savage (2015) and Rhodin et al. (2015) for higher and Kalyaanamoorthy et al., 2017), which uses maximum familial caenophidian clades. Whole genomic DNA was likelihood to fit 22 substitutional models including up extracted from tissue using the Qiagen DNeasy kit to 6 free-rate gamma categories. We first estimated following manufacturer’s protocols at the Center for phylogeny and tree support using the ultrafast non- Anchored Phylogenomics at Florida State University parametric bootstrap approximation (n = 1000; Minh and data were assembled using Anchored Phylogen- et al., 2013) over the partitioned concatenated data omics (www.anchoredphylogeny.com). Following Qubit sets. Support was also estimated using the Shimodaira– fluorometer quantification using a dsDNA HS Assay kit Hasegawa-like approximate likelihood ratio test (SH; (Invitrogen™), we sonicated up to 1 g of the extracted Shimodaira and Hasegawa, 1999; Anisimova et al., 2011). DNA to a size range of 150–400 bp using a Covaris With the locus-partitions and substitution models, we E220 Focused-ultasonicator. We then prepared libraries generated gene trees with support for each locus in IQ- following Lemmon et al. (2012) using a Beckman-Coulter TREE v1.6.6 (Nguyen et al., 2015). We then generated

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 504 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 505

a species tree using ASTRAL III with IQ gene trees Divergence Dates as inputs and support estimated using local posterior We estimated divergence dates (with error) using probabilities on quadripartitions with default hyper- the penalized-likelihood approach in TreePL (Smith parameter inputs (Yule prior for branch lengths and and O’Meara, 2012) with three phylogenetic datasets. the species tree set to 0.5; Mirarab and Warnow, A full MCMC-based relaxed phylogenetics method 2015). We also ran ASTRAL using 1000 bootstrapped was not computationally feasible for a dataset of this IQ trees under the multilocus bootstrapping feature. size. The first dataset used for dating was composed We compared these concatenated and species trees of 1000 concatenated bootstrapped (pseudoreplicated) using Robinson–Foulds (RF) distances (Robinson and trees generated from the IQTREE rapid bootstrap func- Foulds, 1981), which provide a measure of topological tion (UFBoots). Second, because tree space may not have dissimilarity, and then determined if all measures of been widely explored using this method of generating support were correlated among shared branches using bootstrapped trees (Smith et al., 2018), we also generated Spearman rank correlation (Supplementary material 100 pseudoreplicates in RAxML 1.6.7 (Stamatakis, 2014), Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 available on Dryad). The Squamata tree was rooted and estimated phylogeny for each in IQTREE. Third, with their closest living relative, the rhynchocephalian we also generated a test tree by fitting the concatenated Sphenodon punctatus; outgroup status of this taxon data to the ASTRAL topology producing a species- has been discussed in other recent papers that used tree topology with branch lengths in substitution rates. morphological and molecular data (Gauthier et al., 2012; We then re-estimated the phylogeny and generated Jones et al., 2013; Chen et al., 2015; Harrington et al., 2016). dated trees using TreePL; this method produced dates Because bootstraps, SH, or posterior probabilities do given tree uncertainty from the two bootstrap replicates not provide comprehensive measures of underlying and the ASTRAL tree. For TreePL, we chose the best agreements or disagreement among sites and genes smoothing parameter to balance a tradeoff between for supporting any topological arrangement, we rates across the tree being clock-like or completely examined site and gene concordance factors (sCF saturated by using a cross-validation approach. This and gCF, respectively). Here, gCF indicated the approach sequentially removed terminal taxa, produced percentage of gene trees showing a particular branch an estimate of these branches from the remaining from a species tree (Ané et al., 2007; Baum, 2007), data given an optimal smoothing parameter, and then whereas sCF shows the number of sites supporting produced an appropriate smoothing parameter from the that branch. Values of sCF have a lower bound of fit of the real branch and the pruned branch (Sanderson, 33% given the three possible quartets for each node, 2002). We iterated this 14 times over a range of smoothing whereas gCF values are calculated from a full gene × −7 × 4 tree and, therefore, may not resolve a particular node parameters from 1 10 –1 10 and implemented thorough yielding a lower bound of 0%. We estimated gCF and the option to ensure that the run iterated sCF values across all nodes of the ASTRAL species until convergence. To calibrate these trees, we followed tree and concatenated IQ trees using IQ-TREE v1.6.6 Jones et al. (2013), Alencar et al. (2016), and Zaher (Nguyen et al., 2015; Minh et al., 2018) and associated et al. (2018), and added other taxa, to utilize 26 R code (http://www.robertlanfear.com/blog/files/ and locations on the phylogeny for estimating diver- concordance_factors.html). gence dates (see Supplementary material available on Because discordance as estimated here between gene Dryad). trees and the species tree may be caused by either incom- plete lineage sorting (ILS) and/or poorly estimated gene trees, we attempted to isolate these issues across all Locus Support for Morphological Topology nodes. If ILS is driving gene and species-tree differences, We examined how often individual loci recovered then the two discordant topologies at a particular the traditional Scleroglossa/Iguania division and a node (i.e., not the primary node from the species-tree monophyletic Macrostomata (including Acrochordidae, topology) should be equivalent for gCF and for sCF, Boidea, Bolyeriidae, Calabariidae, Candoidae, Charin- though sites may be linked within single loci and provide idae, Erycidae, Loxocemidae, Pythonidae, Sanziniidae, unreliable estimates. Using a X2 test for genes and sites Tropidophiidae, Ungaliophiidae, Xenopeltidae, and all separately, we sum the number of genes and the number 17 families of Colubroides recognized herein). For Sclero- of sites supporting the two discordant topologies at each glossa, we tested for monophyly of all squamates exclud- node to determine if they deviate significantly from ing Iguania. For Macrostomata, we tested for monophyly being evenly represented. If they were not significant, of Amerophidia (Aniliidae + Tropidophiidae), a group this is a reasonable expectation that discordance among that rejects Macrostomata, as aniliids are canonical gene trees and among sites was due to ILS (Huson non-macrostomatans, and tropidophiids are canonical et al., 2005; Green et al., 2010; Martin et al., 2015). macrostomatans (Vidal and Hedges, 2004). This group- We estimated X2 between genes and sites using script ing in particular is an example of conflict between provided here: http://www.robertlanfear.com/blog/ molecular, osteological, and soft-tissue characters (see files/concordance_factors.html. Siegel et al., 2011; Hsiang et al., 2015). To test a more

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 505 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

506 SYSTEMATIC BIOLOGY VOL. 69

difficult phylogenetic relationship, we also examined the method used as an alternative to likelihood techniques placement of Dibamia, here inferred in the species tree to address complex questions with many predictor as sister to Gekkota. variables in population genomics (Libbrecht and Noble, We first estimated the probability of Toxicofera and 2015; Sheehan et al., 2016). As such, they are a powerful Amerophidia monophyly, respectively, using the meas- tool for understanding non-linear interactions among ures of support in the species tree (quadripartition and numerous parameters to predict responses resulting bootstrapped IQ gene trees) and asked how many loci in a range of simple to complex models regardless also recover these relationships. For the remaining loci of the statistical distribution of those variables or the that did not show these relationships, we asked how relationships among them (Lek et al., 1996; Zhang, often they find the basic Scleroglossa/Iguania split and 2010). These methods have been successfully used to monophyly of Macrostomata and whether these outlier address phylogenetic and comparative tree-based tests loci together recovered these traditional relationships previously (Burbrink et al., 2017; Burbrink and Gehara,

using species-tree methods. Similarly, we also examined 2018). Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 key placements for Dibamia as the sister group to: 1) all In brief, in the typical NN with multilayer feed- other squamates, 2) Gekkota, and 3) all other squamates forward back-propagation used here, the basic structure excluding Gekkota. began with genetic input variables (input neurons), joined by weighted synapses to hidden neurons, and then finally ending in an output neuron. Every node Genomic Interrogation Using Neural Networks was connected to every other node in the previous layer, and each node was provided with an activation value, Using a custom R script (R Core Team, 2015; defined as the difference between the weighted sum of script provided as Supplementary material available all inputs and a bias (threshold) parameter; connected on Dryad), we examined the frequency of genes that hidden neurons activate an output node that is compared support each node of the dated species tree, similar to with a known (dependent) value. Smith et al. (2018). We assessed this support over all Specific to our NN, we scaled all predictor variables dated nodes to understand where and when particular by the minimum and maximum range for each data phylogenetic relationships were poorly supported by category. We separated the data into 70% standard the density of gene trees that disagree with the species training and 30% test. We also tested accuracy at tree. We then attempted to investigate why certain loci other training and test percentages, respectively: 50/50, did not show the same relationships as the species 60/40, 70/30, 80/20, and 90/10. Each of these was tree. run using 1000 maximum iterations, which ensured Using RF distances (Robinson and Foulds, 1981) convergence. We resampled the data using the default between each gene tree and the species tree (RFgtst) 25 bootstrap replicates to reach convergence with the as the response variable, we chose characteristics of following tuning parameters: weight decay, root mean genes known to affect resolution, support, or topological squared error (RMSE), r2, and mean absolute error accuracy such as gene length, base-pair composition, (MAE). We examined the power of these variables and alignment gappiness, and variable sites (Rosenberg and models to predict the response (RFgtst) by recompos- Kumar, 2003; Felsenstein, 2004; Wortley et al., 2005; ing the test and training sets over 100 iterations and Nagy et al., 2012; López-Giráldez et al., 2013; Som, 2015; comparing the resulting test statistics (RMSE, r2, and Duchêne et al., 2017), which here included the following MAE) to those from randomized response variables parameters as a standard set of predictor variables: 1) for each of these 100 estimated models. Over these number of sites, 2) base-pair content, 3) number of replicates, we also identified the top five most important parsimony-informative sites (PIS), 4) number of gaps, 4) model variables (Supplementary Fig. S1 available on maximum gap length, 5) mean gap length, 6) number of Dryad). segregating sites with gaps, 7) standard deviation of gap Multicollinearity among variables may be problematic length, 8) sites with more than one substitution, 9) num- for constructing model inferences if there is a linear ber of bases observed at each site (1–4), 10) maximum dependence among these independent variables (De or minimum phylogenetic informativeness, and 11) gene Veaux and Ungar, 1994; Hastie et al., 2009; Dormann and site concordance (gCF and sCF). Metrics 1–9 were et al., 2013). Although overparameterization is often not generated with a custom script using the Ape package a problem for NN given that they are used for prediction in R (Paradis et al., 2004). RF distances were estimated of the system and not necessarily for interpretation, pre- using a custom script based on the package phangorn. vious studies have demonstrated that machine-learning Metric 10 was estimated using a custom script based on techniques in general may be sensitive to changes in the package phyloinformR (Dornburg et al., 2016), which variables over collinear data (Shan et al., 2006; Dormann estimates rates per site from the web server Phydesign et al., 2013). Our NN models contain 26 variables that (Lopez-Giraldez and Townsend, 2011) and metric 11 was mostly describe properties of genes; therefore, we used generated in IQ-TREE v1.6.6 (Nguyen et al., 2015). variance inflation factors (VIF; Montgomer et al., 2012) To understand if these variables can predict RFgtst,we to filter collinearity data in the R package “Faraway” used artificial neural network (NN) regressions in caret (Faraway, 2002). Here, we chose standard VIF >10 and (Kuhn, 2008). Artificial NNs are a type of deep-learning removed all but one variable greater than this value. We

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 506 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 507

then repeated the NN analyses as with all of the variables showed 0.90, 0.92, 0.93, and 0.95 percent of nodes described above. were supported above 0.95 for each method, respect- Similarly, we used NN to understand if the same ively (Supplementary material available on Dryad). variables used to describe genes, but now also including For shared nodes, support was significantly correlated − RFgtst, could be used to classify why loci do or do among methods ( = 0.55–0.66; P= 2.2 × 10 16 ). not support the monophyly of Toxicofera, Amerophidia All phylogenies generally showed relationships (which implies a lack of support for Macrostomata), and among groups that have been commonly recovered the Gekkota and Dibamia sister relationship (GD). Our in previous genomic or combined molecular- binary predictions were classified as “1” if the locus morphological studies (Figs. 1 and 2; Vidal and Hedges, supported the relationship and “0” if it did not. We again 2005; Wiens et al., 2012; Pyron et al., 2013; Reeder et al., used the “train” function with the identical set up as 2015; Streicher and Wiens, 2017). We found unambiguous above but replaced the tuning parameters with weighted support (i.e., support values of 1.0) for all main groups,

decay and model accuracy. This was replicated 500 times including Unidentata, Episquamata, Toxicofera, Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 for recomposing test and training data sets, estimating Gekkota, Scincomorpha, Laterata, Anguiformes, accuracy, and then testing this against the accuracy Iguania, and Serpentes (Fig. 1). The tree supported of randomized responses. For each of these replicates, an early division between Gekkota/Dibamia and the we also tested performance using a confusion matrix remainder of Squamata, followed by a Scincomorpha (Townsend, 1971; Kuhn, 2008) comparing predicted with and Episquamata sister relationship, and within actual classification from the training data. Episquamata a sister relationship between Laterata and Toxicofera. Resolution and support within each of these groups was generally unambiguous (Supplementary RESULTS material available on Dryad). For instance, most of the nodes within Colubroides received probability Assemblies support values of 1.0, except for sister relationships We were able to recover 99.0% (SD = 2.3%) of the 394 between Colubridae + Grayiidae and Lamprophiidae + target AHE loci (Supplementary material available on Pseudoxyrhophiidae, and the placement of Natricidae Dryad). An average of 23.5% (SD = 10.0%) of the reads and Elapidae within Colubroidea and Elapoidea, mapped to the target region. The resulting consensus respectively. Likewise, the remainder of Serpentes was sequences averaged 1775 bp (SD = 180 bp). We obtained well supported except for the placement of Candoiidae for each locus an average of 1.7 consensus sequences and Bolyeriidae, and the sister relationship between (assembly clusters, SD = 0.67), indicating that the loci the two clades containing the most recent common were low copy but not all loci had single copies. The ancestors (MRCA) of 1) Pythonidae and Bolyeriidae average coverage of these consensus sequences was 218 and 2) Calabariidae and Boidae. We also did not find (SD = 83). strong support for relationships among some families in Pleurodonta (Iguania) and Gekkota, the placement of Dibamia as sister to Gekkota, and placement of Data set Anguiformes or Iguania (or their MRCA) as sister to Loci had on average retained 92.4% (SD = 10.8) Serpentes within Toxicofera (Fig. 2). of all taxa represented in the species tree (n = 289). Mean length and number of PIS were 1302 bp (range: 170–2075, SD = 282.72) and 749 bp (range: 61–1455; Tree Support SD = 282.7167), respectively. We found that 18 models provided a best fit for substitutions across all loci, with Our estimates of sCF and gCF were correlated across the TVM (transversion model, AG = CT, unequal base the species tree of squamates (Fig. 3) but we note that frequencies) fitted to 27% of loci and GTR fitted to 18.1 both of these measures fell well below standard meas- % of models, with either a five or six free-rate parameter ures of Pp support. For many of the standard squamate (R) model describing rate heterogeneity. relationships, including Toxicofera and Amerophidia, gCF were above 50%, whereas sCF remained low. This indicated that estimating support from sites alone, such as in bootstraps, may not provide credible estimates of Phylogeny support with genomic-scale data. We also demonstrated Tree estimates using either concatenated or species- a significant relationship between gCF and sCF and tree methods produced similar topologies (Figs. 1 and 2), branch length (Fig. 3). with RF distances of 44 between these trees being only Most of the gene-to-species-tree discordance can 7.7% of the maximum RF distance. All methods showed be described by ILS, where 82.6% of nodes did not strong nodal support. All measures of support—which show significant differences between the two discord- included ASTRAL with local posterior probabilities, ant topologies. Although likely not as reliable given ASTRAL with 1000 IQ tree UF bootstraps, concaten- non-independence among sites, sCF shows 39.7% of ated partitioned IQ trees with 1000 bootstraps, and nodes not showing significance among discordance sites. concatenated partitioned IQ trees with SH likelihoods Finally, logistic regression testing for the presence or

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 507 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

508 SYSTEMATIC BIOLOGY VOL. 69

Rhyncocephalia Sphenodontidae

Dibamia Dibamidae Eublepharidae Diplodactylidae Carphodactylidae Pygopodidae Gekkota Gekkonidae Sphaerodactylidae Scincoidea Scincidae Squamata Cordyloidea Gerrhosauridae Cordylidae Xantusiidae Amphisbaenia Bipedidae Lacertibaenia Trogonophiidae Amphisbaenidae Scincomorpha Lacertidae Teiioidea Gymnophthalmidae Amphisbaenoidea Teiidae Acrodonta Chamaeleonidae Unidentata Iguania Agamidae Laterata Pleurodonta Phrynosomatidae Iguanidae Dactyloidae Tropiduridae

Leiocephalidae Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Episquamata Corytophanidae Crotaphytidae Liolaemidae Hoplocercidae Neoanguimorpha Opluridae Anguioidea Leiosauridae Polychrotidae Helodermatidae Diploglossidae Anniellidae Anguiformes Anguidae Xenosauridae Shinisauridae Varanidae Toxicofera Paleoanguimorpha Leptotyphlopidae Typhlopoidea Gerrhopilidae Typhlopidae

Serpentes Anomalepididae Amerophidia Tropidophiidae Aniliidae Bolyeriidae Xenopeltidae Loxocemidae

Pythonidae Pythonoidea

Calabariidae Booidea Sanziniidae Charinidae Alethinophidia Ungaliophiidae Erycidae

Candoiidae

Boidae

Afrophidia Uropeltoidea Cylindrophiidae Uropeltidae Acrochordidae Xenodermidae Pareidae Caenophidia Viperidae Colubroides Colubriformes Homalopsidae Cyclocoridae *

Endoglyptodonta Elapidae

** Lamprophiidae

Pseudoxyrhophiidae Elapoidea

Atractaspididae Psammophiidae

Natricidae

Pseudoxenodontidae

Dipsadidae

Sibynophiidae Colubroidea Calamariidae Grayiidae

Colubridae eogene Jurassic N Paleogene Quaternary Cretaceous

FIGURE 1. Dated species tree for Squamata with all major taxonomic categories indicated. Dark circles represent areas of local posterior probabilities on quadripartitions <95%. Black stars represent the location of fossils. Tip labels and dating error estimates are available in the Supplementary material available on Dryad.

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 508 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 509

Cylindrophiidae

e e s a i Uropeltidae

d

i ida ed d e Candoiidae r s Ungaliophii o erm

h e Boidae d c rtae dae e Erycidae o

r c no

c n alopsida i

A Charinidae Xe perida

i Parei Sanziniidae V dea dae Hom poi Calabariidae Cyclocoridae a idae hiidae El rop Pythonidae Elap amp didae Lox L i ocemida Pseudoxyrhophiidae Xenopeltid e Atractasp ae Psammophiidae Boly cidae eriidae Natri Aniliidae Pseudoxenodontidae Tropidophiida Dipsadidae e Anomalepididae nophiidae Siby

Typhlopidae Calamariidae

Grayiidae

Ger Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 rhopilidae Colubridae Tria Leptotyphlopidae ss ic Sphenodontidae Varanidae Dibamidae ae Shinisaurid Euble pharidae dae osauri Xen Diplodac tylidae Anguidae Ca rphodactylida ae Py niellid gop e An ae odi Gekkonid dae glossid Diplo Sphaerod ae rmatidae Serpentes S tidae cinci ac Helode tylidae Iguania lychro Gerrh da Po e auridae Cordylid os a Anguiformes Leios Xa uridae Opluridae cidae Bipe nt ae usiidae

Scincomorpha locer T rog di A d Hop Lace m ae dae onophiid Liolaemidae Gymn ph Laterata e Teiid Ch

A isbaenid rtida

g

a Crotaphytidae a

m ae op ae Gekkota m e ae i ht

d opidurida

Corytophani a a r h le e

e Leiocephalidae T Iguanidae almid Dibamia Dactyloidae o

n

i

d

a ae

e

Phrynosomatidae Toxicofera

Episquamata

Unidentata

Squamata PP Support <0.95

FIGURE 2. Reduced dated, circle phylogeny showing family-level relationships and higher. Low support (<95%) from local posterior probabilities on quadripartitions, bootstraps, and SH tests are indicated are indicated on nodes with a small gray circle.

absence of Toxicofera and Amerophidia given sCF was Neither concatenated analyses nor species trees − significant (P= 0.012 and 3.99 × 10 5), whereas the supported the traditional Scleroglossa or Macrosto- presence of the Dibamia/Gekkota node showed no mata. In addition, no loci recovered those groups relationship with sCF (P= 0.45). either. In contrast, Toxicofera and Amerophidia were For each node of the species tree, more genes recovered well supported (100%) in concatenated and species- a particular node than did not. This does not indicate, tree estimates (Supplementary material available on however, that the majority of gene topologies were the Dryad). Among individual genes, 75% and 69% of loci same as the species tree (Fig. 4); 72% of nodes were recovered monophyletic Toxicofera and Amerophidia, supported by the majority of gene trees. Most of the respectively. However, the sister relationship between main groups showed high species-tree and individual- Gekkota/Dibamia (GD) was inferred in all concatenated locus agreement, though we note that at 90–95 Ma, and species-tree estimates but with low support (only relationships among Pleurodonta families (Iguanians) 42.5% of individual loci). and the placement of Candoiidae and Bolyeriidae among We also generated species trees using ASTRAL III Pythonoidea and Booidea, respectively, showed strong from the loci not supporting Toxicofera, Amerophidia, discordance between the number of gene trees showing and GD. We found that among these smaller sets of the species-trees relationship. Most loci yielded strong loci, species trees did not support Scleroglossa, but phylogenetic informativeness over substantially large rather placed Laterata sister to Serpentes, followed substitution rates to infer the origin and diversifica- by Iguania and then Anguimorpha (Supplementary tion throughout the history of Squamata sampled in material available on Dryad). When forcing a sister our date trees (Supplementary material available on relationship between Laterata and Serpentes, estimat- Dryad). ing a best supporting IQTree for this constraint, and

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 509 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

510 SYSTEMATIC BIOLOGY VOL. 69

100

75

Serpentes Gekkota Quadripartion Support 1.0 Anguiformes 0.8 50 Gekkota/Dibamia 0.6

Episquamata 0.4 Iguania Amerophidia Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Scincomorpha

Site Concordance Factors 25 Toxicofera Laterata Unidentata

0

0 255075100 Gene Concordance Factors 0100 8

Concordance Factors Gene Concordance Factor

Site Concordance Factor 0204060

012345 Branch Length (coalescent units)

FIGURE 3. Plot showing the relationship between site and gene concordance factors (sCF and gCF) relative to quadripartition support from ASTRAL (top) and concordance factors regressed against branch length in coalescent units (bottom).

calculating gCF and sCF for that nodal constraint, we in Squamata (e.g., Mulcahy et al., 2012; Jones et al., found very little support, essentially random, for this 2013; Pyron, 2017). Our analysis, though, provided arrangement: gCF = 4.32 and sCF = 33.1. For those loci not an unprecedented taxonomic and genomic coverage supporting Amerophidia, we found Aniliidae as sister to at lower hierarchical levels of the squamate tree that Alethinophidia, but still not supporting Macrostomata, enabled new estimates of divergence dates among extant given the inclusion of Uropeltoidea in Alethinophidia families. Between the different sets of the bootstrapped (Supplementary material available on Dryad). Finally, phylogeny, we found that the absolute mean difference we found that 73% of the loci did not recover the GD for dated nodes was only 0.668 Ma, and all dates at relationship but instead resolving Dibamia as sister to all shared nodes were correlated ( = 0.925, P = 2.2 × − all squamates (43% of loci) or Dibamia as sister to 10 16 ). Our estimates of divergence dates (Figs. 1 and 2; the remaining Squamata after Gekkota (30% of loci; Supplementary material available on Dryad) suggested Supplementary material available on Dryad). the origin of crown-Squamata to be in the Early Jurassic (190 Ma), though recent fossil evidence suggested this may have occurred earlier in the Late Triassic (206 Ma; Divergence Dating Simões et al., 2018). Deep divergences within the tree Our divergence dates were largely congruent with of Squamata occurred between the Early and Middle recent studies focused on estimating divergence times Jurassic, with the divergence among Gekkota–Dibamia,

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 510 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 511

Scincomorpha–Episquamata, Laterata–Toxicofera, For the second machine-learning approach, we used Lacertibaenia–Teiioidea, and Iguania–Anguiformes NN classification to understand if any of the properties of occurring between 190 and 155 Ma. these loci along with RFgtst could predict why particular A large number of modern groups diverged within genes failed to recover Toxicofera, Amerophidia, and the Cretaceous: gekkotan, cordyloid, teiioid, anguiform, Dibamia/Gekkota. We determined that accuracy (scaled acrodontan, pleurodontan, and typhlopoid extant fam- between 0 and 1) when running these models over ilies. Divergence of crown-pleurodont iguanian families the training data set only had a modest increase over occurred within a short interval during the Early Late randomizing the responses (Fig. 5; accuracy for Tox- Cretaceous, between ∼98 and 79 Ma, following the end icofera = 0.08, Amerophidia = 0.08, Dibamia/Gekkota of the opening of the South Atlantic and during a period = 0.13), suggesting difficulty determining why certain of isolation of the western Gondwanan landmasses loci fail to find these three relationships. Similarly, area (McLoughlin, 2001). Similarly, Chamaeleonidae and under the ROC curve was low (0.60–0.70), and confusion ∼ Agamidae, diverged at 100 Ma. Afrophidian stem- matrices were not significant (P = 0.16–0.44). The top- Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Uropeltoidea, Pythonoidea, Booidea, and Caenophidia ranked importance variables were RFgtst (0.96–1.0) and also diverged within the Late Cretaceous, although phylogenetic informativeness (0.92–0.96) for Toxicofera most of their extant families diverged after the K/Pg and Dibamia/Gekkota. For Amerophidia, we found that boundary. Notable exceptions are the Cylindrophiidae, the variables RFgtst (0.94) and frequency of adenosine Bolyeriidae, Xenopeltidae, Calabariidae, Acrochordidae, bases (0.75) ranked highest. and Xenodermidae, which diverged in the Late Creta- ceous. All extant families of the most diverse group of squamates, the Colubriformes (>3600 extant species), DISCUSSION diverged within the Paleogene (Pareidae, Viperidae, Using a genome-scale data set with a thorough and Homalopsidae) or throughout the Eocene (all remaining diverse sampling of taxa, we corroborate nearly all recent families). molecular studies that estimate strong support for sev- eral fundamental groupings in Squamata: Unidentata, Scincomorpha, Episquamata, Laterata, Toxicofera, and Gene Interrogation via NN: Scleroglossa and Macrostomata Amerophidia. Using gene-interrogation techniques, we do not find any support in the genome for the traditional To understand discordance between gene trees and morphology-based Scleroglossa/Iguania division nor species trees beyond ILS, we used two NN analyses. for monophyly of large-gaped snakes, Macrostomata. For the first, we used a regression approach and Although the former has mainly fallen out of use in modeled predictions for the RFgtst discordance. All the literature after the rise of DNA sequence-based NN analyses converged and accuracy did not vary phylogenies, the latter has remained in use given among training data sets ranging in size from 0.5 to 0.9 analyses of morphological data that strongly support it (mean and SD r2 =0.87, 0.015). We estimated an average (Conrad, 2008; Gauthier et al., 2012; Zaher and Scanferla, RMSE, r2, and MAE (SD in parentheses) of 0.06 (0.013), 2012; Hsiang et al., 2015). 0.86 (0.07), 0.05 (0.009), respectively. These strongly suggested that the NN was accurate, particularly relative to random responses (test between real and Artificial NNs < × −16 random accuracy metrics; P 2.2 10 ), which were Artificial NNs show that loci yielding discontinuity on average 0.24 (RMSE), 0.01 (r2), 0.17 (MAE; Fig. 5). with the species tree, as measured by scaled RF distance, The top five most important variables for predicting can be characterized by a few general properties. RFgtst by each locus were: mean and SD of sCF, PIS, Although traditional support remains high across most phylogenetic informativeness, frequency of sites with parts of this phylogeny, both gCF and sCF provide three observable alternate bases, and frequency of another view where in several cases loci fail to infer key sites with four observable alternate bases (Fig. 6). nodes (Figs. 3 and 4). We developed this NN approach Individually, Bayesian correlation (BayesFirstAid to better understand if particular properties of genes or (https://github.com/rasmusab/bayesian_first_aid) simple ILS can account for discordance (Figs. 5 and 6). between each of these variables and RFgtst showed Overall, most nodes reveal a pattern consistent with ILS. strong negative correlation (Fig. 6). We also tested When using NN and filtering for multicollinearity to the efficacy of the NN approach by filtering for predict the degree of gene and species-tree discordance, multicollinearity using VIF, which removed all but five we found the following properties of genes and samples variables (mean and SD sCF, PIS, number of tips, and were important for resolving species-tree phylogenies: number of gaps). This essentially produced the same the number of PIS, mean site concordance across all prediction as using all 26 variables (accuracy = RMSE = nodes in the tree, the number of terminals sampled, 0.06, r2 = 0.87, MAE = 0.047) with variable importance and, importantly, variance in concordance sites across ranked at 1.0 for all uncorrelated variables: PIS, number all nodes of each gene tree. of tips, number of gaps, and standard deviation and As expected, genes with higher mean sCF across all mean sCF. nodes are better at producing trees concordant with the

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 511 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

512 SYSTEMATIC BIOLOGY VOL. 69

Number of Loci

−300 −200 −100 0 100 200 300 0

Candoiidae,Boidae 50

Pleurodonta Relationships Colubroides

100 Amerophidia Bolyeridae(Xenopeltidae(Loxocemidae,Pythonidae)) Node Date Serpentes Anguiformes

150 Iguania

Laterata Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Toxicofera Scincomorpha Episquamata Unidentata Dibamia/Gekkota Squamata 200

250 Does Not Support Species Tree Supports Species Tree

FIGURE 4. Density of loci supporting or not supporting nodes in the species tree scaled against dates of nodes (Ma). Major taxonomic groupings are indicated along these locus densities by particular node.

species trees. This gene property is correlated with the Squamate Phylogenomics − number of PIS ( = 0.400, P< 1.1 × 10 15). In turn, with Some authors have suggested that molecular con- our tests of multicollinearity, PIS is correlated with a vergence, such as that found in previous phylogenetic large number of other properties including phylogenetic studies (e.g., Castoe et al., 2009), may account for an informativeness over time (related to substitution rate erroneous estimation of Toxicofera (see Losos et al., over time), length of the gene, base-pair frequencies, 2012). However, this suggestion, even when using a and number of segregating sites. Interestingly, high handful of loci, seems improbable given the over- variance in sCF predicts higher discordance between whelming support for the group estimated across many the gene and species-tree topologies as measured by individual markers—including nuclear, mitochondrial, and structural (SINE) loci—and from concatenated and RF. This suggests for the main topology estimated using species trees. In addition, there appears to be very species-tree techniques, genes with high sCF variance little unambiguous evidence for Scleroglossa in most are concordant with some nodes and not others, again morphological data sets (Reeder et al., 2015). Molecular likely associated with the number of PIS per locus. convergence has been found within loci—for instance, In addition, both gCF and sCF are correlated with within burrowing squamates in mtDNA (Castoe et al., branch lengths which indicate that difficult areas of 2009), and even across the genome in some limited a phylogeny to infer with credible support will more examples (Footeetal., 2015). However, expecting con- likely be those where the timing between divergences vergence across most or all heritable markers including were small. This has been known from the phylogen- ultraconserved elements (UCE) (Streicher and Wiens, etic literature using both concatenated and multispe- 2017), AHE loci, and mtDNA is unlikely given independ- cies coalescent-based methods (Philippe et al., 1994; ence and function of these loci and the extremely low probability that taxa would randomly show the same Xu and Yang, 2016). relationships across these genes. In summary, gene-tree and species-tree discordance is We do not find a single locus supporting Scleroglossa, a mix of ILS, which is easy to ameliorate given coalescent- whereas a majority of all possible loci containing all taxa based phylogenetic inference, various properties of the recover Toxicofera (Figs. 3 and 4). Moreover, both con- genes, like site concordance and PIS, and short times catenated and species-tree estimates support Toxicofera between divergences, which may be difficult to resolve at 100%. The loci that do not recover Toxicofera are with any amount of data. It is likely that this NN also problematic for most relationships, yielding higher method could serve to filter loci prior to final species- gene-to-species-tree RF values. Interestingly, species- tree estimation. Care should be taken in cases where a tree estimates from these loci also do not support majority of loci feature poor properties for tree inference, Scleroglossa and still show some support for Toxicofera such as low PIS; preliminary species trees would likely (Supplementary material available on Dryad). Previous be poorly estimated and supported, thus providing research with a smaller data set also failed to find unusable estimates of sCF and gCF. gene-support for Scleroglossa and supported Toxicofera

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 512 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 513

RFgtst 0 5 10 15 20 25 30

0.0 0.2 0.4 0.6 0.8 1.0 10 8

Amerophidia Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 0246 0

0.5 0.6 0.7 0.8 0.9 8 Density

46 Toxicofera 2 0

0.5 0.6 0.7 0.8 0.9 10 8

Dibamia/Gekkota 0246

0.5 0.6 0.7 0.8 0.9 Random Real Accuracy

FIGURE 5. Accuracy of neural networks. The top graph shows accuracy predicting the RF distances between gene and species tree (RFgtst) using a neural network regression approach. The bottom three graphs show results from a neural-network-classification analyses to determine why particular loci fail to find the indicated three phylogenetic groups. The figures represent the density of accuracy from known (real) classifications (0—does not support the group, 1—does support the group) and randomly shuffled classifications.

(Reeder et al., 2015). Given that characters supporting and AHE data, it may be likely that the node subtending Scleroglossa (Conrad, 2008; Gauthier et al., 2012)may Iguana/Anguiformes is correct. be problematic due potential convergence arising from It is important to note in our study Serpentes shows independent adaptation to burrowing (Reeder et al., a long unbranched interval of almost 50 Ma separating 2015), we discourage any further use of Scleroglossa as a it from the remaining two toxicoferan clades (Fig. 1). nomen outside of a historical context. On the other hand, The lack of representation of a number of key extinct sister-group relationships within Toxicofera remain lineages capable of resolving the placement of snakes unresolved given that the clade formed by Iguania and by filling this geochronological gap may also confound Anguiformes received low support values in one method phylogeny. These remaining two toxicoferan clades are (quadripartition support), despite the large number also subtended by an unusually small branch, <1.4 of loci used in our study. This relationship has been my, where 98% of branching times in this tree exceed inferred previously using genomic data with greater this length. Unsolved relationships within Toxicofera support but using less taxa (Streicher and Wiens, 2017). and other difficult regions in this tree (e.g., relation- Therefore, given independent support between UCEs ships within Pleurodonta), therefore, may be due to

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 513 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

514 SYSTEMATIC BIOLOGY VOL. 69

a) 100

80

60

40 Variable Importnace (Freq) Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 20

0 st b ase ase ase T b b ases ases ase A b b b b # gaps er Taxa er of sites b b /w ST & GT b um Gap Size SD Phylogen Inf um Pars inf sites Max gap size N Sites w 2 Sites w 1 N Mean gap size Sites > 0 su Sites w 3 Sites w 4 Frequency Frequency # Sse sites w gaps ML Diff SD Site Conc factors

Mean site conc factors median = 0.69 median = 0.70 ρ ρ 95% HDI 95% HDI

0.63 0.75 −0.76 −0.64

−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0

N = 370 b) RF Distance GT-ST RF Distance GT-ST 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6

0.10 0.15 0.20 0.25 0.30 0.35 −100 0 100 200 300 400 500 SD Site Concordance Factors Sites with 4 Bases

median = -0.82 median = -0.73 ρ ρ 95% HDI 95% HDI

−0.85 −0.78 −0.78 −0.67

−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0

N = 370 N 370 RF Distance GT-ST RF Distance GT-ST 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6

200 400 600 800 1000 1200 1400 0.0 0.5 1.0 1.5 Parsimony Informative Sites Phylogenetic Informativeness

FIGURE 6. A) Variable importance (top five from each of 100 replicates) from a neural network regression neural-network analysis designed to predict the scaled response of Robinson-Foulds (RF) distances for each gene tree (GT) against the species tree (ST). B) Bayesian correlations associating RF distances with each of the four most important predictor variables.

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 514 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 515

excessively short internodes, which may require data Barbadillo, 1998, 1999; Hallermann, 1998; Lee, 1998; from the entire genome to estimate with any credible Rieppel and Zaher, 2000; Evans et al., 2005; Conrad, 2008; support. Gauthier et al., 2012), though authors typically acknow- Similarly, we find strong support for a deep sister rela- ledge that many characters showing this relationship tionship between Aniliidae (“Microstomata”) and Trop- may be due to convergence reflecting shared ecology. idophiidae (Macrostomata) using both concatenated and Our analyses inferred Dibamia as the sister to Gekkota, species-tree methods (bootstrap and Pp support = 100%) but with poor support among methods and loci, where and among loci (present in 69% of gene trees), indicating both gCF and sCF were split evenly among primary that the taxon Macrostomata defined by snakes with and discordant nodes for both metrics. The next-most- expanded gapes (Zaher, 1998; Conrad, 2008; Wilson et al., probable placement is sister to all other Squamata, which 2010; Gauthier et al., 2012; Hsiang et al., 2015) is invalid. was also found using UCEs in Streicher and Wiens This suggests that the origin of large gapes has either (2017), and then sister to Episquamata. Importantly, evolved or been lost multiple times and is consistent with our results never find Dibamia closely related to other Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Harrington and Reeder (2017), and it may indicate that limbless or burrowing taxa. In general, other methods previous hypotheses on the origin of a “macrostomatan” characterizing genomic data sets have also had difficulty diet in Serpentes requires re-analyses (e.g., Rodriguez- confidently estimating deep relationships, for which Robles et al., 1999). Unfortunately, the phylogenetic particular genes may have biased results (Brown and affinities of a number of extinct alethinophidian lineages Thomson, 2016). with key macrostomatan features—such as Pachyrhachis, Haasiophis, Eupodophis, Yurlunggur, Wonambi, and Sana- jeh—remain in dispute (Conrad, 2008; Gauthier et al., Phylogenetic Structure and the Origin of Squamates 2012; Zaher and Scanferla, 2012; Reeder et al., 2015). A The structure of our phylogeny is extremely similar more accurate placement of these fossils at the base between concatenated and species-tree methods, with of the alethinophidian tree might help clarify higher- RF distances being only 7.7% of the maximum RF. level affinities between amerophidian and afrophidian Support among all nodes for all measures of support is lineages. high, with >90% of nodes having 95% support or higher. Within Anguiformes, both concatenated and species- Although we infer a tree similar to those published pre- tree estimates support Shinisaurus as the sister taxon viously (Pyron et al., 2013; Reeder et al., 2015; Streicher to varanids and Helodermatidae as the sister taxon and Wiens, 2017), where the tree is largely structured as (Unidentata(Episquamata(Toxicofera))), we note that to Anguioidea (Diploglossidae, Anniellidae, Anguidae, several key nodes remain poorly supported (Figs. 1 Xenosauridae). This topology conflicts with the one and 2). For example, several deep relationships such as suggested by morphological data, including those with the sister to Serpentes (Anguiformes or Iguania, or both) expanded fossil sampling, where helodermatids and within Toxicofera remain uncertain. Shinisaurus are traditionally recognized as the sis- Similarly, support is low for resolving relationships ter groups of varanids and Xenosaurus, respectively among pleurodont families (primarily the New World (McDowell and Bogert, 1954; Gao and Norell, 1998; iguanians), which is similar to results from previous Gauthier, 1998; Gauthier et al., 2012). However, Con- molecular and morphological attempts to understand- rad (2008) has shown that Shinisaurus and Xenosaurus ing these relationships (Etheridge and Frost, 1989; were not closely related, the former being the sister Townsend et al., 2004; Reeder et al., 2015; Streicher et al., group of varanoids (including Helodermatidae) and the 2015). Comparable with Streicher and Wiens (2016), we found excessively short branch lengths subtending rela- latter as the sister group of Anguidae (Conrad, 2008). tionships among iguanian families, here ranging from Conrad’s (2008) morphological tree closely matches 0.3 to 1.6 my, which are in the lower 0.4–1.7% of shortest molecular estimates of anguiform affinities, including internodes on our dated tree. It is likely, given their ours, with the exception of the phylogenetic position rapid divergence in the Upper Cretaceous, that signal of Helodermatidae. The distinct and strongly supported across the genome to confidently estimate this area of the evidence provided by morphological and molecular data tree may remain difficult to extract (Rokas and Carroll, regarding the phylogenetic affinities of Helodermatidae 2008). We also find poor support for the placement of within Anguiforms is another point of conflict that still Candoiidae within Booidea and Bolyeriidae with respect awaits a solution. to Booidea and Pythonoidea, relationships also dating to Finally, the placement of Dibamia within Squamata the Upper Cretaceous. However, dense taxon sampling within these relictual families is not possible. Finally, has been difficult to resolve; molecular studies place support for relationships among some of the families Dibamia as the sister to Gekkota, sister to the remaining of the rapidly diverging and diverse Colubroidea and Squamata, or sister to Unidentata (Townsend et al., 2004; Elapoidea is lacking, mirroring numerous previous Vidal and Hedges, 2005; Pyron et al., 2013; Reeder et al., studies (Lawson et al., 2005; Zaher et al., 2009; Pyron 2015; Streicher and Wiens, 2017). Most morphological et al., 2011). We are presently sampling Colubroidea studies place Dibamia with other limbless, burrowing and Elapoidea more densely to better estimate those taxa such as Amphisbaenia or Serpentes (Evans and interfamilial relationships. As with all regions in the

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 515 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

516 SYSTEMATIC BIOLOGY VOL. 69

Squamate Tree of Life with poorly supported, short This research provides a solid framework for under- nodes, it is hopeful that imminent whole genome studies standing the relationships and dates of origins of may be capable of increasing support in that particular all extant squamates showing that most major famil- region of the tree, though strong signal for a specific ies diversified prior to the K/Pg boundary. We also arrangement may always remain elusive. provide a novel framework for interrogating genes Our results provide a solid foundation for the origins using artificial-intelligence techniques to understand of all groupings of Squamata, with major pulses of how particular loci differ from species-tree estimates. diversification occurring in the Jurassic, Cretaceous, and Importantly, we show that both ILS and poor tree Paleogene (Figs. 1 and 2). These results generally agree estimation given properties of genes such as the number with previous studies using genetic and morphological of informative sites, may produce significant discord- data (Mulcahy et al., 2012; Pyron and Burbrink, 2014; Har- ance among gene and species trees. All analyses fail to rington and Reeder, 2017), though disparity in sampling loci and taxa make direct comparisons among studies support the two traditional groupings of squamates into difficult. For example, the root times for Squamata in Scleroglossa and Macrostomata, but rather a consistent Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 a recent paper were found to be older, with crown- pattern grouping the squamates into Unidentata, Scin- Squamata originating in the Late Triassic (∼206 Ma; comorpha, Episquamata, Laterata, and Toxicofera. We Simões et al., 2018). However, Simões et al. (2018) expan- also highlight areas of topological uncertainty within ded the concept of Squamata by including Megachirella particular groups, such as Pleurodonta family and deep and from the and Early toxicoferan relationships, that represent potential aven- Jurassic, respectively; these two poorly preserved taxa ues of novel research using whole genomes and denser have been identified previously as stem-lepidosaurs taxon sampling to properly infer these relationships. (Evans and Jones, 2010). Our results are concordant with divergence dates for the origin of Squamata and for the major divisions giving rise to higher-level groups SUPPLEMENTARY MATERIAL within squamates given by Pyron (2017) and Jones et al. (2013), but we note many of our calibrations were Data available from the Dryad Digital Repository: taken from those studies. Unidentata, Episquamata, and https://doi.org/10.5061/dryad.sm6jb0p. Toxicofera all arose in the Early to Middle Jurassic. In addition, within these groups, diversification producing the primary taxonomic divisions for the following taxa ACKNOWLEDGMENTS also occurred in the Jurassic: Scincomorpha, Laterata, We thank the following curators who kindly provided Iguania, Cordylioidea, Pleurodonta, Acrodonta, and the root of Dibamia and Gekkota. tissue samples for our study: G. Schneider (UMMZ), Similar to recent combined fossil, morphological, L. Densmore, L. Grismer, A. Bauer, T. Jackman, R. and molecular studies (Jones et al., 2013; Pyron, Brown and C. K. Onn (KU), C. Austin, R. Brumfield 2017; Simões et al., 2018), a large number of diverse and D. Dittman (LSUMNS), J. Rosado (MCZ), M. groups subsequently originated in the Cretaceous, Hagemann (BPBM), A. Wynn (USNM), J. Vindum (CAS), including the crown Serpentes and their major divi- J. McGuire and C. Spencer (MVZ), D. Kizirian (AMNH), sions into Typhlopoidea, Amerophidia, Alethinophidia, A. Resetar (FMNH), K. Krysko and T.Lott (UF). We thank Afrophidia, and Caenophidia. Within these divisions, M.Kortyna, Al. Bigelow, S. Holland, J. Cherry at Florida well-known and widely distributed groups such as State University’s Center for Anchored Phylogenomics Pythonoidea, Booidea, and Uropeltoidea diversified as for assistance with data collection and analysis. well. Within lizards, we see origins and diversification within Gekkota, Teiioidea, Pleurodonta, and Acro- donta. Although understanding how the K/Pg bound- FUNDING ary affected rates of diversification within Squamata requires additional information from the fossil record, This research was supported by Fundação de Amparo it is clear that the groups originating in the Mesozoic à Pesquisa do Estado de São Paulo [grant number rapidly diversified into all major families during the BIOTA-FAPESP 2011/50206-9 to H.Z.], National Science Cenozoic, ultimately producing the 10,800 currently Foundation [grant numbers DEB-1257926 to F.T.B., DEB- known extant species. Most of these extant families 1441719 to R.A.P., DEB-1257610 to C.J.R.], and Australian of squamates diversified massively throughout the Research Council Discovery [grant number DP120104146 Paleogene and Neogene, underscoring the origins and to J.S.K. and S.C.D.]. F.G.G. benefited from a Postdoctoral diversification of the many hyperdiverse families of grant from Fundação de Amparo à Pesquisa do Estado Colubriformes. de São Paulo [FAPESP grant number 2012/08661-3].

REFERENCES CONCLUSION We provide a robust, dated phylogenomic estimate of Alencar L.R.V., Quental T.B., Grazziotin F.G., Alfaro M.L., Martins M., Venzon M., Zaher H. 2016. Diversification in vipers: phylogenetic phylogenetic relationships among Squamata sampling relationships, time of divergence and shifts in speciation rates. Mol. widely across almost all major and groups. Phylogenet. Evol. 105:50–62.

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 516 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 517

Ané C., Larget B., Baum D.A., Smith S.D., Rokas A. 2007. Bayesian Esquerré D., Sherratt E., Keogh J.S. 2017. Evolution of extreme estimation of concordance among gene trees. Mol. Biol. Evol. 24:412– ontogenetic allometric diversity and heterochrony in pythons, a 426. clade of giant and dwarf snakes. Evolution (NY) 71:2829–2844. Anisimova M., Gil M., Dufayard J.-F., Dessimoz C., Gascuel O. 2011. Estes R., Pregill G.K., Camp C.L., Charles L. 1988. Phylogenetic Survey of branch support methods demonstrates accuracy, power, relationships of the lizard families: essays commemorating Charles and robustness of fast likelihood-based approximation schemes. L. Camp. In: Estes R., Pregill G.K., editors. Phylogenetic rela- Syst. Biol. 60:685–699. tionships of the lizard families: essays commemorating Charles Arcila D., Ortí G., Vari R., Armbruster J.W., Stiassny M.L.J., Ko K.D., L. Camp. Stamford: Stanford University Press. p. 119–282. Camp Sabaj M.H., Lundberg J., Revell L.J., Betancur-R R. 2017. Genome- Memorial Symposium on the Phylogenetic Relationships of the wide interrogation advances resolution of recalcitrant groups in the Lizard Families 1982: Knoxville T. tree of life. Nat. Ecol. Evol. 1:20. Etheridge R.E., Frost D.R. 1989. A phylogenetic analysis and taxonomy Barker D.G., Barker T.M., Davis M.A., Schuett G.W. 2015. A review of of iguanian lizards (Reptilia, Squamata). Univ. Kansas Mus. Nat. the systematics and taxonomy of Pythonidae: an ancient serpent Hist. Misc. Publ. 81:1–76. lineage. Zool. J. Linn. Soc. 175:1–19. Evans S., Wang Y., Li C. 2005. The early Cretaceous Chinese lizard, Baum D.A. 2007. Concordance trees, concordance factors, and the : resolving an enigma. J. Syst. Palaeontol. 3:319– exploration of reticulate genealogy. Taxon 56:417–426. 335. Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 Brown J.M., Thomson R.C. 2016. Bayes factors unmask highly variable Evans S.E. 2003. At the feet of the dinosaurs: the early history and information content, bias, and extreme influence in phylogenomic radiation of lizards. Biol. Rev. Camb. Philos. Soc. 78:513–51. analyses. Syst. Biol. 66:517–530. Evans S.E., Barbadillo L.J. 1998. An unusual lizard (Reptilia: Squamata) Burbrink F.T., Gehara M. 2018. The biogeography of deep time from the Early Cretaceous of Las Hoyas, Spain. Zool. J. Linn. Soc. phylogenetic reticulation. Syst. Biol. 67:743–755. Burbrink F.T., Lorch J.M.J.M., Lips K.R. 2017. Host susceptibility to 124:235–265. snake fungal disease is highly dispersed across phylogenetic and Evans S.E., Barbadillo L.J. 1999. A short-limbed lizard from the Lower functional trait space. Sci. Adv. 3:e1701387. Cretaceous of Spain. Syst. Paleontol. 60:73–85. Burbrink F.T., McKelvy A.D.A.D., Pyron R.A., Myers E.A. 2015. Evans S.E., Jones M.E.H. 2010. The origin, early history and diversifica- Predicting community structure in snakes on Eastern Nearctic tion of Lepidosauromorph reptiles. Berlin, Heidelberg: Springer. p. islands using ecological neutral theory and phylogenetic methods. 27–44. Proc. R. Soc. B Biol. Sci. 282:20151700. Faraway J.J. 2002. Practical regression and anova using R. Bath: Burbrink F.T., Ruane S., Kuhn A., Rabibisoa N., Randriamahatant- University of Bath. soa B., Raselimanana A.P., Andrianarimalala M.S.M., Cadle Felsenstein J. 2004. Inferring phylogenies. Sunderland (MA): Sinauer J.E., Lemmon A.R., Lemmon E.M., Nussbaum R.A., Jones L., Associates. Pearson R.G., Raxworthy C.J. 2019. The origins and diver- Foote A.D., Liu Y., Thomas G.W.C., Vinaˇv T., Alföldi J., Deng J., Dugan sification of the exceptionally rich gemsnakes (Colubroidea: S., van Elk C.E., Hunter M.E., Joshi V., Khan Z., Kovar C., Lee S.L., Lamprophiidae: Pseudoxyrhophiinae) in Madagascar. Syst. Biol. Lindblad-Toh K., Mancia A., Nielsen R., Qin X., Qu J., Raney B.J., https://doi.org/10.1093/sysbio/syz026. Vijay N., Wolf J.B.W., Hahn M.W., Muzny D.M., Worley K.C., Gilbert Camp C.L. 1923. Classification of the lizards. Bull. Am. Mus. Nat. Hist. M.T.P., Gibbs R.A. 2015. Convergent evolution of the genomes of 48:289–480. marine mammals. Nat. Genet. 47:272–5. Castoe T.A., de Koning A.P.J., Kim H.M., Gu W.J., Noonan B.P., Naylor Fry B.G. 2015. Venomous reptiles and their toxins: evolution, patho- G., Jiang Z.J., Parkinson C.L., Pollock D.D. 2009. Evidence for an physiology, and biodiscovery. Oxford: Oxford University Press. ancient adaptive episode of convergent molecular evolution. Proc. Gao K.-Q., Norell M.A. 1998. Taxonomic revision of Carusia (Reptilia: Natl. Acad. Sci. USA 106:8986–8991. Chen M.-Y., Liang D., Zhang P. 2015. Selecting question-specific genes Squamata) from the Late Cretaceous of the Gobi Desert and to reduce incongruence in phylogenomics: a case study of jawed phylogenetic relationships of anguimorphan lizards. Am. Mus. vertebrate backbone phylogeny. Syst. Biol. 64:1104–1120. Novit. 3230:1–51. Chernomor O., von Haeseler A., Minh B.Q. 2016. Terrace aware data Garland T., Bennett A.F., Rezende E.L. 2005. Phylogenetic approaches structure for phylogenomic inference from supermatrices. Syst. Biol. in comparative physiology. J. Exp. Biol. 208:3015–3035. 65:997–1008. Gauthier J.A. 1998. Fossil xenosaurid and anguid lizards from the Early Colston T.J.., Costa G.C.., Vitt L.J. 2010. Snake diets and the deep history Eocene Wasatch Formation, southeast Wyoming, and a revision of hypothesis. Biol. J. Linn. Soc. 101:476–486. the Anguioidea. Rocky Mt. Geol. 21:7–54. Conrad J.L. 2008. Phylogeny and systematics of squamata (Reptilia) Gauthier J.A., Kearney M., Maisano J.A., Rieppel O., Behlke A.D.B. based on morphology. Bull. Am. Mus. Nat. Hist. 310:1–182. 2012. Assembling the squamate tree of life: perspectives from the Cundall D., Irish F.J. 2008. The snake skull. In: Gans C., Gaunt A.S., phenotype and the fossil record assembling the squamate tree of Adler K., editors. Biology of the reptilia. Ithaca: Cornell University life: perspectives from the phenotype and the fossil record. Bull. Press. p. 349–692. Peabody Mus. Nat. Hist. 53:3–308. Cundall D., Wallach V., Rossman D.A. 1993. The systematic rela- Green R.E., Krause J., Briggs A.W., Maricic T., Stenzel U., Kircher M., tionships of the snake Anomochilus. Zool. J. Linn. Soc. Patterson N., Li H., Zhai W., Fritz M.H.-Y., Hansen N.F., Durand E.Y., 109:275–299. Malaspinas A.-S., Jensen J.D., Marques-Bonet T., Alkan C., Prüfer De Veaux R.D., Ungar L.H. 1994. Multicollinearity: a tale of two K., Meyer M., Burbano H.A., Good J.M., Schultz R., Aximu-Petri nonparametric regressions. New York (NY): Springer. p. 393–402. A., Butthof A., Höber B., Höffner B., Siegemund M., Weihmann A., Dormann C.F., Elith J., Bacher S., Buchmann C., Carl G., Carré G., Nusbaum C., Lander E.S., Russ C., Novod N., Affourtit J., Egholm Marquéz J.R.G., Gruber B., Lafourcade B., Leitão P.J., Münkemüller T., McClean C., Osborne P.E., Reineking B., Schröder B., Skidmore M., Verna C., Rudan P., Brajkovic D., Kucan Ž., Gušic I., Doronichev A.K., Zurell D., Lautenbach S. 2013. Collinearity: a review of V.B., Golovanova L. V, Lalueza-Fox C., de la Rasilla M., Fortea J., methods to deal with it and a simulation study evaluating their Rosas A., Schmitz R.W., Johnson P.L.F., Eichler E.E., Falush D., performance. Ecography (Cop.). 36:27–46. Birney E., Mullikin J.C., Slatkin M., Nielsen R., Kelso J., Lachmann Dornburg A., Fisk J.N., Tamagnan J., Townsend J.P. 2016. PhyIn- M., Reich D., Pääbo S. 2010. A draft sequence of the Neandertal formR: phylogenetic experimental design and phylogenomic data genome. Science 328:710–722. exploration in R. BMC Evol. Biol. 16:262. Hallermann J. 1998. The ethmoidal region of Dibamus taylori Duchêne D.A., Duchêne S., Ho S.Y.W. 2017. New statistical criteria (Squamata: Dibamidae), with a phylogenetic hypothesis on dibamid detect phylogenetic bias caused by compositional heterogeneity. relationships within Squamata. Zool. J. Linn. Soc. 122:385–426. Mol. Biol. Evol. 34:1529–1534. Hamilton C.A., Lemmon A.R., Lemmon E.M., Bond J.E. 2016. Expand- Esquerré D., Scott Keogh J. 2016. Parallel selective pressures drive ing anchored hybrid enrichment to resolve both deep and shallow convergent diversification of phenotypes in pythons and boas. Ecol. relationships within the spider tree of life. BMC Evol. Biol. Lett. 19:800–809. 16:212.

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 517 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

518 SYSTEMATIC BIOLOGY VOL. 69

Harrington S.M., Leavitt D.H., Reeder T.W. 2016. Squamate phylo- McLoughlin S. 2001. The breakup history of Gondwana and its impact genetics, molecular branch lengths, and molecular apomorphies: on pre-Cenozoic floristic provincialism. Aust. J. Bot. 49:271. a response to McMahan et al. Copeia 104:702–707. Minh B.Q., Hahn M.W., Lanfear R. 2018. New methods to calculate Harrington S.M., Reeder T.W. 2017. Phylogenetic inference and diver- concordance factors for phylogenomic datasets. bioRxiv 487801. gence dating of snakes using molecules, morphology and fossils: Minh B.Q., Nguyen M.A.T., von Haeseler A. 2013. Ultrafast approxim- new insights into convergent evolution of feeding morphology and ation for phylogenetic bootstrap. Mol. Biol. Evol. 30:1188–1195. limb reduction. Biol. J. Linn. Soc. 121:379–394. Mirarab S., Warnow T. 2015. ASTRAL-II: coalescent-based species tree Hastie T., Tibshirani R., Friedman J. 2009. The elements of statistical estimation with many hundreds of taxa and thousands of genes. learning. New York (NY): Springer New York. Bioinformatics 31:i44–i52. Heath T.A., Hedtke S.M., Hillis D.M. 2008. Taxon sampling and the Montgomer D.E., Peck E.A., Vining G.G. 2012. Introduction to linear accuracy of phylogenetic analyses. J. Syst. Evol. 46:239–257. regression analysis. Hoboken (NJ): Wiley. Hsiang A.Y., Field D.J., Webster T.H., Behlke A.D., Davis M.B., Racicot Mulcahy D.G., Noonan B.P.,Moss T., Townsend T.M., Reeder T.W., Sites R.A., Gauthier J.A. 2015. The origin of snakes: revealing the ecology, J.W., Wiens J.J. 2012. Estimating divergence dates and evaluating behavior, and evolutionary history of early snakes using genomics, dating methods using phylogenomic and mitochondrial data in phenomics, and the fossil record. BMC Evol. Biol. 15:87. squamate reptiles. Mol. Phylogenet. Evol. 65:974–991. Huson D.H., Klöpper T., Lockhart P.J., Steel M.A. 2005. Reconstruction Nagy L.G., Kocsubé S., Csanádi Z., Kovács G.M., Petkovits T., Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 of reticulate networks from gene trees. Springer, Berlin, Heidelberg. Vágvölgyi C., Papp T. 2012. Re-mind the gap! Insertion—deletion p. 233–249. data reveal neglected phylogenetic potential of the nuclear Jones M.E., Anderson C., Hipsley C.A., Müller J., Evans S.E., Schoch ribosomal internal transcribed spacer (ITS) of fungi. PLoS One R.R. 2013. Integration of molecules and new fossils supports a 7:e49794. Triassic origin for (lizards, snakes, and ). BMC Nguyen L.-T., Schmidt H.A., von Haeseler A., Minh B.Q. 2015. IQ-TREE: Evol. Biol. 13:208. a fast and effective stochastic algorithm for estimating Maximum- Kalyaanamoorthy S., Minh B.Q., Wong T.K.F., von Haeseler A., Likelihood phylogenies. Mol. Biol. Evol. 32:268–274. Jermiin L.S. 2017. ModelFinder: fast model selection for accurate O’Connor D.E., Shine R. 2004. Parental care protects against infanticide phylogenetic estimates. Nat. Methods. 14:587–589. in the lizard Egernia saxatilis (Scincidae). Anim. Behav. 68:1361–1369. Katoh K., Standley D.M. 2013. MAFFT multiple sequence alignment Oppel M. 1811. Die ordnungen, familien, und gattungen der reptilien, software version 7: improvements in performance and usability. als prodrom einer naturgeschichte derselben. Munich: Joseph Mol. Biol. Evol. 30:772–80. Lindauer. Kearse M., Moir R., Wilson A., Stones-Havas S., Cheung M., Sturrock Paradis E., Claude J., Strimmer K. 2004. APE: analyses of phylogenetics S., Buxton S., Cooper A., Markowitz S., Duran C., Thierer T., Ashton and evolution in R language. Bioinformatics 20:289–290. B., Meintjes P., Drummond A. 2012. Geneious basic: an integrated Parker J., Tsagkogeorga G., Cotton J.A., Liu Y., Provero P., Stupka E., and extendable desktop software platform for the organization and Rossiter S.J. 2013. Genome-wide signatures of convergent evolution analysis of sequence data. Bioinformatics 28:1647–1649. in echolocating mammals. Nature 502:228–31. Kelly C.M.R., Barker N.P., Villet M.H., Broadley D.G. 2009. Phylogeny, biogeography and classification of the snake superfamily Elapoidea: Philippe H., Brinkmann H., Lavrov D. V., Littlewood D.T.J., Manuel a rapid radiation in the late Eocene. Cladistics Int. J. Willi Hennig M., Wörheide G., Baurain D. 2011. Resolving difficult phylogen- Soc. 25:38–63. etic questions: why more sequences are not enough. PLoS Biol. Kuhn M. 2008. Building Predictive Models in R Using the caret Package. 9:e1000602. J. Stat. Softw. 28:1–26. Philippe H., Chenuil A., Adoutte A. 1994. Can the explosion Lawson R., Slowinski J.B., Crother B.I., Burbrink F.T. 2005. Phylogeny be inferred through molecular phylogeny? Development 1994:15– of the Colubroidea (Serpentes): new evidence from mitochondrial 25. and nuclear genes. Mol. Phylogenet. Evol. 37:581–601. Projecto-Garcia J., Natarajan C., Moriyama H., Weber R.E., Fago A., Lee M.S.Y. 1998. Convergent evolution and character correlation in Cheviron Z.A., Dudley R., McGuire J.A., Witt C.C., Storz J.F. 2013. burrowing reptiles: towards a resolution of squamate relationships. Repeated elevational transitions in hemoglobin function during the Biol. J. Linn. Soc. 65:369–453. evolution of Andean hummingbirds. Proc. Natl. Acad. Sci. USA Lee M.S.Y., Scanlon J.D. 2002. Snake phylogeny based on osteology, 110:20669–74. soft and ecology. Biol. Rev. 77:333–401. Prum R.O., Berv J.S., Dornburg A., Field D.J., Townsend J.P., Lemmon Lek S., Delacoste M., Baran P., Dimopoulos I., Lauga J., Aulagnier E.M., Lemmon A.R. 2015. A comprehensive phylogeny of birds S. 1996. Application of neural networks to modelling nonlinear (Aves) using targeted next-generation DNA sequencing. Nature relationships in ecology. Ecol. Modell. 90:39–52. 526:569–573. Lemmon A.R., Emme S.A., Lemmon E.M. 2012. Anchored hybrid Pyron R.A. 2014. Temperate extinction in squamate reptiles and enrichment for massively high-throughput phylogenomics. Syst. the roots of latitudinal diversity gradients. Glob. Ecol. Biogeogr. Biol. 61:727–744. 23:1126–1134. Libbrecht M.W., Noble W.S. 2015. Machine learning applications in Pyron R.A. 2017. Novel approaches for phylogenetic inference from genetics and genomics. Nat. Rev. Genet. 16:321–332. morphological data and total-evidence dating in squamate reptiles Liu L., Zhang J., Rheindt F.E., Lei F., Qu Y., Wang Y., Zhang Y., Sullivan (lizards, snakes, and amphisbaenians). Syst. Biol. 66:38–86.. C., Nie W., Wang J., Yang F., Chen J., Edwards S. V, Meng J., Wu S. 2017. Genomic evidence reveals a radiation of placental mammals Pyron R.A., Burbrink F.T. 2012. Extinction, ecological opportunity, and uninterrupted by the KPg boundary. Proc. Natl. Acad. Sci. USA the origins of global snake diversity. Evolution (NY) 66:163–178. 114:E7282–E7290. Pyron R.A., Burbrink F.T. 2014. Early origin of viviparity and multiple López-Giráldez F., Moeller A.H., Townsend J.P. 2013. Evaluating reversions to oviparity in squamate reptiles. Ecol. Lett. 17:13–21. phylogenetic informativeness as a predictor of phylogenetic signal Pyron R.A., Burbrink F.T., Colli G.R., Montes de Oca A.N., Vitt L.J.J., for metazoan, fungal, and mammalian phylogenomic data sets. Kuczynski C.A.A., Wiens J.J., de Oca A.N.M. 2011. The phylogeny of Biomed. Res. Int. 2013:621604. advanced snakes (Colubroidea), with discovery of a new subfamily Lopez-Giraldez F., Townsend J.P. 2011. PhyDesign: a webapp for and comparison of support methods for likelihood trees. Mol. profiling phylogenetic informativeness. BMC Evol. Biol. 11:152. Phylogenet. Evol. 58:329–342. Losos J., Hillis D., Greene H. 2012. Evolution. Who speaks with a forked Pyron R.A., Burbrink F.T., Wiens J.J. 2013. A phylogeny and revised tongue? Science 338:1428–1429. classification of Squamata, including 4161 species of lizards and Martin S.H., Davey J.W., Jiggins C.D. 2015. Evaluating the use of ABBA– snakes. BMC Evol. Biol. 13:93. BABA statistics to locate introgressed loci. Mol. Biol. Evol. 32:244– Pyron R.A., Hendry C.R., Chou V.M.V.M., Lemmon E.M., Lemmon 257. A.R., Burbrink F.T. 2014. Effectiveness of phylogenomic data and McDowell S.B., Bogert C.M. 1954. The systematic position of Lanthan- coalescent species-tree methods for resolving difficult nodes in otus and the affinities of the anguinomorphan lizards. Bull. Am. the phylogeny of advanced snakes (Serpentes: Caenophidia). Mol. Mus. Nat. Hist. 105:1–142. Phylogenet. Evol. 81:221–231.

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 518 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

2020 BURBRINK ET AL.—GENOMIC RELATIONSHIPS OF SQUAMATES 519

Pyron R.A., Hsieh F.W., Lemmon A.R., Lemmon E.M., Hendry C.R. Shan Y., Paull D., McKay R.I. 2006. Machine learning of poorly 2016. Integrating phylogenomic and morphological data to assess predictable ecological data. Ecol. Modell. 195:129–138. candidate species-delimitation models in brown and red-bellied Sheehan S., Song Y.S., Buzbas E., Petrov D., Boyko A., Auton A. 2016. snakes (Storeria ). Zool. J. Linn. Soc. 177:937–949. Deep learning for population genetic inference. PLOS Comput. Biol. R Core Team. 2015. R: A language and environment for statistical com- 12:e1004845. puting. Vienna, Austria: R Foundation for Statistical Computing; Shimodaira H., Hasegawa M. 1999. Multiple comparisons of log- 2013. Document freely available on the internet [Online]. Available: likelihoods with applications to phylogenetic inference. Mol. Biol. http://www.r-project.org. Evol. 16:1114–1116. Reeder T.W., Townsend T.M., Mulcahy D.G., Noonan B.P., Wood P.L., Siegel D.S., Miralles A., Aldridge R.D. 2011. Controversial snake Sites J.W., Wiens J.J. 2015. Integrated analyses resolve conflicts over relationships supported by reproductive anatomy. J. Anat. 218:342– squamate phylogeny and reveal unexpected placements for 348. fossil taxa. PLoS One 10:e0118199. Simões T.R., Caldwell M.W., Tałanda M., Bernardi M., Palci A., Rhodin A.G.J., Kaiser H., van Dijk P.P., Wüster W., O Shea M., Archer Vernygora O., Bernardini F., Mancini L., Nydam R.L. 2018. The M., Auliya M., Boitani L., Bour R., Clausnitzer V., Contreras- origin of squamates revealed by a Middle Triassic lizard from the MacBeath T., Crother B.I., Daza J.M., Driscoll C.A., Flores-Villela Italian Alps. Nature 557:706–709. O., Frazier J., Fritz U., Gardner A.L., Gascon C., Georges A., Glaw Smith S.A., Brown J.W., Yang Y., Bruenn R., Drummond C.P., Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020 F., Grazziotin F.G., Groves C.P., Haszprunar G., Hava P., Hero Brockington S.F., Walker J.F., Last N., Douglas N.A., Moore M.J. J.-M., Hoffmann M., Hoogmoed M.S., Horne B.D., Iverson J.B., 2018. Disparity, diversity, and duplications in the Caryophyllales. Jäch M., Jenkins C.L., Jenkins R.K.B., Kiester A.R., Keogh J.S., New Phytol. 217:836–854. Lacher Jr. T.E., Lovich J.E., Luiselli L., Mahler D.L., Mallon D.P., Smith S.A., O’Meara B.C. 2012. treePL: divergence time estimation Mast R., McDiarmid R.W., Measey J., Mittermeier R.A., Molur using penalized likelihood for large phylogenies. Bioinformatics S., Mosbrugger V., Murphy R.W., Naish D., Niekisch M., Ota 28:2689–2690. H., Parham J.F., Parr M.J., Pilcher N.J., Pine R.H., Rylands A.B., Som A. 2015. Causes, consequences and solutions of phylogenetic Sanderson J.G., Savage J.M., Schleip W., Scrocchi G.J., Shaffer H.B., incongruence. Brief. Bioinform. 16:536–548. Smith E.N., Sprackland R., Stuart S.N., Vetter H., Vitt L.J., Waller Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis T., Webb G., Wilson E.O., Zaher H., Thomson S. 2015. Comment on and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. Spracklandus Hoser, 2009 (Reptilia, Serpentes, ELAPIDAE): request Streicher J.W., Schulte J.A., Wiens J.J. 2015. How should genes and for confirmation of availability of the generic name and for the taxa be sampled for phylogenomic analyses with missing data? An nomenclatural validation of the journal in which it was published empirical study in iguanian lizards. Syst. Biol. 65:128–45. (Case 3601; BZN 70: 234–237; 71: 30–38, 133–135,181–182, 252–253). Streicher J.W., Wiens J.J. 2016. Phylogenomic analyses reveal novel Ricklefs R.E., Losos J.B., Townsend T.M. 2007. Evolutionary diversific- relationships among snake families. Mol. Phylogenet. Evol. 100:160– ation of clades of squamate reptiles. J. Evol. Biol. 20:1751–1762. 169. Rieppel O. 1988. A review of the origin of snakes. Evolutionary biology. Streicher J.W., Wiens J.J. 2017.Phylogenomic analyses of more than 4000 Boston (MA): Springer US. p. 37–130. nuclear loci resolve the origin of snakes among lizard families. Biol. Rieppel O. 2012. “Regressed” macrostomatan snakes. Fieldiana Life Lett. 13:20170393. Earth Sci. 5:99–103. Townsend J.T. 1971. Theoretical analysis of an alphabetic confusion Rieppel O., Zaher H. 2000. The iIntramandibular joint in squamates, matrix. Percept. Psychophys. 9:40–50. and the phylogenetic relationships of the fossil snake Pachyrhachis Townsend T.M., Larson A., Louis E., Macey J.R. 2004. Molecular problematicus Haas. Fieldiana Geol. 43:1–69. phylogenetics of squamata: the position of snakes, amphisbaenians, Rieppel O., Zaher H., Tchernov E., Polcyn M.J. 2003. The anatomy and dibamids, and the root of the squamate tree. Syst. Biol. and relationships of Haasiophis terrasanctus, a fossil snake with well- 53:735–757. developed hind limbs from the Mid-Cretaceous of the Middle East. Tucker D.B., Colli G.R., Giugliano L.G., Hedges S.B., Hendry C.R., J. Paleontol. 77:536–558. Lemmon E.M., Lemmon A.R., Sites J.W., Pyron R.A. 2016. Method- Robinson D.F., Foulds L.R. 1981. Comparison of phylogenetic trees. ological congruence in phylogenomic analyses with morphological Math. Biosci. 53:131–147. support for teiid lizards (: Teiidae). Mol. Phylogenet. Evol. Rodriguez-Robles J.A., Bell C.J., Greene H.W. 1999. Gape size and 103:75–84. evolution of diet in snakes: feeding ecology of erycine boas. J. Zool. Uetz P., Freed P., Hošek J. 2009. The reptile database. Available from: 248:49–58. http://www.reptile-database.org. Rokas A., Carroll S.B. 2008. Frequent and widespread parallel evolution Underwood G. 1967. A contribution to the classification of snakes. of protein sequences. Mol. Biol. Evol. 25:1943–1953. London: British Museum. Rokyta D.R., Lemmon A.R., Margres M.J., Aronow K. 2012. The Vidal N., Hedges S.B. 2004. Molecular evidence for a terrestrial origin venom-gland transcriptome of the eastern diamondback rattlesnake of snakes. Proc. R. Soc. London Ser. B-Biological Sci. 271:S226–S229. (Crotalus adamanteus). BMC Genomics 13:312. Vidal N., Hedges S.B. 2005. The phylogeny of squamate reptiles Rosenberg M.S., Kumar S. 2003. Heterogeneity of nucleotide frequen- (lizards, snakes, and amphisbaenians) inferred from nine nuclear cies among evolutionary lineages and phylogenetic inference. Mol. protein-coding genes. C. R. Biol. 328:1000–1008. Biol. Evol. 20:610–621. Vidal N., Hedges S.B. 2009. The molecular evolutionary tree of lizards, Ruane S., Raxworthy C.J.C.J., Lemmon A.R., Lemmon E.M., Burbrink snakes, and amphisbaenians. C. R. Biol. 332:129–139. F.T. 2015. Comparing species tree estimation with large anchored Vidal N., Marin J., Morini M., Donnellan S., Branch W.R., Thomas phylogenomic and small Sanger-sequenced molecular datasets: an R., Vences M., Wynn A., Cruaud C., Hedges S.B. 2010. Blindsnake empirical study on Malagasy pseudoxyrhophiine snakes. BMC evolutionary tree reveals long history on Gondwana. Biol. Lett. Evol. Biol. 15:221. 6:558–561. Saint K.M., Austin C.C., Donnellan S.C., Hutchinson M.N. 1998. C- Vitt L.J., Caldwell J.P. 2009. Herpetology. Burlington (MA): Elsevier. mos, a nuclear marker useful for squamate phylogenetic analysis. Vitt L.J., Pianka E.R. 2005. Deep history impacts present-day ecology Mol. Phylogenet. Evol. 10:259–263. and biodiversity. Proc. Natl. Acad. Sci. USA 102:7877–7881. Sanderson M.J. 2002. Estimating absolute rates of molecular evolution Wiens J.J., Hutter C.R., Mulcahy D.G., Noonan B.P., Townsend T.M., and divergence times: a penalized likelihood approach. Mol. Biol. Sites J.W., Reeder T.W. 2012. Resolving the phylogeny of lizards and Evol. 19:101–109. snakes (Squamata) with extensive sampling of genes and species. Savage J.M. 2015. What are the correct family names for the taxa that Biol. Lett. 8:1043–1046. include the snake general Xenodermus, Pareas, and Calamaria? Wiens J.J., Kuczynski C.A., Townsend T., Reeder T.W., Mulcahy D.G., Herpetol. Rev. 46:664–665. Sites J.W. 2010. Combining phylogenomics and fossils in higher level Scanlon J.D. 2006. Skull of the large non-macrostomatan snake squamate reptile phylogeny: molecular data change the placement Yurlunggur from the Australian Oligo-Miocene. Nature 439:839–842. of fossil taxa. Syst. Biol. 59:674–688.

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 519 502–520 Copyedited by: AV MANUSCRIPT CATEGORY: Systematic Biology

520 SYSTEMATIC BIOLOGY VOL. 69

Wilson J.A., Mohabey D.M., Peters S.E., Head J.J. 2010. Predation upon (Serpentes, Caenophidia) with an emphasis on South American hatchling dinosaurs by a new snake from the Late Cretaceous of xenodontines: a revised classification and descriptions of new taxa. India. PLoS Biol. 8:e1000322. Pap. Avulsos Zool. 49:115–153. Wortley A.H., Rudall P.J., Harris D.J., Scotland R.W. 2005. How much Zaher H., Scanferla C.A. 2012. The skull of the Upper Creta- data are needed to resolve a difficult phylogeny? Case study in ceous snake Dinilysia patagonica Smith-Woodward, 1901, and Lamiales. Syst. Biol. 54:697–709. its phylogenetic position revisited. Zool. J. Linn. Soc. 164:194– Xu B., Yang Z. 2016. Challenges in species tree estimation under the 238. multispecies coalescent model. Genetics 204:1353–1368. Zaher H., Yánez-Muñoz M.H., Rodrigues M.T., Graboski R., Machado Zaher H. 1998. The phylogenetic position of Pachyrhachis within snakes F.A., Altamirano-Benavides M., Bonatto S.L., Grazziotin F.G. 2018. (Squamata, Lepidosauria). J. Vertebr. Paleontol. 18:1–3. Origin and hidden diversity within the poorly known Galápagos Zaher H., de Oliveira L., Grazziotin F.G., Campagner M., Jared C., snake radiation (Serpentes: Dipsadidae). Syst. Biodivers. 16:614– Antoniazzi M.M., Prudente A.L. 2014. Consuming viscous prey: a 642. novel protein-secreting delivery system in neotropical snail-eating Zhang W. 2010. Computational ecology: artificial neural networks and snakes. BMC Evol. Biol. 14:58. their applications. Singapore: World Scientific. Zaher H., Grazziotin F.G., Cadle J.E., Murphy R.W., de Moura Zou Z., Zhang J. 2016. Morphological and molecular convergences in J.C., Bonatto S.L. 2009. Molecular phylogeny of advanced snakes mammalian phylogenetics. Nat. Commun. 7:12758. Downloaded from https://academic.oup.com/sysbio/article-abstract/69/3/502/5573126 by Rutgers University user on 21 May 2020

[07:45 3/4/2020 Sysbio-OP-SYSB190063.tex] Page: 520 502–520