<<

Downloaded from http://rsbl.royalsocietypublishing.org/ on September 13, 2017

Evolutionary biology Phylogenomic analyses of more than rsbl.royalsocietypublishing.org 4000 nuclear loci resolve the origin of among families

Research Jeffrey W. Streicher1,2 and John J. Wiens1 1Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721-0088, USA Cite this article: Streicher JW, Wiens JJ. 2017 2Department of Life Sciences, The Natural History Museum, London SW7 5BD, UK Phylogenomic analyses of more than 4000 JWS, 0000-0002-3738-4162; JJW, 0000-0003-4243-1127 nuclear loci resolve the origin of snakes among lizard families. Biol. Lett. 13: Squamate ( and snakes) are the most diverse group of terres- 20170393. trial vertebrates, with more than 10 000 species. Despite considerable effort http://dx.doi.org/10.1098/rsbl.2017.0393 to resolve relationships among major squamates clades, some branches have remained difficult. Among the most vexing has been the placement of snakes among lizard families, with most studies yielding only weak sup- port for the position of snakes. Furthermore, the placement of iguanian Received: 17 June 2017 lizards has remained controversial. Here we used targeted sequence capture Accepted: 18 August 2017 to obtain data from 4178 nuclear loci from ultraconserved elements from 32 squamate taxa (and five outgroups) including representatives of all major squamate groups. Using both concatenated and species-tree methods, we recover strong support for a sister relationship between iguanian and angu- imorph lizards, with snakes strongly supported as the sister group of these Subject Areas: two clades. These analyses strongly resolve the difficult placement of snakes and systematics within squamates and show overwhelming support for the contentious pos- ition of iguanians. More generally, we provide a strongly supported Keywords: hypothesis of higher-level relationships in the most species-rich tetrapod clade using coalescent-based species-tree methods and approximately 100 lizards, phylogenomics, snakes, species-tree times more loci than previous estimates.

Author for correspondence: Jeffrey W. Streicher 1. Introduction e-mail: [email protected] Squamates, lizards and snakes, are a particularly important group. They include more than 10 000 extant species [1], surpassing birds as the most diverse major tetrapod clade. They are used as study systems in many areas of research, including (among many) the evolution of sex, viviparity, herbivory, body form and toxins [2]. Venomous squamates also pose a serious hazard to humans in some regions [3], whereas their venoms offer many resources for medical research [4]. There has been considerable recent research on higher-level squamate phylogeny, but some branches have remained contentious. Molecular analyses have supported a very different set of relationships relative to those suggested by morphology. Morphological analyses have placed iguanian lizards at the base of the squamate tree [5,6]. By contrast, molecular analyses have placed gekkotans and dibamids at the base, and iguanians in a more recent clade with snakes and anguimorphs [7–11]. Combined analyses of molecular and morphological data have also supported the molecular trees [12,13], and have shown that the support for the morphological tree is problematic. Nevertheless, some authors have continued to question the placement of iguanians by Electronic supplementary material is available molecular data [6,14,15]. online at https://dx.doi.org/10.6084/m9. Further, some key aspects of higher-level squamate phylogeny have remained uncertain, even in molecular analyses. One is the placement of figshare.c.3865459. snakes. Many molecular analyses have placed snakes as the sister to a clade

& 2017 The Author(s) Published by the Royal Society. All rights reserved. Downloaded from http://rsbl.royalsocietypublishing.org/ on September 13, 2017

uniting iguanians and anguimorphs, but the support for the analyses used the CIPRES server [24] and branch support was 2 latter clade has been consistently weak [8–11]. Furthermore, assessed with bootstrapping. Species-trees analyses used the pro- rsbl.royalsocietypublishing.org previous studies generally used concatenated analyses, gram ASTRAL 4.10.11 [25,26], with branch support estimated rather than species-tree analyses. Concatenated analyses using local posterior probabilities [27]. may yield weak or even positively misleading results when We generally included only taxa with more than 1000 UCEs. However, we included two more incomplete taxa that were there is extensive conflict among gene trees, whereas potentially important for the tree’s base: the dibamid Anelytropsis species-tree methods can yield well-supported and accurate (90 UCEs; electronic supplementary material, table S1) and the results [16]. Such conflicts among gene trees are well docu- closest squamate outgroup (Sphenodon; 596 UCEs, electronic mented for many major branches of squamate phylogeny supplementary material, table S1). Although Anelytropsis is [9]. Another key area of uncertainty is whether the sister highly incomplete, its inclusion or exclusion had no impact on group to all other squamates is dibamids [7,10,11] or a the estimated topologies (see the electronic supplementary clade consisting of dibamids and gekkotans [9,12,13]. material, figures S1 and S2). Lett. Biol. Here, we analyse higher-level squamate relationships using phylogenomic data and species-tree methods. We obtain nuclear data from 4178 loci, approximately 100 times 13 more than previous analyses of higher-level squamate phylo- 3. Results 20170393 : geny [9,11,13]. We strongly support snakes as the sister group The concatenated and species-tree analyses provided strong of a well-supported clade uniting Iguania þ Anguimorpha. and concordant support for most higher-level squamate Thus, our analyses simultaneously resolve these two persistent relationships (figure 1; electronic supplementary material, controversies in squamate phylogeny. Overall, we provide a figure S3). There was strong support for a clade uniting angu- generally well-supported estimate of higher-level squamate imorphs, iguanians, and snakes (Toxicofera) and strong phylogeny using phylogenomic data and species-tree methods. support for a clade linking anguimorphs and iguanians. Thus, these results strongly resolve the controversial place- ment of snakes and iguanians with species-tree methods and a dataset of unprecedented size. There was strong, 2. Material and methods concordant support for most other relationships across the We sampled 32 squamate species and five outgroup taxa, which tree, such as the clade uniting scincids, xantusiids, gerrho- included representatives of all major amniote clades (i.e. rhyncho- saurids and cordylids (Scincoidea), and the clade uniting cephalians, crocodilians, birds, turtles and mammals). Sampling gymnophthalmids, teiids, lacertids, and amphisbaenians within squamates included all major groups and most families (out- (Lacertoidea). The backbone relationships among these side of snakes, iguanians and amphisbaenians). Details of sampling major clades were also strongly supported. are given in the electronic supplementary material, table S1. There were also a few remaining areas of uncertainty. We used anchored enrichment to sequence ultraconserved Both approaches agreed that dibamids and gekkotans are at elements (UCEs, [17]) and obtained an average of 2738.7 UCEs the base of the squamate tree, but differ as to whether diba- per species. Laboratory methods (following [18,19]) are listed mids are sister to all other squamates (ASTRAL), or whether in the electronic supplementary material, appendix S1. Data for dibamids and gekkotans are sister taxa (concatenated). The most outgroups (all but Sphenodon) and nine ingroup taxa were from previous studies (electronic supplementary material, conflicting relationships were moderately supported by table S1). We obtained new UCE data for 24 taxa. In theory, each approach. There were also conflicts (and weak support we could have included additional published data for iguanian from ASTRAL) among some gekkotan families. Finally, lizards [18] and snakes [19]. However to make the analyses there were conflicting results and weak support regarding more computationally tractable, we included only three species monophyly of amphisbaenians relative to lacertids. from each clade, representing major lineages (i.e. scolecophidian, pythonid, and colubroid snakes; pleurodont and acrodont iguanians). We included the most complete taxon available within each major lineage. 4. Discussion We cleaned, assembled, identified and aligned UCE data In this study, we assembled a large phylogenomic dataset to using the software packages velvet 1.2.10 [20] and PHYLUCE address higher-level squamate phylogeny, using species-tree 2.0.0 [17,21]. We included UCEs with up to 50% missing taxa, which appears to give optimal results [18]. The final dataset methods and sampling approximately 100 times more loci had 4178 UCEs (1 530 015 total aligned base pairs), with 52.4% than previous studies. Our results strongly resolve the phylo- missing data overall (including gaps). The average number of genetic placement of snakes among lizard families, as the variable sites per UCE was 108.3 (range 17–455) and mean sister group to anguimorphs and iguanians. Concomitantly, UCE length was 380.0 base pairs (range 108–878). Raw FASTQ they provide overwhelming evidence to conclusively resolve files for new samples were uploaded to the NCBI sequence the contentious placement of iguanians. We acknowledge read archive (electronic supplementary material, table S1). Align- that some readers might consider the placement of snakes ments and gene trees for all analyses are available via the Dryad and iguanians as already known, but previous analyses had Digital Repository [22]. only weak support and generally used concatenated analyses Phylogenetic analyses were conducted using two [7–13]. Some authors have also argued for a þ angui- approaches: (i) concatenated maximum likelihood, and morph clade instead [28]. More generally, we provide (ii) species-tree analysis using gene trees as input (i.e. one best likelihood tree per UCE). To perform the concatenated analysis strong support for most higher-level relationships across and infer gene trees, we used RAXML 8.2.9 [23] with default set- squamates. We note that may be among the first tings and the GTRCAT model applied to the entire alignment major groups of vertebrates to have their higher-level (partitions are not generally appropriate for UCE data, given relationships largely resolved by species-tree analyses of that most loci are not protein coding [17]). The concatenated thousands of loci (along with birds [29]). Downloaded from http://rsbl.royalsocietypublishing.org/ on September 13, 2017

Homo sapiens 3 Chrysemys picta

Gallus gallus rsbl.royalsocietypublishing.org Alligator mississippiensis Sphenodon punctatus novaeguineae Anelytropsis papillosus Coleonyx variegatus Gonatodes albogularis Gekko gecko 0.38/100 Strophurus ciliaris Gekkota Saltuarius cornutus 0.52/NS Lialis burtonis Acontias meleagris Plestiodon fasciatus Scincidae Scincoidea Tiliqua scincoides 0.85/NS Cricosaura typica Lett. Biol. Lepidophyma flavimaculatum Xantusiidae Cordylosaurus subtesselatus Smaug mossambicus + Cordylidae

Pholidobolus macbrydei 13 coalescent units

Tupinambis teguixin Gymnophthalmidae + Teiidae 20170393 : Lacertoidea Aspidoscelis tigris 0.6 Bipes biporus Lacerta viridis Lacertidae + Amphisbaenia 0.47/NS Rhineura floridana jamaicensis Micrurus fulvius Serpentes Python molurus Hydrosaurus sp. Anolis carolinensis lguania Toxicofera Uta stansburiana 1.0/99 Lanthanotus borneensis Varanus exanthematicus Heloderma suspectum Anguimorpha congruent with concatenated ML Anniella pulchra incongruent with concatenated ML 0.86/100 Xenosaurus platyceps

Figure 1. Phylogeny of squamate reptiles inferred from a species-tree analysis (ASTRAL) of 4178 nuclear loci and 1 530 015 aligned base pairs. Tree is rooted with Homo. Light blue highlighting indicates clades congruent with the concatenated maximum-likelihood analysis (see the electronic supplementary material, figure S3). Grey highlighting indicates incongruent clades. Support values are: local posterior probabilities (species tree)/bootstrap support (concatenated). NS indicates ‘no support’ from the concatenated analysis (incongruent branch). When no numbers are shown, support values are equal to 1.0/100 (maximum support from both methods).

Nevertheless, we acknowledge three areas of remaining contentious aspects of squamate phylogeny using appro- uncertainty in our tree. Two involve lower-level relationships, ximately 100 times more loci than in previous studies, and and can probably be resolved with increased taxon sampling with species-tree methods. Although some areas of uncer- (e.g. relationships among gekkotan families and among lacer- tainty remain, squamates may now be among the most tids and amphisbaenians). Thus, adding more taxa within phylogenetically well-resolved of the major tetrapod clades. these groups should help subdivide long branches associated with these problematic lower-level relationships. The third area (gekkotans, dibamids, other squamates) is more difficult. It is unclear if adding more gekkotans or dibamids can Data accessibility. All data are accessible via Dryad (http://dx.doi.org/ significantly shorten the relevant branches. Resolving these 10.5061/dryad.66gt5) [22]. relationships might require major methodological advances, Authors’ contributions. J.W.S. and J.J.W. designed the study and wrote the or even more loci. Intriguingly, we note that the relationships paper, J.W.S. collected data and performed analyses, all authors revised the paper, approved the final version to be published, and for these taxa from ASTRAL are consistent with concatenated ana- agree to be held accountable for the content therein. lyses including thousands of taxa [10,11]. The results from the Competing interests. The authors declare no competing interests. þ concatenated analysis (dibamids gekkotans) are consistent Funding. Funding was provided by the University of Arizona. with concatenated analyses of fewer taxa and loci [9,13]. Acknowledgements. We thank the many individuals and institutions that In summary, our results here provide strong support for provided tissues used here (listed in the electronic supplementary most higher-level squamate relationships. We resolve two material, table S1), but especially T. Reeder.

References

1. Uetz P, Freed P, Hosˇek J. (eds) 2017 The 3. Kasturiratne A et al. 2008 Estimation of the global Pharm. Des. 13, 2927–2934. (doi:10.2174/ Database. See http://www.reptile-database.org burden of snakebite. PLoS Med. 5, e218. (doi:10. 138161207782023739) (accessed 28 May 2017). 1371/journal.pmed.0050218) 5. Conrad JL. 2008 Phylogeny and systematics of 2. Sites Jr JW, Reeder TW, Wiens JJ. 2011 Phylogenetic 4. Fox JW, Serrano SMT. 2007 Approaching the Squamata (Reptilia) based on morphology. Bull. Am. insights on evolutionary novelties in lizards and golden age of natural product pharmaceuticals Mus. Nat. Hist. 310, 1–182. (doi:10.1206/310.1) snakes: sex, birth, bodies, niches, and venom. Annu. from venom libraries: an overview of toxins 6. Gauthier JA, Kearney M, Maisano JA, Rieppel O, Rev. Ecol. Evol. Syst. 42, 227–244. (doi:10.1146/ and toxin-derivatives currently involved in Behike ADB. 2012 Assembling the squamate tree of annurev-ecolsys-102710-145051) therapeutic or diagnostic applications. Curr. life: perspectives from the phenotype and the fossil Downloaded from http://rsbl.royalsocietypublishing.org/ on September 13, 2017

record. Bull. Peabody Mus. Nat. Hist. 53, 3–308. analyses resolve conflicts over squamate reptile Bioinformatics 32, 786–788. (doi:10.1093/ 4 (doi:10.3374/014.053.0101) phylogeny and reveal unexpected placements for bioinformatics/btv646) rsbl.royalsocietypublishing.org 7. Townsend T, Larson A, Louis EJ, Macey JR. 2004 fossil taxa. PLoS ONE 10, e0118199. (doi:10.1371/ 22. Streicher JW, Wiens JJ. 2017 Data from: Phylogenomic Molecular phylogenetics of Squamata: the position journal.pone.0118199) analyses of more than 4000 nuclear loci resolve the of snakes, amphisbaenians, and dibamids, and the 14. Losos JB, Hillis DM, Greene HW. 2012 Who speaks origin of snakes among lizard families. Dryad Digital root of the squamate tree. Syst. Biol. 53, 735–757. with a forked tongue? Science 338, 1428–1429. Repository. (http://dx.doi.org/10.5061/dryad.66gt5) (doi:10.1080/10635150490522340) (doi:10.1126/science.1232455) 23. Stamatakis A. 2014 RAxML version 8: a tool for 8. Vidal N, Hedges SB. 2005 The phylogeny of 15. McMahan CD, Freeborn LR, Wheeler WC, Crother BI. phylogenetic analysis and post-analysis of large squamate reptiles (lizards, snakes, and 2015 Forked tongues revisited: molecular phylogenies. Bioinformatics 30, 1312–1313. amphisbaenians) inferred from nine nuclear protein- apomorphies support morphological hypotheses of (doi:10.1093/bioinformatics/btu033) coding genes. C. R. Biol. 328, 1000–1008. (doi:10. squamate evolution. Copeia 103, 525–529. (doi:10. 24. Miller MA, Pfeiffer W, Schwartz T. 2010 Creating the

1016/j.crvi.2005.10.001) 1643/CH-14-015) CIPRES science gateway for inference of large Lett. Biol. 9. Wiens JJ, Hutter CR, Mulcahy DG, Noonan BP, 16. Edwards SV. 2009 Is a new and general theory phylogenetic trees. In Proc. Gateway Comp. Environ. Townsend TM, Sites Jr JW, Reeder TW. 2012 of molecular systematics emerging? Evolution 63, Workshop, 14 November 2010, New Orleans, USA,

Resolving the phylogeny of lizards and snakes 1–19. (doi:10.1111/j.1558-5646.2008.00549.x) pp. 1–8. 13 (Squamata) with extensive sampling of genes and 17. Faircloth BC, McCormack JE, Crawford NG, Harvey 25. Mirarab S, Bayzid MS, Zimmermann T, Swenson MS, 20170393 : species. Biol. Lett. 8, 1043–1046. (doi:10.1098/rsbl. MG, Brumfield RT, Glenn TC. 2012 Ultraconserved Warnow T. 2014 ASTRAL: genome-scale coalescent- 2012.0703) elements anchor thousands of genetic markers for based species tree estimation. Bioinformatics 30, 10. Pyron RA, Burbrink FT, Wiens JJ. 2013 A phylogeny target enrichment spanning multiple evolutionary i541–i548. (doi:10.1093/bioinformatics/btu462) and revised classification of Squamata, including timescales. Syst. Biol. 61, 717–726. (doi:10.1093/ 26. Mirarab S, Warnow T. 2015 ASTRAL-II: coalescent- 4161 species of lizards and snakes. BMC Evol. Biol. sysbio/sys004) based species-tree estimation with many 13, 93. (doi:10.1186/1471-2148-13-93) 18. Streicher JW, Schulte II JA, Wiens JJ. 2016 hundreds of taxa and thousands of genes. 11. Zheng Y, Wiens JJ. 2016 Combining phylogenomic How should genes and taxa be sampled for Bioinformatics 31, i44–i52. (doi:10.1093/ and supermatrix approaches, and a time-calibrated phylogenomic analyses with missing data? An bioinformatics/btv234) phylogeny for squamate reptiles (lizards and empirical study in iguanian lizards. Syst. Biol. 65, 27. Sayyari E, Mirarab S. 2016 Fast coalescent-based snakes) based on 52 genes and 4,162 species. Mol. 128–145. (doi:10.1093/sysbio/syv058) computation of local branch support from quartet Phylogenet. Evol. 94, 537–547. (doi:10.1016/j. 19. Streicher JW, Wiens JJ. 2016 Phylogenomic analyses frequencies. Mol. Biol. Evol. 33, 1654–1668. ympev.2015.10.009) reveal novel relationships among snake families. (doi:10.1093/molbev/msw079) 12. Wiens JJ, Kuczynski CA, Townsend T, Reeder TW, Mol. Phylogenet. Evol. 100, 160–169. (doi:10.1016/ 28. Lee MSY. 2009 Hidden support from unpromising Mulcahy DG, Sites Jr JW. 2010 Combining j.ympev.2016.04.015) datasets strongly unites snakes and anguimorph phylogenomics and fossils in higher-level squamate 20. Zerbino DR, Birney E. 2008 Velvet: algorithms for lizards. J. Evol. Biol. 22, 1308–1316. (doi:10.1111/j. reptile phylogeny: molecular data change the de novo short read assembly using de Bruijn 1420-9101.2009.01751.x) placement of fossil taxa. Syst. Biol. 59, 674–688. graphs. Genome Res. 18, 821–829. (doi:10.1101/ 29. Jarvis ED et al. 2014 Whole genome analyses (doi:10.1093/sysbio/syq048) gr.074492.107) resolve early branches in the tree of life of modern 13. Reeder TW, Townsend TM, Mulcahy DG, Noonan BP, 21. Faircloth BC. 2016 PHYLUCE is a software package birds. Science 346, 1321–1331. (doi:10.1126/ Wood PL, Sites Jr JW, Wiens JJ. 2015 Integrated for the analysis of conserved genomic loci. science.1253451)