Phylogenetic Perspectives on Viviparity, Gene-Tree Discordance, and Introgression in ()

Item Type text; Electronic Dissertation

Authors Lambert, Shea Maddock

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Download date 07/10/2021 08:50:17

Link to Item http://hdl.handle.net/10150/630229 1

PHYLOGENETIC PERSPECTIVES ON VIVIPARITY, GENE-TREE DISCORDANCE, AND INTROGRESSION IN LIZARDS (SQUAMATA).

by

Shea Maddock Lambert

______Copyright © Shea Maddock Lambert 2018

A Dissertation Submitted to the Faculty of the

DEPARTMENT OF ECOLOGY AND EVOLUTIONARY BIOLOGY

In Partial Fulfillment of the Requirements

For the Degree of

DOCTOR OF PHILOSOPHY

In the Graduate College

THE UNIVERSITY OF ARIZONA 2

THEUNIVERSITY OF ARIZONA GRADUATE COLLEGE

Asmembers of the Dissertation Committee, we certify that we have read the dissertation prepared by Shea M. Lambert, titled "Phylogenetic perspectives on viviparity,gene-tree discordance, and introgressionin lizards {Squamata)" and recomme11dthat it be accepted as fulfillingthe dissertation requirem�t for the Degree of Doctor of Philosophy.

_.c.---� ------Date: May 21, 2018 _wJohn__ . �e� --�_:-_:-__:_ W_ -----�----'------Date: May 21, 2018 Michael Barker

M ichael ( s��t=t ��A". =----.�+o/-�i � -\.\----�------.______Date: May 21, 2018 Noa�man

Final approval and acceptance· of this dissertation is contingent uporithe candidate's submission of the final copies of the· dissertation to the Graduate College.

I hereby certify that I have read this dissertation prepared under my ditettion and recommend that it be accepted as fulfillin_;the �issertation requirement. V �_ Date: May 21, 2018 �Dissertation n1n1 ctor: John Wiens· 3

STATEMENT BY AUTHOR

This dissertation has been submitted in partial fulfillment of the requirements for an advanced degree at the University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.

Brief quotations from this dissertation are allowable without special permission, provided that an accurate acknowledgement of the source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department of the Dean of the Graduate College when in his or her judgement the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.

SIGNED: Shea M. Lambert 4

ACKNOWLEDGEMENTS

I first thank my doctoral committee, John Wiens, Michael Barker, Michael Sanderson, and Noah Whiteman, for their invaluable assistance and mentorship. I would also like to thank members of the UA EEB staff who have been incredibly helpful over the years, including Pennie Rabago, Lilian Schwartz, Lauren Harrison, Barry McCabe, and Vincente Leon. I would also like to thank all fellow scientists who have been friends, mentors, collaborators, and colleagues over the years, each of whom I am grateful to have known and learned from, including Anthony Baniaga, Evan Forsythe, Anthony Geneva, Julienne Ng, Daniel Scantlebury, Richard Glor, Tim O’Connor, Justin Shaffer, Kelsey Yule, Ryan Ruboyianes, Sarah Richman, Ben Weinstein, Leone Brown, Laurel Yohe, James Herrera, Carl Hutter, Uri Garcia, Tod Reeder, Noelle Bittner, Tereza Jezkova, Liz Miller, Cristian Roman-Palacios, Li Zheng, Alison Harrington, Caitlin Fisher-Reid, Xia Hua, Jeff Streicher, Doug Futuyma, John True, Peter Reinthal, Heather Lynch, and Adrian Nieto Montes de Oca.

I would not be in this fortunate position without the support of my friends and family over the years. In particular, my parents Patrick and Joan Lambert, and sister, Nora Lambert, have always fostered and inspired my curiosity, and always been there when I needed them. I must also thank long- time friends from Rochester, NY for providing so much support and inspiration over so many years: Matthew Agone, Patrick Connelly, Zack Davis, Andrew Detty, Brian Kraftschik, Tom Segerson, and Robert Smith. Last but not least, I would like to thank close friends from Tucson: Evan Forsythe, Anthony Baniaga, Alex Ralston, Florence Durney, and Emily Xander.

Finally, I would like to acknowledge my sources of funding, including an NSF Graduate Research Fellowship (2012), and research awards from the Society for the Study of Evolution (Rosemary Grant Student Research Award, 2012) and the Society of Systematic Biologists (Graduate Student Research Award, 2012). 5

TABLE OF CONTENTS

ABSTRACT…………………………………………………………………………………. 6

INTRODUCTION…………………………………………………………………………… 8

REFERENCES………………………………………………………………………………. 14

APPENDIX A. EVOLUTION OF VIVIPARITY: A PHYLOGENETIC TEST OF THE COLD CLIMATE HYPOTHESIS IN PHRYNOSOMATID LIZARDS……………………………. 17

APPENDIX B. WHEN DO -TREE AND CONCATENATED ESTIMATES DISAGREE? AN EMPIRICAL ANALYSIS WITH HIGHER-LEVEL SCINCID PHYLOGENY…. 34

APPENDIX C. INFERRING INTROGRESSION USING RADSEQ AND DFOIL: POWER AND PITFALLS REVEALED IN A CASE STUDY OF SPINY LIZARDS (SCELOPORUS)….. 46 6

ABSTRACT

This dissertation consists of three works of phylogenetic biology. In each case, my collaborators

and I contribute to the field by providing novel empirical results germane to each topic and study

system, but also in a broader sense by facilitating the use and understanding of modern phylogenetic

methods.

In appendix A, I investigate the evolution of viviparity in phylogenetic test of the cold-climate

hypothesis. This long-standing hypothesis states that the evolution of viviparity is an adaptation that

prevents egg and/or juvenile mortality in cold climates. I apply a suite of cutting-edge phylogenetic

comparative methods to phylogenetic, climatic, and life history data from the lizard family

Phrynosomatidae. The results strongly support the cold-climate hypothesis, and help to explain two

counter-intuitive patterns in .

In appendix B, I examine discordance in phylogenetic reconstruction in order to answer an

unresolved question with significant practical implications for phylogeneticists: can disagreement

between concatenated and multi-species coalescent estimates be predicted? Using an empirical

example in higher-level scincid lizard relationships, I find that discordance between these methods is

related to short, weakly supported branches, and the presence of conflicting gene trees. These results

suggest that concatenation may provide a reasonable approximation of theoretically preferable species-

tree methods under most circumstances. Moreover, standard methods for incorporating phylogenetic

uncertainty may be sufficient to account for disagreements by species-tree method estimates.

In appendix C, I demonstrate a novel workflow for the detection of nuclear introgression in an

empirical case study of closely related (Sceloporus) species. I first show that mitochondrial

data independently suggests a repeated history of introgression between the focal taxa. I then show that

an exhaustive application of the DFOIL method to double-digest RADseq data from many individuals is

effective at identifying introgression in the nuclear genome. This novel approach also reveals intra- 7

specific geographic variation in in patterns of introgression, without the need to determine population

structure. Finally, I find that that batch effects in ddRADseq data may mislead D-statistics (and DFOIL)

under certain circumstances. 8

INTRODUCTION

Phylogenetic trees and comparative methods have become the backbone of a large and growing

body of research in biology (reviewed in Baum & Smith 2012; Pennell & Harmon 2013; Garamszegi

2014). Methodological advances beginning in the 1970s and 1980s (e.g., Felsenstein 1985), together

with rapidly expanding computational power and access to genetic data, have helped drive the growth

of phylogenetic biology. Nevertheless, significant room for growth exists for the understanding and

application of this rapidly expanding toolkit of methods by researchers more broadly. Increasing the

visibility and accessibility of phylogenetic methods is a major goal of modern research in phylogenetic

biology, and the unifying concept of this dissertation.

Explanation of the dissertation format

This dissertation contains three appendices, each of which is a collaborative scientific work on which I

am the lead author. Two of these are already published (Lambert & Wiens 2013, Evolution; Lambert et

al. 2015, Molecular Phylogenetics and Evolution), while the third is currently in review at Molecular

Ecology Resources. This chapter provides a summary of each appendix and their contributions to the

literature (Dissertation aims and outcomes), and the appendices themselves follow. Supporting or

supplemental information for each Appendix are available elsewhere (e.g., Dryad, GitHub), with

accessibility described within each appendix.

Contributions

I am the first and corresponding author of each appendix presented herein, reflecting my primary role

in the design, analyses, and writing of each appendix. My major advisor John J. Wiens (JJW) was a

collaborator on all appendices, with particular influence on the design and revision of appendix B. I

conducted all of the statistical analyses and created all figures for every appendix. I collected the data 9

on reproductive mode from the literature for appendix A, and all previously unpublished sequence data

(RADseq and mtDNA) in appendix C.

Dissertation aims and outcomes

This dissertation consists of three studies of phylogenetic biology. In each chapter, I applied cutting-

edge phylogenetic methods to investigate a topic of general interest in evolutionary biology, using

lizards (Squamata) as a study system. In addition to providing empirical results relevant to each topic

and study system, the dissertation is aimed at furthering phylogenetic biology in a practical sense by

providing clear use examples of cutting-edge phylogenetic methods (in appendices A, B, and C),

explicit guidelines for the application of these methods (in appendices B and C), and reusable scripts

for novel methodology (in appendix C).

1. Appendix A: “Evolution of viviparity: a phylogenetic test of the cold-climate hypothesis in

phrynosomatid lizards”. Shea M. Lambert and John J. Wiens, Evolution, 2013.

The evolution of live birth is a major life history transition seen in many clades of ,

including fish, amphibians, non-avian , and mammals (Amoroso 1968; Blackburn 1999). A

long-standing hypothesis for the origin of viviparity in ectotherms is the cold-climate hypothesis (Mell

1929, Weekes 1935, Shine 1985), which states that viviparity is an adaptation to cold temperatures

during the egg-laying season. The primary objective of appendix A was to provide the first test of the

cold-climate hypothesis using phylogenetic comparative methods and climatic data.

Our study system was the lizard family Phrynosomatidae, which contains ~140 species, and six

putative origins of live-birth. To test for associations between climate and reproductive mode while

accounting for phylogenetic relatedness, I used phylogenetic logistic regression (Ives & Garland 2010)

and a recently developed implementation of Wright’s threshold model (Wright 1934; Revell 2012). I

also conducted ancestral reconstructions of environmental data and reproductive mode. These 10

reconstructions formed the basis of a permutation test I designed which asked whether conditions under

which viviparity initially evolved were indeed colder than conditions under which it had not evolved.

Each of our analyses support the cold-climate hypothesis. We also find that low-elevation,

tropical viviparous phrynosomatid species have recently invaded these climates following the evolution

of viviparity in a colder ancestral environment. Unexpectedly, we find that viviparity may be favored in

the tropics, particularly at high elevations. We suggest this may be due to more extreme climatic

stratification on tropical mountains (Janzen 1967), potentially leading to the evolution of cold-climate

specialists. Finally, we find that viviparous phrynosomatid lineages have higher diversification rates

than their oviparous counterparts.

This study provides the first phylogenetic test of the cold-climate hypothesis that makes use of

climatic data, and the first to apply comparative methods specifically designed to test associations with

a binary trait (reproductive mode). The study also provides a clear example of how modern

phylogenetic comparative methods can be used to improve tests of long-standing hypotheses in

evolutionary ecology. Furthermore, it demonstrates how phylogenetic context can be critical for

explaining potentially counter-intuitive patterns that arise during evolutionary history.

2. Appendix B: “When do species-tree and concatenated estimates disagree? An empirical analysis

with higher-level scincid lizard phylogeny.” Shea M. Lambert, Tod W. Reeder, and John J. Wiens,

Molecular Phylogenetics and Evolution, 2014.

Recent years have seen rapid advancement in the methods available for reconstructing phylogenetic

relatedness among species. Newer methods have focused on modeling conflict between gene trees and

species trees (Edwards, 2009; Maddison, 1997). In particular, species-tree methods based on the multi-

species coalescent have been shown to have high performance under simulation (e.g., Drummond and

Heled 2010). Although multi-species coalescent methods are theoretically preferable, they are

computationally much more expensive than existing approaches, such as concatenation, where all 11

available data are combined into a single “super-gene”. This creates a conundrum for practicing

phylogeneticists that leads to a critical, but unresolved question: when do species-tree and concatenated

estimates disagree?

The main objective of appendix B was to see if disagreement between these methods was

predictable by measures of branch length, statistical support, and gene-tree concordance. If

disagreement can be predicted, this would be quite useful in cases where multi-species coalescent

species-tree methods are computationally intractable. Another major objective of our study was to

assess the behavior of multispecies coalescent methods in cases where multi-individual sampling is not

available, and divergence times are relatively ancient, as in our scincid lizard dataset.

We quantified discordance between all our phylogenetic analyses (individual gene-tree, species-

tree, and concatenated) through manual comparison of each topology. We were primarily interested in

comparing the branch lengths, support values, and underlying gene-tree concordance for nodes that

were concordant in species-tree and concatenated estimates, against nodes that were discordant in these

estimates. We also tested for correlations between branch length, support, and indices of gene-tree

congruence, predicting that shorter branches should have weaker support and greater incongruence.

As predicted, we found that disagreement between species-tree and concatenated estimates is

related to short, weakly supported branches with greater underlying gene-tree discordance. In addition,

shorter branches in the concatenated trees were negatively correlated with indices of support and

congruence. These results support our theoretical expectation that disagreement between species-tree

and concatenated estimates should concentrated in areas where incomplete lineage sorting is more

prevalent.

The results of this study provide a set of expectations for situations in which multi-species

coalescent and concatenation approaches may differ. Overall, we find that these methods are more

likely to disagree on short and/or weakly-supported nodes. Consequentially, researchers who are unable

to apply species-tree methods may simply be able to rely on standard methods for incorporating 12

phylogenetic uncertainty, but should be especially cautious of any short and/or weakly supported

branches. We also found that multi-species coalescent methods may perform well, even with 1

individual sampled per species and deep divergences. Our multi-species coalescent estimate supports

the monophyly of all four traditionally recognized scincid subfamilies, matching a recent estimate

using extensive sampling of species (Pyron et al. 2013). Since the publication of appendix B, another

study using hundreds of loci supported this same higher-level topology (Linkem et al. 2016).

APPENDIX C. “Inferring introgression using RADseq and DFOIL: power and pitfalls revealed in a case

study of spiny lizards (Sceloporus).” Shea M. Lambert, Jeffrey W. Streicher, M. Caitlin Fisher-Reid,

Fausto R. Mendez de la Cruz, Norberto Martinez-Mendez, Uri Omar Garcia Vazquez, Adrian Nieto

Montes de Oca, and John J. Wiens. Currently in review at Molecular Ecology Resources (2018).

Introgression is increasingly recognized as a common phenomenon across the Tree of Life

(Mallet, 2005; Harrison & Larson, 2014). A burgeoning set of methods and access genomic data have

helped to spur research on introgression (reviewed in Payseur & Rieseberg, 2016). Nevertheless, it can

remain difficult to characterize historical introgression (in terms of e.g., its strength, timing, direction),

especially in taxa lacking suitable reference genomes.

In appendix A, we demonstrate a novel workflow for the detection of nuclear introgression. This

workflow involves the exhaustive application of a recently-developed method for introgression (DFOIL,

Pease & Hahn 2015) to all suitable four-taxon combinations in a given dataset. This approach is well-

suited to reduced-representation datasets (e.g., RADseq), where many individuals have been sampled,

and requires only that a phylogeny has been estimated for the individuals in question.

We demonstrated this workflow in an empirical case study of spiny lizard (Sceloporus) species.

First, we showed that evidence from mitochondrial data suggests a history of repeated introgression

between the focal taxa, S. oberon and S. ornatus. We then used double-digest RADseq (Peterson et al.

2012) data from ~180 individuals to resolve phylogeny in the S. torquatus species group. Using the 13

resulting topology, we executed a recently developed extension of Patterson’s D-statistics (Green et al.

2010, Patterson et al. 2011), DFOIL (Pease & Hahn 2015), using all suitable four-taxon combinations

containing representatives of both focal species. We then summarized the results of this exhaustive

application of DFOIL (termed “ExDFOIL”) over phylogenetic and geographic space.

We found evidence of ancient and repeated introgression between the focal taxa, consistent with

the evidence presented from mitochondrial data. Furthermore, we found stronger evidence of

introgression in localities of S. ornatus for which recent mitochondrial introgression was inferred, and

weaker evidence in more geographically distant sampling locations, where no evidence of recent

mitochondrial introgression exists.

Our study provides an example pipeline for detecting introgression that requires minimal

assumptions, and can be readily applied to any group for which a phylogeny and a sufficiently large

scan of genomic variation are available. Beyond simple detection, the approach can reveal geographic

and/or phylogenetic variation in introgression, without the need to determine the number of populations

or their composition. Our study also provides empirical confirmation of an expectation by Pease &

Hahn (2015) that the inclusion of singleton site-pattern counts may be problematic for DFOIL.

Importantly, we also find the first empirical evidence that batch effects in RADseq data (error specific

to a given library preparation and/or sequencing effort) can mislead D-statistics under certain

circumstances. Finally, we provide scripts that will facilitate the use of our pipeline by other

researchers (https://github.com/SheaML/ExDFOIL). 14

REFERENCES

Amoroso, EC (1968) The evolution of viviparity. Proc. R. Soc. Med. 61:1188–1200.

Baum DA, Smith SD (2012) Tree Thinking: An Introduction to Phylogenetic Biology. Roberts and Co.,

Greenwood Village, CO.

Blackburn, DG (1999) Viviparity and oviparity: evolution and reproductive strategies. Pp. 994–1003 in

TE Knobil and JD Neill, eds. Encyclopedia of Reproduction. Academic Press, New York.

Edwards SV (2009) Is a new and general theory of molecular systematics emerging? Evoltuion 63, 1–

19.

Felsenstein J (1985) Phylogenies and the comparative method. Am. Nat. 125:1–15.

Garamszegi LZ (2014) Modern phylogenetic comparative methods and their application in

evolutionary biology: concepts and practice. Springer, Berlin.

Harrison RG, Larson EL (2014) Hybridization, introgression, and the nature of species boundaries. J.

Hered. 105:795–809.

Heled J, Drummond AJ (2010) Bayesian inference of species trees from multilocus data. Mol. Biol.

Evol. 27:570–580.

Ives AR, Garland T (2010) Phylogenetic logistic regression for binary dependent variables. Syst. Biol.

59:9–26.

Janzen DH (1967) Why mountain passes are higher in the tropics. Am. Nat. 101:233–249.

Lambert SM, Wiens JJ (2013) Evolution of viviparity: a phylogenetic test of the cold-climate

hypothesis in phrynosomatid lizards. Evolution 67:2614–2630.

Lambert SM, Reeder TW, Wiens JJ (2015) When do species tree and concatenated estimates disagree?

An empirical analysis with higher-level scincid lizard phylogeny. Mol. Phylogenet. Evol.

82:146–155. 15

Linkem CW, Minin WV, Leache AD (2016) Detecting the anomaly zone in species trees and evidence

for a misleading signal in higher-level phylogeny (Squamata: Scincidae). Syst. Biol.

65:465–477.

Maddison WP (1997) Gene trees in species trees. Syst. Biol. 46, 523–536.

Mallet J (2005) Hybridization as an invasion of the genome. Trends Ecol. Evol. 20:229–237.

Mell R (1929) Grundzuge winer Okologie der chinesischen Reptilien. Walter De Gruyter, Berlin.

Payseur BA, Rieseberg LG (2016) A genomic perspective on hybridization and speciation. Mol. Ecol.

25:2337–2360.

Pease JB, Hahn MW (2015) Detection and polarization of introgression in a five-taxon phylogeny.

Syst. Biol. 64:651–662.

Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE (2012) Double digest RADseq: an

inexpensive method for de novo SNP discover and genotyping in model and non-model species.

PLoS One 7:e37135.

Pennell MW, Harmon LJ (2013) An integrative view of phylogenetic comparative methods:

connections to population genetics, community ecology, and paleobiology. Ann. N.Y. Acad. Sci.

1289:90–105.

Pyron RA, Burbrink FT, Wiens JJ (2013) A phylogeny and revised classification of Squamata,including

4161 species of lizards and snakes. BMC Evol. Biol. 13:93.

Revell LJ (2012) Phytools: an R package for phylogenetic comparative biology (and other things).

Methods Ecol. Evol. 3:217–223.

Shine R (1985) The evolution of viviparity in reptiles: an ecological analysis. Pp. 607–681 in C. Gans

and F. Billet, eds. Biology of the Reptilia 15. John Wiley & Sons, NY.

Weekes, HC (1935) A review of placentation among reptiles with particular regard to the function and

evolution of the placenta. Proc. Zool. Soc. Lond. 1935:625–645. 16

Wright S (1934) An analysis of variability in the number of digits in an inbred strain of guinea pigs.

Genetics 19:506–536. 17 ORIGINAL ARTICLE Appendix A doi:10.1111/evo.12130

EVOLUTION OF VIVIPARITY: A PHYLOGENETIC TEST OF THE COLD-CLIMATE HYPOTHESIS IN PHRYNOSOMATID LIZARDS

Shea M. Lambert1,2 and John J. Wiens1 1Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721 2E-mail: [email protected]

Received November 30, 2012 Accepted March 18, 2013 Data Archived: Dryad doi:10.5061/dryad.5nt3r

The evolution of viviparity is a key life-history transition in vertebrates, but the selective forces favoring its evolution are not fully understood. With >100 origins of viviparity, squamate reptiles (lizards and snakes) are ideal for addressing this issue. Some evidence from field and laboratory studies supports the “cold-climate” hypothesis, wherein viviparity provides an advantage in cold environments by allowing mothers to maintain higher temperatures for developing embryos. Surprisingly, the cold-climate hypothesis has not been tested using both climatic data and phylogenetic comparative methods. Here, we investigate the evolution of viviparity in the lizard family Phrynosomatidae using GIS-based environmental data, an extensive phylogeny (117 species), and recently developed comparative methods. We find significant relationships between viviparity and lower temperatures during the warmest (egg-laying) season, strongly supporting the cold-climate hypothesis. Remarkably, we also find that viviparity tends to evolve more frequently at tropical latitudes, despite its association with cooler climates. Our results help explain this and two related patterns that seemingly contradict the cold-climate hypothesis: the presence of viviparous species restricted to low- elevation tropical regions and the paucity of viviparous species at high latitudes. Finally, we examine whether viviparous taxa may be at higher risk of extinction from anthropogenic climate change.

KEY WORDS: Climate, comparative methods, life-history evolution, phylogeny, reproductive mode, squamates.

The transition from oviparity (egg laying) to viviparity (live bear- yet fully understood (Hodges 2004; Blackburn 2005; Sites et al. ing) is a major evolutionary shift in life history with many poten- 2011). tial consequences for the morphology, physiology, behavior, and Squamate reptiles (lizards and snakes) offer one of the best ecology of an organism (Shine and Bull 1979; Guillette 1993; model systems for understanding the evolution of viviparity in Blackburn 2006; Thompson and Blackburn 2006; Thompson and vertebrates. This is because in squamates, viviparity is estimated Speake 2006). The evolution of viviparity may have disadvan- to have originated at least 100 times, more than twice the number tages (e.g., loss of the entire brood if the mother dies; Neill 1964; in all other clades combined (Blackburn 2000). These Tinkle and Gibbons 1977). However, the repeated evolution of repeated origins facilitate the use of phylogenetic analyses to iden- viviparity across many vertebrate clades (e.g., sharks, ray-finned tify the ecological correlates of these transitions in reproductive fish, frogs, salamanders, caecilians, mammals, lizards, snakes; mode. As a counterexample, determining the selective forces that Amoroso 1968; Vitt and Caldwell 2009) suggests that viviparity led to the evolution of viviparity is difficult or impossible in mam- can provide a strong selective advantage under some conditions mals, in which viviparity seems to have evolved only once among (Tinkle and Gibbons 1977; Blackburn 1999a). Nevertheless, the extant lineages and >100 million years ago (Blackburn 2005). particular conditions that favor the evolution of viviparity are not The single origin makes statistical analysis difficult, whereas the

C 2013 The Author(s). Evolution C 2013 The Society for the Study of Evolution. 2614 Evolution 67-9: 2614–2630 18 EVOLUTION OF VIVIPARITY

ancient timing makes it problematic to infer the ecological con- mates, remains untested using the necessary combination of phy- ditions that were associated with this transition. logenetic and climatic information. Shine and Berry (1978) exam- Over a century of research has focused on viviparity in squa- ined the distributions of viviparous squamates in North America mates, and numerous hypotheses have been proposed to explain and Australia, and used stepwise multiple regression to test for its evolution (reviews in Tinkle and Gibbons 1977; Shine 1985; an association with broad-scale climatic variables. Surprisingly, Sites et al. 2011). The “cold-climate” hypothesis is one of the first they found that their measures of temperature (including mean hypotheses proposed (Mell 1929; Weekes 1935; Sergeev 1940), temperatures of individual summer months) were no more corre- and has emerged as one of the most important (e.g., Tinkle and lated with the proportion of live-bearing species than were other Gibbons 1977; Shine 1985; Sites et al. 2011). The cold-climate variables such as precipitation and evaporation. Shine and Bull hypothesis posits that viviparity evolves as an adaptation to cold (1979) found that “recent” origins of viviparity (i.e., within gen- temperatures during the egg-laying season (Tinkle and Gibbons era, subgenera, or species groups) are associated with relatively 1977; Shine 1985). Cold conditions are problematic because they cold environments, but did not use a phylogeny or quantitative can slow development (Packard et al. 1977), and are potentially measures of climate. In a landmark paper, Shine (1985) used avail- lethal to developing eggs. According to this hypothesis, reten- able phylogenetic information to estimate the number of origins of tion of eggs in utero and behavioral thermoregulation by pregnant viviparity in squamates and tested for associations with ecological females allows for faster development and avoids lethally cold factors. However, the statistical analyses did not account for phy- temperatures for developing embryos (Shine 1985). The cold- logeny, and only qualitative measures of climate (e.g., “colder” climate hypothesis is also appealing because intermediate stages vs. “warmer”) were used. Other studies have shown that within on the path toward viviparity (i.e., longer egg retention) should a single (Pseudechis; Shine 1987) and species (Lerista still provide fitness benefits (Shine 1985; Andrews 2000). Specif- bougainvillii; Qualls and Shine 1998), viviparous species and ically, offspring that stay within the mother longer should hatch populations inhabit significantly colder environments than their sooner, giving them an advantage in the form of increased time oviparous counterparts, but neither study’s statistics accounted for to build up energy reserves and seek shelter before temperatures phylogeny. Hodges (2004) used phylogenetic comparative meth- reach lethally low levels in autumn and winter (Tinkle and Gib- ods and found that viviparity is associated with high elevations in bons 1977; Shine and Bull 1979; Shine 1985). the phrynosomatid lizard genus Phrynosoma but did not include A number of field and laboratory studies support many of the explicit data on climate. Similarly, Schulte et al. (2000) used the predictions of the cold-climate hypothesis. For example, studies phylogeny-based concentrated changes test (Maddison 1990), and have shown that the body temperatures of gravid lizards are higher did not find a significant relationship between viviparity and oc- than relevant soil temperatures (Shine 1983), that viviparous pop- currence in high elevations (>2500 m) or latitudes (>40◦ south) ulations and species occur in colder climates relative to closely in the lizard genus Liolaemus, but again did not include data related oviparous populations and species (Shine 1985, 1987; on climate. Finally, Schulte and Moreno-Roark (2010) used a Qualls and Shine 1998), that higher-elevation populations are as- time-calibrated phylogeny to estimate the number of origins of sociated with longer egg retention in oviparous species (Calderon-´ viviparity in iguanian lizards and showed that the majority of Espinosa et al. 2006; Rodr´ıguez-D´ıaz and Brana˜ 2012), and that these origins likely occurred prior to Pliocene-Pleistocene glacia- offspring of oviparous species experimentally incubated at cooler tions, but they did not test for associations between transitions to temperatures are less fit and take longer to develop (e.g., Qualls viviparity and climate. and Andrews 1999; Shine 2002; Parker and Andrews 2007; for Here we investigate the evolution of viviparity in the lizard an example in a viviparous species, see Li et al. 2009). However, family Phrynosomatidae. We use a relatively complete, time- not all studies have found support for the cold-climate hypothesis. calibrated phylogeny, GIS-based climatic data (Hijmans et al. For example, Andrews (2000) and Shine et al. (2003) both found 2005), and recently developed comparative methods that are ex- that nests of oviparous species and body temperatures of preg- plicitly designed to test for a relationship between an independent nant females show only slight differences in average temperature, continuous variable (e.g., climate) and a dependent discrete vari- and Li et al. (2009) found that females of the viviparous lacertid able (e.g., reproductive mode). These methods are phylogenetic Eremias prezwalskii actually selected lower body temperatures logistic regression (Ives and Garland 2010) and a Bayesian im- when pregnant. Also, Garc´ıa-Collazo et al. (2012) found that in plementation of Wright’s threshold model from quantitative ge- the oviparous lizard Sceloporus aeneus, females from lower and netics (Wright 1934; Revell 2012; Felsenstein 2012). We also es- warmer environments unexpectedly retain their eggs for longer timate ancestral values for reproductive mode and environmental than those from higher and colder localities. variables to better understand the ecological context(s) in which Nevertheless, the most basic prediction of the cold-climate viviparity originates. Without inferring the ecological conditions hypothesis, that viviparity is more likely to evolve in cold cli- at the time of origin of viviparity, it is difficult to know if, for

EVOLUTION SEPTEMBER 2013 2615 19 S. M. LAMBERT AND J. J. WIENS

example, viviparity evolved in response to cold climates, or (2010) and Leache´ (2010), and includes data from eight nuclear evolved in a very different environment and promoted later in- and five mitochondrial loci (but not all species sampled for all vasion of cold climates (Blackburn 2005). genes). Divergence times were estimated using the Bayesian un- The family Phrynosomatidae is an excellent model system to correlated lognormal method in BEAST (Drummond et al. 2006; address the evolution of viviparity, for several reasons. Phrynoso- Drummond and Rambaut 2007) with four fossil calibration points. matidae includes nine genera and approximately 138 species (re- Environmental data (climatic variables and elevation from the view in Wiens et al. 2013) distributed from Canada to Panama (Vitt WORLDCLIM database; Hijmans et al. 2005) were extracted and Caldwell 2009) with two genera containing both viviparous from georeferenced museum localities at a ∼1km2 resolution and oviparous species (Phrynosoma and Sceloporus), and the (Appendix S1; 117 species, mean of ∼54 localities per species, other seven genera being entirely oviparous. Previous work sug- range from 1 to 1230). Sampled localities were carefully selected gests that there were four separate origins of viviparity in Scelo- and spanned most of each species’ range, based on comparison to porus (Mendez-de la Cruz et al. 1998) and two in Phrynosoma published range maps (e.g., Sites et al. 1992; Conant and Collins (Hodges 2004). Thus, there should be enough origins of vivipar- 1998; Grismer 2002; Stebbins 2003). Although some species are ity to test their environmental correlates with sufficient statisti- known from few localities, this appears to be related to these cal power. Also, a detailed phylogeny and climatic data for the species having very restricted geographic ranges (Wiens et al. group are now available (Wiens et al. 2013). Furthermore, because 2013), especially because phrynosomatid species tend to be con- viviparity varies among relatively closely related phrynosomatid spicuous and common at localities where they occur. Detailed species (i.e., congeners) it should be far easier to address the eco- methods for the collection of climatic data and estimation of the logical conditions that favor its evolution than in a group where time-calibrated phylogeny are provided elsewhere, along with the origins of viviparity are more ancient (e.g., mammals). detailed justification for the species-level used (Wiens The evolution of viviparity in phrynosomatid lizards is partic- et al. 2013; see also Quintero and Wiens 2013). ularly interesting because of two patterns seen in Sceloporus that We searched the literature to collect information on repro- seem contradictory to the cold-climate hypothesis. First, a number ductive mode, and found information for 105 of the 117 species of lowland, tropical species are viviparous (Guillette et al. 1980; represented in our phylogeny (Appendix S1). For 12 species, Mendez-de la Cruz et al. 1998). Second, many Sceloporus species definitive information on reproductive mode was not found. We that inhabit high elevation, cold climates in North America are addressed this problem in two ways. oviparous (Guillette et al. 1980; Mendez-de la Cruz et al. 1998; First, parity mode was assumed based on the species’ closest this study). We attempt to explain these patterns here. relatives (e.g., many unknown species were only recently recog- Finally, a recent analysis has suggested that viviparous nized as being distinct). Thus, species were assigned to viviparity species of Sceloporus may have a higher extinction risk than if they were nested within viviparous clades and to oviparity if they oviparous species in the face of recent, anthropogenic climate were nested within oviparous clades. In eight out of twelve cases, change (Sinervo et al. 2010). This pattern was suggested because these unknown species were nested within viviparous clades, and many viviparous species are restricted to high-elevation “islands” evidence suggests that reversals from viviparity to oviparity are in (with the implicit assumption that these species have rare in squamates (Shine 1985; Lee and Shine 1998; Shine and Lee narrower elevational ranges), and because climate is changing 1999; Blackburn 1999b; but see Lynch and Wagner 2009 for an faster at higher elevation sites. However, the elevational and latitu- exception in snakes), making viviparity an even safer assumption dinal ranges of viviparous and oviparous species were not directly for these cases. compared. Here we use phylogenetic comparative methods to ad- Second, we repeated all phylogenetic logistic regression, dress how anthropogenic climate change might differently impact threshold model, and phylogenetic generalized least squares the survival of oviparous and viviparous species in Phrynosomati- (PGLS) analyses (see below) after removing all 12 species with dae, based on their latitudinal and elevational range extents and uncertain reproductive modes from the tree. The results from temperature niche breadths. this dataset were consistent with the larger dataset, and here we present results only from the dataset that used all 117 species and assumed parity mode based on closet relatives (results for the Materials and Methods reduced dataset are provided in supplemental Tables S3–S5). DATA COLLECTION For all analyses, we used a time-calibrated phylogenetic tree of ANCESTRAL STATE RECONSTRUCTIONS: 117 species with associated climatic data (from Quintero and REPRODUCTIVE MODE Wiens 2013; Wiens et al. 2013). The phylogenetic tree was esti- We used the binary state speciation and extinction (BiSSE) model mated based on a data matrix combining the data of Wiens et al. (Maddison et al. 2007) from the diversitree package (FitzJohn

2616 EVOLUTION SEPTEMBER 2013 20 EVOLUTION OF VIVIPARITY

Table 1. Comparison of BiSSE models using AICc. The six estimated parameters in the full model include transition rates between oviparity (0) and viviparity (1), including both gains of viviparity (0–1) and reversals (1–0), and estimated speciation and extinction rates for each parity mode. Other models hold various parameters constant. The best-fitting model (bold-faced) has different rates of speciation in oviparous and viviparous lineages, equal extinction rates, and no reversals from viviparity to oviparity.

Parameters constrained Log-likelihood Parameters AICc AICc Extinction rates equal, no reversals −448.3487 4 905.0545 0.0000 Extinction and transition rates equal −449.3546 4 907.0662 2.0117 Extinction rates equal −448.3487 5 907.2379 2.1834 No reversals −448.3487 5 907.2379 2.1834 Transition rates equal −449.3564 5 909.2533 4.1988 None (full model) −448.3487 6 909.4610 4.4065 Speciation and extinction rates equal, no reversals −452.7899 3 911.7923 6.7377 Speciation, extinction, and transition rates equal −453.8471 3 913.9065 8.8519 Speciation rates equal, no reversals −452.7748 4 913.9067 8.8522 Speciation and extinction rates equal −452.7888 4 913.9346 8.8801 Speciation and transition rates equal −453.7789 4 915.9149 10.8604 Speciation rates equal −452.7737 5 916.0880 11.0335

2012) in R (R Core Team 2012) to reconstruct ancestral states ENVIRONMENTAL VARIABLES using maximum-likelihood optimization while accounting for the We focused on four environmental variables that are potentially potential impact of parity mode on speciation and extinction (and relevant to the cold-climate hypothesis: midpoint of the eleva- the potential impacts of speciation and extinction on ancestral tional range, midpoint of the latitudinal range, and the mean across reconstructions). To infer possible impacts of parity mode on all localities for a given species for the climatic variables Bio1 diversification, we tested the fit of 12 models of discrete trait (annual mean temperature; a standard climatic variable in ecolog- evolution (including state-dependent speciation and extinction ical studies) and Bio10 (mean temperature of the warmest quarter rates) ranging from 3 to 6 parameters and compared models us- of the year). We used the mean temperature of the warmest quar- ing the differences in the sample-size corrected Akaike informa- ter because phrynosomatid lizards typically lay their eggs during tion criterion (AICc; Sugiura 1978; Burnham and Anderson the warmest months of the year (e.g., Smith 1995; Jones and 2002). These models are listed explicitly in Table 1. We supplied Lovich 2009), making Bio10 the seemingly most relevant vari- the BiSSE models with the estimated proportion of oviparous able for testing the cold-climate hypothesis. For all variables, the (∼87%) and viviparous (∼84%) species included in the tree and mean and midpoint were highly correlated (> 0.95) using Pearson assumed random sampling, rather than placing unsampled taxa product–moment correlations, implying that our decisions to use into unresolved clades (Table S6). We reconstructed ancestral mean or midpoint should not influence the results. states using the BiSSE model with the minimum (best) AICc score in diversitree using subplex optimization. Apart from the PHYLOGENETIC LOGISTIC REGRESSION best-fitting model, no other tested models had a AICc of < 2, Logistic regression allows predictions for a binary-dependent which would indicate “substantial support” for the other model, variable (i.e., reproductive mode) to be made from continuous in- following Burnham and Anderson (2002). To further determine dependent variables (i.e., climate). We use the method of Ives and whether speciation and extinction parameters were significantly Garland (2010) for phylogenetic logistic regression, implemented different for oviparous and viviparous lineages, we used diver- using MATLAB version 2009a. The method uses a two-state sitree to run a 10,000 generation Bayesian Markov chain Monte Markov process for the evolution of discrete traits, and treats con- Carlo (MCMC) analysis with all six parameters of the BiSSE tinuous variables as known properties of extant species. We gen- model unconstrained, and examined the 95% credible intervals of erated the necessary variance–covariance matrix from the time- the posterior distributions for the speciation and extinction param- calibrated tree using the PDAP package version 1.16 (Midford eters. Using the maximum-likelihood estimates of each parame- et al. 2010) in MESQUITE version 2.75 (Maddison and Maddi- ter as starting values and tuning parameters for each parameter son 2011). We used 2000 nonparametric bootstrap replicates and derived from a preliminary MCMC of 1000 generations, several a significance threshold of 0.05 to determine significance and es- replicate analyses appeared to converge immediately, and we used timate confidence intervals. All independent variables were stan- the entire posterior distribution from one MCMC of 10,000 gen- dardized to have a mean of 0 and a variance of 1 prior to analyses, erations for calculating the credible intervals. to optimize the performance of the nonparametric bootstrapping

EVOLUTION SEPTEMBER 2013 2617 21 S. M. LAMBERT AND J. J. WIENS

procedure (A. Ives, pers. comm.). We initially performed analy- ranges, mean temperature of the warmest quarter; see Results) ses including all species of Phrynosomatidae, and then conducted using maximum likelihood in the R package APE (Paradis et al. separate analyses for the genera Phrynosoma and Sceloporus,to 2004). The distribution-based traits we reconstruct here are not identify any deviations from the family-level patterns. technically heritable in the genetic sense, but should be closely related to intrinsic, heritable properties such as physiological tol- CORRELATIONS USING WRIGHT’S THRESHOLD erances and behavioral habitat choice. Furthermore, each of the MODEL reconstructed environmental variables in this study has phyloge- The threshold model for discrete traits was first introduced by netic signal (estimated with Pagel’s λ; Pagel 1999) significantly Wright (1934). The model assumes that an unobserved quanti- different from 0 according to likelihood-ratio tests completed us- tative character, termed the liability, determines the state of dis- ing the phylosig function of the R package phytools (Table S2). crete characters based on whether the liability exceeds a thresh- Thus, the presence of phylogenetic signal suggests that it should old value. Compared to the Markov process used by Ives and be appropriate to do phylogenetic reconstructions of ancestral Garland (2010), this is potentially a more biologically realistic values for these variables. We compared models of trait evolu- way to model the evolution of discrete traits, especially those that tion using the R package GEIGER version1.3-1 (Harmon et al. are complex and likely polygenic, such as viviparity. An important 2008) and selected models based on AICc scores as described difference is the way each model treats the probability of a state earlier for reconstructions of reproductive mode. In the case of transition in the binary variable. The Markov process used for dis- multiple models having AICc of < 2 (i.e., statistically indis- crete trait evolution in the approach of Ives and Garland (2010) tinguishable support), we performed reconstructions using each is “memoryless,” such that the probability of a state transition is model and report results from each. We considered a model using the same at any given time, regardless of prior states. In contrast, the maximum-likelihood estimate of Pagel’s λ (Pagel 1999) for using the threshold model, transitions are more likely when the branch-length transformation, a Brownian motion model (equiv- liability is closer to the threshold. For example, a reversal from alent to λ = 1), a “white noise” model (equivalent to λ = 0, or a viviparity to oviparity would be more likely if it occurred shortly star phylogeny) and an Ornstein–Uhlenbeck (OU) model (Hansen after the original transition to viviparity. and Martins 1996) with a single optimum. Under the threshold model, underlying liabilities of discrete The key prediction we tested using ancestral reconstructions traits and continuous characters are assumed to evolve through is whether the conditions in viviparous clades at the time of their covarying Brownian motion, allowing correlation between dis- origin are significantly different (e.g., colder) than conditions in crete and continuous characters. The threshold model has been lineages that did not evolve viviparity. This prediction is distinct recently implemented using MCMC sampling under maximum from those based on extant conditions experienced by viviparous likelihood (Felsenstein 2012) and Bayesian (Revell 2012) frame- and oviparous species (i.e., as in our threshold model and phylo- works. We used the Bayesian method implemented in the R pack- genetic logistic regression analyses). This prediction also focuses age phytools. We examined the same independent variables as in more directly on the conditions under which viviparity evolves, the phylogenetic logistic regression, and again performed analy- and to our knowledge, has not been tested before. ses at both the family (Phrynosomatidae) and genus (Sceloporus To test this prediction, we took a nonparametric approach, and Phrynosoma) levels. We ran several MCMC simulations of 2 sampling from the distribution of ancestral values for all nodes re- million generations each to ensure consistent estimation of param- constructed as oviparous. Using this distribution enabled us to ac- eters and likelihoods, followed by one simulation of 10 million count for the distribution of conditions experienced by oviparous generations for each variable. We report the mean of r (the correla- phrynosomatid species. For example, if a large proportion of tion between the liability of reproductive mode and the continuous oviparous phrynosomatid lineages inhabit cold environments, character[s]) from the posterior probability distribution and used then the origin of viviparity in such conditions does not offer the quantile function in R to obtain 95% confidence intervals. strong support for the cold-climate hypothesis. For each variable that produced significant relationships with reproductive mode, ANCESTRAL RECONSTRUCTIONS OF we first calculated the mean of the ancestral values from the ENVIRONMENTAL VARIABLES nodes at which viviparity originated. For Sceloporus aeneus and We used ancestral reconstructions to address the environmental S. bicanthalis, the immediately ancestral node is reconstructed as conditions under which viviparity evolved, a key question that is oviparous; we addressed this by using: (1) the extant values for not directly addressed by phylogenetic logistic regression or the these species and (2) the mean of the immediate ancestral node and threshold model. We reconstructed ancestral values for the three the extant value. Next, we randomly sampled six values (equal to environmental variables that had the strongest statistical relation- the number of origins of viviparity in Phrynosomatidae; Results) ships with viviparity (midpoints of the elevational and latitudinal from the distribution of reconstructed values for oviparous nodes.

2618 EVOLUTION SEPTEMBER 2013 22 EVOLUTION OF VIVIPARITY

We repeated this sampling 100,000 times using R, and for each Results sample, took the mean of the six values. We then calculated 95% The preferred model for the evolution of reproductive mode one-sided confidence intervals from the resulting distribution of contains four parameters, including different speciation rates sample means, and compared the mean value from the nodes as- for oviparous and viviparous clades (Table 1). In this preferred sociated with origins of viviparity to these confidence intervals. model, extinction rates for oviparous and viviparous clades were If the mean from nodes where viviparity originated fell outside of constrained as equal, and state transitions were constrained to the one-sided confidence intervals in the predicted direction, we happen unidirectionally from oviparity to viviparity (i.e., no concluded that there was a significant difference between recon- reversals). Our reconstructions using this model support six structed environmental conditions at the origins of viviparity and origins of viviparity in Phrynosomatidae: four in Sceloporus those for oviparous lineages. and two in Phrynosoma (Fig. 1), in agreement with previ- We also used the ancestral reconstructions to address the ous studies (Mendez-de´ la Cruz et al. 1998; Hodges 2004). origins of viviparous species living in lowland tropical environ- Bayesian MCMC analysis of the unconstrained BiSSE model ments. We first identified viviparous species that live primarily in recovered 95% credible intervals for the extinction parameter low-elevation habitats (here arbitrarily defined as having an eleva- that overlapped for oviparous and viviparous lineages, but the tional midpoint of < 500 m) and at latitudes below 23.4◦. For each credible intervals for the speciation parameter did not, with such species, we plotted the estimated values for elevational mid- viviparous lineages having a significantly higher estimated spe- point and mean summer temperature for each ancestral node in a ciation rate (maximum likelihood estimate = 0.12 lineages/ straight-line path from the root of Phrynosomatidae to the extant Million years) than oviparous lineages (maximum likelihood es- values for that species. In addition, we plotted the reconstruction timate = 0.06). of reproductive mode at each node, allowing for a straightforward Using phylogenetic logistic regression, reproductive mode visualization of the estimated environmental conditions at the is best predicted by the mean temperature of the warmest quar- evolution of viviparity and any subsequent adaptation to differing ter (estimated slope coefficient B1 =−1.06, P = 0) and the climates (Fig. 2). midpoint of the elevational range (B1 = 0.91, P = 0) across VIVIPARITY AND RANGE SIZE Phrynosomatidae (Table 2). The signs of these relationships are We first quantified for each species their elevational and latitudi- as predicted by the cold-climate hypothesis: cooler climates and nal range sizes (obtained by subtracting the maximum and min- higher elevations predict viviparity. Reproductive mode is also imum recorded values for each species) and temperature niche predicted by midpoint of the latitudinal range (B1 =−0.72, breadth (defined here as the difference between the minimum P = 0.02), but this relationship is opposite to what might be value of Bio6, the minimum temperature of the coldest month, expected under the cold-climate hypothesis: viviparity is associ- and the maximum value of Bio5, the maximum temperature of ated lower latitudes, rather than the higher latitudes that are typ- the warmest month, across all the sampled localities within the ically correlated with cooler temperatures (Tinkle and Gibbons range of each species, following Quintero and Wiens 2013). We 1977; Hodges 2004). This pattern might be partially explained then compared these values for oviparous and viviparous species by the positive correlation between the midpoint of the latitu- using PGLS regression (Grafen 1989) implemented using the R dinal range and the mean temperature of the warmest quarter package NLME (Pinheiro et al. 2012). For these analyses, parity (Bio10) across all species of Phrynosomatidae (nonphylogenetic mode was considered to be a categorical independent variable Pearson product–moment correlation r = 0.23, P = 0.01). This with two possible states, and range sizes and niche breadth as the indicates that higher-latitude phrynosomatid species inhabit ar- dependent variables. For each variable, we compared GLS mod- eas with higher maximum summer temperatures, such as deserts els using AICc as described earlier. We evaluated GLS models (for Sceloporus, see also Oufiero et al. 2011). Interestingly, in no with phylogenetic correlation structures (one using the maximum cases did Bio1, the mean annual temperature, significantly predict likelihood estimate of λ, one using Brownian motion [λ = 1], and reproductive mode. one using the OU correlation structure of Hansen and Martins Results from Wright’s threshold model are consistent with 1996), as well as a model that lacked phylogenetic correlation the results from phylogenetic logistic regression, in terms of structure (λ = 0). In the case of a GLS model that lacks phyloge- the rank order of the importance of the independent variables netic correlation structure, the model gives the same results as a and statistical significance, at least for the family-level analyses t-test comparing the means for viviparous and oviparous species. (Tables 2 and S3). For the genus-level analyses, the threshold As with phylogenetic logistic regression, we performed analy- model provided wider confidence intervals, all of which over- ses at the family (Phrynosomatidae) and genus (Sceloporus and lapped with 0 (indicating nonsignificance), perhaps owing to the Phrynosoma)levels. smaller sample sizes of these analyses.

EVOLUTION SEPTEMBER 2013 2619 23 S. M. LAMBERT AND J. J. WIENS

Reconstructed Values of Bio10 Holbrookia lacerata Holbrookia elegans 19.0 − 21.9 °C Holbrookia propinqua Holbrookia maculata 21.9 − 23.4 °C Cophosaurus texanus Callisaurus draconoides 23.4 − 25.9 °C Uma paraphygas Uma exsul 25.9 − 29.2 °C Uma scoparia Uma notata Uma inornata Phrynosoma solare Phrynosoma goodei Phrynosoma platyrhinos Phrynosoma mcallii Phrynosoma coronatum Phrynosoma cerroense Phrynosoma blainvillii Phrynosoma orbiculare Phrynosoma douglasii Phrynosoma ditmarsi Phrynosoma her nandesi Phrynosoma modestum Phrynosoma braconnieri Phrynosoma taurus Phrynosoma asio Phrynosoma cornutum Urosaurus bicarinatus Urosaurus nigricaudus Urosaurus ornatus Urosaurus graciosus Sceloporus couchi Sceloporus par vus Sceloporus v`ariabilis Sceloporus smithi Sceloporus cozumelae Sceloporus chr ysosctus Sceloporus maculosus Sceloporus gadoviae Sceloporus ochoterenae Sceloporus jalapae Sceloporus pyrocephalus Sceloporus nelsoni Sceloporus arenicolus Sceloporus graciosus Sceloporus v andenburgianus Sceloporus orcu Sceloporus hunsakeri Sceloporus licki Sceloporus lineatulus Sceloporus zosteromus Sceloporus magister Sceloporus melanorhinus Sceloporus clarkii Sceloporus grammicus Sceloporus palaciosi Sceloporus heterolepis Sceloporus dugesii Sceloporus or natus Sceloporus (serrifer) plioporus Sceloporus cyanogenys Sceloporus cyanosctus Sceloporus serrifer Sceloporus (serrifer) prezygus Sceloporus minor Sceloporus oberon Sceloporus mucronatus Sceloporus poinse Sceloporus sugillatus Sceloporus macdougalli Sceloporus (mucronatus) omiltemanus Sceloporus (mucronatus) aureolus Sceloporus megalepidurus Sceloporus pictus Sceloporus bulleri Sceloporus torquatus Sceloporus insignis Sceloporus jarrovii Sceloporus oliv aceus Sceloporus occidentalis Sceloporus (undulatus) cowlesi Sceloporus cautus Sceloporus (undulatus) consobrinus Sceloporus woodi Sceloporus (undulatus) trischus Sceloporus undulatus Sceloporus virgatus Sceloporus goldmani Sceloporus chaneyi Sceloporus slevini Sceloporus bicanthalis Sceloporus aeneus Sceloporus adleri Sceloporus stejnegeri Sceloporus formosus Sceloporus smaragdinus Sceloporus taeniocnemis Sceloporus malachicus Sceloporus lundelli Sceloporus subpictus Sceloporus cryptus Sceloporus (horridus) albiventris Sceloporus spinosus Sceloporus horridus Sceloporus (spinosus) caeruleopunctatus Sceloporus edwardtaylori Sceloporus merriami Sceloporus uformis Sceloporus angustus Sceloporus grandaevus Sceloporus squamosus Sceloporus siniferus Sceloporus carinatus Petrosaurus mearnsi Petrosaurus thalassinus Uta stansburiana Uta palmeri

50 40 30 20 10 0 Millions of years before present

Figure 1. Time-calibrated phylogeny of Phrynosomatidae based on Bayesian analysis of nuclear and mitochondrial genes (from BEAST). Tip labels indicate reproductive mode (black = oviparous, white = viviparous). Node labels show the maximum-likelihood estimate of reproductive mode under the best-fitting model of binary trait evolution (Table 1). Branch colors indicate the reconstructed value for mean temperature of the warmest quarter (Bio10) for the ancestral node of each branch, based on maximum likelihood reconstruction as a continuous character. For illustrative purposes in this figure, these continuous reconstructions are then binned into four categories according to the quantiles of the reconstructed values (blue = 0–25%, green = 26–50%, orange = 51%–75%, red = 76%–100%; see figure legend for exact values). Some species are treated as separate lineages here that were traditionally recognized as subspecies. In these cases, the traditional species designation is indicated with parentheses.

2620 EVOLUTION SEPTEMBER 2013 24 EVOLUTION OF VIVIPARITY

Table 2. Results of statistical tests of the relationships between environmental variables and reproductive mode. Estimates of B1 and P are from phylogenetic logistic regression and estimates of r are from Wright’s threshold model. Bolded results are significantly different from 0 (i.e., 95% confidence intervals do not overlap with 0). Bio10.mean is the mean across all localities for a given species for Bio10, the mean temperature of the warmest quarter; Elev.mid is the midpoint of the elevational range for each species (calculated using the minimum and maximum values in our dataset); Lat.mid represents the midpoint of the latitudinal range; Bio1.mean is the mean across all localities for a given species for Bio1, the mean annual temperature.

Independent variable B1 P B1 95% CI Mean rr95% CI Phrynosomatidae Bio10.mean −1.06 0.00 (−1.89, −0.40) −0.44 (−0.72, −0.09) Elev.mid 0.91 0.00 (0.38, 1.72) 0.38 (0.08, 0.63) Lat.mid −0.72 0.02 (−1.66, −0.12) −0.30 (−0.56, −0.01) Bio1.mean −0.07 0.50 (−0.73, −0.23) −0.18 (−0.51, −0.21) Sceloporus Bio10.mean −0.80 0.01 (−1.82, −0.10) −0.38 (−0.71, 0.07) Elev.mid 0.93 0.01 (0.23, 2.01) 0.31 (−0.03, 0.60) Lat.mid −0.80 0.03 (−2.03, −0.06) −0.38 (−0.69, 0.01) Bio1.mean −0.09 0.75 (−0.84, 0.36) −0.14 (−0.53, 0.32) Phrynosoma Bio10.mean −1.47 0.01 (−3.65, −0.31) −0.45 (−0.92, 0.42) Elev.mid 1.39 0.02 (0.20, 3.02) 0.43 (−0.13, 0.84) Lat.mid −0.29 0.67 (−1.65, 1.05) −0.10 (0.67, 0.54) Bio1.mean −0.58 0.26 (−1.98, 0.43) −0.23 (−0.82, 0.53)

Table 3. Results of model comparisons for the evolution of continuous characters used in ancestral reconstructions. Values shown are for AICc, with 0 being the best model and models having AICc < 2 also considered to have strong support following Burnham and Anderson (2002). Bolded values indicate models with strong support.

Model Elevation (midpoint) Latitude (midpoint) Bio10 (mean) Lambda 0.00 1.75 0.00 Brownian motion 61.57 5.29 46.56 White noise 363.90 62.31 27.71 OU 136.20 0.00 8.92

Table 4. Results from phylogenetic generalized least squares (PGLS) comparisons of elevational and latitudinal ranges of viviparous and oviparous species. Under the best-fitting model, the correlation structure from the model with the lowest AICc is given (with “no correlation structure” referring to a standard GLS model without any phylogenetic correlation structure). The intercept is interpretable as the estimated mean of oviparous species, with the slope coefficient representing the difference in the means of oviparous and viviparous species. Bolded P-values indicate statistically significant differences between oviparous and viviparous species using a threshold of 0.05.

Taxon Variable Best-fitting model Intercept Slope coefficient P Phrynosomatidae Latitudinal range Lambda 6.36 −2.63 0.03 Elevational range Lambda 1257.08 −143.78 0.39 Temperature niche breadth Lambda 37.52 −5.91 0.03 Sceloporus Latitudinal range Lambda 4.60 −2.80 0.01 Elevational Range No correlation structure 1269.45 −31.11 0.90 Temperature niche breadth Lambda 30.73 −6.74 0.02 Phrynosoma Latitudinal range No correlation structure 7.12 −0.89 0.74 Elevational range No correlation structure 1499.50 −302.83 0.54 Temperature niche breadth No correlation structure 40.99 −1.87 0.75

For ancestral reconstructions, model-testing for the environ- hypothesis that viviparity generally evolves in cold climates and mental variables favors the model using the maximum-likelihood high elevations (Fig. 1;Table 5); we find no origins of viviparity estimate of λ for the midpoint of the elevational range and for the in ancestors reconstructed as occurring in relatively warm or low- mean of Bio10 (Table 3). For the midpoint of the latitudinal range, elevation environments. Although the two origins in Phrynosoma the OU and maximum-likelihood estimate of λ models were in- are supported as occurring in the upper 50% of all reconstructed distinguishable (Table 3). Ancestral reconstructions support the Bio10 values across Phrynosomatidae, both origins occur in

EVOLUTION SEPTEMBER 2013 2621 25 S. M. LAMBERT AND J. J. WIENS

Table 5. Results from nonparametric test comparing the mean ancestral values for the six origins of viviparity with the distribution of means from 100,000 random samples of six nodes from all oviparous nodes. “Test statistic” refers to the relevant tail of the one-sided 95% confidence interval from the distribution of sampled means from the oviparous reconstructions. The values used for Sceloporus bicanthalis and Sceloporus aeneus are the midpoint between the extant values and the immediately ancestral (oviparous) nodes. Results from two models of continuous character evolution for latitude are presented for latitude because both had AICc values of < 2.

Origin Mean of Bio10 Midpoint latitude (λ) Midpoint latitude (OU) Midpoint elevation S. poinsetti group 21.58 23.14 23.33 1556.36 P. orbiculare group 23.83 30.66 30.21 1177.13 P. braconnieri group 24.50 23.06 23.54 1122.78 S. formosus group 20.54 18.03 18.51 1537.56 S. bicanthalis 16.91 20.53 20.41 2297.14 S. aeneus 16.75 22.02 21.97 2559.47 Mean (origins) 20.86 22.91 23.00 1708.41 Test statistic 23.15 23.31 23.52 1324.72

relatively low temperatures when compared to the reconstructed the GLS results using these models never disagree regarding the values within Phrynosoma (third and fifth lowest out of the 14 statistical significance of the difference between oviparous and reconstructed values within the genus, with the first and second viviparous species (using P < 0.05; Table S4), and for simplicity, coldest reconstructed values also found within viviparous clades). we report the estimated means and P-values from the model with In Sceloporus, each of the four origins is reconstructed to have oc- the minimum AICc in these cases. The elevational range sizes of curred in environments with reconstructed summer temperatures oviparous and viviparous species are not significantly different at in the lowest 25% of all reconstructed values (Fig. 1). The non- any taxonomic level examined (Table 4). Across Phrynosomatidae parametric test we used to compare estimated conditions at the six and within Sceloporus, we find that the latitudinal range sizes and origins of viviparity with the estimated conditions in oviparous temperature niche breadths of viviparous species are significantly lineages across Phrynosomatidae indicates that the conditions at smaller than those of oviparous species. Within Phrynosoma,we the origins of viviparity were significantly colder, at higher eleva- find no significant differences in any of the examined variables. tions, and at lower latitudes than those reconstructed for oviparous lineages (Table 5). Another pattern we addressed using ancestral reconstruc- Discussion tions is the presence of viviparous species that are restricted to The origin of viviparity is a major transition in life-history evo- tropical, low-elevation environments. Three species in particular lution in vertebrates, but the selective factors that lead to this fit this pattern: Sceloporus macdougalli (maximum recorded el- transition are not fully understood. In this study, we provide the ◦ evation 53 m, latitudinal midpoint ∼16 ), Sceloporus (serrifer) first direct test of the most basic prediction of the cold-climate serrifer (maximum recorded elevation 32 m, latitudinal midpoint hypothesis (that viviparity evolves in colder climates) by combin- ◦ ∼21 ), and Sceloporus lundelli (maximum recorded elevation 92 ing explicit data on climate and phylogeny. Our results support ◦ m, latitudinal midpoint ∼20 ). Ancestral reconstructions indicate the cold-climate hypothesis, but also suggest an intriguing twist: that these species invaded their warm, low-elevation environments cold climates favor the evolution of viviparity, but mostly in the after their ancestors evolved viviparity in high-elevation, cool en- tropics. vironments (Fig. 2; because the reproductive mode of S. lundelli We support the cold-climate hypothesis in that we find that is unknown and assumed based on closest relatives, we present low summer temperatures (Bio10) are strongly related to vivipar- results only for the former two species). In contrast, the two most ity, according to both phylogenetic logistic regression and cor- recent origins of viviparity in Phrynosomatidae (i.e., in S. bican- relations from Wright’s threshold model (Table 2). We find no thalis and S. aeneus of the scalaris group) are associated with the significant relationships between viviparity and mean annual tem- first and third coldest extant mean summer temperatures and the perature, but this result is also consistent with the cold-climate first and second highest elevational midpoints, respectively, in the hypothesis. The selective pressures that favor viviparity should entire dataset (Appendix S1). result from the temperature during the egg-laying and incubation Our comparisons of the elevational range sizes, latitudinal season, which occurs during the summer for oviparous phryno- range sizes, and temperature niche breadths of oviparous and somatid lizards (e.g., Smith 1995; Jones and Lovich 2009). High viviparous species resulted in several instances where multiple and low temperatures during other parts of the year may be of GLS models had AICc scores of < 2 (Table S3). However, limited relevance. In fact, it might be more accurate to call the

2622 EVOLUTION SEPTEMBER 2013 26 EVOLUTION OF VIVIPARITY

Sceloporus serrifer Sceloporus serrifer Esmated mean of Bio10 (C°) 22 23 24 25 26 27 Esmated elevaonal midpoint (m) 0 500 1000 1500

50 40 30 20 10 0 50 40 30 20 10 0 Millions of years ago Millions of years ago

Sceloporus macdougalli Sceloporus macdougalli Esmated mean of Bio10 (°C ) 22 24 26 28 Esmated elevaonal midpoint (m) 0 500 1000 1500 50 40 30 20 10 0 50 40 30 20 10 0 Millions of years ago Millions of years ago

Figure 2. Diagram showing the reconstructed values of temperature and elevation for two viviparous species that inhabit exclusively tropical, low-elevation sites (< 100 m elevational midpoint), showing the values at each node in a straight-line path from the root of the Phrynosomatidae to the extant values. Temperature is the mean temperature of the warmest quarter (Bio10) and elevation is the midpoint of the elevational range. The figure shows that the warm, low-elevation climates inhabited by these species represent secondary invasions after the evolution of viviparity in an ancestor that inhabited relatively cool, high-elevation environments. Colors of each circle indicate the reconstruction of reproductive mode (black = oviparity, white = viviparity) at the corresponding node.

“cold climate hypothesis” the “cool summer hypothesis,” espe- Hodges (2004) highlighted three other factors at high elevations cially given the demonstrated prevalence of viviparity in tropical that might favor viviparity. First, lower oxygen concentrations regions. at higher elevations can slow embryonic development (Andrews Our results are generally concordant with those of Hodges 2002), and placental structures can allow viviparous mothers to (2004), who examined the evolution of viviparity in Phrynosoma better supply their young with oxygen. Second, diel temperature only. In agreement with Hodges (2004), we find that elevation sig- fluctuations at high elevations may present another difficulty that nificantly predicts reproductive mode within Phrynosoma. How- viviparous females can mitigate through behavioral thermoreg- ever, Hodges (2004) further suggested that high-elevation envi- ulation. Unfortunately, these hypotheses are difficult to address ronments have characteristics beyond cold temperatures that also with the data available for phrynosomatids. Third, Hodges (2004) favor viviparity. Our results show that summer temperatures gen- stated that higher elevations are drier, and more arid conditions are erally have a stronger relationship with parity mode than elevation problematic for developing eggs. However, we find that elevation (Table 1), but summer temperatures and elevation are highly cor- and annual precipitation (Bio12) are actually positively correlated related (Table S7), making their effects difficult to distinguish. for the environments inhabited by Phrynosomatidae (Table S7).

EVOLUTION SEPTEMBER 2013 2623 27 S. M. LAMBERT AND J. J. WIENS

Our analyses also help to explain three patterns seen in Holbrookia, Urosaurus, Uta). We find that viviparous Sceloporus Phrynosomatidae that seem to contradict the cold-climate hy- do not occur above a maximum of ∼33◦ latitude in North Amer- pothesis: the presence of viviparous species in lowland tropical ica, even though 28 oviparous phrynosomatid species do. habitats, the scarcity of viviparous species at high latitudes, and We suggest that the scarcity of high-latitude viviparous the tendency of viviparity to evolve in tropical regions. First, our species may be explained by the same factor that underlies the ancestral reconstructions indicate that the viviparous species of tendency for viviparity to evolve at tropical latitudes. We find Sceloporus inhabiting primarily tropical, low-elevation environ- that five of the six origins of viviparity in the family seemingly ments (e.g., S. [serrifer] serrifer and S. macdougalli) secondar- occurred below 24◦ latitude (using the midpoint of the latitudi- ily invaded these areas and are descended from high-elevation, nal range of species, Table 5), and the relationship between low cool-climate dwelling viviparous ancestors (Fig. 2). These re- latitudes and viviparity is also supported by our phylogenetic sults highlight the importance of considering phylogenetic his- comparative analyses (Table 2) and nonparametric test of latitu- tory when testing hypotheses of adaptation: in phrynosomatids, dinal origins (Table 5). Although other studies have noted higher viviparity appears to have originated exclusively in cold climates, numbers of viviparous species in the tropics (e.g., Tinkle and even if the descendant species have subsequently invaded other Gibbons 1977), they have not suggested that viviparity is more environments. Similar information is needed for other viviparous likely to originate in the tropics, and they actually found a greater squamates that occur in warm environments to determine if they proportion of viviparous species at high latitudes. In contrast, we represent in situ origins of viviparity or secondary invasions of find that both the proportion and absolute number of viviparous such environments. For example, the exclusively viviparous and species in Phrynosomatidae is highest at tropical latitudes lowland Neotropical skink genus Mabuya (sensu lato) potentially (Fig. 4). challenges the generality of the cold-climate hypothesis (Shine We propose that latitudinal variation in temperature seasonal- 1985, 2004). ity might explain the paradoxical tendency for viviparity to evolve Furthermore, our results suggest that viviparous species are in the tropics and be largely absent at high latitudes in Phrynoso- able to rapidly adapt to warm climates without reverting to ovipar- matidae. Specifically, seasonal temperature stability in tropical re- ity. This pattern might occur because these lineages have lost gions might facilitate the evolution of cool-climate, high-elevation the genetic and developmental (and possibly behavioral) features specialists. These cool-climate specialists may then be more likely needed for oviparity to re-evolve. Alternately, viviparity might to evolve viviparity, as the selective pressure (cool summer tem- still provide advantages even in environments that are not cold. peratures) is present throughout their geographic range, whereas The maternal manipulation hypothesis (Shine 1995) posits that climatic generalists may have ongoing gene flow with warm- viviparous females can enhance offspring fitness by maintaining climate populations where oviparity is favored instead. This hy- conditions for the embyro that are advantageous relative to those pothesis recalls that of Janzen (1967), who hypothesized that in nesting sites. For example, in warm environments, viviparous seasonal stability in temperature at tropical localities will result females might provide more stable temperatures for developing in selection for narrower physiological tolerances. In turn, these embryos (Webb et al. 2006; Ji et al. 2007). However, it is unclear narrow physiological tolerances will reduce the dispersal ability whether such an advantage could facilitate evolution of viviparity of individuals across elevations and may facilitate the evolution in lowland tropical climates, invasion of these environments by of species that are high-elevation versus low-elevation specialists viviparous species, or allow maintenance of viviparity under these (Ghalambor et al. 2006). At more northern latitudes, each locality conditions. will experience a relatively broad range of temperatures regard- Our results may also help to explain the absence of viviparous less of elevation, potentially reducing specialization to both high Sceloporus at the highest latitudes and elevations in temperate and low elevations. regions, despite our overall support for the cold-climate hypoth- In support of this hypothesis, we find that many high-latitude esis. Although there are phrynosomatid species with ranges that phrynosomatids occur at both low and high elevations (e.g., S. con- extend to relatively high latitudes (to 47.7◦, Phrynosoma dou- sobrinus, S. graciosus, S. occidentalis, S. tristichus, Urosaurus or- glassi) and high elevations (to 4067 m, Sceloporus graciosus) natus, Uta stansburiana, each with elevational ranges of > 2200 in temperate North America (Appendix S1; Smith 1995; Jones m). Of the 28 oviparous species found above 33◦ latitude, 17 have and Lovich 2009), these belong to one of two categories. First, elevational ranges > 2000 m, and only six have elevational ranges there are viviparous “invaders” to high latitudes (i.e., Phryno- of < 1000 m. Importantly, those species with ranges < 1000 m soma douglassi and P. hernandesi), which are reconstructed are primarily confined to lower elevations and latitudes in temper- as having evolved from an ancestor that evolved viviparity at ate North America (the maximum recorded elevation among these lower latitudes (∼30◦; Table 5). Second, there are oviparous species is 1249 m for Sceloporus arenicolus, and the average mid- species of Sceloporus and oviparous species of other genera (e.g., point of their elevational ranges is ∼340 m). We hypothesize that

2624 EVOLUTION SEPTEMBER 2013 28 EVOLUTION OF VIVIPARITY

Viviparous Oviparous Temperature niche breadth (°C) Temperature 10 20 30 40 50 60

15 20 25 30 35 40 45 Latudinal midpoint (decimal degrees)

Figure 3. Scatter plot of temperature niche breadths (y-axis) and midpoint of the latitudinal range (x-axis) for oviparous and viviparous phrynosomatid species (dark circles are viviparous species, crosses are oviparous species). The viviparous species at high latitudes are Phrynosoma hernandesi and Phrynosoma douglassi.

viviparity fails to evolve in high-latitude species because of gene temperate phrynosomatids are caused primarily by within-locality flow between low-elevation populations (where viviparity may seasonal variation, rather than between-locality differences in not be favored because high summer temperatures allow for rapid temperature values, an important assumption in Janzen’s (1967) egg development) and high-elevation populations (where vivipar- hypothesis. ity might otherwise be favored). However, intrinsic physiological This hypothesis of cool-climate, high-elevation specializa- constraints of certain species or clades might also be a factor. For tion favoring the evolution of viviparity in the tropics will require instance, it has been shown that species of Sceloporus vary in their further testing, perhaps including analyses of gene flow and lo- ability to continue embryonic development during egg retention, cal adaptation to different elevations in tropical versus temperate with in utero oxygen availability implicated as an important con- regions (with higher gene flow and less local adaptation over straint (Andrews and Rose 1994; Andrews 2002; Parker and An- similar elevational ranges predicted for temperate species). This drews 2006). Nevertheless, the high-latitude, oviparous, climatic hypothesis also predicts broader physiological tolerances to tem- generalists do belong to diverse clades across phrynosomatids, perature in temperate versus tropical species (e.g., Janzen 1967; arguing against obvious clade-specific constraints. Ghalambor et al. 2006), a prediction that has been confirmed In further support of the idea that there are fewer cool-climate using several species of Sceloporus (van Berkum 1988). Future specialists at temperate latitudes, there is a strong positive re- studies should also address why viviparous Phrynosoma species lationship between thermal niche breadth and latitude (Fig. 3; were able to invade high-latitude regions from the south, whereas R2 = 0.712, P = 0.000 from PGLS: Quintero and Wiens 2013), viviparous Sceloporus have not done so. Another challenge is to where thermal niche breadth is the difference between the min- compare these patterns in temperate North American phrynoso- imum value of Bio6 (the minimum temperature of the coldest matids to those in other clades and regions (e.g., Eurasia, austral month) and the maximum value of Bio5 (the maximum temper- South America, temperate Australia). For instance, the Eurasian ature of the warmest month) across sampled localities. Quintero lacertid Zootoca vivipara contains multiple viviparous lineages and Wiens (2013) also found that wide thermal niche widths in that appear to have originated recently at high latitudes, possibly

EVOLUTION SEPTEMBER 2013 2625 29 S. M. LAMBERT AND J. J. WIENS

Viviparous Oviparous Species richness 0 1020304050

5−10 10−15 15−20 20−25 25−30 30−35 35−40 40−45 45−50 Latude (decimal degrees)

Figure 4. Species richness of oviparous species (black) and viviparous species (white) within latitudinal bins encompassing the entire range of Phrynosomatidae (based on data in Appendix S1). Any species whose latitudinal range overlaps a bin was counted toward the species richness of that bin (even if there were no localities sampled in that particular range of latitudinal values).

related to climate change during the Pleistocene (Surget-Groba mate change, we compared the latitudinal range sizes, elevational et al. 2001, 2006). range sizes, and temperature niche breadths of viviparous and In summary, despite our support for the cold-climate hypoth- oviparous phrynosomatid species. Across Phrynosomatidae and esis, we find that viviparity evolves more often at tropical latitudes within Sceloporus, we find that viviparous species have signifi- than in temperate regions, and that viviparity is relatively uncom- cantly smaller latitudinal ranges and temperature niche breadths, mon in high-latitude phrynosomatids. This pattern seems to be ex- as expected. However, we find similar elevational range sizes plained by the presence of high-elevation, cool-climate specialists in oviparous and viviparous species. Climate change may mod- in tropical regions, and the dominance of elevational generalists ify local conditions such that populations will go extinct if they and low-elevation specialists in temperate regions. These patterns cannot disperse and/or adapt (e.g., Holt 1990). A larger range are in turn explained by greater temperature seasonality in the size should allow a species to persist despite some localized ex- temperate zone, leading to wider climatic niche breadths and a tinctions. The smaller latitudinal ranges and temperature niche lack of elevational specialists. breadths of viviparous species place them at a disadvantage in this respect, but their relatively large elevational range sizes may VIVIPARITY, RANGE SIZE, AND CLIMATE CHANGE help them persist. A previous study suggested that viviparous species of Scelo- porus have a higher risk of extinction from anthropogenic climate VIVIPARITY AND DIVERSIFICATION RATE change because they tend to be restricted to high-elevation habi- Our results also suggest that viviparous phrynosomatid lineages tats, where climate is changing most rapidly (Sinervo et al. 2010). have a significantly higher speciation rate than oviparous lin- Our results from phylogenetic logistic regression and Wright’s eages. A previous study (Lynch 2009) also found that viviparous threshold model clearly support the assumption that viviparous clades of vipers (Viperidae) had higher speciation rates than their species inhabit higher elevations (Table 2). To further explore the oviparous counterparts, suggesting the possibility that this might hypothesis that viviparous species may be more at risk from cli- be a relatively broad evolutionary pattern. However, Lynch (2009)

2626 EVOLUTION SEPTEMBER 2013 30 EVOLUTION OF VIVIPARITY

made the important caveat that an increased speciation rate in related to latitude within Phrynosomatidae, with viviparity more viviparous lineages could be due to traits that are associated with likely to evolve in lower (tropical) latitudes. We hypothesize that viviparity, rather than viviparity itself. this surprising latitudinal pattern could be due to greater sea- One such trait for viviparous phrynosomatid species might sonal temperature stability in the tropics facilitating the evolu- be their presence in tropical montane regions. As climate changes tion of cool-climate, high-elevation specialists. This hypothesis through time, tropical montane species may be more likely to be- also explains the paradoxical scarcity of viviparous phrynoso- come isolated on mountaintops because of the increased climatic matid species at high latitudes. Finally, our results suggest that zonation in tropical regions (Janzen 1967), potentially leading viviparous lineages have higher speciation rates than oviparous to increased rates of allopatric speciation (e.g., Ghalambor et al. lineages. 2006). An ancillary prediction is that viviparous species will show signatures of speciation via climatic niche conservatism (Wiens ACKNOWLEDGMENTS 2004; Hua and Wiens 2013), with similar climatic niches in al- The authors thank A. Ives for advice regarding phylogenetic logistic lopatric sister species that differ markedly from the lowlands that regression, and D. Blackburn, J. Mank, and two anonymous reviewers for helpful comments on the manuscript. separate them (e.g., Kozak and Wiens 2006). This isolation at high elevations through time may ultimately help explain both the evolution of viviparity itself and the faster rate of diversifica- LITERATURE CITED Andrews, R. M. 2000. Evolution of viviparity in squamate reptiles (Sceloporus tion estimated for viviparous lineages. spp.): a variant of the cold-climate model. J. Zool. 250:243–253. However, viviparity itself might increase diversification ———. 2002. Low oxygen: a constraint on the evolution of viviparity in rates. One proposed mechanism is the viviparity-driven conflict reptiles. Physiol. Biochem. Zool. 75:145–154. hypothesis (Zeh and Zeh 2000, 2008). Briefly, this hypothesis Andrews, R. M., and B. R. Rose. 1994. Evolution of viviparity: constraints on suggests that the intensified physiological interactions between a egg retention. Physiol. Zool. 67:1006–1024. Amoroso, E. C. 1968. The evolution of viviparity. Proc. R. Soc. Med. 61:1188– viviparous mother and her offspring create greater selective pres- 1200. sure for genomic compatibility that is not present in oviparous Blackburn, D. G. 1998. Structure, function, and evolution of the oviducts of species. This selection then leads to a faster build-up of postzy- squamate reptiles with special reference to viviparity and placentation. gotic isolation in viviparous species. One potential prediction of J. Exp. Zool. 282:560–617. ———. 1999a. Viviparity and oviparity: evolution and reproductive strategies. this hypothesis is that pairs of viviparous species that show sym- Pp. 994–1003 in T. E. Knobil and J. D. Neill, eds. Encyclopedia of patry without introgression will be younger than similar pairs of reproduction. Academic Press, New York. oviparous species because of more rapid evolution of postzygotic ———. 1999b. Are viviparity and egg-guarding evolutionarily labile in squa- isolation. This hypothesis has been proposed as an explanation mates? Herpetologica 55:556–573. for the apparent faster speciation rate of other viviparous lin- ———. 2000. Reptilian viviparity: past research, future directions, and ap- propriate models. Comp. Biochem. Physiol. A Comp. Physiol. 127:391– eages (e.g., mammals; Zeh and Zeh 2000). However, the effects 409. of viviparity-driven conflict may be absent in squamate reptiles, ———. 2005. Amniote perspective on the evolutionary origins of viviparity where interactions between mother and embryo are typically more and placentation. Pp. 301–322 in H. J. Grief and M. C. Uribe, eds. limited than in mammals (Yaron 1985; Stewart and Blackburn Viviparous fishes. New Life Publications, Homestead, FL. ———. 2006. Squamate reptiles as model organisms for the evolution of 1988; Blackburn 1998; Stewart and Thompson 2000). Neverthe- viviparity. Herp. Mon. 20:121–146. less, both the climate and conflict-based hypotheses should be Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel explored in future studies. inference: a practical information-theoretic approach. 2nd ed. Springer- Verlag, New York. Calderon-Espinosa,´ M. L., R. M. Andrews, and F. R. Mendez´ de la Cruz. 2006. Evolution of egg retention in the Sceloporus spinosus group: exploring Conclusions the role of physiological, environmental, and phylogenetic factors. Herp. Our study is the first to integrate phylogeny and climatic data to ad- Mon. 20:147–158. dress the origin of viviparity in squamate reptiles, and we strongly Conant, R., and J. T. Collins. 1998. A field guide to reptiles and amphibians of support the cold-climate hypothesis. Using phylogenetic compar- Eastern and Central North America. 4th ed. Houghton Mifflin Harcourt, Boston. ative methods, we find that the parity mode of extant phrynoso- Drummond, A. J., and A. Rambaut. 2007. BEAST: Bayesian evolutionary matid species is most strongly related to the mean temperature analysis by sampling trees. BMC Evol. Biol. 7:214. of the warmest (egg-laying) season. Using ancestral reconstruc- Drummond, A. J., S. Y. W. Ho, M. J. Phillips, and A. Rambaut. 2006. Relaxed tions, we found that viviparity evolves in cooler climates, and that phylogenetics and dating with confidence. PLoS Biol. 4:e88. Felsenstein, J. 2012. A comparative method for both discrete and continuous viviparous species inhabiting exclusively tropical, low-elevation characters using the threshold model. Am. Nat. 179:145–156. areas represent secondary invasions of these environments. In FitzJohn, R. G. 2012. Diversitree: comparative phylogenetic analyses of di- addition, we found that reproductive mode is also significantly versification in R. Methods Ecol. Evol. 3:1084–1092.

EVOLUTION SEPTEMBER 2013 2627 31 S. M. LAMBERT AND J. J. WIENS

Garc´ıa-Collazo, R., M. Villagran-Santa´ Cruz, E. Morales-Guillaumin, R. N. Maddison, W. P., and D. R. Maddison. 2011 Mesquite: a modu- M. Lazaro,´ and F. R. Mendez-de´ la Cruz. 2012. Egg retention and lar system for evolutionary analysis. Version 2.75. Available via intrauterine embryonic development in Sceloporus aeneus (Reptilia: http://mesquiteproject.org. Phrynosomatidae): implications for the evolution of viviparity. Rev. Maddison, W. P., P. E. Midford, and S. P. Otto. 2007. Estimating a binary Mex. Biodivers. 83:802–808. character’s effect on speciation and extinction. Syst. Biol. 18:701–710. Ghalambor, C. K., R. B. Huey, P. R. Martin, J. J. Tewksbury, and G. Wang. Mell, R. 1929. Grundzuge¨ einer Okologie der chinesischen Reptilien. Walter 2006. Are mountain passes higher in the tropics? Janzen’s hypothesis De Gruyter, Berlin. revisited. Integr. Comp. Biol. 46:5–17. Mendez-de´ la Cruz, F. R., M. Villagran-Santa´ Cruz, and R. M. Andrews. 1998. Grafen, A. 1989. The phylogenetic regression. Philos. Trans. Roy. Soc. B. Evolution of viviparity in the lizard genus Sceloporus. Herpetologica 326:119–157. 54:521–532. Grismer, L. L. 2002. Amphibians and reptiles of Baja California. University Midford, P. E., T. Garland Jr., and W. P. Maddison. 2010. PDAP Package of of California Press, Berkeley, CA. Mesquite. Version 1.16. Guillette, L. J. Jr. 1993. The evolution of viviparity in lizards. BioScience Neill, W. T. 1964. Viviparity in snakes: some ecological and zoogeographical 43:742–751. considerations. Am. Nat. 98:35–55. Guillette, L. J. Jr., R. E. Jones, K. T. Fitzgerald, and H. M. Smith. 1980. Oufiero, C. E., G .E. A. Gartner, S. C. Adolph, and T. Garland Jr. 2011. Latitu- Evolution of viviparity in the lizard genus Sceloporus. Herpetologica dinal and climatic variation in scale counts and body size in Sceloporus 36:201–215. lizards: a phylogenetic perspective. Evolution 65:3590–3607. Hansen, T. F., and E. P. Martins. 1996. Translating between microevolution- Packard, G. C., C. R. Tracy, and J. J. Roth. 1977. The physiological ecology ary process and macroevolutionary patterns: the correlation structure of of reptilian eggs and embryos and the evolution of viviparity within the interspecific data. Evolution 18:1404–1417. class Reptilia. Biol. Rev. 52:71–105. Harmon, L. J., J. T. Weir, C. D. Brock, R. E. Glor, and W. Challenger. 2008. Pagel, M. 1999. The maximum likelihood approach to reconstructing ances- GEIGER: investigating evolutionary radiations. Bioinformatics 24:129– tral character states of discrete characters on phylogenies. Syst. Biol. 131. 48:612–622. Hijmans, R. J., S. E. Cameron, J. L. Parra, P. G. Jones, and A. Jarvis. 2005. Paradis, E., J. Claude, and K. Strimmer. 2004. APE: analyses of phylogenetics Very high resolution interpolated climate surfaces for global land areas. and evolution in R language. Bioinformatics 20:289–290. Int. J. Climatol. 25:1965–1978. Parker, S. L., and R. M. Andrews. 2006. Evolution of viviparity in sceloporine Hodges, W. L. 2004. Evolution of viviparity in horned lizards (Phrynosoma): lizards: in utero Po2 as a developmental constraint during egg retention. testing the cold climate hypothesis. J. Evol. Biol. 17:1230–1237. Physiol. Biochem. Zool. 79:581–592. Holt, R. D. 1990. The microevolutionary consequences of climate change. ———. 2007. Incubation temperature and phenotypic traits of Sceloporus Trends Ecol. Evol. 5:311–315. undulatus: implications for the northern limits of distribution. Oecologia Hua, X., and J. J. Wiens. 2013. How does climate influence speciation? Am. 151:218–231. Nat. In press. Pinheiro, J., D. Bates, S. DebRoy, D. Sarkar, and the R Development Core Ives, A. R., and T. Garland. 2010. Phylogenetic logistic regression for binary Team. 2012. nlme: linear and nonlinear mixed effects models. R package dependent variables. Syst. Biol. 59:9–26. version 3.1-104. Janzen, D. H. 1967. Why mountain passes are higher in the tropics. Am. Nat. Qualls, F. J., and R. Shine. 1998. Lerista bougainvillii, a case study for the 101:233–249. evolution of viviparity in reptiles. J. Evol. Biol. 11:63–78. Ji, X., C.-X. Lin, L.-H. Lin, Q.-B. Qiu, and Y.Du. 2007. Evolution of viviparity Qualls, F. J., and R. M. Andrews. 1999. Cold climates and the evolution of in warm-climate lizards: an experimental test of the maternal manipula- viviparity in reptiles: cold incubation temperatures produce poor-quality tion hypothesis. J. Evol. Bio. 20:1037–1045. offspring in the lizard, Sceloporus virgatus. Biol. J. Linn. Soc. 67:353– Jones, L. L. C., and R. E. Lovich, eds. 2009. Lizards of the American South- 376. west. Rio Nuevo Publishers, Tucson, AZ. Quintero, I., and J. J. Wiens. 2013. What determines the climatic niche width Kozak, K. H., and J. J. Wiens. 2006. Does niche conservatism drive speciation? of species? The role of spatial and temporal climatic variation in three A case study in North American salamanders. Evolution 60:2604–2621. vertebrate clades. Global. Ecol. Biogeogr. 22:422–432. Leache,´ A. D. 2010. Species trees for spiny lizards (genus Sceloporus): iden- R Core Team. 2012. R: A language and environment for statistical comput- tifying points of concordance and conflict between nuclear and mito- ing. R Foundation for Statistical Computing, Vienna, Austria. ISBN chondrial data. Mol. Phylogenet. Evol. 54:162–171. 3–900051-07-0. Available via http://www.R-project.org. Lee, M. S. Y., and R. Shine. 1998. Viviparity and Dollo’s law. Evolution Revell, L. J. 2012. Phytools: an R package for phylogenetic comparative 52:1441–1450. biology (and other things). Methods Ecol. Evol. 3:217–223. Li, H., Y.-F. Qu, R.-B. Hu, and X. Ji. 2009. Evolution of viviparity in cold- Rodr´ıguez-D´ıaz, T., and F. Brana.˜ 2012. Altitudinal variation in egg retention climate lizards: testing the maternal manipulation hypothesis. Evol. Ecol. and rates of embryonic development in oviparous Zootoca vivipara fits 23:777–790. predictions from the cold-climate model on the evolution of viviparity. Lynch, V. J. 2009. Live-birth in vipers (Viperidae) is a key innovation and J. Evol. Biol. 25:1877–1887. adaptation to global cooling during the Cenozoic. Evolution 63:2457– Sergeev, A. M. 1940. Researches on the viviparity of reptiles. Moscow Soc. 465. Nat. 1:1–34. Lynch, V. J., and G. P. Wagner. 2009. Did egg-laying boas break Dollo’s Schulte, J. A. II, and F. Moreno-Roark. 2010. Live birth among iguanian lizards law? Phylogenetic evidence for reversal to oviparity in sand boas (Eryx: predates Pliocene–Pleistocene glaciations. Biol. Lett. 6:216–218. Boidae). Evolution 64:207–216. Schulte, J. A. II, J. R. Macey, R. E. Espinoza, and A. Larson. 2000. Phy- Maddison, W. P. 1990. A method for testing the correlated evolution of two logenetic relationships in the iguanid lizard genus Liolaemus: multiple binary characters: are gains or losses concentrated on certain branches origins of viviparous reproduction and evidence for recurring Andean of a phylogenetic tree? Evolution 44:539–557. vicariance and dispersal. Biol. J. Linn. Soc. 69:75–102.

2628 EVOLUTION SEPTEMBER 2013 32 EVOLUTION OF VIVIPARITY

Shine, R. 1983. Reptilian viviparity in cold climates: testing the assumptions Sugiura, N. 1978. Further analysis of the data by Akaike’s information crite- of an evolutionary hypothesis. Oecologia 57:397–405. rion and the finite corrections. Comm. Stat. A-Theor. 7:13–26. ———. 1985. The evolution of viviparity in reptiles: an ecological analysis. Surget-Groba, Y., B. Heulin, C.-P. Guillaume, R. S. Thorpe, L. Kupriyanova, Pp. 607–681 in C. Gans and F. Billet, eds. Biology of the Reptilia 15. N. Vogrin, R. Maslak, S. Mazzotti, M. Venczel, I. Ghira, et al. 2001. John Wiley & Sons, NY. Intraspecific phylogeography of Lacerta vivipara and the evolution of ———-. 1987. The evolution of viviparity: ecological correlates of reproduc- viviparity. Mol. Phylogenet. Evol. 18:449–459. tive mode within a genus of Australian snakes (Pseudechis: Elapidae). Surget-Groba, Y., B. Heulin, C.-P. Guillaume, M. Puky, B. Smajda, D. Se- Copeia 1987:551–563. menov, V. Orlova, L. Kupriyanova, and I. Ghira. 2006. Multiple origins ———. 1995. A new hypothesis for the evolution of viviparity in reptiles. of viviparity, or reversal from viviparity to oviparity? The European com- Am. Nat. 145:809– 823. mon lizard (Zootoca vivipara, Lacertidae) and the evolution of parity. ———. 2002. Reconstructing an adaptationist scenario: what selective forces Biol. J. Linn. Soc. 87:1–11. favor the evolution of viviparity in montane reptiles? Am. Nat. 160:582– Thompson, M. B., and D. G. Blackburn. 2006. Evolution of viviparity in 593. reptiles: introduction to the symposium. Herpetol. Monogr. 20:129–130. ———. 2004. Does viviparity evolve in cold climate reptiles because preg- Thompson, M. B., and B. K. Speake. 2006. A review of the evolution of nant females maintain stable (not high) body temperatures? Evolution viviparity in lizards: structure, function and physiology of the placenta. 58:1809–1818. J. Comp. Physiol. B. 176:179–189. Shine, R., and J. F. Berry. 1978. Climatic correlates of live-bearing in squamate Tinkle, D. W., and J. W. Gibbons. 1977. The distribution and evolution of reptiles. Oecologia 33:261–268. viviparity in reptiles. Misc. Publ., Mus. Zool., Univ. Mich. 154:1–47. Shine, R., and J. J. Bull. 1979. The evolution of live-bearing in lizards and van Berkum, F. H. 1988. Latitudinal patterns of the thermal sensitivity of snakes. Am. Nat. 113:905–923. sprint speed in lizards. Am. Nat. 132:327–343. Shine, R., and M. S. Y. Lee. 1999. A reanalysis of the evolution of vivipar- Vitt, L. J., and J. P. Caldwell. 2009. Herpetology, third edition: an introductory ity and egg-guarding in squamate reptiles. Herpetologica 55:538– biology of amphibians and reptiles. Elsevier, Burlington, MA. 549. Webb, J. K., R. Shine, and K. A. Christian. 2006. The adaptive significance Shine, R., M. Elphick, and E. G. Barrott. 2003. Sunny side up: of reptilian viviparity in the tropics: testing the maternal manipulation lethally high, not low, temperature may prevent oviparous reptiles hypothesis. Evolution 60:115–122. from reproducing at high elevations. Biol. J. Linn. Soc. 78: 325– Weekes, H. C. 1935. A review of placentation among reptiles with particular 334. regard to the function and evolution of the placenta. Proc. Zool. Soc. Sinervo, B., F. Mendez-de´ la Cruz, D. B. Miles, B. Heulin, E. Bastiaans, M. Lond. 1935:625–645. Villagran´ -Santa Cruz, R. Lara-Resendiz, N. Mart´ınez-Mendez,´ M. L. Wiens, J. J. 2004. Speciation and ecology revisited: phylogenetic niche con- Calderon-Espinosa,´ R. N. Meza-Lazaro,´ et al. 2010. Erosion of lizard servatism and the origin of species. Evolution 58:193–197. diversity by climate change and altered thermal niches. Science 328:894– Wiens, J. J., C. A. Kuczynski, S. Arif, and T. W. Reeder. 2010. Phylogenetic 899. relationships of phrynosomatid lizards based on nuclear and mitochon- Sites, J. W. Jr., J. W. Archie, C. J. Cole, and O. Flores-Villela. 1992. A drial data, and a revised phylogeny for Sceloporus. Mol. Phylogenet. review of phylogenetic hypotheses for lizards of the genus Sceloporus Evol. 54:150–161. (Phrynosomatidae): implications for ecological and evolutionary studies. Wiens, J. J., K. H. Kozak, and N. Silva. 2013. Diversity and niche evolution Bull. Am. Mus. Nat. Hist. 213:1–110. along aridity gradients in North American lizards (Phrynosomatidae). Sites, J. W. Jr., T. W. Reeder, and J. J. Wiens. 2011. Phylogenetic insights on Evolution. In press. evolutionary novelties in lizards and snakes: sex, birth, bodies, niches, Wright, S. 1934. An analysis of variability in the number of digits in an inbred and venom. Annu. Rev. Ecol. Evol. Syst. 42:227–244. strain of guinea pigs. Genetics 19:506–536. Smith, H. M. 1995. Handbook of lizards: lizards of the United States and Yaron, Z. 1985. Reptilian placentation and gestation: structure, function, and Cananda. Cornell Univ. Press, Ithaca, NY. endocrine control. Pp. 607–681 in C. Gans and F. Billet, eds. Biology of Stebbins, R. C. 2003. A field guide to Western reptiles and amphibians. 3rd the Reptilia. John Wiley & Sons, New York, NY. ed. Houghton Mifflin Harcourt, Boston. Zeh, D. W., and J. A. Zeh. 2000. Reproductive mode and speciation: the Stewart, J. R., and D. G. Blackburn. 1988. Reptilian placentation: structural viviparity-driven conflict hypothesis. BioEssays 22:938–946. diversity and terminology. Copeia 1988:839–852. ———. 2008. Viviparity-driven conflict: more to speciation than meets the Stewart, J. R., and M. B. Thompson. 2000. Evolution of placentation fly. Ann. N.Y. Acad. Sci. 1133:126–148. among squamate reptiles: recent research and future directions. Comp. Biochem. Phys. A. 127:411–431. Associate Editor: J. Mank

EVOLUTION SEPTEMBER 2013 2629 33 S. M. LAMBERT AND J. J. WIENS

Supporting Information Additional Supporting Information may be found in the online version of this article at the publisher’s website:

Appendix S1. List of taxa used in this study with information on reproductive mode (0 = oviparous, 1 = viviparous) and the continuous variables used in comparative analyses. Table S2. Results of likelihood ratio tests for significant phylogenetic signal for all continuous traits used in ancestral reconstruc- tions. Table S3. Results from phylogenetic logistic regression and the threshold model, using the dataset with taxa having uncertain reproductive modes removed. Table S4. AICc scores for all models tested for phylogenetic generalized least squares comparisons of range sizes and temperature niche breadths. Table S5. Full results from phylogenetic generalized least squares comparisons of range sizes and temperature niche breadths, for all models having AICc scores of < 2. Table S6. Reproductive mode for species not included in the phylogenetic tree (0 = oviparous, 1 = viviparous). Table S7. Pearson product-moment correlations between independent variables at each taxonomic scale, calculated using the “cor.test” function in R.

2630 EVOLUTION SEPTEMBER 2013 34

Appendix B Molecular Phylogenetics and Evolution 82 (2015) 146–155

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier.com/locate/ympev

When do species-tree and concatenated estimates disagree? An empirical analysis with higher-level scincid lizard phylogeny ⇑ Shea M. Lambert a, , Tod W. Reeder b, John J. Wiens a

a Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA b Department of Biology, San Diego State University, San Diego, CA 92182, USA

article info abstract

Article history: Simulation studies suggest that coalescent-based species-tree methods are generally more accurate than Received 17 April 2014 concatenated analyses. However, these species-tree methods remain impractical for many large datasets. Revised 9 September 2014 Thus, a critical but unresolved issue is when and why concatenated and coalescent species-tree estimates Accepted 2 October 2014 will differ. We predict such differences for branches in concatenated trees that are short, weakly Available online 12 October 2014 supported, and have conflicting gene trees. We test these predictions in Scincidae, the largest lizard fam- ily, with data from 10 nuclear genes for 17 ingroup taxa and 44 genes for 12 taxa. We support our initial Keywords: predictions, and suggest that simply considering uncertainty in concatenated trees may sometimes Concatenated analysis encompass the differences between these methods. We also found that relaxed-clock concatenated trees Phylogenetic methods can be surprisingly similar to the species-tree estimate. Remarkably, the coalescent species-tree esti- Scincidae mates had slightly lower support values when based on many more genes (44 vs. 10) and a small Species-tree analysis (30%) reduction in taxon sampling. Thus, taxon sampling may be more important than gene sampling Taxon sampling when applying species-tree methods to deep phylogenetic questions. Finally, our coalescent species-tree estimates tentatively support division of Scincidae into three monophyletic subfamilies, a result other- wise found only in concatenated analyses with extensive species sampling. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction widespread (e.g. according to Google Scholar April 12th, 2014: 580 citations for the paper proposing ⁄BEAST as compared to A major conundrum has arisen in the phylogenetic analysis of <200 each for BEST [Liu, 2008], STEM [Kubatko et al., 2009] and multi-locus sequence data. In recent years, there has been growing BUCKy [Larget et al., 2010]). However, it is widely known (if not appreciation for the idea that the evolutionary histories of species explicitly documented) that ⁄BEAST and similar methods can (species trees) and the genes that they carry (gene trees) can often become computationally intractable with large numbers of genes differ (e.g. Edwards, 2009; Maddison, 1997). To address this prob- or species. Unfortunately, such large matrices may be the most lem of discordance between gene and species trees, a variety of useful for accurately resolving phylogenies (e.g. Rannala et al., methods for explicit species-tree reconstruction have been devel- 1998) and for macroevolutionary analyses (e.g. Davis et al., oped (e.g. Edwards et al., 2007; Heled and Drummond, 2010; 2013). Other species-tree methods, while computationally more Kubatko et al., 2009; Larget et al., 2010; Liu, 2008). Simulation efficient (e.g. Kubatko et al., 2009; Larget et al., 2010), may require studies show that these methods, particularly those employing complete datasets (all species with data for all genes), making Bayesian coalescent models (i.e. BEST, ⁄BEAST), are more accurate them impractical in many cases. Furthermore, it is unclear whether than other approaches for inferring species trees, especially tradi- coalescent species-tree methods (e.g., ⁄BEAST) are preferable to tional concatenated analysis (e.g. Bayzid and Warnow, 2012; concatenation for deep phylogenetic questions, especially given Edwards et al., 2007; Heled and Drummond, 2010; Hovmöller limited sampling of species and individuals. These and other issues et al., 2013; Leaché and Rannala, 2011). The use of ⁄BEAST (Heled make choosing between concatenated and species-tree analyses an and Drummond, 2010), which coestimates gene and species trees increasingly common problem for empirical studies (McVay and using the multispecies coalescent, has become particularly Carstens, 2013). The current impracticality of coalescent species-tree methods for many phylogenetic studies leaves systematists with several ⇑ Corresponding author. important and unresolved questions: how similar are phylogenetic E-mail addresses: [email protected] (S.M. Lambert), twreeder@mail. sdsu.edu (T.W. Reeder), [email protected] (J.J. Wiens). estimates between concatenated and coalescent species-tree

http://dx.doi.org/10.1016/j.ympev.2014.10.004 1055-7903/Ó 2014 Elsevier Inc. All rights reserved. 35

S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155 147

approaches? In other words, do concatenated analyses offer a rea- genes versus species) on species-tree analyses of higher-level rela- sonable proxy? Can we make accurate predictions about when tionships is especially poorly known. estimates from concatenated and species-tree analyses will differ? Here we compare multispecies coalescent species-tree (i.e. Specifically, can we examine a tree from concatenated analysis and ⁄BEAST) and concatenated estimates using empirical datasets for predict which branches are most likely to be resolved differently higher-level scincid lizard phylogeny. are the most spe- using coalescent species-tree methods? Here, we make specific cies-rich family of lizards, containing >1560 species (Uetz et al., predictions regarding these questions and test them empirically 2013), and are relatively ancient (crown group 80–110 million using data from scincid lizards. years old; Mulcahy et al., 2012). Skink phylogeny and higher-level Species-tree methods generally rest on the assumptions that classification remain uncertain, despite recent studies based on gene trees and species trees disagree, that gene trees disagree with concatenated analyses of multiple nuclear and mitochondrial one another, and that species-tree methods deal with this discor- genes (e.g. Brandley et al., 2012; Pyron et al., 2013; Wiens et al., dance better than concatenation (e.g. Edwards, 2009; Edwards 2012). For example, Hedges and Conn (2012) divided Scincidae et al., 2007; Maddison and Knowles, 2006). Further, methods such into seven families (without estimating a large-scale phylogeny), as BEST and ⁄BEAST assume that this genealogical discordance is but Pyron et al. (2013) found some of these families to be non- due to incomplete lineage sorting (e.g. Edwards et al., 2007; monophyletic. Traditional classifications recognized four subfami- Heled and Drummond, 2010). Theory suggests incomplete lineage lies in Scincidae (Acontinae, Feylininae, Lygosominae, ; sorting will be more prevalent on shorter branches in the true spe- Greer, 1970 and subsequent authors). Brandley et al. (2012) ana- cies tree (all other things being equal; e.g. Degnan and Rosenberg, lyzed 56 scincid species and found Scincinae to be paraphyletic 2009; Maddison, 1997; Pamilo and Nei, 1988). Empirical multi- with respect to Lygosominae (with Acontinae as sister to the other locus analyses (e.g. Wiens et al., 2008, 2010a, 2012) show that subfamilies), but did not include Feylininae. In a broad analysis of shorter branches in concatenated trees tend to have weaker sup- squamate phylogeny, Wiens et al. (2012) analyzed 44 nuclear loci port (e.g. bootstrap proportions) and greater disagreement with and included 12 skink species and found Scincinae to be paraphy- their underlying gene trees (e.g. a smaller proportion of genes will letic with respect to Feylininae and Lygosominae, with Acontinae support the short branch in the concatenated tree, and the other as sister to all other scincids. In another large-scale analysis of genes will support trees that conflict with this branch). Therefore, squamate phylogeny, Pyron et al. (2013) included 683 scincid spe- we predict that estimates from concatenated analysis and species- cies for multiple nuclear and mitochondrial genes (but with rela- tree methods will tend to disagree most often on those branches of tively incomplete gene sampling) and supported monophyly of the concatenated tree that have: (1) shorter estimated branch Acontinae (as sister to all other scincids), Lygosominae, and Scinci- lengths (assuming that short branches in the concatenated tree nae (including Feylininae). We follow this latter classification here. reflect short branches in the underlying species tree, and short However, as of yet, no studies have used species-tree methods to branches in concatenated trees do seem to reflect short branches address higher-level relationships in Scincidae. in comparable gene trees; Wiens et al., 2008), (2) weaker support In this study, we analyze higher-level scincid phylogeny and (i.e. low bootstraps and/or Bayesian posterior probabilities), and test our predictions about concordance of species-tree and concat- (3) greater disagreement with the underlying gene trees. Although enated estimates using a dataset of 17 ingroup species and 10 these assumptions may seem simple and uncontroversial, they do nuclear protein-coding genes (combining published and new data). have important implications. For example, if discordance with spe- We also compare concatenated and species-tree estimates from a cies-tree estimates occurs primarily on weakly supported branches previously published dataset (Wiens et al., 2012) containing 44 of concatenated trees, then comparative studies that use concate- nuclear protein-coding genes for 12 scincid species. nated trees can incorporate the likely impact of using species-tree methods simply by incorporating uncertainty in the concatenated estimate (e.g. repeating analyses on trees with alternate resolu- 2. Materials and methods tions of the poorly supported branches). Another important, unresolved issue is whether coalescent- 2.1. Sequence data and alignment based species-tree methods are effective for resolving deep phylo- genetic splits, especially with limited sampling of individuals and We assembled data from 10 nuclear protein-coding loci from 17 species. So far, ⁄BEAST is most frequently applied to relatively clo- species representing all four traditionally recognized subfamilies of sely related species, and with >1 individuals sampled per species Scincidae (Acontinae, Feylininae, Lygosominae, Scincinae), includ- (but see for example Perez et al., 2012; Townsend et al., 2011; ing multiple representatives of the subfamilies Lygosominae and Wiens et al., 2012; Williams et al., 2013). Both simulations and Scincinae, which together include most species and genera of empirical analyses have shown that reduced sampling of individu- skinks (Uetz et al., 2013). These 17 species spanned many levels als per species can decrease the accuracy of ⁄BEAST (Camargo et al., of divergence, including multiple species in the species-rich genus 2012; Heled and Drummond, 2010; Hovmöller et al., 2013). Specif- Sphenomorphus. As outgroups, we included six species represent- ically, the lack of multiple individuals per species prevents ⁄BEAST ing the families Cordylidae (Cordylus, Platysaurus), and Gerrhosari- from accurately estimating population sizes for extant taxa (i.e. ter- dae (Cordylosaurus, Zonosaurus), and Xantusiidae (Lepidophyma, minal branches), which may negatively impact estimates of diver- Xantusia). Molecular and combined-data analyses of higher-level gence times and possibly topology (Heled and Drummond, 2010). squamate phylogeny have repeatedly shown that these three fam- Previous work suggests that sampling additional individuals has a ilies form a strongly supported clade that is the sister group of greater effect on accuracy at shallow phylogenetic levels and sam- Scincidae (e.g. Mulcahy et al., 2012; Pyron et al., 2013; Wiens pling additional genes is more beneficial for deeper relationships et al., 2010b, 2012). Voucher specimens used are listed in online (Heled and Drummond, 2010; Maddison and Knowles, 2006). How- Appendix A. ever, it remains unclear whether limited sampling of individuals For our primary dataset (17 ingroup taxa, 10 genes) we first uti- may make ⁄BEAST estimates problematic, and whether it is even lized published sequence data from all 12 scincid taxa included by worth applying to typical higher-level empirical studies in which Wiens et al. (2012) for a subset of 10 relatively well-sampled one tries to estimate deep divergences in species-rich groups with genes. The 10 genes are (with aligned lengths): AHR = 453 base limited sampling of both individuals and species. The impact of lim- pairs, BDNF = 669, DNAH3 = 646, ECEL = 480, FSTL5 = 621, ited species sampling (and the relative importance of sampling GPR = 509, NGFB = 579, PTGER = 492, RAG1 = 996, ZFP36L1 = 606 36

148 S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155

(total concatenated length = 6051). These 10 gene segments were 2.4. Bayesian concatenated and gene tree analyses carefully selected to be: (1) single copy, to avoid paralogy, (2) within a single exon, to facilitate alignment, (3) short enough to We conducted Bayesian analyses of the concatenated data and be amplified and sequenced with single pairs of reactions (450– separate genes using a standard (non-clock) approach (MrBayes 1000 bp), and (4) evolving at an appropriate rate for squamate v3.2.1; Ronquist et al., 2012) and using a relaxed molecular clock phylogenetics (see Townsend et al., 2008). All 10 genes have been approach (BEAST v1.7.5; Drummond et al., 2012). used successfully in higher-level squamate phylogenetics (e.g. For MrBayes analyses, we applied the preferred models of Mulcahy et al., 2012; Townsend et al., 2011;Wiens et al., 2008, molecular evolution (estimated as described above) for each parti- 2010b, 2012). For these 10 genes, we then obtained sequences tion. We also unlinked priors across all partitions and set the rate from five additional scincid species (Egernia whitii, Mabuya prior to ‘‘variable’’, allowing site-specific rates of evolution to vary unimarginatus, virens, Sphenomorphus nigriventris, across partitions (Marshal et al., 2006). We used the default Tropidophorus grayi), all from the subfamily Lygosominae, which temperatures for chain heating, and the default of four Metropo- contains most species and genera of scincids (Uetz et al., 2013). lis-coupled Markov chain Monte Carlo (MCMCMC) chains for the We also obtained a few additional sequences from the initial set two replicate runs for each analysis. For analyses of each gene, of 12 species. We used standard methods of DNA extraction, ampli- we ran 2 Â 106 generations sampled every 1,000. For the concate- fication, and Sanger sequencing. Primers are given in Townsend nated analysis, we used 1 Â 107 generations sampled every 1,000. et al. (2008), Mulcahy et al. (2012), and Wiens et al. (2012). Given We assessed convergence using diagnostics from the sump the protein-coding sequences used, alignment was straightforward command. Specifically, we ensured adequate effective sample sizes and done by eye, after translating nucleotide sequences to amino (ESS) of each parameter (>200), that chains were mixing acid sequences with MacClade version 4.0 (Maddison and appropriately, and that the average standard deviation of split fre- Maddison, 2000). GenBank numbers are provided in online Appen- quencies between independent runs reached <0.05. A conservative dix B, and the data matrix will be deposited in Dryad upon burnin fraction of 50% was used to obtain the consensus phylo- acceptance. gram and posterior probabilities for each bipartition, using the For each gene, we conducted preliminary phylogenetic analyses sumt command. using parsimony (PAUP⁄ 4.0b10; Swofford, 2002) to look for any We used BEAST v1.7.5 (Drummond and Rambaut, 2007)to possible contamination or sequencing errors (i.e. different species perform relaxed clock Bayesian analyses of the concatenated data with identical sequences). Any such erroneous sequences were and each gene. Input files contained partitions for each codon deleted. However, we did not exclude sequences merely because position of each gene. We linked site models in accordance with they conflicted with trees from other genes, to avoid biasing our partitions identified by PartitionFinder, unlinked clock models analyses with regards to congruence between genes. across all partitions, and linked topology across all partitions. We For the entire dataset, we were not able to obtain sequences for tested the null hypothesis of a strict molecular clock for each gene all species for all genes (missing data = 14.6% across the data (data not partitioned) using likelihood-ratio tests in PAUP⁄ v4.0b10 matrix), despite repeated attempts using multiple primer combi- (Swofford, 2002), and found that all genes rejected the strict nations. However, both empirical and simulation studies suggest molecular clock hypothesis. Thus, we used uncorrelated lognormal that even large amounts of missing data do not necessarily impede relaxed molecular clock models for all partitions, with mean rates accurate estimation with model-based methods for concatenated estimated from a gamma distribution relative to a partition with data (review in Wiens and Morrill, 2011), and recent simulation an arbitrary fixed rate of 1. For a tree prior, we used a Yule studies suggest that ⁄BEAST may also be resilient to missing data (pure-birth) model with random starting trees. We used 1 Â 108 (Bayzid and Warnow, 2012; Hovmöller et al., 2013). generations (sampled every 10,000) for each individual gene tree, with the number of generations increased to 2.5 Â 108 for the con- catenated analysis. 2.2. Model and partition selection We assessed convergence of BEAST analyses using the programs Tracer v1.5 (Drummond and Rambaut, 2007) and Are We There We used PartitionFinder v.1.1.1 (Lanfear et al., 2012) to select Yet? (AWTY; Nylander et al., 2008). In Tracer, we examined plots both data partitioning schemes and models of sequence evolution of likelihood and other parameters for stationarity and for effective for all concatenated and single-gene datasets. Models and parti- sample sizes >200. In some instances, ESS for the prior and poster- tions selected are described in Appendix C. PartitionFinder uses ior were lower than 200, and this appeared alongside low ESS for maximum likelihood and the information theoretic criterion to particular transition or transversion types in a given partition, sug- select partitioning schemes and models. We changed the models gesting possible model overparameterization. However, likelihood that PartitionFinder compared based on the software used for sub- plots appeared stable and with high ESS in all cases. We repeated sequent phylogenetic analyses (e.g. only GTR+C for RAxML analy- these analyses to ensure that topologies were consistent between ses). We did not evaluate models that include both a parameter for independent MCMC chains. In AWTY, we used the ‘‘compare’’ func- among-site rate heterogeneity and a parameter for invariant sites tion to compare the posterior probabilities of bifurcations across (i.e. GTR+I+C model), because correlation between these two duplicate MCMC analyses, and the ‘‘cumulative’’ function to exam- parameters may make it difficult to estimate both accurately (e.g. ine stability of topological support throughout the MCMC chains. Sullivan et al., 1999; Yang, 2006). We used the sample-size cor- To ensure consistency of these convergence diagnostics with those rected Akaike Information Criterion (AICc; Sugiura, 1978) to select used for MrBayes, we also used Tracer and AWTY as described models and employed the greedy heuristic algorithm in all cases. above to diagnose convergence using the two replicate runs of each MrBayes analysis. 2.3. Likelihood analysis We used TreeAnnotator v1.7.5 (Drummond et al., 2012)to summarize the posterior distribution of trees from individual Maximum likelihood analyses of individual genes and the con- MCMC analyses into maximum clade credibility trees using a con- catenated data were conducted using RAxML v7.4.2. (Stamatakis, servative burn-in fraction of 50%. Importantly, we did not use 2006). For each analysis, we conducted 1000 rapid bootstrap repli- BEAST to estimate divergence times per se; thus, estimated branch cates combined with 200 replicate searches for the optimal tree, lengths are in units of substitutions per site and are not using the ‘‘-f a’’ option. We used the partitions identified above. ultrametric. 37

S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155 149

2.5. Species tree estimation used bootstrap values for likelihood analyses and posterior proba- bilities (Pp) for Bayesian analyses. We used ⁄BEAST (Heled and Drummond, 2010) as implemented To assess whether nodes in the concatenated trees are concor- in BEAST v1.7.5 to infer species trees using the multispecies coales- dant or discordant with the underlying estimated gene trees we cent. Tree models were linked across partitions within each gene. used several measures (following Wiens et al., 2008), including As in BEAST analyses, substitution models were linked within par- the proportions of concordant genes, strongly concordant genes, titions identified by PartitionFinder, and clock models were and strongly discordant genes. For a given clade in the concate- unlinked across all codon positions of each gene. We used lognor- nated tree, we counted a gene as concordant if all the species mal relaxed molecular clocks with relative rates estimated and a included in the gene tree from that clade form a monophyletic Yule process species tree prior. Convergence assessment and max- group. However, not every species had data for every gene. In these imum clade credibility tree construction followed the methods cases, we counted a gene as supporting a given clade if the basal described above for BEAST. The interpretation of branch lengths species of the clade were present in the separate gene tree. Using in ⁄BEAST is not straightforward, and so these values are treated the example from Wiens et al. (2008), say that clade (A (B + C)) as unitless. was present in the concatenated-data tree. If a given gene included taxa A and C, but lacked data for B, and the clade A + C was sup- ported, then this gene was considered as supporting (A (B + C)). If 2.6. Analyses of 44 genes taxon A lacked data for that gene instead, we would consider this to be ambiguous. We then calculated the proportion of concordant We inferred concatenated and species-tree estimates using a genes for each node (the number of concordant genes divided by dataset of 44 genes (Wiens et al., 2012), using BEAST and ⁄BEAST. the number of genes that have relevant sampling for that node, We excluded all taxa except the 12 scincid species and three out- given that not every species had data for every gene, and that some group species (Cordylus, Cordylosaurus, Zonosaurus). Methods fol- genes were therefore ambiguous for a given node). lowed those described above, except that each gene was treated We also tallied whether genes that were concordant (or discor- as its own partition, with substitution models, clock models, and dant) with a given clade in the concatenated tree showed strong or tree models linked within each gene. We chose not to evaluate sep- weak support for that clade. This distinction is important because arate substitution models for each codon position of each gene, weakly supported discordance might be the result of stochastic given that preliminary analyses showed that these extra parame- error in estimating the gene tree, whereas strongly supported dis- ters made the species-tree analysis computationally intractable. cordance may indicate distinct genealogical histories (e.g. due to Substitution models for each gene were selected using jModelTest2 incomplete lineage sorting, introgression, or paralogy). For likeli- (Darriba et al., 2012). We ran the analysis for 1 Â 109 generations hood analyses, we used a threshold of P70% bootstrap support sampled every 10,000. for classifying a relationship as strongly supported. We acknowl- edge that a criterion of P95% would be more conservative (and 2.7. Analyses of congruence and support potentially more appropriate) for considering a clade to be strongly supported in a standard phylogenetic analysis, but this criterion We tested whether those branches in the concatenated tree that was used only for assessing congruence, and is based on the differ from the species-tree estimate are short, weakly supported, well-established observation that bootstrap values are often biased and in conflict with the underlying gene trees. Therefore, we com- but conservative (Felsenstein, 2004). For Bayesian analyses, we pared the concatenated trees from the likelihood and Bayesian used the standard threshold Pp P 0.95 as indicating strong sup- analyses to the estimated species tree from ⁄BEAST. For each con- port, as posterior probabilities are less conservative compared to catenated tree, we classified all 16 nodes within the ingroup as bootstrap values (e.g. Rannala and Yang, 1996; Douady et al., being concordant or discordant with the estimated species tree 2003). For each node in the concatenated tree, we evaluated the from ⁄BEAST. We then compared values of these two classes of proportion of genes that strongly supported that clade (among nodes (for length, support, and gene congruence) using one-sided, the genes relevant for that clade) and the proportion of genes that nonparametric Mann–Whitney U tests in R v2.15.1 (R Core Team, strongly supported an alternative relationship. 2012). We used the package exactRankTests (Hothorn and In the case of discordant relationships, selection of the appro- Hornik, 2002), which implements the shift algorithm (Streitberg priate support value is not always straightforward (especially and Röhmel, 1986) to calculate exact P-values, even in the case when the gene tree was locally very different from the concate- of tied data. We used a nonparametric approach because Kolmogo- nated and/or ⁄BEAST tree). Following Wiens et al. (2008), when a rov-Smirnov tests (in R) indicated that many of the relevant sam- discordant relationship was found, we used the support value from pling distributions were non-normal (results not shown). The the deepest node in the gene tree that rejected the clade, given the raw data for all analyses (here and below) are presented in Appen- taxon sampling for that gene. For example, if the concatenated or dix D. ⁄BEAST tree supported the relationships (A (B (C + D))), and a gene For branch lengths, we used estimated lengths from the concat- supported the relationships (D (B (A + C))), we considered that enated analysis. A reasonable concern is whether these adequately gene’s support for the clade uniting A + B + C rather than the clade reflect the true branch lengths in the underlying species tree A + C. If two nodes at equal depth formed incongruent clades, we (although our analyses do not assume this). Previous analyses used the larger support value. Overall, we found that in these more show that shorter branch lengths in concatenated trees are related ambiguous cases, all the discordant clades involved were usually to shorter branches in the corresponding gene trees for comparable weakly supported, making the selection of a particular support clades (Wiens et al., 2008), strongly suggesting that these short value less critical (as found by Wiens et al., 2008). branches generally reflect limited time between splitting events, We also quantified strongly supported discordance among rather than artifacts of concatenation. Furthermore, previous anal- genes for a given clade. We used the index of gene conflict from yses show that shorter branches in concatenated trees are associ- Wiens et al. (2008). This index has its highest values (1) when ated with weaker support and greater incongruence among genes the number of genes strongly supporting a clade is equal to the (e.g. Wiens et al., 2008, 2010a, 2012). We also tested these latter number strongly rejecting it, and when most genes belong to these two hypotheses on the 10 gene, 17-taxa skink dataset (see below). two types. In contrast, the index will take low values (0) when For our measure of branch support in the concatenated trees, we the concordant and discordant clades are weakly supported or if 38

150 S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155

the genes all strongly support the relevant clade in the concate- The topology estimated in the Bayesian relaxed clock analysis of nated or ⁄BEAST species tree. the concatenated data (Fig. 2) differs notably from the other We primarily tested whether branch lengths, support, and con- concatenated analyses (Fig. 1). Most importantly, this analysis cordance in the concatenated tree predicted congruence with the shows strong support (Pp = 0.99) for monophyly of Scincinae estimated species tree from ⁄BEAST (since in many cases a concat- (including Feylinia), which is paraphyletic in the other concate- enated tree will be available but a species-tree estimate will not nated analyses. Within Scincinae, there is strong support be). However, we also tested whether these same properties of (Pp = 1.00) for placing Plestiodon with the Eumeces + clade, the species tree from ⁄BEAST predicted incongruence with the con- whereas the other concatenated analyses suggest a very different catenated trees. placement for Plestiodon, and place Brachymeles with the Eume- Finally, we assessed correlations between branch lengths, sup- ces + Scincus clade. Intriguingly, there are differences between the port, and indices of gene congruence and conflict using one-sided two Bayesian concatenated analyses that are strongly supported Spearman’s nonparametric rank correlations with R. Specifically, by each (Pp > 0.95), specifically regarding the placement of scin- we predicted that shorter branches in the concatenated tree would cine genera. However, all three concatenated analyses agree have weaker support and greater incongruence with their underly- regarding the placement of Acontinae, and the relationships among ing gene trees. We then tested similar predictions on the estimated the sampled lygosomine species. species tree. The topology from the Bayesian species-tree analysis (⁄BEAST; Overall, we performed only standard statistical analyses rather Fig. 3) is almost identical to that from the Bayesian concatenated than phylogenetically corrected analyses (e.g. using independent relaxed-clock analysis (BEAST). These topologies differ only in their contrasts; Felsenstein, 1985). In general, we do not expect charac- placement of Brachymeles, and the conflicting nodes are weakly teristics of branches (such as their length, support values, and con- supported in both analyses. Thus, as in the relaxed-clock concate- gruence among genes) to show phylogenetic autocorrelation. nated Bayesian analysis, the ⁄BEAST analysis shows strong support Furthermore, most phylogenetic comparative methods utilize for monophyly of Scincinae, in contrast to the non-clock concate- characteristics of terminal species, and are not designed for inter- nated analyses. nal nodes. Indeed, most phylogenetic comparative methods rely The 44-gene dataset contains a similar set of taxa, but lacks five heavily on branch length information, but branch length is one of lygosomine species present in the 10 gene, 17-species datasets. our key variables. The previously published concatenated analyses of this dataset (RAxML and MrBayes; Fig. 4ab) also show Lygosominae nested 3. Results inside a paraphyletic Scincinae (Wiens et al., 2012), although the relationships among scincine genera differ between the 10 gene 3.1. Comparison of topologies and 44 gene analyses. The BEAST analysis of the 44 gene dataset (Fig. 4c) supports a monophyletic Scincinae with the exception of Maximum likelihood and standard Bayesian analyses of the Brachymeles, which is strongly placed as sister to Lygosominae. concatenated 10 gene dataset yielded identical topologies The ⁄BEAST analysis (Fig. 4d) again supports a monophyletic Scin- (Fig. 1). These trees show strong support for placing Acontinae as cinae, but this time with weak support. Again, the standard non- sister group to all other scincids, as in previous studies. These trees clock likelihood and Bayesian analyses give topology estimates also support monophyly of Lygosominae, but show Lygosominae as that are identical to each other, but differ from the relaxed clock nested inside a paraphyletic Scincinae. The two branches underly- and species-tree estimates to the same extent (2 of 10 nodes). In ing non-monophyly of Scincinae are weakly supported in the like- this case, the BEAST and ⁄BEAST trees also differ by 2 of 10 nodes. lihood analysis (bootstrap values <50%) and the Bayesian analysis There are conflicting nodes between the relaxed clock and stan- (Pp = 0.87 and 0.77). dard Bayesian estimates that are strongly supported by each.

Cordylus mossambicus

1.00 Acontias meleagris Acontiinae 100 Brachymeles gracilis 0.97 1.00 55 100 1.00 Scincus scincus 100 Eumeces schneideri 1.00 1.00 Plestiodon skiltonianus Scincinae 99 100 Plestiodon fasciatus 0.87 1.00 100 36 Amphiglossus splendidus

1.00 Tiliqua scincoides 0.77 31 100 1.00 Egernia whitii 100 1.00 Mabuya quinquetaeniata 100 1.00 Mabuya unimarginata 100 Tropidophorus grayi Lygosominae 1.00 100 Prasinohaema virens 1.00 100 Sphenomorphus cf. simus 1.00 91 1.00 Sphenomorphus solomonis 0.01 substitutions/site 78 Sphenomorphus nigriventris

Fig. 1. Estimated tree from standard concatenated analyses of 10 genes for higher-level scincid lizard relationships. Both non-clock maximum likelihood (RAxML) and Bayesian analysis (MrBayes) estimate the same topology. Branch lengths shown are from maximum likelihood. Support values below each branch are from likelihood bootstrap values, values above each branch are Bayesian posterior probabilities. For illustrative purposes, only one outgroup taxon (Cordylus) is shown, but all six were included in the analyses. 39

S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155 151

Cordylus mossambicus Acontias meleagris Acontiinae Mabuya quinquetaeniata 1.00 1.00 1.00 Mabuya unimarginata 1.00 Tiliqua scincoides Egernia whitii 1.00 Lygosominae 1.00 1.00 Sphenomorphus nigriventris 0.99 Sphenomorphus solomonis 1.00 Sphenomorphus cf. simus 1.00 Prasinohaema virens Tropidophorus grayi 1.00 1.00 Eumeces schneideri 1.00 Scincus scincus 1.00 Plestiodon skiltonianus Plestiodon fasciatus Scincinae 0.99 Brachymeles gracilis 0.85 1.00 Amphiglossus splendidus Feylinia polylepis

0.01 substitutions/site

Fig. 2. Estimated tree from concatenated analyses of 10 genes for higher-level scincid lizard relationships based on a relaxed clock Bayesian analysis (BEAST). Values at branches indicate posterior probabilities of clades. For illustrative purposes, only one outgroup taxon (Cordylus) is shown, but all six were included in the analyses.

Cordylus mossambicus Acontias meleagris Acontiinae 1.00 Tiliqua scincoides 1.00 0.99 Egernia whitii 1.00 Mabuya quinquetaeniata 1.00 Mabuya unimarginata 1.00 0.99 Sphenomorphus nigriventris Lygosominae 0.63 Sphenomorphus solomonis 0.99 Sphenomorphus cf. simus 1.00 Prasinohaema virens Tropidophorus grayi 1.00 Eumeces schneideri 0.80 0.99 Plestiodon fasciatus 0.71 Plestiodon skiltonianus 0.99 Scincus scincus Scincinae 0.96 Brachymeles gracilis 1.00 Amphiglossus splendidus Feylinia polylepis

0.01

Fig. 3. Estimated tree from Bayesian coalescent species-tree analysis (⁄BEAST). Values at branches indicate posterior probabilities of clades. For illustrative purposes, only one outgroup taxon (Cordylus) is shown, but all six were included in the analyses.

Relationships among scincine genera differ somewhat between are significantly shorter, more weakly supported, and have a lower the two species-tree analyses (Fig. 3 vs. Fig. 4d), but these differ- proportion of genes that are concordant and strongly concordant ences are only weakly supported in each. Surprisingly, the analysis with them. However, the proportion of genes that are strongly dis- of 44 genes and 12 skink species has slightly lower mean support cordant with the concatenated tree is significantly higher for the values for ⁄BEAST relative to that using 10 genes and 17 species likelihood results but not for the non-clock Bayesian results, and (mean Pp = 0.924 with 44 genes, mean Pp = 0.943 with 10 genes). the measure of strong conflict among genes is not significant for Although this decrease is not significant (P = 0.368 for Mann– either method. Whitney test), the expectation is that increasing the number of The branches in the species tree from ⁄BEAST that conflict genes from 10 to 44 should significantly increase support. with the non-clock concatenated trees (RAxML and MrBayes) tend to be shorter (although the trend is only marginally signif- 3.2. Analyses of congruence icant; P = 0.057), with significantly weaker support (P = 0.013). Separate non-clock analyses of genes show that nodes that are Our overall goal was to determine what properties of branches concordant between the concatenated and ⁄BEAST trees do have from concatenated trees predict whether they will be concordant higher proportions of genes that are concordant and strongly or discordant with estimates from species-tree methods. We pre- concordant with these branches (P = 0.002–0.013; using both dicted that branches of the concatenated tree that are most likely Bayesian and likelihood-estimated gene trees), while the propor- to differ from the coalescent species-tree estimate are those that tion of strongly discordant genes is significant for non-clock are short, weakly supported, and contested by the underlying gene Bayesian estimated trees (P = 0.045) but not likelihood estimated trees. These predictions are generally supported (Table 1). For the trees (P = 0.786). concatenated trees from non-clock likelihood and Bayesian analy- Branch lengths and support values were significantly, positively ses, those branches that disagree with the estimated species tree correlated among branches within each of the three concatenated 40

152 S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155

RAxML MrBayes (a) (b) Acontias meleagris AcontiinaeAcontias meleagris Acontiinae Feylinia polylepis Scincus scincus 100 1 Amphiglossus splendidus Eumeces schneideri 59 1 0.99 100 Scincus scincus Feylinia polylepis Scincinae 1 Scincinae 100 Eumeces schneideri 1 Amphiglossus splendidus 100 Plestiodon skiltonianus Plestiodon skiltonianus 100 1 Plestiodon fasciatus Plestiodon fasciatus 0.99 100 Brachymeles gracilis Brachymeles gracilis 100 Mabuyaquinquetaeniata Mabuya quinquetaeniata 1 1 70 Tiliqua scincoides Tiliqua scincoides Lygosominae Lygosominae 1 100 100 Sphenomorphus solomonis Sphenomorphus solomonis Sphenomorphus cf. simus 1 Sphenomorphus cf. simus 0.01 substitutions/site 0.01 substitutions/site

(c)BEAST (d) *BEAST Acontias meleagris Acontiinae Acontias meleagris Acontiinae Brachymeles gracilis Scincinae Mabuya quinquetaeniata 1 0.99 Sphenomorphus cf. simus Tiliqua scincoides 1 1 1 Lygosominae 1 Sphenomorphus solomonis Sphenomorphus cf. simus 1 Lygosominae 1 Mabuya quinquetaeniata Sphenomorphus solomonis 1 1 1 Tiliqua scincoides Brachymeles gracilis Amphiglossus splendidus 0.76 Plestiodon fasciatus 1 1 Feylinia polylepis Plestiodon skiltonianus 1 Plestiodon fasciatus 0.80 Scincus scincus Scincinae 1 Scincinae 1 Plestiodon skiltonianus Eumeces schneideri 0.98 Eumeces schneideri 0.68 Amphiglossus splendidus 1 Scincus scincus 1 Feylinia polylepis 0.01 substitutions/site 0.001

Fig. 4. Estimates of scincid phylogeny based on a dataset of 44 nuclear protein-coding genes, including estimates from (a) concatenated non-clock maximum likelihood analysis (RAxML), (b) concatenated non-clock Bayesian analysis (MrBayes), (c) concatenated relaxed clock Bayesian analysis (BEAST) and (d) Bayesian coalescent species-tree analysis (⁄BEAST). Trees in (a) and (b) are from the larger squamate-wide analyses in Wiens et al. (2012). The three outgroup taxa included are not shown.

Table 1 Results from one-sided Mann–Whitney U-tests comparing the characteristics of branches in the non-clock concatenated trees that are disconcordant or concordant with the estimated species tree from ⁄BEAST. For each variable, medians for the groups of discordant and concordant nodes are given. Asterisks indicate statistically significant results using one-sided tests and a threshold of P < 0.05. Proportions of concordant/discordant genes refer to the gene trees estimated using MrBayes, RAxML, or BEAST depending on the method used for the concatenated tree. The conflict index is from Wiens et al. (2008). Note that the common ancestor of Scincidae was included as a node.

Variable MrBayes medians MrBayes P-value RAxML medians RAxML P-value Branch length Discordant (n = 3): 0.0017 0.002⁄ Discordant (n = 3): 0.0011 0.002⁄ Concordant (n = 13): 0.0151 Concordant (n = 13): 0.0166 Support (concatenated) Discordant (n = 3): 0.8730 0.002⁄ Discordant (n = 3): 36 0.002⁄ Concordant (n = 13): 1.0000 Concordant (n = 13): 100 Support (⁄BEAST) Discordant (n = 3): 0.9593 0.011⁄ Discordant (n = 3): 0.9593 0.011⁄ Concordant (n = 13): 1.0000 Concordant (n = 13): 1.000 Proportion concordant genes Discordant (n = 3): 0.0000 0.002⁄ Discordant (n = 3): 0.0000 0.002⁄ Concordant (n = 13): 0.8354 Concordant (n = 13): 0.8333 Proportion strongly concordant genes Discordant (n = 3): 0.0000 0.002⁄ Discordant (n = 3): 0.0000 0.002⁄ Concordant (n = 13): 0.5714 Concordant (n = 13): 0.7143 Proportion strongly discordant genes Discordant (n = 3): 0.1000 0.250 Discordant (n = 3): 0.1111 0.011⁄ Concordant (n = 13): 0.0000 Concordant (n = 13): 0.0000 Conflict index Discordant (n = 3): 0.0000 0.589 Discordant (n = 3): 0.0000 0.688 Concordant (n = 13): 0.0000 Concordant (n = 13): 0.0000

Variable BEAST Medians BEAST P-value Branch length Discordant (n = 1): 0.0011 0.0625 Concordant (n = 15): 0.0077 Support (Concatenated) Discordant (n = 1): 0.8460 0.0625 Concordant (n = 15): 1.0000 Support (⁄BEAST) Discordant (n = 1): 0.8044 0.1875 Concordant (n = 15): 0.9527 Proportion concordant genes Discordant (n = 1): 0.0000 0.0625 Concordant (n = 15): 0.8333 Proportion strongly concordant genes Discordant (n = 1): 0.0000 0.188 Concordant (n = 15): 0.6000 Proportion strongly discordant genes Discordant (n = 1): 0.0000 1.000 Concordant (n = 15): 0.0000 Conflict index Discordant (n = 1): 0.0000 0.625 Concordant (n = 15): 0.0000

tree (MrBayes: rho = 0.68, P = 0.002; RAxML: rho = 0.87, P < 0.0001; estimated gene trees (Table 2), except for the proportion of BEAST: rho = 0.64, P = 0.004), but surprisingly, not the species-tree strongly discordant genes estimated using relaxed clock methods estimated from ⁄BEAST (rho = 0.30, P = 0.131). (BEAST). The proportions of concordant and strongly concordant Overall, levels of support in the estimated Bayesian species tree genes are significantly related to branch lengths in the species tree are related to concordance and discordance among the separately and the concatenated trees (Table 2). However, the proportion of 41

S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155 153

Table 2 Results of one-sided non-parametric Spearman’s rank correlations for gene concordance versus support and branch lengths, in the format: method used to estimate gene trees vs. method used to estimate reference tree (for measures of support and branch length). Asterisks indicate statistically significant results using a threshold of P < 0.05. Note that the common ancestor of Scincidae was included as a node.

Variable 1 Variable 2 MrBayes vs. ⁄BEAST RAxML vs. ⁄BEAST BEAST vs. ⁄BEAST Proportion of concordant genes Support rho = 0.56 rho = 0.47 rho = 0.51 P = 0.0119⁄ P = 0.0315⁄ P = 0.0211⁄ Proportion strongly concordant genes Support rho = 0.51 rho = 0.44 rho = 0.51 P = 0.0215⁄ P = 0.04237⁄ P = 0.0223⁄ Proportion strongly discordant genes Support rho = À0.79 rho = À0.58 rho = À0.23 P = 0.0001⁄ P = 0.0094⁄ P = 0.1996 Proportion of concordant genes Length rho = 0.67 rho = 0.60 rho = 0.68 P = 0.0024⁄ P = 0.0066⁄ P = 0.0020⁄ Proportion strongly concordant genes Length rho = 0.73 rho = 0.60 rho = 0.60 P = 0.0006⁄ P = 0.0066⁄ P = 0.0067⁄ Proportion strongly discordant genes Length rho = À0.22 rho = À0.034 rho = À0.51 P = 0.2048 P = 0.5487 P = 0.0217⁄ Variable 1 Variable 2 MrBayes vs. MrBayes Concatenated RAxML vs. RAxML Concatenated BEAST vs. BEAST Concatenated Proportion of concordant genes Support rho = 0.68 rho = 0.52 rho = 0.51 P = 0.0017⁄ P = 0.0205⁄ P = 0.0207⁄ Proportion strongly concordant genes Support rho = 0.68 rho = 0.59 rho = 0.60 P = 0.0018⁄ P = 0.0084⁄ P = 0.0070⁄ Proportion strongly discordant genes Support rho = À0.20 rho = À0.46 rho = 0.13 P = 0.2328 P = 0.0348⁄ P = 0.6868 Proportion of concordant genes Length rho = 0.72 rho = 0.68 rho = 0.77 P = 0.0008⁄ P = 0.0019⁄ P = 0.0003⁄ Proportion strongly concordant genes Length rho = 0.81 rho = 0.74 rho = 0.82 P = 7.197eÀ5⁄ P = 0.0005⁄ P = 5.062eÀ5⁄ Proportion strongly discordant genes Length rho = À0.29 rho = À0.62 rho = 0.18 P = 0.135 P = 0.0049⁄ P = 0.7535

strongly discordant genes is only related to branch lengths in the may be those most likely to have a different resolution when spe- RAxML concatenated tree (Table 2). cies-tree methods are applied. Thus, incorporating uncertainty in the concatenated analysis may also help make phylogenetic results 4. Discussion robust to the future application of species-tree methods. For exam- ple, the simple practice of performing comparative analyses on 4.1. Differences between concatenated and coalescent species trees multiple trees sampled from the posterior distribution (from a Bayesian analysis) or from bootstrapping (from a likelihood analy- Species-tree methods may be more accurate than concatenated sis) should help ensure that the overall results are not compro- approaches for estimating phylogenies from multi-locus sequence mised by differences between trees from concatenated and data under a variety of conditions (e.g. Heled and Drummond, species-tree analyses. Nevertheless, it is important to point out 2010; Leaché and Rannala, 2011), but concatenated approaches that we do find cases where concatenated analysis strongly sug- remain a widely used approach for many large-scale multi-locus gests a relationship that is strongly contradicted by species-tree ⁄ studies (e.g. Blanga-Kanfi et al., 2009; Crawford et al., 2012; analysis (i.e. MrBayes versus BEAST for monophyly of Scincinae), Wiens et al., 2012). Here, we address a critical but neglected ques- but there are also strongly supported conflicts between different tion related to this issue: are there properties of concatenated trees methods for estimating concatenated trees (MrBayes versus that will predict their concordance with species-tree estimates? BEAST). Our results from scincid lizards demonstrate that the answer is yes. We find that branches of the standard non-clock Bayesian 4.3. Similarity of relaxed clock concatenated and coalescent species and likelihood concatenated trees that are short, weakly sup- trees ported, and have few genes that are concordant with them are those that tend to have a different resolution in the estimated spe- An unexpected result of our study was that in the 10-gene anal- cies tree from ⁄BEAST. Although our results are based on only one yses, the relaxed clock concatenated analysis (BEAST) provided an group of organisms, these results are broadly congruent with the- estimate that was almost identical to that from the species-tree oretical predictions and related results from previous studies. For method (⁄BEAST), differing by only one node (which was weakly example, theory suggests that discordance among genes will tend supported in both trees). The explanation for this pattern is not to be greater on shorter branches due to incomplete lineage sorting entirely clear. For example, the individual gene trees estimated (e.g. Maddison, 1997), and previous empirical studies show that by BEAST do not show greater congruence with the ⁄BEAST spe- short branches will tend to be weakly supported in concatenated cies-tree estimate than do gene trees estimated by standard Bayes- analyses and will have greater disagreement with their underlying ian analysis (Table 2). This suggests that other factors are at work gene trees (e.g. Wiens et al., 2008, 2010a, 2012). Furthermore, besides the concordance of genes alone. One possibility is that the without discordance between gene trees (and gene and species simple use of a relaxed molecular clock model by both BEAST and trees), estimated trees from species tree and concatenated analyses ⁄BEAST may explain their similar topological estimates. However, are expected to generally be identical (e.g. Edwards, 2009). analyses using an alternative relaxed clock method on the 10-gene data set did not support this idea (see Appendix E). 4.2. Practical implications Furthermore, the pattern of greater congruence between relaxed-clock concatenated and species-tree estimates was not Our results have a particularly important practical application. repeated in the 44 gene case (Fig. 4). Instead, the standard concat- We find that weakly supported relationships in concatenated trees enated, relaxed-clock concatenated, and coalescent species-tree 42

154 S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155

estimates were all equally dissimilar (Fig. 4), based on the number taxon sampling on species-tree analyses of higher-level phylogeny of shared nodes. Nevertheless, we again found that relaxed clock should be an urgent area for future research. and standard concatenated analyses gave quite different results, and these differences were actually strongly supported by each 4.5. Skink phylogeny method for the Bayesian analyses. These observations should also be considered in future simula- Finally, our results are relevant to the higher-level phylogeny tion studies. Specifically, it is important to ensure that differences and classification of the largest family of lizards, which have been between concatenated and species-tree estimates are actually due uncertain and controversial in the recent literature (e.g. Hedges to the use of these two classes of approaches, and not to use of a and Conn, 2012; Pyron et al., 2013). Our relaxed-clock concate- relaxed clock approach when estimating species trees vs. a non- nated and species-tree results are concordant with a higher-level clock approach when estimating concatenated trees. We note that phylogeny and classification of skinks suggested by fewer loci previous analyses have suggested that relaxed-clock methods may but more extensive species sampling, (Pyron et al., 2013), as noted provide more accurate estimates of phylogeny than standard con- above. We argue that the subdivided classification of the clearly catenated analyses (using empirical datasets; Drummond et al., monophyletic Scincidae into multiple families (Hedges and Conn, 2006) but some results from simulations suggest that standard 2012) is unnecessary and disrupts traditional classification (e.g. concatenated analyses are more accurate instead (Wertheim changing the long-standing definition of Scincidae without any et al., 2010). phylogenetic justification). Our results also confirm some aspects of previous concatenated analyses (e.g. Brandley et al., 2012), including placement of Acontinae as sister to other scincids, 4.4. Species-tree methods, deep phylogenies, and sampling of genes monophyly of Lygosominae, and placement of Feylininae within and taxa Scincinae. However, further resolution of generic and species-level relationships within the major clades of scincids will clearly Our results also offer some insights on the application of coales- require more extensive taxon sampling. cent-based species-tree methods to relatively deep divergences and with limited sampling of species and individuals. Species-tree methods (especially BEST and ⁄BEAST) are typically applied to clo- 5. Conclusions sely related species (e.g. Camargo et al., 2012). Further, ⁄BEAST is explicitly recommended for cases in which >1 individual is sam- Using data from scincid lizards, we found that differences pled per species (Heled and Drummond, 2010). Here, we explored between concatenated and coalescent-based species trees are pre- a case where some divergences were very deep ( 100 myr), spe-  dictable and associated with branches in the concatenated tree cies sampling was limited (relative to the >1560 species in the fam- that are short, weakly supported, and have fewer concordant gene ily), and only a single individual was sampled per species. Even trees. Importantly, our results suggest that in cases where coales- though the true species phylogeny is not known in this case, we cent-based species trees cannot be estimated (e.g. too many taxa), nevertheless found the results to be encouraging. Specifically, the incorporating uncertainty in the concatenated tree may also incor- species tree results were largely concordant with traditional taxon- porate differences between the concatenated tree and estimated omy and with a recent (concatenated) analysis based on extensive species tree. We also found that relaxed-clock concatenated trees species sampling, especially with regards to basal placement of provided a better approximation of the coalescent-based species Acontinae, monophyly of Lygosominae, and monophyly of Scinci- tree in our 10-gene, 17-taxon analyses than standard concatenated nae (including Feylininae). The first two results have also appeared analyses. In addition, we found that coalescent-based species trees in other studies based on concatenated analyses (e.g. Brandley have lower mean support values when estimated using a dataset et al., 2012; Wiens et al., 2012). However, the latter result (Scinci- with a more than four-fold increase in the number of genes, but nae monophyly) is concordant with traditional taxonomy but a 30% reduction in taxon sampling. This suggests that taxon sam- among recent molecular analyses has so far appeared only in one pling could be more important than sampling large numbers of with extensive species sampling (683 scincid species; Pyron genes when applying coalescent-based species-tree methods to et al., 2013). Intriguingly, we found that the results from relaxed deep phylogenetic questions. Finally, our results from species-tree clock concatenated and species-tree analyses of 10 genes were analyses support division of Scincidae into three monophyletic more concordant with results from extensive species sampling subfamilies, a result otherwise found only in concatenated analy- than concatenated analyses based on 44 genes (e.g. concatenated ses with extensive species sampling. analysis with 44 genes and 12 ingroup taxa does not support monophyly of Scincinae). In addition, we found that many clades remained weakly sup- Acknowledgments ported in the estimated species trees after increasing the sampling of genes from 10 to 44 (but with reduced taxon sampling), and that We thank Kate Kuczynski for her assistance in generating new mean Pp was actually slightly lower with 44 genes and reduced sequences for this study, and Nathan Butler for help with taxon sampling (12 species) than with 10 genes and more exten- preliminary analyses. For generously providing tissue samples we sive taxon sampling (17 species). These results raise the interesting thank A.M. Bauer, M.C. Brandley, A.E. Hebert, J.A. McGuire, R.A. possibility that sampling more species may be more important for Nussbaum, and J.Q. Richmond, and the following institutions the accuracy of coalescent species-tree analyses than sampling (whose curators and/or collection managers we gratefully thank): more loci, at least for studies of higher-level phylogeny (possibly J. Gauthier (Yale Peabody Museum), M.S.Y. Lee (South Australian because of increased errors in estimating the underlying gene trees Museum), R. Nussbaum (University of Michigan), J. Vindum with poor taxon sampling, leading to errors in the estimated spe- (California Academy of Sciences). We are grateful to C. Linkem cies tree). The relative importance of sampling taxa versus genes for helping to clarify the identity of our Sphenomorphus simus sam- has been a major debate in systematics (e.g. Graybeal, 1998; ple (see Appendix B). We thank J. Schulte II and two anonymous Heath et al., 2008; Poe and Swofford, 1999; Rannala et al., 1998; reviewers for helpful comments that improved the manuscript. Rosenberg and Kumar, 2001; Zwickl and Hillis, 2002), but not in This project was supported by a US National Science Foundation- reference to species-tree methods. Clearly, the impact of limited AToL grant, with awards to T.W.R. (EF 0334967) and J.J.W. (EF 43

S.M. Lambert et al. / Molecular Phylogenetics and Evolution 82 (2015) 146–155 155

0334923). S.M.L. was supported by an N.S.F. Graduate Research Maddison, W.P., Maddison, D.R., 2000. MacClade 4.0. Sinauer Associates, Fellowship. Sunderland, MA. Marshal, D.C., Simon, S., Buckley, T.R., 2006. Accurate branch length estimation in partitioned Bayesian analyses requires accommodation of among-partition rate Appendix A. Supplementary material variation and attention to branch length priors. Syst. Biol. 55, 993–1003. McVay, J.D., Carstens, B.C., 2013. Phylogenetic model choice. justifying a species tree or concatenation analysis. J. Phylogen. Evol. Biol. 1, 114. Supplementary data associated with this article can be found, in Mulcahy, D.G., Noonan, B.P., Moss, T., Townsend, T.M., Reeder, T.W., Sites Jr., J.W., the online version, at http://dx.doi.org/10.1016/j.ympev.2014. Wiens, J.J., 2012. Estimating divergence dates and evaluating dating methods 10.004. using phylogenomic and mitochondrial data in squamate reptiles. Mol. Phylogenet. Evol. 65, 974–991. Nylander, J.A.A., Wilgenbusch, J.C., Warren, D.L., Swofford, D.L., 2008. AWTY (are we References there yet?): a system for graphical exploration of MCMC convergence in Bayesian phylogenetics. Bioinformatics 24, 581–583. Pamilo, P., Nei, M., 1988. Relationships between gene trees and species trees. Mol. Bayzid, M.D.S., Warnow, T., 2012. Estimating optimal species trees from incomplete Biol. Evol. 5, 568–583. gene trees under deep coalescence. J. Comput. Biol. 19, 591–605. Perez, S.I., Klaczko, J., dos Reis, S.F., 2012. Species tree estimation for a deep Blanga-Kanfi, S., Miranda, H., Penn, O., Pupko, T., DeBry, R.W., Huchon, D., 2009. phylogenetic divergence in the New World monkeys (Primates: Platyrrhini). Rodent phylogeny revised: analysis of six nuclear genes from all major rodent Mol. Phylogenet. Evol. 65, 621–630. clades. BMC Evol. Biol. 9, 71. Poe, S., Swofford, D.L., 1999. Taxon sampling revisited. Nature 398, 299–300. Brandley, M.C., Wang, Y., Guo, X., Montes, Nieto., de Oca, A., Feria Ortiz, M., Hikida, Pyron, R.A., Burbrink, F.T., Wiens, J.J., 2013. A phylogeny and revised classification of T., Ota, H., 2012. The phylogenetic systematics of blue-tailed skinks (Plestiodon) Squamata, including 4161 species of lizards and snakes. BMC Evol. Biol. 13, 93. and the family Scincidae. Zool. J. Linn. Soc. 165, 163–189. Rannala, B., Huelsenbeck, J.P., Yang, Z., Nielsen, R., 1998. Taxon sampling and the Camargo, A., Avila, L.J., Morando, M., Sites Jr., J.W., 2012. Accuracy and precision in accuracy of large phylogenies. Syst. Biol. 47, 702–710. species tree estimation: an empirical evaluation of performance in lizards of the Rannala, B., Yang, Z., 1996. Probability distribution of molecular evolutionary trees: Liolaemus darwinii group (Squamata: Liolaemidae) under varying sub- a new method of phylogenetic inference. J. Mol. Evol. 43, 304–311. sampling designs. Syst. Biol. 61, 272–288. Ronquist, F., Teslenko, M., van der Mark, P., Ayres, D.L., Darling, A., Hohna, S., Larget, Crawford, N.G., Faircloth, B.C., McCormack, J.E., Brumfield, R.T., Winker, K., Glenn, B., Liu, L., Suchard, M.A., Huelsenbeck, J.P., 2012. MrBayes 3.2: efficient Bayesian T.C., 2012. More than 100 ultraconserved elements provide evidence that phylogenetic inference and model choice across a large model space. Syst. Biol. turtles are the sister group of archosaurs. Biol. Lett. 8, 783–786. 61, 539–542. Darriba, D., Taboada, G.L., Doallo, R., Posada, D., 2012. JModelTest 2: more models, Rosenberg, M.S., Kumar, S., 2001. Incomplete taxon sampling is not a problem for new heuristics and parallel computing. Nat. Methods 9, 772. phylogenetic inference. Proc. Natl. Acad. Sci. USA 98, 10751–10756. Davis, M.P., Midford, P.E., Maddison, W., 2013. Exploring power and parameter Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic estimation of the BiSSE method for analyzing species diversification. BMC Evol. analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688– Biol. 13, 38. 2690. Degnan, J.H., Rosenberg, N.A., 2009. Gene tree discordance, phylogenetic inference Streitberg, B., Röhmel, J., 1986. Exact distributions for permutation and rank tests: and the multispecies coalescent. Trends Ecol. Evol. 24, 332–340. an introduction to some recently published algorithms. Stat. Software Newslett. Douady, C.J., Delsuc, F., Boucher, Y., Doolittle, W.F., Douzery, E.J.P., 2003. 12, 10–17. Comparison of Bayesian and maximum likelihood bootstrap measures of Sugiura, N., 1978. Further analysts of the data by Akaike’s information criterion and phylogenetic reliability. Mol. Biol. Evol. 20, 248–254. the finite corrections. Commun. Stat. Theory Meth. 7, 13–26. Drummond, A.Y., Ho, S.Y.W., Phillips, M.J., Rambaut, A., 2006. Relaxed phylogenetics Sullivan, J., Swofford, D.L., Naylor, G.J.P., 1999. The effect of taxon sampling on and dating with confidence. PLoS Biol. 4, 699–710. estimating rate-heterogeneity parameters of maximum-likelihood models. Mol. Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis by Biol. Evol. 16, 1347–1356. sampling trees. BMC Evol. Biol. 7, 214. Swofford, D.L., 2002. PAUP⁄: Phylogenetic Analysis Using Parsimony⁄, Version Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A., 2012. Bayesian phylogenetics 4.0b10. Sinauer, Sunderland, MA. with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973. Team RC, 2012. R: A Langauge and Environment for Statistical Computing. . emerging? Evolution 63, 1–19. Townsend, T.M., Alegre, E.R., Kelley, S.T., Wiens, J.J., Reeder, T.W., 2008. Rapid Edwards, S.V., Liu, L., Pearl, D.K., 2007. High-resolution species trees without development of multiple nuclear loci for phylogenetic analysis using genomic concatenation. Proc. Natl. Acad. Sci. USA 104, 5936–5941. resources: an example from squamate reptiles. Mol. Phylogenet. Evol. 47, 129– Felsenstein, J., 1985. Phylogenies and the comparative method. Am. Nat. 125, 1–15. 142. Felsenstein, J., 2004. Inferring Phylogenies. Sinauer Associates, Sunderland, MA. Townsend, T.M., Mulcahy, D.G., Sites Jr., J.W., Kuczynski, C.A., Wiens, J.J., Reeder, Graybeal, A., 1998. Is it better to add taxa or characters to a difficult phylogenetic T.W., 2011. Phylogeny of iguanian lizards inferred from 29 nuclear loci, and a problem? Syst. Biol. 47, 9–17. comparison of concatenated and species tree approaches for an ancient, rapid Greer, A.E., 1970. A subfamilial classification of scincid lizards. Bull. Mus. Comp. radiation. Mol. Phylogenet. Evol. 61, 363–380. Zool. Harvard 139, 151–183. Uetz, P., Goll, J., Hallermann, J., 2013. The . (accessed 27.11.13). phylogenetic analyses. J. Syst. Evol. 46, 239–257. Wertheim, J., Sanderson, M.J., Worobey, M., Bjork, A., 2010. Relaxed molecular Hedges, S.B., Conn, C.E., 2012. A new skink fauna from Caribbean islands (Squamata, clocks, the bias-variance tradeoff, and the quality of phylogenetic inference. Mabuyidae, Mabuyinae). Zootaxa 3288, 1–244. Syst. Biol. 59, 1–8. Heled, J., Drummond, A.J., 2010. Bayesian inference of species trees from multilocus Wiens, J.J., Hutter, C.R., Mulcahy, D.G., Noonan, B.P., Townsend, T.M., Sites Jr., J.W., data. Mol. Biol. Evol. 27, 570–580. Reeder, T.W., 2012. Resolving the phylogeny of lizards and snakes (Squamata) Hothorn, T., Hornik, R., 2002. Exact nonparametic inference in R. In: Hardle, W., with extensive sampling of genes and species. Biol. Lett. 8, 1043–1046. Ronz, B. (Eds.), Compstat: Proceedings in Computational Statistics 2002. Wiens, J.J., Kuczynski, C.A., Smith, S.A., Mulcahy, D., Sites Jr., J.W., Townsend, T.M., Springer-Verlag, Berlin, pp. 355–360. Reeder, T.W., 2008. Branch lengths, support, and congruence. testing the Hovmöller, R., Knowles, L.L., Kubatko, L.S., 2013. Effects of missing data on species phylogenomic approach with 20 nuclear loci in snakes. Syst. Biol. 57, 420–431. tree estimation under the coalescent. Mol. Phylogenet. Evol. 69, 1057–1062. Wiens, J.J., Kuczynski, C.A., Stephens, P.R., 2010a. Discordant mitochondrial and Kubatko, L.S., Carstens, B.C., Knowles, L.L., 2009. STEM: species tree estimation using nuclear gene phylogenies in emydid turtles: implications for speciation and maximum likelihood for gene trees under coalescence. Bioinformatics 7, 971– conservation. Biol. J. Linn. Soc. 99, 445–461. 973. Wiens, J.J., Kuczynski, C.A., Townsend, T., Reeder, T.W., Mulcahy, D.G., Sites Jr., J.W., Lanfear, R., Calcott, B., Ho, S.Y.W., Guindon, S., 2012. PartitionFinder: combined 2010b. Combining phylogenomics and fossils in higher level squamate reptile selection of partitioning schemes and substitution models for phylogenetic phylogeny: molecular data change the placement of fossil taxa. Syst. Biol. 59, analyses. Mol. Biol. Evol. 29, 1695–1701. 674–688. Larget, B.R., Kotha, S.K., Dewey, C.N., Ané, C., 2010. BUCKy: gene tree/species tree Wiens, J.J., Morrill, M.C., 2011. Missing data in phylogenetic analysis: reconciling reconciliation with Bayesian concordance analysis. Bioinformatics 26, 2910– results from simulations and empirical data. Syst. Biol. 60, 719–731. 2911. Williams, J.S., Niedzwiecki, J.H., Weisrock, D.W., 2013. Species tree reconstruction of Leaché, A.D., Rannala, B.R., 2011. The accuracy of species tree estimation under a poorly resolved clade of salamanders (Ambystomatidae) using multiple simulation: a comparison of methods. Syst. Biol. 60, 126–137. nuclear loci. Mol. Phylogenet. Evol. 68, 671–682. Liu, L., 2008. BEST: Bayesian estimation of species trees under the coalescent model. Yang, Z., 2006. Computational Molecular Biology. Oxford University Press, New Bioinformatics 24, 2542–2543. York, New York. Maddison, W.P., 1997. Gene trees in species trees. Syst. Biol. 46, 523–536. Zwickl, D.J., Hillis, D.M., 2002. Increased taxon sampling greatly reduces Maddison, W.P., Knowles, L.L., 2006. Inferring phylogeny despite incomplete lineage phylogenetic error. Syst. Biol. 51, 588–589. sorting. Syst. Biol. 55, 21–30. 44

Molecular Phylogenetics and Evolution 97 (2016) 242–243

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier.com/locate/ympev

Corrigendum Corrigendum to ‘‘When do species-tree and concatenated estimates disagree? An empirical analysis with higher-level scincid lizard phylogeny”. [Mol. Phylogen. Evol. 82A (2015) 146–155] ⇑ Shea M. Lambert a, , Tod W. Reeder b, John J. Wiens a

a Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA b Department of Biology, San Diego State University, San Diego, CA 92182, USA

The authors regret an error in the taxon labeling of Fig. 3. In the published figure, the taxon labels for Brachymeles gracilis and Eumeces schneideri are erroneously switched. Below is a corrected figure:

Additionally, we regret that the DOI for the Dryad submission was omitted from the main text. The DOI is . Finally, we regret that GenBank accession numbers for newly generated sequences used in this study were omitted from the supple- mental material. Below is a table detailing these accessions:

DOI of original article: http://dx.doi.org/10.1016/j.ympev.2014.10.004 ⇑ Corresponding author. E-mail addresses: [email protected] (S.M. Lambert), [email protected] (T.W. Reeder), [email protected] (J.J. Wiens).

http://dx.doi.org/10.1016/j.ympev.2016.02.006 1055-7903/Ó 2016 Elsevier Inc. All rights reserved. 45

S.M. Lambert et al. / Molecular Phylogenetics and Evolution 97 (2016) 242–243 243

Accession ID Organism Gene KP843142 Prasinohaema virens GPR37 KP843143 Egernia whitii FSTL5 KP843144 Eumeces schneideri FSTL5 KP843145 Mabuya unimarginata FSTL5 KP843146 Plestiodon skiltonianus FSTL5 KP843147 Prasinohaema virens FSTL5 KP843148 Sphenomorphus nigriventris FSTL5 KP843149 Tropidophorus grayi FSTL5 KP843150 Egernia whitii BDNF KP843151 Eumeces schneideri BDNF KP843152 Mabuya unimarginata BDNF KP843153 Plestiodon skiltonianus BDNF KP843154 Prasinohaema virens BDNF KP843155 Sphenomorphus nigriventris BDNF KP843156 Tropidophorus grayi BDNF KP843157 Egernia whitii RAG1 KP843158 Eumeces schneideri RAG1 KP843159 Mabuya unimarginata RAG1 KP843160 Prasinohaema virens RAG1 KP843161 Scincus scincus RAG1 KP843162 Sphenomorphus nigriventris RAG1 KP843163 Egernia whitii NGFB KP843164 Mabuya unimarginata NGFB KP843165 Prasinohaema virens NGFB KP843166 Sphenomorphus nigriventris NGFB KP843167 Tropidophorus grayi NGFB KP843168 Egernia whitii AHR KP843169 Mabuya unimarginata AHR KP843170 Prasinohaema virens AHR KP843171 Sphenomorphus nigriventris AHR KP843172 Tropidophorus grayi AHR KP843173 Egernia whitii ECEL1 KP843174 Mabuya unimarginata ECEL1 KP843175 Prasinohaema virens ECEL1 KP843176 Sphenomorphus nigriventris ECEL1 KP843177 Tropidophorus grayi ECEL1 KP843178 Egernia whitii PTGER4 KP843179 Mabuya unimarginata PTGER4 KP843180 Sphenomorphus nigriventris PTGER4 KP843181 Tropidophorus grayi PTGER4 KP843182 Egernia whitii DNAH3 KP843183 Prasinohaema virens DNAH3 KP843184 Sphenomorphus nigriventris DNAH3 KP843185 Tropidophorus grayi DNAH3 KP843186 Eumeces schneideri ZFP KP843187 Prasinohaema virens ZFP

Two other FSTL5 sequences listed as ‘TBA’ in the supplemental were already accessioned to GenBank as part of a previous study. They are accessions JN654839 (Sphenomorphus cf. simus) and JN654849 (Zonosaurus ornatus). The authors thank John Gatesy for pointing out the error in Fig. 3. The authors apologise for any inconvenience caused. 46

Appendix C

1

1 Inferring introgression using RADseq and DFOIL: power and pitfalls

2 revealed in a case study of spiny lizards (Sceloporus)

3

4 Running Title: Inferring introgression with RADseq and DFOIL

5

6 Shea M. Lambert1*, Jeffrey W. Streicher1,2, M. Caitlin Fisher-Reid3, Fausto R. Méndez de la Cruz4,

7 Norberto Martínez-Méndez5, Uri Omar García Vázquez6, Adrián Nieto Montes de Oca7, and John J.

8 Wiens1

9

10 1. Department of Ecology and Evolutionary Biology, University of Arizona. Tucson, AZ 85721, USA.

11 2. Department of Life Sciences, The Natural History Museum, London SW7 5BD, UK.

12 3. Department of Biological Sciences, Bridgewater State University. Bridgewater, MA 02324, USA.

13 4. Laboratorio de Herpetología, Instituto de Biología, Universidad Nacional Autónoma de México,

14 Ciudad de México 04510, D.F., México.

15 5. Laboratorio de Bioconservación y Manejo, Departamento de Zoología, Escuela Nacional de Ciencias

16 Biológicas del Instituto Politécnico Nacional, Ciudad de México 11340, D.F., México.

17 6. Unidad Multidisciplinaria de Investigación, Facultad de Estudios Superiores Zaragoza, Universidad

18 Nacional Autónoma de México, Batalla 5 de Mayo s/n, Ejercito de Oriente, Iztapalapa, Ciudad de

19 México 09230, D.F., México.

20 7. Departamento de Biología Evolutiva, Facultad de Ciencias, Universidad Nacional Autónoma de

21 México, Ciudad de México 04510, D.F., México.

22 *Corresponding author, email: [email protected] 47

2

23 Abstract

24 Introgression is now commonly reported in studies across the Tree of Life, aided by recent

25 advancements in data collection and analysis. Nevertheless, researchers working with non-model

26 species lacking reference genomes may be stymied by a mismatch between available resources and

27 methodological demands. In this study, we demonstrate a fast and simple approach for inferring

28 introgression using RADseq data, and apply it to a case study involving spiny lizards (Sceloporus) from

29 northeastern México. First, we find evidence for recurrent mtDNA introgression between the two focal

30 species based on patterns of mito-nuclear discordance. We then test for nuclear introgression by

31 exhaustively applying the “five-taxon” D-statistic (DFOIL) to all relevant individuals sampled for

32 RADseq data. In our case, this exhaustive approach (dubbed “ExDFOIL”) entails testing up to ~250,000

33 unique four-taxon combinations of individuals across species. To facilitate use of this ExDFOIL

34 approach, we provide scripts for many relevant tasks, including the selection of appropriate four-taxon

35 combinations, execution of DFOIL tests in parallel, and visualization of introgression results in

36 phylogenetic and geographic space. Using ExDFOIL, we find evidence for ancient introgression between

37 the focal species. Furthermore, we reveal geographic variation in patterns of introgression that is

38 consistent with patterns of mito-nuclear discordance and with recurrent introgression. Overall, our

39 study demonstrates that the combination of DFOIL and RADseq data can effectively detect introgression

40 under a variety of sampling conditions (for individuals, populations, and loci). Importantly, we also

41 find evidence that batch-specific error in RADseq data may mislead inferences of introgression under

42 certain conditions.

43

44 Keywords: D-statistics, ABBA-BABA, batch effects, hybridization, mito-nuclear discordance 48

3

45 1 | INTRODUCTION

46 Introgressive hybridization is increasingly recognized as a common and influential force of evolution

47 (Mallet, 2005; Schwenk, Brede, & Streit, 2008; Harrison & Larson, 2014). The availability of genomic

48 datasets and new methods for identifying introgression has facilitated a wide variety of new research in

49 this area (reviewed in Twyford & Ennos, 2012; Payseur & Rieseberg, 2016). Nevertheless, it remains

50 difficult to describe the timing, strength, directionality, genomic patterns, and geographic context of

51 introgression, especially for taxa lacking suitable reference genomes.

52 A variety of methods for inferring introgression using genomic data are now available

53 (reviewed in Sousa & Hey, 2013; Payseur & Rieseberg, 2016). In theory, many of these methods can be

54 applied to “reduced-representation” datasets (e.g., RADseq, genotyping-by-sequencing) without

55 needing a reference genome. The lack of a reference genome remains a common scenario, especially

56 for the many researchers studying introgression in non-model organisms. However, in these cases,

57 mismatches between methodological demands and available data or resources may be more common.

58 For example, many methods assume unlinked sites (e.g., isolation with migration models; Hey, 2010),

59 leading researchers to discard all but a single variant per locus. Other methods rely on well-resolved

60 gene trees from many independent loci (e.g., most phylogenetic species networks; Than, Ruths, &

61 Nakhleh, 2008; Solis-Lemus & Ané, 2016), which can be difficult to obtain using the relatively short

62 sequences characteristic of reduced-representation datasets (e.g., Eaton & Ree 2013). In other cases,

63 methods require a pre-defined set of hypotheses or demographic models to compare (e.g., Gutenkunst,

64 Hernandez, Williamson, & Bustamante, 2009; Cornuet et al., 2014). Such methods may also become

65 computationally intractable as populations or hypotheses are added. Finally, many methods that are

66 otherwise suitable may require that the number of populations and assignment of samples to these

67 populations are known (e.g., f-statistics, Reich, Thangaraj, Patterson, Price & Singh, 2009). These tools 49

4

68 also rely on estimates of allele frequencies that could be biased by the number of individuals sampled

69 for each population and by the bioinformatic processing of the data.

70 D-statistics (i.e., ABBA-BABA; Green et al., 2010; Durand, Patterson, Reich, & Slatkin, 2011)

71 and related methods offer a simple yet powerful framework for detecting introgression. These statistics

72 typically require only a species-tree hypothesis and site-count patterns for a sufficient number of

73 biallelic sites (e.g., Martin et al., 2013; Eaton & Ree, 2013), and are computationally efficient to

74 calculate. Pease and Hahn (2015) described DFOIL, a system of D-statistics applicable to a symmetric

75 four-taxon tree (four ingroup taxa [P1-P4], plus an optional outgroup [O]) that can detect introgression

76 and potentially infer its direction. Using simulations, Pease and Hahn (2015) demonstrated that DFOIL

77 has high power and low type-I error under a wide range of conditions. By combining the results of four

78 D-statistic “components”, DFOIL can infer either “ancestral” or “intergroup” introgression. Each

79 component is a four-taxon D-statistic comparing three of the ingroup taxa. Ancestral introgression is

80 that between the ancestor of the younger pair of ingroup taxa (P1 and P2) and one other ingroup taxon

81 (P3 or P4). However, the direction of ancestral introgression cannot be inferred with DFOIL. In contrast,

82 intergroup introgression signatures can be inferred between any two ingroup taxa, and can estimate

83 directionality. Thus far, DFOIL has typically been applied across linkage groups of whole genomes or

84 exomes, to summarize the genomic “landscape” of introgression (e.g., Kumar et al., 2016; Schumer,

85 Cui, Powell, Rosenthal, & Andolfatto, 2016; Sarver et al., 2017). In theory, DFOIL will also work with

86 reduced-representation markers (e.g., RADseq), allowing for the recovery of a single “genome-wide”

87 signature of introgression (Pease & Hahn, 2015). However, this particular combination of methods

88 (DFOIL and RADseq) has rarely been used (but see Huang, 2016, and Eaton & Ree 2013 for a similar

89 combination of approaches). 50

5

90 Here, we develop a novel application of DFOIL that involves exhaustively testing

91 relevant four-taxon combinations of sampled individuals among species, and then summarizing the

92 results over phylogenetic and geographic space. In the present study, this approach (referred to as

93 “ExDFOIL”, https://www.github.com/SheaML/ExDFOIL) is paired with sequence data from double-

94 digest RADseq (“ddRAD” hereafter; Peterson, Weber, Kay, Fisher, & Hoekstra, 2012). However, it is

95 well-suited for sequence data from any reduced-representation genomic dataset with multiple

96 individuals per population or species. We demonstrate this approach in a case study involving two

97 lizard species from northeastern México. We also provide scripts that allow this approach to be readily

98 applied by other researchers.

99 We also address how batch-specific errors and the inclusion of singleton site-pattern counts

100 from RADseq data might influence these analyses of introgression. Sequence error is known to affect

101 D-statistics, particularly when it is uneven between samples or populations (Green et al., 2010; Durand

102 et al., 2011). Batch-specific error can arise in RADseq or other genomic data from a wide variety of

103 sources (Mastretta-Yanes et al., 2015), and might be expected to produce non-random patterns of

104 sequence error when combining samples from different batches. However, the potential impact of

105 RADseq batch effects on D-statistics (and related approaches) is unexplored with empirical data, to our

106 knowledge. Therefore, we used our exhaustive analyses here to examine the influence of batch identity.

107 Similarly, the inclusion of singleton-site pattern counts (e.g., ABAAA, “singleton counts” hereafter) in

108 DFOIL is predicted to generate false positives when counts are highly uneven between samples (Pease &

109 Hahn, 2015). However, this possibility also remains unexplored (both in silico and empirically).

110 Therefore, we used our analyses here to empirically evaluate the impact of including singleton counts.

111 Our study system is a small group of spiny lizard species (genus Sceloporus) from eastern and

112 central México. Sceloporus is a well-studied genus distributed from Canada to Central America 51

6

113 containing ~100 species (Uetz, Freed, & Hošek, 2017). Sceloporus research includes many studies

114 focused on hybridization and introgression (e.g., Hall & Selander, 1973; Sites, Davis, Hutchinson,

115 Maurer, & Lara, 1993; Sites, Barton, & Reed, 1995; Leaché & Cole, 2007; Leaché, 2011, Leaché,

116 Harris, Maliska, & Linkem, 2013, Grummer et al., 2015). The present study focuses on a subset of the

117 torquatus species group, one of the youngest and most species-rich groups of Sceloporus (~17 nominal

118 species, sensu Leaché, Banbury, Linkem, & Nieto Montes de Oca, 2016). To our knowledge, this group

119 has not been the target of previous research on hybridization or introgression. The two focal species of

120 this study are the small-bodied, desert-dwelling S. ornatus (panel A of Figure 1) and the large-bodied,

121 alpine-dwelling S. oberon (sensu Wiens, Reeder, & Nieto Montes de Oca, 1999; panels F and G of

122 Figure 1). These species were synonymized by Martínez-Méndez and Méndez-de la Cruz (2007) using

123 mtDNA alone (although this was inconsistent with other analyses based on mtDNA; Wiens et al. 1999;

124 Wiens, Kuczynski, Arif, & Reeder, 2010; Wiens, Kozak, & Silva, 2013). Here, using nuclear DNA, we

125 find they are reciprocally monophyletic and not sister species.

126 In this study, we use this new ExDFOIL approach to evaluate nuclear introgression between these

127 Sceloporus species. We apply this approach to new RADseq data for the torquatus group, using two

128 distinct strategies for de novo ddRAD assembly and variant filtering (but identical individual-level

129 sampling for downstream analyses). One strategy is aimed at phylogeny estimation and divergence

130 dating across the clade (“clade-wide” dataset hereafter). The other is aimed at maximizing power to

131 detect introgression in the two targeted species (“targeted” dataset hereafter). We first develop the

132 hypothesis of introgression from patterns of mito-nuclear discordance, based on new mtDNA data and

133 the RADseq phylogeny from the clade-wide dataset. We then test for the expected genome-wide

134 signatures of introgression, using both the clade-wide and targeted datasets. The results demonstrate

135 that DFOIL can detect introgression with RADseq data under a wide range of sampling conditions for 52

7

136 loci, individuals, and populations. Furthermore, the ExDFOIL approach reveals intra-specific geographic

137 variation in the degree of introgression. Finally, our analysis reveals that batch-specific error in

138 RADseq data may mislead inferences of introgression under certain circumstances, and that inclusion

139 of singleton counts may also be problematic.

140

141 2 | MATERIALS AND METHODS

142 2.1 | Geographic and taxonomic sampling

143 Our sampling focused on a subclade of the torquatus group formerly referred to as the poinsettii group

144 (sensu Wiens et al., 2010). We emphasized widespread geographic sampling of S. ornatus and S.

145 oberon. We used samples from Wiens et al. (1999), supplemented with additional fieldwork by NM,

146 UOG, and SML. In order to resolve species-level phylogeny using ddRAD data, we also included

147 representatives of seven other species of the torquatus group. These species were selected on the basis

148 of their close relationships to the two focal species in previous phylogenetic studies (e.g., Wiens et al.,

149 2013; Leache et al., 2016) and tissue availability. Many S. minor individuals were sampled, as this

150 species will be studied in future work. Vouchers, taxonomic identities, and geo-referenced locality

151 information for all samples are provided in Table S2.

152 There are two parapatric forms of S. oberon (Wiens et al., 1999). We refer to them as “oberon-

153 black” and “oberon-red” based on their distinctive male dorsal coloration (Figure 1, panels F and G,

154 respectively). We excluded ddRAD and mtDNA data from putatively hybrid individuals from the

155 contact zone separating them (see sampling gap in Figure 1). This zone will be examined in future

156 work.

157

158 2.2 | mtDNA data collection 53

8

159 To better examine mito-nuclear discordance, we supplemented previously published data for the ND4

160 mtDNA region (Wiens et al., 1999; Martínez-Méndez & Méndez-de la Cruz, 2007) with new ND4 data.

161 Given that S. ornatus was represented by two or fewer individuals in previous studies, we added

162 multiple individuals and localities for this species. Clade-wide sampling of the torquatus group (and

163 outgroups) relied primarily on sequences from Wiens et al. (1999) and Martínez-Méndez and Méndez-

164 de la Cruz (2007). GenBank accession numbers and taxonomic identities for all sequences used are in

165 Table S1. We focused on ND4 because it contains more informative sites than other mtDNA regions

166 used in this group (e.g., 12S, see Table 3 of Wiens et al., 1999).

167 For most samples, we extracted genomic DNA (gDNA) from liver or tail tissue using Qiagen

168 DNeasy and Tissue kit and mini-spin column protocol. For others, we used an alternative

169 paramagnetic bead protocol (M. Fujita, pers. comm.) using “Serapure” beads prepared following the

170 protocol of Faircloth & Glenn (2014), modified from Rohland & Reich (2012). To amplify ND4 and

171 adjacent tRNAs, we used the primers ND4 (5’-TGACTACCAAAAGCTCATGTAGAAGC-3’) and

172 LUE (5’-TRCTTTTACTTGGATTTGCACCA-3’), from Forstner et al. (1995). Each reaction totaled

173 25µL, using 12.5µL of GoTaq® Green Master Mix (Promega), 1–2µL of template DNA, either 4µL at

174 3µM concentration for each primer, or 1µL of 10µM, and nuclease-free water to 25µL. Reaction

175 conditions generally followed Wiens et al. (1999), but using 35 cycles. Sequencing was performed at

176 the University of Chicago Comprehensive Cancer Center DNA Sequencing & Genotyping Facility and

177 the University of Arizona Genetics Core using Applied Biosystems 3730 DNA Analyzers.

178

179 2.3 | ddRAD data collection

180 We prepared ddRAD libraries generally following Peterson et al. (2012), with departures described

181 here and in Appendix S1. We used the enzymes SbfI (5'-CCTGCA*GG -3') and MspI (5'-C*CGG-3', 54

9

182 New England Biolabs) for restriction digest. We extracted gDNA as described above. We used Serapure

183 beads for paramagnetic bead clean-up steps. For each sample, we used 250–500ng of starting DNA. We

184 used a combinatorial barcode scheme to label individuals, with one barcode sequence on the 5’ end of

185 the ligated adapter, and another on the 5’ end of the PCR primer. PCR amplification was performed on

186 pooled libraries of 5’ barcoded samples. Full barcoding schemes and sequences are provided in Table

187 S2.

188 We prepared two separate “batches” of ddRAD libraries, sequenced at different times and

189 sequencing facilities (hereafter “batch 1” and “batch 2”). Although the ddRAD protocol was similar in

190 both cases, there were several technical differences in library preparation and sequencing. Full details

191 of library preparation and sequencing for both batches are described in Appendix S1.

192 Differences in library preparation and sequencing for our batches are expected to cause batch-

193 specific error (or “batch effects”; for specific sources of error see Table 1 of Mastretta-Yanes et al.,

194 2015). With this in mind, we included 20 biological replicates (i.e., separate aliquots from the same

195 gDNA stock included in each batch; Table S2). This allowed us to estimate (and optimize) error rates

196 and (together with replicates at the population level) mitigate batch-effects bioinformatically (see

197 Appendix S1). In general, we tried to include samples from the same localities (or regions) in both

198 batches to minimize introduction of real biological differences between the batches, which could be

199 confounded with batch-specific error. However, samples of S. ornatus from the four western-most

200 localities are restricted to batch 2. Additionally, several newly sampled localities for oberon-black and

201 oberon-red were included in batch 2 only. Nevertheless, these sampling differences do not seem to

202 explain any of our results (see sections 3.5 and 4.4).

203

204 2.4 | ddRAD de-novo assembly and variant calling 55

10

205 We demultiplexed data using the function process_radtags in Stacks v1.42 (Catchen,

206 Hohenlohe, Bassham, Amores, & Cresko, 2013). We discarded reads for which the expected cut sites

207 and/or barcodes were not found. We filtered PCR duplicates from our data using the function

208 clone_filter function in Stacks, specifying the use of 8bp unique molecular identifiers located

209 upstream of the barcode on the 5’ adapter end. We filtered data for adapters and low-quality sequence

210 using the quality-filtering step of dDocent v2.24 (Puritz et al., 2014), which uses Trimmomatic

211 v0.3.33 (Bolger, Lohse, & Usadel, 2014). We ran this step before all subsequent assembly, read

212 mapping, or variant-calling steps.

213 We used two executions of the dDocent pipeline for de novo assembly, read mapping, and

214 variant calling. The resulting datasets (referred to as “targeted” and “clade-wide”) differed primarily in

215 the samples used to create the de novo reference and in the parameters used to filter variants.

216 Importantly, both datasets ultimately include the same set of individuals for use in DFOIL tests. We

217 created a single de novo reference for each dataset, using data from multiples species. The reference

218 samples for the clade-wide assembly included eleven total representatives (Table S2) from six species.

219 These samples were selected on the basis of phylogenetic breadth and sequencing depth, and include

220 representatives of S. torquatus (n=1), S. poinsettii (n=1), S. ornatus (n=2), S. oberon (n=2), S.

221 cyanogenys (n=1), S. cyanostictus (n=1), and S. minor (n=3). The assembly used a minimum-read

222 count (k1) of 3 and minimum-individual count (k2) of 4. The reference samples for the targeted

223 assembly included all available individuals of S. oberon and S. ornatus from batch 2, and used values

224 of k1=3 and k2=3. As we did not include individuals of S. cyanogenys or S. cyanostictus in the

225 targeted reference panel, an increased rate of mapping error may occur for these taxa. In theory, this

226 could lead to inflated type-I error rates for tests involving these taxa (Durand et al., 2011). However, 56

11

227 because relatively few individuals are available for these taxa, this bias would likely persist even if the

228 samples were included in the reference panel. Consequently, we interpreted with caution those tests

229 using the targeted dataset and one individual of S. cyanostictus or S. cyanogenys (see section 3.5).

230 For both de novo assemblies, we mapped reads using BWA v0.7.15-r1142-dirty (Li & Durbin,

231 2010), with a match score of 1, mismatch score of 3, and gap score of 5. We called variants using the

232 population-informed model of freebayes v1.0.2-33-gdbb6160 (Garrison & Marth, 2012), with

233 populations defined by sampling locality (Table S2). For the clade-wide assembly, we mapped reads

234 and called variants for all available individuals, including those from batch 1. These individuals had

235 better overall sequencing depth but shorter read length (Table S2). However, these individuals were not

236 included in either reference panel, as we observed that “mixed” reference panels resulted in higher

237 error rates for biological replicates. For the targeted assembly for DFOIL analyses, we called variants for

238 all available individuals (including both batches) of S. oberon, S. ornatus, S. cyanogenys, and S.

239 cyanostictus. Based on our mtDNA analyses (Figure 2, Fig S1), one sample of S. minor was also

240 included to use as the outgroup for our DFOIL analyses. We selected an individual of S. minor rather than

241 a more distantly related taxon to maximize the number of comparable sites for analysis.

242 We applied distinct variant-filtering pipelines to the raw clade-wide variants and targeted

243 variants (contained in the TotalRawSNPs.vcf files produced by dDocent). The clade-wide

244 variants underwent relatively stringent filtering, which greatly reduced between-batch error rates

245 (described in Appendix S1). In contrast, the targeted variants underwent relatively minimal filtering.

246 Using vcftools v0.1.15 (Danacek et al., 2011), we filtered all sites with quality scores <30, and all

247 genotypes with coverage <3. We then filtered any variants that were not bi-allelic SNPs or that had

248 missing data for >99% of individuals. We converted the resulting vcf file to fasta format using the 57

12

249 vcf_to_tab function of vcftools and a publicly available perl script by CM Bergey

250 (https://code.google.com/archive/p/vcf-tab-to-fasta/). To characterize the effect of sampling additional

251 individuals, we further split the data into “reduced” (Figure 2) and “full” (Fig S1) sets of individuals.

252 For reduced sampling, we kept a small number of the best-sequenced individuals from each sampled

253 population. For full sampling, we used these individuals plus any remaining individuals of S.

254 cyanostictus, S. cyanogenys, S. oberon, or S. ornatus that had <50% missing sites. This threshold was

255 applied to a minimally filtered version of the clade-wide TotalRawSNPs.vcf file (see Appendix

256 S1 for filtering parameters).

257

258 2.5 | Phylogenetic analyses and divergence dating

259 For mtDNA data, we aligned sequences using the CLUSTAL W (Thompson, Higgins, & Gibson, 1994)

260 plugin in Geneious v6 (Kearse et al., 2012). We collapsed identical haplotypes using the “Find

261 Duplicates” function of Geneious. Collapsed sequences are indicated in Table S1. We manually

262 inspected the protein-coding section of the ND4 alignment to ensure open reading frames for all

263 sequences. In one case, we replaced an apparent stop-codon early in the protein-coding sequence of an

264 outgroup sample (S. jarrovii, #AF154210) with three ambiguous base pairs (“NNN”).

265 We estimated phylogenetic relationships and divergence times with the ND4 alignment using a

266 relaxed-clock Bayesian framework with BEAST2 v2.4.6 (Bouckaert et al., 2014), executed using the

267 CIPRES Scientific Gateway v3.3 (Miller, Pfeiffer, & Schwartz, 2010). We used bModelTest

268 (Bouckaert & Drummond, 2017) to infer and marginalize models of substitution and rate heterogeneity.

269 As bModelTest has only been tested with sequences as short as 500bp, we chose not to use separate

270 partitions for each codon position and the tRNA region, which would all be <250bp. Instead we treated 58

13

271 the entire sequence as a single partition (but using models that incorporated rate heterogeneity among

272 sites). Given the scarcity of fossils within the torquatus group, we used one secondary calibration point

273 with a range of dates from two recent studies (Bayesian relaxed-clock trees of Wiens et al., 2013;

274 Leaché et al., 2016). We calibrated the node corresponding to the most recent common ancestor of S.

275 grammicus and the torquatus group, the root of the tree in the present study. We used a uniform prior

276 from 12.9–18.0 million years ago (Mya), bracketed using these two point estimates for this node. We

277 selected a single uncorrelated lognormal relaxed clock model (Drummond, Ho, Phillips, & Rambaut,

278 2006), and a calibrated Yule-tree prior (Heled & Drummond, 2012). We ran 40 million generations,

279 sampling once every 10,000 generations, and ran three replicate analyses. We assessed convergence

280 using Tracer v1.6 (Rambaut, Suchard, Xie, & Drummond, 2014). We first checked that all

281 parameter and prior estimates had effective sample sizes (ESS) >200 for each replicate analysis. We

282 then compared replicate analyses, ensuring that each provided similar estimates for all parameters.

283 Using the results of one analysis, we generated a maximum clade-credibility tree using common

284 ancestor heights, after discarding the first 10% of generations as burn-in, using TreeAnnotator

285 v.2.4.5.

286 For phylogenetic analyses of ddRAD data, we used RAxML v8.2.9 (Stamatakis, 2014) on the

287 University of Arizona High Performance Computing (UA HPC) cluster. We treated each concatenated

288 matrix as one partition, using the GTRCAT model of evolution. We used the ­f a option to search for

289 an optimal tree and to assess support using 100 rapid bootstrap replicates. We conducted two analyses

290 with the clade-wide alignment, one with full- and one with reduced-individual sampling. We did not

291 include biological replicate sample pairs in phylogenetic analyses. Instead, we retained the replicate

292 with more overall sequence data. Several samples were excluded from these analyses after being 59

14

293 identified as having strongly discordant signal, putatively owing to introgression. Removal of these

294 samples allowed for consistent placement of S. cyanogenys, but had no effect on placement of S.

295 cyanostictus, S. oberon, or S. ornatus (see Appendix S1 for details).

296 We did not use the targeted dataset for phylogeny estimation, for two major reasons. First,

297 invariant sites were not retained. Their exclusion can negatively influence topological and branch-

298 length inference (Lewis, 2001; Bertels, Silander, Pachkov, Rainey, & van Nimwegen, 2014), even when

299 corrections for acquisition bias are applied (Leaché, Banbury, Felsenstein, Nieto Montes de Oca, &

300 Stamatakis, 2015). Second, only S. oberon and S. ornatus individuals were used to create the de novo

301 reference assembly. This should bias the retained loci towards those with variants within or between

302 these particular species. This is useful for detecting introgression between these species using DFOIL, but

303 not for estimating branch-lengths or topologies.

304 For divergence dating using ddRAD data, we used penalized likelihood (Sanderson, 2002)

305 implemented using treePL v1.0 (Smith & O'Meara, 2012). This approach estimates divergence times

306 based on an existing topology and set of branch lengths (here from the concatenated likelihood

307 analysis). Use of BEAST to simultaneously estimate the topology and divergence times was not

308 practical for the ddRADseq data, given the large number of loci. We used a fixed-point calibration on

309 the root, representing the crown-node of the torquatus group (sensu Leaché et al., 2016). We set this

310 age to 11.8 Mya, based on the Bayesian estimate from Leaché et al. (2016). We used the leave-one-out

311 cross-validation procedure (Sanderson, 2002) to choose the optimal smoothing parameter (10),

312 comparing 6 smoothing values between 0.1–10,000.

313

314 2.6 | Exhaustive application of DFOIL (‘ExDFOIL’) 60

15

315 Our approach to assessing introgression with DFOIL involved applying this test exhaustively to all sets of

316 individuals that matched the assumptions of DFOIL. To do this, we first wrote a custom function in R

317 v3.4.1 (R Core Team, 2017) that accepts a phylogenetic tree and a list of taxa, and returns all unique

318 sets of taxa {P1, P2, P3, P4} such that subsets {P1,P2} and {P3,P4} are reciprocally monophyletic, and the

319 most recent common ancestor of subset {P1,P2} is younger than that of subset {P3,P4}. The latter

320 assumption is met by all pairs of clades with non-identical clade age, but DFOIL expects that the younger

321 clade is listed first. In cases of identical clade ages, the function will still accept the combination as

322 valid and return the clades in an arbitrary order. However, no such cases exist for our tree. This

323 function depends on the R packages ape (Paradis, Claude, & Strimmer, 2004), phytools (Revell,

324 2012), and combinat (Chasalow, 2012), and is available as Supplementary File S1. We applied this

325 function using each of our two ddRAD-based phylogenetic hypotheses (reduced-individual sampling,

326 Figure 2; and full-individual sampling, Fig S1). We then filtered these sets of taxa, retaining only sets

327 with representatives of both S. oberon and S. ornatus, resulting in 32,368 unique sets (reduced) and

328 237,600 (full) unique sets. For every test, we used the same outgroup, a sample of S. minor from batch

329 2 with highly complete data (voucher EPR743, Table S2). We used the default significance cutoff of

330 0.01 for each DFOIL component, and defaults for all other parameters.

331 We ran the DFOIL pipeline on each list of unique sets, for both the clade-wide and targeted

332 datasets, on the UA HPC cluster. We executed the DFOIL pipeline using custom shell scripts and a

333 publicly available perl script (selectSeqs.pl;

334 ), parallelized

335 across 28 processors with GNU parallel (Tange, 2011). We then used custom shell and R scripts to

336 collate the DFOIL test results, associate sample information with each test result, calculate summary 61

16

337 statistics, and visualize results. Example scripts and input files for these steps are available in the Dryad

338 Digital Repository (available upon acceptance) and at https://www.github.com/SheaML/ExDFOIL.

339 For our primary analyses, we used the dfoilalt option of dfoil.py to ignore singleton-

340 site pattern counts (e.g., ABAAA, “singleton counts” hereafter). These counts can be included in DFOIL

341 to reflect the fact that introgression could transfer either the derived or ancestral allele (Pease & Hahn

342 2015). Their inclusion is predicted to be potentially problematic when error or substitution rates are

343 skewed (Pease & Hahn, 2015), but this prediction has not been formally tested. Therefore, we repeated

344 all tests while including singleton counts using the default dfoil flag (see section 2.8).

345 DFOIL uses Chi-square tests of significance, rather than Z-scores from bootstrap or block-

346 jackknife resampling, as in previous applications of D-statistics (e.g., Green et al. 2010; Eaton & Ree,

347 2013). An assumption of the Chi-square test is that the sites considered are independent. This

348 assumption is clearly violated in most data sets (including our own). However, using simulation, Pease

349 and Hahn (2015) showed that D-statistics will nevertheless approximate a Chi-square distribution when

350 a sufficient number of sites are considered. They concluded that the number of sites required can be

351 estimated using the parameter ρ = 4NerL, with Ne=effective population size, r=recombination rate,

6 352 L=the number of sites considered, and a minimum value of ρ of ~4000. Using values of Ne=10 and

353 r=10-8, Pease and Hahn (2015) showed that the number of sites required is ~100kb. Although we lack

354 formal estimates of effective population sizes or recombination rates for our focal taxa, we believe that

355 we are likely sampling sufficient sites to avoid serious issues. Our clade-wide dataset samples from

356 >400kb, and our targeted dataset samples from >5Mb. Moreover, while a recombination rate of 10-8

357 may be reasonable for sites within a single genomic window, the per-site recombination rate in our

358 ddRAD dataset should be much higher, as ddRAD loci are scattered across the genome. Given these 62

17

359 considerations, we do not believe that linkage within or between our RAD loci is likely to inflate our

360 type-I error rate, particularly for the targeted dataset.

361

362 2.7 | Examination of batch-specific effects

363 To examine the influence of batch identity, we first compared proportions of positive tests for

364 introgression for each of the four primary datasets (using Supplemental Tables S3–S6). We did this for

365 tests that included: (1) only batch 2 samples for the four ingroup individuals (P1–P4); (2) only batch 1

366 samples for these individuals; and (3) any combination of batch 1 and batch 2 samples for ingroup

367 individuals. We then used stacked bar plots to visualize test results for our targeted dataset with

368 reduced-individual sampling, grouped by “batch signature.” We defined the batch signature as a string

369 giving the batch of origin for each of the four ingroup individuals (P1–P4). For example, we used

370 “1122” for a test with P1 and P2 from batch 1 and P3 and P4 from batch 2.

371

372 2.8 | Examination of singleton-count effects

373 Pease and Hahn (2015) predicted that DFOIL may return false positives when including singleton counts

374 if sample-specific error, or substitution rate, is high enough. The predicted mechanism for these false

375 positives is an inflated distance from all other taxa for a taxon that has a relatively high error (or

376 substitution rate) leading to false inference of introgression for that taxon’s sister lineage.

377 For this reason, DFOIL warns if either of the singleton count ratios P1:P2 or P3:P4 is >1.25 or <0.75 for a

378 given test.

379 To examine the influence of including singleton counts on our DFOIL results, we first compared

380 proportions of positive tests within each dataset, both for tests excluding singleton counts (executed 63

18

381 with dfoilalt flag) and tests including them (using dfoil flag). We also used a grouped bar plot to

382 examine ratios of singleton counts in P3 to singleton counts in P4 (using the test results from our

383 targeted dataset with reduced-individual sampling; Figure 6). This plot compares the average P3:P4 ratio

384 for tests with singleton counts excluded vs. included, across each possible DFOIL result. Highly uneven

385 ratios may cause erroneous inference of introgression for the taxon with a lower singleton count (Pease

386 & Hahn, 2015). We chose to examine the P3:P4 ratio because introgression signatures returned by DFOIL

387 can involve only one or the other of these taxa, allowing for a clearer examination of the relationship

388 between singleton count and involvement in introgression.

389

390 3 | RESULTS

391 3.1 | mtDNA alignment and phylogenetic results

392 The final ND4 alignment contained 122 unique haplotypes, 915 total sites, and 310 parsimony-

393 informative sites. We recovered a mostly well-resolved topology (Figure 2) for our species of interest

394 (i.e., those sequenced for ddRAD): ((cyanostictus+cyanogenys),(minor(oberon+ornatus)). As we were

395 unable to successfully sequence samples of S. serrifer for ddRAD, we do not focus on S. serrifer here.

396 However, in agreement with Martínez-Méndez and Méndez-de la Cruz (2007), we find that S. serrifer

397 is polyphyletic, with one clade related to S. minor and another to S. cyanogenys. Sceloporus

398 cyanogenys and S. cyanostictus were each strongly supported as monophyletic, with posterior

399 probabilities (PP) of 1.00. Sceloporus minor was supported as monophyletic with PP=0.89, and weakly

400 supported (PP=0.72) as the sister group to a well-supported clade (PP=0.92) containing all S. oberon

401 and S. ornatus samples. This clade is comprised of two strongly supported subclades (PP=1.00). One

402 contains only S. ornatus samples, including five individuals from localities near the contact zone with

403 S. oberon (see Figure 1) and all samples from localities further west (see Figure 4). The other contains 64

19

404 all S. oberon individuals, and all remaining S. ornatus individuals from localities near the contact zone.

405 Within this “mixed” clade, oberon-red is strongly supported as monophyletic (PP=1.00), but oberon-

406 black and S. ornatus are paraphyletic with respect to each other (with strong support for relevant

407 nodes).

408

409 3.2 | ddRAD sequencing, assembly, and alignment results

410 Read counts for all samples are in Table S2 (batch 1: mean=1,145,317, range=11,973–3,712,752,

411 standard deviation=814,844; batch 2: mean=983,548, range=1,546–6,765,507, standard

412 deviation=1,306,692). The clade-wide reference assembly totaled 3,728 loci and 922,077 sites. After all

413 filtering steps (see Appendix S1) we retained 1,712 loci and 404,966 sites. The clade-wide alignments

414 used for phylogenetic analysis contained 2,950 (reduced individual) and 3,389 (full individual)

415 parsimony-informative sites. For full and reduced-individual sampling of clade-wide variants, we

416 retained 10,095 biallelic sites, including heterozygous sites (DFOIL derives site-pattern counts from

417 biallelic sites with fixed differences only). The targeted reference assembly totaled 22,433 loci and

418 5,538,162 sites. After filtering based on quality and depth, we retained 205,101 biallelic sites from

419 21,818 loci, for both full and reduced-individual sampling. We note that we did not retain any loci that

420 were entirely invariant, considering all individuals, for either dataset.

421

422 3.3 | ddRAD phylogenetic results

423 Using our clade-wide ddRAD dataset with reduced individual sampling, we recovered a strongly

424 supported species-level hypothesis: (minor(oberon(cyanogenys(cyanostictus+ornatus)))). Bootstrap

425 support was >90% for each inter-specific split (Figure 2). All nominal species were strongly supported

426 as monophyletic, excepting S. minor (monophyletic in the best tree, but with bootstrap <50%) and S. 65

20

427 cyanogenys (only one individual included). When the full set of individuals was used in tree estimation,

428 bootstrap support for the placement of S. cyanogenys fell to 83% (Figure S1). However, the species-

429 level topology remained identical and otherwise strongly supported (Fig S1). Notably, the topology

430 from nuclear data shares none of the four interspecific splits from mtDNA data for the same five

431 species (Figure 2).

432

433 3.4 | Exhaustive DFOIL results

434 Summaries for all DFOIL test results using the dfoilalt method (excluding singleton counts) are

435 provided in Tables S3–S6. We found that dfoilalt tests with representatives of oberon-red resulted

436 in very few introgression signatures (~1% of such tests for the targeted dataset with reduced-individual

437 sampling, calculated from Table S3). This is consistent with the monophyly of oberon-red in our

438 mtDNA dataset (Figure 2). Therefore, in Tables 1–3, we report proportions of positive tests considering

439 only tests that included at least one representative of oberon-black. For each dataset, we provide the

440 percentage of positive tests for ancestral and inter-group introgression in Table 1, raw numbers for each

441 unique DFOIL signature in Table 2, and percentage of introgression signatures categorized by the species

442 involved in Table 3, and the median, range, and standard deviation of non-singleton site-pattern counts

443 in Table 4.

444 Overall, introgression was inferred much more frequently using the targeted datasets than the

445 clade-wide datasets (21.2% vs 0.7% for reduced-individual sampling; Table 1), but this is unsurprising

446 given that the targeted dataset had roughly five-fold more site-pattern counts per test (Table 4). Most

447 positive results were for signatures of ancestral introgression, particularly involving the ancestors of

448 pairs of S. ornatus (as P1 and P2) and oberon-black. Intergroup signatures (between terminal taxa) were 66

21

449 recovered less frequently, and most often indicated introgression from S. ornatus into S. oberon,

450 opposite the direction inferred from mtDNA (Table 3). However, we show in the next section that these

451 intergroup signatures may be spurious and attributable to batch effects.

452 Phylogenetic and geographic patterns of introgression for the targeted dataset with reduced

453 individual sampling are visualized in Figures 3 and 4, respectively. We stress that the proportions

454 provided merely indicate the proportion of positive tests, and do not provide information on (for

455 example) the quantity of introgression. These visualizations demonstrate that introgression signatures

456 involving S. ornatus and oberon-black were detected in similar proportions under multiple

457 phylogenetic and geographic contexts, as expected if introgression was ancient. However, tests

458 involving the western-most populations of S. ornatus recover introgression slightly less often,

459 potentially indicating recurrent introgression involving the eastern-most populations of S. ornatus.

460 We expected that for our targeted dataset, individuals of S. cyanostictus and S. cyanogenys

461 could have increased rates of mapping error relative to individuals of S. ornatus and S. oberon, as they

462 were not used in the reference panel (see section 2.4). A bias in rates of mapping error could

463 theoretically lead to false positives for D-statistics (Durand et al., 2011). However, we observed the

464 opposite pattern, with tests involving S. cyanostictus or S. cyanogenys recovering very few positive

465 results (Fig. 3). This result could be caused by an increased rate of allelic dropout (or failure to map)

466 for these taxa. Consistent with this idea, tests involving S. cyanostictus or S. cyanogenys had on

467 average fewer counts than tests without these taxa. For the targeted dataset with reduced individual

468 sampling, tests involving these taxa had an average of 918.3 non-singleton counts, while tests without

469 these taxa had an average of 1203.3 non-singleton counts (calculated using Table S7).

470

471 3.5 | Batch-specific effects 67

22

472 Results for each dataset separated by batch identity (all batch 1 samples, all batch 2 samples, or a mix)

473 are presented in Table 5. We focus here on the targeted dataset with reduced-individual sampling. For

474 ancestral introgression signatures (Table 5), the proportion of positive tests was higher for tests using

475 only batch 2 samples (38.7%) than tests using only batch 1 samples (18.2%), or a combination of batch

476 1 and 2 samples (17.3%). Intergroup signatures of introgression were also inferred in a larger

477 proportion using only batch 2 samples (2.9%), compared to batch 1 (0.4%), or mixed samples (0.2%;

478 Table 6). When applying the full-individual sampling to the targeted dataset, there was a similar overall

479 reduction in the proportion of positive tests when using batch 1 or mixed samples. For the clade-wide

480 datasets, batch 1 had a higher proportion of positive tests than batch 2, but the proportions were very

481 low for both.

482 A visual comparison of the test results for each possible “batch signature” is provided in Figure

483 5, using the targeted dataset with reduced individual sampling. These plots show the apparent reduction

484 in power when using only batch 1 samples or mixed-batch samples, as in Table 5. They also show

485 several instances of over-representation of a particular introgression signature for one or more batch

486 signatures. Most notably, the introgression signature “P1→P4” is over-represented in tests with a batch

487 signature of “1211” or “1221” and the introgression signature “P2→P4” is over-represented in tests with

488 a batch signature of “2111” or “2121”. In these cases, we argue that each of these patterns is the result

489 of type I error induced by batch effects (see Discussion). Also evident in Figure 5 is the over-

490 representation of “P1/P2←→P3” introgression in tests with a batch signature “1112”. In this case,

491 however, the over-representation is more likely due to a sampling difference between the batches for

492 the reduced-individual datasets (see Discussion).

493 The inclusion of some sampled populations in only a single batch may raise concerns that our

494 results are confounded by batch effects. In the case of S. oberon, several localities were restricted to 68

23

495 batch 2 only (oberonS1, oberonS3, oberonS4, oberonS6; Figure 4). However, there is no apparent

496 pattern in the introgression results for these populations vs. populations included in both batches. In the

497 case of S. ornatus, the four western-most localities were exclusively from batch 2. These localities also

498 show a reduction in the percentage of tests returning introgression (Figure 4). However, this geographic

499 pattern is still apparent when results from western and eastern S. ornatus localities are compared using

500 only batch 2 (Figure S5). Given this result, we think this geographic pattern is not a product of batch

501 effects.

502

503 3.6 | Singleton-count effects

504 For simplicity, results in this section are only from the targeted dataset with reduced-individual

505 sampling (results for all datasets in Table 6). Importantly, we included tests that sampled from only

506 oberon-red populations (i.e., no oberon-black) in Table 6 and Figure 6. The inclusion of singleton

507 counts massively increased the proportion of positive tests using only oberon-red samples (Table 6).

508 The effects of including singleton counts were dramatic: 44.9% of tests returned introgression

509 signatures when singleton counts were included and 21.2% when they were not (Table 6). We found

510 many more signatures of directional introgression when including singleton counts (8.1% vs. 1.3%),

511 and a much higher proportion of introgression signatures for tests that lacked representatives of oberon-

512 black (42.9% vs. 1.0%).

513 Comparison of singleton-count ratios for taxa P3 and P4 across introgression signatures (Figure

514 6) showed that when singleton counts were included, average count ratios were often well beyond the

515 limits suggested by the DFOIL authors (>0.75 and <1.25, indicated by dotted red lines), which may lead

516 to type-I error (Pease & Hahn, 2015). In comparison, tests excluding these counts typically had ratios

517 much closer to 1. 69

24

518

519 4 | Discussion

520 4.1 | Advantages of combining ddRAD and ExDFOIL

521 The combination of ddRAD and the exhaustive DFOIL (ExDFOIL) approach developed here for detecting

522 introgression has several attractive properties. These include cost-efficiency and ease of execution for

523 laboratory and computational aspects alike. DFOIL also allows for maximal use of data from ddRAD

524 markers by concatenating all biallelic sites from all loci, as long as a sufficient number of sites are

525 sampled for the D-statistics to approximate a Chi-square distribution (Pease & Hahn, 2015). Many

526 popular methods have assumptions about linkage that are potentially problematic for ddRAD and

527 similar data types. For instance, some methods assume that each variant is effectively unlinked (e.g.,

528 Pickrell & Pritchard, 2012), which could require the sub-sampling of only one variant per ddRAD

529 locus. Other methods may assume that variants within each locus are completely linked, but loci are

530 unlinked (e.g., Hey 2010). Either condition could be violated in ddRAD or similar datasets, and these

531 assumptions can be difficult to assess, especially without a reference genome.

532 The ExDFOIL approach is also free of several assumptions or intermediate steps required by

533 “population-based” methods for inferring introgression (e.g., f-statistics: Reich et al., 2009; TreeMix:

534 Pickrell & Pritchard, 2012). These methods require the number of populations, assignment of samples

535 to these populations, and allele frequencies. This makes the ExDFOIL approach more readily applicable

536 to datasets with highly uneven sampling between taxa and/or uncertainty about the number and

537 composition of populations, as in our dataset. Moreover, by virtue of its exhaustive nature, ExDFOIL

538 provides a richer view of the overall signal and noise in the data, as compared to a relatively small

539 number of population-based hypothesis tests. 70

25

540 Despite its exhaustive nature, ExDFOIL remains computationally efficient, especially if multiple

541 processors are available for use. The independent execution of a set of DFOIL tests is a readily

542 parallelizable problem, with no communication required between processors. This means that

543 computational time should decrease in a roughly linear manner with the number of available

544 processors. This also means that the computational load of ExDFOIL could easily be split across entirely

545 independent machines, and the results later collated. The calculation of D-statistics scales easily with

546 the number of sites. Therefore, the major source of computational load for most ExDFOIL analyses will

547 be the number of tests considered. This number will depend on the total number of taxa and the

548 structure of the phylogenetic tree used. In our case, an increase in sampling of 25 individuals, from 36

549 (reduced-individual sampling) to 62 (full-individual sampling), generated 236,110 additional tests, or

550 roughly 10^4 tests per individual added.

551 Our results show that the combination of ddRAD and DFOIL can recover signatures of

552 introgression using only a small fraction of genome-wide variation. Given an estimate of 2.73pg for the

553 genome size of a congener (S. magister; De Smet, 1981), and a conversion factor of 978 x 106 base-

554 pairs per picogram (Doležel, Bartos, Voglmayr, & Greilhuber, 2003), our targeted dataset sampled only

555 ~0.2% of the genome (~5.5 x 106 bp). We detected introgression between the expected taxa with as few

556 as 167 site-pattern counts for the clade-wide dataset, and 577 for the targeted dataset (Table 4).

557 Furthermore, we were able to detect introgression even when including individual samples with large

558 amounts of missing data (as in the full-individual sampling). For example, tests including a sample of

559 S. ornatus with >75% missing data in the targeted dataset (NM170p1) still recovered introgression in

560 >20% of tests (calculated from Table S4). This result suggests that DFOIL may perform well despite

561 missing data or limited intra-population sampling. 71

26

562 The signals of introgression that we recovered between our two focal species (S. oberon, S.

563 ornatus) were phylogenetically ancient and geographically widespread (Figures 3,4). However, we did

564 discover subtle but sensible geographic variation in introgression within S. ornatus (discussed below).

565 Presumably, the exhaustive application of DFOIL demonstrated here could reveal stronger variation in the

566 phylogenetic, geographic, and/or temporal evidence of introgression in other cases. Nevertheless, one

567 shortcoming of ExDFOIL is that the quantity of introgression, in terms of genomic contribution from

568 each population, is not estimated. This could be addressed instead population-based approaches (e.g.,

569 Reich et al., 2009; Pickrell & Pritchard, 2012).

570 Finally, we note that we used ExDFOIL here to investigate a single a-priori hypothesis of

571 introgression. However, we believe the approach may be particularly well-suited as a tool for de novo

572 discovery of introgression as it produces a richly informative view of the signal (and noise) in a given

573 dataset with relatively few assumptions.

574

575 4.2 | Mito-nuclear discordance indicates recurrent introgression

576 We argue that the rampant of mtDNA from S. ornatus and S. oberon that we observe here is

577 best explained by repeated introgression between these species (Figure 2). This introgression may have

578 been influenced by Pliocene climate change and/or Pleistocene climate cycles and associated range

579 shifts or population-size fluctuations (although this would require detailed analyses to test, beyond the

580 scope of the present study). We think it is unlikely that incomplete lineage sorting alone could produce

581 these patterns, a conclusion that is in agreement with similar studies of mtDNA discordance in related

582 lizard clades (e.g., McGuire et al., 2007; Jezkova, Leal, & Rodríguez-Robles, 2013). First, it is less

583 likely that a species will be non-monophyletic in mtDNA due to incomplete lineage sorting, given the

584 much smaller effective population size and rapid evolution of mtDNA (e.g., Wiens & Penkrot, 2002; 72

27

585 Hudson & Turelli, 2003; Zink & Barrowclough, 2008). Second, there is a geographically biased

586 distribution of putatively introgressed haplotypes, centered on the contact zone between oberon-black

587 and S. ornatus (Figure 1). This pattern is not expected under incomplete lineage sorting alone (Funk &

588 Omland, 2003).

589 We suggest that mtDNA introgression most likely occurred from oberon-black into S. ornatus,

590 based on the co-occurrence of two distinct haplogroups within single S. ornatus localities near the

591 range of oberon-black (Figure 1, panel D). Under this proposal, one haplogroup represents “native” S.

592 ornatus haplotypes, more closely related to S. ornatus samples from localities further west (depicted in

593 blue in panels C and D of Figure 1). The other haplogroup represents introgressed haplotypes, more

594 closely related to geographically proximate oberon-black (depicted in gray in panels C and D of Figure

595 1).

596 If introgression occurred in the opposite direction (from S. ornatus into oberon-black), we

597 would need an alternative explanation for the retention of two distinct ND4 haplogroups in single

598 localities of S. ornatus for at least ~4.64 Mya (the crown-node ancestor of all S. oberon and S. ornatus,

599 95% highest posterior density [HPD] 2.21–6.49 Mya). Several selection-based scenarios could explain

600 polymorphism of ancient mtDNA, including negative frequency-dependent selection (Kazancıoğlu &

601 Arnqvist, 2014) or sex-ratio distorting bacterial infection (e.g., Jiggins & Tinsley, 2005). Nevertheless,

602 because introgression from oberon-black into S. ornatus does not require the invocation of any

603 additional evolutionary forces, we consider it the more likely direction of mtDNA introgression.

604 Our analyses suggest that mtDNA introgression between S. oberon and S. ornatus occurred at

605 least once, in an event as old as 2.29 Mya (crown-node ancestor of all S. oberon mtDNA samples, 95%

606 HPD interval 1.31–3.38 Mya). One or two additional instances of mtDNA introgression are indicated

607 by younger discordant relationships (grouping S. ornatus and oberon-black) as young as 0.61 Mya 73

28

608 (youngest strongly-supported node with both S. oberon and S. ornatus descendants, 95% HPD 0.24–

609 1.04 Mya; Figure 2). More ancient mtDNA introgression, involving the ancestors of S. oberon and S.

610 ornatus, is also suggested by the discordance between mtDNA and ddRAD estimates for the

611 phylogenetic position and age of the most recent oberon–ornatus ancestor (Figure 2). If the oberon-

612 ornatus mtDNA clade is caused by introgression, then introgression may have occurred as long ago as

613 ~5.29 Mya (stem-node of S. oberon and S. ornatus in the mtDNA tree, 95% HPD 3.34–7.26; Figure 2).

614

615 4.3 | Historical introgression revealed using RADseq data and ExDFOIL

616 Our exhaustive DFOIL approach (ExDFOIL) revealed phylogenetic and geographic patterns of

617 introgression between S. oberon and S. ornatus. Introgression occurred anciently, between the ancestors

618 of S. ornatus and oberon-black, and is broadly detectable across the range of both taxa. This includes

619 populations from the western extent of the range of S. ornatus, >150 km from the range of S. oberon

620 (Figures 3,4). However, a slight reduction in the frequency of positive tests is evident for these western-

621 most populations. This may indicate that these populations harbor less introgressed ancestry than their

622 eastern counterparts, which may have participated in more recent introgression. This result is consistent

623 with the geographic patterns of mito-nuclear discordance we observed in S. ornatus, where only

624 eastern-most populations contain recently-introgressed mtDNA haplotypes (see above). Although the

625 western-most S. ornatus localities are exclusively from batch 2, we do not believe that this result is

626 compromised by batch effects, as the geographic pattern is still apparent when comparing only batch 2

627 samples (Supplemental Figure S5).

628 Introgression involving S. ornatus and oberon-black is rarely inferred in tests that include one

629 representative of S. cyanostictus or S. cyanogenys and one of S. ornatus (Figure 3). As discussed in

630 section 3.4, this result may be caused by higher rates of allelic dropout for S. cyanostictus and S. 74

29

631 cyanogenys, at least for the targeted dataset. Closer examination of these tests, however, reveals that

632 individual DFOIL components frequently indicate introgression involving S. ornatus and oberon-black,

633 but also introgression between oberon-red and S. cyanostictus or S. cyanogenys (Tables S3–S6). It may

634 be that the combination of introgression between e.g., P1↔P3 and P2↔P4 prevents DFOIL from inferring

635 either introgression type in these cases.

636 We were able to detect introgression using the clade-wide dataset, but only rarely (<1% overall;

637 Table 1), despite sampling >1,700 loci and >400,000 bp. This result is not necessarily surprising,

638 however. Many tests using the clade-wide data may not have sufficient count data (e.g., zero

639 observations for one or more site patterns; Table 4, Tables S5, S6). Clearly, our clade-wide dataset has

640 low power to detect historical introgression. Nevertheless, this dataset is more appropriate for resolving

641 phylogeny using ddRAD data, as it contains more accurate variants (see Appendix S1, Tables S12–

642 S16), and does not share the same potential for mapping error bias of the targeted dataset (see section

643 2.4).

644 We did not detect intergroup introgression signatures in large frequencies overall when

645 excluding singleton counts (Tables 1–3). Moreover, many of these signatures may be compromised by

646 batch-specific error (see next section). Simulations show that DFOIL may fail to polarize introgression

647 when directional introgression is weak, occurs close to the divergence time of P1 and P2 (Figure 5 of

648 Pease & Hahn, 2015), or is bi-directional (Schumer et al., 2016). The level of asymmetry in nuclear

649 introgression between oberon-black and S. ornatus is therefore considered unknown.

650

651 4.4 | Batch effects can mislead D-statistics and DFOIL

652 The potential impact of batch effects on analyses of RADseq data is rarely discussed in the literature

653 (but see Deagle, Faux, Kawaguchi, Meyer, & Jarman, 2015; Mastretta-Yanes et al., 2015). We found 75

30

654 that batch effects apparently reduced power for detecting introgression, at least for the targeted datasets

655 designed for analyzing introgression (Table 5, Figure 5). For the clade-wide datasets, batch 1 recovered

656 slightly higher numbers of positive tests overall. This may be due to the greater number of individuals

657 sequenced in batch 1 (especially for oberon-black and S. ornatus from near the contact zone), but the

658 power of the clade-wide data was also consistently low (Table 5). As batch 1 had a shorter read length

659 (100bp vs. 125bp), some reduction in power for tests using batch 1 samples was expected a priori.

660 We also found clear associations between particular introgression signatures and batch

661 signatures (Figure 5, see Results). In some cases, these associations seem to likely reflect some level of

662 type-I error induced by batch effects. In the case of “P1/P2←→P3” introgression and batch signature

663 “1112”, the potential for batch effects to influence the results seems obvious, as introgression is

664 inferred to involve the three taxa from batch 1, while the representative from batch 2 is not involved.

665 However, the majority of these signatures correspond to introgression involving two representatives of

666 oberon-black (~75%; Table S3). As batch 2 retained only one individual of oberon-black in the

667 reduced-individual sampling, this type of introgression was not detectable using only batch 2 samples.

668 In this case, confounding of batch and sampling effects prevents us from making a firm interpretation.

669 In order to understand how batch effects might influence “P1→P4” and “P2→P4” introgression

670 signatures, a consideration of the unique DFOIL signatures (sensu Pease & Hahn, 2015) underlying these

671 results is required. Briefly, DFOIL signatures are defined by the individual results for each of the four

672 DFOIL components. Each component is a four-taxon D-statistic that can be significantly different than

673 zero in either the positive (+) or negative (-) direction (each implying a different taxon is involved in

674 introgression), or not significantly different than zero (0). For example, one potential signature would

675 be “--+0”. 76

31

676 Importantly, both the “P1→P4” and “P2→P4” DFOIL signatures (“--0+” and “--0-”) differ from a

677 DFOIL signature of ancestral introgression (“--00”) by only one DFOIL component (“DOL”). This means

678 that a false positive result for DOL will imply directional introgression when placed against the

679 background of a “true” signature of ancestral introgression. The DOL component compares the site-

680 pattern counts for P1, P2, and P4. If batch-specific error is generating false positives, this should occur

681 when P1 and P2 come from different batches, and P4 comes from the same batch as the donor taxon,

682 while the batch identity of P3 should not matter. As expected, “P1→P4” introgression is overrepresented

683 in tests with batch signatures of “1211” and “1221”, and “P2→P4” introgression is overrepresented in

684 tests with batch signatures “2111” and “2121” (Figure 5). Specifically, we find that P1 and P2 are from

685 different batches in 86.2% (332/385) of tests with “--0+” or “--0-” introgression signatures, and that P4

686 is from the same batch as the donor taxon in 95.7% of these cases (318/332), which is 82.6% of the

687 total instances of directional introgression obtained using a mix of batch 1 and 2 samples (318/385,

688 proportions calculated from Table S3). Unlike the situation described above for “P1/P2←→P3”

689 introgression, no difference in sampling regime between the batches could fully explain this bias. We

690 therefore conclude that some intergroup signatures we recovered are spuriously produced by batch

691 effects, although we cannot say for sure which (or how many). This finding has important implications

692 for other RADseq studies that focus on introgression using data from multiple batches.

693

694 4.5 | Does the inclusion of singleton-counts mislead DFOIL?

695 Pease and Hahn (2015) highlighted the theoretical potential for the inclusion of singleton counts to

696 mislead DFOIL, but this issue has not been explored using simulated or empirical data (to our

697 knowledge). We found that the inclusion of singleton counts in DFOIL tests dramatically increased the

698 total proportion of positive tests recovered, and substantially altered the phylogenetic composition of 77

32

699 the recovered introgression signatures (Table 6, Tables S7–S10). Specifically, adding singleton counts

700 greatly increased the proportion of positive introgression results for tests that did not include any

701 representative of oberon-black (from ~1% to >40% for the targeted dataset), and increased the overall

702 proportion of intergroup introgression signatures recovered (from ~1% to ~6–8% for the targeted

703 dataset). We consider these results suspect, independent of the actual singleton counts, for two reasons.

704 First, most of the remaining intergroup signatures inferred when excluding singleton counts are

705 potentially compromised by batch effects (see section 4.4). Second, although we cannot rule out

706 historical introgression between oberon-red and S. ornatus, they are not geographically adjacent, and

707 oberon-red is monophyletic with respect to S. ornatus in our mtDNA tree (Figure 2). These results

708 contrast with those for S. ornatus and oberon-black. Thus, we suspect that inferred introgression

709 between S. ornatus and oberon-red is artifactual.

710 Our ExDFOIL analyses also allow a direct appraisal of singleton-count ratios and associated

711 results for thousands of tests. Using the targeted dataset with reduced individual sampling, we

712 compared average P3:P4 count ratios for tests excluding vs. including singleton counts separately for

713 each DFOIL result (Figure 6). This comparison demonstrates that average count ratios when including

714 singleton counts are typically much more extreme than count ratios when excluding singleton counts.

715 In many cases, the average count ratio when including singleton counts exceeds the recommended

716 bounds of 0.75 or 1.25 (Figure 6). Furthermore, count ratios were consistently skewed such that the

717 taxon with a lower singleton count was inferred to be involved in introgression, matching the

718 predictions of Pease and Hahn (2015). These results provide empirical support for the idea that the

719 inclusion of singleton counts maybe problematic (Pease & Hahn, 2015).

720

721 5 | CONCLUSIONS 78

33

722 We demonstrate a novel application of the DFOIL method (Pease & Hahn, 2015) for detection of

723 introgression. The approach (‘ExDFOIL’) involves exhaustively applying DFOIL to hundreds of thousands

724 of unique four-taxon combinations of individuals, here sequenced using a reduced-representation

725 protocol (ddRAD). We demonstrate ExDFOIL in an empirical system in which mito-nuclear discordance

726 independently suggests recurrent introgression. We find that DFOIL can detect introgression under a

727 broad range of genomic and geographic sampling conditions. Furthermore, the ExDFOIL approach

728 reveals subtle intra-specific geographic variation in introgression that is also consistent with observed

729 patterns of mito-nuclear discordance and a hypothesis of recurrent introgression. Our results may also

730 provide the first empirical evidence that batch effects in RADseq data can mislead inferences of

731 introgression. Finally, our results provide empirical support for the predictions of Pease and Hahn

732 (2015) that the inclusion of singleton counts in DFOIL analyses may yield problematic results. We

733 provide scripts to apply our approach to other datasets at https://www.github.com/SheaML/ExDFOIL.

734

735 ACKNOWLEDGEMENTS

736 We thank Harsimran Brar, Liz Miller, and Tereza Jezkova for help in the lab, and Anthony Baniaga,

737 Tim O’Connor, and Tod W. Reeder for help in the field. We thank N. Barton and an anonymous

738 reviewer for helpful comments that improved the manuscript. We thank Kathryn Chenard for the

739 artwork of S. oberon and S. ornatus in Figures 2, 3, and 4. This work was partially funded by awards

740 from the Society for the Study of Evolution (Rosemary Grant Award) and Society of Systematic

741 Biologists (Graduate Student Research Award) to SML. This material is based upon work supported by

742 the National Science Foundation Graduate Research Fellowship under Grant No. 2012117663 to SML

743 and NSF DEB 1655690 to JJW. Any opinion, findings, and conclusions or recommendations expressed 79

34

744 in this material are those of the authors(s) and do not necessarily reflect the views of the National

745 Science Foundation.

746

747 REFERENCES 748 Bertels, F., Silander, O. K., Pachkov, M., Rainey, P. B., & van Nimwegen, E. (2014). Automated 749 reconstruction of whole-genome phylogenies from short-sequence reads. Molecular Biology 750 and Evolution, 31, 1077–1088. doi: 10.1093/molbev/msu088 751 752 Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence 753 data. Bioinformatics 30, 2114–2120. doi: 10.1093/bioinformatics/btu170 754 755 Bouckaert, R., Heled, J., Kühnert, D., Vaughan, T., Wu, C.-H., Xie, D., Suchard, M. A., Rambaut, A., & 756 Drummond, A.J. (2014). BEAST 2: a software platform for Bayesian evolutionary analysis. 757 PLoS Computational Biology, 10, e1003537. doi: 10.1371/journal.pcbi.1003537 758 759 Bouckaert, R. R., & Drummond, A. J. (2017). Bmodeltest: Bayesian phylogenetic site model averaging 760 and model comparison. BMC Evolutionary Biology, 17, 42. doi: 10.1186/s12862-017-0890-6 761 762 Catchen, J., Hohenlohe, J. P., Bassham, S., Amores, A., & Cresko, W. (2013). Stacks: an analysis tool 763 set for population genomics. Molecular Ecology, 22, 3124–3140. doi: 10.1111/mec.12354 764 765 Chasalow, S. (2012). combinat: combinatorics utilities. R package version 0.0-8. 766 https://CRAN.R-project.org/package=combinat 767 768 Cornuet, J. M., Pudlo, P., Veyssier, J., Dehne-Garcia, A., Gautier, M., Leblois, R., Marin, J. M., & 769 Estoup, A. (2014). DIYABC v2.0: a software to make approximate Bayesian computation 770 inferences about population history using single nucleotide polymorphism, DNA sequence and 771 microsatellite data. Bioinformatics, 30, 1187–1189. doi: 10.1093/bioinformatics/btt763 772 773 Danecek, P., Auton, A., Abecasis, G., Albers, C., Banks, E., DePristo, M. A., Handsaker, R., Lunter, G., 774 Marth, G., Sherry, S.T., McVean, G., Durbin, R., & 1000 Genomes Project Analysis Group. 775 (2011). The variant call format and VCFtools. Bioinformatics, 27, 2156–2158. doi: 776 10.1093/bioinformatics/Fbtr330 777 778 Deagle, B. E., Faux, C., Kawaguchi, S., Meyer, B., & Jarman, S. N. (2015). Antarctic krill population 779 genomics: apparent panmixia, but genome complexity and large population size muddy the 780 water. Molecular Ecology, 24, 4943–4959. doi: 10.1111/mec.13370 781 782 De Smet, W. H. O. (1981). The nuclear Feulgen-DNA content of the vertebrates (especially reptiles), as 783 measured by fluorescence cytophotometry, with notes on the cell and chromosome size. Acta 784 Zoologica et Pathologica Antverpiensia, 76, 119–167. 80

35

785 786 Doležel, J., Bartos, J., Voglmayr, H., & Greilhuber, J. (2003). Nuclear DNA content and genome size of 787 trout and human. Cytometry Part A, 51, 127–128. doi: 10.1002/cyto.a.10013 788 789 Drummond, A. J., Ho, S. Y. W., Phillips, M. J., & Rambaut, A. (2006). Relaxed phylogenetics and 790 dating with confidence. PLoS Biology, 4, e88. doi: 10.1371/journal.pbio.0040088

791 Durand, E. Y., Patterson, N., Reich, D., & Slatkin, M. (2011). Testing for ancient admixture between 792 closely related populations. Molecular Biology and Evolution, 28, 2239–2252. doi: 793 10.1093/molbev/msr048

794 Eaton, D. A. R., & Ree, R. H. (2013). Inferring phylogeny and introgression using RADseq data: an 795 example from flowering plants (Pedicularis: Orobanchaceae). Systematic Biology, 62, 689–706. 796 doi: 10.1093/sysbio/syt032 797 798 Faircloth, B. C., & Glenn, T. C. (2014). Protocol: Preparation of an AMPure XP substitute (AKA 799 Serapure). doi: 10.6079/J9MW2F26 800 801 Funk, D. J., & Omland, K. E. (2003). Species level paraphyly and polyphyly: frequency, causes, and 802 consequences, with insights from mitochondrial DNA. Annual Review of Ecology 803 Evolution and Systematics, 34, 397–423. doi: 10.1146/annurev.ecolsys.34.011802.132421 804 805 Garrison, E., & Marth, G. (2012). Haplotype-based variant detection from short-read sequencing. arXiv 806 preprint arXiv:1207.3907 [q-bio.GN] 807 808 Green, R. E., Krause, J., Briggs, A. W., Maricic, T., Stenzel, U., Kircher, M., … & Pääbo, S. (2010). A 809 draft sequence of the Neandertal genome. Science, 328, 710–722. doi: 10.1126/science.1188021 810 811 Grummer, J. A., Calderón-Espinosa, M. L., Nieto Montes de Oca, A.., Smith, E. N., Méndez-de la 812 Cruz, F. R., & Leaché, A. D. (2015). Estimating the temporal and spatial extent of gene flow 813 among sympatric lizard populations (genus Sceloporus) in the southern Mexican highlands. 814 Molecular Ecology, 24, 1523–1542. doi: 10.1111/mec.13122 815 816 Gutenkunst, R. N., Hernandez, R. G., Williamson, S. H., & Bustamante, C. D. (2009). Inferring the 817 joint demographic history of multiple populations from multidimensional SNP frequency data. 818 PLoS Genetics, 5: e10000695. doi: 10.1371/journal.pgen.1000695 819 820 Hall, W. P., & Selander, R. K. (1973). Hybridization of karyotypically differentiated populations in the 821 Sceloporus grammicus complex (Iguanidae). Evolution, 27, 226–242. doi: 10.2307/2406963 822 823 Harrison, R. G. & Larson, E. L. (2014). Hybridization, introgression, and the nature of species 824 boundaries. Journal of Heredity, 105, 795–809. doi: 10.1093/jhered/esu033 825 826 Heled, J., & Drummond, A. J. (2012). Calibrated tree priors for relaxed phylogenetics and divergence 827 time estimation. Systematic Biology, 61, 138-49. doi: 10.1093/sysbio/syr087 81

36

828 829 Hey, J., & Sousa, V. (2013) Understanding the origin of species with genome-scale data: modelling 830 gene flow. Nature Reviews Genetics, 14, 404–414. doi:10.1038/nrg3446 831 832 Hey, J. (2010). Isolation with migration models for more than two populations. Molecular Biology and 833 Evolution, 27, 905–920. doi: 10.1093/molbev/msp296 834 835 Huang, J.-P. (2016). Parapatric genetic introgression and phenotypic assimilation: testing conditions for 836 introgression between Hercules beetles (Dynastes, Dynastinae). Molecular Ecology, 25, 5513– 837 5526. doi: 10.1111/mec.13849 838 839 Hudson, R. R., & Turelli, M. (2003). Stochasticity overrules the ‘‘three-times rule’’: genetic drift, 840 genetic draft, and coalescence times for nuclear loci versus mitochondrial DNA. Evolution, 57, 841 182–190. doi: 10.1111/j.0014-3820.2003.tb00229.x 842 843 Jezkova, T., Leal, M., & Rodríguez-Robles, J. A. (2013). Genetic drift or natural selection? 844 Hybridization and asymmetric mitochondrial introgression in two Caribbean lizards (Anolis 845 pulchellus and Anolis krugi). Journal of Evolutionary Biology, 26, 1458–1471. doi: 846 10.1111/jeb.12149 847 848 Jiggins, F. M., & Tinsley, M. C. (2005). An ancient mitochondrial polymorphism in Adalis bipunctata 849 linked to a sex-ratio-distorting bacterium. Genetics, 171, 1115–1124. doi: 850 10.1534/genetics.105.046342 851 852 Kazancıoğlu, E., & Arnqvist, G. (2014). The maintenance of mitochondrial genetic variation by 853 negative frequency-dependent selection. Ecology Letters, 17, 22–27. doi: 10.1111/ele.12195 854 855 Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., 856 Markowitz, S., Duran, C., Thierer, T., Ashton, B., Mentjies, P., & Drummond, A. (2012). 857 Geneious Basic: an integrated and extendable desktop software platform for the organization 858 and analysis of sequence data. Bioinformatics, 28, 1647–1649. doi: 859 10.1093/bioinformatics/bts199 860 861 Kumar, V., Lammers, F., Bidon, T., Pfenninger, M., Kolter, L., Nilsson, M. A., & Janke, A. (2017). The 862 evolutionary history of bears is characterized by gene flow across species. Scientific Reports, 7, 863 46487. doi: 10.1038/srep46487 864 865 Leaché, A. D. (2011). Multi-locus estimates of population structure and migration in a fence lizard 866 hybrid zone. PLoS ONE, 6, e25827. doi: 10.1371/journal.pone.0025827 867 868 Leaché, A. D., Banbury, B. L., Felsenstein, J., Nieto Montes de Oca, A., & Stamatakis, A. (2015). Short 869 tree, long tree, right tree, wrong tree: new acquisition bias corrections for inferring SNP 870 phylogenies. Systematic Biology, 64, 1032–1047. doi: 10.1093/sysbio/syv053 871 82

37

872 Leaché, A. D., Banbury, B. L., Linkem, C. W., & Nieto Montes de Oca, A. (2016). Phylogenomics of a 873 rapid radiation: is chromosomal evolution linked to increased diversification in North American 874 spiny lizards (Genus Sceloporus)? BMC Evolutionary Biology, 16, 63. doi: 10.1186/s12862- 875 016-0628-x 876 877 Leaché, A. D., & Cole, C. J. (2007). Hybridization between multiple fence lizard lineages in an 878 ecotone: locally discordant variation in mitochondrial DNA, chromosomes, and morphology. 879 Molecular Ecology, 16, 1035–1054. doi: 10.1111/j.1365-294X.2006.03194.x 880 881 Leaché, A. D., Harris, R. B., Maliska, M. E., & Linkem, C. W. (2013). Comparative species divergence 882 across eight triplets of spiny lizards (Sceloporus) using genomic sequence data. Genome 883 Biology and Evolution, 5, 2410–2419. doi: 10.1093/gbe/evt186 884 885 Leigh, J. W., & Bryant, D. (2015). PopART: Full-feature software for haplotype network construction. 886 Methods in Ecology and Evolution, 6, 1110–1116. doi: 10.1111/2041-210X.12410 887 888 Lewis, P. O. (2001). A likelihood approach to estimating phylogeny from discrete morphological 889 character data. Systematic Biology, 50, 913–925. doi: 10.1080/106351501753462876 890 891 Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler Transform. 892 Bioinformatics. Epub 2010 Jan 15. doi: 10.1093/bioinformatics/btp698 893 894 Mallet, J. (2005). Hybridization as an invasion of the genome. Trends in Ecology and Evolution, 20, 895 229–237. doi: 10.1016/j.tree.2005.02.010 896 897 Martin, S. H., Dasmahapatra, K. K., Nadeau, N. J., Salazar, C., Walters, J. R., Simpson, F., Blaxter, M., 898 Manica, A., Mallet, J., & Jiggins, C. D. (2013). Genome-wide evidence for speciation with gene 899 flow in Heliconius butterflies. Genome Research, 23, 1817–1828. doi: 10.1101/gr.159426.113

901 Martínez-Méndez, N., & Méndez de la Cruz, F. R. (2007). Molecular phylogeny of the Sceloporus 902 torquatus species-group (Squamata: Phrynosomatidae). Zootaxa, 1609.1, 53–68. 903 904 Mastretta-Yanes, A., Arrigo, N., Alvarez, N., Jorgensen, T. H., Piñero, D., & Emerson, B. C. (2015). 905 Restriction site-associated DNA sequencing, genotyping error estimation and de novo assembly 906 optimization for population genetic inference. Molecular Ecology Resources, 15, 28–41. 907 doi:10.1111/1755-0998.12291 908 909 McGuire, J. A., Linkem, C. W., Koo, M. S., Hutchison, D. W., Lappin, A. K., Orange, D. I., Lemos- 910 Espinal, J., Riddle, B. R., & Jaeger, J. R. (2007). Mitochondrial introgression and incomplete 911 lineage sorting through space and time: phylogenetics of crotaphytid lizards. Evolution, 61, 912 2879–2897. doi: 10.1111/j.1558-5646.2007.00239.x 913 914 Miller, M. A., Pfeiffer, W., & Schwartz, T. (2010). Creating the CIPRES Science Gateway for inference 915 of large phylogenetic trees. Proceedings of the Gateway Computing Environments Workshop 916 (GCE), 14 Nov. 2010, New Orleans, LA pp 1–8. 83

38

917 918 Paradis, E., Claude, J., & Strimmer, K. (2004). APE: analyses of phylogenetics and evolution in R 919 language. Bioinformatics, 20, 289–290. doi: 10.1093/bioinformatics/btg412 920 921 Payseur, B. A., & Rieseberg, L. H. (2016). A genomic perspective on hybridization and speciation. 922 Molecular Ecology, 25, 2337–2360. doi: 10.1111/mec.13557 923 924 Pease, J. B., & Hahn, M. W. (2015). Detection and polarization of introgression in a five-taxon 925 phylogeny. Systematic Biology, 64, 651–662. doi: 10.1093/sysbio/syv023 926 927 Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S., & Hoekstra, H. E. (2012). Double digest 928 RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non- 929 model species. PLoS ONE, 7, e37135. doi: 10.1371/journal.pone.0037135 930 931 Pickrell, J. K., & Pritchard, J. K. (2012). Inference of population splits and mixtures from genome-wide 932 allele frequency data. PLoS Genetics, 8, e1002967. doi: 10.1271/journal.pgen.1002967 933 934 Puritz, J. B., Hollenbeck, C. M., & Gold, J. R. (2014). dDocent: a RADseq, variant-calling pipeline 935 designed for population genomics of non-model organisms. PeerJ, 2, e431. doi: 936 10.7717/peerj.431 937 938 R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for 939 Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. 940 941 Rambaut, A., Suchard, M. A., Xie, D. & Drummond, A. J. (2014). Tracer v1.6, Available from 942 http://tree.bio.ed.ac.uk/software/tracer/ 943 944 Reich, D., Thangaraj, K., Patterson, N., Price, A. L., & Singh, L. (2009). Reconstructing Indian 945 population history. Nature 461: 489–494. 946 947 Revell, L. J. (2012). phytools: An R package for phylogenetic comparative biology (and other things). 948 Methods in Ecology and Evolution, 3, 217–223. doi: 10.1111/j.2041-210X.2011.00169.x 949 950 Rohland, N., & Reich, D. (2012). Cost-effective, high-throughput DNA sequencing libraries for 951 multiplexed target capture. Genome Research, 22, 939–946. doi: 10.1101/gr.128124.111 952 953 Sanderson, M. J. (2002). Estimating absolute rates of molecular evolution and divergence times: a 954 penalized likelihood approach. Molecular Biology and Evolution, 19, 101–109. doi: 955 10.1093/oxfordjournals.molbev.a003974 956 957 Sarver, B. A. J., Keeble, S., Cosart, T., Tucker, P. K., Dean, M. D., & Good, J. M. (2017). 958 Phylogenomic insights into mouse evolution using a pseudoreference approach. Genome 959 Biology and Evolution, 9, 726–739. doi: 10.1093/gbe/evx034 960 84

39

961 Schumer, M., Cui, R., Powell, D. L., Rosenthal, G. G., & Andolfatto, P. (2016). Ancient hybridization 962 and genomic stabilization in a swordtail fish. Molecular Ecology, 25, 2661–2679. doi: 963 10.1111/mec.13602 964 965 Schwenk, K., Brede, N., & Streit, B. (2008). Extent, processes and evolutionary impact of interspecific 966 hybridization in . Philosophical Transactions of the Royal Society of London B: 967 Biological Sciences, 363, 2805–2811. doi: 10.1098/rstb.2008.0055 968 969 Sites, J. W. Jr., Davis, S. K., Hutchinson, D. W., Maurer, B. A., & Lara, G. (1993). Parapatric 970 hybridization between chromosome races of the Sceloporus grammicus complex 971 (Phrynosomatidae): structure of the Tulancingo transect. Copeia, 1993, 373–398. doi: 972 10.2307/1447137 973 974 Sites, J. W. Jr., Barton, N. H., & Reed, K. M. (1995). The genetic structure of a hybrid zone between 975 two chromosome races of the Sceloporus grammicus complex (Sauria, Phrynosomatidae) in 976 central Mexico. Evolution, 49, 9–36. doi: 10.1111/j.1558-5646.1995.tb05955.x 977 978 Smith, S. A., & O’Meara, B. C. (2012). treePL: divergence time estimation using penalized likelihood 979 for large phylogenies. Bioinformatics, 28, 2689–2690. doi: 10.1093/bioinformatics/bts492 980 981 Solís-Lemus, C., & Ané, C. (2016). Inferring phylogenetic networks with maximum pseudolikelihood 982 under incomplete lineage sorting. PLoS Genetics, 12, e1005896. doi: 983 10.1371/journal.pgen.1005896 984 985 Than, C., Ruths, D., & Nakhleh, L. (2008). PhyloNet: a software package for analyzing and 986 reconstructing reticulate evolutionary relationships. BMC Bioinformatics, 9, 322. doi: 987 10.1186/1471-2105-9-322 988 989 Tange, O. (2011). GNU Parallel - The Command-Line Power Tool;login: The USENIX Magazine, 990 February 2011, 42–47. 991 992 Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of 993 progressive multiple sequence alignment through sequence weighting, position-specific gap 994 penalties and weight matrix choice. Nucleic Acids Research, 22, 4673–4680. doi: 995 10.1093/nar/22.22.4673 996 997 Twyford, A. D., & Ennos, R. A. (2012). Next-generation hybridization and introgression. Heredity, 998 108, 179–189. doi: 10.1038/hdy.2011.68 999 1000 Uetz, P., Freed, P., & Hošek, J. (eds.). “The Reptile Database”. http://www.reptile-database.org, 1001 accessed October 2nd, 2017 1002 1003 Wiens, J. J., Kuczynski, C. A., Arif, S., & Reeder, T. W. (2010). Phylogenetic relationships of 1004 phrynosomatid lizards based on nuclear and mitochondrial data, and a revised phylogeny for 85

40

1005 Sceloporus. Molecular Phylogenetics and Evolution, 54, 150–161. doi: 1006 10.1016/j.ympev.2009.09.008 1007 1008 Wiens, J. J., & Penkrot, T. L. (2002). Delimiting species based on DNA and morphological variation 1009 and discordant species limits in spiny lizards (Sceloporus). Systematic Biology, 51, 69–91. doi: 1010 10.1080/106351502753475880 1011 1012 Wiens, J. J., Kozak, K. H., & Silva, N. (2013). Diversity and niche evolution along aridity gradients in 1013 North American lizards (Phrynosomatidae). Evolution, 67, 1715–1728. doi: 10.1111/evo.12053 1014 1015 Wiens, J. J., Reeder, T. W., & Nieto Montes de Oca, A. (1999). Molecular phylogenetics and evolution 1016 of sexual dichromatism among populations of the Yarrow’s spiny lizard (Sceloporus jarrovii). 1017 Evolution, 53, 1884–1897. doi: 10.2307/2640448 1018 1019 Zink, R. M., & Barrowclough, G. (2008). Mitochondrial DNA under siege in avian phylogeography. 1020 Molecular Ecology, 17, 2107–2121. doi: 10.1111/j.1365-294X.2008.03737.x 86

41

1021 FIGURE AND TABLE LEGENDS 1022 1023 FIGURE 1. A. Adult male S. ornatus. B. Representative habitat of S. ornatus: rocky slopes with 1024 Chihuahuan desert-scrub vegetation. C. A median-joining haplotype network of ND4 sequences for S. 1025 ornatus (in blue and gray), S. oberon “black” (in black) and S. oberon “red” (in red) made using 1026 PopART (Leigh & Bryant, 2015). The number of samples for each haplotype is indicated by the circle 1027 size (see scale in top-left). D. Map of the contact zone between S. oberon and S. ornatus in northeastern 1028 Mexico. Black circles indicate sampled localities of oberon-black, red circles indicate sampled 1029 localities of oberon-red. Localities for S. ornatus indicate the proportion of putatively introgressed 1030 (gray) and putatively native (blue) ND4 haplotypes, as in the haplotype network. Map is a composite of 1031 digital elevation models and satellite imagery made in QGIS v2.18 (http://qgis.osgeo.org/). Landsat 7 1032 imagery courtesy of the U.S. Geological Survey. Note that this map encompasses the entire range of S. 1033 oberon, but only the eastern portion of the range of S. ornatus. E. Representative habitat of S. oberon 1034 “black”: high-elevation pine-oak forest. F. Adult male S. oberon “black”. G. Adult male S. oberon 1035 “red”. H. Inset map and legend for map in panel D. Photo A by John Wiens, photos B and E by 1036 Anthony Baniaga, photos F and G by Shea Lambert. 1037 1038 FIGURE 2. Time-calibrated trees for ddRAD data using the clade-wide dataset and reduced-individual 1039 sampling, using RAxML and treePL (above) and the mtDNA gene ND4 (below), using BEAST2. 1040 Blue shading indicates groups of samples assigned to S. ornatus, gray shading indicates groups of 1041 samples assigned to S. oberon “black”, and red shading indicates groups of samples assigned to S. 1042 oberon “red”. For the ND4 tree, support values indicated on nodes are posterior probabilities, and for 1043 the ddRAD tree, support values are the percentage of bootstrap replicates supporting that bipartition. 1044 For both trees, support values of less than 50 / 0.5 are not shown, and asterisks indicate support values 1045 of 100 / 1. 1046 1047 FIGURE 3. Visualization of the proportions of positive tests for introgression over phylogenetic space, 1048 using the targeted dataset with reduced-individual sampling. Tree is based on the clade-wide dataset 1049 (Figure 2). Pies above nodes indicate results when that node was the most recent common ancestor of 1050 taxa P3 and P4 (the older pair of taxa); pies below nodes indicate results when that node was the most 1051 recent common ancestor of taxa P1 and P2 (the younger pair of taxa). Tests using only representatives of 1052 S. oberon “red” are not included here. Gray indicates no introgression, blue indicates ancestral 1053 introgression involving two S. ornatus and one S. oberon, red indicates introgression from S. ornatus 1054 into S. oberon, orange indicates ancestral introgression involving two S. oberon and one S. ornatus, 1055 yellow indicates introgression from S. oberon into S. ornatus, and green represents ancestral 1056 introgression involving two S. oberon and one S. cyanostictus. Batch identities are indicated at the end 1057 of sample names (b1 = batch 1, b2 = batch 2). Prefixes indicate the species and locality for each 1058 sample, as seen in Figure 4 and Table S2. 1059 1060 FIGURE 4. Visualization of the proportions of positive tests for introgression over geographic space, 1061 using the targeted dataset with reduced-individual sampling. Localities of oberon-red are in red, 1062 localities of oberon-black are in black, and localities of S. ornatus are in blue. For simplicity, we do not 1063 include tests that involved only oberon-black or oberon-red representatives in this visualization. 1064 Furthermore, we only consider tests where P1 and P2 were drawn from S. ornatus, and combine test 1065 results for individual localities regardless of if the locality was used for P1 or P2. Red indicates ancestral 87

42

1066 introgression involving two S. ornatus and one S. oberon, pink indicates introgression from S. ornatus 1067 into S. oberon, and blue indicates no introgression. 1068 1069 FIGURE 5. Stacked bar plots displaying test results for the targeted dataset with reduced-individual 1070 sampling, comparing tests with each possible “batch signature”, defined simply as the batch identity of 1071 samples P1, P2, P3, P4 in that order (e.g., “1111” indicates that all four taxa belong to batch 1). Test 1072 results are color coded. For example, “P4 → P2” represents intergroup introgression from taxon P4 into 1073 taxon P2. 1074 1075 FIGURE 6. Comparison of the ratio of singleton counts in taxa P3:P4 for tests including (gray) or 1076 excluding (black) singleton counts, for the targeted dataset with reduced-individual sampling. Dashed 1077 lines are drawn at values of 1.25 and 0.75 for the P3:P4 ratio, as tests with count ratios that exceed these 1078 bounds are considered to be potentially problematic (i.e., DFOIL prints a warning). Bar plots are grouped 1079 by test results, indicated on the x-axis. For example, “123” representing ancestral introgression 1080 involving P1 / P2 and P3, and “13” indicating intergroup introgression from P1 into P3. We note that 1081 count ratios for introgression results of “23” are not compared here, as there were such results when 1082 singleton counts were included. 88

43

1083 TABLE 1 Coarse-scale summaries of the proportions of positive tests for introgression using DFOIL, for 1084 each of the four primary datasets. “Ancestral” signatures involve the ancestor of taxa P1 and P2, and one 1085 of either P3 or P4, and do not have directionality inferred. Inter-group signatures involve any two non- 1086 sister terminal taxa, with directionality inferred. 1087 1088 Dataset Number of tests Proportion Proportion of Proportion of of positive “ancestral” “inter-group” tests signatures signatures Targeted, Full 176,400 0.217 0.204 0.012 Targeted, Reduced 21,658 0.212 0.192 0.019 Clade-wide, Full 176,400 0.004 0.004 0 Clade-wide, Reduced 21,658 0.007 0.007 0 1089 89

44

1090 TABLE 2 Raw counts of DFOIL signatures recovered for each of the four primary datasets. Signatures 1091 are indicated by the taxa involved (e.g., “4” = P4) with directionality indicated by the arrow; “12” 1092 indicates the ancestor of P1 and P2. The DFOIL signature (sensu Pease & Hahn, 2015) is indicated in 1093 parentheses, and corresponds to the results of each of the four DFOIL components (DFO, DIL, DFI, and DOL, 1094 respectively; + indicates significantly positive, - indicates significantly negative, and 0 indicates not 1095 significantly different than zero). 1096 1097 Signatures Targeted, Full Targeted, Reduced Clade-wide, Full Clade-wide, Reduced

12 ↔ 3 (++00) 34,625 237 664 158 12 ↔ 4 (--00) 1,411 3,929 0 0 1 → 3 (+++0) 1,266 10 0 0 1 → 4 (--0+) 6 217 0 0 2 → 3 (++-0) 885 0 0 0 2 → 4 (--0-) 2 187 0 0 3 → 1 (+0++) 20 1 0 0 3 → 2 (0+--) 7 2 0 0 4 → 1 (-0++) 1 3 0 0 4 → 2 (0---) 2 0 0 0 1098 90

45

1099 TABLE 3 Proportions of positive tests for introgression recovered by DFOIL, collated by the species 1100 involved, for each primary dataset. 1101 1102 Dataset S. ornatus / S. oberon / S. ornatus S. oberon → other S. ornatus S. oberon ↔ → S. oberon S. ornatus ↔ S. oberon S. ornatus Targeted, Full 0.188 0.012 0.012 < 0.001 0.004 Targeted, Reduced 0.177 0.012 0.019 < 0.001 0.003 Clade-wide, Full 0.004 0 0 0 0 Clade-wide, Reduced 0.007 0 0 0 0 1103 91

46

1104 TABLE 4 Median, range, and standard deviation of the number of non-singleton counts for the 1105 targeted and clade-wide datasets. These values are compared for all tests, tests that recovered no 1106 introgression, and tests that did infer introgression. 1107 Dataset Tests considered Median of non- Range of non- Standard deviation singleton counts singleton of non-singleton counts counts Targeted, Reduced All tests 1026 463–2802 295 No introgression 1004 462–2802 293 Introgression 1252 577–2657 269

Clade-wide, Reduced All tests 197 111–365 41 No introgression 197 111–365 41 Introgression 234 167–281 20

Targeted, Full All tests 894 267–2972 251 No introgression 882 267–2972 254 Introgression 952 321–2657 228

Clade-wide, Full All tests 200 76–419 44 No introgression 200 76–419 44 Introgression 228 147–281 22 1108 92

47

1109 TABLE 5 Proportions of positive DFOIL tests for each dataset, comparing tests using only batch 1 1110 samples, tests using only batch 2 samples, and tests using a mix of samples from batches 1 and 2. 1111 1112 Dataset Batch Number of Proportion Proportion Proportion identity tests of positive of of “inter- tests “ancestral” group” signatures signatures

Targeted, Reduced Batch 1 only 2,340 0.186 0.182 0.004 Batch 2 only 315 0.416 0.387 0.029 Mixed 19,003 0.212 0.19 0.002

Targeted, Full Batch 1 only 26,999 0.214 0.211 0.004 Batch 2 only 1,672 0.278 0.266 0.011 Mixed 147,729 0.216 0.202 0.014

Clade-wide, Reduced Batch 1 only 2,340 0.008 0.008 0 Batch 2 only 315 0.003 0.003 0 Mixed 19,003 0.007 0.007 0

Clade-wide, Full Batch 1 only 26,999 0.006 0.006 0 Batch 2 only 1,672 0.001 0.001 0 Mixed 147,729 0.003 0.003 0 1113 93

48

1114 TABLE 6 Proportions of positive DFOIL tests for each dataset, comparing tests with singleton counts 1115 excluded against tests with singleton counts included. Unlike Tables 1–4, tests that sampled 1116 representatives of S. oberon from only oberon-red localities are included here, allowing the large effect 1117 of singleton counts on these tests to be seen. 1118 Dataset Singleton Proportion Proportion Proportion Proportion of positive counts of positive ancestral inter- results for tests without tests group oberon-black

Targeted, Reduced included 0.442 0.361 0.081 0.429 excluded 0.145 0.132 0.013 0.010

Targeted, Full included 0.407 0.343 0.064 0.408 excluded 0.163 0.154 0.009 0.009

Clade-wide, Full included 0.091 0.086 0.004 0.102 excluded 0.003 0.003 0 0

Clade-wide, Reduced included 0.089 0.087 0.003 0.115 excluded 0.005 0.005 0 0 1119 94

49

1120 DATA ACCESSIBILITY 1121 All alignments and tree files are available in the Dryad Digital Repository (available upon acceptance) 1122 and at https://www.github.com/SheaML/ExDFOIL. Scripts and example files used to select taxa for 1123 DFOIL, execute DFOIL in parallel, collate results, calculate summary statistics and visualize results are 1124 found in the Dryad Digital Repository (available upon acceptance) and at 1125 https://www.github.com/SheaML/ExDFOIL. Genbank accession numbers for all previously published 1126 ND4 sequences are found in Table S1 (accession numbers for newly generated sequences will be made 1127 available upon acceptance). Raw sequence reads and quality scores for all samples used will be made 1128 available on the NCBI Sequence Read Archive upon acceptance. 1129 1130 AUTHOR CONTRIBUTIONS 1131 SML designed the study, conducted fieldwork, laboratory work, bioinformatic processing and 1132 downstream analyses. JWS conducted laboratory work and provided technical supervision in the lab. 1133 MCF-R provided technical supervision in the lab. UOG, ANMO, NM, and JJW conducted fieldwork 1134 and provided tissue samples. SML wrote the manuscript and created figures. All authors contributed to 1135 editing the manuscript. 1136 1137 Supporting Information 1138 1139 Appendix S1. Description of ddRAD library preparation and sequencing, variant filtering, haplotyping, 1140 error-rate estimation, mitigation of batch effects, and removal of discordant phylogenetic signal related 1141 to introgression. 1142 1143 Supplemental File S1. Custom R function for selecting taxa from a list that meet the assumptions of 1144 DFOIL, given a phylogenetic hypothesis containing all of the taxa. 1145 1146 Supplemental File S2. Meta-data file for Supplemental Tables S3–S10 1147 1148 Figure S1. Chronogram made using the clade-wide alignment with full individual sampling. Rapid 1149 bootstrap support values from RAxML are provided as node labels. 1150 1151 Figure S2. Maximum-likelihood tree obtained using RAxML and the clade-wide dataset with reduced 1152 individual sampling, before removing two populations of S. minor (“minor6” and ”minor7”) with a 1153 potential history of introgression with S. cyanogenys. Rapid bootstrap support values are provided as 1154 node labels. 1155 1156 Figure S3. Maximum-likelihood tree obtained using RAxML and the clade-wide dataset with reduced 1157 individual sampling, after removing all samples from localities “minor6” and “minor7”. Rapid 1158 bootstrap support values are provided as node labels. 1159 1160 Figure S4. Maximum-likelihood tree obtained using RAxML and the clade-wide dataset with reduced 1161 individual sampling, after removing the sample of S. cyanogenys from within the range of S. oberon in 1162 Nuevo Leon, JJW593. Rapid bootstrap support values are provided as node labels. 1163 1164 Figure S5. Visualization of introgression results over geographic space, for the targeted dataset with 1165 reduced individual sampling, and only batch 2 samples considered. For simplicity, we do not include 95

50

1166 tests that involved only S. oberon “black” or S. oberon “red” representatives in this visualization. 1167 Furthermore, we only consider tests where P1 and P2 were drawn from S. ornatus, and combine test 1168 results for individual localities regardless of if the locality was used for P1 or P2. Red indicates ancestral 1169 introgression involving two S. ornatus and one S. oberon, pink indicates introgression from S. ornatus 1170 into S. oberon, and blue indicates no introgression. As in Figure 3, a geographic pattern is apparent, 1171 where western-most populations of S. ornatus recover introgression less often. 1172 1173 Table S1. Genbank accession numbers for all previously published and newly generated ND4 1174 sequences (Excel spreadsheet). 1175 1176 Table S2. ddRAD sample information and read counts after quality filtering with dDocent (Excel 1177 spreadsheet). 1178 1179 Table S3. Full results of targeted, reduced-individual sampling DFOIL tests, with singleton counts 1180 excluded (.csv file). 1181 1182 Table S4. Full results of targeted, full-individual sampling DFOIL tests, with singleton counts excluded 1183 (.csv file). 1184 1185 Table S5. Full results of clade-wide, reduced-individual sampling DFOIL tests, with singleton counts 1186 excluded (.csv file). 1187 1188 Table S6. Full results of clade-wide, full-individual sampling DFOIL tests, with singleton counts 1189 excluded (.csv file). 1190 1191 Table S7. Full results of targeted, reduced-individual sampling DFOIL tests, with singleton counts 1192 included (.csv file). 1193 1194 Table S8. Full results of targeted, full-individual sampling DFOIL tests, with singleton counts included 1195 (.csv file). 1196 1197 Table S9. Full results of clade-wide, reduced-individual sampling DFOIL tests, with singleton counts 1198 included (.csv file). 1199 1200 Table S10. Full results of clade-wide, full-individual sampling DFOIL tests, with singleton counts 1201 included (.csv file). 1202 1203 Table S11. Error rates for the clade-wide dataset, before any filtering steps (using the 1204 TotalRawSNPs.vcf file). 1205 1206 Table S12. Error rates for the clade-wide dataset, after filtering sites for a minimum quality of 30 1207 (Phred33) and a minimum depth of 3 (Excel format spreadsheet). 1208 1209 Table S13. Error rates for the clade-wide dataset, after all filtering steps, except for the removal of 1210 batch effects, minor allele count filter, and haplotyping with rad_haplotyper.pl.(Excel format 1211 spreadsheet). 96

51

1212 1213 Table S14. Error rates for the clade-wide dataset, after removal of batch effects and filtering of sites 1214 with minor allele counts less than three. (Excel format spreadsheet). 1215 1216 Table S15. Error rates for the clade-wide dataset, after haplotyping and filtering of potential paralogs 1217 with rad_haplotyper.pl. (Excel format spreadsheet). 1218 1219 Table S16. Error rates for the targeted dataset, after filtering sites for a minimum quality of 30 and a 1220 minimum depth of 3. (Excel format spreadsheet). 1221 97 98 99

ornatus5_NM173b1 ornatus5_NM167b1 ornatus2_JJW672b2 ornatus6_NM175b1 ornatus6_NM178b1 ornatus2_JJW671b1 ornatus3_JJW673b2 ornatus4_JJW677b1 ornatus3_JJW676b1 ornatusS3_SML136b2 ornatusS2_SML132b2 ornatusS3_SML134b2 ornatus1_JJW628b1 ornatus1_JJW627b2 ornatusS1_SML128b2 ornatusS1_SML129b2 ornatusJAM_JAM652b2

oberon10_JJW684b1 oberon10_JJW683b1 oberon2_JJW541b1 oberon2_JJW539b1 oberonS1_JJW539b2 oberon3_JJW606b1 oberon3_JJW605b1 100 101

Proportion

0.0 0.2 0.4 0.6 0.8 1.0

1111

1112

1121 (targeted dataset,reducedindividualsampling) 1122 D

1211

1212 FOIL resultsbybatchsignature

Batch Signature 1221

1222

2111

2112

2121

2122

2211

2212

2221

2222 P1 /P2<->P3 P1 /P2<->P4 P1 ->P3 P1 ->P4 P2 ->P4 P3 ->P1 P3 ->P2 P4 ->P1 P4 ->P2 no introgression 102

Ratio of P3:P4 Single Counts

0.0 0.5 1.0 1.5

123

124

13

14

24

31

32 With SingletonCounts(dfoil) No SingletonCounts(dfoil_alt)

41

42 none