<<

Downloaded by guest on September 28, 2021 www.pnas.org/cgi/doi/10.1073/pnas.1607921114 species Sukumaran not Jeet structure, delimits Multispecies ofudn ouainsrcuewt pce onais sa As boundaries. species possibility with the structure potentially population is and confounding structure there population loci, ever-more-fine more detecting independence of With evolutionary (7). less and or populations flow more gene among in of results level separation the as extent of varies the structure However, population geo- (6). of by barriers reduced is environmental flow and/or gene simply which graphic in or taxa all among speciation ubiquitous is to structure genetic due Population structure. population isolated within-species lineages represents genet- delimited ically we to what rise whether How- challenge—determining given new paradoxically (5). a also have Phylogeography) gains program remarkable and these popular ever, the Phylogenetics with (Bayesian as with BPP times, lineages boundaries. delimit divergence can taxon short we extremely diverged consider- that show recently with simulations example, identifying us For under in provides inference power model species, able coalescent putative multispecies per the individuals Cou- datasets many genomic data. collecting opposed across phenotypic for as advances on technological (4), based with pled work model taxonomic typi- coalescent traditional multispecies approaches, to the model-based of using statistical use cally increasing under making data is funda- genomic boundaries—that delimitation—currently their these advances constitute species of These that Identification is, 1–3). known. species as refs. the units (e.g., of mental life boundaries of the tree take study the we as across scale, and species scope, rapidity, increasing with made being M coalescent multispecies data information. multiple ecological with and hypothesis phenotypic species as delimited a such types, of considered validation be requires only that should boundaries, due species meth- results to structure due new genomic-based between that Until and discriminate processes to alone. can population-level study to that data systematic developed genomic such, for are on call ods As reliance general coalescent. a a represents reconsider finer-scaled multispecies provide also the increasingly resources work our under genomic detecting structure as for genetic the especially power of estimates species, unprecedented inflated units by of as impacted macroevolu- number be species of will studies on to dynamics, rely biology tionary conservation that from fields analysis, all of Specifically, fields. of in consequences practice other cascading the with about discovery, questions species as raises genome-based structure work distinguish population our of statistically species, misidentification putative not bound- the species of does vs. Because it isolation aries. population that struc- with associated genetic and structure We diagnoses species, coalescent. coalescent not multispecies multispecies ture, the the delimitation under that species data rather show out these process carry simulations on extended and under inference event an instantaneous data as an use speciation than model we (struc- explicitly Here, extended initiation that an completion. speciation as of and than phases turing) rather distinct event involving instantaneous process an was as speciation model, per- treated this the under 2016) assessing delimitation approaches 23, work species May of previous review many formance In for delimitation. (received underlies 2016 species 29, model for December used approved coalescent and TX, multispecies Austin, Austin, The at Texas of University Hillis, M. David by Edited a eateto clg n vltoayBooy nvriyo ihgn n ro I48109-1079 MI Arbor Ann Michigan, of University Biology, Evolutionary and Ecology of Department iyadtepoessta eeaetoepten are patterns those generate that biodiver- processes of the patterns and the understanding sity in advances ajor a,1,2 | pce delimitation species n .LcyKnowles Lacey L. and | olsettheory coalescent a,1 2 Submission. 1 Direct PNAS a is article This interest. of conflict no declare authors The paper. the wrote and data, speciation when model coalescent multispecies delim- the species under of itation robustness the investigate we Here, true (22). forming such immediately species as event, treated instantaneous are lineages an divergent models, as all that birth–death abstracted is applied speciation commonly which to in contrast This in 21). 19, is ini- (18, time process the over species from true into evolving new lineages isolated with stochastically tially process, and protracted gradually a only selection, less of lineages strength or and more form becomes extent the speciation the lin- and on isolation population Depending of (20). a duration species that and new a probability into the evolve will for eage continuum a to points itself process speciation the 19). understanding funda- (18, population-level to More central and (17). is species- detrimental lineages between be distinguishing could data mentally, populations genetic isolated on or small, (16), based of webs oversplitting food where in of diversity conservation, analyses of multiple estimation (13–15), the across studies example, for macroecological ramifications as, delimita- with such study, cases—species issue of fields critical other a in in becomes populations species tion but blurred—delimiting Conse- cases, become (9–12). lines some biodiversity the of when dynamics issues quently, and to generation understand- limited to the just paramount ing also not are are but species boundaries, species species delimit about Because to diversity, used alone. species models of data patterns the for to genetic linked methodologies inextricably on is insuf- to delimitation based received respect has taxa that with delimiting (8) especially issue attention, key a ficient as emerging therefore is hierar- also fractal but a divergences. lineages, is, of them—that species chy within divergent structure detect population only local it makes not data to genomic possible of resolution increased the consequence, uhrcnrbtos ..adLLK eindrsac,promdrsac,analyzed research, performed research, designed L.L.K. and J.S. contributions: Author owo orsodnesol eadesd mi:[email protected]. Email: addressed. be should correspondence whom work. To this to equally contributed L.L.K. and J.S. norcl en rae sdsic species. con- distinct as because populations as treated biodiversity, isolated being impacted, to this incorrectly due be of undermined is existence will planning very servation units the fundamental as rely their well that as studies biodiver- evolutionary species or of on ecological dynamics any and because generation sity, our of the for implications of overinflation struc- profound understanding genetic has The general boundaries species of boundaries. for misidentification ture species the to to due distin- species due for that way processes population-level and any to due the provide structure between The under not the guishing do delimitation structure. what coalescent species is multispecies that of delim- delimits implementations demonstrates species actually current the study coalescent to our multispecies application problem, widespread itation its Despite Significance o l ouain eoeseis nta,seito theory speciation Instead, species. become populations all Not species putative as structure population of Misidentification PNAS | eray1,2017 14, February | o.114 vol. | o 7 no. | 1607–1612

EVOLUTION Splitting events such as this are initiation of speciation through, e.g. population isolation.

Color change indicates completion of speciation and development of true species from erred under the multispecies coalescent incipient species Tree inf

True speciation history

Fig. 1. (Left) The tree represents the true history upon which the gene genealogies (shown by thin purple lines) are conditioned, with the colors repre- senting species. Note that here we show a single representative gene genealogy, with four individuals sampled per lineage, whereas in practice multiple gene genealogies are sampled for delimiting species. This tree shows speciation as a process rather than speciation as an event. That is, the internal nodes of the containing tree do not represent instantaneous complete speciation events, but, rather, initiation of the speciation process due to microevolutionary processes that result in population isolation. Not all of the lineages that arise due to population isolation develop into true species. For example, some may merge back into the other lineages of the same species in the future if whatever barriers led to their isolation were to be removed (i.e., the evolutionary independence will be ephemeral because two populations belonging to the same species do not have an impediment to reproduction). Some of these isolated incipient species lineages, however, do stochastically develop into true species (indicated by shift in color), so that they will remain distinct lineages with independent evolutionary trajectories that do not merge back into their parental species, even if the isolation barriers were to disappear. (Upper Right) The tree shows the results of inference under the multispecies coalescent using BPP, which includes the structuring both due to species boundaries as well as due to lineage splitting as a result of population isolation. As such, the inference corresponds to the full structural history, but not the true speciation history, which is shown by the tree in Lower Right. That is, the multispecies coalescent does not distinguish between structuring due to population isolation vs. structuring due to speciation: It only identifies genetic structure.

is considered as an extended process rather than an instantaneous dominate, the potential for inflating estimates of species diversity event. Specifically, we simulate genetic data under a protracted through the delimitation of populations may differ substantially speciation model (21), such that the degree of genetic structure among clades. accumulating among lineages differs (7), depending on the dura- The objective here is to assess this potential degree of bias tion of speciation (21), and investigate the accuracy of inferences for inferred patterns of species diversity with the multispecies from the popular model-based species delimitation program BPP coalescent model to diagnose species given that (i) the spe- (4, 23, 24). Unlike birth–death models, which treat all lineages ciation process is not instantaneous (18–21, 25, 26), and (ii) equivalently, under the protracted speciation model, population genetic structure is a chimeric pattern resulting from isolation lineages are distinguished from species lineages (Fig. 1). In other due to both population divergence as well as species bound- words, a lineage-splitting event does not correspond to an instan- aries because of the extended nature of the speciation pro- taneous speciation event, but, rather, an incipient speciation rep- cess (18, 19, 21, 26). Note that our study is not restricted to resenting, for example, the initial isolation of a new population. any particular species concept. Instead, it is based simply on This incipient species can go extinct or merge back into the par- the basic principle that there is a distinction between popula- ent population at a particular rate, or, alternatively, if remaining tions and species, which is recognized both empirically and the- isolated for a long enough time, converts or develops into a true oretically (7). Different researchers need not agree on how the species. The phylogeny generated by the protracted speciation distinction between populations vs. species might be operational- process, therefore, consists of a mosaic of population-level as well ized or how divergence might proceed (i.e., we model divergence as species-level lineages (Fig. 1), representing the dynamic and in the absence of gene flow, but other demographic models chimeric nature of real-world speciation (7, 20, 25). For example, might be applied; for a model of divergence with gene flow, the opportunities for geographic isolation and/or rates of repro- see ref. 27). Unless the investigator explicitly assumes that all ductive isolation differ among taxa (18, 19, 21). A large number and any genetic structure of any kind is attributable to species of species might be generated under a short period either due boundaries, then distinguishing between these two sources of to high rates of completion of speciation (e.g., the of structure is fundamental to the species-delimitation analysis. reproductive isolation; ref. 25) or many opportunities for initia- Addressing this knowledge gap—can the multispecies coalescent tion of species (e.g., many relatively isolated populations across distinguish population-level structure and species boundaries— a landscape that contribute to the initiation of speciation; ref. is critical, given the advocacy for delimiting taxa under the 26). However, depending on which of these two parameters pre- multispecies coalescent (28).

1608 | www.pnas.org/cgi/doi/10.1073/pnas.1607921114 Sukumaran and Knowles Downloaded by guest on September 28, 2021 Downloaded by guest on September 28, 2021 uuaa n Knowles and Sukumaran of lineages function isolated a 2). which (Table as at species rate true vary into the will develop is, species rate—that the true conversion true between and the discrepancy of lineages of number the degree of than The the number lower and 30). much lineages, be ref. of will vs. species-level number phylogeny 29 a as with ref. well associated as e.g., species population-level 1; of (Fig. coalescent mosaic structure the a conditions is that process phylogeny the process, all extended across 3.17, to 1.76 from to 0.15 range rates. of conversion (rmses) factor speciation errors a rms low: relatively and number are 0.20, (Fig. values the well error underestimate the very generally lineages, does to of tends BPP BPP that Although see reflect- 2). we divergences incipient structure), or population species regardless ing true lineages, are of they delimitation whether the of (i.e., species, BPP by to of recovered is number three actual (i.e., the the simulations double than 1). in almost inferred Table of five; used still species rate is species more moderate, the species six of more of times number become number 1,000 actual rates inferred high- or error the the the at (10 but system- do Only rates a species). initiation) is of conversion species errors number there species the inferred (i.e., that the est true them in noting of bias overestimates number worth atic the only also underestimates is and never It species BPP present positive: 1). all actually (Table aver- are than data species, species the of estimated respect boundaries in more specific with times More- and 5–13 random 3). number aging essentially the actual and 0.1), are the and 2 BPP to (0.001 by (Figs. rates over- conversion delimited species to species species lowest tends true the BPP of at rates, over, number conversion the species the estimate of all Across Results hw stenme fseisprrpiaev.tenme flnae ie,bt reseisa ela iegsrpeetn niin pce or species incipient representing lineages as tree. well not input does as the BPP on species that species is true striking true both is of what (i.e., number However, lineages species. the of main of vs. number the probability number the on 0.95 overestimate the effect in to a meaningful vs. seen tends at no as BPP replicate inferred species, had rates, per replicate track conversion these species per all because across species rates, of Generally, of extinction extinction structure). number different while number population the 0.5, these the at is between is fixed distinguish was Shown Shown rate not (A) (B) initiation does Speciation argument). plot rates. our conversion (the species or 0.2 different results under and units) 0.0 (5 either time of were duration rates fixed a for run simulations from 2. Fig. AB u eut hwta,gvnteraiiso pcaina an as speciation of realities the given that, show results Our structure much how on assessment our focus we if contrast, In h efrac fseisdlmtto ne h utseiscaecn hntedt r eeae ne h rtatdseito model, speciation protracted the under generated are data the when coalescent multispecies the under delimitation species of performance The u,rte,tak tutr faysr,wehricpetseiso reseis sse in seen as species, true or species incipient whether sort, any of structure tracks rather, but, A, olsetmyb oe hntoew ouethr.For here. document we those spe- than certain lower under be that, multispecies may the possible to under coalescent inferences is due erroneous the structure It conditions, cific genetic species. between of vs. distinguish conclusions coa- population not the multispecies radically—the does data, change lescent genetic not will the manuscript generate the divergence to the of used Irrespective 27. process ref. flow; struc- gene (e.g., with genetic demographies divergence alternative include under simulated data that been Obviously, have species. data might and isolation of population both generation to due the ture for allows it model. this under or these inference program implements and particular that the approach delimitation, respect with species invariant are mul- for implications the model using rather coalescent of conceptual, implications tispecies the computational, are or here statistical present than we or What populations species. to struc- due to genetic structure due is coales- between diagnosed distinction multispecies is no what with the ture, holds: as thesis uses long central as our implementation cent, used, or genetic is program the implementation implemen- of the which the resolution of of of particularities Regardless to level tation. attributable and be accuracy will the structure results in of differences terms and imple- model, in underlying that same programs the but use coalescent, other all multispecies they the are under There delimitation species model BPP. ment this by of implementation exemplified particular as the in of on model validity not the coalescent and on multispecies general, is the argument under our both delimitation of between species includes focus structure the which as that well Note data, number as species. the actually species the within of coalescent structure that structure multispecies population show genetic the the we struc- by reflects Specifically, identified genetic “species” species. diagnoses of not model and coalescent ture multispecies The Discussion ieie eueteporce pcainmdl(1 because (21) model speciation protracted the use we Likewise, PNAS | eray1,2017 14, February B. | o.114 vol. | o 7 no. | 1609

EVOLUTION AB

Fig. 3. The performance of species delimitation under the multispecies coalescent when the data are generated under the protracted speciation model, from simulations run until five true species (indicated with dashed line) were produced under different species conversion rates. Speciation initiation rate was fixed at 0.5, while extinction rates were either 0.0 and 0.2. (A) The number of inferred species at different speciation conversion rates is shown in the box plots. The multispecies coalescent tends to overestimate the number of species when conversion rates are small and there is a long lag between initiation and speciation, because it is diagnosing structure rather than species per se, and does not distinguish between structure due to incipient species or population processes and structure due to actual species. This is clearly shown in B, where the number of lineages (which include incipient or population- level lineages as well as true species lineages) is shown by the diagonal. The inference tends to track the number of lineages, rather than the number of true species (five; shown by the dashed horizontal line), and thus not distinguishing between population or species.

example, the rate of gene flow between populations may be between lineages that have undergone speciation as well as high enough so that the structure due to populations will not others that reflect spatial structuring of populations (32). We be detected under the multispecies coalescent, yet, conversely, expect this to be particularly a problem in cases where species gene flow between species may be low enough such that the delimitation analysis is most needed—recent, rapid radiations, structure due to species boundaries is strongly evident. How- where the processes promoting divergence will result in both ever, in cases like this, the multispecies coalescent has not species and population structure. Even if it is recognized by suddenly gained the ability to distinguish between genetic struc- researchers using programs that implement the multispecies coa- ture due to populations vs. structure due to species. Rather, this lescent for species delimitation that what is being delimited is not apparent discrimination is an artefactual one due to contriving so much species but structure—which the researchers are assum- the relative gene flow rates between populations so as to erode the ing to be coincident with species boundaries—this distinction signal due to population structure below the detection threshold of the multispecies coalescent, while at the same time maintaining the signal due to species structure above this threshold. Table 1. Sizes of species trees with different number of species Consequently, if all the multispecies coalescent can identify generated under the protracted speciation process under is genetic structure, it is not a justifiable model for delimiting different species conversion rates, c, the rate at which isolated species, irrespective of which species concept an investigator lineages develop into true species might apply. Our results indicate that the only way that BPP Lineages Species in its current form can be validly used as a species delimitation approach is if external information is used to make the determi- c Max Min Mean Max Min Mean nation. This external information can be an a priori hypothesis or 0.001 47 3 17.35 2 1 1.01 knowledge that all of the structure in the genetic data identified 0.1 38 4 13.29 7 1 1.83 by BPP do indeed correspond to species rather than populations, 1 36 3 11.43 12 1 3.75 or, equivalently, but perhaps much more unrealistically, the rate 10 39 3 10.43 17 1 4.87 of speciation completion is so rapid compared with the rate of 1,000 39 3 10.67 19 1 5.46 population isolation that species instantaneously form. Alterna- tively, morphological, ecological, ethological, or other classes of At the lowest conversion rate (0.001, or 500 times slower than the spe- data must be used to correctly attribute the elements of structure ciation initiation rate), on average only one actual species was generated in each species phylogeny, even though the number of lineages ranged delimited by BPP to either species-level (31) or population-level between 3 and 47. At the highest conversion rate (1,000 or 2,000 times processes. faster than the speciation initiation rate), on average approximately five Dense spatial sampling of individuals and genomes, which species were expected on phylogenies that ranged in size from 3 to 39 tips. increasingly characterize many current studies, is leading more Results here are based on a birth rate of 0.5 and extinction rates of 0.0 and and more to datasets where the primary structuring is a mix 0.2 (see Materials and Methods for details).

1610 | www.pnas.org/cgi/doi/10.1073/pnas.1607921114 Sukumaran and Knowles Downloaded by guest on September 28, 2021 Downloaded by guest on September 28, 2021 uuaa n Knowles and Sukumaran otesml it–et oe,wihsmltsseito sinstanta- as speciation simulates contrast In which concept). model, this “good for birth–death or species simple “full” “true” as the term known the to lin- are use whereas species we species, yet true (here incipient into not species” an developed has as have known that that is track eages species evolutionary true isolated a model into an speciation developed on protracted lineage original its the a and In terminology, species species. ancestral an true pop- from into initial lineage between development a lag of a divergence with or event, isolation an ulation than speciation simulate rather to process us extended protracted allows an the that as use model we generative is a Here, work as such. current model as speciation the model to speciation of reader protracted focus the the the refer on we because not and information, 36, more and for 35, 21, works refs. those in speciation detail protracted in sam- The described model. is for speciation model provides protracted the which from (34), trees pling DendroPy of class tractedSpeciationProcess” steps: following the of consisted approach Our Methods as and Materials than rather information, or such. data as other species anal- true of subsequent application through rejected or or ysis confirmed delimited hypotheses be units tentative to as the species, recog- best of report at to and coalescent just multispecies treat the to not under researchers important population- for is but between it nize, discriminate structuring, to species-level able and are that delimitation species stud- genomic-based approaches develop by secondary we these overlooked until such, of be As consumers ies. will other error or of or readers, source studies reviewers, secondary important may these an of that but pos- findings reports, consequences the immediate will have the only numbers beyond Not go species lost. of is itself, funda- inflation a analysis sible than the rather of populations, result not mental only and corresponds boundaries structure species genetic to the the of that assumption researchers postanalytical the original a that from fact follow the identities studies, secondary species in used are coalescent species 9.25 policy levels. and global or management, local at reporting, either con- conservation development, applied more as in such also such but texts, fields, biology, other evolutionary 5.65 13.16 in or efforts research ecology or only as studies not other include by efforts used These are (33). results their when lost be may 12.96 9.25 error. normalized “Normalized the of simulation. square the simulated. the of species true root in of square number the species the 13.16 is by true rmse divided error of absolute the 1.00 number is error” the 11.00 and 14.00 species 2.00 29.00 1000 10 31.00 1 0.1 0.001 c as model species coalescent different under multispecies rates, species conversion the delimit to of BPP Performance in implemented 2. Table 5. 4. 3. 2. 1. h iuaino h pce re a are u yuigte“Pro- the using by out carried was trees species the of simulation The multi- the under delimitation species of results the When inferred of number the between difference the is error” “Absolute oprsno h nerdv.ata ubro pce nteoriginal the in species of number actual tree. vs. species multispecies inferred a the using of data Comparison sequence on program. inference locus. based coalescent each trees for species trees gene of loci. on Inference independent alignments multiple data for sequence tree of indi- structuring Generation multiple the with of coalescent, lineage multispecies per the viduals under structure trees the above the by the conditioned models of genealogies explicitly gene coalescent that completion. of speciation model Generation and a initiation speciation under of lineages, processes species) species- and (true species) (incipient level population-level with trees of Generation 20.00 a i enErrRMSE Error Mean Min Max c boueError Absolute .034 .23.68 3.88 6.04 0.72 0.92 1.83 3.46 3.70 6.01 −3.00 −4.00 −1.00 Normalized okwsspotdb ainlSineFudto rn E 11-45989 DEB Grant We Foundation possible. Science This National been paper. L.L.K.). this by (to have strengthened supported and not was improved would considerably work sugges- which and work of comments thoughtful all this for tions, reviewers which anonymous resources four without computing thank also high-performance lab, the his with of generosity his for Kansas ACKNOWLEDGMENTS. greater. species or of 0.95 number of the probability be posterior the a to with taken BPP was by number tree inferred greater The postprocessed or collapsed. the were 0.95 in nodes in of tips internal of probability node other posterior internal all a whereas any had retained, and was that preorder, tree in species traversed original was the BPP summary The by species: produced true as tree interpreted were probability posterior greater parameters samples. taken proposal 4,000 samples of of with burn-in fine-tuning a used, automatic and were after posterior A generations dataset. the 100 input from every each samples of 10,000 values of the (simulated) total true as the used to were corresponded trees mean the rooted for Uniform Priors prior. 0 (37). model algorithm analyses was species species under upstream to of searches to delimitation set with the due was in tree not errors BPP guide any consequently, a data. “10”); as mode source tree (i.e., as species tree (true) the species use corresponding the with along under tree time. gene of unit each in per 10 on site characters rates: per mutation 1,000 individual different mutations of two previous each using consisting model the for loci, Jukes–Cantor simulated in the 10 of were produced total locus, trees A each gene Seq-Gen. Sequence using the lineage. by each on step from fixed simulated sampled lineage were genes each individual of alignments 10 sizes used and population were 100,000, haploid step with to previous sam- tree, the containing phylogenies in the process The as speciation (34). protracted DendroPy the from of pled class “ContainingTree” the using incipient tree. the as species well the as as lineages, lineages) species true species both of consisting mosaic a (i.e., following the of combination each under simulated parameters: were species true units: time 5.0 for parameters following the of combination species lineages true total of five numbers until varying phylogeny). (with per in run phylogeny (resulting were each species on units “fixed simulations generated two time were the and 5.0 where species), of true regimes, and total number” lineages a total for of numbers run varying were simulations the where true to rate. transitioning conversion tree given resulting to the the corresponding at on rate species lineages death with the rate, and produce extinction corre- rate the to rate initiation birth species considered the the be with to process, can the sponding birth–death model as conventional Thus, speciation the well rate). under protracted “extinction” as trees the common rate of a (i.e., extinction use equal our species be to incipient rate”). extinction initiation the species “species true both common a set (i.e., equal we be Similarly, to rate birth species true species incipient go which lineages at species species). rate true true (the into which species), rate develop conversion at parent species rate their the (the and into rate extinct), back extinction merge species incipi- true or new incipient the which extinct produce at become species rate (the true lineages rate which species extinction species at incipient the rate the the species), species), (the incipient ent rate, rate new death produce birth species species and incipient birth true which birth species at incipient a rate the (the parameters, parameters: rate five two has only model speciation has protracted and events neous • • • • • • l lt eemd yuigteRpcaeggplot2. package R the using by made were plots All or 0.95 a with clades only that such processed were results BPP The BPP to passed were tree species each for generated alignments of set The by model coalescent multispecies the under simulated were trees Gene produced so phylogeny the use we analysis, of stages subsequent all For five of consisting phylogenies 20 regime, fixed-species-number the For each under simulated were phylogenies 20 regime, fixed-duration the For regime, duration” “fixed a used: were regimes simulation of classes Two the as well as rate birth species incipient the both set we study, our In pce ovrinrt:01 .,ad1,000.0. and 1.0, 0.1, rate: conversion Species 0.2 0.0, rate: extinction Species 0.5 rate: initiation Species 1,000.0. and 10.0, 1.0, 0.1, 0.001, rate: conversion Species 0.2 0.0, rate: extinction Species 0.5 rate: initiation Species PNAS etakD.Mr .Hle tteUiest of University the at Holder T. Mark Dr. thank We | eray1,2017 14, February θ and τ aaeeswr e uhta the that such set were parameters | o.114 vol. | o 7 no. −6 n 10 and | 1611 −8

EVOLUTION 1. Jetz W, Thomas GH, Joy JB, Hartmann K, Mooers AO (2012) The global diversity of 20. Nosil P, Harmon LJ, Seehausen O (2009) Ecological explanations for (incomplete) spe- birds in and time. Nature 491:1–5. ciation. Trends Ecol Evol 24(3):145–156. 2. Jarvis ED, et al. (2014) Whole-genome analyses resolve early branches in the tree of 21. Etienne RS, Morlon H, Lambert A (2014) Estimating the duration of speciation from life of modern birds. Science 346(6215):1320–1331. phylogenies. Evolution 68(8):2430–2440. 3. Hinchliff CE, et al. (2015) Synthesis of phylogeny and taxonomy into a comprehensive 22. Nee S (2006) Birth-death models in macroevolution. Annu Rev Ecol Evol Syst 37:1–17. tree of life. Proc Natl Acad Sci USA 112(41):12764–12769. 23. Yang Z, Rannala B (2014) Unguided species delimitation using DNA sequence data 4. Yang Z, Rannala B (2010) Bayesian species delimitation using multilocus sequence from multiple loci. Mol Biol Evol 31:3125–3135. data. Proc Natl Acad Sci USA 107(20):9264–9269. 24. Yang Z (2015) The BPP program for species tree estimation and species delimitation. 5. Zhang C, Zhang DX, Zhu T, Yang Z (2011) Evaluation of a Bayesian coalescent method Curr Zool 61(5):854–865. of species delimitation. Syst Biol 60:747–761. 25. Weir JT, Wheatcroft DJ, Price TD (2012) The role of ecological constraint in driving the 6. Avise JC (2000) Phylogeography: The History and Formation of Species (Harvard Univ evolution of avian song frequency across a latitudinal gradient. Evolution 66(9):2773– Press, Cambridge, MA). 2783. 7. Hey J, Pinho C (2012) Population genetics and objectivity in species diagnosis. Evolu- 26. Botero CA, Dor R, McCain CM, Safran RJ (2014) Environmental harshness is positively tion 66(5):1413–1429. correlated with intraspecific divergence in mammals and birds. Mol Ecol 23(2):259– 8. Carstens B, Pelletier T, Reid N, Satler J (2013) How to fail at species delimitation. Mol 268. Ecol 22:4369–4383. 27. Heled J, Bryant D, Drummond AJ (2013) Simulating gene trees under the multispecies 9. Moritz C, Schneider C, Wake B (1992) Evolutionary relationships within the Ensatina coalescent and time-dependent migration. BMC Evol Biol 13(1):44. eschscholtzii complex confirm the ring species interpretation. Syst Biol 41:273–291. 28. Fujita MK, Leache´ AD, Burbrink FT, McGuire JA, Moritz C (2012) Coalescent-based 10. Agapow P, et al. (2004) The impact of species concept on biodiversity studies. Q Rev species delimitation in an integrative taxonomy. Trends Ecol Evol 27(9):480–488. Biol 79:161–179. 29. Knowles LL, Carstens BC (2007) Estimating a geographically explicit model of popula- 11. de Quieroz K (2005) Different species problems and their solutions. Bioessays 26:67– tion divergence. Evolution 61(3):477–493. 70. 30. Carstens BC, Knowles LL (2007) Estimating species phylogeny from gene-tree proba- 12. de Quieroz K (2005) Ernst Mayr and the modern concept of species. Proc Natl Acad bilities despite incomplete lineage sorting: An example from Melanoplus grasshop- Sci USA 102:6600–6607. pers. Syst Biol 56(3):400–411. 13. Isaac NJ, Mallet J, Mace GM (2004) Taxonomic inflation: Its influence on macroecology 31. Sol´ıs-Lemus C, Knowles LL, Ane´ C (2015) Bayesian species delimitation combining mul- and conservation. Trends Ecol Evol 19(9):464–469. tiple genes and traits in a unified framework. Evolution 69(2):492–507. 14. Payo DA, et al. (2013) Extensive cryptic species diversity and fine-scale endemism in 32. Moritz C, et al. (2016) Multilocus phylogeography reveals nested endemism in a gecko the marine red alga Portieria in the Philippines. Proc Biol Sci 280(1753):20122660. across the monsoonal tropics of Australia. Mol Ecol 25(6):1354–1366. 15. Voda˘ R, Dapporto L, Dinca˘ V, Vila R (2015) Cryptic matters: Overlooked species gener- 33. Huang JP, Knowles LL (2015) The species versus subspecies conundrum: Quantitative ate most butterfly beta-diversity. Ecography 38(4):405–409. delimitation from integrating multiple data types within a single Bayesian approach 16. Hamback¨ PA, et al. (2013) Bayesian species delimitation reveals generalist and spe- in Hercules beetles. Syst Biol 65(4):685–99. cialist parasitic wasps on Galerucella beetles (Chrysomelidae): Sorting by herbivore or 34. Sukumaran J, Holder MT (2010) DendroPy: A Python library for phylogenetic comput- plant host. BMC Evol Biol 13(1):92. ing. Bioinformatics 26(12):1569–1571. 17. Frankham R, et al. (2012) Implications of different species concepts for conserving 35. Etienne RS, Rosindell J (2012) Prolonging the past counteracts the pull of the present: biodiversity. Biol Conserv 153:25–31. Protracted speciation can explain observed slowdowns in diversification. Syst Biol 18. Rosenblum EB, et al. (2012) Goldilocks meets Santa Rosalia: An ephemeral speci- 61(2):204–213. ation model explains patterns of diversification across time scales. Evol Biol 39(2): 36. Lambert A, Morlon H, Etienne RS (2015) The reconstructed tree in the lineage-based 255–261. model of protracted speciation. J Math Biol 70(1-2):367–397. 19. Dynesius M, Jansson R (2014) Persistence of within-species lineages: A neglected con- 37. Olave M, Sola` E, Knowles LL (2014) Upstream analyses create problems with DNA- trol of speciation rates. Evolution 68(4):923–934. based species delimitation. Syst Biol 63(2):263–271.

1612 | www.pnas.org/cgi/doi/10.1073/pnas.1607921114 Sukumaran and Knowles Downloaded by guest on September 28, 2021