Syst. Biol. 61(1):22–43, 2012 c The Author(s) 2011. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: [email protected] DOI:10.1093/sysbio/syr075 Advance Access publication on August 11, 2011

Evaluating Fossil Calibrations for Dating Phylogenies in Light of Rates of Molecular Evolution: A Comparison of Three Approaches

1,2, 3 1 VIMOKSALEHI LUKOSCHEK ∗, J.SCOTT KEOGH , AND JOHN C.AVISE 1Department of Ecology and Evolutionary Biology, University of California at Irvine, Irvine, CA 92697, USA; 2Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, QLD 4811, Australia; and 3Division of Evolution, Ecology and Genetics, Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia; ∗Correspondence to be sent to: ARC Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, QLD 4811, Australia; E-mail: [email protected].

Received 17 November 2010; revises returned 21 January 2011; accepted 4 March 2011 Associate Editor: Frank E. Anderson

Abstract.—Evolutionary and biogeographic studies increasingly rely on calibrated molecular clocks to date key events. Although there has been significant recent progress in development of the techniques used for molecular dating, many issues remain. In particular, controversies abound over the appropriate use and placement of fossils for calibrating molec- ular clocks. Several methods have been proposed for evaluating candidate fossils; however, few studies have compared the results obtained by different approaches. Moreover, no previous study has incorporated the effects of nucleotide sat- uration from different data types in the evaluation of candidate fossils. In order to address these issues, we compared three approaches for evaluating fossil calibrations: the single-fossil cross-validation method of Near, Meylan, and Shaffer (2005. Assessing concordance of fossil calibration points in molecular clock studies: an example using turtles. Am. Nat. 165:137–146), the empirical fossil coverage method of Marshall (2008. A simple method for bracketing absolute divergence times on molecular phylogenies using multiple fossil calibration points. Am. Nat. 171:726–742), and the Bayesian multical- ibration method of Sanders and Lee (2007. Evaluating molecular clock calibrations using Bayesian analyses with soft and hard bounds. Biol. Lett. 3:275–279) and explicitly incorporate the effects of data type (nuclear vs. mitochondrial DNA) for identifying the most reliable or congruent fossil calibrations. We used advanced (Caenophidian) as a case study; however, our results are applicable to any taxonomic group with multiple candidate fossils, provided appropriate taxon sampling and sufficient molecular sequence data are available. We found that data type strongly influenced whichfos- sil calibrations were identified as outliers, regardless of which method was used. Despite the use of complex partitioned models of sequence evolution and multiple calibrations throughout the tree, saturation severely compressed basal branch lengths obtained from mitochondrial DNA compared with nuclear DNA. The effects of mitochondrial saturation were not ameliorated by analyzing a combined nuclear and mitochondrial data set. Although removing the third codon positions from the mitochondrial coding regions did not ameliorate saturation effects in the single-fossil cross-validations, it did in the Bayesian multicalibration analyses. Saturation significantly influenced the fossils that were selected as most reliable for all three methods evaluated. Our findings highlight the need to critically evaluate the fossils selected by data with different rates of nucleotide substitution and how data with different evolutionary rates affect the results of each method for evalu- ating fossils. Our empirical evaluation demonstrates that the advantages of using multiple independent fossil calibrations significantly outweigh any disadvantages. [Bayesian dating; Caenophidia; cross-validation; fossil calibrations; Hydrophi- inae; molecular clock; nucleotide saturation; snakes.]

Ideally, molecular clock calibrations are obtained from existed than the clade’s common ancestor. Fossils also accurately dated fossils that can be assigned to nodes may be incorrectly assigned to the crown rather than with high phylogenetic precision (Graur and Martin the stem of a clade (Doyle and Donoghue 1993; Maga- 2004), but reality is generally far from this ideal be- llon and Sanderson 2001). The most useful fossils are, cause of a number of important problems. The incom- therefore, geologically well dated, preserved with suffi- plete and imperfect nature of the fossil record means cient morphological characters to be accurately placed that fossils necessarily only provide evidence for the on a phylogenetic tree and temporally close to an ex- minimum age of a clade. Many clades will be consider- tant node rather than buried within a stem lineage (van ably older than the oldest known fossil; thus, nodes may Tuinen and Dyke 2004). However, the fossil records of be constrained to erroneously young ages (Benton and many, if not most, taxonomic groups fall far short of Ayala 2003; Donoghue and Benton 2007; Marshall 2008). these criteria. As such, several methods have been de- Incorrect fossil dates also arise from experimental errors veloped for evaluating candidate fossil calibrations in in radiometric dating of fossil-bearing rocks or incor- order to: determine their internal consistency and iden- rectly assigning fossils to a specific stratum. In addition, tify outliers (Near et al. 2005), identify lineages with misinterpreted character state changes can result in the the best fossil coverage and identify outliers (Marshall taxonomic misidentification of fossils or their incorrect 2008), and evaluate alternative placements of fossils placement on the phylogeny (Lee 1999). Ideally, a fossil (Rutschmann et al. 2007; Sanders and Lee 2007). would date the divergence of two descendant lineages However, fossil calibrations are not the only difficulty from a common ancestor. In reality, however, fossils in molecular dating. Other factors also contribute to rarely represent specific nodes but rather points along a inaccurately calibrated molecular clocks including: in- branch (Lee 1999; Conroy and van Tuinen 2003). Thus, correctly specified models of evolution (Brandley et al. although a fossil may appear to be ancestral to a clade, 2011), inappropriate modelling of rate heterogeneity it is impossible to determine how much earlier the fossil among lineages (Sanderson 1997; Drummond et al.

22 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 23

2006), and unbalanced taxon sampling potentially re- evaluating fossil calibrations: the single-fossil cross- sulting in node-density artifacts (Hugall and Lee 2007). validation method of Near et al. (2005), the empirical In addition, choice of genetic data or gene region can fossil coverage method of Marshall (2008), and the strongly affect estimated divergences (Benton and Ayala Bayesian multicalibration method of Sanders and Lee 2003). For example, in rapidly evolving genes, such as (2007) and explicitly evaluate the effects of nucleotide mitochondrial DNA, saturation has been shown to have saturation on the results of each method. Briefly, the the effect of compressing basal branches and artificially single-fossil cross-validation approach (Near et al. 2005) pushing shallow nodes toward basal nodes, resulting in evaluates candidate fossils, including the alternative overestimated divergence dates (Hugall and Lee 2004; ages or placements of fossils at some calibrated nodes, Townsend et al. 2004; Hugall et al. 2007; Phillips 2009). with the aim of identifying a number of plausible reli- However, the nature of the bias is complicated. For able calibration sets. The approach of Marshall (2008) example, underestimating the true rate of hidden sub- aims to identify candidate calibrations with the best stitution results in tree compression: However, if the fossil coverage and then tests whether these fossils are rate of hidden substitutions were to be overestimated, potential outliers. Finally, the Bayesian multicalibration the reverse would be true. These effects are further approach evaluates one or more alternative calibrations complicated by the calibration placement. For exam- in a set by comparing the Bayesian prior and posterior ple, if only deep splits are calibrated, then recent nodes probabilities (PPs) at fossil-calibrated nodes (Sanders will be biased to be younger under tree extension and and Lee 2007). We explicitly evaluate the effects of using older under tree compression. Slowly evolving genes, sequence data with different rates of molecular evo- as are typical for nuclear DNA, are less prone to such lution on the best fossils identified by each method saturation effects; however, nuclear DNA data are not using the same mitochondrial and nuclear sequence completely immune to these issues; problems of satu- data set (each with identical taxon sampling) for each ration also can emerge for slowly evolving nuclear loci method. In addition, we evaluate whether saturation ef- if deeper divergences are being investigated. More im- fects can be ameliorated by 1) removing the third codon portantly, although the effects of saturation have been position of the mitochondrial coding regions and 2) an- documented for estimating divergence times (Brandley alyzing a combined nuclear and mitochondrial data set. et al. 2011; Hugall and Lee 2004; Townsend et al. 2004; Our study focused on testing alternative placements or Hugall et al. 2007; Phillips 2009), the effects of saturation ages of controversial fossil calibrations (as is typical for on different approaches for evaluating candidate fossil groups with poor fossil records); however, our approach calibrations have yet to be explored. is relevant for any situation where numerous candidate Caenophidia (“advanced snakes” comprising acro- fossil calibrations exist. chordids, elapids, viperids, and colubrids) is a group with a controversial fossil record. Indeed, recent pa- MATERIALSAND METHODS pers using calibrated molecular clocks to date diver- Fossil Calibrations, Taxon Sampling, Molecular Data, gences among advanced clades highlight the Convergence Diagnostics, and Saturation Plots extent of controversy about the placements of certain fossils (Wuster et al. 2007, 2008; Sanders and Lee 2008; Colubroid classification is in flux (Vidal et al. 2007). Sanders et al. 2008; Kelly et al. 2009). In part, this con- We use the traditional colubroid classification as com- troversy exists because of the relatively poor nature of prising viperids, elapids, and colubrids, including colu- the snake fossil record. Well preserved and relatively brid subfamilies recently elevated to higher taxonomic complete caenophidian fossils date back no further than ranks (McDowell 1987; Rage 1987; Lawson et al. 2005). the Miocene (Rage 1984) and often belong to extant Forty-eight taxa (40 caenophidian and 8 henophidian genera (Rage 1988; Szyndlar and Rage 1990, 1999), thus taxa) were chosen based on the availability of nuclear are of little value as calibration points for most studies. and mitochondrial sequences (Appendix Table A1) and Earlier caenophidian fossils mostly comprise isolated to appropriately span the various fossil calibrations vertebrae, the taxonomic affinities of which have been tested. We specifically selected fossil calibrations that strongly debated (McDowell 1987; Rage 1987). Perhaps, often have been used to date recent caenophidian diver- the most controversial calibrations concern the origin gences (Nagy et al. 2003; Guicking et al. 2006; Burbrink of caenophidian snakes themselves, which has been as- and Lawson 2007; Wuster et al. 2007, 2008; Alfaro et al. signed dates of 38 (34–48) myr (Sanders and Lee 2008; 2008; Sanders and Lee 2008; Kelly et al. 2009), and for Kelly et al. 2009), 57 (47–140) myr (Wuster et al. 2008), which we could construct nuclear and mitochondrial and >65 myr (Noonan and Chippindale 2006a, 2006b), data sets with appropriate taxon sampling. Details of based on different interpretations of the fossil record the fossil calibrations evaluated are given in Table 1. (Table 1). As such, very different dates have been used We constructed nuclear and mitochondrial data sets, to calibrate the caenophidian molecular clock (Nagy each with identical taxon sampling, using >100 novel et al. 2003; Guicking et al. 2006; Burbrink and Lawson sequences generated for this study and published se- 2007; Wuster et al. 2007, 2008; Alfaro et al. 2008; Sanders quences obtained from GenBank (Appendix Table A1). and Lee 2008; Kelly et al. 2009). The mitochondrial data comprised 16S rRNA (454 bp), In this paper, we use advanced snakes as a test case NADH dehydrogenase subunit 4 (ND4) (672 bp), and to compare three previously published methods for cytochrome b (1095 bp), and the nuclear data comprised 24 SYSTEMATIC BIOLOGY VOL. 61 ) ); Rage ); 57 Continued 2009 ) and ( Kelly et al. 2009 Rage and Werner 1999 ; Gardner and Cifelli 1999 vertebrate from Utah from Noonan and Chippindale ). We tested the effects of Kelly et al. Noonan and Chippindale 75 Ma based on the earliest > Coniphis , p. 146). Based on these fossils, we fossils in geographically distant Sudan ence Head et al. 2005 1999 , p. 95) note that the approximately ( Sanders and Lee 2008 Refer 1987 ); and the oldest undisputed colubroid fossil , Coniophis ); and 65 (63–80) myr ( Rage et al. 2001 ) indicating that the Boinae were a separate phylogenetic ). The first vertebrae that are undoubtedly booids occur in Rage 1984 65 myr). (Nigeropheidae) vertebra found in Paleocene marine deposits > Gardner and Cifelli 2008 ). Rage 2001 ( ). trunk vertebrate from the Cenomanian (94–100 Ma) in Sudan ( Wuster et al. 2007 Rage 2001 ; ) used Nigerophis (Boinae) and occur contemporaneously with fossil vertebrae from several Coniphis 2006b calibrations 2006a ) dated the Henophidia–Caenophidia split at Albino 2000 Sanders and Lee The divergence between the Henophidiahas (boids) and been Caenophidia dated (advanced using snakes) the( fossils assigned to the Booidae. (47–140) myr ( probable boid fossils from theHowever, latest the Cretaceous taxonomic (65–85 affinities Ma) from( of these South older America. vertebrae werethe mid-Palaeocenenot (58.5–56.5 easy to assign Ma). TheseCorallus vertebrae are assigned to theother extant boine genus taxa ( entity by the mid-Palaeocene, andTertiary that or extant late boine Cretaceous lineages ( originatedconstrained early the in Henophidia–Caenophidia the split as occurring 68 (65–85) Ma. the oldest in Nigeria (56–65 Ma) ( from the late middle Eoceneconstraining (37–39 this Ma) node ( with thebased three on different these divergence fossils: dates 38 previously (34–48) used myr ( 2006a , and Werner 1999 contemporaneous occurrence of and Utah suggests that theCenomanian Alethenophidia–Scoleophidia (99 split Ma). occurred This prior calibration to also the was used by The MRCA of the Caenophidiaascribed (Acrochordidae a versus range Colubroidea) of has dates been affinities based onof certain different fossils. interpretations (93–96 ofThese fossils Ma) the in taxonomic include Sudan six thatvertebrae were assigned from the Cenomanian to the Colubroidea ( and six the upper Albanian/lowermost Cenomanian (97–102 Ma) ( The divergence between the Scolecophidiabased and on the the Alethinophidia earliest was alethinophidian calibrated fossils: two fset 62 Calibration priors (95% HPD) Ln mean (SD) Zero of 38 (34–48) 1.40 (0.75) 34 65 (63–80) 1.10 (1.10) 68 (65–85) 1.00 (1.20) 65 57 (47–140) 2.50 (1.25) 45 Mean Set A Set B Set C 2- 2- 2- 1 versus Details of fossils tested using three approaches for evaluating candidate fossil 1. calibrations Node BLE A T Henophidians versus caenophidians Acrochordids versus colubroids Acrochordids versus colubroids Acrochordids versus colubroids Fossil Scolecophidians alethinophidians Root 97 (92–100) 2.00 (0.85) 90 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 25 ; ), Continued ). We also Head et al. 2005 Sanders and Lee 2008 , p. 225). Based on these Parmley and Holman ). A third colubrid, Wuster et al. 2008 , extinct species that belong ), Pondaung ( Rage 1988 ) have also been used to constrain ) indicate that colubroids had started Holman 1984 ence 2005 , p. 249) and ( Natrix mlynarskii ). However, fossils with colubrine and Rage et al. 1992 Refer and ); and 47 (40–95) myr ( Head et al. ); however, the oldest putative colubroid fossils from the Rage and Werner 1999 Alfaro et al. 2008 Parmley and Holman 2003 ; , has been described from the early Orellan to Whitneyan ages of Coluber cadurci Wiens et al. 2006 ; Rage et al. 1992 Guicking et al. 2006 2003 , p. 6) argue that taxonomically and geographically divergent colubrid fossils tested a constraint of 40divergent (37–60) colubrid myr fossils based from on the the Late geographically Eocene. and taxonomically Kelly et al. 2009 the extant subfamilies Colubrinae andfrom Natricinae the respectively, early have Oligocene been (30–34 described Texasophis Ma) galbreathi in Europe ( the Oligocene (30–31 Ma) in North American ( and North America ( diverging pre-Late Eocene, possibly eventested in two the divergence early dates Paleogene previously (43–60 used: Ma). 34 We (31–43) myr ( fossils, the crown natricine–colubrine divergence( has been constrained at 35–45natricine myr morphology appear almost immediatelyintedeterminate after colubrids, the suggesting first that appearance these of appropriate primitive for fossils dating may the be stem more between natricine–colubrine the clade xenodontines, (in natricines, our colubrines).constraining case, the We the tested divergence stem the (node effect 4)(35–45) of and myr. crown (node 5) colubrine–natricine clades at 37 Fossils assigned to The MRCA of the Colubroideadated (viperids based versus on colubrids the and oldestThailand elapids) colubrid ( has fossils been from theCenomanian Late (93–96 Eocene Ma) (34–37 ( Ma) in the upper bound of this( clade. found the late Eocene in Krabi Basin ( fset 40 37 35 30 Calibration priors (95% HPD) Ln mean (SD) Zero of 47 (40–95) 2.00 (1.20) 40 (37–60) 1.10 (1.25) 36 (35–45) 0.50 (1.10) 34 (31–43) 1.40 (0.70) Mean Set B Set C Set A 3- 5 3- 3- 4 Node ) Continued ( 1. calibrations BLE A T iperids versus colubrids Viperids versus colubrids + elapids V Fossil Viperids versus colubrids + elapids Natricines versus colubrines (Crown) + elapids Natricines versus colubrines (Stem) 26 SYSTEMATIC BIOLOGY VOL. 61 as with Naja Naja ) Wuster et al. Bungarus ( Naja but differences from ). These fossils have been L. colubrina Szyndlar and Rage 1990 . Based on this taxonomic and the closely related 2003 , p. 579) suggested that this fossil ( Naja ence Laticauda , p. 1186). This vertebra is one of the oldest Refer species with apomorphies that distinguish Asian Scanlon et al. Szyndlar and Rage 1990 Naja ). However, these extinct fossil species display primitive ). We tested the effects of constraining the crown 2007 ) used a minimum age of 24 myr to calibrate the ( and, based on its similarity to Sanders and Lee 2008 occur 16 Ma ( and other elapids, Wuster et al. Naja Laticauda Kelly et al. 2009 Kuch et al. 2006 2008 ; dates of 19 (17–30) myr. the stem clade and explored the effects of constraining the crown and stem assignment, divergence between Laticauda and allHowever, other the hydrophiines taxonomic (crown affinity hydrophiines). and/orbeen stratigraphic questioned ( age of this fossilelapid fossil have recently known and might,extant therefore, elapid be group. basal Apart to from (ratherelapids this than in fossil, nested the the within) first earliest the fossil appearances20–23 record of Ma modern ( proteroglyphous are fangshydrophiines from andGermany crown dated at elapids with dates of 23 (21–30) myr. suggesting that they should beclade. used We to assigned calibrate the the divergence stem between rather than the crown L. laticaudata is nested within (not basal to) the genus 2007 , Fossils of 3 extinct European and African conditions that are very rare among living cobras ( A fossil vertebra from theassigned late to Oligocene/early Miocene (20–23 Ma) has been used to date the divergence between the crown African and Asian 1 . fset 16 Calibration priors (95% HPD) Ln mean (SD) Zero of 19 (17–30) 1.00 (1.00) Mean 9 8 Node ) Continued ( 1. versus hydrophines 6 23 (21–30) 1.00 (0.80) 20 Constraints are given as absolute values (millions of years before present) and the corresponding lognormal mean, standard deviation, and zero offset of the calibration calibrations BLE (Crown) (Stem) A T African versus Asian Naja Notes: Crown hydrophiines 7 African versus Asian Naja prior used in BEAST analyses. Phylogenetic placement of nodes is shown on Figure Fossil Elapines 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 27 the oocyte maturation factor gene (c-mos—864 bp) and as follows: mitochondrial—mtCode1-GTRig, mtCode2- the recombination activating gene 1 (RAG-1—2400 bp). GTRig, mtCode3-GTRig, mtRNA-GTRig; mtDNA ex- Novel cytochrome b, 16S rRNA, and ND4 fragments cluding third codon positions (mtDNA3rdExcl)—mt were amplified and sequenced using the primers pub- Code1-GTRig, mtCode2-GTRig, mtRNA-GTRig; and lished in Lukoschek and Keogh (2006), Palumbi (1996), nuclear—nDNA1-GTRig, nDNA2-GTRig, nDNA3-GT and Forstner et al. (1995), respectively, and the proto- Rig with model parameters allowed to vary indepen- cols of Lukoschek and Keogh (2006) and Lukoschek dently across partitions. However, MrBayes returned et al. (2007). Amplifications of RAG-1 and c-mos used unrealistic estimates of alpha for the nDNA1 gamma the primers and protocols of Groth and Barrowclough distribution of rate heterogeneity (66.74 4006.05), (1999) and Saint et al. (1998). Newly generated se- so we used the next best nDNA model (nDNA1+2-± quences were submitted to GenBank (Appendix Table GTRig, nDNA3-GTRig) and the best mtDNA model for A1). For some taxa, mitochondrial fragments and/or all Bayesian analyses (BEAST and MrBayes). We also nuclear genes were concatenated from two individuals conducted extensive preliminary analyses of all three or two congeneric species to minimize the amount of methods using a combined nDNA + mtDNA data set, missing sequence data, in which case, the highest com- but the results were virtually identical to those obtained mon taxon name was assigned (Appendix Table A1). Se- for the mtDNA data alone, so we do not present the quences were edited in SeqMan (Lasergene v.6; DNAS- results of the combined data set. TAR Inc.), aligned with Clustal W2 (default parameters) Bayesian relaxed molecular clocks, which assume (Labarga et al. 2007), and visually refined. Following rates of molecular evolution are uncorrelated but log- alignment, coding region sequences were translated into normally distributed among lineages (Drummond et al. amino acid sequences in MacClade v.4.06 (Sinauer Inc.) 2006), as implemented in BEAST v1.4.8 (Drummond using the vertebrate mitochondrial and nuclear genetic and Rambaut 2007) were used for all dating analyses. codes as appropriate. No premature stop codons were Yule and birth–death models performed similarly in all observed, so we are confident that the mitochondrial se- preliminary analyses, so the birth–death model (Gern- quences obtained were mitochondrial in origin, and that hard 2008) with a uniform prior was used to model the nuclear genes were not nonfunctional nuclear copies cladogenesis for all final analyses. We summarized the (pseudogenes). Saturation plots comparing uncorrected outputs of all MrBayes and BEAST Markov chain Monte “p” genetic distances with General Time Reversible Carlo (MCMC) analyses using TRACER (version 1.4) in plus invariant plus gamma (GTRig) distances were con- order to obtain parameter estimates, as well as evalu- structed for the nuclear and mitochondrial data sets. In ate effective sample sizes (ESSs) and convergence. ESS order to evaluate saturation in each of the mitochon- values greater than 100 are generally regarded as be- drial codon positions, we also constructed saturation ing sufficient to obtain a reliable posterior distribution plots for the first, second, and third codon positions of (Drummond et al. 2007), and we adjusted the numbers the ND4 and cytochrome b genes. of MCMC runs to ensure that ESSs were greater than The best-fit models of molecular evolution for the nu- 100 for all relevant parameters in each set of analy- clear and mitochondrial data sets were selected based ses conducted (numbers of MCMC runs for different on Akaike Information Criteria (AIC) implemented analyses are specified in relevant sections). ESS values in ModelTest 3.06 (Posada and Crandall 1998) using typically were much larger than 100 for most parame- model scores ( lnL) obtained from PAUP* (Swofford ters in each analysis. Graphical exploration of trace files 2000). We evaluated− alternative partitioning strate- for tree likelihoods and other tree-specific parameters gies using a modified version of the Akaike infor- using TRACER (version 1.4) indicated that convergence mation criterion for small sample sizes (AICc) and had been reached in all cases. Bayesian information criterion (BIC) (McGuire et al. 2007). AICc and BIC values incorporate a penalty for Single-Fossil Cross-Validations increasing the number of parameters in the model, thus potentially avoiding problems with model over- The agreement or consistency between single-fossil parameterization. Three partitioning strategies were calibration dates and other available fossil calibrations evaluated for the mitochondrial (mtCode, mtRNA; mt- for 10 calibrated nodes (Fig. 1—Tree Root and nodes 1– Code1+2, mtCode3, mtRNA; mtCode1, mtCode2, mt- 9) was evaluated using a modified version of the single- Code3, mtRNA) and nuclear data (nDNA; nDNA1+2, fossil cross-validations developed by Near et al. (2005). nDNA3; nDNA1, nDNA2, nDNA3). Bayesian analyses There were two main differences in our approach. First, (four incrementally heated chains run for 2,000,000 rather than using fixed points for each calibration, we generations sampled every 100th generation with all used lognormal distributions that placed a hard mini- substitution parameters and rates allowed to vary mum bound and soft maximum bound on each calibra- across partitions) were conducted in MrBayes (Ron- tion (Table 1), thereby allowing for uncertainty in the quist and Huelsenbeck 2003) and used to evaluate fossil dates (Yang and Rannala 2006; Ho and Phillips combinations of character partition and evolutionary 2009). For each single-fossil calibration (i), we calcu- model. AICc and BIC values were calculated using lated the metrics Dx, SSx, and s (Near et al. 2005) for the equations of (McGuire et al. 2007, p. 841). AICc the other nine fossil-calibrated nodes on the tree using and BIC criteria selected the same optimal partitions age estimates obtained from BEAST. We conducted the 28 SYSTEMATIC BIOLOGY VOL. 61

FIGURE 1. Bayesian chronograms from BEAST analyses with tree roots constrained to 97 (92–120) myr indicating the position of 10 candidate fossil-calibrated nodes (root and nodes 1–9) evaluated in this study. Solid black dots indicate nodes with >98% PPs. a) Chronogram from nuclear DNA. b) Chronogram from entire mitochondrial data set and c) Chronogram from mitochondrial data with third codon positions removed. 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 29 cross-validations using both the mean and median age MA, 95% confidence interval 85–135 MA) spanning estimates in order to evaluate whether the posterior a wide range of plausible dates= for this node (Table 1) age distributions (rather than point age estimates) in- in all single-fossil calibration analyses. BEAST runs for fluenced which fossil calibrations were identified asin- single-fossil cross-validations were conducted as fol- congruent. The difference between the molecular (MA) lows: nDNA—4,000,000 generations sampled every 100 and fossil age (FA) at each node was calculated as D generations, mtDNA—5,000,000 generations sampled i = (MAi FAi), where FAi is the fossil age and MAi is the every 100 generations, and mtDNA3rdExcl—10,000,000 mean− or median molecular age estimate for node i using generations sampled every 100 generations. the candidate fossil calibration at node x. The average difference Dx between the MA and FA across the nine other fossil-calibrated nodes for the fossil calibration at Evaluating Fossil Coverage and Identifying Outliers node x was then calculated as The approach of Marshall (2008) involves generating Di an ultrametric tree that is uncalibrated with respect to i x the fossil record and then mapping all candidate fossil Dx 6= . = nP 1 calibrations onto the tree to determine which of the cal- − ibrated lineages has the best temporal fossil coverage. The FA for each candidate fossil-calibrated node (x) was Specifically, the method aims to identify the lineage, for used as a single calibration prior in the BEAST analysis, which the oldest fossil (for that lineage) sits proportion- and its standard errors were calculated from the remain- ally closest to the node of its MRCA (true time of origin) ing nine candidate fossil-dated nodes. SS values were and therefore has the best temporal coverage. Marshall then calculated as the sum of the squared differences (2008) emphasizes two assumptions of the method: 1) between the MA and FA age estimates at all other fossil- the proportional branch lengths of the ultrametric tree dated nodes using the formula are accurate and 2) fossilization is random: However, the method also assumes that fossils are accurately SS D2. dated and assigned correctly to their respective lineages x = i i x (see below for further discussion). X6= The first and arguably most important step in the Finally, the average squared deviations, s, were calcu- approach of Marshall (2008) is to generate a reliable n 2 x 1 i x Di ultrametric phylogeny that is uncalibrated with respect lated using the formula s =n(n 16=) , where n is = P P− to the fossil record using an appropriate relaxed clock equal to the total number of observations of Di (i.e., algorithm. Given that obtaining accurate proportional the number of fossil calibrations remaining). For more branch lengths of the ultrametric tree is critical to the details about the single-fossil cross-validation analyses, success of this method, we generated a number of ultra- see Near et al. (2005). metric trees using different approaches and compared The second difference in our approach was that, the results. Specifically, we generated ultrametric trees rather than using the cross-validations to exclude spe- for the mtDNA and nDNA data sets in BEAST by con- cific fossils, we used them in a more exploratory fashion straining the tree root with a fixed value (arbitrarily to evaluate the alternative placements of three fossils as set to 100). However, MCMC runs of 20,000,000 gen- calibrations for their respective stem (nodes 4, 6, and 8) erations were needed to obtain ESSs >100 for the cali- and crown (nodes 5, 7, and 9) clades (Table 1). We also brated nodes using nDNA and convergence could not evaluated three different pairs (referred to as calibra- be achieved for mtDNA. As such, we followed the ap- tion sets) of fossil dates for two nodes, the most recent proach of Marshall (2008) and obtained ultrametric trees common ancestor (MRCA) of Caenophidia (Fig. 1— using r8s (Sanderson 2003). r8s requires user-specified node 2), and the MRCA of Colubroidea (Fig. 1—node3), input trees, so we used MrBayes (MCMC chains of based on their previous use in other studies (Table 1). 2,000,000 generations sampling every 100 generations Each alternative set of fossil dates for nodes 2 and 3 and all default settings) to obtain optimal Bayesian (Table 1: Sets A, B, C) was evaluated by conducting a phylogenies for the nDNA and mtDNA data sets us- separate iteration of the cross-validation exercise (i.e., ing the same partitioning strategies and models of three separate iterations). In each case, the calibration evolution used for the BEAST analyses. As there is set and the corresponding molecular dates from the evidence that branch lengths are more accurately es- single-fossil dating analyses were used to calculate Dx, timated by maximum-likelihood (ML) than Bayesian SSx, and s. The molecular and fossil dates for the other criteria (Schwartz and Mueller 2010), we also generated eight single–fossil-calibrated nodes were the same for ML trees for the nDNA, mtDNA, and mtDNA3rdExcl the three calibration sets. data sets in PAUP (Swofford 2000) under optimal mod- Preliminary analyses revealed that the shallower cal- els of sequence evolution obtained from AIC in Model- ibrations (Fig. 1, nodes 4–9) artificially inflated age es- Test (Posada and Crandall 1998). We generated rooted timates at deeper nodes to unrealistically high values. input trees (required by r8s) by adding sequences ob- In order to stabilize estimated ages at deeper nodes, we tained from GenBank (Appendix Table A1) for two constrained the root using a normal prior (mean 110 outgroup taxa (the lizard genera Varanus and Calotes) = 30 SYSTEMATIC BIOLOGY VOL. 61 to the data sets. The lizard taxa were pruned from the included nodes 2 and 3 but could not distinguish be- optimal ML and Bayesian trees and the resulting rooted tween the different possible ages assigned to these two trees used to obtain ultrametric trees in r8s, again fix- nodes (Table 1—Sets A, B, and C). In addition, the ESFi ing the root age to an arbitrary value of 100. We used values for the same six fossil-calibrated nodes indicated semiparametric penalized likelihood (PL) (Sanderson that none were outliers. However, ESFi values cannot 2002) and optimal smoothing parameters identified be used to evaluate alternative dates for the same node from the cross-validation procedure in r8s as follows: because the oldest date will inevitably have the high- MrBayes tree—smoothing parameter of 3200 with log est empirical coverage, even if that date is not correct. penalty function and ML tree—smoothing parameter of Moreover, ESFi values from different ultrametric trees 3200 with additive penalty function. Given that Smith identified different fossils as having the highest empir- et al. (2006) demonstrated that the log penalty func- ical coverage (see below for details). We evaluated the tion better estimated branch lengths than the additive alternative ages for nodes 2 and 3 using three sets of penalty function for calibrated ultrametric trees, we BEAST multicalibration analyses that incorporated the also generated an ML ultrametric tree using the log four congruent calibrations and the Set A, B, and C node penalty function and optimal smoothing parameter of 2 and 3 calibration ages in turn. For each analysis, we 320 (note, however, that the sum of squares obtained compared the prior and posterior distributions of all from the cross-validations for the log penalty func- six fossil-calibrated nodes, with the expectation that the tion were much higher than the additive penalty func- node 2 and 3 calibration set most consistent with the tion, suggesting that the additive penalty was more other four fossil-dated nodes would return posterior appropriate). distributions for all six calibrated nodes that were sim- We used the resultant ultrametric trees to calculate the ilar to their prior constraints (Sanders and Lee 2007). empirical scaling factor (ESF) for each candidate fossil We also conducted a fourth set of analyses using the calibration (including the three alternative fossil dates four congruent fossils with no constraints on nodes 2 for nodes 2 and 3 and the alternative placements of three and 3 (Set D) and compared the unconstrained and con- FAi strained node 2 and 3 age estimates. These four sets of fossils, Table 1) using the equation ESFi , where = NTL BEAST analyses were conducted for nDNA, mtDNA, FAi is the age of the oldest fossil of the lineage and NTLi is the relative node to tip length of the branch of that and mtDNA3rdExcl data sets, using the same lognor- lineage on the ultrametric phylogeny (Marshall 2008). mal priors, relaxed molecular clocks, and partitioned evolutionary models as the single-fossil dating analy- The fossil with the largest ESFi is regarded as having the best temporal coverage; however, fossils that have been ses. MCMC runs comprised 4,000,000 generations for incorrectly assigned and/or incorrectly dated may also the nuclear data and 10,000,000 generations for both have the highest ESF values and these outliers need to be mitochondrial data sets. In each case, MCMC runs were identified. We tested for possible fossil outliers bycom- sampled every 100 generations. Given that certain combinations of priors can interact paring the distribution of ESFi values to a uniform dis- tribution using the Kolmorgorov–Smirnov test, on the to generate unexpected effective joint priors, we also performed an analysis for each calibration set without assumption that ESFivalues for fossil outliers lie outside a uniform distribution (Marshall 2008). One limitation data (empty alignments) to ensure that the effective of this approach is that it is most effective if there is priors were similar to the original priors. We assessed just one outlier (Marshall 2008, p. 732). We were test- how informative the data were by comparing the effec- ing the alternative stem and crown placements of three tive priors with posteriors obtained using data (Drum- mond et al. 2006). These analyses indicated that the fossils. As such, the ESFi values for the crown place- effective priors were similar to the original priors and ments (that inevitably will be larger than the ESFi values for their stem placements) might potentially cluster to- the posteriors obtained from the data departed from the gether, thereby making it impossible to identify them as priors (indicating informative data). outliers. In order to address this issue, we modified the approach of Marshall (2008) to test the alternative place- ments of these fossils (see Results section for details). RESULTS The final nDNA alignment had 3264 characters of Bayesian Analyses to Evaluating Multicalibration Sets which 870 were variable and 421 were parsimony in- We used the method of Sanders and Lee (2007) to formative, whereas the mtDNA alignment had 2221 evaluate three alternative dates for two nodes with characters of which 1368 were variable and 1193 were controversial fossil calibrations in a Bayesian multical- parsimony informative, and the mtDNA3rdExcl had ibration framework. This method compares the prior 1632 characters of which 884 were variable and 578 were and posterior distributions of the 95% highest posterior parsimony informative. All tree topologies from PAUP* densities (HPD) intervals for each candidate calibra- ML analyses and Bayesian MCMC searches (MrBayes tion, particularly focusing on potentially controversial and BEAST) of the nuclear and mitochondrial data sets calibrations of interest. In our case, the single-fossil converged on a topology (Fig. 1) highly congruent with cross-validations identified plausible congruent cali- published molecular phylogenies for the the elapid taxa bration sets comprising six fossil-calibrated nodes that (Slowinski et al. 1997; Keogh 1998; Keogh et al. 1998; 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 31

Slowinski and Keogh 2000; Lukoschek and Keogh 2006; the mean age estimates. Nuclear DNA cross-validations Wuster et al. 2007; Sanders and Lee 2008; Sanders et al. produced similar results for each calibration set, with Dx 2008; Kelly et al. 2009; Pyron et al. 2011). Data matri- values indicating that four fossils consistently produced ces and relevant trees have been submitted to TreeBASE older molecular divergence estimates for other candi- (#11272). Eight of the 10 candidate calibration nodes had date fossil-calibrated nodes, whereas the other six fossils extremely high support with 99% PPs for all analyses ≥ produced younger divergence estimates; however, the conducted (Fig. 1). The two nodes with poor support relative magnitude of these tendencies differed between were node 5 (typically with 80% PPs for mtDNA and calibration sets (Fig. 3a). Specifically, the youngest fossil <50% PPs for nDNA) and node∼ 8 (typically with 55% ∼ dates for nodes 2 and 3 (set A) resulted in larger molec- PPs for mtDNA and <50% PPs for mtDNA). Other ular overestimates and smaller underestimates of fossil nodes with PPs >98% are also shown on the trees (Fig. dates than sets B and C, which returned similar mean 1). Saturation plots revealed an abundance of hidden differences (Dx) between the fossil and molecular dates substitutions in all three codon positions of the mito- (Fig. 3a). SS values ranked the four node calibrations chondrial data set (Fig. 2a–d) but particularly in the that consistently produced older molecular divergence third codon position (Fig. 2d). estimates for other FAs as the most incongruent fossils (Fig. 4a). Set A calibrations produced consistently larger Single-Fossil Cross-Validations SS values for all fossil calibrated nodes than sets B and In all cases, the results of single-fossil cross-validations C (Fig. 4a), reflecting the larger differences (Dx) between using mean and median age estimates from BEAST were the molecular and fossil dates using the younger set A highly consistent, so we present only the results from calibrations (Fig. 3a). By contrast, SS values for sets B

FIGURE 2. Saturation plots of genetic distances corrected for multiple substitutions versus uncorrected “p” distances. Corrected genetic distances were calculated using the estimated best-fit models of sequence evolution obtained from AIC criterion in ModelTest. a) Saturation plots of the entire mitochondrial DNA data set (black circles) versus nuclear DNA (gray diamonds). Note the different axis scales for the nuclear and mitochondrial data sets. Saturation plots are also shown for b) mtDNA first codon position, c) mtDNA second codon position, and d) mtDNA third codon position for the combined ND4 and cytochrome b genes. Note the different x-axis scales for b, c, and d. 32 SYSTEMATIC BIOLOGY VOL. 61

FIGURE 3. Histogram of the mean differences and standard errors between fossil and estimated MAs for each of three sets of 10 single– fossil-calibrated nodes from a) nuclear DNA; b) mitochondrial DNA; and c) mitochondrial DNA with third codon position removed. FAs for 8 of the 10 candidate nodes were identical for each set, differing only for nodes 2 and 3 (see Fig. 1). FAs used as constraints are given in Table 1. For a single node (x), the FA at node x was used as a single calibration prior. MA estimates were obtained for the nine other candidate nodes, for which FAs were available. and C were very similar (Fig. 4a). Sequential removal mtDNA data sets, whereas this node produced younger of fossil calibrations from most to least divergent, as dates for nuclear DNA. Given these differences, it is ranked by SS values (Fig. 4a), resulted in steep incre- not surprising that mitochondrial SS values ranked fos- mental declines in s values for the subsequent removal sils differently than nuclear SS values (Fig. 4b,c). In of nodes 7, 9, 5, and 4 for all calibration sets (Fig. 5a). addition, Dx values for the younger set A calibrations At this point, s values for sets B and C were small and (at nodes 2 and 3) did not follow the same pattern as subsequent removal of fossils did not markedly de- for sets B and C (Fig. 3b,c) and the mitochondrial rank crease s values (Fig. 5a). Starting s values for set A were order of candidate calibrations was different for set A much larger than for sets B and C and did not drop to calibrations than for sets B and C, which were similar low values until the fifth fossil calibration (node 2) was (Fig. 4b,c). Sets B andC had highest SS values at nodes removed and then remained low (Fig. 5a). 6 and 8; however, removing these nodes only slightly Mitochondrial DNA produced a markedly different decreased s values, which did not decline sharply un- pattern of mean differences (Dx) between the molecular til subsequent removals of the third and fourth ranked and fossil dates than nuclear DNA (Fig. 3). Most notably, fossils and then remained low (Fig. 5b,c). Interestingly, the four fossil calibrations (nodes 4, 5, 7, and 9) that re- node 1 was the most incongruent fossil for the younger turned much older nuclear DNA values for FAs at other set A calibrations for the entire mtDNA data set and s candidate calibration nodes either produced younger values dropped sharply when it was removed. Subse- or only slightly older estimates of FAs for mtDNA (Fig. quent removal of the three next most incongruent fossils 3b) and this remained the case even when the third did not produce further decreases in s, but s decreased codon positions were removed (Fig. 3c). In addition, the with the removal of the fifth and subsequent fossils tendency for nodes 6 and 8 to produce younger MAs (Fig. 5b). By contrast, node 8 was the most incongruent for fossil dates at other nodes was more extreme for fossil for all three calibration sets for the mtDNA data the mitochondrial than nuclear data, and this was true set with third codon position excluded, and s values did for both mitochondrial data sets (Fig. 3b,c). By contrast, not drop sharply until the first two most incongruent node 1 produced older ages at other nodes for both nodes were excluded in each case (Fig. 5c). 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 33

FIGURE 4. SS values for each candidate fossil calibration node when used as the single calibration prior in each of the three calibration sets for a) nuclear DNA; b) mitochondrial DNA; and c) mitochondrial DNA with third codon position removed.

Fossil Coverage and Fossil Outliers values (in decreasing order) for the ML and MrBayes The four ultrametric trees obtained from the nDNA ultrametric trees were for nodes 9, 7, 5, and 4 (Table 2), data set differed in their proportional branch lengths, the same nodes identified as least congruent by the resulting in differing ESFi values for the candidate fossil cross-validation analyses. These four nodes also had calibrations (Table 2). Nonetheless, the four highest ESFi the highest ESFi values for the BEAST ultrametric tree 34 SYSTEMATIC BIOLOGY VOL. 61

FIGURE 5. Effect of sequentially removing candidate fossil calibrated nodes on s, the average squared deviation of Di values for the remain- ing fossil calibrations in each set. a) Nuclear DNA s values for three calibration sets. Fossils were removed based on highest to lowest SS values calculated from all 10 fossil-calibrated nodes. Removal order (shown on the x-axis) of the first four most incongruent fossils was identical for each calibration set but then differed between sets. b) mtDNAs values for three calibration sets when fossils were removed based on highest to lowest SS values calculated for all 10 fossil-calibrated nodes. but in different decreasing order (Table 2). Lack of res- http://www.sysbio.oxfordjournals.org/) suggest that olution in the ML and Bayesian nDNA trees resulted it is not possible to accurately place them on the phy- in nodes 4 and 5 forming a polytomy: As such, it was logeny (despite their use to date caenophidian diver- not possible to evaluate the alternative placements of gences in previous studies: Guicking et al. 2006; Alfaro this fossil calibration (as the ESFi values for the stem et al. 2008). As such, we excluded them from the outlier and crown placement were identical). Moreover, is- analysis. sues regarding the taxonomic affinities of these fossils Nodes 7 and 9 were the shallower crown placements (Table 1; and Supplementary material A available from of the two candidate fossil calibrations, for which the 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 35

alternative deeper stem placements also were evalu- ated. Obviously, the candidate fossils cannot correctly be assigned to both the stem and crown nodes so, prior to testing whether the distributions of ESF values con- d 218 137 128 161 190 192 i formed to uniform distributions, we removed the ESFi 10 values for the corresponding stem placements of each fossil (nodes 6 and 8). The resulting distributions of 8 140 6 148 4 185 7 187 1 197 95 203 205 ESF values for the BEAST and ML ultrametric trees Set C Set B Set B

Set C i Set A Set A ML - log Root 97 3- 2- 3- 3- 2- 2- (under both the additive and log penalty functions) mtDNA noThir were strongly rejected as belonging to uniform distri- butions (BEAST P < 0.05; ML trees P < 0.005 in both cases); however, this was not the case for the MrBayes tree (0.20 < P > 0.10). These inconsistent results high- 193 165 227 247 267 281 light the sensitivity of this approach to differences in proportional branch lengths obtained from ultrametric trees obtained using different methods (see below for further discussion). Given that the weight of evidence Set B Set B Set C Set C Set A Set A suggested that crown placement of the Naja fossil was an outlier, we removed the ESFi values for node 9 and reinserted the ESFi values for the corresponding stem placement of the fossil (node 8). The resulting distribu- 178 3- 148 2- 209 3- 223 6 245 246 4 265 254 2- tions of ESFi values for the MrBayes and ML ultrametric

Mitochondrial DNA trees also were rejected as belonging to uniform dis- tributions, suggesting that the crown placement of the 1 213 8 232 8 221 1 241 6 235 2- 4 247 3- 9 373 9 404 57 271 299 5 7putative 286 324 Laticauda fossil at node 7 also is an outlier. Set B Set B Set C Set C Set A Set A Root 97 Root 97 MrBayes - log 1 ML - log 10 2- 3- 3- 2- 3- 2- However, this was not the case for the BEAST ultramet- ric tree (Table 2). We then removed the ESFi values for node 7 (from the ML and MrBayes ESFi distributions) and inserted the ESFi values for the stem placement of the fossil at node 6. The resulting distributions of 84 57 72 85 97 99

130 ESFi values were not rejected as belonging to uniform distributions. In terms of the MrBayes tree, the inclu- sion of ESFi values for both potential outliers (nodes 7

BEAST and 9) may have resulted in the artifact mentioned by Set B Set B Set C Set C Set A Set A Marshall (2008), whereby the larger ESFi values of out- 3- liers group together making it impossible to distinguish the resultant distribution from a uniform distribution (thereby failing to identify node 9 as an outlier). In or- 86 2- 130 3- 140 2- 148 6 85 165 8 88 193 Root 97 der to explore this possibility, we removed the ESFi for node 7 and retained the ESFi of the corresponding stem placement at node 6. The resulting distribution of ESFi values did not conform to a uniform distribu- Set B Set B Set C Set C Set A Set A tion, supporting node 9 as an outlier. Overestimation

DNA of shorter branches has recently been demonstrated for Bayesian approaches (Schwartz and Mueller 2010), and

79 2- the smaller difference between ESFi values for nodes 9 methods 115 2 - 118 1 138 1 85 135 3- 136 8 168 2- 159 6 224 Nuclear and 7 for the Bayesian than ML trees may reflect over- estimation of short branches in the crown Naja clade by MrBayes. 8 120 3- 1 128 2- 6 152 3- 4 194 4 250 7 579 194 242 370 5 7 9The 250 361 494 proportional 4 9 5 134 168 branch 171 lengths and corresponding Set B Set C Set C Set A Set A Set B ordering of ESFi values for the ultrametric trees ob- tained from optimal mtDNA ML and MrBayes and the

candidate fossil calibrations for nuclear and mitochondrial data sets calculated using proportional branch lengths obtained from uncalibrated mtDNA3rdExcl ML trees were different from those ob-

for tained from nDNA (Table 2). For the both mtDNA trees, i 76 2- 97 Root 97 Root 97 3- 114 2- 113 3- 122 130 2- 123 132 3- 137 156 3- 196 196 235 250 the crown nodes 5, 7, and 9 still had the highest ESFi ESF

- log 3200 ML - add 3200 ML - log 320 values, whereas for the mtDNA3rdExcl tree, the node

2. 2 Set C had the highest ESFi value (Table 2). However, The BEAST nDNA chronogram was obtained by fixing the root to an arbitrary value of 100. The remaining ultrametrictrees were produced in r8s using the optimal ML and no ESF Node no ESF Node no ESF Node no ESF Node no ESF Node no ESF Node no ESF BLE the distributions of ESFi values conformed to unifor- A T MrBayes mity for all three mitochondrial ultrametric trees (ML Set B Set A Set A Set C Set C Set B Notes: 2- 3- Bayesian (MrBayes) phylogenies. Uncalibrated ultrametricpenalty trees function were and obtained the byeach optimal fixing ultrametric smoothing the tree parameter indicate root the obtaineddetails. fossil from withto an arbitrary cross-validation the (shown highest in empiricalvalue coverage the after columnof 100 removing heading). fossils Nodes identifiedand andusingas outliers corresponding PL with (i.e.,the logarithmic ESF values not conforming highlighted (log) or additive (add) in to bolda uniform distribution). for text for more See Root 2- ultrametric trees produced using different Node 8 1 2- 3- 6 3- 4 5 7 9 and MrBayes), and this result was true for distributions 36 SYSTEMATIC BIOLOGY VOL. 61 including just one potential crown node outlier (and the mitochondrial node 1 age estimates for all calibration corresponding stem placement of the other fossil): Thus, sets were similar to the calibration prior and to nu- no outliers were identified. clear DNA age estimates, and this was true for both mitochondrial data sets (Fig. 6). Nonetheless, the ten- dency for mtDNA to return older age estimates at shal- low nodes and the tree root was much pronounced Evaluating Multicalibration Sets Using Bayesian Analyses when the third codon positions were excluded, with There were consistent differences in the plausible sets mtDNA3rdExcl age estimates for nodes 6 and 8 age in- of congruent fossil calibrations identified from the cross- termediate to the nDNA and mtDNA age estimates and validations from nuclear and mitochondrial DNA, and tending to converge on mean nDNA age estimates for the fossil outliers identified from nuclear but not mito- node 3 and the tree root (Fig. 6). Although mitochondrial chondrial data based on ESFi values. These differences age estimates for node 2 from the entire data set showed are almost certainly due to the effects of nucleotide sat- the same tendency as nuclear ages to converge on the uration for mtDNA (see Discussion section). As such, unconstrained set D age estimates (irrespective of the we conducted the multicalibration analyses using the calibration prior used), this was not the case for mtDNA six fossil-calibrated nodes selected by the nuclear data. with third codon positions excluded (Fig. 6). Indeed, Multicalibration analyses using nuclear DNA re- with the exception of set C, the node 2 mtDNA3rdExcl vealed similarities and differences between the esti- age estimates tended to converge on the calibration mated mean ages and 95% HPD intervals for the six prior resulting in age estimates that were younger than calibrated nodes across calibration sets A, B, C, and D. the corresponding nDNA estimates, and this was also The most striking similarities were for the four fossil cal- true for the node 3 set A age estimate (Fig. 6). ibrations common to each calibration set (tree root and nodes 1, 6, and 8), for which the means and minimum 95% HPD intervals were very similar to their respective calibration priors (<5% in all cases), whereas maximum DISCUSSION 95% HPD intervals invariably were smaller than the Increasing awareness of the importance of identifying calibrations (Fig. 6). By contrast, age estimates for nodes reliable fossils to calibrate molecular clocks has resulted 2 and 3 differed considerably between calibration sets, in the development of several methods for evaluating in part reflecting the influence of their calibration priors and employing fossil calibrations (reviewed by Ho and but also reflecting inconsistencies between these priors Phillips 2009). Each approach has advantages and limi- and the other four fossil calibrations (Fig. 6). Moreover, tations, as we demonstrate by comparing three different age estimates for nodes 2 and 3 tended to converge on approaches with particular emphasis on the impact of ages estimated by set D (Fig. 6), in which nodes 2 and nucleotide saturation on the fossils selected. 3 were not constrained. This tendency was most pro- The cross-validation method (Near et al. 2005) dis- nounced for node 2, for which the set A age estimate cards calibrations until an internally consistent set is was far more similar to the set D estimate than to the set obtained, and in the process, may discard calibrations A calibration prior. Indeed, the set A prior and posterior with the best temporal coverage because they are in- distributions barely overlapped (Fig. 6). Similarly, the consistent with the remaining calibrations (see Marshall set B estimated age for node 2 also was closer to the set 2008 for indepth discussion of this issue). Nonetheless, D estimate than to the set B calibration prior, with the set the method has been used in several recent studies B maximum age estimate 70 million years younger than (Near and Sanderson 2004; Noonan and Chippindale its calibration prior (Fig. 6). Set C returned a node 2 age 2006b; Rutschmann et al. 2007; Alfaro et al. 2008). By estimate that was similar to both its calibration prior contrast, the use of ESF aims to identify one fossil with and the set D age estimate for this node, although its the best empirical coverage (Marshall 2008); however, minimum 95% HPD interval was younger than the hard accurate results are highly dependant on meeting the as- minimum bound of the prior. The node 3 age nDNA es- sumptions of the method (see below). Unlike the cross- timates were more similar to their respective calibration validation approach, ESFs have only been used in one priors, but again, posterior distributions diverged from previous study (Davis et al. 2009). This study obtained priors towards the unconstrained set D age estimate. an ultrametric tree in r8s using PL with log penalty The set A estimated mean age was slightly older than its function (following the advice of Marshall 2008), based calibration prior, but posterior and prior distributions on empirical evidence that PL using the log penalty were identical, whereas the set C age estimate also was function produces the most reliable ultrametric trees identical to the mean and minimum bounds of the cali- (Smith et al. 2006). However, Davis et al. (2009) com- bration prior (Fig. 6). The set B estimated mean age and ment that their resultant dates were much older than minimum 95% HPD were younger than the calibration expected for several lineages. Our study demonstrated prior (Fig. 6). that ultrametric trees generated from ML and Bayesian Mitochondrial age estimates were invariably older nDNA phylogenies using the log penalty function were for the shallower nodes 3, 6, and 8 than their respec- incongruent in terms of the magnitude and order of tive calibration priors and, with one exception, also for the ESFs (Table 2) and the fossil outliers identified. By the corresponding nDNA age estimates. By contrast, contrast, results from the ML ultrametric tree using the 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 37

FIGURE 6. Bayesian multifossil calibration analyses showing fossil calibration priors and posterior distributions of MA estimates (mean and 95% HPD intervals) at six fossil-calibrated nodes using four calibration sets (A, B, C, D). Each calibration set comprised four calibration priors that were identical among sets (tree root, nodes 1, 6, and 8) and two priors that differed among sets (nodes 2 and 3). Lognormal calibration priors are shown as wider shaded bars with the lognormal mean shown as a black square on the bar. MA estimates for nuclear (black bars), mitochondrial (white bars), and mitochondrial DNA with third codon position removed (gray bars) are shown in pairs for each calibration set at each node. Bars indicate 95% HPDs with estimated mean ages indicated by black squares. At nodes 2 and 3, the respective calibration prior is shown immediately below the corresponding nuclear and mitochondrial age estimates. Calibration priors at the other four nodes are shown below all four sets of MA estimates. Prior and posterior distributions are shown on a diagrammatic chronogram depicting the backbone of the phylogeny; however, this chronogram does not represent the results of any specific analysis. 38 SYSTEMATIC BIOLOGY VOL. 61 additive penalty function were more similar to those (Fig. 1c). The addition of multiple calibrations amelio- obtained for the MrBayes tree. At the very least, these rated this effect (Fig. 6), presumably resulting in more results suggest that the findings of Smith et al. (2006) accurately estimated branch lengths (time) throughout are not universal and various approaches for obtaining the chronogram. Although the mtDNA3rdExcl ultra- uncalibrated ultrametric trees need to be evaluated for metric tree generated in r8s did not suffer from simi- reliability and consistency of results. larly stretched basal branches (results not shown), the These conflicting results highlight a major limitation approach of Marshall (2008) ultimately relies on just of ESFs, which is the reliance on accurate proportional one calibration to date the phylogeny, and our analyses branch lengths (which we do not know, or the entire demonstrated the highly variable results that could be dating process would be considerably easier). The final obtained using different methods to generate the ultra- step of Marshall’s (2008) approach uses the lineage with metric tree (Table 2). Moreover, although this approach the highest coverage to calibrate the tree and estimate might be realistic for groups with exceptionally good divergences. Our nuclear DNA results suggest that the fossil records (provided that the hurdle of obtaining a set B date for node 3 had the highest coverage (Table 2). reliable ultrametric tree can be overcome), on its own, However, we were evaluating several controversial FAs it is likely to produce highly misleading results in the for this node (Table 1) and, by default, the highest cov- majority of cases where the fossil record is less than erage will be assigned to the oldest fossil so ESFs cannot ideal. be used for this task. The third method we evaluated, which uses a Bayesian framework to evaluate several candidate fossils in a multicalibration framework (Sanders and Lee 2007), is Evaluating the Effects of Saturation on Identifying ideally suited for the task. However, one limitation of Reliable Calibrations this method is that at least some of the candidate cal- The differences in the plausible sets of congruent ibrations are assumed to be reliable, with just one or fossil calibrations identified from the cross-validations two calibrations being evaluated. In addition, multiple from nuclear and mitochondrial DNA, as well as fossil calibrations can interact with each other to generate dif- outliers identified from nuclear but not mitochondrial ferent effective priors; However, the extent of this effect data based on ESFi values, can be entirely accounted can be evaluated explicitly (Drummond et al. 2006), and for by saturation effects. The saturation plots revealed our analyses of priors with empty alignments indicated strong mitochondrial saturation in the data set (Fig. 2), that this was not an issue in our study. Nonetheless, particularly the third codon position (Fig. 2d). The sat- one limitation of our study was that the calibrations for uration effects on tree topology, and corresponding age nodes 2 and 3 were evaluated in pairs based on their estimates of fossil-calibrated nodes, are clearly evident previous use in other studies and, as such, the best com- in Figure 1. Compared with the nuclear chronogram bination may not have been included in our analyses. (Fig. 1a), the chronogram from the entire mitochon- Rutschmann et al. (2007) recently presented an alter- drial data set had compressed internal branches, which native approach for evaluating the internal consistency essentially reduced the total distance (time) between of fossil calibrations that compared s values from all nodes 1 and 9 on the chronogram (Fig. 1b). This result possible combinations of dates and nodes (72 combina- was also true for the nDNA and mtDNA ultrametric tions in our case) (Rutschmann et al. 2007). However, trees generated in r8s (not shown). this approach will be subject to the same saturation ef- In terms of the cross-validations, the three sets of nu- fects demonstrated in our study and, as such, the effects clear cross-validations identified the same four shallow of using rapidly and slowly evolving gene regions or fossil-calibrated nodes (4, 5, 7, and 9) as least congruent codon positions for evaluating the internal consistency with the six other candidate calibrations tested. These of calibrations will need to be considered. nodes also had the highest ESFi values (Table 2), with There is a growing consensus that the advantages of nodes 7 and 9 being identified as outliers by three of using multiple independent fossil calibrations signif- the four nuclear DNA ultrametric trees. By contrast, icantly outweigh any disadvantages (Ho and Phillips mitochondrial cross-validations identified nodes 6 and 2009). Multiple calibrations can ameliorate the effects of 8 as least congruent for sets B and C (and also set A errors in fossil dates and/or the assignment of fossils to when the third codon positions were removed). Thus, certain nodes (Conroy and van Tuinen 2003; van Tuinen for two fossils (Naja and Laticauda), nuclear DNA fa- and Dyke 2004), provided that errors are not biased in vored stem placement (nodes 8 and 8), whereas mtDNA the same direction. Moreover, the use of multiple cali- favored crown placement (nodes 7 and 9), directly as brations allows the explicit modeling of rate variation the result of saturation effects. Specifically, if a crown among lineages. The limitations of using just one cali- group is constrained with the same fossil calibration as bration in BEAST analyses for modeling rate variation its respective stem group, the placement of a fossil at are highlighted in the chronogram from the mitochon- the shallower crown node will return older estimates drial data set with third codon positions removed: The at other nodes than stem placement, irrespective of two basal branches extending from the tree root on data type. However, because mitochondrial distances the BEAST chronogram were massively stretched and were artificially shortened (due to compression ofin- the remaining internal branches overly compressed ternal branches resulting from nucleotide saturation), 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 39 the tendency for crown placement to produce much How Wrong Can We Be? older age estimates for other fossil-calibrated nodes, Pulquerio and Nichols (2006) explored the many fac- which was so strongly apparent for nuclear DNA, dis- tors that can contribute to the highly variable dates ob- appeared for mtDNA: Instead, stem placement resulted tained using calibrated molecular clocks and posed the in younger age estimates at deeper fossil-calibrated question “how wrong can we be?” Given that the choice nodes. Similarly, the compressed internal branches for of fossil calibrations is fundamental in obtaining accu- mtDNA resulted in smaller differences between the rate dates, it is vital that the methods used to evaluate larger ESFi values; thus, ESFi distributions did not de- viate from uniformity with the result that fossil outliers candidate fossils are used with a clear understanding were not identified. Evaluating these results in terms of of the advantages and disadvantages of each, as well as the actual fossils (Table 1 and Supplementary material the effects of data type and other factors on the results. A) further suggests that misleading results were ob- Our study highlighted the fact that nucleotide satura- tained from the mitochondrial data due to the effects of tion strongly influences which fossil calibrations are saturation. identified as outliers by the cross-validations and ESFs. The effects of mitochondrial saturation are also ev- Previous studies that used cross-validations to evaluate ident in many studies estimating divergence times in fossil calibrations have tended to use some combination snakes. Studies that have relied primarily or entirely of nuclear and mitochondrial DNA (Near and Sander- on mitochondrial data (Nagy et al. 2003; Guicking et son 2004; Near et al. 2005; Noonan and Chippindale al. 2006; Burbrink and Lawson 2007; Wuster et al. 2007, 2006b; Alfaro et al. 2008) or nuclear and plastid DNA 2008; Alfaro et al. 2008; Kelly et al. 2009) have recovered (Rutschmann et al. 2007), and this was also the case 2-fold older age estimates for some advanced snake for the fossil coverage approach (Marshall 2008; Davis clades from mitochondrial sequence data (see table 1 et al. 2009). We also conducted many of the Bayesian in Kelly et al. 2009) than from nuclear sequence data single- and multicalibration analyses using a combined (Sanders and Lee 2008), even when almost exactly the mitochondrial and nuclear data set (with appropri- same calibrations were used (Sanders and Lee 2008; ate partitioning), and the results were very similar to Kelly et al. 2009). Jiang et al. (2007) demonstrated ac- those of the mitochondrial data (results not shown), celerated rates of mitochondrial evolution in advanced indicating that combining nuclear and mitochondrial snakes, suggesting that the extent of nucleotide satura- data does not inevitably counteract the effects of mi- tion may be more pronounced than in other taxonomic tochondrial saturation (but see Brandley et al. 2011). groups. Nonetheless, the effects of mitochondrial satu- Given that nucleotide saturation typically has the ef- ration for estimating branch lengths and dating diver- fect of compressing basal branches, it is most likely that gences have been well documented for other vertebrate older calibrations at shallow nodes will be identified as groups such as agamid lizards (Hugall and Lee 2004), more congruent with candidate calibrations at deeper squamates (Townsend et al. 2004), tetrapods (Hugall nodes by cross-validations using sequence data with et al. 2007), rodents (Jansa et al. 2006), and across all high levels of saturation yet not be identified as outliers vertebrates (Phillips 2009). In addition, Brandley et al. based on the distribution of ESFs, as was the case in (2011) recently demonstrated the importance of data our study. If these calibrations subsequently are used partitioning for obtaining accurate divergence estimates in dating analyses that also rely partially or entirely on in lizards, particularly drawing attention to the effects of saturated DNA, the resultant age estimates will suffer highly saturated mitochondrial third codon positions. from the compounded effects of two sources of error We found similarly high levels of third codon mito- from nucleotide saturation. Recent studies have demon- chondrial saturation (Fig. 2d), yet removing third codon strated the potential benefits of appropriate data parti- nucleotides did not ameliorate saturation effects for tioning (Brandley et al. 2011) and the use of RY coding the single-fossil cross-validations (Figs. 2–4). However, for mitochondrial data (Phillips 2009) for ameliorating removing third codon nucleotides improved the mul- saturation effects on estimating divergence dates. We ticalibration BEAST analyses. Posterior distributions demonstrate that excluding third codon positions can for the six fossil-calibrated nodes (with third codon also ameliorate saturation effects in Bayesian multical- nucleotides excluded) typically converged on age esti- ibration analyses, with relevance both for evaluating mates from nuclear DNA (deeper nodes) or were inter- fossil calibrations and estimating divergences. mediate between the mitochondrial (entire) and nuclear To our knowledge, there has been no previous DNA results (shallower nodes). These results suggest evaluation of the effects of data type (saturation) on that removing highly saturated third codon positions approaches for evaluating fossil calibrations (Near and in multicalibration Bayesian analyses might provide a Sanderson 2004; Rutschmann et al. 2007; Sanders and way forward for dealing with mitochondrial saturation, Lee 2007). Given that these approaches are in their both for evaluating fossil calibrations and for estimating infancy, further exploration of the effects of using divergences. More importantly, our study demonstrates sequence data with different evolutionary rates for that saturation strongly influenced different approaches evaluating candidate fossils and their most appropri- for evaluating candidate calibrations and highlights the ate placement on a phylogeny is obviously needed. In need to carefully consider the effects of data type when the meantime, we urge researchers evaluating candi- evaluating fossils. date fossil calibrations to utilize several of the methods 40 SYSTEMATIC BIOLOGY VOL. 61 currently available and critically compare the results. Drummond A.J., Ho S.Y.W., Rawlence N., Rambaut A. 2007. A rough Moreover, we think it imperative that researches con- guide to BEAST 1.4. 1–42. duct these analyses using separate nuclear and mito- Drummond A.J., Rambaut A. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7:214. chondrial data sets (rather than combining the data) Forstner M.R.J., Davis S.K., Arevalo E. 1995. Support for the and use one or more of the various approaches for hypothesis of anguimorph ancestry for the suborder serpentes from ameliorating mitochondrial saturation and compare the phylogenetic analysis of mitochondrial DNA sequences. Mol. Phy- results, particularly when evaluating fossils that span logenet. Evol. 4:93–102. Gardner J.D., Cifelli R.L. 1999. A primitive snake from the Cretaceous very different temporal depths on the tree. of Utah. Special Papers Palaeontology. Volume 60. p. 87–100. Gernhard T. 2008. The conditioned reconstructed response. J. Theor. Biol. 253:769–778. SUPPLEMENTARY MATERIAL Graur D., Martin W. 2004. Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends Genet. Supplementary material, including data files and/or 20:80–86. online-only appendices, can be found at http://www. Groth J.G., Barrowclough G.F. 1999. Basal divergences in birds and the sysbio.oxfordjournals.org/. phylogenetic utility of the nuclear RAG-1 gene. Mol. Phylogenet. Evol. 12:115–123. Guicking D., Lawson R., Joger U., Wink M. 2006. Evolution and phy- logeny of the genus Natrix (Serpentes: ). Biol. J. Linn. FUNDING Soc. 87:127–143. This study was supported by funds from the Univer- Head J., Holroyd P.A., Hutchison J.H., Ciochon R.L. 2005. First report of snakes (Serpentes) from the late middle Eocene Pondaung for- sity of California, Irvine. mation, Myanmar. J. Vert. Paleontol. 25:246–250. Ho S.Y.W., Phillips M.J. 2009. Accounting for calibration uncertaintly in phylogenetic estimation of evolutionary divergence times. Syst. ACKNOWLEDGMENTS Biol. 58:367–380. Holman J.A. 1984. Texasophis galbreathi, new species, the earliest New We thank Matt Phillips for insightful comments and World colubrid snake. J. Vert. Paleontol. 3:223–225. for invaluable assistance in generating ultrametric trees Hugall A.F., Foster R., Lee M.S.Y. 2007. Calibration choice, rate in r8s. We thank Andrei Tatarenkov, Jim McGuire, Frank smoothing, and the pattern of tetrapod diversification according to the long nuclear gene RAG-1. Syst. Biol. 56:543–563. Anderson, and one anonymous reviewer for exten- Hugall A.F., Lee M.S.Y. 2004. Molecular claims of Gondwanan age for sive constructive comments that significantly improved Australian agamid lizards are untenable. Mol. Biol. Evol. 21:2102– the manuscript. J.S.K. thanks the Australian Research 2110. Council for ongoing support. V.L. thanks the ARC Cen- Hugall A.F., Lee M.Y.L. 2007. The likelihood node density effect and tre of Excellence for Coral Reef Studies at James Cook consequences for evolutionary studies of molecular rates. Evolu- tion. 61:2293–2307. University for support while finalizing the manuscript. Jansa S.A., Barker F.K., Heaney L.R. 2006. The pattern and timing of diversification of Philippine endemic rodents: evidence from mito- chondrial and nuclear gene sequences. Syst. Biol. 55:73–88. REFERENCES Jiang Z.J., Castoe T.A., Austin C.C., Burbrink F.T., Herron M.D., McGuire J.A., Parkinson C.L., Pollock D.D. 2007. Comparative mi- Albino A.M. 2000. New record of snakes from the Cretaceous of Patag- tochondrial genomics of snakes: extraordinary substitution rate dy- onia (Argentina). Geodiversitas. 22:247–253. namics and functionality of the duplicate control region. BMC Evol. Alfaro M.E., Karns D.R., Voris H.H., Brock C.D, Stuart B.L. 2008. Biol. 7:123 Phylogeny, evolutionary history, and biogeography of Oriental- Kelly C.M.R., Barker N.P., Villet M.H., Broadley D.G. 2009. Phylogeny, Australian rear-fanged water snakes (Colubroidea: Homalopsidae) biogeography and classification of the superfamily Elapoidea: a inferred from mitochondrial and nuclear DNA sequences. Mol. rapid radiation in the late Eocence. Cladistics. 25:38–63. Phylogenet. Evol. 46:576–593. Keogh J.S. 1998. Molecular phylogeny of elapid snakes and a consid- Benton M.J., Ayala F.J. 2003. Dating the tree of life. Science. 300:1698– eration of their biogeographic history. Biol. J. Linn. Soc. 63:177–203. 1700. Keogh J.S., Shine R., Donnellan S. 1998. Phylogenetic relationships Brandley M.C., Wang Y., Guo X., Montes de Oca A.N., Feria-Ortiz of terrestrial Australo-Papuan elapid snakes (Subfamily Hydrophi- M., Hikida T., Ota H. 2011. Accommodating heterogeneous rates inae) based on cytochrome b and 16S rRNA sequences. Mol. Phylo- of evolution in molecular divergence dating methods: an example genet. Evol. 10:67–81. using intercontinental dispersal of Plestiodon (Eumeces) lizards. Syst. Kuch U., Muller J., Modden C., Mebs D. 2006. Snake fangs from Biol. 60:3–15. the lower Miocene of Germany: evolutionary stability of perfect Burbrink F.R., Lawson R. 2007. How and when did Old World weapons. Naturewissenschaften. 93:84–87. ratsnakes disperse into the New World? Mol. Phylogenet. Evol. Labarga A., Valentin F., Andersson M., Lopez R. 2007. Web services 43:173–189. at the European bioinformatics institute. Nucleic Acids Res. (Web Conroy C.J., van Tuinen M. 2003. Extracting time from phylogenies: Services Issue):1–6. positive interplay between fossil and genetic data. J. Mammal. Lawson R., Slowinski J.B., Crother B.I., Burbrink F.R. 2005. Phylogeny 84:444–455. of the Colubroidea (Serpentes): new evidence from mitochondrial Davis R.B., Baldauf S.L., Mayhew P.J. 2009. Eusociality and the success and nuclear genes. Mol. Phylogenet. Evol. 37:581–601. of the termites: insights from a supertree of dictyopteran families. Lee M.S.Y. 1999. Molecular clock calibrations and metozoan diver- J. Evol. Biol. 22:1750–1761. gence dates. J. Mol. Evol. 49:385–391. Donoghue M.J., Benton M.J. 2007. Rocks and clocks: calibrating the Lukoschek V., Keogh J.S. 2006. Molecular phylogeny of sea snakes Tree of Life using fossils and molecules. Trends Ecol. Evol. 22:424– reveals a rapidly diverged adaptive radiation. Biol. J. Linn. Soc. 431. 89:523–539. Doyle J.A., Donoghue, M.J. 1993. Phylogenies and angiosperm diver- Lukoschek V., Waycott M., Marsh H. 2007. Phylogeographic structure sification. Paleobiology. 19:141–167. of the olive sea snake, Aipysurus laevis (Hydrophiinae) indicates re- Drummond A.J., Ho S.Y.W., Phillips M.J., Rambaut A. 2006. Relaxed cent Pleistocene range expansion but low contemporary gene flow. phylogenetics and dating with confidence. PLoS Biol. 4:699–710. Mol. Ecol. 16:3406–3422. 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 41

Magallon S., Sanderson M.J. 2001. Absolute diversification rates in an- Saint K.M., Austin C.C., Donnellan S., Hutchison M.N. 1998. C-mos, giosperm clades. Evolution. 55:1762–1780. a nuclear marker useful for squamate phylogenetic analysis. Mol. Marshall C.R. 2008. A simple method for bracketing absolute diver- Phylogenet. Evol. 10:259–263. gence times on molecular phylogenies using multiple fossil calibra- Sanders K., Lee M.S.Y. 2007. Evaluating molecular clock calibrations tion points. Am. Nat. 171:726–742. using Bayesian analyses with soft and hard bounds. Biol. Lett. McDowell J.R. 1987. Systematics. In: Seigel R.A., Collins J.T., Novak 3:275–279. S.S., editors. Snakes—ecology and evolutionary biology. New York: Sanders K.L., Lee M.Y.L. 2008. Molecular evidence for a rapid late- Macmillan. p. 3–50. Miocene radiation of Australasian venomous snakes (Elapidae: McGuire J.A., Witt C.C., Altshuler D.L., Remsen J.V. Jr. 2007. Phylo- Colubroidea). Mol. Phylogenet. Evol. 46:1180–1188. genetic systematics and biogeography of hummingbirds: Bayesian Sanders K.L., Lee M.Y.L., Leijs R., Foster R., Keogh J.S. 2008. Molec- and maximum likelihood analyses of partitioned data and selection ular phylogeny and divergence dates for Australasian elapids and of an appropriate partitioning strategy. Syst. Biol. 56:837–856. sea snakes (Hydrophiinae): evidence from seven genes for rapid Nagy Z.T., Joger U., Wink M., Glaw F., Vences M. 2003. Multiple evolutionary radiations. J. Evol. Biol. 21:682–695. colonisation of Madagascar and Socotra by colubrid snakes: evi- Sanderson M.J. 1997. A nonparametric approach to estimating di- dence from nuclear and mitochondrial gene phylogenies. Proc. R. vergence times in the absence of rate constancy. Mol. Biol. Evol. Soc. Lond. B. Biol. Sci. 270:2613–2621. 14:1218–1231. Near T.J., Meylan P.A., Shaffer H.B. 2005. Assessing concordance of Sanderson M.J. 2002. Estimating absolute rates of molecular evolution fossil calibration points in molecular clock studies: an example us- and divergence times: a penalized likelihood approach. Mol. Biol. ing turtles. Am. Nat. 165:137–146. Evol. 19:101–109. Near T.J., Sanderson M.J. 2004. Assessing the quality of molecular Sanderson M.J. 2003. r8s: inferring absolute rates of molecular evo- divergence time estimates by fossil calibrations and fossil-based lution and divergence times in the absence of a molecular clock. model selection. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 359:1477– Bioinformatics. 19:301–302. 1483. Scanlon J.D., Lee M.S.Y., Archer M. 2003. Mid-Tertiary elapid snakes Noonan B.P., Chippindale P.T. 2006a. Dispersal and vicariance: the (, Colubroidea) from Riversleigh, northern Australia: complex evolutionary history of boid snakes. Mol. Phylogenet. early steps in a continent-wide adaptive radiation. Geobios. 36:573– Evol. 40:347–358. 601. Noonan B.P., Chippindale P.T. 2006b. Vicariant origin of Malagasy Schwartz R.S., Mueller R.L. 2010. Branch length estimation and diver- supports late Cretaceous Antarctic land bridge. Am. Nat. gence dating: estimates of error in Bayesian and maximum likeli- 168:730–741. hood frameworks. BMC Evol. Biol. 10:5 Palumbi S.R. 1996. Nucleic acids II: the polymerase chain reaction. In: Slowinski J.B., Keogh J.S. 2000. Phylogenetic relationships of elapid Hillis D.M., Moritz C., Mable B.K., editors. Molecular systematics. snakes based on cytochrome b mtDNA sequences. Mol. Phylogenet. Sunderland (MA): Sinauer Associates, Inc. p. 205–247. Evol. 15:157–164. Parmley D., Holman J.A. 2003. Nebraskophis HOLMAN from the Late Slowinski J.B., Knight A., Rooney A.P. 1997. Inferring species trees Eocene of Georgia (USA), the oldest known North American colu- from gene trees: a phylogenetic analysis of the Elapidae (Serpentes) brid snake. Acta Zool. Cracov. 46:1–8. based on the amino acid sequences of venom proteins. Mol. Phylo- Phillips M.J. 2009. Branch-length estimation bias misleads molecu- genet. Evol. 8:349–362. lar dating for a vertebrate mitochondrial phylogeny. Gene. 441: Smith A.B., Pisani D., Mackenzie-Dodds J.A., Stockley B., Webster B.L., 132–140. Littlewood D.T.J. 2006. Testing the molecular clock: molecular and Posada D., Crandall K.A. 1998. MODELTEST: testing the model of paleontological estimates of divergence times in the Echinoidea DNA substitutions. Bioinformatics. 14:817–818. (Echinodermata). Mol. Biol. Evol. 23:1832–1851. Pulquerio M.J.F., Nichols R.A. 2006. Dates from the molecular clock: Swofford D. 2000. PAUP*: phylogenetic analysis using parsimony how wrong can we be? Trends Ecol. Evol. 22:180–184. (*and other methods). 4th ed. Sunderland (MA): Sinauer Asso- Pyron R.A., Burbrink F.T., Colli G.R., Montes A.N., de Oca Vitt L.J., ciates, Inc. Kuczynski C.A., Wiens J.J. 2011. The phylogeny of advanced snakes Szyndlar Z., Rage J.C. 1990. West Palearctic cobras of the genus Naja (Colubroidea), with discovery of a new subfamily and compari- (Serpentes: Elapidae): interrelationships among extinct and extant son of support methods for likelihood trees. Mol. Phylogenet. Evol. species. Amphibia-Reptilia. 11:385–400. 58:329–342. Szyndlar Z., Rage J.C. 1999. Oldest fossil vipers (Serpentes: Viperidae) Rage J.C. 1984. Serpentes. In: Encyclopedia of paleoherpetology. from the Old World. Kaupia. 8:9–20. Stuttgart (Germany): Gustav Fischer Verlag. p. 1–79. Townsend T.M., Larson A., Louis E., Macey J.R. 2004. Molecular phy- Rage J.C. 1987. Fossil history. In: Seigel R.A., Collins J.T., Novak logenetics of squamata: the position of snakes, amphisbaenians, S.S., editors. Snakes—ecology and evolutionary biology. New York: and dibamids, and the root of the squamate tree. Syst. Biol. 53: Macmillan. p. 51–76. 735–757. Rage J.C. 1988. The oldest known colubrid snakes. State of the art. Acta van Tuinen M., Dyke G.J. 2004. Calibration of galliform molecular Zool. Cracov. 31:457–474. clocks using multiple fossils and genetic partitions. Mol. Phylo- Rage J.C. 2001. Fossil snakes from the Paleocene of Sao Jose de Itaborai, genet. Evol. 30:74–86. Brazil. Part II. Boidae. Palaeovertebrata. 30:111–150. Vidal N., Delmas A.S., David P., Cruaud C., Couloux A., Hedges S.B. Rage J.C., Buffetaut E., Buffetaut-Tong H., Chaimanee Y., Ducrocq S., 2007. The phylogeny and classification of caenophidian snakes in- Jaeger J.J., Suteethorn V. 1992. A colubrid in the late Eocene of Thai- ferred from seven nuclear protein-coding genes. C. R. Biol. 330: land: the oldest known Colubridae (Reptilia, Serpentes). C. R. Acad. 182–187. Sci. 314:1085–1089. Wiens J.J., Brandley, M.C., Reeder, T.W. 2006. Why does a trait evolve Rage J.C., Gupta S.S., Prasad G.V.R. 2001. Amphibians and squamates multiple times within a clade? Repeated evolution of snakelike from the Neogene Siwalik beds of Jammu and Kashmir, India. body form in squamate reptiles. Evolution. 60:123–141. Palaont. Z. 75:197–208. Wuster W., Crookes S., Ineich I., Mane Y., Pook C.E., Trape J.-F., Rage J.C., Werner C. 1999. Mid-Cretaceous (Cenomanian) snakes from Broadley D.G. 2007. The phylogeny of cobras inferred from mi- Wadi Abu Hashim, Sudan: the earliest snake assemblage. Palaeon- tochondrial DNA sequences: evolution of venom spitting and the tol. Afr. 35:85–110. phylogeography of the African spitting cobras (Serpentes: Elapi- Rambaut A., Bromham L. 1998. Estimating divergence times from dae: Naja nigricollis complex). Mol. Phylogenet. Evol. 45:437–453. molecular sequences. Mol. Biol. Evol. 15:442–448. Wuster W., Peppin L., Pook C.E., Walker D.E. 2008. A nesting of vipers: Ronquist F., Huelsenbeck J.P. 2003. MrBayes 3: Bayesian phylogenetic phylogeny and historical biogeography of the Viperidae (Squa- inference under mixed models. Bioinformatics. 19:1572–1574. mata: Serpentes). Mol. Phylogenet. Evol. 49:445–459. Rutschmann F., Eriksson T., Salim K.A., Conti E. 2007. Assessing cal- Yang Z., Rannala B. 2006. Bayesian estimation of species divergence ibration uncertainty in molecular dating: the assignment of fossils times under a molecular clock using multiple fossil calibrations to alternative calibration points. Syst. Biol. 56:591–608. with soft bounds. Mol. Biol. Evol. 23:212–226. 42 SYSTEMATIC BIOLOGY VOL. 61 Continued RAG-1 AY662608 EU366435 GenBank mos c- EU366456 b EF193687 DQ525172 NC010974 or subfamily 16S rRNA ND4 Cyt BoidaeBoidae AM236348 AM236347 AM236348 AM236347 AM236348 AM236347 AF471115/AF544676 AY099964 AY988064 AY988063 Varanidae PythonidaePythonidae PT EF545061 AF544828 — — U69741 AY099993 AF544730/AY444035 DQ465557 AY444061 DQ465560 TyphlopidaeTyphlopidae DQ343649 AM236345 DQ343649 AM236345 DQ343649 AM236345 AF544717 AF544733 AY444062 — Family Hydrophiinae (Marine)Hydrophiinae (Marine)Hydrophiinae (Marine) DQ233989Hydrophiinae (Marine) DQ234000Hydrophiinae EF506658 (Marine) DQ234046Hydrophiinae FJ593195 (Marine) DQ234047 DQ233913Hydrophiinae FJ593199 (Marine) DQ234005 DQ233939Hydrophiinae FJ593201 (Marine) DQ234008 DQ233948Hydrophiinae FJ593202 (Marine) DQ234017 DQ233974Hydrophiinae FJ593205 (Marine) DQ234021 DQ233924Hydrophiinae FJ593209 (Marine) DQ234035 FJ587169 DQ233927 FJ593216 DQ234040 DQ233936 FJ587171 FJ593224 DQ234051 DQ233950 FJ587175 FJ593227 DQ233963 FJ587177 FJ593233 DQ233968 FJ587178 DQ233977 FJ587181 FJ587185 FJ587190 FJ587087 FJ587196 FJ587093 FJ587199 FJ587097 FJ587201 FJ587099 FJ587100 FJ587103 FJ587107 FJ587113 FJ587119 FJ587122 FJ587124 Hydrophiinae (Terrestrial)Hydrophiinae (Terrestrial) EU547176Hydrophiinae (Terrestrial) EU547150 EU547030 EU547171Hydrophiinae (Terrestrial) EU547007 EU547078Hydrophiinae (Terrestrial) EU547025 EU547138 EU547052Hydrophiinae (Terrestrial) EU547073 FJ587203 EU546998 EU547140 EU547040 FJ593191 EU547000 EU546940 EU366446/AF544702/AY05893 EU547042 FJ587153 EU366451 EU366433/AY487404 EU546935 EU366449 FJ587158 EU546901 EU366440 EU546896 FJ587082 1. Taxa used in three approaches for evaluating candidate fossil calibrations with details of sequences generated or retrieved from A spp. Hydrophiinae (Terrestrial) EU547177 EU547031 EU547079 EU546941 EU546902 spp. Acrochordidae AB177879 AB177879 AB177879 EU366454/EU366443 AY487388/AY988070 spp. Hydrophiinae (Terrestrial) EU547161 EU547016 EU547063 EU546926 EU546887 PENDIX P A spp. spp. Pythonidae EF545049 — EF545099 AF544723 AY988069 spp. Iguanidae NC009683 NC009683 NC009683 AF137525 AY662584 spp. Pythonidae EF545066 73760200 73760200 AF544675/AY099968 EU624119 d Taxa Calotes Ramphotyphlops braminus Typhlops jamaicensis Boa constrictor Varanus Eunectes notaeus Aspidites melanocephalus Loxocemus bicolor Morelia Python Acrochordus Aipysurus laevis Emydocephalus annulatus Hydrelaps darwiniensis Parahydrophis mertoni Acalyptophis peronii Astrotia stokesii Disteira major Hydrophis elegans Hydrophis pacificus Lapemis curtis Pelamis platurus Acanthophis Austrelaps superbus Cacophis squamulosus Hemiaspis dameli Hoplocephalus Laticauda colubrina Laticauda laticaudata Micropechis ikaheka Lizar Scoliophidian Taxa Henophidian Taxa Caenophidian Taxa Elapidae 2012 LUKOSCHEK ET AL.—EVALUATING FOSSIL CALIBRATIONS 43 — — Y662614 RAG-1 JF357956 JF412633 AY662611 AY487378 EU366432 EU546880 EU546888 EU546877 mos c- JF357955 AY058939 AY058938 AF471163 AF544686 AY486937 EU366445 EU546919 EU546927 EU546916 b Cyt JF357949 DQ897689 DQ897732 JF357944 JF357926JF357948 EU547086/AF217830 JF357930 JF357954 JF357939 EU366438/AY487389 JF357950 JF357932 JF357940 JF357945JF357946JF357947 JF357927 JF357928 JF357929 JF357936 JF357937 JF357938 AF544735/FJ387197 AF544678/AY058930 EF137421/AF544708 AY487395 AY487373 AY487411 AF158517 U49308 AF471085 AF471126/AF544694 AY487376 EU624272 AY058983 AF217827 AB008539 AB008539 AB008539 DQ343648 DQ343648 DQ343648 or subfamily 16S rRNA ND4 Elapinae Elapinae Elapinae Elapinae Elapinae Elapinae Elapinae Elapinae Viperinae JF357952 JF357934 JF357942 AF471130 — Crotalinae AY223672 AY223636 AY223598/AF191587 DQ469788/AF544680 DQ469790/AY487374 Natricinae AF158530 AY487800 AY487756 AF471121/AF544697 — Colubrinae Colubrinae L01770/AY188081 AF138746 EU180489 Boodontinae Xenodontinae ) Family Pseudoxyrhophiinae JF357951 JF357933 JF357941 AY058933 AY487377 Hydrophiinae (Terrestrial) EU547153 EU547010 EU547055 Hydrophiinae (Terrestrial) EU547162 EU547017 EU547064 Hydrophiinae (Terrestrial) EU547149 EU547003 EU547051 Continued 1. ( A spp. scutellatus spp. spp. spp. Crotalinae JF357953 JF357935 JF357943 AF435019 A PENDIX spp. spp. P A Oxyuranus Naja melanoleuca Naja nivea Coluber Dinodon semicarinatus fuliginosus Leioheterodon modestus Natrix natrix Bothrops/Bothriechis Gloydius Alsophis Bitis arietans Bungarus fasciatus Dendroaspis Elapsoidea Micrurus Naja kaouthia Naja naja Vermicella intermedia Suta fasciata Elapidae Colubridae Viperidae