<<

Syst. Biol. 63(1):31–54, 2014 © The Author(s) 2013. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: [email protected] DOI:10.1093/sysbio/syt058 Advance Access publication August 20, 2013

Accelerated Rate of Molecular Evolution for Vittarioid is Strong and Not Driven by Selection

, ,∗ , CARL J. ROTHFELS1 2 AND ERIC SCHUETTPELZ3 4 1Department of Biology, Duke University, Box 90338, Durham, NC 27708, USA; 2Department of Zoology, University of British Columbia, #4200-6270 University Blvd., Vancouver, BC V6T 1Z4, Canada; 3Department of Biology and Marine Biology, University of North Carolina Wilmington, 601 South College Road, Wilmington, NC 28403, USA; and 4Department of Botany (MRC 166), National Museum of Natural History, Smithsonian Institution, PO Box 37012, Washington DC 20013-7012, USA ∗ Correspondence to be sent to: Department of Zoology, University of British Columbia, #4200-6270 University Blvd., Vancouver, BC V6T 1Z4, Canada; E-mail: [email protected].

Received 16 January 2013; reviews returned 26 March 2013; accepted 15 August 2013 Associate Editor: Roberta Mason-Gamer

Abstract.—Molecular evolutionary rate heterogeneity—the violation of a molecular clock—is a prominent feature of many Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 phylogenetic data sets. It has particular importance to systematists not only because of its biological implications, but also for its practical effects on our ability to infer and date evolutionary events. Here we show, using both maximum likelihood and Bayesian approaches, that a remarkably strong increase in substitution rate in the vittarioid ferns is consistent across the nuclear and plastid genomes. Contrary to some expectations, this rate increase is not due to selective forces acting at the protein level on our focal loci. The vittarioids bear no signature of the change in the relative strengths of selection and drift that one would expect if the rate increase was caused by altered post-mutation fixation rates. Instead, the substitution rate increase appears to stem from an elevated supply of mutations, perhaps limited to the vittarioid ancestral branch. This generalized rate increase is accompanied by extensive fine-scale heterogeneity in rates across loci, genomes, and taxa. Our analyses demonstrate the effectiveness and flexibility of trait-free investigations of rate heterogeneity within a model- selection framework, emphasize the importance of explicit tests for signatures of selection prior to invoking selection-related or demography-based explanations for patterns of rate variation, and illustrate some unexpected nuances in the behavior of relaxed clock methods for modeling rate heterogeneity, with implications for our ability to confidently date divergence events. In addition, our data provide strong support for the monophyly of Adiantum, and for the position of Calciphilopteris in the cheilanthoid ferns, two relationships for which convincing support was previously lacking. [Adiantum; Calciphilopteris; codon models; divergence time dating; local clocks; model selection; molecular clock; mutation rate; nucleotide substitution rate; ; rate heterogeneity; relaxed clocks; trigenomic analyses.]

Molecular evolutionary rate heterogeneity can take Much of the research into molecular evolutionary many forms, ranging from variation among nucleotide rate heterogeneity has focused on finding a correlation substitution types (Kimura 1980) to variation among between the rate of molecular evolution and some sites (Yang 1996), loci (Wolfe et al. 1989b; Small et al. natural history attribute of the organism in question, 1998), genomic regions (Wolfe et al. 1989a), and genomic using multiple independent comparisons (reviewed in compartments (Wolfe et al. 1987; Baer et al. 2007). Lanfear et al. 2010). However, cases limited to a single Perhaps most vexing, however, is lineage-specific rate potential rate change, or where obvious candidate heterogeneity, whereby some lineages have significantly correlated traits are lacking, are still tractable within a different rates of molecular evolution than do their model selection framework. Using this approach one close relatives. Such violations of a molecular clock can ask if a potential rate discrepancy is significant by (Zuckerkandl and Pauling 1962, 1965) are ubiquitous comparing models that permit particular groups to have across the tree of life and have been characterized within individual rates (local clocks) to models that enforce a vertebrates (Bromham 2002; Hoegg et al. 2004; Bininda- global clock. This approach provides an elegant solution Emonds 2007), invertebrates (Hebert et al. 2002; Schon to problems associated with the lack of independence et al. 2003; Shao et al. 2003; Thomas et al. 2006; Singh among branches and differing taxon sampling density, et al. 2009), fungi (Lutzoni and Pagel 1997; Moncalvo does not require the a priori identification of a potential et al. 2000; Woolfit and Bromham 2003; Zoller and correlated trait of interest, and has enjoyed considerable Lutzoni 2003; Lumbsch et al. 2008), algae (Zoller and popularity (Lutzoni and Pagel 1997, their “method 2”; Lutzoni 2003), bacteria (Woolfit and Bromham 2003), Yoder and Yang 2000; Bromham and Woolfit 2004; liverworts (Lewis et al. 1997), seed (Bousquet et al. Lanfear et al. 2007; Korall et al. 2010; Lanfear 2010; 1992; Muse 2000; Davies et al. 2004; McCoy et al. 2008; Neiman et al. 2010); however, note the caveats described Smith and Donoghue 2008; Xiang et al. 2008), and ferns by (Lanfear, 2010). (Soltis et al. 2002; Des Marais et al. 2003; Schneider In this study, we use sequence data from all three et al. 2004; Schuettpelz and Pryer 2006; Korall et al. genomic compartments—nuclear, mitochondrial, and 2010; Li et al. 2011; Rothfels et al. 2012). Consequently, plastid—to probe for signatures of rate heterogeneity this phenomenon has important implications for our in a clade of ferns. We first utilize a maximum understanding of evolution, as well as for our ability to likelihood (ML) method to look for rate heterogeneity infer and date evolutionary events. at the nucleotide level, which requires the a priori

31

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 31 31–54 32 SYSTEMATIC BIOLOGY VOL. 63

selection of models to be compared from among a (Cryptogramma crispa and Pityrogramma austroamericana; near-infinite array of possible models. To explore the Appendix 1). The selected species span the phylogenetic impact of this requirement, we then employ a Bayesian diversity of each clade, with at least one representative framework, utilizing two relaxed clock models. These included from each major subclade and were selected to models explicitly incorporate rate variation across the additionally capture the branch length variation in each tree without the a priori division of the tree into clade (Crane et al. 1995; Schuettpelz et al. 2007; Windham particular clades or classes. Finally, we conduct a series et al. 2009; Lu et al. 2011b). Equal sampling across clades of codon-based ML analyses to distinguish between was adopted to avoid the potential for node-density mutation-driven and fixation-driven causes of elevated effect artifacts (Fitch and Beintema 1990; Bromham 2002; substitution rate. Venditti et al. 2006; Hugall and Lee 2007). The likelihood Our focal group consists of the cheilanthoid ferns of biases being introduced due to punctuated evolution (cheilanthoids), the genus Adiantum, and the vittarioid associated with speciation events (Pagel et al. 2006)is ferns (vittarioids) in the family Pteridaceae. Each of also minimized, given that vittarioids—hypothesized to these subclades is fairly large (approximately 400, 200, have the fastest rates—constitute the smallest clade. and 100 species, respectively; Crane 1997; Schuettpelz

et al. 2007; Lu et al. 2011b), and moderately old. The Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 divergence between the cheilanthoids and the vittarioids + Adiantum (collectively referred to as the adiantoids) is DNA Extraction, Amplification, and Sequencing estimated at approximately 90 Ma, and that between the DNA was extracted from silica-dried leaf tissue vittarioids and Adiantum at about 70 Ma (Schuettpelz or herbarium fragments in the Lab Database and Pryer 2009). Earlier molecular analyses inferred (fernlab.biology.duke.edu) archive. Sequences were considerably longer branch lengths for the vittarioids obtained for three plastid loci (atpA, atpB, rbcL), than for the remainder of the Pteridaceae (Schuettpelz two mitochondrial loci (atp1, nad5), and one nuclear and Pryer 2007; Schuettpelz et al. 2007), suggesting an locus (gapCp); note that plastids and mitochondria are increase in the vittarioid rate of molecular evolution. This maternally inherited in this group of ferns (Gastony and increase may well be coupled with other changes, as the Yatskievych 1992). The three plastid loci were amplified vittarioids are very different from their relatives in terms and sequenced using previously published protocols of their ecology, population biology, and morphology. (Pryer et al. 2004; Schuettpelz et al. 2006). The vittarioids are obligate epiphytes in tropical habitats, Most mitochondrial atp1 sequences were obtained have dramatically simplified leaf morphologies (many using primers F83-atp1 and R725-atp1 (Wikström species are colloquially termed “shoestring ferns” and and Pryer 2005; full data for all primers used are others do not exceed lengths of a centimeter or two available in Supplementary Table S1), following the or widths of more than two millimeters), and have protocol of Wikström and Pryer (2005). When reactions long-lived gametophytes that are capable of asexual failed, we amplified and sequenced atp1 with primers propagation via gemmae (Farrar 1974; Crane et al. 1995). CRATP1F1 and CRATP1R1, using a standard reaction mix (Schuettpelz and Pryer 2007). Our thermal cycling Here, we assess the nature and degree of molecular ◦ rate heterogeneity in our focal group and identify program consisted of an initial denaturation step (94 C for 3 min), followed by 35 denaturation, annealing, and the pool of biological mechanisms that are tenable ◦ ◦ ◦ elongation cycles (94 C for 45 s, 55 C for 30 s, 72 C explanations for the suggested vittarioid rate increase. ◦ In doing so, we provide an example of a focused, trait- for 2 min), and a final elongation step (72 Cfor10 free analysis of molecular rates, demonstrate nuances min). Taxa with a type II intron (Cryptogramma and in the behavior of recently developed models that Pityrogramma, see Results section) required additional accommodate rate variation, and show the utility and sequencing primers F328-atp1, F411-atp1, and R348-atp1 importance of associated analyses of selection. Our (Wikström and Pryer 2005). study aims to determine: (i) whether vittarioids have Initial amplifications of nad5 were performed as elevated rates of molecular evolution compared with for atp1, but with primers K and L (Vangerow et al. 1999) and a modified thermal cycling program those of their closest relatives; (ii) whether observed ◦ rate differences are comparable across loci and genomic consisting of an initial denaturation step (94 Cfor 3 min), followed by 40 denaturation, annealing, and compartments, and consistent with possible life-history ◦ ◦ ◦ elongation cycles (94 C for 45 s, 45 C for 30 s, 72 C based explanations; and (iii) whether selection drove the ◦ patterns observed. for 3 min), and a final elongation step (72 Cfor10 min). The nad5 amplifications tended to be weak and often resulted in multiple bands. These PCR products were therefore cloned, and resulting colonies amplified, MATERIALS AND METHODS following the protocol described by Schuettpelz et al. (2008). We sequenced the colony amplifications using Taxon Sampling primers K, L, KLEX, FLIN, LISEX, M13F, and M13R We sampled 26 species, including 8 species from (V.Knoop,unpublished data; Invitrogen; Vangerow et al. each focal clade within the Pteridaceae (cheilanthoids, 1999). Particularly recalcitrant taxa were amplified vittarioids, and Adiantum), and 2 outgroup species using primers CRNAD5F1 and CRNAD5R1 (using the

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 32 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 33

program described above, but with 35 cycles and an only two random-addition-sequence starting trees. This ◦ annealing temperature of 55 C) and direct sequenced analysis allowed us to phylogenetically discriminate with CRNAD5F1, CRNAD5F2, CRNAD5F3, CRNAD5R1, gapCp “short” sequences from gapCp “long” and gapC CRNAD5R2, CRNAD5R3, and LISIN (Supplementary sequences (Supplementary Fig. S1). The originally Table S1). targeted gapCp “short” locus was the best represented; Nuclear gapCp sequences were obtained following sequences from the other loci were discarded, along the protocol of Schuettpelz et al. (2008). For hard-to- with the gapCp “short” sequences from taxa outside amplify vittarioid species, primer ESGAPCP11R1 was our focal sample. Among the remaining gapCp “short” replaced with CJRGAPVITR1 for both amplification sequences were two copies from each of the four and sequencing. Although we attempted to target sampled members of the Adiantum raddianum clade, gapCp “short” sequences through visualization of the suggesting a duplication event had occurred on its stem colony amplifications on agarose gels, we also obtained branch (Supplementary Fig. S1). To yield a nuclear sequences from the gapC and gapCp “long” paralogs data set that was maximally comparable to the plastid (Schuettpelz et al. 2008). Our total pool of sequences was and mitochondrial data sets, we removed one set of thus filtered in a stepwise manner to arrive at a set of copies (the two copies have similar branch lengths and

gapCp “short” sequences for subsequent analysis. From preliminary analyses indicated no effect of retaining one Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 the sequences obtained for each taxon, we first removed over the other). We were left with a single gapCp “short” any duplicates and all those sequences containing sequence from each of 22 (of 26) sampled taxa (gapCp indels in the exons (exon length is highly conserved “short” sequences could not be obtained from four in this gene family and exons with indels are almost vittarioid species). For simplicity, these gapCp “short” certainly indicative of pseudogenes; Peterson et al. 2003; sequences are hereafter referred to simply as gapCp Schuettpelz et al. 2008). We then manually screened sequences. for and removed chimaeric sequences, presumably In total, 117 sequences were newly obtained for resulting from PCR-mediated recombination (Cronn this study and deposited in GenBank (Appendix 1; et al. 2002). Because of the considerable phylogenetic Supplementary Table S2). An additional 78 previously depth of our study (an ingroup sample of 24 species published sequences were used to complete our data sets selected from across a clade with a crown age of (Appendix 1; Supplementary Table S2). approximately 90 myr; Schuettpelz and Pryer 2009), our goal was not to identify each of the segregating alleles, but rather to capture all of the major copy-types present in each accession. We therefore constructed an exon- Sequence Alignment and Data Set Construction only alignment of the sequences remaining from each The plastid and mitochondrial loci were aligned sample (independently for each sample) and inferred individually, by eye, in Mesquite v2.72 (Maddison from that alignment maximum parsimony trees through and Maddison 2011). Ambiguous portions of each a branch-and-bound search in PAUP* v4.0b10 (Swofford alignment were excluded prior to subsequent analyses. 2002). From the unrooted most parsimonious trees (or, For mitochondrial nad5, unambiguous indels were when multiple trees were found, consensus trees), we recoded following Simmons and Ochoterena’s (2000) designated as distinct all clusters of sequences that simple gap recoding method, to yield an additional data differed from their nearest neighbors by more than 10 set used in the topology searches, but not in subsequent substitutions. The use of 10 substitutions was an ad hoc analyses. cut-off based on preliminary examinations of the trees; The intron sequences within our retained gapCp data other cut-off values (from 5 to 20 substitutions) gave were significantly divergent, making by-eye alignment similar results (data not shown). From each cluster we unreliable. In order to infer an objective alignment for selected the sequence that was closest to the geometric these regions, we used BAli-Phy v2.1.0 (Suchard and center of the cluster (the least apomorphic sequence) and Redelings 2006; Redelings and Suchard 2007; e.g., Gaya discarded the others. et al. 2011), which coestimates alignment and phylogeny We then constructed a large exon-only alignment in a Bayesian framework. These analyses were run comprising our partially filtered set of new sequences, under a seven-partition scheme (one partition for each three well-characterized sequences from the fern genus of the three intron and four exon regions included in Cystopteris, and a set of previously published sequences the sequenced section; Schuettpelz et al. 2008), with of known copy type from Martin et al. (1993), Meyer- topologies and branch lengths linked across partitions Gauen et al. (1994), and Schuettpelz et al. (2008). We (a branch length scaling parameter was included to analysed this data set using ML in PAUP* (Swofford allow intron and exon sequences to differ in their 2002), under a GTR+I+G model with parameters fixed global rates). Substitution parameters were estimated for at values obtained from jModeltest 0.1.1 (Posada 2008). the exon partitions independently from those for the This search utilized TBR branch swapping, and was intron partitions. All partitions were analysed under a repeated 100 times from independent random-addition- GTR+gwF substitution model (Tavaré 1986; Goldman sequence starting trees. To assess support, we also and Whelan 2002), with among-site rate heterogeneity performed 500 ML bootstrap replicates, with the same modeled by a five-state Dirichlet process for the exons settings as above, but with each search performed from and a three-state process for the introns. The RS07

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 33 31–54 34 SYSTEMATIC BIOLOGY VOL. 63

indel model (Redelings and Suchard 2007) was applied compartments, we adopted a model comparison to the three intron partitions, which shared indel approach using the program baseml of the PAML v4.4e model parameters (there were no gaps within the exon package (Yang 2007). We selected eight models of partitions, so no indel model was applied to them; particular biological interest for comparison (Table 1), Gaya et al. 2011). Priors were set at their default each of which incorporates a GTR+G substitution model values and seven independent chains were run, each for with independent parameter estimates for each included 100,000 generations, and sampled every 10 generations. partition. Our models vary in whether branch rates Inspection of parameter traces (Suchard and Redelings are considered to be proportional (vs. independent) 2006; Rambaut and Drummond 2007) indicated that among partitions, the presence of a molecular clock, each chain converged (to the same area of parameter and (if present) how such a molecular clock is space) by 8000 generations. To be conservative, we enforced. All of our analyses utilized a fixed topology— excluded the first 10,000 generations of each run, and obtained via phylogenetic analysis of our combined data pooled the remaining generations across runs before set (Fig. 1c). Because we were interested in relative computing the posterior. For all parameters, the pooled rather than absolute rates, when a timescale was effective sample sizes were >2500. The maximum necessary, we fixed the stem age of the ingroup at 10

posterior decoding alignment (the alignment that has arbitrary time units (rate estimates are thus in units Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 the maximum sum of column posteriors, weighted by of substitutions per site per arbitrary time unit). We the number of nucleotides in that column; Redelings, repeated each analysis 10 times, independently, to avoid personal communication) from each partition was used results based on suboptimal peaks in the likelihood to produce an alignment for the complete included surface. Here, we report results from the runs with the region of gapCp. Sites for which only one taxon had a highest likelihoods (results from all runs are available character state other than a gap were excluded prior at the Dryad repository (doi:10.5061/dryad.c5m42), to subsequent analyses; all other sites were retained. summarized from the PAML outputs using the Python The alignments, and associated trees, are available in script PAMLparser (Supplementary Appendix S1). TreeBASE (accession S14194). Three of our chosen models do not incorporate a molecular clock. The first, bas1, is the only model investigated that links substitution parameters across Tree Reconstruction partitions; in addition, it requires that branch lengths We analysed each of the six single-gene data sets be proportional across partitions (Table 1). Model bas2 (atp1, atpA, atpB, gapCp, nad5, and rbcL), as well as the is the most parameter-rich model and is equivalent to nad5 recoded indels data set, using ML and Bayesian analysing each partition independently. Only the fixed approaches. ML analyses of the single-gene data sets topology is shared among partitions and branch rates were conducted with RAxML v7.2.6 (Stamatakis 2006), are not required to be proportional. The third clockless using the GTR+G model of sequence evolution and model (bas4) adds a proportionality requirement. Here, the option to conduct a rapid bootstrap analysis (1000 there can be faster and slower partitions, but individual replicates) and a search for the best-scoring tree in a branch lengths must remain proportional. single program run (Stamatakis et al. 2008). The indel We also examined two models in which a global data set was analysed in a similar fashion but utilizing molecular clock is enforced. The more complex of the BINGAMMA, a likelihood model for binary data (Lewis two (bas3, Table 1) is, as in bas2, equivalent to analysing 2001) that accommodates rate heterogeneity. Bayesian each partition independently, but here each partition analyses were executed in MrBayes v3.1.1 (Ronquist and has a global clock. The simpler model (bas5, Table Huelsenbeck 2003), using the GTR+G model for genes 1) additionally requires a shared set of branch times and the STANDARD+G model for recoded indels. We while allowing the global clock rate to vary among ran 4 independent runs, each with 4 chains, for 10 million partitions. generations. We sampled trees every 1000 generations Finally,we selected three models that incorporate local and assessed convergence by examining the standard molecular clocks. The most complex of these models deviation of split frequencies within the output and by (bas8, Table 1) treats each partition as fully independent, plotting parameter values in Tracer v1.5 (Rambaut and including branch times and rates. The simpler models Drummond 2007). We very conservatively excluded the both require a shared set of branch times, but differ in first 2.5 million generations from each run as the burn-in how local clock rates vary from partition to partition. and computed a majority rule consensus from the pooled One (bas7) allows each partition to have its own unique trees using the “sumt” command. We analysed the set of clock rates. The other (bas6) requires the local combined data sets (plastid data, mitochondrial data, all clock rates to be proportional among partitions (if a genes, all data) as above, allowing for partition-specific given local clock is fast in one partition, it must be fast parameter estimates. in all). In our local clock analyses (models bas6, bas7, and bas8) we chose to compare four distinct regimes (Fig. 2). Each regime assigns a unique rate parameter Likelihood Analysis of Nucleotide Models (clock) to the outgroup branches and deep ingroup To investigate lineage-specific rate heterogeneity in branches following the “nuisance parameter” arguments our data, within and among loci and across genomic of Lanfear (2010). The first three regimes allocate a second

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 34 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 35 2 3 rate to the cheilanthoids, Adiantum, and the vittarioids, 4 p 2 − − 7 p − +4 c p p

− respectively, leaving the third rate to be shared by the 6 + p 8 9 p p + + + + p 10 cp remaining two clades. The fourth regime gives each of 10 n +2 np + + 10 np + n +

n the major clades its own rate. n + , number of np p n cp For each of the eight general models, the data were analysed according to an unpartitioned scheme, a three- 1) 1)

− partition scheme (one partition for each of the three − c c ( 1) ( 1 12 12 genomic compartments) and a six-partition scheme (one + − + − − − c p p p ( 1)

1) partition for each locus). Model fit was evaluated using p − −

p the small-sample correction for the Akaike Information p ( c Criterion (AICc: Akaike 1974; Hurvich and Tsai 1989), which converges to the AIC as sample size increases 3) 0 2 , number of tips (taxa); 1) 1)3 0 3 1 1( 1 n − − − − − and has a reduced propensity for selecting unduly − − − n n n n n ( ( 2 2 (2 parameterized models when sample size is small (Burnham and Anderson 2004). Smaller AICc scores

indicate better fit and, as a general rule of thumb, any Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 Parameter counts

p model with an AICc score four or more points above the best-fitting model has considerably less support (Burnham and Anderson 2002). For all models, sample size for the AICc calculation was considered to be ppn ppn ppn ppp ppp pp ppp 3 3 3 3 3 3 3 the number of nucleotide characters (sites) in the

frequency shape lengths scalars alignment. p p p p p p p Likelihood Analysis of Codon Models To evaluate whether any lineage-specific rate heterogeneity might be explained by the effects of selection (at the protein level) on our focal loci, we also performed a series of analyses using the program settings Substitution parameters Time parameters codeml of the PAML v4.4e package (Yang 2007). We selected four codon models (Goldman and Yang baseml 1994) of particular biological interest for comparison (Table 2), each with a parameter () representing the transition:transversion ratio, a second parameter a (ω) representing the nonsynonymous:synonymous

times substitution ratio (i.e., dn:ds), and codon frequencies calculated from the nucleotide frequencies at the three codon positions. This basic model forms the foundation for our models cod1 through cod4 (Table 2), all of a which utilize the fixed (unrooted) topology obtained

rates via phylogenetic analysis of our combined data set (Fig. 1c). The first model (cod1, Table 2) is a basic branch model (Yang 1998; Yang and Nielsen 1998), in which the a phylogeny is divided, a priori, into groups of branches that are each allocated their own ω parameter. We lengths evaluated four versions of this model, each with a different clade regime (Fig. 2). Our second model (cod2)

a is equivalent to model M2a of Wong et al. (2004) and , number of predefined local clocks. c Yang et al. (2005) and is a basic site model (Nielsen and Yang 1998; Yang et al. 2000), in which the likelihood of the data is computed given a certain number of parameters site classes (analogous to the site classes of the gamma distribution of rates in nucleotide models incorporating this parameter), each of which has its own ω parameter. Model cod2 has three site classes (with ω <1; ω =1; 1. Nucleotide models used 0 1 ω > ω and 2 1, respectively), and four -related parameters

ABLE (a proportion of sites fitting ω , a proportion of sites

T 0

Among partitions. ω ω ω bas7 Local Independent NA Independent Shared 4 3 5 bas6 Local Independent NA Proportional Shared 4 2 5 bas5 Global Independent NA NA Shared 4 1 5 bas8 Local Independent NA Independent Independent 1 2 5 bas3 Global Independent NA NA Independent 1 1 5 bas4 None Independent Proportional NA NA 4 0 5 bas2 None Independent Independent NA NA 1 0 5 Model Clock(s) Substitution Branch Branch Branch mgene clock Exchange Base Gamma Branch Rate Total bas1 None Shared Proportional NADescriptions and parameter counts for the general nucleotidea substitution models used in this study. NA, not applicable. For the parameter NA counts: fitting 0, and 0 the 5and 3 parameters themselves; partitions in the analysis; and 1 0 2

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 35 31–54 36 SYSTEMATIC BIOLOGY VOL. 63

a) Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019

b) c)

FIGURE 1. ML phylograms. a) From the individual loci: (i) atpA; (ii) atpB; (iii) rbcL; (iv) gapCp; (v) atp1; and (vi) nad5. b) From the data combined by compartment: (i) plastid; and (ii) mitochondrion. c) All data. Branch lengths are in substitutions/site. Bold branches are strongly supported (≥70% ML bootstrap support and ≥0.95 posterior probability). Bootstrap support values are shown above the branch and posterior probabilities below. Asterisks (*) indicate 100% bootstrap support or 1.0 posterior probability.

since the proportions have to sum to 1, the proportion zero and infinity, rather than being fixed at 1, and thus ω ω of sites fitting 2 is not a free parameter, nor is the 1 has an additional free parameter. parameter fixed at 1). The third model (cod3; model M3 The final model we considered (cod4) is model D ω in codeml) is very similar, but allows 1 to vary between of Bielawski and Yang (2004). This clade model is a

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 36 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 37

i Outgroup

C

V

A Regime 1 Regime 2 Regime 3 Regime 4

ii Outgroup C Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019

V

A Regime nu1 Regime nu2 Regime nu3 Regime nu4

Local Clock 0 Local Clock 2 Local Clock 1 Local Clock 3

FIGURE 2. Clock/clade regimes. The four “clock” (for nucleotide analyses) or “clade” (for the codon analyses) regimes used for the plastid and mitochondrial data (i) and the nuclear data (ii). For each regime, branches of a given shade share rate or ω parameters (depending on the analysis), whereas branches with different shades get their own individual parameters. Regimes 1–3 each have three rate or ω parameters; regime 4 has four. Clade name abbreviations follow Figure 1c.

TABLE 2. Codon models used Parameter counts codeml settings Substitution parameters Model ω variability model nssites Exchange Codon frequency Branch lengths Total

cod1 Among branches 2 NA p(c+1) 9pp(2n−3) p(2n+c+7) cod2 Among sitesa 02 5p 9pp(2n−3) p(2n+11) cod3 Among sitesb 03 6p 9pp(2n−3) p(2n+12) cod4 Among branches and sites 3 3 p(c+5) 9pp(2n−3) p(2n+c+11)

Descriptions and parameter counts for the general codon models used in this study. NA = not applicable. For the parameter counts: n, number of tips (taxa); p, number of partitions in the analysis; c, number of predefined “clades” for the branch-site analyses. a With two free ω parameters (ω0 <1; ω1 =1; ω2 >1). b With three free ω parameters (ω0 <1; ω1 ≥0; ω2 >1).

variation of the general branch-site family of models model, each under a different clade regime, as we did (Yang and Nielsen 2002) that allow heterogeneous dn:ds for cod1 (Fig. 2). ratios across codons and also across branches of the Because is not possible to directly perform partitioned phylogeny. With three site classes, cod4 incorporates analyses in codeml, it was necessary to run each partition two parameters corresponding to the proportion of sites individually and sum log-likelihoods and parameter optimized under the first two site classes (as in cod2, the counts. To avoid basing conclusions on results from third proportion is not a free parameter). The ω for the suboptimal likelihood peaks, we repeated each analysis first site class is allowed to take any value between 0 and 10 times independently and we report the run with the 1; the ω for the second class can take any value between highest likelihood (results from all runs are available at zero and infinity. Both of these site classes are applied the Dryad repository, doi:10.5061/dryad.c5m42). Data across the full phylogeny. However, the ω parameter for were again summarized from the PAML outputs using the third site class is independently optimized for each PAMLparser (Supplementary Appendix S1). Each model of the predesignated clades and can also take any value was run on the unpartitioned data, under a three- between zero and infinity. We ran four versions of this partition scheme (one partition for each of the three

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 37 31–54 38 SYSTEMATIC BIOLOGY VOL. 63

genomic compartments) and a six-partition scheme (one comprising 10,487 bp (Supplementary Table S3). An partition for each locus). Model fit was evaluated as for additional 82 characters were obtained through the the nucleotide models above. scoring of the nad5 indels, resulting in a combined data set of 10,569 bp. In atp1 there is a group II intron (studied by Wikström and Pryer (2005)) that is present Bayesian Analysis in Cryptogramma and Pityrogramma but absent in all other To further dissect patterns of rate heterogeneity, sampled species (apparently due to a single loss on the while avoiding the need to assign local clocks or branch leading to the adiantoids and cheilanthoids); this even a topology a priori, we analysed our data in a intron was excluded prior to analysis. The nad5 group II Bayesian framework using BEAST v1.6.1 (Drummond intron identified by Vangerow et al. (1999), however, is and Rambaut 2007; Drummond and Suchard 2010). present in all members of our sample, and was included BEAST offers two relaxed clock models that are in our analyses. of particular interest here: one with uncorrelated lognormally distributed branch rates and another with randomly assigned local clocks (Drummond et al. Phylogeny Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 2006; Drummond and Suchard 2010). The six loci (the Among the phylogenies inferred from the six nad5 indel data set was not included) were analysed individual loci (the indel characters were included with individually and in combination using each of these the nad5 sequence characters), there are only two well- two models. For the partitioned analyses, substitution supported conflicts, both involving atp1 (Fig. 1a(v)). This parameters were unlinked across partitions, while clock gene places two species of Adiantum (A. peruvianum parameters and topologies were linked. Monophyly and A. tetraphyllum) as sister to the remainder of was enforced for four taxon sets: (i) all taxa except the ingroup, with strong support, rather than with for Cryptogramma (in effect, rooting the tree); (ii) all the other species of Adiantum. This gene also places taxa except for Cryptogramma and Pityrogramma; (iii) the Rheopteris (rather than Anetium)assistertoVittaria. These cheilanthoids; and (iv) Adiantum (it was never necessary incongruences are not the result of a misidentification to enforce the monophyly of the vittarioids). In all (the same extractions were used for all loci), or of a cases, we employed a GTR+G substitution model and lab error (the sequences involved are clearly adiantoid a birth–death tree prior, with the average clock rate and all sequenced members of that clade are uniquely fixed at 1.0. Priors were left at their default values represented in the tree). We therefore considered the with the exception of birthDeath.meanGrowthRate, incongruence to be due to stochastic variation in the which was given a uniform prior between 1 and 100, atp1 signal (e.g., see Weisrock 2012; Rothfels et al. 2013b), and birthDeath.relativeDeathRate, which was given a and concatenated the loci to infer the compartment uniform prior between 0 and 2. Convergence and specific (Fig. 1b) and global (Fig. 1c) ML phylograms effective sample sizes were assessed in Tracer v1.5 (see Materials and Methods section). All nodes in the (Rambaut and Drummond 2007). phylogeny resulting from ML analysis of the combined For each data set, the lognormal uncorrelated data set are well supported by both ML bootstrapping relaxed clock (LURC) analyses were run four times (≥ 70% bootstrap support) and by Bayesian inference independently, for 50 million generations each, with (≥ 0.95 posterior probability; Fig. 1c). This includes the chain sampled every 2000 generations. These strong support for each of the three major subclades runs converged relatively rapidly. The first 10 million of interest—cheilanthoids, vittarioids, and Adiantum generations were discarded (very conservatively) as (labeled C, V, and A, respectively; Fig. 1c)—with burn-in prior to summarizing the posterior. The random Adiantum and the vittarioids sister to each other, and local clock (RLC) analyses generally took longer to that clade sister to the cheilanthoids. converge and were run 7 times independently for 100 million generations, with chains sampled every 25,000 generations. Nevertheless, some individual runs failed Likelihood Analysis of Nucleotide Models to converge. Of the 7 runs per data set, at least 5 converged in all cases, and the burn-in period for the Optimizations of branch lengths for the individual- converged runs ranged from 20 million to 80 million locus data sets (Fig. 3a–c) and the individual- generations. The total post-burn-in sample sizes ranged compartment data sets (Fig. 3c–e) on the combined from 10,400 samples (for the concatenated mitochondrial topology reveal a clear trend toward longer branches data) to 19,000 samples (for the concatenated full data). for the vittarioid taxa. However, this trend is not absolute (Fig. 3a,e) and there is substantial branch length variation within each of the major clades. Our model comparison analyses are more RESULTS illuminating than a simple inspection of branch lengths. The best performing model, by far, is the Character Data most heavily parameterized (bas2), partitioned by The single-gene alignments ranged from 1342 bp locus (Table 3; Fig. 4), despite a strong penalty by the (atpB)to2544bp(nad5), with the 6-gene data set AICc. After bas2, the best fitting of the non-local clock

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 38 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 39

a) atpA b) atpB rbcL

atp1

c) d) e) nad5 gapCp “short” Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019

“outgroup” vittarioids cheilanthoids Adiantum

FIGURE 3. Branch lengths optimized on consensus topology. Branch lengths of the mitochondrial loci (a), the plastid loci (b), the nuclear locus (c), and the combined plastid (d) and combined mitochondrial (e) data, optimized on the topology from the combined data (Fig. 1c). Branch length bars each indicate 0.03 substitutions/site/arbitrary time unit.

TABLE 3. Fit of the full data on the nucleotide models and Anderson 2002). The three best performing Model Partitioning Clock Parameter lnL AICc models are all clockless (tips are not constrained to regimea count be contemporaneous), differing from each other only in the degree to which parameters are shared across bas1 None NA 58 –41,813.15 83,743.19 partitions. The best performing model (bas2) has no bas3 None NA 34 –42,229.79 845,27.90 bas1 By locus NA 68 –41,199.80 82,536.84 shared parameters, the next best (bas4) has no shared bas2 By locus NA 348 –40,515.34 81,766.80 parameters except that branch lengths are constrained to bas3 By locus NA 204 –41,156.48 82,734.52 be proportional across partitions, and the worst fitting bas4 By locus NA 108 –40,995.03 82,209.16 of the clockless models (bas1) has proportional branch bas5 By locus NA 83 –41,436.10 83,040.04 bas1 By compartment NA 62 –41,208.81 82,542.64 lengths and shared substitution parameters (Table 3; bas2 By compartment NA 174 –40,722.08 81,804.69 Fig. 4). bas3 By compartment NA 102 –41,286.15 82,780.57 Under a given model, the unpartitioned runs had bas4 By compartment NA 78 –41,058.03 82,273.67 very poor fit, and partitioning by locus (six partitions) bas5 By compartment NA 53 –41,501.64 83,110.04 always resulted in better fit than did partitioning by bas6 By compartment 1 56 –41,361.23 82,835.29 bas6 By compartment 2 56 –41,446.50 83,005.84 compartment (three partitions; Fig. 4(i)). With regard bas6 By compartment 3 56 –41,275.96 82,664.75 to partitioning, the smallest AICc difference (5.80) bas6 By compartment 4 57 –41,238.62 82,592.10 was for bas1, which includes relatively few additional bas7 By compartment 1 60 –41,350.65 82,822.26 parameters for each new partition (Tables 1, 3) but is bas7 By compartment 2 60 –41,331.80 82,784.57 bas7 By compartment 3 60 –41,167.16 82,455.29 nonetheless above the “considerable” improvement cut- bas7 By compartment 4 63 –41,095.55 82,318.16 off of four AICc points. Overall, the effect of altering bas8 By compartment 1 108 –41,121.89 82,462.89 the partitioning scheme was much smaller than that bas8 By compartment 2 108 –41,188.82 82,596.74 of changing the model. For example, the six-partition bas8 By compartment 3 108 –41,037.67 82,294.46 version of a model never outperforms the three-partition bas8 By compartment 4 111 –40,984.41 82,195.87 version of the next best model (such models differ by an average of over 250 AICc points; Fig. 4(ii)). NA, not applicable. Both of the local clock models (bas6 and bas7), aClock regimes correspond to those in Figure 2. partitioned by compartment, fit the combined data better than a comparable global clock model (bas5) models is bas4, which is followed by bas1, bas3, and and worse than a comparable clockless model (bas4; finally bas5 (Fig. 4). The differences in fit among these Table 3; Fig. 5). The more highly parameterized bas7 models are dramatic. On average, each model had an consistently outperformed bas6, which forces branch improvement of more than 300 points in AICc score lengths to be proportional across partitions (Fig. 5). The over the next best performing model—nearly two orders effects of the different local clock regimes on the two of magnitude greater than the rule of thumb difference models are very similar (Fig. 5), with only one notable for a “considerable” AICc improvement (Burnham deviation: clock regime 2 (cheilanthoids and vittarioids

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 39 31–54 40 SYSTEMATIC BIOLOGY VOL. 63

i 85000 ii 83200

83100 84500 bas5

83000

84000 82800

bas3 82700 etter models) etter models) 83500 b b 82600 bas1 Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 82500 83000 82300 AICc (smaller for AICc (smaller for AICc (smaller for AICc (smaller for bas4 82500 82200 81900

82000 81800 bas2

81500 81700 a b c Partitioned by Partitioned by compartment locus

bas1 bas2 bas3 bas4 bas5

FIGURE 4. Model fit of the full data on the general nucleotide models. Units are in AICc points; smaller values indicate better fit. i) Fit of unpartitioned data (a) versus the same data partitioned by compartment (three partitions; b), and by locus (six partitions; c). ii) Fit of the partitioned analyses. The small tree icons in part (ii) indicate whether the model is clockfree (unrooted), or has some form of molecular clock enforced.

sharing a local clock) was a worse fit than clock regime The local clock rate estimates under clock regime 4 1(Adiantum and vittarioids sharing a local clock) under (each major clade given its own clock) help explain bas6, but a better fit under bas7. That said, local clock the discrepancy in fit among the compartments. In the regimes 1 and 2 are both very poorly performing for the plastid and nuclear genomes, the cheilanthoids have combined data and there is a strong increase in model the slowest rate, followed by Adiantum, and finally fit (decrease in AICc) moving to regime 3, the first to the vittarioids, with a particularly high rate (Table 4; allow the vittarioids their own clock (Fig. 2). A smaller Fig. 7). The vittarioids have a plastid rate that is (but still considerable) improvement of fit is seen under 2.73 and 5.82 times greater than Adiantum and the clock regime 4, which gives each of the major clades its cheilanthoids, respectively. For the nuclear data, the own clock. numbers are similar: the vittarioids are 2.46 times Model fit for the compartment-level data, individually faster than Adiantum and 3.08 times faster than the optimized, reveals that much of the difference in fit cheilanthoids. The mitochondrial data, however, show under the local clock regimes is driven by the strong nearly identical rates for Adiantum and the vittarioids, improvement in fit of the plastid data under clock with both being faster than the cheilanthoids (3.16 and regimes 3 and 4 (Table 4; Fig. 6). The nuclear data show 3.07 times faster, respectively; Table 4; Fig. 7). The rate a similar pattern but the mitochondrial data are very increase seen for the vittarioids in the plastid and nuclear different, fitting worst under clock regime 3 (which gives data thus also includes Adiantum in the mitochondrial the vittarioids their own clock, lumping Adiantum and data, explaining the better fit of the mitochondrial the cheilanthoids together), and best under regime 1 data to the local clock regime that allows Adiantum (giving cheilanthoids their own clock, Adiantum and the and the vittarioids to share a clock (regime 1). These vittarioids together; Table 4; Fig. 6). rate estimates are derived from data that, within a

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 40 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 41

83200 450 83100 mitochondrial 400 plastid 83000 nuclear al clock

b 350 82900 300 82800 etter models)

b 250 82700 200 82600 bas4 82500 bas5 150 bas6 82400 bas7 100 AICc (smaller for AICc (smaller for 82300 50

82200 glo in fit (AICc) vs. Improvement 0 clock regime clock regime clock regime clock regime clock regime clock regime clock regime clock regime Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 1 2 3 4 1 2 3 4

FIGURE 6. Fit of the compartment data to the local clock models FIGURE 5. Comparison of the fit of the local clock models (bas6 and (under bas8). For each combination of compartment and clock regime, bas7) to the global clock (bas5) and clockfree (bas4) models. Units are the fit scores are standardized against the fit of the same data under in AICc points; smaller values indicate better fit. The data under all the global clock model. Units are in AICc points, with larger values models are partitioned by compartment (three partitions). The clock indicating bigger improvements in fit for the local clock model. The regimes follow Figure 2. clock regimes follow Figure 2.

given compartment, were not partitioned. To ensure on AICc scores (Table 5; Fig. 8(i)). However, three- that heterogeneity among codon positions within a partition schemes (by compartment) generally fit better compartment is not biasing our estimates (Brandley et al. for the codon models than do more complex six-partition 2005; Shapiro et al. 2005; Lanfear et al. 2012; Rothfels schemes (by locus; Table 5; Fig. 8(i),(ii)). That said, the et al. 2013a), we additionally ran these analyses with effect of the model (e.g., cod1 vs. cod2) was still much the data partitioned by codon position (one partition for stronger than that of the partitioning scheme (three vs. each codon position, and one for noncoding characters, six partitions). if present). The two model types (unpartitioned vs. The branch model (cod1)—the only model that partitioned by codon position) gave very similar rate did not allow for differences in selection pressure estimates for all compartments (data not shown). among codons—performed very poorly (at least 1500 AICc points worse than the other models; Table 5; Fig. 8(ii)), even though it accommodates lineage- specific differences. The three remaining models each Likelihood Analysis of Codon Models incorporate three site classes to accommodate selection As with the nucleotide models, the introduction of differences among codons. The worst performing of partitions in the codon models had a strong effect these site models was the simplest (cod2), which was

TABLE 4. Fit of the compartment data and rate estimates under local clocks

Local clock ratesa Partition Clock regimeb Parameter count lnL AICc clock 0 clock 1 clock 2 clock 3

Plastid 1 36 −24,722.411 49,517.480 0.006381 0.029328 0.007585 NA Plastid 2 36 −24,762.292 49,597.241 0.006618 0.018745 0.010018 NA Plastid 3 36 −24,629.441 49,331.538 0.006919 0.008387 0.025738 NA Plastid 4 37 −24,612.159 49,299.012 0.006363 0.007547 0.016098 0.043965 Mitochondrion 1 36 −9271.300 18,615.601 0.003151 0.007748 0.002504 NA Mitochondrion 2 36 −9295.614 18,664.228 0.003283 0.003935 0.006357 NA Mitochondrion 3 36 −9305.914 18,684.829 0.003269 0.005141 0.006218 NA Mitochondrion 4 37 −9271.283 18,617.623 0.003120 0.002544 0.008053 0.007822 Nuclear 1 36 −7128.177 14,331.477 0.027437 0.032859 0.017725 NA Nuclear 2 36 −7130.909 14,336.942 0.028789 0.030989 0.019710 NA Nuclear 3 36 −7102.319 14,279.761 0.027964 0.020127 0.051948 NA Nuclear 4 37 −7100.970 14,279.240 0.027612 0.017868 0.022348 0.055020

All values obtained using the bas8 nucleotide model. NA, not applicable. aRates are in number of substitutions per site per arbitrary time unit. bClock regimes correspond to those in Figure 2.

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 41 31–54 42 SYSTEMATIC BIOLOGY VOL. 63

0.06 Using the cod4 model, the three genomic plastid compartments have approximately parallel responses 0.05 mitochondrial in fit to the different clade regimes (Table 6; Fig. 9). In nuclear each case, clade regime 4—the one that gives each clade 0.04 its own ω parameter—was the worst fitting, with the other three regimes being very close in performance. 0.03 This is yet another example of the data not supporting the most parameter-rich models (clade regime 4 has 0.02 an additional parameter over the other three regimes; Fig. 2). The mitochondrial data, if analysed separately, showed considerable support for the inclusion of 0.01

stitution rate (per site per time unit) stitution rate lineage effects under most of the clade regimes (most b of the mitochondrial AICc improvements are greater Su 0 than 4; Fig. 9). In contrast, the nuclear data showed clade 0 clade 1 clade 2 clade 3 considerable support for the simpler, lineage-effect free

(outgroup) (cheilanth.) (Adiantum) (vittarioids) model (differences are greater than 4 AICc points). The Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 plastid data did not show considerable support for FIGURE 7. Lineage-specific rates of evolution for each genomic compartment. Inferred substitution rates of the compartment data for either model over the other (Fig. 9). each of the clades of clock regime 4 (Fig. 2), calculated under the bas8 For both the nuclear and plastid data, most sites were model. Units are in substitutions per site per arbitrary time unit. optimized by the software under site class 1 (the class constrained to have ω values <1; see Methods section), indicating strong purifying selection. In fact, the ML estimates of ω for those data were near zero (Table 6; TABLE 5. Fit of the full data on the codon models Fig. 10). For each of the three compartments, relatively Model Partitioning Clade Parameter lnL AICc ω a few codon sites were optimized under site class 2 ( regime count between zero and infinity), additional evidence for a cod1 By compartment 1 178 –31,125.31 62,617.53 preponderance of purifying selection. However, those cod1 By compartment 2 178 –31,124.32 62,615.55 few mitochondrial and nuclear sites that were optimized cod1 By compartment 3 178 –31,123.62 62,614.16 under site class 2 had ML ω estimates considerably above cod1 By compartment 4 181 –31,121.95 62,617.20 1, suggesting strong positive selection at a small number cod1 By locus 1 364 –30,917.42 62,609.87 of sites across the tree (Table 6; Fig. 10(ii)). cod1 By locus 2 364 –30,917.68 62,610.39 ω cod1 By locus 3 364 –30,914.27 62,603.57 The estimates for the parameters that were allowed cod1 By locus 4 370 –30,911.70 62,612.04 to have different values for each clade (those for site cod1 None 1 62 –32,204.70 64,534.72 class 3) are marked by two main patterns. The first is cod1 None 2 62 –32,198.30 64,521.90 that the biggest ω differences are between the outgroup cod1 None 3 62 –32,194.52 64,514.35 cod1 None 4 63 –32,193.72 64,514.80 and the ingroup clades for both the mitochondrial data cod2 By compartment NA 181 –30,334.41 61,042.12 (a decrease in ω) and the plastid data (an increase in ω; cod2 By locus NA 370 –30,133.79 61,056.23 Fig. 10(iii)). Presumably, a strong difference in selective cod2 None NA 63 –31,363.83 62,855.01 pressure between the ingroup and outgroup is driving cod3 By compartment NA 184 –30,264.73 60,909.14 cod3 By locus NA 375 –30,077.03 60,954.07 the high proportion of mitochondrial sites optimized cod3 None NA 64 –31,331.96 62,793.31 under this site class (slighter over 50% of the codons; cod4 By compartment 1 190 –30,256.20 60,904.87 Fig. 10(i)). The second broad pattern is that there is cod4 By compartment 2 190 –30,255.01 60,902.49 little change in the signature of selection among the cod4 By compartment 3 190 –30,254.78 60,902.02 three ingroup clades. The mitochondrial and nuclear loci cod4 By compartment 4 193 –30,255.59 60,910.05 ω cod4 By locus 1 388 –30,042.64 60,914.93 have estimates for site class 3 of approximately 0.3, cod4 By locus 2 388 –30,056.63 60,942.92 regardless of the clade. The plastid data site class 3 ω cod4 By locus 3 388 –30,051.57 60,932.79 estimate is approximately 1.25 (regardless of clade), but cod4 By locus 4 394 –30,051.41 60,946.21 very few sites are allocated to this site class (Table 6; cod4 None 1 66 –31,329.96 62,793.40 cod4 None 2 66 –31,330.26 62,794.02 Fig. 10(i), (iii)). cod4 None 3 66 –31,325.76 62,785.02 cod4 None 4 67 –31,324.74 62,785.01 Bayesian Analyses NA, not applicable. aClade regimes correspond to those in Figure 2. The Bayesian analyses under the LURC and RLC models yielded results that were complementary to those obtained through the likelihood analyses. For each over 100 AICc points poorer than either of the other two of the LURC compartmental analyses, the vittarioid models. Those remaining models (cod3 and cod4) had branches are generally reconstructed as having higher very similar AICc scores on the full data, regardless of rates of molecular evolution (Fig. 11(i)). However, this partitioning scheme (Table 5; Fig. 8). increase is against a very heterogeneous background of

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 42 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 43

i 64750 ii 62750

64250

63750 62550

63250 61150 etter models) etter models) b b 62750 Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019

62250

60950 AICc (smaller for AICc (smaller for AICc (smaller for AICc (smaller for 61750

61250 cod1 cod3 cod2 cod4

60750 60750 a b c Partitioned by Partitioned by compartment locus

FIGURE 8. Model fit of the full data on the general codon models. Units are in AICc points; smaller values indicate better fit. i) Fit under unpartitioned data (a) versus the same data partitioned by compartment (three partitions; b), and by locus (six partitions; c). ii) Zoomed-in view of the fit of the partitioned analyses.

rate variation (within each of the major clades there are fern genera and is defined by a unique character state— both unusually fast branches, and particularly slow ones) sporangia born on, and limited to, the false indusium and much of the rate increase in the vittarioids maps to (Tryon et al. 1990). However, molecular phylogenetic a single branch (the stem branch of the clade). The RLC studies have struggled to find support for its monophyly analyses reconstruct much more homogenous patterns with respect to the vittarioid ferns, which are strikingly of rate variation (Fig. 11(ii)). Here, the vittarioids are different, morphologically (Gastony and Rollo 1995; uniformly fast, the cheilanthoids are generally slow, and Schuettpelz and Pryer 2007; Schuettpelz et al. 2007; Adiantum is somewhat intermediate. However, within Ruhfel et al. 2008; Bouma et al. 2010; Lu et al. 2011a,b). that broad pattern, it is actually a cheilanthoid terminal Our six-locus data set strongly supports the monophyly branch (the one leading to Hemionitis) that has the of Adiantum, with 94% ML bootstrap support and 1.0 highest rate. For the RLC analyses, the mean number posterior probability (Fig. 1c). of reconstructed rate shifts across the posterior sample is Other relationships within the adiantoids (Adiantum 8.06 ± 0.018, despite half the prior density for the number plus the vittarioids) were mostly in accordance with of rate changes being on 0. earlier studies (Crane et al. 1995; Crane 1997; Schuettpelz et al. 2007; Ruhfel et al. 2008; Lu et al. 2011a,b), but better supported. We find novel support for a clade uniting the enigmatic Rheopteris (the least morphologically reduced of the vittarioids) and Monogramma acrocarpa, one of DISCUSSION the most reduced species, a result that suggests a pattern of convergent morphological simplification in Phylogenetic Relationships the vittarioids (Crane et al. 1995; Schuettpelz et al. The most significant phylogenetic result of this 2007; Ruhfel et al. 2008). Broad relationships within study is the very strong support for a monophyletic the cheilanthoid ferns are becoming increasingly well Adiantum (Fig. 1c). Adiantum has an unusual degree understood (Gastony and Rollo 1995; Gastony and Rollo of morphological consistency compared to other large 1998; Kirkpatrick 2007; Prado et al. 2007; Schuettpelz

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 43 31–54 44 SYSTEMATIC BIOLOGY VOL. 63

10

8

6

4

2

0 mitochondrial plastid -2 nuclear

-4 923 0.333 0.017 1.923 0.252 831 0.418 NA NA NA 104 0.129 NA NA NA 195 0.150 NA NA NA 832 0.336 0.000 4.832 0.328 923 0.352 NA NA NA 913 0.338 NA NA NA 923 0.251 NA NA NA 201 1.274 NA NA NA 200 1.296 NA NA NA 204 1.130 NA NA NA 203 1.307 0.004 0.203 1.130 ......

-6

-8 Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 Improvement in fit (AICc) vs. site model (Cod3) in fit (AICc) vs. Improvement -10 clade regime clade regime clade regime clade regime 1 2 3 4 831 0.333 0.000 4 104 0.168 3.195 15 195 0.146 3.230 15 832 0.418 0.000 4 923 0.302 0.018 1 913 0.302 0.018 1 923 0.337 0.017 1 923 0.342 0.017 1 201 1.191 0.004 0 200 1.176 0.004 0 204 1.297 0.004 0 203 1.284 0.004 0 IGURE ...... F 9. Fit of the compartment data to the branch-site (“clade”) models. The fit scores are standardized against the fit of the same data under the site model (values plotted are AICc score under cod3 minus AICc score under cod4). Units are thus in AICc points, but with larger (more positive) values indicating bigger improvements in fit for the branch-site model over the site model, and thus the magnitude of the influence of lineage effects. The entire line for the nuclear data is below zero because cod3 was a better fit for those data than was cod4 (and thus the AICc scores for cod4 were larger). The clade regimes follow Figure 2. 831 1.048 0.000 4 104 0.457 3.195 15 195 0.459 3.230 15 832 1.049 0.000 4 923 0.328 0.018 1 913 0.324 0.018 1 923 0.315 0.017 1 923 0.315 0.017 1 207060 1.1483201.919NANANANANANANANANA 14.217 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 201 0.672 0.004 0 200 0.673 0.004 0 204 0.675 0.004 0 203 0.675 0.004 0 ......

et al. 2007; Zhang et al. 2007; Rothfels et al. 2008; Windham et al. 2009; Eiserhardt et al. 2011; Johnson et al. 2012) and our results are consistent with these studies. A notable finding for cheilanthoids here, however, is the strong support for Calciphilopteris ludens as a member of this clade (Fig. 1c). Earlier Site classes Site classes Site classes Site classes Site classes Proportions Omegas for clade 0 Omegas for clade 1 Omegas for clade 2 Omegas for clade 3 studies that included this taxon typically resolved it 1231 2 3 1 2 3 1 2 3 12 3 as sister to the rest of the cheilanthoids, but without support (Schuettpelz et al. 2007; Zhang et al. 2007), 7 0.404 0.087 0.509 0.000 4 5 0.117 0.010 0.873 3.195 15 1 0.116 0.010 0.874 3.230 15 9 0.404 0.087 0.509 0.000 4 8 0.731 0.031 0.238 0.018 1 9 0.727 0.031 0.242 0.018 1 3 0.721 0.031 0.247 0.017 1 4 0.722 0.031 0.247 0.017 1 91 0.841 0.1198 0.865 0.040 0.125 0.730 0.004 0.011 0.239 0.174 0 0.031 0.018 3 0 9 0.838 0.122 0.041 0.004 0 6 0.837 0.122 0.041 0.004 0 1 0.839 0.121 0.040 0.004 0 2 0.839 0.121 0.040 0.004 0 ...... making it unclear as to whether it was more closely 436 433 433 433 435 , , , , , related to the cheilanthoids, or to the adiantoids. Certainly, morphology would place Calciphilopteris in 5 9836 4 9834 7 9835 5 9838 8 4658 9 4658 6 4658 6 4661 446 49842 9 4652 846 746 546 546 the cheilanthoids (Tryon 1942), and that is where it was ...... treated in a recent classification of the genus (Yesilyurt 149 149 149 149 153 , , , , , 2256 2256 2256 2256 4854 2256 4849 4848 4848 4849

. and Schneider 2010). 23 23 23 23 − − − − − − − − − − 23 2 − − − − −

Patterns of Substitution Rate Heterogeneity count lnL AICc In our analyses, we find rates of molecular evolution a to vary among taxa, loci, and genomic compartments. Consistent with other studies of vascular plants (Wolfe Clade Parameter et al. 1987; Wolfe et al. 1989b), our mitochondrial data are more slowly evolving than are our plastid data (Supplementary Fig. S2), despite the fact that the plastid loci are all coding and much of the nad5 alignment is intron sequence. In turn, the plastid loci are themselves 6. Fit of the compartment data and parameter estimates for codon models much more slowly evolving than is our low-copy nuclear locus (Supplementary Fig. S2). However, these results ABLE T are coarse—the nuclear rate, in particular, is inferred Clade regimes correspond to those in Figure cod4 Mitochondrial 2 66 cod4 Mitochondrial 3 66 cod4 Mitochondrial 4 67 cod4 Nuclear 1 58 cod4 Nuclear 2 58 cod4 Nuclear 3 58 cod4 Nuclear 4 59 Model Genome regime cod3 Plastid NA 64 NA, not applicable. a cod3cod3 Mitochondrialcod4 Nuclear NA Plastid 64 NA 1 56 66 cod4 Plastid 2 66 cod4 Plastid 3 66 cod4 Plastid 4 67 cod4 Mitochondrial 1 66

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 44 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 45

i ii5 iii 1.5 3 2 4.5 1.25 stitution ratio stitution ratio 4 b 1 b

3.5 selection positive Incr. 1 2 3 2.5 0.75 1 3 2 0.5 1.5 Increasing purifying selection 3 1 selection Increasing positive 0.25 2 1

0.5 Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 Incr. pur. sel. pur. Incr. Non-synonymous to synonymous (dn/ds) su to synonymous Non-synonymous Non-synonymous to synonymous (dn/ds) su to synonymous Non-synonymous 0 0 clade 0 clade 1 clade 2 clade 3 plastid mitochondrial nuclear (outgroup) (cheil.) (Adiant.) (vittar.)

FIGURE 10. Inferences of selection pressure on genomic compartments, across taxa. i) Proportion of sites allocated to each site class, for each genomic compartment, under cod4 (a “clade” branch-site model). Estimates for ω (the ratio of nonsynonymous to synonymous substitutions) for (ii) site classes 1 and 2, and (iii) site class 3. The ω values for site class 3 are permitted to vary across predefined “clades” in the tree, in this case, by the four clades of clade regime 4 (Fig. 2).

from a single short locus, with a high proportion of From the Bayesian analyses, we reconstruct the missing data in the intron portions (Supplementary vittarioid stem branch to have an average substitution Table S3). rate that is 2.16 or 2.30 times that of the branch that The lineage-specific and partition-specific differences gave rise to it (under the LURC and RCL models, we observe are strong, and their intersection—the respectively; Fig. 11). This acceleration for vittarioids is variation (across data sets) of the variation (across noteworthy in that for the subset of previous studies taxa) of the rate of molecular evolution—is especially where polarity could be established, rate accelerations interesting. Overall, the vittarioids show a dramatically were less frequent than were slowdowns (e.g., Soltis et al. elevated rate of molecular evolution, evolving 4.3 times 2002; Schuettpelz and Pryer 2006; Bininda-Emonds 2007; faster on average than the cheilanthoids (under a local Korall et al. 2010; Li et al. 2011). clock model with all data linked—bas6 with clock The strong, broad lineage effects we see in our data regime 4; Dryad doi:10.5061/dryad.c5m42). This rate set occur in the context of extensive fine-grained rate acceleration is even more pronounced when the plastid variation. In all cases, clockless (unrooted) models fit data are considered alone (Table 4; Fig. 7). Adiantum is the data much better than do local clock models (Fig. 5; also faster than the cheilanthoids, but generally much Table 3), despite the greatly increased parameter number slower than the vittarioids. Local clock models that of the former. The fine-grained effects are best exhibited force the cheilanthoids and Adiantum to share a rate by the Bayesian LURC analyses (Fig. 11(i)). Under this (vittarioids get their own) have a much better fit than uncorrelated model of rate variation, which lacks any those that force the vittarioids to share a rate with either a priori division of the tree into focal clades, the data the cheilanthoids or Adiantum (by 367 and 329 AICc prefer many rate changes, spread across the tree. Most points, respectively; bas7; Fig 5; Table 3). However, it of the vittarioid branches are slower than at least one is important to note that local clock models that give branch in each of the other clades (Fig. 11(i)) and the each of the three major clades its own rate are still vittarioid rate increase, under this model, appears to preferred over those that force any two to share a rate be largely attributable to a single very fast branch— (clock regime 4 vs. clock regimes 1, 2, or 3, Fig. 2), by a the stem branch of the clade. Even under the RLC minimum of 137 AIC points (Fig. 5; Table 3). Although model, which strongly favors very few rate changes within-clade rate variation can result in the superior fit (half the prior density is on zero rate changes), the of local clock models even when there is no difference mean number of rate changes across the tree is above in expected rates among the clades in question (Lanfear eight, and three rate changes (the number necessary 2010), this is not driving our results. Under the Local to give each of the major clades their own rates with Clock Permutation Test (Lanfear 2010; 1000 permutations no further changes) is outside the 95% credibility used) both the plastid and nuclear data reject the null interval for the rateChangeCount parameter. Inferences model of no rate differences among the focal clades (p< under this model do, however, suggest rather uniformly 0.001 and < 0.015, respectively), but the mitochondrial elevated rates across the vittarioids, instead of a single datadonot. fast branch at the base of the clade as inferred under

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 45 31–54 46 SYSTEMATIC BIOLOGY VOL. 63

0.33 0.78 0.54 0.41 0.42 0.41 0.38 0.43 0.5 0.69 0.55 0.52 0.97 0.7 1.11 2.02 0.84 0.83 0.99 1.64 0.84 1.16 2.83 0.9 0.84 1.63 i) LURC 1.25 1.04 1.26 1.31 0.97 1.2 1.29 0.97 0.97 0.62 0.57 1.25 Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 0.62 0.66 0.9 0.54 1.03 0.67 0.72 0.84 0.77 0.68 0.95 a b 0.59 0.21 0.87 0.52 0.52 0.51 0.51 0.51 0.51 0.51 0.87 0.51 0.57 0.55 1.08 1.08 2.87 1.08 1.08 0.87 2.21

1.58 ii) RLC 2.17 2.15 All vittarioid branches 2.15 have a rate of 2.13 unless indicated 2.04 0.87 2.04 2.03

1.43

All Adiantum 0.92 branches have a rate of 0.87 unless indicated a b

10 9 8 746 5 3 2 10 Arbitrary time units from present

FIGURE 11. Comparison of Bayesian relaxed clock models: rate and divergence time contrasts. Results from the full data, partitioned by compartment, analysed under a lognormal uncorrelated relaxed clock (LURC; i) and a random local clock (RLC; ii) model. Branch thickness and shading are both proportional to inferred median rate for the branch; median rate estimates are presented above the branches. Two discrepancies between the models are highlighted: the difference between the inferred ages of the ingroup (arrow a) and between the inferred crown age of the vittarioids (arrow b).

the LURC model. Under the RLC model, all vittarioid compartments than among loci (Fig. 4). Generally, branches have faster median rates than do any other the loci within a compartment behave similarly. This branches, save that of the cheilanthoid Hemionitis, which result contrasts with, for example, Moncalvo et al. is reconstructed as fastest of all. (2000), who found significantly different patterns of The pattern of variation across our data sets is rate variation among loci within compartments. The similar to that seen across taxa, with some strong, differences among compartments we observe compose coarse patterns in the context of considerable fine- perhaps the most difficult set of results to interpret. Both scaled variation. In our analyses, the “variation in rate the plastid and the nuclear data strongly support an variation” is partitioned much more strongly among accelerated rate of evolution for the vittarioids compared

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 46 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 47

to either other major clade, but the degree of the regimes (Fig. 5), and the strongly elevated rates inferred increase differs (in the plastid data, vittarioids are 5.8 for the vittarioids under those same models (Fig. 7) times faster than cheilanthoids and 2.7 times faster than demonstrate that the rate acceleration of the vittarioids Adiantum; in the nuclear data, those values are 3.1 and is a prominent feature of these data. The general 2.5, respectively; Fig. 7; Table 4). The mitochondrial consistency of this acceleration—across the plastid data, in contrast, show a rate increase for both Adiantum and nuclear loci (Fig. 7)—is most compatible with and the vittarioids, versus the cheilanthoids, but no demographic (selection and drift related) and/or life- subsequent increase from Adiantum to the vittarioids history based explanations. Changes in polymerase (vittarioids:cheilanthoids = 3.1; Adiantum:cheilanthoids proof reading accuracy (c.f. Cho et al. 2004; Parkinson = 3.2; vittarioids:Adiantum = 0.97; Fig. 7; Table 4). et al. 2005), for example, can be discounted—since The mitochondrial data therefore stand out, even by the compartments have their own polymerases, such a visual inspection (Figs. 1, 3). The two mitochondrial change would affect only a single compartment. loci clearly differ from the other compartments, but also from each other, and both are evolving relatively slowly,on average (Supplementary Fig. S2). It is tempting

to attribute the anomalous mitochondrial pattern to Patterns of Selection Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 stochastic variation in the substitution process, given Our results with codon models showed a pattern the generally weak signal present in the mitochondrial somewhat opposed to that of the nucleotide models. data sets. However, the model selection analyses Whereas the fit of the nucleotide models improved demonstrate that the variation is nonetheless significant: with model complexity (the best-fitting model was the the mitochondrial data fit much better under a local most highly parameterized one, with the finest grained clock regime that forces Adiantum and the vittarioids partition scheme), codon model fit was maximized with to share a rate and gives the cheilanthoids their own models of intermediate complexity. The unpartitioned clock, than under either of the alternatives (e.g., the clock codon models fit very poorly (Fig. 8(i)), but the three- regime that has a single rate for the cheilanthoids and partition (by compartment) scheme outperformed the Adiantum, and a different rate for the vittarioids, is 69 more complex six-partition (by locus) scheme (Fig. 8(ii)). AIC points worse; Fig. 6; Table 4). In addition, unlike This result is in contrast to earlier studies (Eyre-Walker the plastid or nuclear data, the mitochondrial data do and Gaut 1997; Muse and Gaut 1997; Muse 2000) that not favor the four-rate local clock model (clock regime found selective effects (nonsynonymous substitution 4, Fig. 2). Giving each of the major clades (plus the rate differences, in their case) to be largely locus specific. outgroup) its own rate results in a small reduction in The intermediate-is-best pattern is also seen among the fit (of 2.0 AIC points) over the three-rate clock regime main model types. Model cod1, which does not include 1 (in which Adiantum and the vittarioids share a rate; site effects, fits very poorly. The site models (cod2 and Fig. 6; Table 4). cod3) fit much better, but the further addition of branch One of the noteworthy results of the nucleotide model effects (in the branch-site model cod4) has a negligible selection analyses was the strong preference of the effect (Fig. 8(ii); Table 5). The lack of a lineage effect data for the most highly parameterized model (bas2), in the fit of the codon models is among our most which is not only clockless, with unlinked substitution interesting results. In marked contrast to the nucleotide- parameters, but additionally lacks a proportional branch based analyses, our codon-based analyses show no length constraint. In effect, bas2 treats each partition significant evidence of fine-scale differences across data as coming from a separate evolutionary process. This partitions or strong differences among lineages (Figs. 9 model is a much better fit for the data than are any and 10). Differences in selective signatures, when they of the other general models, and bas2 partitioned do appear (i.e., the evidence for strong positive selection by locus outperforms the same model partitioned by in a subset of the mitochondrial sites but not for those compartment (by 37.9 AIC points; Fig. 4; Table 3). Thus, of the plastid; Fig. 10(ii)), extend across the tree, and fine-scale, locus-specific variation in the variation of thus cannot explain the heterogeneity in evolutionary rate is one of the dominant features of these data. rates that is so striking in this group. Finally, the only Even among the local clock models, those which permit indications of lineage effects (e.g., ML estimates of ω each clock to vary freely across partitions (e.g., if the for site class 3 of cod4) are weak, and occur between rate for clock 1 is twice that of clock 2 in the first the outgroup and ingroup, rather than among ingroup partition, it need not be so for the second partition) clades (Fig. 10(iii)). fit considerably better than do the corresponding local These results further limit the pool of biological clock models where the clocks are constrained to be mechanisms that might explain the vittarioid rate proportional across partitions (bas6 vs. bas7; Fig. 5; increase. While our nucleotide-based results are Table 3). consistent with life-history or demography-based Overall, the results of our analyses eliminate the explanations, the subset of explanations that depend possibility that the apparent rate acceleration of the on altered post-mutation fixation rates, including those vittarioids is simply due to stochastic variation in that invoke changes in the relative strength of selection the substitution process. The differences in fit among and drift (e.g., reduced efficacy of selection in small the local clock models under the different local clock populations; Woolfit and Bromham 2003; Bromham and

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 47 31–54 48 SYSTEMATIC BIOLOGY VOL. 63

Leys 2005; Woolfit and Bromham 2005) are undermined (e.g., Sarich and Wilson 1967; Muse and Weir 1992; by the absence of lineage-specific signatures of selection Bromham et al. 1996; Bromham and Woolfit 2004; in our codon-based analyses. Selection may still be Lanfear et al. 2010). Phylogenetic non-independence ultimately responsible for the elevated rates, but only is much more difficult to account for in cases, like indirectly (e.g., through selection for increased mutation ours, that lack repeated examples of a rate change rates in the vittarioids). It is possible that lineage-specific correlated with a plausible external factor. There has effects were undetectable by our methods (there are been a temptation in such cases to approach evolutionary one-third as many codons as nucleotide sites, so the rate itself as a trait and to reconstruct the evolution codon models may lack some of the power of their of that trait on a phylogeny (e.g., Lutzoni and Pagel nucleotide counterparts). However, the strong contrast 1997, their “method 1”). However, if one attempts to between the two model types (vittarioids having very do so in a likelihood or Bayesian framework, then strong lineage effects in the nucleotide models vs. none in the quantity being reconstructed (rate) is itself part the codon models) argues that selective effects, if present, of the model used in the reconstruction (the branch are too weak to explain the observed rate discrepancy. lengths). Other approaches also potentially run subtly The remaining subset of explanations, then, is limited afoul of the issue of independence. Many analyses that

to those that invoke changes in the supply of mutations, explicitly model rate variation do so in a manner that Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 rather than changes in the rate of post-mutation fixation. assumes a degree of correlation among parent and child Because only those explanations with organism-wide branches (e.g., r8s; Sanderson 2002, 2003), rendering effects are tenable, given the bigenomic nature of the rate branch rates dependent (c.f. Hoegg et al. 2004; Korall increase, life history or related environmental factors are et al. 2010). Finally, even if a model does not assume considered to be the most likely agent. autocorrelation (e.g., some of the models in BEAST; Reports of molecular rate heterogeneity caused by Drummond et al. 2006; Drummond and Rambaut 2007) changes in environment-driven mutation rates are rare those branches are still embedded within the hierarchy in the literature. To our knowledge, such effects have of the phylogenetic tree. Different branches each have been found only in lichen-forming fungi (Lutzoni and a role in accommodating the same data, and thus a Pagel 1997) and halophytic crustaceans (Hebert et al. rate reconstructed for a given branch is not independent 2002). That said, it is important to note that the rate of the rates reconstructed for the others. It remains to increase for the vittarioids may be driven substantially be seen whether these issues have practical effects in by a single elevated rate on the clade’s stem branch real data sets; preliminary comparisons suggest that (Fig. 11(i)). If this one branch is responsible for the rate they do not (e.g., Nabholz et al. 2008; Welch et al. increase, looking for environmental correlates among 2008). Much future progress is likely to come from extant vittarioids could be unproductive. Instead, one recent approaches that model rate variation explicitly would need to ask what was so unusual about the as a trait, or at least, as “trait dependent” (Lartillot and environment of the ancestor of the vittarioids that Poujol 2010; Mayrose and Otto 2011). By coestimating could cause mutation rates high enough to yield a rate and divergence time, these methods avoid the rate of substitution approaching six times that of the circularity and non-independence problems; related background, even when spread across all descendent models can then be compared using likelihood ratio tests branches? or model selection approaches, in a manner similar to this study. Our model-fitting analyses, under likelihood, did require the a priori identification of models to compare, Methods of Investigating Rate Heterogeneity which was not trivial. The number of possible We found our model-fitting approach, using the parameterizations of a partitioned analysis increases AICc, to be an effective means of investigating rate rapidly with the number of partitions. Our data heterogeneity. Model-fitting analyses allowed us to set of six loci can be divided into groups of loci establish the pattern of rate variation without requiring 203 different ways. In a simplified scenario, where multiple independent observations, and to evaluate the for each of the 203 different partitioning regimes assumption that rate is an organism-level phenomenon we allow parameters of each “type” (exchangeability; influenced by some component of that organism’s life base frequency; gamma shape; clock regime—unrooted, history and thus manifest across loci and genomic global, or one of four local clock regimes) to be either compartments (e.g., Bousquet et al. 1992; Nikolaev et al. shared across the full data set or allocated individually 2007; Bromham 2009; Lanfear et al. 2010). In addition, to each of the partitions, there are 9744 different our approach allowed us to avoid the challenges possible models. However, even with the requirement of phylogenetic non-independence, which have been of a priori identification of models, the ML model- prevalent in the history of rate heterogeneity research fitting approach described here will remain attractive. (reviewed in Bromham 2002). Typically, the issue of In our case, it allowed us to elucidate complex patterns non-independence is accommodated by examining a of rate variation, in the absence of any associated series of independent pair-wise comparisons—informed correlation-based hypotheses, and the same approach by, for example, a particular phenotypic change of was readily extendable to our investigations of complex choice—evaluated with a sign test or similar approach heterogeneous patterns of selection.

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 48 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 49

Understanding the patterns of selection proved critical expanded taxon samples (e.g., Gastony and Rollo 1995; in this study, to evaluating the potential causes of Schuettpelz and Pryer 2007; Schuettpelz et al. 2007; the vittarioid rate increase. Given the vittarioids’ Bouma et al. 2010). At least for these data, then, the idiosyncratic biology,there are many potential correlates relaxed clock models (with our choices of priors and for their elevated rate of evolution (e.g., changes parameterization) are not yet fully capable of accurately associated with their reduced morphologies, with their modeling rate variation such that they are able to root gametophyte generation, or induced by their epiphytic this tree correctly without additional information. habit). However, any explanation for their elevated While the effects of model choice on topology are rates, based on these correlates, should be interpreted perhaps modest, their effects on inferences of timing with caution to ensure that mutation, rather than (the dating of evolutionary events) are strong. The two fixation, is invoked. This danger of inferring causation relaxed clock models (LURC and RLC) we employed not from correlation is perhaps best illustrated, and most only made different inferences about the location and intriguing, in studies of rate heterogeneity in plants, scale of rate changes, but also inferred correspondingly where many studies have found a correlation between strong differences in the relative timing of divergences generation time and evolutionary rate (Bousquet et al. (Fig. 11). The RLC model inferred events to be generally

1992; Gaut et al. 1992; Laroche et al. 1997; Smith and younger than the LURC, by over 10% of the total tree Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 Donoghue 2008; Korall et al. 2010), a pattern well height for the base of the ingroup (Fig. 11,arrowa), and documented in animals. However, plants do not have by over 20% of the tree height for the crown clade of a segregated germ line, and thus should show no the vittarioids (Fig. 11,arrowb). These discrepancies correlation between the number of generations and the are potentially the result of a worst-case scenario number of potentially mutation-inducing replications in (very strong rate heterogeneity in a phylogeny without the history of their gametes, which is the explanation internal time constraints—our sole constraint is at the typically proffered for the “generation time hypothesis” base of the tree) but are nonetheless dramatic and (Mooers and Harvey 1994; Li et al. 1996). As Muse (2000) potentially worrisome. A closer inspection of the results notes, while such a correlation may exist in plants, the of the two models shows that, as expected, the branch mechanism is unclear (but see Lanfear et al. 2013). rates inferred under one model are strongly correlated with those inferred under the other (Supplementary Fig. S3). However, the stem branch for the vittarioids is an outlier—it is inferred to have a much higher Implications for Topology Inference and Divergence rate under the LURC model than under the RLC Time Dating model, whereas the other vittarioid branches have a Questions about the patterns and causes of rate much lower rate under the LURC model than under variation aside, our results are also relevant to recent the RLC model (the vittarioid stem branch is below debates concerning the influence of rate heterogeneity, the diagonal in Supplementary Fig. S3, vs. the other and the way it is modeled, on phylogenetic inference. vittarioid branches, which are above the diagonal). In Drummond et al. (2006) make the case that standard effect, the tendency of the RLC model to penalize “unrooted” (or “no-clock”; Yang and Rannala 2006; frequent rate changes causes it to prefer to distribute Wertheim et al. 2010) models are flawed, in that they the substitutions that the LURC model loaded onto do not require contemporaneity of the tips. Given that the single stem branch, and spread them across the we know that the tips are contemporaneous (in most entire clade. The LURC model correspondingly infers cases), is it reasonable to ignore that information when a shorter (but faster) stem branch, whereas the RLC inferring phylogeny? Might our inferences be more model infers a longer, slower branch (Fig. 11), and accurate if such data were included explicitly in the thus very different divergence times. A similar complex model via “relaxed clock” approaches (Drummond et al. model-mediated effect is apparent in Hemionitis, which 2006)? Recent studies failed to find any significant is on a sufficiently long terminal branch that it is improvement of relaxed clock over unrooted models given its own rate under both models. The RLC model (Wertheim et al. 2010; Rothfels et al. 2012), but were prefers to force all the neighboring branches to share a generally inconclusive: “The question remains whether, single rate—especially Mildella, the slow sister lineage to in practice, modeling rate variation among branches can Hemionitis—and thus the rate change on the Hemionitis improve phylogenetic inference” (Wertheim et al. 2010). branch is particularly extreme (Fig. 11, Supplementary In our case, the inference of ingroup relationships was Fig. S3). As with the vittarioids, a single deeper branch largely insensitive to the choice of model (at least, among is inferred to have a much higher rate under the LURC those investigated, which included both unrooted and than RLC models (Supplementary Fig. S3, arrow), with relaxed clock models). However, when unconstrained, the more distal tips showing the opposite pattern, both relaxed clock models failed to root the tree correctly, for the overall result of a small number of ancestral placing the root along the fastest branch (separating branches below the diagonal in Supplementary Fig. S3, the vittarioids from the rest of the tree) rather than and many more (typically terminal) branches above the between the ingroup and outgroup (results not shown). diagonal. In effect, the RLC prefers rate homogeneity, This placement renders the adiantoids paraphyletic, but when heterogeneity is necessary, the degree of and strongly conflicts with the results obtained from the change is unimportant. The LURC model, with its

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 49 31–54 50 SYSTEMATIC BIOLOGY VOL. 63

tendency toward many smaller rate changes, is able to sequences; and Joe Bielawski, Fay-Wei Li, Nimrod “accommodate” some of the additional substitutions Rubinstein, and Dave Swofford, for assistance with the on the deeper ancestral branches, resulting in rate complexities of parameter counting, model selection, changes that are less dramatic but more frequent (Fig. 11, and codon models. Maarten Christenhusz, Jim Croft, Supplementary Fig. S3). Layne Huiet, Thomas Janssen, Nathalie Nagalingum, While all of our rate and divergence time inferences Tom Ranker, Harald Schneider, Alan Smith, and Paul are model dependent (we do not know the “true rates”), Wolf contributed plant materials used in this study; the concordance between the pattern of rate variation we thank the staff of A, COLO, DUKE, GOET, P, TUR, in the LURC results and the strong preference of the UC, and UTC for curating the associated vouchers. data for ML models with the maximum number of This manuscript was greatly improved through the permitted rate parameters (Fig. 4; Table 3) suggests thoughtful comments of Rob Lanfear, Sally Otto, Editor- that fine-scale rate heterogeneity may be dominant in in-Chief Frank Anderson, Associate Editor Roberta J. our data. Contrary to the position of Drummond and Mason-Gamer, and an anonymous reviewer. Suchard (2010) that “in any given tree there exist a small number of rate changes … in general, the numerous

small changes arise as a modeling consequence, and are Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 not necessarily data-driven” (see also Yoder and Yang APPENDIX 1 2000), our data suggest the opposite: it may be the few, List of accessions sampled in this study, presented large changes inferred under the RLC model that are a in the following format: Species, Fern Lab database model consequence. number (http://fernlab.biology.duke.edu/), Voucher An appealing avenue for future research is to compare (HERBARIUM), Provenance: GenBank numbers (with the fit of complex relaxed clock models on diverse data citations for previously published sequences) for atpA, sets, using Bayes factors. Unfortunately, Bayes factors are atpB, rbcL, atp1, nad5, gapCp (in that order). Missing data frequently calculated from the harmonic mean, which are indicated by “–”. Herbarium acronyms follow Index is an unreliable method (Lartillot and Philippe 2006; Herbariorum (Thiers [continuously updated]). Fan et al. 2011; Xie et al. 2011); more reliable methods are just now becoming available (Lewis et al. 2010; Adiantum aethiopicum L., 3895, N.Nagalingum 24 Baele et al. 2012). Regardless of the ultimate truth, (DUKE), Australia, New South Wales: KC984436, one conclusion is clear—inferences of rate, and thus of KC984441, KC984519, KC984409, KC984493, KC984450. divergence times, are critically dependent on the model Adiantum formosum R.Br., 4602, A.R.Smith s.n. (UC), adopted. Cult. (wild provenance unknown): KC984437,KC984442, KC984520, KC984410, KC984494, KC984453. Adiantum hispidulum Sw., 4603, L.Huiet 101 (UC), Cult. (wild SUPPLEMENTARY MATERIAL provenance unknown): KC984438, KC984443, KC984521, Data files and/or other supplementary information KC984411, KC984495, KC984455. Adiantum malesianum related to this paper have been deposited at Dryad J.Ghatak, 2506, L.Huiet 111 (UC), Cult. (wild provenance (http://datadryad.org/) under doi:10.5061/dryad. unknown): EF452068 (Schuettpelz et al. 2007), EF452011 c5m42. (Schuettpelz et al. 2007), EF452132 (Schuettpelz et al. 2007), KC984412, KC984496, EU551257 (Schuettpelz et al. 2008). Adiantum peruvianum Klotzsch, 2507, L.Huiet FUNDING 103 (UC), Cult. (wild provenance unknown): EF452070 (Schuettpelz et al. 2007), EF452013 (Schuettpelz et al. This work was supported by the Society for Systematic 2007), EF452133 (Schuettpelz et al. 2007), KC984413, Biologists (Graduate Student Research Award to C.J.R.), KC984497,KC984456. Adiantum raddianum C.Presl, 638, the Natural Sciences and Engineering Research Council P.G.Wolf 717 (UTC), Cult. (wild provenance unknown): of Canada (a Julie Payette PGS M and PGS D to C.J.R.), EF452071 (Schuettpelz et al. 2007), U93840 (Wolf 1997), and the National Science Foundation (DEB-1145925 KC984522, KC984414, KC984498, KC984458. Adiantum to E.S.). tenerum Sw., 2504, L.Huiet 107 (UC), Cult. (wild provenance unknown): EF452072 (Schuettpelz et al. 2007), EF452014 (Schuettpelz et al. 2007), EF452134 ACKNOWLEDGMENTS (Schuettpelz et al. 2007), KC984415, KC984499, Our great thanks to the individuals who made this KC984459. Adiantum tetraphyllum Humb. & Bonpl. ex project possible: Kathleen Pryer, who had a pivotal Willd., 2505, L.Huiet 105 (UC), Cult. (wild provenance role in initiating and guiding this project; Layne unknown): EF452073 (Schuettpelz et al. 2007), EF452015 Huiet, who assisted in the lab and provided critical (Schuettpelz et al. 2007), EF452135 (Schuettpelz et al. materials and unpublished information regarding the 2007), KC984416, KC984500, KC984460. Anetium phylogeny of Adiantum; Ben Redelings for guidance citrifolium (L.) Splitg., 3339, M.Christenhusz 4076 through (and execution of) the BAli-Phy analyses; Ester (TUR), France, Guadeloupe: EF452075 (Schuettpelz et al. Gaya and Casey Dunn for assistance with PAMLparser; 2007), EF452017 (Schuettpelz et al. 2007), KC984523, Volker Knoop for supplying unpublished nad5 primer KC984417, KC984501, KC984461. Antrophyum latifolium

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 50 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 51

Blume, 3078, T.Ranker 1774 (COLO), Papua New 2007), KC984431, KC984515, KC984472. Pityrogramma Guinea: EF452076 (Schuettpelz et al. 2007), EF452018 austroamericana Domin, 2561, E.Schuettpelz 301 (Schuettpelz et al. 2007), EF452138 (Schuettpelz et al. (DUKE), Cult. (wild provenance unknown): EU268769 2007), KC984418, KC984502, KC984462. Bommeria (Rothfels et al. 2008), EF452050 (Schuettpelz et al. 2007), hispida (Mett. ex Kuhn) Underw., 3174, E.Schuettpelz et EF452166 (Schuettpelz et al. 2007), KC984432, KC984516, al. 467 (DUKE), USA, Arizona, Cochise Co.: EU268725 KC984473. Rheopteris cheesemaniae Alston, 3373, Croft (Rothfels et al. 2008), EF452022 (Schuettpelz et al. 1749 (A), Papua New Guinea: EF452126 (Schuettpelz 2007), EF452142 (Schuettpelz et al. 2007), KC984419, et al. 2007), EF452063 (Schuettpelz et al. 2007), EF452176 KC984503, KC984463. Calciphilopteris ludens (Wall. (Schuettpelz et al. 2007), KC984433, KC984517, –. ex Hook.) Yesilyurt & H.Schneid., 3510, H.Schneider Vittaria graminifolia Kaulf., 2395, E.Schuettpelz 227 s.n. (GOET), Cult. (wild provenance unknown): (DUKE), Ecuador, Zamora-Chinchipe Prov.: EF452128 EU268741 (Rothfels et al. 2008), EF452031 (Schuettpelz (Schuettpelz et al. 2007), EF452064 (Schuettpelz et al. et al. 2007), EF452150 (Schuettpelz et al. 2007), 2007), U21295 (Crane et al. 1995), KC984434, KC984518, –. KC984422, KC984506, KC984465. Cheilanthes covillei Maxon, 3150, E.Schuettpelz et al. 443 (DUKE), USA,

Arizona, Maricopa Co.: EU268733 (Rothfels et al. REFERENCES Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 2008), KC984444, EU268782 (Rothfels et al. 2008), Akaike H. 1974. A new look at the statistical model identification. IEEE KC984420, KC984504, EU551267 (Schuettpelz et al. T. Automat. Contr. 19:716–723. 2008). Cryptogramma crispa (L.) R.Br. ex Hook., Baele G., Lemey P., Bedford T., Rambaut A., Suchard M.A., 2949, M.Christenhusz & F.Katzer 3871 (DUKE), Alekseyenko A.V. 2012. Improving the accuracy of demographic United Kingdom, Scotland: EU268740 (Rothfels et al. and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol. Biol. Evol. 29:2157–2167. 2008), EF452027 (Schuettpelz et al. 2007), EF452148 Baer C.F., Miyamoto M.M., Denver D.R. 2007. Mutation rate variation (Schuettpelz et al. 2007), KC984421, KC984505, in multicellular eukaryotes: Causes and consequences. Nat. Rev. KC984464. Haplopteris elongata (Sw.) E.H.Crane, Genet. 8:619–631. 2546, L.Huiet 112 (UC), New Caledonia: EF452096 Bielawski J., Yang Z. 2004. A maximum likelihood method for detecting (Schuettpelz et al. 2007), EF452035 (Schuettpelz et al. functional divergence at individual codon sites, with application to gene family evolution. J. Mol. Evol. 59:121–132. 2007), EF452153 (Schuettpelz et al. 2007), KC984423, Bininda-Emonds O.R.P. 2007. Fast genes and slow clades: Comparative KC984507, KC984466. Hecistopteris pumila (Spreng.) rates of molecular evolution in mammals. Evol. Bioinform. J.Sm., 3278, M.Christenhusz 3976 (TUR), France, 3:59–85. Guadeloupe: EF452097 (Schuettpelz et al. 2007), Bouma W., Ritchie P. Perrie L. 2010. Phylogeny and generic of the New Zealand Pteridaceae ferns from chloroplast rbcL DNA EF452036 (Schuettpelz et al. 2007), KC984524, KC984424, sequences. Aust. Syst. Bot. 23:143–151. KC984508, –. Hemionitis palmata L., 2557,E.Schuettpelz Bousquet J., Strauss S., Doerksen A., Price R. 1992. Extensive variation in 297 (DUKE), Cult. (wild provenance unknown): evolutionary rate of rbcL gene-sequences among seed plants. PNAS EU268743 (Rothfels et al. 2008), EF452037 (Schuettpelz 89:7844–7848. et al. 2007), KC984525, KC984425, KC984509, KC984467. Brandley M., Schmitz A., Reeder T. 2005. Partitioned Bayesian analyses, partition choice, and the phylogenetic relationships of scincid Mildella intramarginalis (Kaulf. ex Link) Trevis, 3513, lizards. Syst. Biol. 54:373–390. H.Schneider s.n. (GOET), Cult. (wild provenance Bromham L. 2002. Molecular clocks in reptiles: Life history influences unknown): EF452085 (Schuettpelz et al. 2007), EF452025 rate of molecular evolution. Mol. Biol. Evol. 19:302–309. (Schuettpelz et al. 2007), EF452146 (Schuettpelz et al. Bromham L. 2009. Why do species vary in their rate of molecular Monogramma evolution? Biol. Letters 5:401–404. 2007), KC984426, KC984510, KC984468. Bromham L., Leys R. 2005. Sociality and the rate of molecular evolution. acrocarpa (Holttum) D.L.Jones, 3375, T.Ranker 1778 Mol. Biol. Evol. 22:1393–1402. (COLO), Papua New Guinea: KC984435, KC984439, Bromham L., Rambaut A., Harvey P. 1996. Determinants of rate EF452156 (Schuettpelz et al. 2007), KC984427, variation in mammalian DNA sequence evolution. J. Mol. Evol. KC984511, –. Monogramma graminea (Poir.) Schkuhr, 43:610–621. Bromham L., Woolfit M. 2004. Explosive radiations and the reliability 3548, T.Janssen 2692 (P), France, Reunion: EF452102 of molecular clocks: Island endemic radiations as a test case. Syst. (Schuettpelz et al. 2007), EF452040 (Schuettpelz et al. Biol. 53:758–766. 2007), EF452157 (Schuettpelz et al. 2007), KC984428, Bromham L., Woolfit M., Lee M., Rambaut A. 2002. Testing the KC984512, KC984469. Notholaena grayi Davenp., 3187, relationship between morphological and molecular rates of change along phylogenies. Evolution 56:1921–1930. E.Schuettpelz et al. 480 (DUKE), USA, Arizona, Cochise Burnham K.P., Anderson D.R. 2002. Model selection and multimodel Co.: EU268749 (Rothfels et al. 2008), JF832173 (Rothfels inference: A practical information-theoretic approach. New York, et al. 2012), EU268794 (Rothfels et al. 2008), KC984429, USA: Springer. KC984513, KC984470. Pellaea atropurpurea (L.) Link, Burnham K.P., Anderson D.R. 2004. Multimodel inference: 2957, E.Schuettpelz 312 (DUKE), Cult. (originally Understanding AIC and BIC in model selection. Sociol. Method. Res. 33:261–304. collected from Virginia, USA): JQ855925 (Johnson Cho Y., Mower J., Qiu Y.L., Palmer J. 2004. Mitochondrial substitution et al. 2012), KC984440, EF452162 (Schuettpelz et al. rates are extraordinarily elevated and variable in a genus of 2007), KC984430, KC984514, KC984471. Pentagramma flowering plants. PNAS 101:17741–17746. triangularis (Kaulf.) Yatsk., Windham & E.Wollenw., Crane E.H. 1997. A revised circumscription of the genera of the fern family Vittariaceae. Syst. Bot. 22:509–517. 3152, E.Schuettpelz et al. 445 (DUKE), USA, Arizona, Crane E.H., Farrar D.R., Wendel J.F. 1995. Phylogeny of the Vittariaceae: Gila Co.: EU268768 (Rothfels et al. 2008), EF452049 Convergent simplification leads to a polyphyletic Vittaria.Am.Fern (Schuettpelz et al. 2007), EF452165 (Schuettpelz et al. J. 85:283–305.

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 51 31–54 52 SYSTEMATIC BIOLOGY VOL. 63

Cronn R., Cedroni M., Haselkorn T., Grover C., Wendel J.F. 2002. PCR- Korall P., Schuettpelz E., Pryer K.M. 2010. Abrupt deceleration of mediated recombination in amplification products derived from molecular evolution linked to the origin of arborescence in ferns. polyploid cotton. Theor. Appl. Genet. 104:482–489. Evolution 64:2786–2792. Davies T., Savolainen V., Chase M., Moat J., Barraclough T. 2004. Lanfear R. 2010. The local-clock permutation test: A simple test Environmental energy and evolutionary rates in flowering plants. to compare rates of molecular evolution on phylogenetic trees. P. Roy. Soc. Lond. B Bio. 271:2195–2200. Evolution 65:606–611. Des Marais D.L., Smith A.R., Britton D.M., Pryer K.M. 2003. Lanfear R., Calcott B., Ho S.Y.W., Guindon S. 2012. PartitionFinder: Phylogenetic relationships and evolution of extant horsetails, Combined selection of partitioning schemes and substitution Equisetum, based on chloroplast DNA sequence data (rbcL and models for phylogenetic analyses. Mol. Biol. Evol. 29:1695–1701. trnL-F). Int. J. Plant Sci. 164:737–751. Lanfear R., Ho S.Y.W., Davies T.J., Moles A.T., Aarssen L., Swenson Drummond A.J., Ho S.Y.W., Phillips M.J., Rambaut A. 2006. N.G., Warmann L., Zanne A.E., Allen A.P. 2013. Taller plants have Relaxed phylogenetics and dating with confidence. PLoS Biol. 4: lower rates of molecular evolution. Nat. Commun. 4:1–29. 699–710. Lanfear R., Thomas J.A., Welch J.J., Brey T., Bromham L. 2007. Drummond A. J., Rambaut A. 2007. BEAST: Bayesian evolutionary Metabolic rate does not calibrate the molecular clock. PNAS 104: analysis by sampling trees. BMC Evol. Biol. 7:214. 15388–15393. Drummond A.J., Suchard M.A. 2010. Bayesian random local clocks, or Lanfear R., Welch J. J., Bromham L. 2010. Watching the clock: Studying one rate to rule them all. BMC Biol. 8:1–12. variation in rates of molecular evolution between species. Trends Eiserhardt W.L., Rohwer J.G., Russell S.J., Yesilyurt J.C., Schneider H. Ecol. Evol. 25:496–503.

2011. Evidence for radiations of cheilanthoid ferns in the Greater Laroche J., Li P., Maggia L., Bousquet J. 1997. Molecular evolution of Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 Cape Floristic Region. Taxon 60: 1269–1283. angiosperm mitochondrial introns and exons. PNAS 94:5722–5727. Eyre-Walker A., Gaut B. 1997. Correlated rates of synonymous site Lartillot N., Philippe H. 2006. Computing Bayes factors using evolution across plant genomes. Mol. Biol. Evol. 14:455–460. thermodynamic integration. Syst. Biol. 55:195–207. Fan Y., Wu R., Chen M.-H., Kuo L., Lewis P. 2011. Choosing Lartillot N., Poujol R. 2010. A phylogenetic model for investigating among partition models in Bayesian phylogenetics. Mol. Biol. Evol. correlated evolution of substitution rates and continuous 28:523–532. phenotypic characters. Mol. Biol. Evol. 28:729–744. Farrar D. 1974. Gemmiferous fern gametophytes-Vittariaceae. Am. J. Lewis P.O. 2001. A likelihood approach to estimating phylogeny from Bot. 61:146–155. discrete morphological character data. Syst. Biol. 50:913–925. Fitch W.M., Beintema J. 1990. Correcting parsimonious trees for Lewis P.O., Holder M.T., Swofford D.L. 2010. Phycas. Available from: unseen nucleotide substitutions: The effect of dense branching as URL www.phycas.org (last accessed January 1, 2012). exemplified by ribonuclease. Mol. Biol. Evol. 7:438–443. Lewis L., Mishler B.D., Vilgalys R. 1997. Phylogenetic relationships of Gastony G.J., Rollo D.R. 1995. Phylogeny and generic circumscriptions the liverworts (Hepaticae), a basal embryophyte lineage, inferred of cheilanthoid ferns (Pteridaceae: ) inferred from from nucleotide sequence data of the chloroplast gene rbcL. Mol. rbcL nucleotide sequences. Am. Fern J. 85:341–360. Phylogenet. Evol. 7:377–393. Gastony G.J., Rollo D.R. 1998. Cheilanthoid ferns (Pteridaceae: Li W.-H., Ellsworth D.L., Krushkal J., Chang B.H.J., Hewett-Emmett Cheilanthoideae) in the southwestern United States and adjacent D. 1996. Rates of nucleotide substitution in primates and rodents Mexico–a molecular phylogenetic reassessment of generic lines. and the generation-time effect hypothesis. Mol. Phylogenet. Evol. Aliso 17:131–144. 5:182–187. Gastony G.J., Yatskievych G. 1992. Maternal inheritance of the Li F.-W., Kuo L.-Y., Rothfels C.J., Ebihara A., Chiou W.-L., Windham chloroplast and mitochondrial genomes in cheilanthoid ferns. Am. M.D., Pryer K.M. 2011. rbcL and matK earn two thumbs up as the J. Bot. 79:716–722. core DNA barcode for ferns. PLoS ONE 6:e26597. Gaut B., Muse S., Clark W., Clegg M. 1992. Relative rates of nucleotide Lu J.-M., Li D.-Z., Lutz S., Soejima A., Yi T., Wen J. 2011a. Biogeographic substitution at the rbcL locus of monocotyledonous plants. J. Mol. disjunction between eastern Asia and North America in the Evol. 35:292–303. Adiantum pedatum complex (Pteridaceae). Am. J. Bot. 98:1680–1693. Gaya E., Redelings B.D., Navarro-Rosinés P., Llimona X., De Cáceres M., Lu J.-M., Wen J., Lutz S., Wang Y.-P., Li D.-Z. 2011b. Phylogenetic Lutzoni F. 2011. Align or not to align? Resolving species complexes relationships of Chinese Adiantum based on five plastid markers. within the Caloplaca saxicola group as a case study. Mycologia J. Plant Res. 125:1–13. 103:361–378. Lumbsch H., Hipp A.L., Divakar P.K., Blanco O., Crespo A. 2008. Goldman N., Whelan S. 2002. A novel use of equilibrium Accelerated evolutionary rates in tropical and oceanic parmelioid frequencies in models of sequence evolution. Mol. Biol. Evol. 19: lichens (Ascomycota). BMC Evol. Biol. 8:257. 1821–1831. Lutzoni F., Pagel M. 1997. Accelerated evolution as a consequence of Goldman N., Yang Z. 1994. A codon-based model of nucleotide transitions to mutualism. PNAS 94:11422–11427. substitution for protein-coding DNA sequences. Mol. Biol. Evol. Maddison W.P., Maddison D.R. 2011. Mesquite: A modular system 11:725–736. for evolutionary analysis. Available from: URL mesquiteproject.org Hebert P., Remigio E., Colbourne J., Taylor D., Wilson C. (last accessed January 1, 2012). 2002. Accelerated molecular evolution in halophilic crustaceans. Martin W., Lydiate D., Brinkmann H., Forkmann G., Saedler H. and Evolution 56:909–926. Cerff R. 1993. Molecular phylogenies in angiosperm evolution. Mol. Hoegg S., Vences H., Brinkmann H., Meyer A. 2004. Phylogeny and Biol. Evol. 10:140–162. comparative substitution rates of frogs inferred from sequences of Mayrose I., Otto S.P. 2011. A likelihood method for detecting trait- three nuclear genes. Mol. Biol. Evol. 21:1188–1200. dependent shifts in the rate of molecular evolution. Mol. Biol. Evol. Hugall A.F., Lee M. S. Y. 2007. The likelihood node density effect and 28:759–770. consequences for evolutionary studies of molecular rates. Evolution McCoy S.R., Kuehl J. V., Boore J. L., Raubeson L.A. 2008. The complete 61:2293–2307. plastid genome sequence of Welwitschia mirabilis: An unusually Hurvich C.M., Tsai C.-L. 1989. Regression and time-series model compact plastome with accelerated divergence rates. BMC Evol. selection in small samples. Biometrika 76:297–307. Biol. 8:130. Johnson A.K., Rothfels C.J., Windham M.D., Pryer K. M. 2012. Unique Meyer-Gauen G., Schnarrenberger C., Cerff R., Martin W. 1994. expression of a sporophytic character on the gametophytes of Molecular characterization of a novel, nuclear-encoded, NAD+- notholaenid ferns (Pteridaceae). Am. J. Bot. 99:1118–1124. dependent glyceraldehyde-3-phosphate dehydrogenase in Kimura M. 1980. A simple method for estimating evolutionary rates plastids of the gymnosperm Pinus sylvestris L. Plant Mol. Biol. of base substitutions through comparative studies of nucleotide 26:1155–1166. sequences. J. Mol. Evol. 16:111–120. Moncalvo J., Drehmel D., Vilgalys R. 2000. Variation in modes Kirkpatrick R.E.B. 2007. Investigating the monophyly of Pellaea and rates of evolution in nuclear and mitochondrial ribosomal (Pteridaceae) in the context of a phylogenetic analysis of DNA in the mushroom genus Amanita (Agaricales, Basidiomycota): cheilanthoid ferns. Syst. Bot. 32:504–518. Phylogenetic implications. Mol. Phylogenet. Evol. 16:48–63.

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 52 31–54 2014 ROTHFELS AND SCHUETTPELZ—VITTARIOID RATES 53

Mooers A.Ø., Harvey P.H. 1994. Metabolic rate, generation time, and Sanderson M.J. 2003. r8s: Inferring absolute rates of molecular the rate of molecular evolution in birds. Mol. Phylogenet. Evol. evolution and divergence times in the absence of a molecular clock. 3:344–350. Bioinformatics 19:301–302. Muse S.V. 2000. Examining rates and patterns of nucleotide Sarich V.M., Wilson A.C. 1967. Immunological time scale for hominid substitution in plants. Plant Mol. Biol. 42:25–43. evolution. Science 158:1200. Muse S., Gaut B. 1997. Comparing patterns of nucleotide substitution Schneider H., Smith A., Cranfill R., Hildebrand T., Haufler C., rates among chloroplast loci using the relative ratio test. Genetics Ranker T. 2004. Unraveling the phylogeny of polygrammoid 146:393–399. ferns (Polypodiaceae and Grammitidaceae): Exploring aspects of Muse S., Weir B. 1992. Testing for equality of evolutionary rates. the diversification of epiphytic plants. Mol. Phylogenet. Evol. Genetics 132:269–276. 31:1041–1063. Nabholz B., Glémin S., Galtier N. 2008. Strong variations of Schon I., Martens K., Van Doninck K., Butlin R.K. 2003. Evolution in mitochondrial mutation rate across mammals–the longevity the slow lane: Molecular rates of evolution in sexual and asexual hypothesis. Mol. Biol. Evol. 25:120–130. ostracods (Crustacea: Ostracoda). Biol. J. Lin. Soc. 79:93–100. Neiman M., Hehman G., Miller J.T., Logsdon J.M. Jr., Taylor D.R. Schuettpelz E., Grusz A.L., Windham M.D., Pryer K.M. 2008. The 2010. Accelerated mutation accumulation in asexual lineages of a utility of nuclear gapCp in resolving polyploid fern origins. Syst. freshwater snail. Mol. Biol. Evol. 27:954–963. Bot. 33:621–629. Nielsen R., Yang Z. 1998. Likelihood models for detecting positively Schuettpelz E., Korall P., Pryer K.M. 2006. Plastid atpA data provide selected amino acid sites and applications to the HIV-1 envelope improved support for deep relationships among ferns. Taxon

gene. Genetics 148:929–936. 55:897–906. Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 Nikolaev S.I., Montoya-Burgos J.I., Popadin K., Parand L., Margulies Schuettpelz E., Pryer K.M. 2006. Reconciling extreme branch length E.H., Antonarakis S.E. 2007.Life-history traits drive the evolutionary differences: Decoupling time and rate through the evolutionary rates of mammalian coding and noncoding genomic elements. history of filmy ferns. Syst. Biol. 55:485–502. PNAS 104:20443–20448. Schuettpelz E., Pryer K.M. 2007. Fern phylogeny inferred from Pagel M., Venditti C., Meade A. 2006. Large punctuational contribution 400 leptosporangiate species and three plastid genes. Taxon of speciation to evolutionary divergence at the molecular level. 56:1037–1050. Science 314:119–121. Schuettpelz E., Pryer K.M. 2009. Evidence for a Cenozoic radiation of Parkinson C.L., Mower J.P.,Qiu Y.L., Shirk A.J., Song K.M., Young N.D., ferns in an angiosperm-dominated canopy. PNAS 106:11200–11205. dePamphilis C.W., Palmer J.D. 2005. Multiple major increases and SchuettpelzE., Schneider H., Huiet L., Windham M.D., Pryer K.M. 2007. decreases in mitochondrial substitution rates in the plant family A molecular phylogeny of the fern family Pteridaceae: Assessing Geraniaceae. BMC Evol. Biol. 5:73. overall relationships and the affinities of previously unsampled Peterson J., Brinkmann H., Cerff R. 2003. Origin, evolution, and genera. Mol. Phylogenet. Evol. 44:1172–1185. metabolic role of a novel glycolytic GAPDH enzyme recruited by Shao R., Dowton M., Murrell A., Barker S. 2003. Rates of gene land plant plastids. J. Mol. Evol. 57:16–26. rearrangement and nucleotide substitution are correlated in the Posada D. 2008. jModelTest: Phylogenetic model averaging. Mol. Biol. mitochondrial genomes of insects. Mol. Biol. Evol. 20:1612–1619. Evol. 25:1253–1256. Shapiro B., Rambaut A., Drummond A.J. 2005. Choosing appropriate Prado J., Rodrigues C.D.N., Salatino A., Salatino M.L.F. 2007. substitution models for the phylogenetic analysis of protein-coding Phylogenetic relationships among Pteridaceae, including Brazilian sequences. Mol. Biol. Evol. 23:7–9. species, inferred from rbcL sequences. Taxon 56:355–368. Simmons M.P., Ochoterena H. 2000. Gaps as characters in sequence- Pryer K.M., Schuettpelz E., Wolf P.G.,Schneider H., Smith A.R., Cranfill based phylogenetic analyses. Syst. Biol. 49:369–381. R. 2004. Phylogeny and evolution of ferns (monilophytes) with Singh N.D., Arndt P., Clark A.G., Aquadro C.F. 2009. Strong evidence a focus on the early leptosporangiate divergences. Am. J. Bot. for lineage and sequence specificity of substitution rates and 91:1582–1598. patterns in Drosophila. Mol. Biol. Evol. 26:1591–1605. Rambaut, A., Drummond A.J. 2007. Tracer. Available from: URL Small R., Ryburn J., Cronn R., Seelanan T., Wendel J. 1998. The tortoise http://tree.bio.ed.ac.uk/software/tracer/ (last accessed January 1, and the hare: Choosing between noncoding plastome and nuclear 2012). Adh sequences for phylogeny reconstruction in a recently diverged Redelings B.D., Suchard M.A. 2007. Incorporating indel information plant group. Am. J. Bot. 85:1301–1315. into phylogeny estimation for rapidly emerging pathogens. BMC Smith S.A., Donoghue M.J. 2008. Rates of molecular evolution are Evol. Biol. 7:40. linked to life history in flowering plants. Science 322:86–89. Ronquist F., Huelsenbeck J.P. 2003. MRBAYES 3: Bayesian Soltis P.S., Soltis D.E., Savolainen V., Crane P.R., Barraclough T.G. 2002. phylogenetic inference under mixed models. Bioinformatics 19: Rate heterogeneity among lineages of tracheophytes: Integration of 1572–1574. molecular and fossil data and evidence for molecular living fossils. Rothfels C.J., Larsson A., Kuo L.-Y., Korall P., Chiou W.-L., Pryer K.M. PNAS 99:4430–4435. 2012. Overcoming deep roots, fast rates, and short internodes to Stamatakis A. 2006. RAxML-VI-HPC: Maximum likelihood-based resolve the ancient rapid radiation of eupolypod II ferns. Syst. Biol. phylogenetic analyses with thousands of taxa and mixed models. 61:490–509. Bioinformatics 22:2688–2690. Rothfels C.J., Larsson A., Li F.-W., Sigel E.M., Huiet L., Burge D.O., Stamatakis A., Hoover P., Rougemont J. 2008. A rapid bootstrap Ruhsam M., Graham S.W., Stevenson D.W., Wong G.K.-S., Korall algorithm for the RAxML web servers. Syst. Biol. 57:758–771. P., Pryer K.M. 2013a. Transcriptome-mining for single-copy nuclear Suchard M.A., Redelings B.D. 2006. BAli-Phy: Simultaneous Bayesian markers in ferns. PLoS ONE, in press. inference of alignment and phylogeny. Bioinformatics 22:2047–2048. Rothfels C.J., Windham M.D., Grusz A.L., Gastony G.J., Pryer K.M. Swofford D.L. 2002. PAUP*: Phylogenetic analysis using parsimony 2008. Toward a monophyletic Notholaena (Pteridaceae): Resolving (*and other methods). Sunderland (MA): Sinauer. patterns of evolutionary convergence in xeric-adapted ferns. Taxon Tavaré S. 1986. Some probabilistic and statistical problems in the 57:712–724. analysis of DNA sequences. Am. Math. Soc. Lect. Math. Life Sci. Rothfels C.J., Windham M.D., Pryer K.M. 2013b. A plastid phylogeny 17:57–86. of the cosmopolitan fern family Cystopteridaceae (Polypodiopsida). Thiers B. [continuously updated]. Index Herbariorum: A global Syst. Bot. 38:295–306. directory of public herbaria and associated staff. New York Ruhfel B., Lindsay S., Davis C.C. 2008. Phylogenetic placement Botanical Garden’s Virtual Herbarium. Available from: URL of Rheopteris and the polyphyly of Monogramma (Pteridaceae http://sweetgum.nybg.org/ih/ (last accessed January 1, 2013). s.l.): Evidence from rbcL sequence data. Syst. Bot. 33: Thomas J., Welch J.J., Woolfit M., Bromham L. 2006. There is no 37–43. universal molecular clock for invertebrates, but rate variation does Sanderson M.J. 2002. Estimating absolute rates of molecular evolution not scale with body size. PNAS 103:7366–7371. and divergence times: A penalized likelihood approach. Mol. Biol. Tryon R.M. 1942. A revision of the genus Doryopteris. Contrib. Gray Evol. 19:101–109. Herb. 143:1–80.

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 53 31–54 54 SYSTEMATIC BIOLOGY VOL. 63

Tryon R.M., Tryon A.F., Kramer K.U. 1990. Pteridaceae. In: Kramer (Cornales)–Pattern of variation and underlying causal factors. Mol. K.U., Green P.S., editors. The families and genera of vascular plants. Phylogenet. Evol. 49:327–342. Berlin: Springer. p. 230–256. Xie W., Lewis P., Fan Y., Kuo L., Chen M.-H. 2011. Improving marginal Vangerow S., Teerkorn T., Knoop V. 1999. Phylogenetic information likelihood estimation for Bayesian phylogenetic model selection. in the mitochondrial nad5 gene of pteridophytes: RNA editing and Syst. Biol. 60:150–160. intron sequences. Plant Biol. 1:235–243. Yang Z. 1996. Among-site rate variation and its impact on phylogenetic Venditti C., Meade A., Pagel M. 2006. Detecting the node-density analyses. Trends Ecol. Evol. 11:367–372. artifact in phylogeny reconstruction. Syst. Biol. 55:637–643. Yang Z. 1998. Likelihood ratio tests for detecting positive selection Weisrock D.W. 2012. Concordance analysis in mitogenomic and application to primate lysozyme evolution. Mol. Biol. Evol. phylogenetics. Mol. Phylogenet. Evol. 65:194–202. 15:568–573. Welch J.J., Bininda-Emonds O.R.P., Bromham L. 2008. Correlates of Yang Z. 2007. PAML 4: A program package for phylogenetic analysis substitution rate variation in mammalian protein-coding sequences. by maximum likelihood. Mol. Biol. Evol. 24:1586–1591. BMC Evol. Biol. 8:53. Yang Z., Nielsen R. 1998. Synonymous and nonsynonymous rate Wertheim J.O., Sanderson M.J., Worobey M., Bjork A. 2010. Relaxed variation in nuclear genes of mammals. J. Mol. Evol. 46: molecular clocks, the bias-variance trade-off, and the quality of 409–418. phylogenetic inference. Syst. Biol. 59:1–8. Yang Z., Nielsen R. 2002. Codon-substitution models for detecting Wikström N., Pryer K.M. 2005. Incongruence between primary molecular adaptation at individual sites along specific lineages. sequence data and the distribution of a mitochondrial atp1 group Mol. Biol. Evol. 19:908–917.

II intron among ferns and horsetails. Mol. Phylogenet. Evol. Yang Z., Nielsen R., Goldman N., Pedersen A.-M.K. 2000. Codon- Downloaded from https://academic.oup.com/sysbio/article-abstract/63/1/31/1687928 by guest on 12 March 2019 36:484–493. substitution models for heterogeneous selection pressure at amino Windham M.D., Huiet L., Schuettpelz E., Grusz A.L., Rothfels C.J., Beck acid sites. Genetics 155:431–449. J. 2009. Using plastid and nuclear DNA sequences to redraw generic Yang Z., Rannala B. 2006. Bayesian estimation of species divergence boundaries and demystify species complexes in cheilanthoid ferns. times under a molecular clock using multiple fossil calibrations with Am. Fern J. 99:68–72. soft bounds. Mol. Biol. Evol. 23:212–226. Wolf P.G. 1997. Evaluation of atpB nucleotide sequences for YangZ., Wong W.S.W.,Nielsen R. 2005. Bayes empirical Bayes inference phylogenetic studies of ferns and other pteridophytes. Am. J. Bot. of amino acid sites under positive selection. Mol. Biol. Evol. 84:1429–1440. 22:1107–1118. Wolfe K.H., Li W.-H., Sharp P. M. 1987. Rates of nucleotide substitution Yesilyurt J.C., Schneider H. 2010. The new fern genus Calciphilopteris vary greatly among plant mitochondrial, chloroplast, and nuclear (Pteridaceae). Phytotaxa 7:52–59. DNAs. PNAS 84:9054–9058. Yoder A.D., Yang Z. 2000. Estimation of primate speciation dates using Wolfe K.H., Sharp P., Li W.H. 1989a. Mutation rates differ among local molecular clocks. Mol. Biol. Evol. 17:1081–1090. regions of the mammalian genome. Nature 337:283–285. Zhang G., Zhang X., Chen Z., Liu H., Yang W. 2007. First insights in the Wolfe K.H., Sharp P.M., Li W.-H. 1989b. Rates of synonymous phylogeny of Asian cheilanthoid ferns based on sequences of two substitution in plant nuclear genes. J. Mol. Evol. 29:208–211. chloroplast markers. Taxon 56:369–378. Wong W.S.W., Yang Z., Goldman N., Nielsen R. 2004. Accuracy and Zoller S., Lutzoni F. 2003. Slow algae, fast fungi: Exceptionally power of statistical methods for detecting adaptive evolution in high nucleotide substitution rate differences between lichenized protein coding sequences and for identifying positively selected fungi Omphalina and their symbiotic green algae Coccomyxa. Mol. sites. Genetics 168:1041–1051. Phylogenet. Evol. 29:629–640. Woolfit M., Bromham L. 2003. Increased rates of sequence evolution in Zuckerkandl E., Pauling L. 1962. Molecular disease, evolution, endosymbiotic bacteria and fungi with small effective population and genetic heterogeneity. In: Kasha M., Pullman B., editors. sizes. Mol. Biol. Evol. 20:1545–1555. Horizons in biochemistry. New York: Academic Press. Woolfit M., Bromham L. 2005. Population size and molecular evolution p. 189–225. on islands. P. Roy. Soc. B-Biol. Sci. 272:2277–2282. Zuckerkandl E., Pauling L. 1965. Evolutionary divergence and Xiang Q.-Y., Thorne J.L., Seo T.-K., Zhang W., Thomas D.T., convergence in proteins. In: Bryson V., Vogel H.J., editors. Evolving Ricklefs R.E. 2008. Rates of nucleotide substitution in Cornaceae genes and proteins. New York: Academic Press. p. 97–165.

[13:56 4/12/2013 Sysbio-syt058.tex] Page: 54 31–54