<<

bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 Disparity, Diversity, and Duplications in the 2 Stephen A. Smith*,1, Caroline Parins-Fukuchi1, Ya Yang2, Michael J. Moore3, Samuel F. 3 Brockington4, Riva Bruenn3,5, Chloe P. Drummond3,6, Joseph W. Brown1, Joseph F. Walker1, 4 Noah Last2, Norman A. Douglas3,7 5 6 1Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, 48103 7 2 Department of Biology, University of Minnesota-Twin Cities, 1445 Gortner Avenue, St. 8 Paul, MN 55108 9 3Department of Biology, Oberlin College, 119 Woodland St., Oberlin, Ohio 44074-1097 10 4Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United 11 Kingdom 12 5Current Address: Department of Plant & Microbial Biology, University of California, Berkeley, 13 Berkeley, California 94720-3102 14 6Current Address: Department of Botany, University of Wisconsin-Madison, 430 Lincoln Drive, 15 Madison, WI 53706 16 7Current address: Department of Biology, University of Florida, 618A Carr Hall, Gainesville, FL 17 32605 18 CORRESPONDING AUTHOR: Stephen A. Smith; 734-615-5510; [email protected] 19 20 WORD COUNT 21 Total: 5285 22 Summary: 197 23 Introduction: 1177 24 Materials and Methods: 1674 25 Results and Discussion: 2428 26 This manuscript also contains 3 Main Figures (two in color), 4 Main Tables, and 5 27 Supplementary Figures. 28 29 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 2

30 Summary 31 • The role whole genome duplication (WGD) plays in the history of lineages is actively 32 debated. WGDs have been associated with advantages including superior colonization, 33 adaptations, and increased effective population size. However, the lack of a 34 comprehensive mapping of WGDs within a major plant clade has led to questions 35 regarding the potential association of WGDs and higher diversification rates. 36 • Using seven chloroplast and nuclear ribosomal genes, we constructed a phylogeny of 37 5,036 of Caryophyllales, representing nearly half of the extant species. We 38 phylogenetically mapped putative WGDs as identified from analyses on transcriptomic 39 and genomic data and analyzed these in conjunction with shifts in climatic niche and 40 lineage diversification rate. 41 • Thirteen putative WGDs and twenty-seven diversification shifts could be mapped onto 42 the phylogeny. Of these, four WGDs were concurrent with diversification shifts, with 43 other diversification shifts occurring at shallower nodes than WGDs. Five WGDs were 44 associated with shifts to colder climatic niches. 45 • While we find that many diversification shifts occurred after WGDs it is difficult to 46 directly associate these and consider diversification and duplication to be tightly 47 correlated. Our findings suggest that duplications may often along with shifts in either 48 diversification rate, climatic niche, or rate of evolution. 49 Keywords: Caryophyllales, duplications, climatic niche, diversification rates, phylogenomics 50 51 Introduction 52 Understanding the causes and correlates of diversification within flowering has been a 53 central goal of evolutionary biologists. Genomic and transcriptomic data have reinvigorated 54 hypotheses associating whole genome duplication (WGD) with lineage diversification rate 55 increases (e.g., Levin, 1983; Levin 2002; Barker et al. 2009; Soltis et al., 2014; Edger et al. 56 2015; Puttick et al. 2015; Tank et al., 2015; Barker et al. 2016; Huang et al. 2016; McKain et al. 57 2016). It is not self-evident why WGDs would be associated with increases in lineage 58 diversification. One hypothesis suggests that the additional genetic material provides a basis to 59 generate new adaptations (Edger et al., 2015), although this itself assumes a co-occurrence of 60 adaptation and lineage proliferation (Levin, 1983). The apparent lack of precise co-occurrence of bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 3

61 adaptation and lineage proliferation has been explained by the potential of a lag model (Tank et 62 al. 2015) where diversification may follow WGD events. In the absence of overwhelming 63 correlative signal, we are often unable to discern true ancient WGD events from aneuploidy 64 without advanced genomic information such as synteny mapping (Dohm et al., 2012). Because it 65 is often difficult to distinguish the two, for simplicity we will define WGD broadly to include 66 putative ancient WGD events (paleopolyploidy) and ancient aneuploidy events. WGD events are 67 thought to be a common occurrence and have been associated with an estimated 15% of 68 angiosperm speciation events (Wood et al., 2009). However, whether speciation by WGD is 69 correlated with higher diversification rates remains highly debated (Mayrose et al., 2011; Soltis 70 et al., 2014; Tank et al., 2015). Analyses based on recent WGD events have concluded that 71 immediate extinction rates are higher for polyploid plants (Mayrose et al., 2011; Arrigo and 72 Barker, 2012). This may result from small initial population sizes and an increased dependence 73 on selfing. Alternatively, despite the disadvantages of WGD, others have suggested that 74 polyploids may be superior colonizers (Soltis and Soltis, 2000). 75 Indeed, extreme environments are associated with high levels of WGD, with up to 87% 76 of species restricted to areas that were glaciated during the last ice age consisting of polyploids 77 (Brochmann, 2004). However, in the example from Arctic plants, the high level of WGD has 78 occurred post-glaciation representing a micro-evolutionary period whereas previous studies often 79 focus at much deeper macro-evolutionary time scales (Mayrose et al., 2011; Tank et al., 2015; 80 Soltis et al., 2014). From the perspective of a short timescale, polyploidy has the disadvantages 81 of higher error rates in mitosis (Storchová et al., 2006) and masking of deleterious mutations 82 allowing them to accumulate to higher frequencies in a population (Otto and Whitton, 2000). A 83 suite of advantages however may also arise, including gain of asexuality (Miller et al.. 2000) and 84 varying effects of heterosis (Comai, 2005). The net role these advantages and disadvantages play 85 on the macroevolutionary scale is difficult to determine from either the purely short-term or 86 purely long-term time scales previously used. 87 The long-term consequence of WGD is a central question in macroevolution and 88 comparative genomics. However, with a suite of advantages and disadvantages, much debate 89 surrounds the importance and patterns of correlation of WGD (Comai 2005). While 90 polyploidization events can cause instant speciation, there is no reason to assume that these 91 singular speciation events in themselves would influence large-scale diversification rate shifts bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 4

92 when considering lineage survivorship. Instead, there may be other factors, such as the increase 93 in genetic material, perhaps increasing genetic diversity or enabling adaptation, that cause long 94 term shifts in rates of diversification. Adaptations need not be associated with shifts in the tempo 95 of diversification and those adaptations and shifts in diversification may not co-occur on the 96 same branch (i.e., there may be a lag time; Donoghue 2005; Smith et al. 2011, Donoghue and

97 Sanderson 2015; Tank et al. 2015). In the broader context of plant evolution, there are several 98 possible outcomes of WGD in relation to the evolution and diversification of clades: 1) no 99 relationship between WGD and speciation rate or habitat shift/adaptation, 2) WGD coincides 100 with an increase of speciation rate, with or without a lag time, 3) WGD promotes dispersal and 101 habitat shifts, which has mixed relationship with speciation rate, and 4) a mixture (some 102 association, some not), similar to the previous hypothesis but without explicitly promoting 103 dispersal or habitat shift or speciation (e.g., adaptation could be more prominent than dispersal 104 and habitat shift). Here, we contribute to this discussion on diversification and WGDs with an in- 105 depth examination of the intersection of diversification and WGDs happening at a range of scales 106 within the hyperdiverse Caryophyllales. 107 The Caryophyllales contains ~12,500 species in 39 families (Thulin et al., 2016; APG IV: 108 Chase et al., 2016), representing approximately 6% of extant angiosperm species diversity. The 109 estimated crown age of Caryophyllales is approximately 67–121 millions of years ago (mega- 110 annum, Ma) (Bell et al., 2010; Moore et al., 2010) and species of the Caryophyllales exhibit 111 extreme life-history diversity, ranging from tropical trees to temperate annual herbs, and from 112 desert succulents (e.g., Cactaceae) to a diverse array of carnivorous plants (e.g., the sundews 113 Drosera and pitcher plants ). Such extraordinary diversity makes Caryophyllales a 114 particularly useful system for investigating the relationship between WGD vs. diversification and 115 niche evolution. Our previous analyses using 62 transcriptomes representing 60 species across 116 the Caryophyllales identified 13 well-supported ancient WGD events (Yang et al., 2015). We 117 have since nearly tripled the taxon sampling and assembled a data set comprising high-coverage 118 transcriptomes and genomes from 169 species across the Caryophyllales (Yang et al., 119 submitted), providing even greater power for resolving the number and phylogenetic locations of 120 WGD events. Moreover, the growth in the number of plant taxa on GenBank that are represented 121 by traditional targeted sequences (e.g., rbcL, matK, ITS, etc.) and the growth of publicly bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 5

122 available collections data (e.g., GBIF, iDigBio) provide excellent opportunities to apply 123 megaphylogeny and niche diversification approaches at fine scales in Caryophyllales. 124 By examining WGDs and diversification within the Caryophyllales, we present an 125 important example. Not only does the dataset examined have a high density of transcriptomic 126 sampling, the diversification of the bulk of Caryophyllales occurred during a time frame 127 intermediate to that of most published studies that have probed a link between WGD and 128 macroevolution. This time frame, between 10 and 100 Ma, is important for angiosperms as much 129 of the diversification that has led to the modern flora occurred during this period and most 130 modern angiosperm families appeared by this time. Discussion of speciation rate, niche shift, and 131 WGD would be flawed without accurate mappings of WGD events within this time scale. We 132 compiled a data set with extensive and precise mapping of WGD combined with a species-level 133 phylogeny. The megaphylogeny approach has been used extensively in the past to combine data 134 from many gene regions and across broad taxonomic groups to address evolutionary questions 135 (Smith et al., 2009). Here, we use this approach to help inform analyses from phylogenomic 136 studies, and provide a broad context in which to examine these genomic phenomena. With half 137 of the species sampled, this represents to date the largest and most exhaustive study of WGD vs. 138 rate and adaptive shift. 139 140 Materials and Methods 141 Sanger sequencing and assembly. —A total of 248 new matK sequences were included in this 142 study (Table 1). To generate these sequences, samples were collected in silica in the field or 143 from cultivated material, or were collected from herbarium sheets. DNA was isolated using 144 either the Nucleon Phytopure kit (GE Healthcare Life Sciences, Pittsburgh, PA, USA), using the 145 0.1 g protocol and following manufacturer’s instructions, or using the Doyle and Doyle (1987) 146 protocol, with the addition of 1% PVP-40. An approximately 950 bp region in the middle of the 147 matK gene was amplified and sequenced using custom-designed primers (Table 2). PCRs were 148 performed in 12.5 µL volumes with 0.5 µL of 5 mM primer for both primers, 5-20 ng of DNA 149 template, 0.1 µL of GoTaq (Promega, Madison, WI, USA), 6.25 µL of Failsafe Premix B 150 (Epicentre, Madison, WI, USA), and 4.7 µL of sterile, deionized water. Reactions were run on a 151 Bio-Rad PTC 200 thermocycler (Bio-Rad, Hercules, CA, USA) at Oberlin College. Individual 152 PCRs were cleaned in 16.5 µL reactions containing 10 U of Exonuclease I (Affymetrix, bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 6

153 ThermoFisher Scientific, Waltham, MA, USA), 2 U of shrimp alkaline phosphatase 154 (Affymetrix), 8 µL of PCR product, and 8.5 µL of sterile, deionized water. Sanger sequencing of 155 the resulting cleaned PCRs was conducted by Neogenomics (formerly SeqWright; Houston, TX, 156 USA) using an ABI 3730xl automated sequencer (Applied Biosystems, ThermoFisher 157 Scientific). The resulting forward and reverse sequences for each reaction were trimmed and de 158 novo assembled using default parameters of the Geneious assembler in Geneious versions 5-7 159 (Biomatters, Auckland, New Zealand). 160 161 Molecular Data for Phylogenetic Reconstruction. —Nucleotide data from the nuclear 162 ribosomal internal transcribed spacers (ITS) and phyC gene, and the plastid loci matK, ndhF, 163 rbcL, trnH-psbA spacer, and trnL-trnF spacer were used to reconstruct the phylogeny. These data 164 were gathered first using PHLAWD (Smith and Donoghue, 2008; Smith et al., 2009) and then 165 curated and combined with newly sequenced matK data for 124 additional species. This yielded 166 the following sampling: ITS 2,969 species, matK 2,270 species, ndhF 417 species, phyC 172 167 species, rbcL 947 species, trnH-psbA 240 species, and trnL-trnF 1,996 species. We used matK, 168 rbcL, and ndhF sequences from Aextoxicon, Apium, Berberidopsis, Campanula, Clethra, Coffea, 169 Echinops, Helwingia, Ilex, Ipomoea, Lamium, Lonicera, Nyssa, Polysoma, Primula, Santalum, 170 Valeriana, and Viburnum to represent outgroups. 171 172 Phylogenetic Reconstruction. —We conducted phylogenetic analyses with RAxML v7.2.8 173 (Stamatakis, 2014) using the full analysis command, -f a, which conducts a rapid bootstrap and 174 then a full maximum likelihood search. The combined bootstrap and maximum likelihood search 175 allows for a more thorough for maximum likelihood analysis where the initial rapid bootstrap 176 results prime the maximum likelihood analysis. However, we did not use the rapid bootstrap 177 trees from this analysis and instead, we conducted a full bootstrap, generating the bootstrap 178 dataset using phyx (Brown et al., 2017) and then conducting individual maximum likelihood 179 runs on each constructed bootstrap dataset. We conducted bootstraps within gene regions and we 180 retained the individual bootstrap alignments to conduct additional analyses (i.e., bootstrapped 181 alignments contained the same number of gene-specific sites as the empirical alignment). On 182 each of the resulting trees of the bootstrap and the maximum likelihood tree, we conducted SH- 183 like approximate likelihood ratio tests (SH-aLRT; Guindon et al., 2010) as implemented in bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 7

184 RAxML. These analyses calculate support for each edge while also finding the NNI-optimal 185 topology. RAxML completed the likelihood search for each of these bootstrap replicates, 186 however the SH-aLRT analyses often resulted in an improved maximum likelihood topology. 187 The trees that resulted from the SH-aLRT, ML and bootstrap samples, were used for further 188 analyses. Because several deep relationships within Caryophyllales are hard to resolve without 189 large amounts of molecular data that are unavailable for most of the taxa included in this analysis 190 (Yang et al., 2015), for all phylogenetic analyses we applied the following topological constraint: 191 (, (Microtea, (Stegnospermataceae, Limeaceae, (Lophiocarpaceae, (Barbeuiaceae, 192 Aizoaceae))))) as per previous analysis (Brockington et al. 2009; Yang et al., 2015). 193 194 Divergence Time Estimation. — Few tractable options for divergence time estimation exist for 195 datasets of the size presented here. We use the penalized likelihood approach (Sanderson, 2003) 196 as implemented in the program treePL (Smith and O’Meara, 2012), which can handle large-scale 197 phylogenies. The early fossil record of the Caryophyllales is sparse with only a few known 198 records (Friis et al., 2011; Arakaki et al., 2011): (1) fossil has been ascribed to 199 (Chenopodipollis) from the Paleocene of Texas (Nichols and Traverse, 1971); 200 (2) a putative fossil infructescence from within the in the Campanian has also 201 been reported (Cevallos-Ferriz et al., 2008), but this phylogenetic position has been disputed 202 (pers. comm. S. Manchester) and hence we excluded it; (3) Jordan and Macphail (2003) describe 203 a middle to late Eocene from the species Caryophylloflora paleogenica, ascribed to 204 Caryophyllaceae; (4) pollen from within the has been reported from the 205 middle Eocene (Zetter et al., 1999); and (5) (Degreef, 1997). The penalized likelihood method 206 performs better when a calibration is used at the root. For this calibration, and because there is no 207 fossil record for the earliest Caryophyllales, we use a secondary calibration from the 208 comprehensive angiosperm divergence time analyses of Bell et al. (2010). Several other 209 secondary calibrations were attached to major clades where fossils are not available (Ocampo 210 and Columbus 2010; Arakaki et al. 2011; Schuster et al. 2013; Valente et al. 2013; see Supp. 211 Table S1 for detail on placement and calibrations). We conducted a priming analysis to 212 determine the best optimization parameter values. We then performed a cross validation analysis 213 using the random cross validation setting to determine the optimal smoothing parameter value. 214 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 8

215 Climate analyses. —We downloaded 6,592,700 georeferenced occurrences for the 216 Caryophyllales from GBIF (accessed on 6/1/2015; http://gbif.org). After removing samples 217 present in living collections, and therefore not necessarily representative of native climates, and 218 removing samples whose localities were over water, there were 6,009,552 samples. We extracted 219 bioclimatic values for each coordinate using the 2.5 arc-minute resolution data from WorldClim 220 (http://worldclim.org). We only included taxa that had at least three samples in these analyses to 221 reduce potential errors and to have the minimum number of samples required to calculate mean 222 and variance. The resulting overlap of the taxa represented in both the geographic and genetic 223 data was 2,843 taxa. We conducted principal component analyses (PCA) on these extracted 224 values. With both the bioclimatic values and the first two axes of the PCA, we conducted 225 ancestral state reconstruction analyses. 226 We also conducted contrast analyses and calculated Brownian motion rates of evolution 227 between sister clades (comparing duplicated lineages with their sisters) for mean annual 228 precipitation, mean annual temperature, and principal component axis 1. Contrasts were 229 calculated using phylogenetic independent contrasts. Brownian motion rates were calculated on

230 sister lineages independently using the analytical solution for rate: � = . 231 232 Diversification analyses. —To map diversification rate shifts, we conducted MEDUSA (Alfaro 233 et al., 2009; Pennell et al., 2014) analyses on the maximum likelihood tree and the bootstrap 234 trees. MEDUSA is far more computationally tractable than some other diversification estimation 235 methods (e.g., BAMM). Furthermore, we required the ability to feasibly integrate over the 236 phylogenetic uncertainty within the phylogenetic dataset because of both the nature of the larger 237 phylogenetic dataset and the inherent biological uncertainty within the Caryophyllales. 238 MEDUSA fits a birth-death model of diversification (with parameters r: net diversification (birth 239 - death), and ε: relative extinction (death / birth)) before using stepwise AIC (Burnham and 240 Anderson, 2002) to identify shifts in rates of diversification. These analyses allow 241 complementary analyses targeted at accommodating topological and branch length uncertainty. 242 We performed these diversification analyses using a birth-death model on 97 chronograms 243 generated from nonparametric bootstrapping of the original matrix, inferring ML trees in 244 RAxML, and estimating divergence times in treePL using the temporal constraints described bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 9

245 above. We discarded three trees based on poor fossil placement resulting from phylogenetic 246 uncertainty causing fossil placements to conflict. 247 248 Whole Genome Duplication Identification. — To identify WGDs (procedure described below), 249 we generated a tree based on transcriptomic data. For this tree, we used 178 ingroup data sets 250 (175 transcriptomes, 3 genomes) representing 169 species in 27 families and 40 outgroup 251 genomes (Table S1-S2 in Yang et al. submitted). We mapped putative WGD events using 252 multiple strategies: gene tree topology, plotting synonymous distance, and chromosome counts 253 (Yang et al. 2015; Yang et al., submitted). For gene tree topology analyses, we performed two 254 alternative strategies for mapping duplication events from gene trees to the species tree: mapping 255 to the most recent common ancestor (MRCA), or mapping to species tree only when gene tree 256 and species tree topologies are compatible. 257 To conduct synonymous distance analyses, we performed the following procedure. For 258 all ingroup Caryophyllales transcriptome data sets, we calculated the distribution of paralog 259 synonymous distance following the same procedure as (Yang et al., 2015). We reduced highly 260 similar peptide sequences with CD-HIT (-c 0.99 -n 5). We also carried out an all-by-all BLASTP 261 within each taxon using an E value cutoff of 10 and -max_target_seq set to 20. Resulting hits 262 with pident < 20% or niden < 50 amino acids were removed. We removed sequences with ten or 263 more hits to avoid overrepresentation of gene families that experienced multiple recent 264 duplications. We used the remaining paralog pairs and their corresponding CDS to calculate Ks 265 values using the pipeline https://github.com/tan ghaibao/bio- 266 pipeline/tree/master/synonymous_calculation (accessed November 29, 2014). The pipeline first 267 carries out pairwise protein alignment using default parameters in ClustalW (Larkin et al., 2007), 268 back-translates the alignment to a codon alignment using PAL2NAL (Suyama et al., 2006), and 269 calculates the synonymous substitution rate (Ks) using yn00 as part of the PAML package 270 (Yang, 2007), with Nei–Gojobori correction for multiple substitutions (Nei and Gojobori, 1986). 271 We obtained chromosome counts from the Chromosome Counts Database (CCDB; 272 http://ccdb.tau.ac.il accessed Oct 5, 2015). When multiple counts were reported from different 273 authors or different plants, we erred on the conservative estimate and recorded the lowest 274 number. For species that were not available in the database, we found counts from the literature bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 10

275 (e.g., Jepson eFlora http://ucjeps.berkeley.edu/eflora/ and Flora of North America 276 http://floranorthamerica.org) or by a consensus from species of the same genera. 277 278 Results and Discussion 279 Phylogenetic results. —Phylogenetic analyses showed strong support based on bootstrap and 280 SH-aLRT values for the monophyly of most Caryophyllales families (see Fig. S1). We find 281 strong support for the carnivorous clade including Droseraceae, Ancistrocladaceae, 282 Nepenthaceae, Drosophyllaceae, and . There is also strong support for this 283 clade as sister to a clade including Frankeniaceae, , , and 284 . However, relationships among the families show more varied support. There is 285 weak support for the placement of other families relative to other early diverging Caryophyllales 286 (see Fig. S1). There is strong support for Caryophyllaceae sister to Amaranthaceae. There is very 287 weak support for Aizoaceae sister to Phytolaccaceae+Nyctaginaceae. As with previously 288 published analyses, there was no support for the monophyly of Phytolaccaceae in the traditional 289 sense (i.e., including Phytolaccaceae s.s., Petiveriaceae, and Agdestis; APG IV) and very weak 290 support for the placement of Sarcobataceae. There is also weak support for the relationships 291 among Limeaceae, , and the Portulacineae. Many of these relationships have been 292 found to be strongly supported but conflicting in different analyses (Brockington et al., 2009; 293 Soltis et al., 2011; Yang et al., 2015; Smith et al., 2015; Walker et al., 2017). Here, we focus less 294 on the systematic resolution within the Caryophyllales and instead examine the potential 295 relationship of diversification and climate shifts to WGDs. Therefore, we place more emphasis 296 on including more taxa over that of more gene regions (i.e., transcriptomes) at the cost of more 297 missing data. Confident resolution of many of the systematic relationships will require genomic 298 and transcriptomic sampling, and more thorough taxon sampling (Yang et al., submitted). 299 300 Climate tolerance reconstruction results. —We performed climate ancestral reconstruction 301 analyses on the phylogeny of 2,843 taxa that included taxa with at least three sampled 302 geographic coordinates (Figs. 1-3). These analyses were conducted for visualization and for 303 comparison with diversification and WGD results (see below). Results for individual bioclimatic 304 variables and principal components can be found in Figs. S2-S4. Bioclimatic variable 1 (mean 305 annual temperature, Fig. 1) shows that there are several strong phylogenetic patterns of clades bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 11

306 with preferences for colder or warmer regions. For example, Polygonaceae, Caryophyllaceae, 307 and each are dominated by taxa with preferences for cold environments, although 308 each also contains early-diverging taxa with preferences to warm environments. In contrast, taxa 309 inhabiting warm environments predominate in Cactaceae, Amaranthaceae, Aizoaceae, the 310 carnivorous clade (Droseraceae, Drosophyllaceae, Nepenthaceae, Ancistrocladaceae, 311 Dioncophyllaceae), and the phytolaccoid clade (Nyctaginaceae, Phytolaccaceae, Petiveriaceae, 312 Sarcobataceae, and Agdestis). Bioclimatic variable 12 (mean annual precipitation) shows a 313 relatively consistent pattern of relatively dry to intermediately wet clades throughout the group. 314 Indeed, only a few clades inhabiting wet ecosystems (in this case, the wet tropics) exist in the 315 Caryophyllales, specifically small groups within the carnivorous clade, the phytolaccoids, early- 316 diverging Polygonaceae, and other small groups throughout the Caryophyllales. The principal 317 component loadings are presented in Fig. 2 and Fig. S5. Principal component 1, PCA1, shows 318 significant differentiation throughout the Caryophyllales, as for example, early-diverging 319 Polygonaceae vs the rest of Polygonaceae, early diverging Caryophyllaceae vs the rest of 320 Caryophyllaceae, phytolaccoids vs Aizoaceae, and Portulacineae + relatives vs Cactaceae, to 321 mention a few. These results generally reflect the extensive ecological diversification throughout 322 the group. They also reflect significant diversification in the temperate regions of the world 323 especially within the Caryophyllaceae and Polygonaceae contrasted with extensive 324 diversification in the succulent lineages (especially Aizoaceae and Cactaceae) found in relatively 325 dry and warm environments. 326 327 Diversification. —Significant shifts in diversification were detected in most major clades (Table 328 4, Fig. 1). The results from diversification analyses on the maximum likelihood tree and 329 bootstrap tree set are generally congruent with each other. However, there are discrepancies (Fig. 330 1). The bootstrap set recovered many shifts in Polygonaceae, the carnivorous clade, 331 Caryophyllaceae, some shifts within Cactaceae, phytolaccoids, and Amaranthaceae. 332 Disagreements on the existence and placement of shifts are primarily within Portulacineae, 333 Aizoaceae, and Amaranthaceae. Overall, MEDUSA detected 27 increases in diversification rate 334 using the ML tree and 16 increases using the bootstrap trees. Given the relative lack of support of 335 some of the branches in the phylogeny, we find the MEDUSA results on the set of bootstrapped bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 12

336 trees to be the most conservative while the ML results are suggestive but not definitive of 337 diversification shifts. 338 339 Duplications, diversification, and climate. —WGD analyses show thirteen putative WGDs that 340 can be mapped to clades (i.e., involve more than 1 taxon in the dataset; Table 3 and Figs. 1-3). 341 Many of these are found in early diverging lineages as opposed to nested deep within families, 342 though there are WGDs identified in Amaranthus and Claytonia. We also find evidence of nested 343 WGDs as within the phytolaccoids and Portulacineae. In addition to these deeper WGDs, there 344 are larger numbers of more recent WGDs that are present in Ks plots but cannot be mapped to a 345 clade (Yang et al., submitted). By sampling more extensively, Yang et al. (submitted) and 346 Walker et al. (2017) found additional WGD events within the Caryophyllales. While it is 347 possible, this is unlikely to be phenomenon specific to the Caryophylalles and we will likely find 348 additional WGDs events in other lineages as more effort is placed on denser taxon sampling 349 using genomes and transcriptomes. We do not explore WGDs that can only be mapped to one tip 350 any further and more discussion of specific results related to the WGDs themselves can be found 351 in Yang et al. (submitted) and Walker et al. (2017). 352 To better examine whether WGDs coincide with diversification rate shifts, increases and 353 decreases, or notable changes in climate tolerance, we mapped WGDs onto the large phylogenies 354 and summarized the number of species and climate information for each clade (Tables 3-4, Figs. 355 1-3). Some WGD events are associated with synchronous diversification events. For example, 356 within Nyctaginaceae, a WGD event occurs on the same branch (leading to Tribe Nyctagineae; 357 Douglas and Spellenberg, 2010) as an increase in diversification rate in both the ML tree and the 358 bootstrapped dataset (Fig. 1, dup:1 div:n). These events are further associated with a shift in life 359 history and niche from an ancestral woody habit in the tropics to the largely herbaceous, arid- 360 adapted temperate Nyctagineae. This is also the case for Amaranthus (Fig. 1, dup:5 div:x). Other 361 coincident diversification and WGD events in the Droseraceae and Nepenthaceae are only 362 supported by the ML tree. Although these correlated events may, in fact, be accurate, we will 363 reserve more comments for when these are more confidently resolved. Other than these 364 simultaneous shifts, and excepting one diversification shift at the base of the MRCA of 365 Nyctaginaceae+Cactaceae, all other shifts in diversification occur more recently than WGD 366 events. Tank et al. (2015) suggested that this lagging pattern may be common at the broader bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 13

367 angiosperm scale, though the expected distance of the diversification shift from the WGD event 368 was not specified (this is discussed more below). In the results presented here, some 369 diversification events occur shortly after the WGD event, such as within the Amaranthaceae 370 (dup: 6) and Portulacineae (dup: 4). For others, it is difficult to determine whether the 371 diversification events that occur after the WGD events are significantly close to the WGD to 372 warrant suggestion of an association (e.g., dup: 7, dup: 10, dup: 8). More description of a model 373 that would generate a null expectation would be necessary to determine what is “close enough” 374 (see discussion below). 375 Many of the other inferred lineage diversification rate shifts are associated with very 376 recent, rapid radiations within genera such as those documented within Commicarpus 377 (Nyctaginaceae), Dianthus (Caryophyllaceae), Cerastium (Caryophyllaceae), Arenaria 378 (Caryophyllaceae), and Salicornia (Amaranthaceae), to name a few (Table 4). Although 379 polyploids were reported in these clades, we are unable to pinpoint the phylogenetic location of 380 any WGD with our current taxon sampling (e.g., Dianthus; Carolin, 1954; Weiss et al. 2002). 381 Increased sampling of transcriptomes and genomes will shed more light in these areas. While we 382 only find a few WGDs that coincide well with diversification rate shifts, it is important to note 383 that the uncertainty in the phylogenies makes it difficult to map anything but the strongest 384 diversification signals. This discrepancy can be seen in the difference between the number of 385 events supported by the ML analyses and those supported by the bootstrap analyses. It is possible 386 that additional sequence data will improve phylogenetic resolution and confidence, and that 387 consequently additional diversification events will emerge. 388 Equally interesting to the few WGD events associated directly with diversification are the 389 WGD events associated with general shifts in climate tolerance. WGDs in the Polygonaceae, 390 Caryophyllaceae, Montiaceae, and the Tribe Nyctagineae appear to be associated with movement 391 into colder environments (Figs. 1-2 and Figs. S2-S3). Species arising after the WGD within the 392 Amaranthaceae occupy wetter environments than the sister clade. The WGDs within the 393 carnivorous plants are also associated with shifts in environment as Nepenthes are found in very 394 wet environments and the Droseraceae are found in somewhat drier environments, at least 395 comparatively. However, in these cases, perhaps the development of the wide array of 396 morphologies associated with carnivory, apart from , is more obviously associated 397 with the WGD (Walker et al., 2017). bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 14

398 While these qualitative assessments suggest potential correlations of shift in the climate 399 occupied and WGDs, more specific and direct comparisons are necessary to quantify the extent 400 of the shifts. For many of the clades experiencing WGD, a direct comparison with a sister clade 401 is difficult because the sister may consist of a single species, another clade with WGD, or 402 another complication. For example, there are WGDs at the base of both Polygonaceae and 403 Plumbaginaceae as well as Nepenthaceae and Droseraceae. However, we made direct 404 comparison of five duplicated lineages (see Fig. 3) in both means (i.e., character contrasts 405 between sister clades) and variances (rate of Brownian motion) of climatic variables. In each 406 case, the duplicated lineage occupies a colder mean annual temperature. This is also the case 407 with the nested WGDs of Portulacineae and the Tribe Nyctagineae. Of course, we do not suggest 408 that all WGDs are associated with a shift to a colder climate. While such a pattern may exist in 409 some groups such as Caryophyllaceae, we emphasize the observation that there is a shift in the 410 climate occupied rather than the direction of the shift. Mean annual precipitation is not as clear 411 with some clades occupying a higher precipitation and some occupying lower precipitation. 412 Perhaps the best summary of climatic niche is the principal components of all the climatic 413 variables. Here, while the shift in units is less easily interpreted, duplicated clades occupy 414 different niches than sister lineages. This generally supports the hypothesis that WGD events are 415 associated with adaptations, in this case, that are associated with shifts in climatic niches. This 416 necessitates further examination in other angiosperm clades. 417 The rates of niche evolution show more complicated patterns. While some clades, such as 418 the Portulacineae, show significant increase in a rate of niche evolution as compared to the sister 419 clade (e.g., MAT), no clear pattern emerges across all comparisons. There are other shifts in rate 420 such as with MAT and MAP in the Nyctaginaceae and Montiaceae, but these are not as strong as 421 the pattern of climate itself discussed above. 422 With each of these patterns presented here, it is important to consider them in the context 423 of uncertainty, both inherent in the biological processes that generate the phylogeny and in the 424 analyses associated with large scale datasets. These large phylogenies and datasets allow for 425 more thorough examination of the clades, but uncertainty makes precise mapping of weaker 426 signals difficult. As mentioned above, this is demonstrated by both the mapping of 427 diversification events and duplications. Furthermore, the comparisons of the sister clades for 428 climatic niche analyses assumes accurate identification of sister lineages. Increasing taxon bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 15

429 sampling may help, but additional sequence data and specimen data for phylogenetic analyses, 430 WGD mapping analyses, and climate niche characterization will surely improve our precision in 431 these investigations. 432 What emerges from these analyses of WGD, diversification, and climate? It would appear 433 as though, perhaps not unexpectedly, the patterns are complex and mixed. Some WGD are 434 associated directly with diversification events, some WGD are associated with shifts in climate 435 tolerance, some WGD are coincident with shifts in rates of niche evolution, and still other WGD 436 are associated with known adaptations (carnivory, habit shifts associated with montane habitats, 437 etc.). Some diversification shifts follow WGD events. However, it is unclear whether these 438 events are linked or correlated and, if so, if they are correlated more with diversification than an 439 additional adaptation or other evolutionary pattern or process. As data increase in these groups 440 and as confidence increases in the phylogenetic relationships as well as the placement of both 441 diversification and WGD events, we will be able to better address these questions. However, at 442 least for the Caryophyllales, it does not appear as though diversification is tightly linked with 443 WGD. Instead, for the clades that can be tested, we find shifts in climate correspond well to 444 WGD. 445 446 Suggestions for moving forward. —WGD are almost certainly one of the dominant processes 447 that contribute to major evolutionary events within plant lineages. This may be in the form of 448 increased diversification, development of novel traits, adaptation to new environments, and many 449 other events (e.g., Schubert and Vu, 2016; Clavijo et al. 2017). However, for several reasons, 450 these events (i.e., WGD and other evolutionary events) may not occur simultaneously. In fact, 451 there may be little to no expectation for the events to occur simultaneously (e.g., Donoghue, 452 2005; Donoghue and Sanderson, 2015; Tank et al., 2015). In any case, however, more precise 453 expectations and null models need to be developed to allow for reasonable tests of the 454 correlations among these events. For example, there may be shifts in diversification that follow a 455 WGD, but is it close enough, or frequent enough to infer that the two events are related? Is 456 correlation possible or identifiable if, as is expected, intervening lineages have gone extinct? 457 Furthermore, more precise connections should be made to the biology of speciation and genome 458 WGDs to better determine why, specifically, WGDs would be expected to correspond with any 459 diversification pattern instead of adaptations, which may or may not correspond with increases or bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 16

460 decreases in speciation. While still challenging, investigating the fate of and patterns of selection 461 within individual genes (e.g., subfunctionalization and neofunctionalization) may shed light into 462 the genomic basis of post-WGD and possibly allow for more concrete expectations for 463 diversification. With the availability of genomes and transcriptomes, this is now beginning to 464 become a possibility (e.g., Brockington et al., 2015, Walker et al., 2017). Only when these 465 suggestions are linked to more specific biological hypotheses will we be able to better 466 understand the ultimate impact of WGD in plant evolution. 467 468 Acknowledgements 469 We thank Gregory Stull, Oscar Vargas, Ning Wang, Sonia Ahluwalia, Jordan Shore, Lijun Zhao, 470 Alex Taylorm, and Drew Larson for helpful discussion on the manuscript. The authors thank 471 Hilda Flores, Helga Ochoterena, Tom Wendt and the staff at the Plant Resources Center at the 472 University of Texas at Austin, the Lyon Arboretum, David Anderson, John Brittnacher, Anna 473 Brunner, Joseph Charboneau, Arianna Goodman, Heather-Rose Kates, Patricia Herrnández 474 Ledesma, Lucas Majure, Nidia Mendoza, Michael Powell, Rick Ree, Carl Rothfels, Flora Samis, 475 Jeffrey Sanders, Elizabeth Saunders, Rich Spellenberg, Greg Stull, Mats Thulin, Erin Tripp, and 476 Sophia Weinmann for help with obtaining material. This work was supported by NSF DEB 477 awards 1352907 and 1354048. 478 479 Author contributions 480 S.A.S., J.F.W., Y.Y., M.J.M., C.P.-F., and S.F.B. designed research. S.A.S., J.W.B., C.P.F., and 481 Y.Y. analyzed the data. S.A.S. led the writing. All authors read and contributed to the 482 manuscript. 483 484 References 485 Alfaro ME, Santini F, Brock C, Alamillo H, Dornburg A, Rabosky DL, Carnevale G, 486 Harmon LJ. 2009. Nine exceptional radiations plus high turnover explain species 487 diversity in jawed vertebrates. Proceedings of the National Academy of Sciences, USA 488 106: 13410–13414. bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 17

489 Angiosperm Phylogeny Group. 2016. An update of the Angiosperm Phylogeny Group 490 classification for the orders and families of flowering plants: APG IV. Botanical Journal 491 of the Linnean Society 181: 1-20. 492 Arakaki M, Christin PA, Nyffeler R, Lendel A, Eggli U, Ogburn RM, Spriggs E, Moore 493 MJ, Edwards EJ. 2011. Contemporaneous and recent radiations of the world's major 494 lineages. Proceedings of the National Academy of Sciences, USA 108: 495 8379–8384. 496 Arrigo N, Barker MS. 2012. Rarely successful polyploids and their legacy in plants genomes. 497 Current Opinion in Plant Biology 15: 140–146. 498 Barker MS, Vogel H, Schranz ME. 2009. Paleopolyploidy in the Brassicales: Analyses of 499 the Cleome transcriptome elucidate the history of genome duplications in Arabidopsis 500 and other Brassicales. Genome Biology and Evolution 1: 391–399. 501 Barker MS, Husband BC, Pires JC. 2016. Spreading Winge and flying high: The evolutionary 502 importance of polyploidy after a century of study. Am. J. Bot. 103: 1139-1145. 503 Bell CD, Soltis DE, Soltis PS. 2010. The age and diversification of the angiosperms re-revisited. 504 American Journal of Botany 97: 1296–1303. 505 Brochmann C, Brysting AK, Alsos IG, Borgen L, Grundt HH, Scheen AC, Elven R. 2004. 506 Polyploidy in arctic plants. Biological Journal of the Linnean Society 82: 521–536. 507 Brockington SF, Alexandre R, Ramdial J, Moore MJ, Crawley S, Dhingra A, Hilu K, Soltis 508 DE, Soltis PS. 2009. Phylogeny of the Caryophyllales sensu lato: revisiting hypotheses 509 on pollination biology and perianth differentiation in the core Caryophyllales. 510 International Journal of Plant Sciences 170: 627–643. 511 Brockington SF, Yang Y, Gandia-Herrero F, Covshoff S, Hibberd JM, Sage RF, Wong GK, 512 Moore MJ, Smith SA. 2015. Lineage-specific gene radiations underlie the evolution of 513 novel betalain pigmentation in Caryophyllales. New Phytologist 207: 1170–1180. 514 Brown JW, Walker JF, Smith SA. 2017. phyx: Phylogenetic tools for Unix. Bioinformatics 515 btx063. 516 Burnham KP, Anderson DR. 2003. Model Selection and Multimodel Inference: A Practical 517 Information-Theoretic Approach. Springer, New York. 518 Carolin RC. 1954. Stomatal size, density and morphology in the Dianthus. Kew Bulletin 519 9: 251-258. bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 18

520 Cevallos-Ferriz SRS, Estrada-Ruiz E, Perez-Hernandez BR. 2008. Phytolaccaceae 521 infructescence from Cerro del Pueblo formation, upper Cretaceous (late Campanian), 522 Coahuila, Mexico. American Journal of Botany 95: 77–83. 523 Chase MW, Christenhusz MJM, Fay MF, Byng JW, Judd WS, Soltis DE, Mabberley DJ, 524 Sennikov AN, Soltis PS, Stevens PF. 2016. An update of the Angiosperm Phylogeny 525 Group classification for the orders and families of flowering plants: APG IV. Botanical 526 Journal Linnean Society 181: 1-20. 527 Clavijo BJ, Venturini L, Schudoma C, Accinelli GG, Kaithakottil G, Wright J, Borrill P, 528 Kettleborough G, Heavens D, Chapman H, Lipscombe J, Barker T, Lu FH, 529 McKenzie N, Raats D, Ramirez-Gonzalez RH, Coince A, Peel N, Percival-Alwyn L, 530 Duncan O, Trösch J, Yu G, Bolser DM, Namaati G, Kerhornou A, Spannagl M, 531 Gundlach H, Haberer G, Davey RP, Fosker C, Palma FD, Phillips A, Millar AH, 532 Kersey PJ, Uauy C, Krasileva KV, Swarbreck D, Bevan MW, Clark MD. 2017. An 533 improved assembly and annotation of the allohexaploid wheat genome identifies 534 complete families of agronomic genes and provides genomic evidence for chromosomal 535 translocations. Genome Research doi: 10.1101/gr.217117.116 536 Comai L. 2005. The advantages and disadvantages of being polyploid. Nature Review Genetics 537 6: 836–846. 538 Degreef JD. 1997. Fossil Aldrovanda. Newsletter 26: 93–97. Available at: 539 http://www.carnivorousplants.org/cpn/articles/CPNv26n3p93_97.pdf%5Cnpapers3://public 540 ation/uuid/A61A0327-61CF-4683-BF3F-EE5AFFD69F49. 541 Dodsworth S, Chase M, Leitch A. 2016. Is post-polyploidization diploidization the key to the 542 evolutionary success of angiosperms. Botanical Journal of the Linnean Society 180: 543 1095–8339. 544 Dohm JC, Lange C, Holtgräwe D, Sörensen TR, Borchardt D, Schulz B, Lehrach H, 545 Weisshaar B, Himmelbauer H. 2012. Palaeohexaploid ancestry for Caryophyllales 546 inferred from extensive gene‐based physical and genetic mapping of the sugar beet 547 genome (Beta vulgaris). The Plant Journal 70: 528–40. 548 Donoghue MJ. 2005. Key innovations, convergence, and success: macroevolutionary lessons 549 from plant phylogeny. Paleobiology 31:77-93. bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 19

550 Donoghue MJ, Sanderson MJ. 2015. Confluence, synnovation, and depauperons in plant 551 diversification. New Phyt. 207:260-274. 552 Douglas N, Spellenberg R. 2010. A new tribal classification of Nyctaginaceae. Taxon 59: 905– 553 910. 554 Doyle JJ, Doyle JL. 1987. A rapid DNA isolation procedure for small quantities of fresh leaf 555 tissue. Phytochemical Bulletin 19: 11-15. 556 Edger PP, Heidel-Fischer HM, Bekaert M, Rota J, Glöckner G, Platts AE, Heckel DG, Der 557 JP, Wafula EK, Tang M, Hofberger JA. 2015. The butterfly plant arms-race escalated 558 by gene and genome duplications. Proceedings of the National Academy of Sciences, 559 USA 112: 8362–8366. 560 Friis EM, Crane P, Pedersen KR. 2011. Early Flowers and Angiosperm Evolution. Cambridge. 561 Guindon S, Dufayard J, Lefort V, Anisimova M, Hordijk W, Gascuel O. 2010. New 562 algorithms and methods to estimate maximum-likelihood phylogenies: assessing the 563 performance of PhyML 3.0. Systematic Biology 59: 307–321. 564 Huang, CH, Zhang C, Liu M, Hu Y, Gao T, Qi J, Hong Ma. 2016. Multiple Polyploidization 565 Events across Asteraceae with Two Nested Events in the Early History Revealed by 566 Nuclear Phylogenomics. Mol Biol Evol. 33: 2820-2835. 567 Jordan GJ, Macphail MK. 2003. A middle-late Eocene inflorescence of Caryophyllaceae from 568 Tasmania, Australia. American Journal of Botany 90: 761–768. 569 Levin DA. 1983. Polyploidy and novelty in flowering plants. American Naturalist 122: 1–25. 570 Levin DA. 2002. The Role of Chromosomal Change in Plant Evolution. Oxford University 571 Press: New York. 572 McKain MR, Tang H, McNeal JR, Ayyampalayam S, Davis JI, dePamphilis CW, Givnish 573 TJ, Pires JC, Stevenson DW, Leebens-Mack JH. 2016. A Phylogenomic Assessment 574 of Ancient Polyploidy and Genome Evolution across the Poales. Genome Biol. Evol. 8: 575 1150-1164. 576 Mayrose I, Zhan SH, Rothfels CJ, Magnuson-Ford K, Barker MS, Rieseberg LH, Otto SP. 577 2011. Recently formed polyploid plants diversify at lower rates. Science 333: 1257. 578 Miller JS, Venable DL. 2000. Polyploidy and the evolution of gender dimorphism in plants. 579 Science 289: 2335–2338. bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 20

580 Moore MJ, Soltis PS, Bell CD, Burleigh JG, Soltis DE. 2010. Phylogenetic analysis of 83 581 plastid genes further resolves the early diversification of . Proceedings of the 582 National Academy of Sciences, USA 107: 4623–4628. 583 Nei M, Gojobori T. 1986. Simple methods for estimating the numbers of synonymous and 584 nonsynonymous nucleotide substitutions. Molecular Biology and Evolution 3: 418–426. 585 Nichols DJ, Traverse A. 1971. Palynology, petrology, and depositional environments of some 586 early Tertiary lignites in Texas. Geoscience and Man 3: 37–48. 587 Ocampo G, Columbus T. 2010. Molecular phylogenetics of suborder Cactineae 588 (Caryophyllales), including insights into photosynthetic diversification and historical 589 biogeography. Am. J. Bot. 97: 1827-1847. 590 Otto SP, Whitton J. 2000. Polyploid incidence and evolution. Annual Review of Genetics 34: 591 401–437. 592 Pennell MW, Eastman JM, Slater GJ, Brown JW, Uyeda JC, Fitzjohn RG, Alfaro ME, 593 Harmon LJ. 2014. geiger v2.0: an expanded suite of methods for fitting 594 macroevolutionary models to phylogenetic trees. Bioinformatics 30: 2216–2218. 595 Puttick, MN, Clark J., Donoghue P. 2015. Size is not everything: rates of genome size 596 evolution, not C-value, correlate with speciation in angiosperms. Proc. Roy. Soc. B. 282: 597 20152289. 598 Sanderson MJ. 2003. r8s: inferring absolute rates of molecular evolution and divergence times 599 in the absence of a molecular clock. Bioinformatics 19: 301–302. 600 Schubert I, Vu GTH. 2016. Genome Stability and Evolution: Attempting a Holistic View. 601 Trends in Plant Sciences 21: 749-757. 602 Schuster TM, Setaro SD, Kron KA. 2013. Age Estimates for the Buckwheat Family 603 Polygonaceae Based on Sequence Data Calibrated by Fossils and with a Focus on the 604 Amphi-Pacific Muehlenbeckia. Plos ONE 8(4): e61261. 605 Smith SA, Beaulieu JM, Donoghue MJ. 2009. Mega-phylogeny approach for comparative 606 biology: an alternative to supertree and supermatrix approaches. BMC Evolutionary 607 Biology 9: 37. 608 Smith SA, Beaulieu JM, Stamatakis A, Donoghue MJ. Understanding angiosperm 609 diversification using small and large phylogenetic trees. American Journal of Botany 98: 610 404–414. bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 21

611 Smith SA, Donoghue MJ. 2008. Rates of molecular evolution are linked to life history in 612 flowering plants. Science 322: 86–89. 613 Smith SA, O’Meara BC. 2012. treePL: divergence time estimation using penalized likelihood 614 for large phylogenies. Bioinformatics 28: 2689–2690. 615 Smith SA, Moore MJ, Brown JW, Yang Y. 2015. Analysis of phylogenomic datasets reveals 616 conflict, concordance, and gene duplications with examples from animals and plants. 617 BMC Evolutionary Biology 15: 150. 618 Soltis PS, Soltis DS. 2000. The role of genetic and genomic attributes in the success of 619 polyploids. Proceedings of the National Academy of Sciences, USA 97: 7051–7057. 620 Soltis DE, Smith SA, Cellinese N, Wurdack KJ, Tank DC, Brockington SF, Refulio- 621 Rodriguez NF, Walker JB, Moore MJ, Carlsward BS, Bell CD. 2011. Angiosperm 622 phylogeny: 17 genes, 640 taxa. American Journal of Botany 98: 704–730. 623 Soltis PS, Liu X, Marchant DB, Visger CJ, Soltis DE. 2014. Polyploidy and novelty: 624 Gottlieb’s legacy. Philosophical Transactions of the Royal Society of London B: 625 Biological Sciences 369: 20130351. 626 Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of 627 large phylogenies. Bioinformatics 30: 1312–1313. 628 Storchová Z, Breneman A, Cande J, Dunn J, Burbank K, O'toole E, Pellman D. 2006. 629 Genome-wide genetic analysis of polyploidy in yeast. Nature 443: 541. 630 Suyama M, Torrents D, Bork P. 2006. PAL2NAL: robust conversion of protein sequence 631 alignments into the corresponding codon alignments. Nucleic Acids Research 34 (suppl 632 2): W609–W612. 633 Tank DC, Eastman JM, Pennell MW, Soltis PS, Soltis DE, Hinchliff CE, Brown JW, Sessa 634 EB, Harmon LJ. 2015. Nested radiations and the pulse of angiosperm diversification: 635 increased diversification rates often follow whole genome duplications. New Phytologist 636 207: 454–467. 637 Thulin M, Moore AJ, El-Seedi H, Larsson A, Christin P-A, Edwards EJ. 2016. Phylogeny 638 and generic delimitation in Molluginaceae, new pigment data in Caryophyllales, and the 639 new family Corbichoniaceae. Taxon 65: 775–793. bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 22

640 Valente LM, Britton AW, Powell MP, Papadopulos AST, Burgoyne PM, Savolainen V. 641 2014. Correlates of hyperdiversity in southern African ice plants (Aizoaceae). Botanical 642 Journal of the Linnean Society Linnean Society of London. 174:110-129. 643 Walker JF, Yang Y, Moore MJ, Mikenas J, Timoneda A, Brockington SF, Smith SA. 2017. 644 Widespread paleopolyploidy, gene tree conflict, and recalcitrant relationships among the 645 carnivorous Caryophyllales. bioRxiv: 115741. 646 Weiss H, Dobes C, Schneeweiss GM, Greimler J. 2002. Occurrence of tetraploid and 647 hexaploid cytotypes between and within populations in Dianthus sect. Plumaria 648 (Caryophyllaceae). New Phyto. 156: 85-94. 649 Wood TE, Takebayashi N, Barker MS, Mayrose I, Greenspoon PB, Rieseberg LH. 2009. 650 The frequency of polyploid speciation in vascular plants. Proceedings of the National 651 Academy of Sciences, USA 106: 13875–13879. 652 Yang Y, Moore MJ, Brockington SF, Soltis DE, Wong GK-S, Carpenter EJ, Zhang Y, 653 Chen L, Yan Z, Xie Y, Sage RF. 2015. Dissecting molecular evolution in the highly 654 diverse plant clade Caryophyllales using transcriptome sequencing. Molecular Biology 655 and Evolution 32: 2001–2014. 656 Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Molecular Biology and 657 Evolution 24: 1586–1591. 658 Zetter R, Hofmann CC, Draxler I, Durango de Cabrera J, Del MVergel M, Vervoorst F. 659 1999. A rich middle Eocene microflora at Arroyo de los Mineros, near Cañadón Beta, NE 660 Tierra del Fuego province, Argentina. Abhandlungen der Geologischen Bundesanstalt 661 56: 439–460. 662 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 23

663 Tables 664 Table 1 Voucher information and GenBank accession numbers for newly reported plastid matK 665 sequences. Families follow APG IV (Angiosperm Phylogeny Group 2016). NCBI Voucher specimen Collection accession Family Taxon (Herbarium acronym) locality number Silvia H. Salas Achatocarpus gracilis Mexico: Achatocarpaceae Morales et al. 5608 KY952292 H.Walter Oaxaca (TEX) Phaulothamnus Michael J. Moore et : Achatocarpaceae KY952477 spinescens A.Gray al. 976 (OC) Texas Phaulothamnus William R. Carr United States: Achatocarpaceae KY952478 spinescens A.Gray 27176 (TEX) Texas Allenrolfea occidentalis Michael J. Moore 474 United States: Amaranthaceae KY952314 (S.Watson) Kuntze (OC) Texas Alternanthera Michael J. Moore United States: Amaranthaceae KY952319 caracasana Kunth 1808 (OC) Texas United States: Michael J. Moore 356 Amaranthaceae Amaranthus cruentus L. Ohio KY952320 (OC) (cultivated) Michael J. Moore United States: Amaranthaceae Amaranthus sp. KY952321 1801 (OC) Texas Michael J. Moore United States: Amaranthaceae Amaranthus sp. KY952322 2186 (OC) Ohio Michael J. Moore United States: Amaranthaceae Amaranthus sp. KY952323 2187 (OC) Illinois Atriplex prosopidum Hilda Flores Olvera Mexico: Amaranthaceae KY952340 I.M.Johnst. et al. 1658 (MEXU) Coahuila Michael J. Moore United States: Amaranthaceae Atriplex sp. KY952338 1689 (OC) Texas Michael J. Moore United States: Amaranthaceae Atriplex sp. KY952339 1699 (OC) Texas United States: Celosia argentea L. var. Michael J. Moore 359 Amaranthaceae Ohio KY952359 plumosa (OC) (cultivated) Flora K. Samis 7 Charpentiera ovata (Lyon Arboretum United States: Amaranthaceae KY952360 Gaudich. var. ovata living collection, accession 2011.0034) Flora K. Samis 6 Charpentiera tomentosa (Lyon Arboretum United States: Amaranthaceae Sohmer var. KY952361 living collection, Hawaii maakuaensis Sohmer accession 88.0141) Michael J. Moore 344 United States: Amaranthaceae Chenopodium album L. KY952362 (OC) Ohio Gossypianthus Michael J. Moore United States: Amaranthaceae KY952408 lanuginosus (Poir.) Moq. 1807 (OC) Texas Amaranthaceae Guilleminea densa Michael J. Moore et Mexico: KY952412 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 24

(Humb. & Bonpl. ex al. 2445 (OC) Chihuahua Schult.) Moq. Michael J. Moore 453 United States: Amaranthaceae (L.) Scop. KY952506 (OC) Texas Flora K. Samis 3 Nototrichium (Lyon Arboretum United States: Amaranthaceae divaricatum living collection, KY952468 Hawaii D.H.Lorence accession 96.0036 #3) Flora K. Samis 2 Nototrichium humile (Lyon Arboretum United States: Amaranthaceae KY952469 Hillebr. living collection, Hawaii accession 2001-0254) Suaeda jacoensis Hilda Flores Olvera Mexico: Amaranthaceae KY952514 I.M.Johnst. et al. 1662 (MEXU) Coahuila Suaeda jacoensis Michael J. Moore et Mexico: Amaranthaceae KY952515 I.M.Johnst. al. 2617 (OC) Nuevo Leon Suaeda mexicana Hilda Flores Olvera Mexico: Amaranthaceae KY952516 (Standl.) Standl. et al. 1654 (MEXU) Coahuila Tidestromia lanuginosa Michael J. Moore United States: Amaranthaceae KY952521 (Nutt.) Standl. 1128 (OC) Texas Zuckia brandegeei (A.Gray) S.L.Welsh & Joseph L. M. United States: Amaranthaceae Stutz var. plummeri Charboneau 9672 KY952528 Colorado (Stutz & S.C.Sand.) (RM) Dorn Flora K. Samis 11 Leuenbergeria (Lyon Arboretum United States: Cactaceae quisqueyana (Alain) KY952473 living collection, Hawaii Lodé accession 2000.0281) Moehringia macrophylla Arianna Goodman 1 United States: Caryophyllaceae KY952464 (Hook.) Fenzl (OC) Oregon Paronychia lundellorum William R. Carr United States: Caryophyllaceae KY952472 Torr. & A.Gray 17607 (MEXU) Texas Michael J. Moore et United States: Caryophyllaceae Saponaria officinalis L. KY952507 al. 1819 (OC) Indiana Flora K. Samis 5 (Lyon Arboretum United States: Caryophyllaceae Schiedea kaalae Wawra KY952509 living collection, Hawaii accession 92.0513) Spergularia salina Michael J. Moore United States: Caryophyllaceae KY952512 J.Presl & C.Presl 1693 (OC) Texas Alluaudia ascendens Michael J. Moore United States Didiereaceae KY952318 (Drake) Drake 1645 (cultivated) Carel C. H. Jongkind Dioncophyllaceae peltatum (Hutch. & Liberia KY952524 et al. 7136 (WAG) Dalziel) Airy Shaw Drosera burmannii Vahl Michael J. Moore United States Droseraceae KY952400 cv. Pilliga Red 1814 (OC) (cultivated) Droseraceae Drosera peltata Thunb. Michael J. Moore Australia: KY952401 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 25

1817 (OC) Tasmania (cultivated) Michael J. Moore United States Droseraceae Drosera regia Stephens KY952402 1812 (OC) (cultivated) Drosophyllum Michael J. Moore United States Drosophyllaceae KY952403 lusitanicum (L.) Link 1816 (OC) (cultivated) gypsophila Michael J. Moore et Mexico: Frankeniaceae KY952406 I.M.Johnst. al. 1880 (OC) Nuevo Leon Manuel Rimachi Microteaceae Microtea debilis Sw. Peru: Loreto KY952415 11128 (TEX) Arianna Goodman 2 United States: Montiaceae L. KY952363 (OC) Oregon Phemeranthus Michael J. Moore et United States: Montiaceae KY952479 parviflorus (Nutt.) Kiger al. 2214 (OC) New Mexico angustifolia Michael J. Moore et Mexico: Nyctaginaceae KY952281 Greene al. 2063 (OC) Coahuila Abronia angustifolia Michael J. Moore et United States: Nyctaginaceae KY952282 Greene al. 896 (OC) New Mexico Abronia bigelovii Michael J. Moore et United States: Nyctaginaceae KY952283 Heimerl al. 704 (OC) New Mexico Abronia elliptica Norman A. Douglas United States: Nyctaginaceae KY952284 A.Nelson 2039 (DUKE) Arizona Nutt. Billie L. Turner 20- United States: Nyctaginaceae KY952285 ex Hook. 22 (SRSC) Texas Abronia fragrans Nutt. Glenn Kroh et al. United States: Nyctaginaceae KY952286 ex Hook. 3021 (TEX) Texas Steve L. Orzell et al. United States: Nyctaginaceae KY952287 L.A.Galloway 6492 (TEX) Texas N. Elizabeth Abronia mellifera United States: Nyctaginaceae Saunders BP 19 KY952288 Douglas ex Hook. Wyoming (SIU) N. Elizabeth Abronia mellifera United States: Nyctaginaceae Saunders BP 20 KY952289 Douglas ex Hook. Wyoming (SIU) Abronia nana S.Watson Robert C. Sivinski et United States: Nyctaginaceae KY952290 var. nana al. 3108 (NMC) Arizona N. Elizabeth United States: Nyctaginaceae Lam. Saunders LU 45 KY952291 California (SIU) Acleisanthes acutifolia James Henrickson et Mexico: Nyctaginaceae KY952293 Standl. al. 22916 (TEX) Coahuila Acleisanthes Michael J. Moore 460 United States: Nyctaginaceae angustifolia (Torr.) KY952294 (OC) Texas R.A.Levin Acleisanthes cf. James Henrickson Mexico: Nyctaginaceae purpusiana (Heimerl) KY952309 23026 (TEX) Coahuila R.A.Levin Acleisanthes Michael J. Moore et United States: Nyctaginaceae chenopodioides KY952295 al. 733 (OC) Texas (A.Gray) R.A.Levin bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 26

Acleisanthes crassifolia Michael J. Moore et United States: Nyctaginaceae KY952296 A.Gray al. 569 (OC) Texas Acleisanthes diffusa Michael J. Moore et United States: Nyctaginaceae (A.Gray) R.A.Levin var. KY952297 al. 624 (OC) Texas diffusa Acleisanthes lanceolata Michael J. Moore et United States: Nyctaginaceae (Wooton) R.A.Levin KY952298 al. 870 (OC) New Mexico var. lanceolata Acleisanthes lanceolata Michael J. Moore et United States: Nyctaginaceae (Wooton) R.A.Levin KY952299 al. 903 (OC) Texas var. lanceolata Acleisanthes lanceolata (Wooton) R.A.Levin var. megaphylla Alfred T. Richardson Mexico: Nyctaginaceae KY952300 (B.A.Fowler & 1666 (TEX) Chihuahua B.L.Turner) Spellenb. & J.Poole Acleisanthes longiflora Michael J. Moore 435 United States: Nyctaginaceae KY952301 A.Gray (OC) Texas Acleisanthes longiflora Michael J. Moore et United States: Nyctaginaceae KY952302 A.Gray al. 571 (OC) Texas Acleisanthes nana Jackie Smith et al. Mexico: San Nyctaginaceae KY952303 I.M.Johnst. 798 (TEX) Luis Potosi Acleisanthes obtusa Michael J. Moore et United States: Nyctaginaceae KY952304 (Choisy) Standl. al. 984 (OC) Texas Acleisanthes palmeri George S. Hinton MexicoNuevo Nyctaginaceae KY952305 (Hemsley) R.A.Levin 28620 (TEX) Leon Acleisanthes parvifolia Michael J. Moore 452 United States: Nyctaginaceae KY952306 (Torr.) R.A.Levin (OC) Texas Acleisanthes purpusiana James Henrickson Mexico: Nyctaginaceae KY952307 (Heimerl) R.A.Levin 22709 (TEX) Coahuila Acleisanthes purpusiana Billie L. Turner 6205 Mexico: Nyctaginaceae KY952308 (Heimerl) R.A.Levin (TEX) Coahuila Acleisanthes undulata James Henrickson Mexico: Nyctaginaceae (B.A.Fowler & KY952310 23195 (TEX) Coahuila B.L.Turner) R.A.Levin Acleisanthes wrightii Michael J. Moore et United States: Nyctaginaceae (A.Gray) Benth. & KY952311 al. 620 (OC) Texas Hook. Norman A. Douglas Mexico: Nyctaginaceae Allionia choisyi Standl. KY952315 2187 (DUKE) Coahuila Michael J. Moore et Mexico: Nyctaginaceae Allionia incarnata L. KY952316 al. 1352 (OC) Nuevo Leon Michael J. Moore 424 United States: Nyctaginaceae Allionia sp. KY952317 (OC) Texas Andradea floribunda André M. Amorim Nyctaginaceae KY952324 Allemão 2294 (NY) Andradea floribunda Jacquelyn Ann Nyctaginaceae Brazil KY952325 Allemão Kallunki 701 (NY) Nyctaginaceae Anulocaulis annulatus Richard W. United States: KY952326 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 27

(Coville) Standl. Spellenberg 3162 California (NMC) Anulocaulis eriosolenus James Henrickson et Mexico: Nyctaginaceae KY952327 (A.Gray) Standl. al. 23103 (TEX) Coahuila Anulocaulis eriosolenus Michael J. Moore et United States: Nyctaginaceae KY952328 (A.Gray) Standl. al. 611 (OC) Texas Anulocaulis hintoniorum Patricia Hernández Mexico: Nyctaginaceae KY952329 B.L.Turner Ledesma 52 (MEXU) Coahuila Anulocaulis leiosolenus (Torr.) Standl. var. Michael J. Moore 402 United States: Nyctaginaceae KY952330 gypsogenus (Waterf.) (OC) New Mexico Spellenb. & T.Wootten Anulocaulis leiosolenus (Torr.) Standl. var. Thomas Wootten et United States: Nyctaginaceae KY952331 howardii Spellenb. & al. s.n. (NMC) New Mexico T.Wootten Anulocaulis leiosolenus Michael J. Moore et United States: Nyctaginaceae (Torr.) Standl. var. KY952332 al. 610 (OC) Texas lasianthus I.M.Johnston Anulocaulis leiosolenus Michael J. Moore 493 United States: Nyctaginaceae (Torr.) Standl. var. KY952333 (OC) Texas leiosolenus Anulocaulis leiosolenus Michael J. Moore et United States: Nyctaginaceae (Torr.) Standl. var. KY952334 al. 825 (OC) Nevada leiosolenus Anulocaulis leiosolenus Michael J. Moore et United States: Nyctaginaceae (Torr.) Standl. var. KY952335 al. 853 (OC) Arizona leiosolenus Anulocaulis reflexus Michael J. Moore et Mexico: Nyctaginaceae KY952336 I.M.Johnst. al. 242 (TEX) Chihuahua Anulocaulis reflexus Michael J. Moore 483 United States: Nyctaginaceae KY952337 I.M.Johnst. (OC) Texas anisophylla Norman A. Douglas Mexico: Nyctaginaceae KY952341 Torr. 2194 (DUKE) Durango Boerhavia ciliata Norman A. Douglas United States: Nyctaginaceae KY952342 Brandegee 2145 (DUKE) Texas Boerhavia coccinea Michael J. Moore 366 United States: Nyctaginaceae KY952343 Mill. (OC) New Mexico Boerhavia coulteri Richard W. (Hook.f.) S.Watson var. United States: Nyctaginaceae Spellenberg 13273 KY952344 palmeri (S.Watson) Arizona (NMC) Spellenb. Australia: Boerhavia dominii Nyctaginaceae H. Smyth 42 (NY) South KY952345 Meikle & Hewson Australia Richard W. Boerhavia gracillima United States: Nyctaginaceae Spellenberg 12447 KY952347 Heimerl Texas (NMC) Boerhavia intermedia Richard W. United States: Nyctaginaceae KY952348 M.E.Jones Spellenberg 13279 Arizona bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 28

(NMC) Boerhavia lateriflora Norman A. Douglas Mexico: Nyctaginaceae KY952349 Standl. 2161 (DUKE) Sonora Boerhavia linearifolia Michael J. Moore et United States: Nyctaginaceae KY952350 A.Gray al. 581 (OC) Texas Richard W. Boerhavia purpurascens United States: Nyctaginaceae Spellenberg 13261 KY952351 A.Gray Arizona (NMC) United States: Nyctaginaceae Boerhavia repens L. J. S. Rose 2 KY952352 Hawaii Richard W. Nyctaginaceae Boerhavia repens L. Spellenberg 7183 Yemen: Sana KY952353 (NMC) Erin Tripp et al. 4090 Nyctaginaceae Boerhavia sp. Namibia KY952346 (OC) Boerhavia torreyana Michael J. Moore et United States: Nyctaginaceae KY952354 (S.Watson) Standl. al. 633 (OC) Texas Michael Nee 51257 Bolivia: Santa Nyctaginaceae KY952355 campanulata Heimerl (TEX) Cruz United States: Bougainvillea glabra Michael J. Moore 538 Nyctaginaceae Ohio KY952356 Choisy (OC) (cultivated) Bougainvillea spinosa J. Saunders et al. Argentina: Nyctaginaceae KY952357 (Cav.) Heimerl 3371 (TEX) San Juan Bougainvillea stipitata Michael Nee 50723 Bolivia: Santa Nyctaginaceae KY952358 Griseb. (TEX) Cruz Colignonia glomerata Michael Nee 52523 Nyctaginaceae Bolivia KY952364 Griseb. (NY) Martin Grantham 63 Colignonia scandens (SFBG living Nyctaginaceae Ecuador KY952365 Benth. collection, accession 1996-0202) Commicarpus ambiguus Mats Thulin 11015 Somalia: Nyctaginaceae KY952366 Meikle (UPS) Sanaag Commicarpus arabicus Mats Thulin et al. Nyctaginaceae Yemen: Taizz KY952367 Meikle 9294 (UPS) Richard W. Commicarpus arabicus Nyctaginaceae Spellenberg 7217 Yemen: Ibb KY952368 Meikle (NMC) Richard W. Commicarpus arabicus Nyctaginaceae Spellenberg 7297 Yemen: Ibb KY952369 Meikle (NMC) Richard W. Australia: Commicarpus australis Nyctaginaceae Spellenberg et al. Western KY952370 (Meikle) Govaerts 9469 (NMC) Australia Commicarpus boissieri Mats Thulin 11423 Nyctaginaceae Oman: Dhofar KY952371 (Heimerl) Cufod. (UPS) Commicarpus boissieri Carl J. Rothfels et al. Oman: Ash Nyctaginaceae KY952373 (Heimerl) Cufod. 4331 Sharqiyah Nyctaginaceae Commicarpus Patricia Hernández Mexico: Baja KY952372 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 29

brandegeei Standl. Ledesma 55 (MEXU) California Sur Richard W. Commicarpus coctoris Mexico: Nyctaginaceae Spellenberg et al. KY952374 N.A.Harriman Oaxaca 12883 (NMC) Commicarpus Mats Thulin et al. Madagascar: Nyctaginaceae commersonii (Baill.) KY952380 11836 (UPS) Toliara Cavaco Commicarpus decipiens Erin Tripp et al. 4127 Nyctaginaceae Namibia KY952375 Meikle (NMC) Commicarpus Mats Thulin et al. Nyctaginaceae grandiflorus (A.Rich.) Yemen: Taizz KY952376 9311 (UPS) Standl. Commicarpus greenwayi Mats Thulin 606 Tanzania: Nyctaginaceae KY952377 Meikle (UPS) Iringa Commicarpus helenae Richard W. Yemen: Nyctaginaceae (Roem. & Schult.) Spellenberg et al. KY952378 Dhamar Meikle 7504 (NMC) Commicarpus hiranensis Mats Thulin et al. Ethiopia: Nyctaginaceae KY952379 Thulin 11225 (UPS) Harerge Commicarpus mistus Mats Thulin et al. Yemen: Nyctaginaceae KY952381 Thulin 9786 (UPS) Mahrah Commicarpus Mats Thulin 6318 Somalia: Nyctaginaceae KY952382 parviflorus Thulin (UPS) Banaadir Commicarpus Mats Thulin 1301 Ethiopia: Nyctaginaceae pedunculosus (A.Rich.) KY952383 (UPS) Arussi Cufod. Commicarpus Mats Thulin 10747 Somalia: Nyctaginaceae plumbagineus (Cav.) KY952384 (UPS) Togdheer Standl. Commicarpus Mats Thulin et al. Ethiopia: Nyctaginaceae plumbagineus (Cav.) KY952385 11330 (UPS) Harerge Standl. Commicarpus Richard W. Nyctaginaceae plumbagineus (Cav.) Spellenberg et al. Yemen: Ta'izz KY952386 Standl. 7374 (NMC) Commicarpus Richard W. Mexico: Nyctaginaceae praetermissus Spellenberg et al. KY952387 Michoacán N.A.Harriman 12905 (NMC) Commicarpus reniformis Mats Thulin 4200 Nyctaginaceae Somalia: Sool KY952388 (Chiov.) Cufod. (UPS) Commicarpus reniformis Mats Thulin et al. Yemen: Nyctaginaceae KY952389 (Chiov.) Cufod. 8337 (UPS) Hadramaut Commicarpus scandens Michael J. Moore United States: Nyctaginaceae KY952390 (L.) Standl. 1127 (OC) Texas Richard W. Commicarpus scandens Mexico: Nyctaginaceae Spellenberg et al. KY952391 (L.) Standl. Puebla 12887 (NMC) Somalia: Commicarpus sinuatus Mats Thulin 10737 Nyctaginaceae Woqooyi KY952392 Meikle (UPS) Galbeed bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 30

Richard W. Commicarpus sinuatus Yemen: Nyctaginaceae Spellenberg 7144 KY952393 Meikle Sana'a (NMC) Richard W. Commicarpus sinuatus Yemen: Nyctaginaceae Spellenberg 7506 KY952394 Meikle Dhamar (NMC) Commicarpus Erin Tripp et al. 4049 Nyctaginaceae squarrosus (Heimerl) Namibia KY952395 (NMC) Standl. var. squarrosus Commicarpus Mats Thulin et al. Yemen: Nyctaginaceae stenocarpus (Chiov.) KY952396 8062 (UPS) Hadramaut Cufod. Cuscatlania vulcanicola José L. Linares 12938 El Salvador: Nyctaginaceae KY952397 Standl. (MEXU) Sonsonate Cuscatlania vulcanicola José L. Linares 13440 El Salvador: Nyctaginaceae KY952398 Standl. (MEXU) Sonsonate Cyphomeris gypsophiloides Michael J. Moore et United States: Nyctaginaceae KY952399 (M.Martens & Galeotti) al. 582 (OC) Texas Standl. Grajalesia fasciculata José L. Linares 13416 El Salvador: Nyctaginaceae KY952409 (Standl.) Miranda (MEXU) Sonsonate Richard W. Guapira discolor United States: Nyctaginaceae Spellenberg 13294 KY952410 (Spreng.) Little Florida (NMC) Guapira eggersiana Scott A. Mori Nyctaginaceae French Guiana KY952411 (Heimerl) Lundell 25542/40 (NY) Mirabilis albida Norman A. Douglas United States: Nyctaginaceae KY952416 (Walter) Heimerl 2035 (DUKE) Arizona Mirabilis albida William R. Carr United States: Nyctaginaceae KY952417 (Walter) Heimerl 11075 (TEX) Texas Mirabilis alipes Arnold Tiehm 13461 United States: Nyctaginaceae KY952418 (S.Watson) Pilz (TEX) Nevada Mirabilis bigelovii James D. Morefield United States: Nyctaginaceae A.Gray var. retrorsa (A. KY952419 et al. 3780 (TEX) California Heller) Munz Mirabilis cf. glabrifolia Michael J. Moore et Mexico: San Nyctaginaceae KY952428 (Ortega) I.M.Johnst. al. 1244 (OC) Luis Potosi Mirabilis cf. nesomii George S. Hinton Mexico: Nyctaginaceae KY952449 B.L.Turner 25567 (TEX) Nuevo Leon Mirabilis coccinea Norman A. Douglas United States: Nyctaginaceae KY952420 (Torr.) Benth. & Hook.f. 2133 (DUKE) Arizona Steven P. Mirabilis coccinea United States: Nyctaginaceae McLaughlin et al. KY952421 (Torr.) Benth. & Hook.f. Arizona 9354 (ARIZ) Mirabilis comata Norman A. Douglas United States: Nyctaginaceae KY952422 (Small) Standl. 2084 (DUKE) Arizona Richard W. Mirabilis decumbens Mexico: Nyctaginaceae Spellenberg et al. KY952423 (Nutt.) Daniels Zacatecas 4073 (TEX) bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 31

Mirabilis donahooiana Alice Le Duc et al. Mexico: Nyctaginaceae KY952424 Le Duc 247 (TEX) Michoacán Mirabilis exserta Pedro Tenorio 10586 Nyctaginaceae Mexico KY952425 Brandegee (MEXU) Mirabilis gigantea J. Quayle et al. 752 United States: Nyctaginaceae KY952426 (Standl.) Shinners (TEX) Texas Mirabilis glabra Michael J. Moore et United States: Nyctaginaceae KY952446 (S.Watson) Standl. al. 674 (OC) New Mexico Mirabilis glabrifolia Guy Nesom et al. Mexico: Nyctaginaceae KY952427 (Ortega) I.M.Johnst. 7654 (TEX) Coahuila Mirabilis glabrifolia Michael J. Moore et Mexico: Nyctaginaceae KY952429 (Ortega) I.M.Johnst. al. 1325 (OC) Nuevo Leon Mirabilis gracilis Alice Le Duc et al. 71 Mexico: Nyctaginaceae KY952430 (Standl.) LeDuc (TEX) Jalisco Mirabilis grandiflora Nyctaginaceae EDL 1863 (MEXU) Mexico KY952431 (Standl.) Standl. Mirabilis greenei George E. Pilz 998 United States: Nyctaginaceae KY952432 S.Watson (TEX) California Mirabilis himalaica D. E. Boufford et al. China: Xizang Nyctaginaceae (Edgew.) Heimerl var. KY952433 32449 (F ) (Tibet) chinensis Heimerl Mirabilis himalaica D. E. Boufford et al. China: Xizang Nyctaginaceae (Edgew.) Heimerl var. KY952434 41198 (F) (Tibet) chinensis Heimerl Mirabilis himalaica D. E. Boufford et al. China: Xizang Nyctaginaceae (Edgew.) Heimerl var. KY952435 41435 (F ) (Tibet) chinensis Heimerl Patricia Hernández Mirabilis hintoniorum Mexico: Nyctaginaceae Ledesma 118 KY952436 Le Duc Michoacán (MEXU) United States Nyctaginaceae Mirabilis jalapa L. Michael J. Moore s.n. KY952437 (cultivated) Mirabilis laevis (Benth.) Andrew C. Sanders et United States: Nyctaginaceae KY952438 Curran al. 29410 (TEX) California Mirabilis latifolia Victor L. Cory 24549 United States: Nyctaginaceae (A.Gray) Diggs, KY952439 (GH) Texas Lipscomb & O'Kennon Mirabilis linearis Billie L. Turner 21- United States: Nyctaginaceae KY952440 (Pursh) Heimerl 854 (TEX) Texas Mirabilis linearis (Pursh) Heimerl var. Michael J. Moore et Mexico: Nyctaginaceae KY952441 decipiens (Standl.) al. 1984 (OC) Coahuila S.L.Welsh Michael J. Moore et Mexico: San Nyctaginaceae Mirabilis longiflora L. KY952442 al. 1230 (OC) Luis Potosi Mirabilis longiflora L. var. wrightiana (A.Gray Alice Le Duc 185 United States: Nyctaginaceae KY952443 ex Britton & Kearney) (TEX) New Mexico Kearney & Peebles Nyctaginaceae Mirabilis melanotricha Michael J. Moore et Mexico: San KY952444 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 32

(Standl.) Spellenb. al. 1191 (OC) Luis Potosi Mirabilis melanotricha Norman A. Douglas United States: Nyctaginaceae KY952445 (Standl.) Spellenb. 2067 (DUKE) New Mexico Mirabilis multiflora Michael J. Moore United States: Nyctaginaceae KY952447 (Torr.) A.Gray 1110 (OC) Texas Mirabilis multiflora Norman A. Douglas United States: Nyctaginaceae KY952448 (Torr.) A.Gray 2037 (DUKE) Arizona Mirabilis nesomii Michael J. Moore et Mexico: Nyctaginaceae KY952450 B.L.Turner al. 2179 (NMC) Nuevo Leon Mirabilis nesomii Michael J. Moore et Mexico: Nyctaginaceae KY952451 B.L.Turner al. 2643 (NMC) Nuevo Leon Mirabilis nyctaginea William R. Carr United States: Nyctaginaceae KY952452 (Michx.) MacMill. 14590 (TEX) Texas Mirabilis oligantha José L. Panero 2816 Mexico: Baja Nyctaginaceae KY952453 (Standl.) Standl. (MEXU) California Mirabilis oxybaphoides George S. Hinton Mexico: Nyctaginaceae KY952454 (A.Gray) A.Gray 25572 (TEX) Nuevo Leon Alice Le Duc 259 Mexico: Nyctaginaceae Mirabilis polonii Le Duc KY952455 (MEXU) Nuevo Leon Mirabilis pringlei Alice Le Duc et al. 63 Mexico: Nyctaginaceae KY952456 Weath. (TEX) Jalisco Mirabilis pudica Arnold Tiehm 10971 United States: Nyctaginaceae KY952457 Barneby (TEX) Nevada Mirabilis texensis Billie L. Turner 22- United States: Nyctaginaceae KY952458 (J.M.Coult.) B.L.Turner 417 (TEX) Texas Ramón Cuevas G. et Mexico: Nyctaginaceae Mirabilis triflora Benth. KY952459 al. 3415 (MEXU) Jalisco Mirabilis urbani Mark Fishbein et al. Mexico: Nyctaginaceae KY952460 Heimerl 5107 (MEXU) Michoacan Mexico: Mirabilis violacea (L.) Patricia Hernández Nyctaginaceae Distrito KY952461 Heimerl Ledesma 63 (MEXU) Federal Michael J. Moore et Mexico: San Nyctaginaceae Mirabilis viscosa Cav. KY952462 al. 1824 (NMC) Luis Potosi Patricia Hernández Nyctaginaceae Mirabilis viscosa Cav. Mexico KY952463 Ledesma 13 (MEXU) Neea belizensis Cyrus L. Lundell Guatemala: Nyctaginaceae KY952465 Donn.Sm. 17692 (TEX) Petén Neea cauliflora Poepp. Schanke S15106 Nyctaginaceae Peru KY952466 & Endl. (NY) Neea psychotrioides Robert L. Wilbur Costa Rica: Nyctaginaceae KY952467 Donn.Sm. 63654 Heredia Nyctaginia capitata Michael J. Moore et United States: Nyctaginaceae KY952470 Choisy al. 617 (OC) Texas Thomas R. Van Okenia hypogaea Mexico: Nyctaginaceae Devender et al. 92- KY952471 Schltdl. & Cham. Sonora 1069 (NMC) C. Martínez 1209 Mexico: Nyctaginaceae Pisonia aculeata L. KY952483 (TEX) Oaxaca Nyctaginaceae Pisonia brunoniana J. S. Rose 3 United States: KY952484 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 33

Endl. Hawaii Ana L. Reina Pisonia capitata Mexico: Nyctaginaceae Guerrero et al. 2000- KY952485 (S.Watson) Standl. Sonora 193 (NMC) Thomas R. Van Pisonia capitata United States: Nyctaginaceae Devender et al. 2003- KY952486 (S.Watson) Standl. Arizona 17 (TEX) Pisonia macranthocarpa Dennis E. Breedlove Mexico: Nyctaginaceae KY952487 (Donn.Sm.) Donn.Sm. et al. 30361 (TEX) Chiapas Flora K. Samis 1 Pisonia sandwicensis United States: Nyctaginaceae (Lyon Arboretum KY952488 Hillebr. Hawaii living collection) José L. Linares 13403 El Salvador: Nyctaginaceae Pisonia sylvatica Standl. KY952489 (MEXU) Sonsonate Flora K. Samis 12 Pisonia umbellifera (Lyon Arboretum United States: Nyctaginaceae (J.R.Forst. & G.Forst.) KY952490 living collection, Hawaii Seem. accession 68.0453) Israel G. Vargas et al. Bolivia: Santa Nyctaginaceae Pisonia zapallo Griseb. KY952491 2001 (TEX) Cruz Pisoniella arborescens Alice Le Duc et al. Mexico: Nyctaginaceae KY952492 (Lag. & Rodr.) Standl. 231 (NMC) Oaxaca Pisoniella arborescens William R. Anderson Mexico: Nyctaginaceae KY952493 (Lag. & Rodr.) Standl. 13522 (NY) Oaxaca Ramisia brasiliensis Jomar G. Jardim Nyctaginaceae Brazil KY952495 Oliv. 1507 (NY) Reichenbachia hirsuta Michael Nee 47813 Nyctaginaceae Bolivia KY952496 Spreng. (NY) Reichenbachia Maria Maguidaura paraguayensis Nyctaginaceae Hatschbach 49218 Brazil KY952497 (D.Parodi) Dugand & (NY) Daniel Richard W. Salpianthus arenarius Mexico: Nyctaginaceae Spellenberg 12903 KY952503 Bonpl. Michoacán (NMC) Thomas R. Van Salpianthus Mexico: Nyctaginaceae Devender et al. 91- KY952504 macrodontus Standl. Sonora 894 (NMC) Salpianthus Richard W. Mexico: Nyctaginaceae purpurascens (Cav. ex Spellenberg et al. KY952505 Oaxaca Lag.) Hook. & Arn. 12885 (NMC) carneus Norman A. Douglas United States: Nyctaginaceae KY952525 (Greene) L.A.Galloway 2060 (DUKE) New Mexico Tripterocalyx crux- Arnold Tiehm et al. United States: Nyctaginaceae KY952526 maltae (Kellogg) Standl. 12213 (TEX) Nevada Tripterocalyx B. MacLeod et al. United States: Nyctaginaceae micranthus (Torr.) KY952527 751 (TEX) Colorado Hook. Agdestis clematidea George S. Hinton Mexico: Phytolaccaceae KY952313 Moc. & Sessé ex DC. 25023 (TEX) Tamaulipas bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 34

Gallesia integrifolia Michael Nee et al. Bolivia: Santa Phytolaccaceae KY952407 (Spreng.) Harms 50072 (TEX) Cruz Hilleria latifolia (Lam.) Michael Nee 33807 Bolivia: Santa Phytolaccaceae KY952413 H.Walter (TEX) Cruz Lucas C. Majure United States: Phytolaccaceae Petiveria alliacea L. KY952476 4132 (FLAS) Florida Phytolacca americana Michael J. Moore 342 United States: Phytolaccaceae KY952480 L. (OC) Ohio Mark H. Mayfield et Mexico: Phytolaccaceae Phytolacca icosandra L. KY952481 al. 1001 (TEX) Guerrero Juan A. Encina et al. Mexico: Phytolaccaceae Phytolacca octandra L. KY952482 1545 (TEX) Nuevo Leon Michael J. Moore United States: Phytolaccaceae Rivina humilis L. KY952499 1129 (OC) Texas Elsa Zardini et al. Phytolaccaceae Seguieria aculeata Jacq. KY952510 22101 (TEX) Seguieria Michael Nee 48735 Bolivia: Santa Phytolaccaceae KY952511 paraguariensis Morong (TEX) Cruz Trichostigma octandrum Michael Nee 47094 Bolivia: Santa Phytolaccaceae KY952522 (L.) H.Walter (TEX) Cruz Flora K. Samis 10 Trichostigma (Lyon Arboretum United States: Phytolaccaceae peruvianum (Moq.) KY952523 living collection, Hawaii H.Walter accession 94.0377) Australia: Christopher T. Plumbaginaceae Aegialitis annulata R.Br. Western KY952312 Martine 4043 (OC) Australia Limonium limbatum Michael J. Moore et United States: Plumbaginaceae KY952414 Small al. 694 (OC) New Mexico Michael J. Moore et Mexico: San Plumbaginaceae Plumbago scandens L. KY952494 al. 1828 (OC) Luis Potosi Eriogonum longifolium Michael J. Moore United States: Polygonaceae KY952404 Nutt. var. longifolium 1796 (OC) Texas Eriogonum Michael J. Moore United States: Polygonaceae KY952405 rotundifolium Benth. 1769 (OC) New Mexico Flora K. Samis 9 Persicaria odorata (Lyon Arboretum United States: Polygonaceae KY952475 LaLlave living collection, Hawaii accession 88.0439) Michael J. Moore United States: Polygonaceae Persicaria sp. KY952474 1177 Ohio Reynoutria japonica Michael J. Moore United States: Polygonaceae KY952498 (Houtt.) Ronse Decr. 2188 (OC) Ohio Flora K. Samis 4 Rumex albescens (Lyon Arboretum United States: Polygonaceae KY952500 Hillebr. living collection, Hawaii accession 2008-0119) Michael J. Moore United States: Polygonaceae Rumex sp. KY952501 1800 (OC) Texas Polygonaceae Rumex sp. Michael J. Moore United States: KY952502 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 35

1805 (OC) Texas Sarcobatus vermiculatus Michael J. Moore et United States: Sarcobataceae KY952508 (Hook.) Torr. al. 813 (OC) Utah Stegnosperma cubense Silvia H. Salas Mexico: Stegnospermataceae KY952513 A.Rich. Morales 2649 (NY) Oaxaca Talinum cf. aurantiacum Michael J. Moore et Mexico: Talinaceae KY952517 Engelm. al. 1985 (OC) Coahuila Flora K. Samis 8 Talinum fruticosum (L.) (Lyon Arboretum United States: Talinaceae KY952518 Juss. living collection, Hawaii accession 2012.0008) Talinum paniculatum Michael J. Moore United States Talinaceae KY952520 (Jacq.) Gaertn. 1789 (OC) (cultivated) Michael J. Moore et Mexico: Talinaceae Talinum sp. al. 1974 (MEXU) Coahuila KY952519 666 667 Table 2 List of primers used to amplify the matK sequences newly reported here. Within each 668 primer name, the number indicates the approximate position of the primer in nucleotides 669 downstream from the start of matK. 670 Primer name Sequence (5'à3') Notes works broadly across most of Caryophyllales, but generally fails matK.300F.Car TTG CAG TCA TTG TGG AAA TTC C in Caryophyllaceae and Frankeniaceae works broadly across most of matK.1350R.Car GCC AAA GTT CTA GCA CAA GAA AG Caryophyllales designed specifically for matK.210F.Car TTC GGC TAA TGA TTC TCA CCA A Caryophyllaceae designed specifically for matK.1345R.Car GAG CCA AAG TTC TAG CAC AAG AA Caryophyllaceae designed specifically for matK.1355R.Car TGT GTT TAC GAG CTA AAG TTC TAG Caryophyllaceae designed specifically for matK.300F.Fra TCG CTG TCT TTG CTG AAA TTC C Frankeniaceae 671 672 Table 3 Summary of WGD events at identified clades with distance to diversification shift and 673 climate information. Numbers correspond to those in Figs. 1 and 2.

Distance to Subtending Mean annual Mean annual Putative WGD diversification shift in species temp °C precip mm # nodes ML(BS) (sister) (sister) (sister) Tribe Nyctagineae within the 1 Nyctaginaceae 0 (0) 123 (40) 17.49 (20.08) 482.9 (997.08) 1007.58 2 Phytolaccoid clade 6 (6) 182 (407) 19.64 (18.36) (452.47) bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 36

3 Claytonia NA 38 (15) 5.28 (7.25) 790.5 (970.36) 4 Portulacineae 1 (1) 1600 (38) 16.19 (19.35) 699.87 (736.42) 5 Amaranthus 0 (0) 28 (1) 16.27 (27.09) 797.74 (117.63) Tribe Gomphrenoideae within 6 Amaranthaceae 7 (7) 172 (41) 17.91 (16.65) 871.95 (1289.5) in Caryophyllaceae (Alsinoideae + Caryophylloideae sensu Greenberg and 7 Donoghue 2011) 9 (9) 793 (13) 11.44 (12.06) 761.43 (720.00) 1084.17 8 Polygonaceae 13 (13) 670 (70) 16.3. (16.89) (794.28) 794.28 9 Plumbaginaceae NA 70 (670) 16.89 (16.3) (1084.17) 1280.57 10 Droseraceae 8 (NA) 67 (108) 16.3 (19.08) (1491.72) 2170.5 11 Nepenthaceae 4 (NA) 89 (19) 22.52 (20.05) (1611.63) 1899.13 12 Ancistrocladaceae 0 (NA) 15 (3) 24.17 (25.6) (2882.4) 13 Tamaricaceae NA 19 (3) 14.09 (16.21) 568.32 (469.61) 674 675 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 37

676 Table 4 Summary of diversification shifts. Letters correspond to those in Figs. 1 and 2.

Mean Mean shift shift # Family Diversification shift (ML) (BS) a Cactaceae Echinops 1.7957 2.2008 b Cactaceae within Gymnocalycium 6.9152 c Cactaceae Gymnocalycium -0.001 0.0555 d Cactaceae Hylocereus+Selenicereus 0.1175 e Cactaceae Rhipsalis+Schlumbergera+Echinocereus+relatives 0.0514 f Cactaceae Stenocactus -0.057 -0.019 g Anacampserotaceae Anacampseros 0.2624 h Portulaca 0.0427 0.0447 i Montiaceae Montiopsis 0.9418 j Montiaceae Montiaceae 0.0325 k Aizoaceae Drosanthemum+Delosperma+Hereroa+relatives 0.1469 l Nyctaginaceae Boerhavia 0.0747 m Nyctaginaceae Commicarpus 0.9642 n Nyctaginaceae Tribe Nyctagineae 0.0484 0.0485 o Nyctaginaceae Abronia -0.084 p Nyctag.+Aizo+Cact.+relatives Nyctag.+Aizo+Cact.+relatives 0.0168 0.019 r Amaranthaceae Salicornia 0.2732 0.1649 s Amaranthaceae Suaeda clade 1 0.1027 t Amaranthaceae Suaeda clade 2 -0.036 -0.028 u Amaranthaceae Atriplex 0.0384 v Amaranthaceae Corispermum 0.1186 w Amaranthaceae Froelichia+Gomphrena+relatives 0.0217 0.0132 x Amaranthaceae Amaranthus 0.335 0.2049 y Caryophyllaceae Dianthus 0.0662 0.0409 z Caryophyllaceae Cerastium 0.7137 aa Caryophyllaceae Arenaria 0.4606 0.425 bb Caryophyllaceae Moehringia 1.0971 0.995 cc Caryophyllaceae Schiedea 0.2339 0.2767 dd Polygonaceae Fagopyrum -0.04 -0.034 ee Polygonaceae Eriogonum+relatives 0.0432 0.0364 ff Nepenthaceae within Nepenthes 0.042 gg Ancistrocladaceae 0.1426 hh Droseraceae within Drosera 1 0.2237 0.2076 ii Droseraceae within Drosera 2 0.1622 677 678 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 38

679 Figures Amaranthaceae

5,x

v Polygonaceae u

t s ff r w

6

cc ee Plumbaginaceae

bb Caryophyllaceae aa

ii hh z Carnivores 7 8 13 10 9

y 11,ff 12,gg Bioclim 1

Diversification shift (MEDUSA ML) d p Diversification shift (MEDUSA set) e a 2 Putative duplication b

f 4 1,n o j

3 m h l Phytolaccoids g k Portulacineae

i 680 Aizoaceae 681 Fig. 1 Chronogram of the Caryophyllales with putative WGD mapped along with identified 682 diversification shifts. Diversification analyses were performed on the maximum likelihood tree 683 (ML) as well as the bootstrap tree set (set) and those shifts that were identified in both groups are 684 shown. The branches are colored based on Bioclim variable 1 (Mean Annual Temperature). 685 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 39

Amaranthaceae Amaranthaceae

5,x 5,x

v v Polygonaceae Polygonaceae u u

t t s s ff ff r w r w

6 6

cc cc ee Plumbaginaceae ee Plumbaginaceae

bb bb Caryophyllaceae aa Caryophyllaceae aa

ii ii hh hh z Carnivores z Carnivores 7 8 7 8 13 10 13 10 9 9 y y 11,ff 12,gg 11,ff 12,gg Bioclim 12 PC Axis 1

Diversification shift (MEDUSA ML) d Diversification shift (MEDUSA ML) d p p Diversification shift (MEDUSA set) e Diversification shift (MEDUSA set) e a a 2 Putative duplication 2 Putative duplication b b

f f 4 4 1,n 1,n o j o j

3 3 m h m h l l Phytolaccoids Phytolaccoids g g k k Portulacineae Portulacineae i i 686 Aizoaceae Aizoaceae 687 Fig. 2 The chronograms and mapping of diversification and WGD are as in Fig. 1 (see caption 688 for details). A) The branches are colored based on Bioclim variable 12 (Mean Annual 689 Precipitation), and B) based on the principal component analyses (PCA) axis 1. 690 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 40

a) taxon (est. species) taxon (est. species) b) Portulacineae (~2000) Molluginaceae (~100) c) Claytonia (~38) other Montiaceae (~10-15) Divers. 3 3 up events 1 down within

Dupl. id 4 3

MAT MAP PCA1 MAT MAP PCA1 0.4 − Mean Annual Mean Annual Princ. Comp. 1.0 Temperature Precipitation Axis 1 − 160 180 45 60 75 1.6 680 740 800 750 900 − 1.5 2.5

r(MAT) r(MAP) r(PCA1) r(MAT) r(MAP) r(PCA1)

Rate MAT Rate MAP Rate PCA1 0.4 0.8 0.5 1.5 50 200 350 200 800 5000 20000 20000 100000

d) Tribe Nyctag. (~200) other Nyctag. (~200) e) Tamaricaceae (~80) Frankeniaceae (~73) f) Caryophyllaceae (~2300) Tribe Sperguleae (~12) 1 up 4 up

1 up 1 13 7 2

MAT MAP PCA1 MAT MAP PCA1 MAT MAP PCA1 0.5 − 2.0 − 0.5 1.5 115 125 600 1000 180 200 130 150 680 720 760 3.5 450 550 650 0.90 1.05 1.20 −

r(MAT) r(MAP) r(PCA1) r(MAT) r(MAP) r(PCA1) r(MAT) r(MAP) r(PCA1) 50000 150000 0 2000 5000 0 150000 0 4 8 12 0 2000 5000 0 400 1000 0 10000 02468 691 0.0 1.0 2.0 692 Fig. 3 Summary of WGD events, with numbers corresponding to those in Table 3, climatic 693 variables, and diversification shifts. Numbers along branches denote WGD, with the numbers 694 corresponding to those in Fig. 1 and Table 3. Numbers inside clades denote the number of 695 diversification rate shifts. Estimated species numbers are listed beside clade names. Box plots 696 show the values estimated (ancestral values are listed in the top rows, rates in the bottom rows) 697 for both the left and right clades across bootstrap samples. Clades shaded grey denote a WGD. 698 b), c), and d) have nested WGD. 699 bioRxiv preprint doi: https://doi.org/10.1101/132878; this version posted May 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 41

Supporting Information 700 Fig. S1 The cladogram with support mapped for the bootstrap replicates described in the 701 methods. 702 Fig. S2 The chronograms and mapping of temperature variables (bioclimatic variables 13-19) 703 that are not presented in Fig. 1. 704 Fig. S3 The chronograms and mapping of precipitation variables (bioclimatic variables 13-19) 705 that are not presented in Fig. 2. 706 Fig. S4 The chronograms and mapping of PCA axis 2 on the broader Caryophyllales. 707 Fig. S5 Principal component loadings for bioclimatic variables.