* Correspondence To: Yujing Yan, Email: [email protected] Or Charles C
Total Page:16
File Type:pdf, Size:1020Kb
Supplementary Materials for Phytogeographic history of the Tea family inferred through high-resolution phylogeny and fossils *Yujing Yan1,2, *Charles C. Davis2, Dimitar Dimitrov1,3, Zhiheng Wang4, Carsten Rahbek5,1,4,6,7, Michael Krabbe Borregaard1 1. Center for Macroecology, Evolution and Climate, GLOBE Institute, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark 2. Department of Organismic and Evolutionary Biology, Harvard University Herbaria, 22 Divinity Ave, Cambridge, MA 02138, USA 3. Department of Natural History, University Museum of Bergen, University of Bergen, P.O. Box 7800, 5020 Bergen, Norway 4. Institute of Ecology, College of Urban and Environmental Sciences, Key Laboratory of Earth Surface Processes of Ministry of Education, Peking University, Beijing 100871, China 5. Center for Global Mountain Biodiversity, GLOBE Institute, University of Copenhagen, Universitetsparken 15, 2100 Copenhagen, Denmark 6. Department of Life Sciences, Imperial College London, Silkwood Park campus, Ascot SL5 7PY, UK 7. Danish Institute for Advanced Study, University of Southern Denmark, Odense, Denmark. * Correspondence to: Yujing Yan, email: [email protected] or Charles C. Davis, email: [email protected] This PDF file includes: Appendix 1 (p.4) Supplementary Notes 1 (p. 2-4) Supplementary Tables S1 to S11 (p.7-42) Supplementary Figures S1 to S10 (p.43-52) 1 Supplementary Note 1. Inference of biogeographical patterns using a fully Bayesian method in RevBayes We applied an alternative biogeographic method proposed by Landis et al. (2020) to a subset of our Theaceae empirical dataset and compared its performance to our method. This method uses a hierarchical Bayesian approach to account for the uncertainty in the position of fossils (lack of characters), phylogenetic relationships, and the geological/biogeographic template at once. The method takes a combined evidence strategy which “favors evolutionary histories that are in harmony across phylogenetic, temporal, biogeographic, and ecological components, while penalizing those with dissonant components” (Landis et al., 2020). However, because the method is computationally intensive, we only applied it to the Stewartieae clade. Methods -Dataset We subset our Theaceae dataset to include the sampled 12 extant species and five fossil species of the tribe Stewartieae. The alignment length and partition scheme were the same as the full dataset. -Implementation of the RevBayes method This method combines a molecular substitution model to estimate the molecular variation, a fossilized birth- death model to account for the relationship between lineage diversification and fossil occurrence, and a DEC model to reconstruct the biogeographic history. We followed Landis et al. (2020) to set the molecular substitution model and the fossilized birth-death model. To shorten the computation, we first estimated the topology using the maximum likelihood method in RAxML with 1000 bootstraps. Then we pruned the estimated best tree to only include nodes that have a bootstrap value of 100. This tree was then used as a backbone tree to constrain the topology of extant species in the RevBayes analysis. We set the prior on the root age of Stewartieae clade as a normal distribution which centered at 27.8 Ma and had an upper and lower boundary of 8.1 Ma and 125 Ma respectively. The maximum range occupation was set to two. We have run two analyses. One used a time-stratified dispersal matrix which only allows the dispersal between Eurasia and Nearctic before 10Ma (corresponding to the M1 scenario in the BioGeoBEARS analysis), the other used the same dispersal matrix for all periods (corresponding to the M0 scenario in the BioGeoBEARS analysis). The analyses were running with a sampling frequency of one tree every 50 generations until the ESS for estimated parameters become > 300. -Implementation of the empirical Bayesian method Because BioGeoBEARS is a globally optimum model, we repeated our pipeline on the Stewartieae clade to eliminate potential differences associated with the use of different datasets. We used Gordonia as outgroup and reran the RAxML+TreePL step. We used two dating schemes. The first is assuming that the oldest fossil of the tribe with a detailed description, Stewartia terteria, is a crown member of the Stewartieae clade. Under this assumption, we used the species to calibrate the crown node of Stewartieae. The second dating scheme is the same as in the main text, assuming that this fossil can only be used to calibrate the stem of the Stewartieae clade. It is not certain whether Stewartia terteria belongs to the stem or crown of the group because its morphological characters are similar to Stewartia malacodendorn. We found that the two dating schemes resulted in similar divergence time estimates. We used the phylogenies estimated under the first dating scheme in the following biogeographic analysis. The rest steps were the same as in the main text. We used only the DEC model with maximum ranges set to two. We summarized the proportions of phylogenies with each ancestral state and compared the results with the summarized posterior distribution of biogeographic reconstructions obtained with RevBayes. Results and discussion The results of divergence time estimates are shown in Fig. N1. The estimates using the approach in Landis et al. (2020) were generally in agreement with that using a penalized method. The only exception was the age of crown Stewartieae estimated under a non-stratified dispersal matrix. This estimate (c.38Ma) was much older than any previous estimates (Yu et al. 2017; Lin et al. 2019) and is less likely (Fig. N1). It is possibly a result of placing fossil species Stewartia terteria on the stem of Stewartieae except Stewartia malacodendorn and employing the fossilized birth-death process. On the other hand, our penalized dating method is less sensitive to the placement of Stewartia terteria and steadily estimated a crown node age between 20-30 Ma (Fig. N2), similar to our result in the main text (Fig. 2) and previous estimations (Yu et al. 2017; Lin et al. 2019). The results of ancestral range reconstruction are shown in Fig. N3. The two methods are generally in congruence with each other, giving similar estimates of different ancestral states for the three key nodes. The only exception is still the crown node of Stewartieae under the non-stratified dispersal scenario. Our method estimated the states supported by most sampled phylogenies to be Eurasian, while RevBayes estimated the states to be Eurasian and Nearctic. 2 Figure N1. Dated phylogeny estimated in RevBayes using (a) a non-stratified dispersal matrix, (b) a stratified dispersal matrix. Bars indicate 95% highest posterior density. Red bars represent fossil species. Figure N2. Dated phylogeny estimated in TreePL using Stewartia teriaria to calibrate (a) the crown node of Stewartieae (b) the stem node of Stewartieae. Bars indicated 95% confidence intervals. Figure N3. Comparison of the posterior probabilities estimated in RevBayes and the conditional probabilities (proportions of sampled phylogenies) estimated using our method of the same ancestral states of the three key nodes (crown Stewartieae, crown Hartia, and crown of strict Stewartia). The results are under (a) non-stratified dispersal scenario, and (b) stratified dispersal scenario. The dotted line in each plot represents a 1:1 line. References 3 Landis M.J., Eaton D.A.R., Clement W.L., Park B., Spriggs E.L., Sweeney P.W., Edwards E.J., Donoghue M.J. 2020. Joint Phylogenetic Estimation of Geographic Movements and Biome Shifts during the Global Diversification of Viburnum. Syst. Biol. 70:67–85. Lin H.-Y., Hao Y.-J., Li J.-H., Fu C.-X., Soltis P.S., Soltis D.E., Zhao Y.-P. 2019. Phylogenomic conflict resulting from ancient introgression following species diversification in Stewartia s.l. (Theaceae). Mol. Phylogenet. Evol. 135:1– 11. Yu X.Q., Gao L.M., Soltis D.E., Soltis P.S., Yang J.B., Fang L., Yang S.X., Li D.Z. 2017b. Insights into the historical assembly of East Asian subtropical evergreen broadleaved forests revealed by the temporal history of the tea family. New Phytol. 215:1235–1248. Appendix 1. Distribution data sources Enquist, B. J. et al. (2009). The Botanical Information and Ecology Network (BIEN): cyberinfrastructure for an integrated botanical information network to investigate the ecological impacts of global climate change on plant biodiversity. – The iPlant Collaborative, <www.iplantcollaborative.org/sites/default/files/BIEN_White_paper.pdf > Global biodiversity information facility (GBIF): http //www.gbif.org/. doi: 10.15468/dl.bpj2hf IDigBio. (2018). Integrated digitized biocollections. National Science Foundation. https://www.idigbio.org/. Ma K, Xu Z (2019). NSII (China National Specimen Information Infrastructure). Chinese Academy of Sciences (CAS). http://www.nsii.org.cn/ Min T, Bartholomew B. (2007). Theaceae. In Flora of China, pp. 366–478. NSII (National Specimen Information Infrastructure): http://www.nsii.org.cn/ Supplementary tables Table S1. List of Genbank sequences used in the study Table S2. List of taxa sampled in this study with voucher, distribution and assembly information Table S3. Information of extracted coding regions of plastomes Table S4. Rogue taxa identified using RogueNaRok in the combined plastome-nrDNA dataset. Table S5. Node age calibration settings used in TreePL Table S6. Dispersal matrix settings used in BioGeoBEARS for the M0 and M1 scenarios Table S7. Information of fossils incorporated