For Peer Review
Total Page:16
File Type:pdf, Size:1020Kb
Page 1 of 117 Journal of Biogeography 1 2 3 4 5 We thank the reviewers for their helpful comments. Our responses to each 6 7 comment are included below in blue. We apologise for the length of time it 8 9 has taken us to resubmit our revision. To properly address reviewer concerns 10 11 required additional analyses that turned out to be quite time consuming. 12 Additionally, and more recently, an intensive campaign of fieldwork by the first 13 14 author in Brazil became a more immediate and immovable priority. We 15 16 appreciate the time taken by the reviewers to assess our manuscript, and we 17 18 believe that they have helped us to provide a more robust and balanced 19 manuscript. 20 21 For Peer Review 22 23 24 25 -------------------------------------------- 26 REVIEWER COMMENTS TO AUTHOR 27 28 29 30 Referee: 1 31 32 33 Comments to the Author 34 35 This manuscript by Faria and colleagues focuses on two genera of 36 37 Collembola to test for signatures of Pleistocene persistence in glaciated vs. 38 39 unglaciated areas of Great Britain. I believe that this is a compelling and well- 40 written manuscript, which will be of broad interest to the readers of the Journal 41 42 of Biogeography. However, I would like to express a few concerns and some 43 44 suggestions that may help the authors to improve it further. 45 46 47 48 49 - OTU delineation: OTU delineation is a very important aspect of this 50 51 study as all subsequent analyses are based on the delimited OTUs. Given 52 that it is such a critical part of the analyses, I would expect a more thorough 53 54 approach, rather than relying simply on a 4% pairwise p-distance threshold. It 55 56 is not clear why the authors did not assess different threshold values and/or 57 58 they did not apply any phylogeny-based method (e.g., PTP or GMYC 59 analyses for single-locus species delimitation). At least I would appreciate a 60 Journal of Biogeography Page 2 of 117 1 2 3 4 better justification and discussion on this issue, so that it does not sound like 5 an arbitrary decision. 6 7 8 9 The reviewer’s concern is pertinent, particularly because the details about our 10 11 approach for OTU delineation were not included in the manuscript. Our 12 approach was based on prior data available from several studies that have 13 14 investigated the temporal-spatial aspects and biological significance of cryptic 15 16 molecular lineage diversity in the genus Lepidocyrtus, one of the two genera 17 18 investigated in the present study. The more relevant of these for the present 19 study is Cicconardi et al. (2013) who defined a molecular lineage within a 20 21 morphological definedFor species Peer of Collembola Review as “a group of individuals, or an 22 23 individual, that are reciprocally defined as a monophyletic (group of 24 25 individuals) or as a divergent lineage (single individuals) by unlinked loci.” 26 They used Hardy-Weinberg equilibrium (HWE) and linkage-disequilibrium 27 28 (LD) tests among lineages in sympatry to evaluate molecular lineages within 29 30 morphospecies. As a result, they found 19 molecular lineages occurring 31 32 sympatrically, with genetic isolation in sympatry congruent with assignment to 33 biological species. We then based our OTU threshold on these prior data and 34 35 the thorough approach behind with explicit reference to the biological species 36 37 concept. We used sequences from these 19 molecular lineages of 38 39 Cicconnardi et al. collected in sympatry to calculate their pairwise distances 40 and used the minimum p-distance found between them (0.04) to define our 41 42 96% similarity threshold. This is now summarized in lines 180-190. 43 44 45 46 Initially, we did not apply any phylogeny-based method due to the nature of 47 Collembola data, which mostly present deep divergent sequences that may 48 49 have experienced high levels of saturation. This is not very appropriate for 50 51 Bayesian or ML methods. Thus, we used a NJ method to build the tree and 52 used, as explained above, prior data regarding species definition available for 53 54 Collembola to define the 4% distance threshold. That is to say, our interest 55 56 was to identify clusters, not higher-level relationships among clusters. 57 58 59 60 Page 3 of 117 Journal of Biogeography 1 2 3 4 We are aware that NJ uses a genetic distance-clustering algorithm without an 5 optimality criterion. Thus, we have taken on board the reviewer’s suggestion 6 7 to have a more thorough approach to define OTUs, and explored Bayesian 8 9 Inference (BI) and Maximum Likelihood (ML) to build trees to then explore a 10 11 range of OTU delimitation methods. Due to limitations and caveats inherent to 12 delimitation methods, we applied three different approaches to then assess 13 14 congruence of inference across approaches (explained in lines 155-190). 15 16 17 18 Main limitations: 19 GMYC – performance is affected by: ratio of population sizes to species 20 21 divergence times, varyingFor population Peer sizes,Review number of species involved and 22 23 number of sampling singletons (tends to over split species) 24 25 bPTP and mPTP – branch lengths proportional to the amount of genetic 26 change rather than to time. Outperforms GMYC when interspecific distances 27 28 are small. 29 30 31 32 First, to infer an ultrametric tree, which is a requirement for the GMYC 33 method, we ran BEAST analyses using all sequences but no outgroups, since 34 35 they tend to compress the coalescent events towards the tips of the trees, 36 37 making difficult for GMYC to distinguish closely related species. We tested a 38 39 combination of different parameters (tree and clock models) and priors (for the 40 ucdl.mean) but BEAST analyses were somewhat problematic (there were 41 42 convergence issues with poor ESS values). These problems are likely to be 43 44 due to the type of data (species level data with some variation within species), 45 46 which cannot be fully resolved by a Yule tree prior or by a Coalescent tree 47 prior. As we were not confident on the resulting BI trees (which actually 48 49 resulted in very different GMYC results), we explored ML analyses. 50 51 52 53 We inferred a gene tree using maximum likelihood in RAXML (Stamatakis et 54 55 al., 2008). After rooting the tree and removing the outgroup sequence, we 56 57 used the inferred tree for species delimitation by bPTP and mPTP methods 58 59 (Zhang et al., 2013; Kapli et al., 2017). However, bPTP failed to run the raxml 60 Journal of Biogeography Page 4 of 117 1 2 3 4 tree, possibly because the branch lengths of ML trees seemed to be set to the 5 minimum. This could be potentially a problem caused by the large amount of 6 7 exactly identical sequences across both datasets. To take this into account, 8 9 we collapsed our sequences into unique haplotypes, and re-run ML analysis 10 11 and bPTP analysis with this shorter dataset. However, bPTP also failed to run 12 and we focused on mPTP, an improved model over bPTP that accounts for 13 14 different levels of intraspecific genetic diversity. It also implements a more 15 16 accurate heuristic algorithm compared to bPTP. Both Entomobrya and 17 18 Lepidocyrtus unique haplotype datasets were successfully run in the 19 webserver. Using the multicoalescent rate, mPTP delimited 12 Entomobrya 20 21 and 17 LepidocyrtusFor OTUs Peerfor ML trees. Review 22 23 24 25 Finally, we re-ran Beast analyses using the unique haplotype dataset for both 26 genera. Optimal convergence and ESS values were achieved and resulting BI 27 28 ultrametric trees were used for both GMYC and mPTP analyses. GMYC was 29 30 run in R using the ‘splits’ package (Ezard et al., 2009). Number of GMYC ML 31 32 entities were 22 for Entomobrya and 18 for Lepidocyrtus. Finally, mPTP for 33 the BI tree defined 13 OTUs for Entomobrya, and 22 OTUs for Lepidocyrtus. 34 35 Details of similarities and differences among OTUs defined by the different 36 37 methods are in Appendix 2: Table S2.2. 38 39 40 In general, our previous NJ tree inference was similar to the BI haplotype 41 42 trees and only slightly different from ML haplotype trees (see Appendix 43 44 2:Table S2.1). OTU delimitation methods slightly differ in number of OTUs 45 46 defined, thus we have taken a conservative approach counting OTUs as 47 those that were recovered in the majority of approaches employed and 48 49 avoiding intra splits (Appendix 2: Table S2.2). Final number of OTUs is 12 for 50 51 Entomobrya and 18 for Lepidocyrtus. 52 53 54 55 56 - NJ trees: On a related note, I do not find very appealing the fact that 57 58 the only trees presented in the manuscript are distance-based neighbour- 59 joining trees, which are however treated as if they were true phylogenetic 60 Page 5 of 117 Journal of Biogeography 1 2 3 4 trees (e.g., they are used to establish monophyletic sequence clusters - lines 5 271, 361 etc). To establish monophyly, I think it would be preferable to include 6 7 a phylogenetic tree (e.g., a Maximum Likelihood analysis using RAxML which 8 9 is particularly suitable for large datasets). Or you could just present the 10 11 BEAST trees (if they included the full dataset, see next point below), which 12 are currently not presented in the manuscript or supplementary material.