Knowledge Status and Sampling Strategies to Maximize Cost-Benefit
Total Page:16
File Type:pdf, Size:1020Kb
www.nature.com/scientificreports OPEN Knowledge status and sampling strategies to maximize cost- beneft ratio of studies in landscape genomics of wild plants Alesandro Souza Santos* & Fernanda Amato Gaiotto To avoid local extinction due to the changes in their natural ecosystems, introduced by anthropogenic activities, species undergo local adaptation. Landscape genomics approach, through genome– environment association studies, has helped evaluate the local adaptation in natural populations. Landscape genomics, is still a developing discipline, requiring refnement of guidelines in sampling design, especially for studies conducted in the backdrop of stark socioeconomic realities of the rainforest ecologies, which are global biodiversity hotspots. In this study we aimed to devise strategies to improve the cost-beneft ratio of landscape genomics studies by surveying sampling designs and genome sequencing strategies used in existing studies. We conducted meta-analyses to evaluate the importance of sampling designs, in terms of (i) number of populations sampled, (ii) number of individuals sampled per population, (iii) total number of individuals sampled, and (iv) number of SNPs used in diferent studies, in discerning the molecular mechanisms underlying local adaptation of wild plant species. Using the linear mixed efects model, we demonstrated that the total number of individuals sampled and the number of SNPs used, signifcantly infuenced the detection of loci underlying the local adaptation. Thus, based on our fndings, in order to optimize the cost-beneft ratio of landscape genomics studies, we suggest focusing on increasing the total number of individuals sampled and using a targeted (e.g. sequencing capture) Pool-Seq approach and/or a random (e.g. RAD- Seq) Pool-Seq approach to detect SNPs and identify SNPs under selection for a given environmental cline. We also found that the existing molecular evidences are inadequate in predicting the local adaptations to climate change in tropical forest ecosystems. Anthropogenic activities are transforming natural systems, drastically changing the environmental conditions in a way which poses major threat to global biodiversity1,2. Anthropogenic modifcations can lead to reduction and fragmentation of natural environments, or to changes in climatic conditions3–5. Species respond to such changes in the natural habitat, by (i) phenotypic plasticity, (ii) migrating from their natural habitats, in search of conditions, ft for survival and having ample resources, or (iii) adapting to the new environment to avoid local extinction (local adaptation)4,6–8. A species is said to exhibit local adaptation, when individuals on an average have a superior ftness in their home environment, compared to a transplanted individual8–11. Terefore, local adaptation is driven by the action of natural selection, which acts on the individual’s phenotypes, determining the characteristics that will be favored under certain environmental conditions9,12,13. Historically, local adaptation has been studied either through trans- location experiments (between environments) or trials in the greenhouse under controlled environmental con- ditions8,14,15. Te drawbacks of these two approaches include the requirement of ample fnancial resources and time, which is generally scarce for studies involving long-lived species such as trees14,15. Landscape genomics, a study that identifes genetic variations that confer local adaptation, is used to remedy these limitations16. In this approach, signifcant diferences in allele frequency between populations of the target species, indicates that individuals in the population are experiencing selection pressure; possibly in response to change in some envi- ronmental factor, such as changes in soil type, radiation, water stress, and temperature17–20. Landscape genomics studies analyze frequency distribution changes in molecular markers, such as single nucleotide polymorphism Applied Ecology & Conservation Lab, Santa Cruz State University - Rodovia Ilhéus-Itabuna, km 16, Ilheus, BA, ZIP Code: 45662-901, Brazil. *email: [email protected] SCIENTIFIC REPORTS | (2020) 10:3706 | https://doi.org/10.1038/s41598-020-60788-8 1 www.nature.com/scientificreports/ www.nature.com/scientificreports (SNPs)14,21 in relation to given environmental factors. Te SNPs are commonly used in wild local adaptation stud- ies because their location and functional annotations are known and they are widely distributed throughout the genome22,23. Additionally, the advent of high-throughput sequencing technology (HTS) has made the sequencing of millions of SNPs from the genome, possible, at moderate cost and it is not time intensive24,25. Tus, through the use of landscape genomics, it is now possible to fnd correlations between genomic regions and the variable envi- ronmental characteristics15,21. Terefore, this approach is used to pinpoint the environmental change in nature, that afects ecology (e.g.: climatic changes) of a species and can infuence its adaptive genetic potential26–28. Tus, landscape genomics approach is useful in predicting the responses of species to environmental heterogeneity and variable landscape factors29. In landscape genomics studies, outlier loci method has been used in tandem with genome–environment asso- ciation (GEA) method, to evaluate local adaptation in natural populations21. In the outlier loci method, changes in among-population frequency distribution of given allele(s)/loci, signifcantly diferent from that seen in the absence of natural selection, are identifed, and these alleles/loci are considered to be under natural selection pres- sure15,21. While, in the GEA analysis, occurrence of high correlation between the allele frequencies with one or more environmental variables is considered an indicator of local adaptation10,30. However, as the outlier method does not specify the environmental forces at work on the locus under selection, studies using landscape genomics approach use the GEA results31,32 to fx the causative environmental factor. For this reason, in this study we tried to systematize and discuss the results obtained from GEA in wild populations, sampled in situ. Considering that landscape genomics is a relatively new discipline, it requires refnement within the scope of sampling design, for GEA studies33–37. Studies, based on simulations, have predicted that the number of SNP markers used and the number of populations sampled infuence the inferences from landscape genomics analy- sis38,39. However, a recent review article, evaluated the impacts of sampling design on inferences from empirical studies on landscape genomics and it does not take into account the particularities of tropical regions37. In a way, our approach complements that of Ahrens et al., as they focused on the limitations of the diferent techniques, the lack of standardization and non-availability of information in conducting these studies. However, unlike others, our study, by reporting on empirically observed patterns in landscape genomics, aims to shape sampling design strategies for improving the cost-beneft ratio of future works, conducted in laboratories using population genomics for the frst time. Guidelines put forth in this study will be helpful, in particular, in subsidizing the costs of studies being carried out in the stark socioeconomic realities of tropical regions, which are global biodiversity hotspots under grave anthropogenic threat. Methodology To perform this work, all research articles, published to date (at the time of writing, September 2018), which evaluated local adaptations in populations of wild plants were pooled. Te papers were analyzed for (i) number of populations sampled, (ii) number of individuals sampled per population, (iii) total number of individuals sampled, and (iv) number of SNPs used. Care was taken to ensure that all data were obtained in empirical studies assessing local adaptation in wild plant populations. Te Scopus (https://www.scopus.com) and Google Scholar (http://scholar.google.com.br/) databases were searched for title, abstract, and articles using the following key- words: environmental variables and SNPs, landscape genetics and SNPs, spatial analysis and SNPs, landscape genomics and SNPs, population genomics and SNPs, adaptive genetic variation, and local adaptation and SNPs. Search results were fltered to remove papers and reviews from clinical, biomedical, veterinary, and immunology areas. Ten, a second flter was applied to eliminate the articles containing the following words: animal, fsh, ecol- ogy of freshwater, marine biology, entomology, and zoology. Finally, papers using simulations, or involving exotic and crop species in plantations, or using greenhouse experiments were removed and the remaining 35 empirical studies on in situ wild plants were selected for the present work. Using this data set and ArcGIS (10.2), a map was created depicting the distribution of the localities where researches were performed, in the diferent terrestrial biomes (Fig. 1). A double check was done at this stage to ensure if all the 35 papers chosen for this study, had eval- uated in situ local adaptation in wild plant populations, using SNPs markers. Te following information from the papers: (1) authors names; (2) publication year, (3) country of study and geographical coordinates; (4) botanical family studied; (5) species; (6) total number of individuals; (7) number of populations; (8) number of SNPs used; (9) number of SNPs under selection; and (10) method