A Novel Metagenomic Workflow for Biomonitoring Across the Tree of Life Using PCR-Free Ultra-Deep Sequencing of Extracellular
Total Page:16
File Type:pdf, Size:1020Kb
A Novel Metagenomic Workflow for Biomonitoring across the Tree of Life using PCR-free Ultra-deep Sequencing of Extracellular eDNA Shivakumara Manu1 and Govindhaswamy Umapathy1 1Centre for Cellular and Molecular Biology CSIR February 22, 2021 Abstract Biodiversity is declining on a planetary scale at an alarming rate due to anthropogenic factors. Classical biodiversity monitoring approaches are time-consuming, resource-intensive, and not scalable to address the current biodiversity crisis. The environ- mental DNA-based next-generation biomonitoring framework provides an efficient, scalable, and holistic solution for evaluating changes in various ecological entities. However, its scope is currently limited to monitoring targeted groups of organisms using metabarcoding, which suffers from various PCR-induced biases. To utilise the full potential of next-generation biomonitoring, we intended to develop PCR-free genomic technologies that can deliver unbiased biodiversity data across the tree of life in a single assay. Here, we describe a novel metagenomic workflow comprising of a customised extracellular DNA enrichment pro- tocol from large-volume filtered water samples, a completely PCR-free library preparation step, an ultra-deep next-generation sequencing, and a pseudo-taxonomic assignment strategy using the dual lowest common ancestor algorithm. We demonstrate the utility of our approach in a pilot-scale spatially-replicated experimental setup in Chilika, a large hyper-diverse brackish lagoon ecosystem in India. Using incidence-based statistics, we show that biodiversity across the tree of life, from microorgan- isms to the relatively low-abundant macroorganisms such as Arthropods and Fishes, can be effectively detected with about one billion paired-end reads using our reproducible workflow. With decreasing costs of sequencing and the increasing availability of genomic resources from the earth biogenome project, our approach can be tested in different ecosystems and adapted for large-scale rapid assessment of biodiversity across the tree of life RESOURCE ARTICLE A Novel Metagenomic Workflow for Biomonitoring across the Tree of Life using PCR-free Ultra-deep Sequencing of Extracellular eDNA Running title: Biomonitoring across the tree of life Shivakumara Manu and Govindhaswamy Umapathy# Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India Correspondence: [email protected] (G.U.#), [email protected] (S.M.) ORCiD: 0000-0002-9114-8793- S.M ORCiD: 0000-0003-4086-7445 – G.U Abstract Biodiversity is declining on a planetary scale at an alarming rate due to anthropogenic factors. Classical biodiversity monitoring approaches are time-consuming, resource-intensive, and not scalable to address the Posted on Authorea 22 Feb 2021 | The copyright holder is the author/funder. All rights reserved. No reuse without permission. | https://doi.org/10.22541/au.161401815.51766652/v1 | This a preprint and has not been peer reviewed. Data may be preliminary. 1 current biodiversity crisis. The environmental DNA-based next-generation biomonitoring framework provides an efficient, scalable, and holistic solution for evaluating changes in various ecological entities. However, its scope is currently limited to monitoring targeted groups of organisms using metabarcoding, which suffers from various PCR-induced biases. To utilise the full potential of next-generation biomonitoring, we intended to develop PCR-free genomic technologies that can deliver unbiased biodiversity data across the tree of life in a single assay. Here, we describe a novel metagenomic workflow comprising of a customised extracellular DNA enrichment protocol from large-volume filtered water samples, a completely PCR-free library preparation step, an ultra-deep next-generation sequencing, and a pseudo-taxonomic assignment strategy using the dual lowest common ancestor algorithm. We demonstrate the utility of our approach in a pilot-scale spatially- replicated experimental setup in Chilika, a large hyper-diverse brackish lagoon ecosystem in India. Using incidence-based statistics, we show that biodiversity across the tree of life, from microorganisms to the relatively low-abundant macroorganisms such as Arthropods and Fishes, can be effectively detected with about one billion paired-end reads using our reproducible workflow. With decreasing costs of sequencing and the increasing availability of genomic resources from the earth biogenome project, our approach can be tested in different ecosystems and adapted for large-scale rapid assessment of biodiversity across the tree of life. Keywords: Environmental DNA, Extracellular DNA, Next-Generation Biomonitoring, Shotgun Sequenc- ing, Metagenomics, Tree of Life INTRODUCTION The vast biodiversity on earth is the result of billions of years of evolution. All the evolutionary lineages that make up the tree of life belong to three domains: Archaea, Bacteria, Eukaryota, and a fourth contested category of Viruses. Organisms across the tree of life have evolved and adapted to inhabit various environ- ments on earth. Widely accepted studies estimate that about 8.7 million (±1.3 million) species of Eukaryotes (Mora et al., 2011) and up to a trillion species of microbes (Locey & Lennon, 2016) exist on earth. Given these estimates and the completeness of the encyclopedia of life database (Parr et al., 2014), the majority of eukaryotic diversity and most of the microbial diversity remain unknown to science despite 250 years of scientific exploration. At the current rate of novel species discovery, it would take hundreds of years to cata- logue all the eukaryotic species alone (Costello et al., 2013). However, there is currently an impending threat of sixth mass extinction due to various anthropogenic factors such as pollution, land-use change, habitat loss, poaching, and climate change (Ceballos et al., 2020). The population sizes of many species have dropped significantly, and species extinction rates have increased hundreds to thousands of times compared to the background rate (Ceballos et al., 2015, 2017). Extinction of species is irreversible and may have long-lasting effects on the ecosystem functioning and services. A global assessment report by the U.N. estimated that up to a million eukaryotic species may be threatened with extinction and might go extinct in the next few decades (Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services, 2019). Under the current scenario, many species may go extinct even before being catalogued in the encyclopedia of life. Therefore, it has become imperative to assess and monitor biodiversity at a large-scale than ever before, to chart conservation policies and guide management strategies. Classical biomonitoring techniques are time-consuming, resource-intensive, require manual identification of specimens, and are not easily scalable to deploy on a large scale. Environmental DNA (eDNA)-based mole- cular methods offer several advantages over classical biomonitoring methods (Thomsen & Willerslev, 2015). eDNA-based biomonitoring techniques detect the presence of various taxa in the ecosystem utilising the DNA extracted directly from whole environmental samples (e.g., water, soil, air) (Taberlet et al., 2012a). The last decade witnessed tremendous strides in the methodological development of eDNA-based biomo- nitoring techniques (Seymour, 2019). Along with the technical advances, there has also been considerable effort to understand the ecology of eDNA (Barnes & Turner, 2016; Stewart, 2019), and to clearly define the term eDNA (Pawlowski et al., 2020; Rodriguez-ezpeleta et al., 2020). By exploiting various sources of DNA in an environmental sample (Figure 1), eDNA-based biomonitoring has emerged as a powerful new technique that has revolutionised the way we survey ecological communities (Deiner et al., 2017). Numerous Posted on Authorea 22 Feb 2021 | The copyright holder is the author/funder. All rights reserved. No reuse without permission. | https://doi.org/10.22541/au.161401815.51766652/v1 | This a preprint and has not been peer reviewed. Data may be preliminary. 2 comparative studies have concluded that eDNA-based biomonitoring could complement or even potentially replace classical biomonitoring methods in the future (Leempoel et al., 2020; Piggott et al., 2020). eDNA-based biomonitoring offers high scalability at four levels: physical, economic, biological, and ecological scalability. First, collecting eDNA samples requires relatively very less effort than most of the classical bio- monitoring techniques. For example, filtering water samples to detect fish communities requires significantly less time and resources than surveying using gill nets or electrofishing. Such physical scalability in sampling enables collection of a large number of samples covering entire ecosystems with minimal effort (e.g., West et al., 2021). Second, the high-throughput nature of molecular methods used in eDNA-based biomonitoring such as quantitative PCR and next-generation sequencing reduces the economic cost incurred per sample as the scale of the project increases. Third, eDNA samples typically contain multiple sources of DNA from many different organisms across the tree of life (Figure 1) (Barnes & Turner, 2016; Torti et al., 2015). For instance, a bulk sediment sample can contain microorganisms, invertebrates, extracellular DNA, and other biological particles of various sizes from organisms across a wide range of taxa. Thus, eDNA samples are biologically scalable, in the