High-Throughput Sequencing and Unsupervised Analysis Of
Total Page:16
File Type:pdf, Size:1020Kb
fmicb-11-02066 August 25, 2020 Time: 17:40 # 1 METHODS published: 27 August 2020 doi: 10.3389/fmicb.2020.02066 High-Throughput Sequencing and Unsupervised Analysis of Formyltetrahydrofolate Synthetase (FTHFS) Gene Amplicons to Estimate Acetogenic Community Structure Edited by: Denis Grouzdev, Abhijeet Singh1*, Johan A. A. Nylander2,3, Anna Schnürer1*, Erik Bongcam-Rudloff4 and Federal Centre Research Bettina Müller1 “Fundamentals of Biotechnology” (RAS), Russia 1 Anaerobic Microbiology and Biotechnology Group, Department of Molecular Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden, 2 Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Reviewed by: Sweden, 3 National Bioinformatics Infrastructure Sweden, SciLifeLab, Uppsala, Sweden, 4 SLU-Global Bioinformatics Centre, Ye Deng, Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden Research Center for Eco-environmental Sciences (CAS), China The formyltetrahydrofolate synthetase (FTHFS) gene is a molecular marker of choice to Irena Maus, Bielefeld University, Germany study the diversity of acetogenic communities. However, current analyses are limited *Correspondence: due to lack of a high-throughput sequencing approach for FTHFS gene amplicons and Abhijeet Singh a dedicated bioinformatics pipeline for data analysis, including taxonomic annotation [email protected]; and visualization of the sequence data. In the present study, we combined the [email protected] Anna Schnürer barcode approach for multiplexed sequencing with unsupervised data analysis to [email protected] visualize acetogenic community structure. We used samples from a biogas digester Specialty section: to develop proof-of-principle for our combined approach. We successfully generated This article was submitted to high-throughput sequence data for the partial FTHFS gene and performed unsupervised Evolutionary and Genomic data analysis using the novel bioinformatics pipeline “AcetoScan” presented in this Microbiology, a section of the journal study, which resulted in taxonomically annotated OTUs, phylogenetic tree, abundance Frontiers in Microbiology plots and diversity indices. The results demonstrated that high-throughput sequencing Received: 18 June 2020 can be used to sequence the FTHFS amplicons from a pool of samples, while Accepted: 05 August 2020 Published: 27 August 2020 the analysis pipeline AcetoScan can be reliably used to process the raw sequence Citation: data and visualize acetogenic community structure. The method and analysis pipeline Singh A, Nylander JAA, described in this paper can assist in the identification and quantification of known or Schnürer A, Bongcam-Rudloff E and potentially new acetogens. The AcetoScan pipeline is freely available at https://github. Müller B (2020) High-Throughput Sequencing and Unsupervised com/abhijeetsingh1704/AcetoScan. Analysis of Formyltetrahydrofolate Synthetase (FTHFS) Gene Amplicons Keywords: FTHFS, acetogens, AcetoScan, AcetoBase, high-throughput sequencing to Estimate Acetogenic Community Structure. Front. Microbiol. 11:2066. Abbreviations: bp, base pairs; FTHFS, formyltetrahydrofolate synthetase; nt, nucleotides; OTU, operational taxonomic doi: 10.3389/fmicb.2020.02066 units; WLP, Wood–Ljungdahl pathway. Frontiers in Microbiology| www.frontiersin.org 1 August 2020| Volume 11| Article 2066 fmicb-11-02066 August 25, 2020 Time: 17:40 # 2 Singh et al. AcetoScan: Pipeline for FTHFS Sequence-Data INTRODUCTION acetogen-specific primers for the 16S rRNA gene is almost impossible (Drake, 1994; Drake et al., 2002, 2008, 2013; Lovell Acetogens are a group of bacteria that (1) use the Wood- and Leaphart, 2005). Enzymes involved in the WLP, especially Ljungdahl pathway (WLP) for energy conservation, (2) generate formyltetrahydrofolate synthetase (FTHFS), are structurally and acetyl-Coenzyme A (CoA) by reduction of two molecules of functionally conserved and the gene sequences of these enzymes carbon dioxide (CO2), (3) may or may not produce acetate have been used in numerous studies to target and identify as the main end product of carbon fixation by WLP and (4) potential acetogens in complex microbial communities (e.g., are obligate anaerobes, with some members having tolerance to Lovell and Hui, 1991; Ragsdale and Wood, 1991; Lovell, 1994; periods of aerobiosis (Wagner et al., 1996; Drake and Küsel, Leaphart and Lovell, 2001; Lovell and Leaphart, 2005; Ragsdale 2003; Küsel and Drake, 2011; Schuchmann and Müller, 2016). and Pierce, 2008; Hori et al., 2011; Müller et al., 2013, 2016; Shin WLP is one of the most ancient biological pathways known et al., 2016). Despite this, a high-throughput analysis pipeline (Pereló et al., 1999; Russell and Martin, 2004). It is used mainly for this group of organisms has not yet been established. Most, by acetogens, but also by some archaea, syntrophic acetate- if not all, previous studies using FTHFS gene amplicons have oxidizing bacteria and sulfate-reducing bacteria (Drake, 1994; been conducted by clone library construction, sequencing by Drake et al., 1997; Ragsdale and Pierce, 2008; Poehlein et al., low-throughput methods and manually evaluated against small 2012; Sakimoto et al., 2016). However, only acetogens use the reference datasets (Gagen et al., 2010; Henderson et al., 2010; complete WLP and meet the requirements for reductive or Hori et al., 2011; Planý et al., 2019). This process is time- true acetogenesis (Drake, 1994; Poehlein et al., 2012; Müller and resource-intensive and not suitable for rapid analysis of and Frerichs, 2013). Acetogens are phylogenetically very diverse a large number of samples, due to lack of a high-throughput and consist of more than 100 species in more than 23 genera solution for analysis of FTHFS gene amplicons (Leaphart and (Drake et al., 2013; Müller and Frerichs, 2013; Shin et al., Lovell, 2001; Xu et al., 2009; Gagen et al., 2010). Lack of a 2016). They are metabolically very dexterous and, in addition dedicated information resource/database for acetogen-specific to acetate, may produce ethanol, butyrate, lactate etc. (Das homology and taxonomic comparisons has also restricted the and Ljungdahl, 2003; Lovell and Leaphart, 2005; Hügler and development of high-throughput methods (Lovell and Hui, 1991; Sievert, 2011; Yang, 2018). They are also very competitive when Xu et al., 2009; Gagen et al., 2010; Planý et al., 2019). Therefore, co-existing with methanogenic archaea and sulfate-reducing we recently published AcetoBase, an information resource for bacteria, especially if the temperature is below 20◦C(Drake, FTHFS data, which can assist in high-throughput analysis of 1994; Kotsyurbenko et al., 1996; Dar et al., 2008; Drake et al., acetogenic community diversity and taxonomic assignments 2013; Fu et al., 2019; Palacios et al., 2019). Acetogens play a of sequence reads from high-throughput sequencing platforms very important role in biological carbon cycling and produce (Singh et al., 2019). approximately 1013 kg/annum of acetate in different anaerobic The goal of this study was to present a proof-of-principle environments worldwide (Müller, 2003; Lovell and Leaphart, for high-throughput sequencing of FTHFS gene amplicons. 2005; Ragsdale, 2007; Ragsdale and Pierce, 2008; Schuchmann More specifically the aim was to set up and validate our and Müller, 2016). For example, acetogens are one of the most bioinformatics pipeline “AcetoScan” for unsupervised data dominant microorganisms in the human gut (Sagheddu et al., analysis of the raw high-throughput sequence data using our 2017), where they generate approximately 1010 kg of acetate previously developed resource “AcetoBase” (Singh et al., 2019). annually worldwide by reductive acetogenesis. In the termite For the analyses, samples from a biogas digester were selected. gut, acetogenesis results in annual worldwide acetate turnover of Biogas/anaerobic digester environments are well-studied in terms approximately 1012 kg (Breznak and Kane, 1990; Drake et al., of overall bacterial community composition as well as dynamics 2013). A further approximately 1012 kg of acetate is produced and acetogens. Previous studies with the bioreactors used in in different terrestrial and marine environments like forest soil, the present study have also confirmed a diverse acetogenic marshes, peat and sediments etc. (e.g., King, 1991; Kusel and community, making it suitable for the evaluation (Müller et al., Drake, 1994; Drake et al., 2013). However, due to lack of sufficient 2013; Westerholm et al., 2015; Müller et al., 2016). data from other environments with prominent acetogenesis, the actual amount of acetate production globally cannot be resolved (Küsel and Drake, 2011; Drake et al., 2013). Acetogens are also MATERIALS AND METHODS of importance in constructed anaerobic digestion systems, where they are important key players for efficient and stable methane Sample Collection and DNA Extraction production (e.g., Xu et al., 2009; Weiland, 2010; Hori et al., 2011; Samples were taken on five different time-points (date: 150303, Wang et al., 2013; Moestedt et al., 2016; Müller et al., 2016; 150414, 150519, 150709, and 151117) from a continuous stirred- Westerholm et al., 2018; Fischer et al., 2019). tank biogas reactor (GR2) in the Anaerobic Microbiology and This physiological diversity and versatility enables acetogens Biotechnology Laboratory, Swedish University of Agricultural to survive in different types of habitats and compete with Sciences, Uppsala.