Compendium of 4,941 Rumen Metagenome-Assembled Genomes for Rumen Microbiome Biology and Enzyme Discovery
Total Page:16
File Type:pdf, Size:1020Kb
Edinburgh Research Explorer Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery Citation for published version: Stewart, R, Auffret, MD, Warr, A, Walker, A, Roehe, R & Watson, M 2019, 'Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery', Nature Biotechnology, vol. 37, pp. 953-961. https://doi.org/10.1038/s41587-019-0202-3 Digital Object Identifier (DOI): 10.1038/s41587-019-0202-3 Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: Nature Biotechnology General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 11. Oct. 2021 RESOURCE https://doi.org/10.1038/s41587-019-0202-3 Compendium of 4,941 rumen metagenome- assembled genomes for rumen microbiome biology and enzyme discovery Robert D. Stewart1, Marc D. Auffret 2, Amanda Warr1, Alan W. Walker 3, Rainer Roehe 2 and Mick Watson 1* Ruminants provide essential nutrition for billions of people worldwide. The rumen is a specialized stomach that is adapted to the breakdown of plant-derived complex polysaccharides. The genomes of the rumen microbiota encode thousands of enzymes adapted to digestion of the plant matter that dominates the ruminant diet. We assembled 4,941 rumen microbial metagenome- assembled genomes (MAGs) using approximately 6.5 terabases of short- and long-read sequence data from 283 ruminant cattle. We present a genome-resolved metagenomics workflow that enabled assembly of bacterial and archaeal genomes that were at least 80% complete. Of note, we obtained three single-contig, whole-chromosome assemblies of rumen bacteria, two of which represent previously unknown rumen species, assembled from long-read data. Using our rumen genome collection we predicted and annotated a large set of rumen proteins. Our set of rumen MAGs increases the rate of mapping of rumen metage- nomic sequencing reads from 15% to 50–70%. These genomic and protein resources will enable a better understanding of the structure and functions of the rumen microbiota. uminants convert human-inedible, low-value plant biomass were not present in Stewart et al.8, and brings the number of rumen into products of high nutritional value, such as meat and dairy genomes assembled to date to 5,845. We also present a metage- Rproducts. The rumen, which is the first of four chambers of nomic assembly of nanopore (MinION) sequencing data (from one the stomach, contains a mixture of bacteria, archaea, fungi and pro- rumen sample) that contains at least three whole bacterial chro- tozoa that ferment complex carbohydrates, including lignocellulose mosomes as single contigs. These genomic and protein resources and cellulose, to produce short-chain fatty acids (SCFAs) that the will underpin future studies on the structure and function of the ruminant uses for homeostasis and growth. Rumen microbes are rumen microbiome. a rich source of enzymes for plant biomass degradation for use in biofuel production1–3, and manipulation of the rumen microbiome Results offers opportunities to reduce the cost of food production4. Metagenome-assembled genomes from the cattle rumen. We Ruminants are important for both food security and climate sequenced DNA extracted from the rumen contents of 283 beef cat- change. For example, methane is a byproduct of ruminant fermen- tle (characteristics of the animals sequenced are in Supplementary tation, released by methanogenic archaea, and an estimated 14% Data 1), producing over 6.5 terabytes of Illumina sequence data. of methane produced by humans has been attributed to ruminant We operated a continuous assembly-and-dereplication pipeline, livestock5. Methane production has been directly linked to the which means that newer genomes of the same strain (>99% aver- abundance of methanogenic archaea in the rumen6, offering pos- age nucleotide identity (ANI)) replaced older genomes if their com- sibilities for mitigating this issue through selection7 or manipulation pleteness and contamination statistics were better. All 4,941 RUGs of the microbiome. Two studies have reported large collections of we present here have completeness ≥80% and contamination ≤10% rumen microbial genomes. Stewart et al. assembled 913 draft MAGs (Supplementary Fig. 1). (named rumen-uncultured genomes (RUGs)) from the rumens of All the RUGs were analyzed using MAGpy10 and their assem- 43 cattle raised in Scotland8, and Seshadri et al. reported 410 refer- bly characteristics, putative names and taxonomic classifications ence archaeal and bacterial genomes from the Hungate collection9. are given in Supplementary Data 2. Sourmash11, DIAMOND12 and As isolate genomes, the Hungate genomes are generally higher qual- PhyloPhlAn13 outputs, which reveal genomic and proteomic simi- ity and, crucially, the corresponding organisms exist in culture and larity to existing public data, are given in Supplementary Data 3. so can be grown and studied in the lab. However, we found that A phylogenetic tree of the 4,941 RUGs, alongside 460 public addition of the Hungate genomes increased read classification by genomes from the Hungate collection, is presented in Fig. 1 and only 10%, as compared to an increase of 50–70% when the RUGs Supplementary Data 4. The tree is dominated by large numbers of were used, indicating large numbers of undiscovered microbes in genomes from the Firmicutes and Bacteroidetes phyla (dominated the rumen. by Clostridiales and Bacteroidales, respectively), but also con- We present a comprehensive analysis of more than 6.5 tera- tains many new genomes from the Actinobacteria, Fibrobacteres bases of sequence data from the rumens of 283 cattle. Our catalog and Proteobacteria phyla. Clostridiales (2,079) and Bacteroidales of rumen genomes (named RUG2) includes 4,056 genomes that (1,081) are the dominant orders, with Ruminoccocacae (1,111) and 1The Roslin Institute and the Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, UK. 2Scotland’s Rural College, Edinburgh, UK. 3The Rowett Institute, University of Aberdeen, Aberdeen, UK. *e-mail: [email protected] NATURE BIOTECHNOLOGY | VOL 37 | AUGUST 2019 | 953–961 | www.nature.com/naturebiotechnology 953 RESOURCE NATURE BIOTECHNOLOGY Actinobacteria Bacteroidetes Bacteroidales bacterium WCE2008 Bacteroidales bacterium NLAE-zl-C104 Fibrobacteres um Proteobacteria Planctomycetes WCE2004 Spirochaetes Chloroflexi Porphyromonadaceae bacteri Treponema bryantii Bacteroides sp AR29 Bacteroidales bacterium Ga6A1 Firmicutes Tenericutes Succinivibrio dex B25 Desulfovibrio desulfuri C21a Unknown Olsenella umbon Enterobactertrinosolvens sp cans G11 DSM 7057 Prevotella bryantii Elusimicrobia ata A2 DSM 2261 22B Olsenella sp KH1P3 Denitrobacterium deto KPR-6 HUN156 Bifidobacterium rumina Euryarchaeota 9 xificans NPOH1 DSM Actinomyces ruminicola B71 ntium RU687 DS Prevotellaceae bacterium 21843 DSMM 27982 6489 Prevotella brevis P6B11 HUN048 Prevotella ruminicola AR32 Fusobacterium necrophorum Methanobrevibacter gottschalkii PG DSM 11978 Acidaminococcus fermentansB14 pGA-4DSM 11005 Succiniclasticum ruminis Anaerovibrio sp RM50 um AB3002 Methanobrevibacter olleyae Selenomonas ruminanti 1H5-1P DSM 16632 Mitsuokella jalaludinii M9 DSM 13811 Methanomicrobium mobile 1 DSM 1539 Selenomonas bovis 8-14-1 Bacillus cereus KPR 7A Bacillus licheniformis VTM3R78 Lactococcus garvieae M79 Streptococcus equinus 2B Sharpea azabuensis KH1P5 YE282 Kandleria vitulina Erysipelatoclostrid Ruminococcus bromii KH4T7 ium innocuum Peptostreptococcus anaerobius C Eubacterium callanderi NLAE-zl-C381 Eubacterium pyruvativo NLAE-zl-G225 AR67 rans Clostridiales bacter i6 Ruminococcus flavefaciens Y1 Ruminococcus albus Lachnoclostridium amino Lachnospiraceae Eubacterium ruminan Lachnoclostridium ium R-7 Lachnospira pe Dorea longicatena Eubacterium cellulosolvens Eubacterium oxido Lachnobacterium bovis AE2004 Pseudobutyrivibrio xylan Lachnospirac Butyrivibrio fibrisolvens PC13 Butyrivibrio sp YAB3001 bacterium AB4001 ctinoschiza philum clostridioform tium 2388 eae bacter A GR2136 NC2004 F DSM 10710 Oscillibacter sp reducens LD 64 DSM 11001 M83 LD2006 ivorans e ium AGR2157 AB2020 G41 DSM 3217 NLAE-zl-G231 Sarcina sp Bu21 DSM 10317 Ruminococcaceae bacterium Ruminococcaceae bacterium YRB3002 Fig. 1 | Phylogenetic tree of 4,941 RUGs from the cattle rumen, additionally incorporating rumen genomes from the Hungate collection. The tree was produced from concatenated protein sequences using PhyloPhlAn13, and subsequently drawn using GraPhlAn45. Labels show Hungate genome names, and were chosen to be informative but not overlap. Lachnospiraceae (640) constituting the dominant families within uncultured strains of Ruminococcus flavefaciens, 42 represented Clostridiales and Prevotellaceae (521) consituting the dominant genomes from uncultured strains of Fibrobacter succinogenes, family within