Microbiota Profiling with Long Amplicons Using Nanopore Sequencing: Full-Length 16S Rrna Gene and the 16S-ITS-23S of the Operon
Total Page:16
File Type:pdf, Size:1020Kb
F1000Research 2019, 7:1755 Last updated: 03 AUG 2021 RESEARCH ARTICLE Microbiota profiling with long amplicons using Nanopore sequencing: full-length 16S rRNA gene and the 16S- ITS-23S of the rrn operon [version 2; peer review: 2 approved, 3 approved with reservations] Anna Cuscó 1, Carlotta Catozzi 2,3, Joaquim Viñes1,3, Armand Sanchez3, Olga Francino3 1Vetgenomics, SL, Bellaterra (Cerdanyola del Vallès), Barcelona, 08193, Spain 2Dipartimento di Medicina Veterinaria, Università degli Studi di Milano, Milano, Italy 3Molecular Genetics Veterinary Service (SVGM), Universitat Autonoma of Barcelona, Bellaterra (Cerdanyola del Vallès), Barcelona, 08193, Spain v2 First published: 06 Nov 2018, 7:1755 Open Peer Review https://doi.org/10.12688/f1000research.16817.1 Latest published: 01 Aug 2019, 7:1755 https://doi.org/10.12688/f1000research.16817.2 Reviewer Status Invited Reviewers Abstract Background: Profiling the microbiome of low-biomass samples is 1 2 3 4 5 challenging for metagenomics since these samples are prone to contain DNA from other sources (e.g. host or environment). The usual version 2 approach is sequencing short regions of the 16S rRNA gene, which (revision) report fails to assign taxonomy to genus and species level. To achieve an 01 Aug 2019 increased taxonomic resolution, we aim to develop long-amplicon PCR-based approaches using Nanopore sequencing. We assessed two version 1 different genetic markers: the full-length 16S rRNA (~1,500 bp) and the 06 Nov 2018 report report report report report 16S-ITS-23S region from the rrn operon (4,300 bp). Methods: We sequenced a clinical isolate of Staphylococcus pseudintermedius, two mock communities and two pools of low- 1. Alfonso Benítez-Páez , Institute of biomass samples (dog skin). Nanopore sequencing was performed on Agrochemistry and Food Technology-Spanish MinION™ using the 1D PCR barcoding kit. Sequences were pre- National Research Council (IATA-CSIC), processed, and data were analyzed using EPI2ME or Minimap2 with rrn database. Consensus sequences of the 16S-ITS-23S genetic marker Valencia, Spain were obtained using canu. Results: The full-length 16S rRNA and the 16S-ITS-23S region of the 2. Rasmus H. Kirkegaard , Aalborg rrn operon were used to retrieve the microbiota composition of the University (AAU), Aalborg, Denmark samples at the genus and species level. For the Staphylococcus pseudintermedius isolate, the amplicons were assigned to the correct 3. Amanda Warr , University of Edinburgh, bacterial species in ~98% of the cases with the16S-ITS-23S genetic Edinburgh, UK marker, and in ~68%, with the 16S rRNA gene when using EPI2ME. Using mock communities, we found that the full-length 16S rRNA 4. Kon Chu, Seoul National University, Seoul, gene represented better the abundances of a microbial community; South Korea whereas, 16S-ITS-23S obtained better resolution at the species level. Page 1 of 29 F1000Research 2019, 7:1755 Last updated: 03 AUG 2021 Finally, we characterized low-biomass skin microbiota samples and detected species with an environmental origin. 5. Lee J. Kerkhof , Rutgers University, New Conclusions: Both full-length 16S rRNA and the 16S-ITS-23S of the rrn Brunswick, USA operon retrieved the microbiota composition of simple and complex microbial communities, even from the low-biomass samples such as Any reports and responses or comments on the dog skin. For an increased resolution at the species level, targeting article can be found at the end of the article. the 16S-ITS-23S of the rrn operon would be the best choice. Keywords microbiome, microbiota, 16S, rrn operon, nanopore, canine, low- biomass, skin, dog This article is included in the Nanopore Analysis gateway. Corresponding author: Anna Cuscó ([email protected]) Author roles: Cuscó A: Conceptualization, Formal Analysis, Investigation, Methodology, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing; Catozzi C: Investigation, Validation, Visualization, Writing – Review & Editing; Viñes J: Investigation, Visualization, Writing – Review & Editing; Sanchez A: Conceptualization, Funding Acquisition; Francino O: Conceptualization, Funding Acquisition, Methodology, Supervision, Writing – Review & Editing Competing interests: No competing interests were disclosed. Grant information: This work was supported by two grants awarded by Generalitat de Catalunya (Industrial Doctorate program, 2013 DI 011 and 2017 DI 037). Copyright: © 2019 Cuscó A et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. How to cite this article: Cuscó A, Catozzi C, Viñes J et al. Microbiota profiling with long amplicons using Nanopore sequencing: full- length 16S rRNA gene and the 16S-ITS-23S of the rrn operon [version 2; peer review: 2 approved, 3 approved with reservations] F1000Research 2019, 7:1755 https://doi.org/10.12688/f1000research.16817.2 First published: 06 Nov 2018, 7:1755 https://doi.org/10.12688/f1000research.16817.1 Page 2 of 29 F1000Research 2019, 7:1755 Last updated: 03 AUG 2021 dermatitis presented an overrepresentation of Staphylococcus REVISE D Amendments from Version 1 genus13,14, but the species was only confirmed when comple- We hereby present a revised version of our manuscript, based menting the studies using directed qPCRs for the species of 13 on the comments made by the referees and more results since interest or using a Staphylococcus-specific database and V1-V3 version 1 was published. region amplification14. The main changes of the manuscript when compared to version 1 are those stated here: With the launching of single-molecule technology sequencers - We have assessed the performance of the mapping approach (e.g. PacBio or Oxford Nanopore Technologies), these short- with the uncorrected long-amplicons strategy presented here. length associated issues can be overcome by sequencing the - We have expanded the results adding a section of the de novo assembly of the 16S-ITS-23S genetic marker. full-length 16S rRNA gene (~1,500 bp) or even the nearly- - We have provided more detail on the methodology used from complete rrn operon (~4,300 bp), which includes the 16S rRNA the lab bench to the bioinformatics part, with supplementary file 1 gene, ITS region, and 23S rRNA gene. providing the commands used for the mapping approach. - The mapping results have been updated since in version 1 Several studies assessing the full-length 16S rRNA gene have chimeras were computed differently for each amplicon and led already been performed using Nanopore sequencing to: i) char- to confusing results (more chimera for 16S rRNA than for 16S- 15–17 ITS-23S). Now for both amplicons, we have performed base-level acterize artificial bacterial communities (mock community) ; 18 alignment using Minimap2 as the first step before yacrd (detailed ii) complex microbiota samples, from the mouse gut , in Supplementary File 1). wastewater19, microalgae20 and dog skin21; and iii) the pathogenic - The figures and tables have been simplified, re-ordered and agent in a clinical sample22–24. Some studies have been performed completed to be more informative and clear. We got rid of the using the nearly-complete rrn operon to characterize mock graphic bars and kept the heatmaps. communities25 and complex natural communities26. Finally, we want to thank all the referees for their time to evaluate our work and their invaluable feedback. Here, we aim to assess these two long-amplicon approaches See referee reports using MinIONTM (Oxford Nanopore Technologies), a single- molecule sequencer that is portable, affordable with a small budget and offers long-read output. Its main limitation is a higher error Introduction rate than massive sequencing. We will test our approaches by The microbiota profile of low-biomass samples such as skin is sequencing several samples with different degrees of complex- challenging for metagenomics. These samples are prone to contain ity: i) a clinical isolate of Staphylococcus pseudintermedius, DNA contamination from the host or exogenous sources, which ii) two bacterial mock communities; and iii) two complex skin can overcome the DNA of interest1,2. Thus, the usual approach microbiota samples. is amplifying and sequencing certain genetic markers that are ubiquitously found within the studied kingdom rather than per- Methods forming metagenomics. Ribosomal marker genes are a common Samples and DNA extraction choice: 16S rRNA and 23S rRNA genes to taxonomically We first sequenced a pure bacterial isolate of S. pseud- classify bacteria3,4; and ITS1 and ITS2 regions for fungi5,6. intermedius obtained from the ear of a dog affected by otitis. Until now, most studies of microbiota rely on massive paral- lel sequencing, and target a short fragment of the 16S rRNA Then, we used two DNA mock communities as simple and gene, which presents nine hypervariable regions (V1-V9) that well-defined microbiota samples: are used to infer taxonomy7,8. The most common choices for host-associated microbiota are V4 or V1-V2 regions, which • HM-783D, kindly donated by BEI resources, con- present different taxonomic coverage and resolution depending taining genomic DNA from 20 bacterial strains with on the taxa9,10. staggered ribosomal RNA operon counts (between 103 and 106 copies per organism per μl). Apart from the biases derived from the primer choice, short fragment strategies usually fail to assign taxonomy