An Enhanced Characterization of the Human Skin Microbiome
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.01.21.914820; this version posted January 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 An enhanced characterization of the human skin microbiome: a new biodiversity of 2 microbial interactions 3 4 Akintunde Emiola1, Wei Zhou1, Julia Oh1* 5 6 1The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA 7 *Corresponding author. [email protected] 8 9 10 ABSTRACT 11 12 The healthy human skin microbiome is shaped by skin site physiology, individual-specific factors, 13 and is largely stable over time despite significant environmental perturbation. Studies identifying 14 these characteristics used shotgun metagenomic sequencing for high resolution reconstruction 15 of the bacteria, fungi, and viruses in the community. However, these conclusions were drawn from 16 a relatively small proportion of the total sequence reads analyzable by mapping to known 17 reference genomes. ‘Reference-free’ approaches, based on de novo assembly of reads into 18 genome fragments, are also limited in their ability to capture low abundance species, small 19 genomes, and to discriminate between more similar genomes. To account for the large fraction 20 of non-human unmapped reads on the skin—referred to as microbial ‘dark matter’—we used a 21 hybrid de novo and reference-based approach to annotate a metagenomic dataset of 698 healthy 22 human skin samples. This approach reduced the overall proportion of uncharacterized reads from 23 42% to 17%. With our refined characterization, we revisited assumptions about the skin 24 microbiome, and demonstrated higher biodiversity and lower stability, particularly in dry and moist 25 skin sites. To investigate hypotheses underlying stability, we examined growth dynamics and 26 interspecies interactions in these communities. Surprisingly, even though most skin sites were 27 relatively stable, many dominant skin microbes, including Cutibacterium acnes and staphylococci, 28 were actively growing in the skin, with poor or no relationship between growth rate and relative 29 abundance, suggesting that host selection or interspecies competition may be important factors 30 maintaining community homeostasis. To investigate other mechanisms facilitating adaptation to 31 a specific skin site, we identified Staphylococcus epidermidis genes that are likely involved in 32 stress response and provide mechanisms essential for growth in oily sites. Finally, horizontal gene 33 transfer—another mechanism of competition by which strains may swap antagonistic or virulent 34 coding regions—was relatively limited in healthy skin, but suggested exchange of different 35 metabolic and environmental tolerance pathways. Altogether, our findings underscore the value 36 of a combined reference-based and de novo approach to provide significant new insights into 37 microbial composition, physiology, and interspecies interactions to maintain community 38 homeostasis in the healthy human skin microbiome. 39 40 BACKGROUND 41 Deep metagenomic shotgun sequencing is a powerful tool to interrogate composition and function 42 of complex microbial communities. Microbial communities offer the potential for discovery of a 43 tremendous suite of previously unknown biological functions, for example, new bioactive 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.21.914820; this version posted January 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 44 compounds, antimicrobials, virulence factors, or metabolic pathways. Such discovery has relied 45 on the ability to survey and deconvolute species from mixed microbial consortia. Advances in 46 next-generation sequencing and computational analyses have, in recent years, greatly furthered 47 efforts to reconstruct microbial communities at the species1,2, strain2,3, and even single nucleotide 48 polymorphism level3,4, examining function, transmission, and stability of the resident microbes. 49 50 However, interpretations of many metagenomic datasets are limited by the inability to characterize 51 a large fraction of the total microbial reads present in the original sample5,6. This uncharacterized 52 sequence space, or microbial ‘dark matter’7, typically results from the inability to map a sequence 53 read to a known microbial reference genome and can exceed 96% of sequence reads within a 54 sample5. Such ‘reference-based’ approaches, whether mapping reads to complete genomes8 or 55 marker genes9, have high sensitivity and discriminatory ability between even very similar 56 genomes8. However, microbes with no representative reference, or those with significant 57 pangenomic variation, which can account for considerable within-species diversity in gene 58 content10, are not captured. Conversely, reference-free approaches based on de novo assembly 59 to aggregate reads into longer stretches of contiguous DNA sequence, can aid in the identification 60 and characterization of new genomes. However, de novo assembly-based approaches are less 61 effective in capturing small genomes (e.g., viruses), low-abundance microbes, and in 62 discriminating between very similar genomes. 63 64 By combining both approaches into a holistic framework, we aimed to reduce the proportion of 65 uncharacterized sequence space in a metagenomics dataset, and thus provide new insights into 66 the biological function and interspecies interactions of these microbial communities. We used a 67 hybrid de novo and reference-based approach aimed at characterizing microbial dark matter in 68 the skin metagenome. Our previous analyses of this dataset (698 samples), which were 69 exclusively reference-based, showed that the skin microbiome is defined primarily by the 70 physiological characteristics of the skin site (e.g., whether it was a sebaceous, moist, dry, or foot 71 site), then by host-intrinsic factors that confer individuality in strain representation and the 72 presence of low-abundance and transient organisms5,6. More intriguing was the observation that 73 the skin microbiome is remarkably stable even over years, despite the exposure of skin to different 74 hygiene practices and the external environment6. However, our conclusions were based on an 75 incomplete portrait with, on average, half of each sample remaining uncharacterized by our 76 reference-based analyses5. by incorporating additional information from microbial dark matter, 77 we stood to gain significant new insights into the landscape of skin biodiversity and microbial 78 stability. 79 80 Leveraging our integrated approach, we uncovered previously unaccounted-for biodiversity and 81 reduced microbial stability in the skin microbiome. We used this refined characterization to more 82 deeply probe interspecies interactions, identifying intra-genus diversity and mechanisms 83 underlying stability and inter-species interactions in the skin, including new assessments on 84 growth rate and horizontal gene transfer. Our results demonstrate the highest resolution analysis 85 of the skin microbiome to date, and provide new hypotheses for how skin microbes interact and 86 compete to maintain homeostatic community conditions. 87 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.01.21.914820; this version posted January 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 88 RESULTS 89 A hybrid de novo and reference-based microbial community analysis 90 To address the significant uncharacterized sequence space (mean ± sd 42% ± 24%) in our initial 91 analysis of a 698-sample longitudinal skin metagenomic dataset (Supplementary Fig. 1), we used 92 reference-independent approaches to reconstruct composition. With the improvement of de novo 93 assembly algorithms to input large datasets11, we concatenated our samples and assembled 94 iteratively, resulting in 75% ± 19% reads incorporated into the assemblies (Supplementary Fig. 95 2). The 1,037,465 resultant contigs >1kb were then grouped into genome ‘bins’ based on co- 96 abundance clustering and nucleotide composition. However, because this approach is limited in 97 its ability to recover small genomes, low-abundance species, and to ascertain precise taxonomic 98 classifications, we speculated that integrating reference-based analyses (Fig. 1A) would further 99 reduce dark matter beyond the 33% ± 21% reduction observed by mapping reads to our de novo 100 reference set only (Fig. 1B). Microbial reads unmappable to our de novo reference catalogue 101 were annotated by mapping to a reference database of fungal, bacterial, and viral genomes. 102 Reference-based and de novo classifications were integrated with a normalization step that took 103 into consideration the total proportion of reads derived from each approach (Supplementary Fig. 104 2). While using de novo references significantly aided reconstruction of microbiota, our hybrid 105 approach most considerably reduced the proportion of dark matter (16% ± 17%; Fig. 1B and C). 106 107 A new biodiversity of the human skin metagenome 108 Our new compositional analysis was largely concordant with our previous findings that the skin 109 microbiome is predominated by Staphylococcus, Cutibacterium (formerly Propionibacterium), 110 Corynebacterium, and Malassezia species5,6