Population Genetics in the Human Microbiome
Total Page:16
File Type:pdf, Size:1020Kb
Trends in Genetics Review Population Genetics in the Human Microbiome Nandita R. Garud1,* and Katherine S. Pollard2,3,4,* While the human microbiome’s structure and function have been extensively studied, its within- Highlights species genetic diversity is less well understood. However, genetic mutations in the microbiome Genetic variation in host-associ- can confer biomedically relevant traits, such as the ability to extract nutrients from food, metab- ated microbiomes can be assayed olize drugs, evade antibiotics, and communicate with the host immune system. The population in a high-throughput manner with a genetic processes by which these traits evolve are complex, in part due to interacting ecological variety of technologies. and evolutionary forces in the microbiome. Advances in metagenomic sequencing, coupled with Many bacterial species recombine bioinformatics tools and population genetic models, facilitate quantification of microbiome ge- extensively, although they asexu- netic variation and inferences about how this diversity arises, evolves, and correlates with traits ally reproduce. of both microbes and hosts. In this review, we explore the population genetic forces (mutation, recombination, drift, and selection) that shape microbiome genetic diversity within and between The genetic diversity of many spe- hosts, as well as efforts towards predictive models that leverage microbiome genetics. cies within and across hosts is spatially structured. A Population Genetic View of the Dynamic Microbiome Evidence for rapid adaptation The human microbiome comprises bacteria, archaea, viruses, and microbial eukaryotes living in our within hosts is starting to emerge. bodies. The taxonomic composition of these communities has been extensively studied and is signif- icantly associated with a variety of diseases and traits [1]. However, each species in the microbiome is Modeling efforts are connecting microbiome genetic variants with genetically heterogeneous, comprising individual cells whose genomes contain different mutations host phenotypes, highlighting the [2]. Widespread deployment of sequencing technologies (Box 1) has revealed that most microbiota biomedical importance of genetic harbor extensive genetic variation between hosts, within a host over time, and even within a host at a variationinthemicrobiome. giventime[2,3]. As in other species, this variation comprises single-nucleotide variants (SNVs) [2], short insertions and deletions (indels) [4], and larger structural variants (SVs) [5], which include dupli- cations, deletions, insertions, and inversions and can generate gene copy-number variants (CNVs) [6]. There has been substantial progress towards quantifying genetic diversity in the human microbiome [2,6–8]. By contrast, our knowledge of population genetic processes that shape the human microbiome is nascent. Population genetics is a discipline that makes statistical inferences about the evolutionary events that gave rise to patterns of genetic variation across individuals of the same species. The main processes that determine the fate of a new mutation are drift (see Glossary), selection,migra- tion (or transmission), and recombination [9]. These processes govern how new traits arise and spread among microbes, such as antibiotic resistance [10], drug side effects [11], pathogenic biofilm formation [12], and responses to diet, sanitation, and health status [13]. However, the relative contri- butions of these forces to microbiome diversity – as well as how they fluctuate over time and across microbiota, genes, and hosts with different phenotypes – remain to be fully understood. A population genetic view of the microbiome provides the opportunity to discover the genetic contri- bution to traits within microbial populations and to infer the processes creating and maintaining trait- associated genotypes. In this review, we focus on the population genetics of human-associated spe- cies, but note lessons learned from other environments. Much of the data discussed are from obser- 1Department of Ecology and Evolutionary Biology, University of California Los vational studies of natural populations, but we also discuss experimental methods (Box 2)andtheuse Angeles, Los Angeles, CA, USA of model systems to establish causality and to interpret inferences from observational studies of hu- 2Gladstone Institutes, San Francisco, CA, man-associated communities. USA 3Department of Epidemiology and Biostatistics, University of California San Evolution versus Ecology Francisco, San Francisco, CA, USA To what extent do microbiota respond to their environment by evolutionary versus ecological pro- 4Chan Zuckerberg Biohub, San Francisco, CA, USA cesses? Ecology is defined here as the presence or abundance of species or strains plus their inter- *Correspondence: actions with each other and their environment (e.g., strain replacements or shifts in species compo- [email protected], sition over time). Evolution, by contrast, refers to genetic modifications that accumulate on the same [email protected] Trends in Genetics, January 2020, Vol. 36, No. 1 https://doi.org/10.1016/j.tig.2019.10.010 53 ª 2019 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Trends in Genetics Box 1. How Do We Obtain the Necessary Genomic Information to Study Population Genetics in the Mi- Glossary crobiome? m: mutation rate; measured in Genetic variants in the microbiome are assayed using a variety of approaches that vary on several dimensions units of number of mutations per including whether they require culturing, the number of species sampled concurrently, the length of the result- generation per locus. In bacteria, À ing haplotypes and reads, and cost. m is estimated to be 10 9 [20]. Biogeography: spatial structure in allele frequencies either along the Cultured Isolate Sequencing gut or across the Earth’s surface. Much of microbiome population genetics relies on identifying SNVs or SVs among sequenced and assembled Clonal interference: when two genomes from cultured isolates. Since variants are linked in assembled genomes, studies of recombination, genomes with identical fitness HGT, and nucleotide changes associated with evolution are feasible [34,35,76,117,127,149,152]. Throughput compete with each other until a and difficult-to-culture species are limitations, but targeted culturing efforts [58,153–156], multiplexed clear fitness difference arises via either a subsequent mutation or a sequencing [34], and microfluidics [157] are helping [158]. recombination event. Dispersal limitation: when the Metagenotyping range of migration or dispersal of a species is smaller than the over- Shotgun metagenomics is the sequencing of pooled DNA from microbial communities. By aligning reads to all global area in which the species genomes (or ‘pangenomes’ of all genes observed in a species) and applying statistical models, both SNVs and resides. SVs can be called [2,6,7,46,151,159]. Metagenotyping is relatively unbiased, does not require culturing, and dN/dS: ratio of nonsynonymous to can be used to study hundreds of species simultaneously for relatively low cost. However, alignment to a synonymous fixations between database misses novel genes and species, is sensitive to reads that align to multiple species or genes [5], and two lineages; dN/dS >1 indicates does not provide long-range linkage information, although probabilistic models can be used to computa- positive selection, < 1 indicates tionally phase variants and resolve short-range haplotypes for abundant species [33]. Metagenotyping is purifying selection, and 1in- distinct from methods that quantify gene or pathway abundance across all species from metagenomes [160]or dicates neutrality. use genes with covarying abundance to define metagenomic linkage groups as estimates of species Drift: change in population allele frequencies due to random [120,161,162]. sampling. Ecology: interactions between Metagenome-Assembled Genomes (MAGs) species and strains and their environment. Genomes can be recovered from metagenomes de novo using binning and assembly [120,161,162], which Evolution: a change in allele fre- allows the characterization of new species and variant discovery in both known and novel species [163–165]. quency in a population driven by a MAGs can rapidly expand the number of reference genomes and observed genetic diversity of a species, as for population genetic force such as Prevotella copri , which recently went from having one genome to >1000 genomes from four distinct clades mutation, drift, migration, recom- [54,133]. Chimeric and incomplete assemblies are limitations, especially for lower-abundance species. As- bination, or selection. sembly is improving with chromatin capture scaffolding [166] and longer reads. Improved methods for as- Fixation: the process by which an sessing MAG quality and completeness will be helpful. allele reaches 100% frequency in a population. Four-gamete test:testforrecom- Single-Cell and Read-Cloud Sequencing bination in which all four combi- Flow cytometry [167] and microfluidics [157,168] can be used to capture and barcode single cells, small nations of pairs of alleles at two numbers of cells, or small numbers of DNA fragments [169] from a sample without culturing. Coupled with loci are strong evidence for deep sequencing and specialized bioinformatics methods, these techniques assemble genome segments from recombination. F which genetic variants, including mobile