Exploring the application of ecological theory to the human gut microbiota using complex defined microbial communities as models

by

Kaitlyn Oliphant

A Thesis presented to the The University of Guelph

In partial fulfillment of requirements for the degree of Doctor of Philosophy in Molecular and Cellular Biology

Guelph, Ontario, Canada

© Kaitlyn Oliphant, December, 2018 ABSTRACT EXPLORING THE APPLICATION OF ECOLOGICAL THEORY TO THE HUMAN GUT MICROBIOTA USING COMPLEX DEFINED MICROBIAL COMMUNITIES AS MODELS

Kaitlyn Oliphant Advisor: University of Guelph, 2018 Dr. Emma Allen-Vercoe

The ecosystem of microorganisms that inhabit the human gastrointestinal tract, termed the gut microbiota, critically maintains host homeostasis. Alterations in species structure and metabolic behaviour of the gut microbiota are thus unsurprisingly exhibited in patients of gastrointestinal disorders when compared to the healthy population. Therefore, strategies that aim to remediate such gut microbiota through microbial supplementation have been attempted, with variable clinical success. Clearly, more knowledge of how to assemble a health promoting gut microbiota is required, which could be drawn upon from the framework of ecological theory. Current theories suggest that the forces driving microbial community assembly include historical contingency, dispersal limitation, stochasticity and environmental selection. Environmental selection additionally encompasses habitat filtering, i.e., host-microbe interactions, and species assortment, i.e., microbe-microbe interactions. I propose to explore the application of this theory to the human gut microbiota, and I hypothesize that microbial ecological theory can be replicated utilizing complex defined microbial communities. To address my hypothesis, I first built upon existing methods to assess microbial community composition and behaviour, then applied such tools to human fecal-derived defined microbial communities cultured in bioreactors, for example, by using marker gene sequencing and metabonomics. I determined that stochasticity is an important influencer of species structure within the gut microbiota, whereas dietary interventions greatly impacted the metabolic behaviour. Additionally, habitat filtering predominated over species assortment. This assertion was based on the lack of competitive exclusion observed when beneficial microbes were added to an ulcerative colitis-associated microbial community with and without prior antibiotic treatment. The few unique functionalities that were provided by these microbes upon integration are related to starch and mucin degradation. Also, there was not a discernible overall difference between a microbial community and its non-coevolved species matched counterpart, in which each species was derived from a separate donor. However, the existence of species assortment was not precluded, since certain species that relied upon cross-feeding for polysaccharide utilization failed to integrate into the non-coevolved microbial community. Together, this work would suggest that successful modulation of the gut microbiota would involve providing microbes as coevolved guilds that can colonize niches ubiquitous amongst the human population.

Acknowledgements The completion of the work presented in this thesis would not be possible without the mentorship, collaboration, consultation, instruction, funding and support I had received from many individuals and institutions throughout my doctorate studies. Particularly, I would like to acknowledge the following:

Ph.D. Supervisor: Dr. Emma Allen-Vercoe

Advisory Committee: Dr. Lucy Mutharia, Dr. Kari Dunfield, Dr. France-Isabelle Auzanneau

Dr. Emma Allen-Vercoe Laboratory Personnel: Dr. Julie McDonald, Mr. Ian Brown, Dr. Kathleen Schroeter, Ms. Erin Bolte, Dr. Mike Toh, Dr. Kyla Cochrane, Dr. Christian Carlucci, Dr. Rafael Peixoto, Dr. Valeria Parreira, Mr. Chris Ambrose, Ms. Michelle Daigneault, Ms. Sandi Yen, Ms. Avery Robinson, Ms. Caroline Ganobis, Ms. Simone Renwick, Mr. Jacob Wilde, Ms. Mbita Nakazwe, Mr. AJ Stirling, Mr. Joseph Ciufo, Mr. Keith Sherriff, Ms. Emily Mercer, Ms. Isra Hussein

Collaborators: Dr. Martin von Bergen, Dr. Elena Verdu, Dr. Oliver Kohlbacher

Dr. Martin von Bergen Laboratory Personnel: Dr. Dirk Wissenbach, Dr. Robert Starke, Dr. Nico Jehmlich, Dr. Ulrike Rolle-Kampczyk, Mr. Sven Haange, Ms. Stephanie Schäpe, Dr. Henrike Höke, Dr. Sven Baumann, Ms. Dominique Türkowsky, Mr. Hannes Petruschke, Mr. Geoffroy Saint-Genis, Mr. Patrick Lohmann, Dr. Matthias Bernt, Ms. Kathleen Eismann, Dr. Jean Froment, Ms. Oliva Pleβow

Consultants: Dr. Greg Gloor and Dr. Jean Macklaim, Dr. Marc Aucoin, Dr. Cezar Khursigara

Dr. Cezar Khursigara Laboratory Personnel: Ms. Sherise Charles, Ms. Mara Goodyear, Dr. Alison Berezuk, Mr. Mitch Demelo, Ms. Nicole Garnier, Ms. Erin Anderson, Ms. Madison Wright

Teaching Faculty: Dr. Wendy Keenleyside, Dr. Roselynn Stevenson, Dr. Lucy Mutharia (again), Ms. Debra Flett

Advanced Analysis Center Staff: Dr. Sameer Al-Abdul-Wahid, Mr. Jeff Gross

Funding Agencies: Canadian Institutes of Health Research, Ontario Ministry of Training, College and Universities, National Science and Research Council of Canada, University of Guelph

Thank you.

iii

Table of Contents ABSTRACT ...... ii Acknowledgements ...... iii List of Tables ...... vii List of Figures ...... viii List of Abbreviations ...... ix Chapter 1 - Introduction ...... 1 1.1 The Human Gut Microbiota ...... 1 1.1.1 Strategies for modification of the human gut microbiota to enhance health...... 1 1.1.2 Composition of the human gut microbiota ...... 3 1.1.3 Functions of the human gut microbiota ...... 14 1.1.4 Modulators of the human gut microbiota ...... 22 1.2 Ecological Theory and the Human Gut Microbiota ...... 24 1.2.1 Assembly ...... 25 1.2.2 Diversity and evolution of the human gut microbiota...... 30 1.3 Study of the Human Gut Microbiota ...... 34 1.3.1 Model systems ...... 34 1.3.2 Gut microbial ecosystem analysis methods ...... 37 1.4 Overview of thesis work and overall hypothesis ...... 41 Chapter 2 – 1H-NMR spectroscopy vs. LC-MS/MS ...... 43 2.1 Article Information ...... 44 2.2 Abstract ...... 45 2.3 Introduction ...... 45 2.4 Relevance for human health, potential for mechanistic insights, and feasibility define the strengths of model systems ...... 47 2.5 Metabolic interaction as a key feature of microbiome:host interaction ...... 49 2.6 NMR and MS ...... 50 2.7 Metabolomics detects many spectral features and 50 metabolites detected by NMR result in the same quality of group separation ...... 51 2.8 Targeted validation ...... 58 2.9 Perspectives for metabolic flux and for linking community activity to the composition of the consortium...... 61 2.10 Conclusion ...... 63 Chapter 3 – Protein-SIP utilizing Heavy Water ...... 64 3.1 Article Information ...... 65

iv

3.2 Abstract ...... 66 3.3 Introduction ...... 66 3.4 Methods...... 68 3.4.1 Validation of isotope detection and abiotic HD-exchange...... 68 3.4.2 Growth of E. coli K12 ...... 68 3.4.3 Bioreactor operation and batch culture ...... 69 3.4.4 Batch culture sample processing ...... 70 3.4.5 Direct infusion of D-labeled Angiotensin-II ...... 70 3.4.6 Sample preparation for proteomics ...... 71 3.4.7 Bioinformatics tool development ...... 71 3.4.8 Mass spectrometry and identification of stable isotope incorporation ...... 72 3.4.9 Assessment of microbial community composition ...... 73 3.5 Results ...... 73 3.5.1 Validation of isotope detection and activity measure with E. coli K12 ...... 73 3.5.2 Incorporation of D and 18O into the metaproteome of a defined human fecal microbial community for the detection of active key players ...... 75 3.6 Discussion ...... 77 Chapter 4 – Coevolution, Determinism & Stochasticity ...... 83 4.1 Article Information ...... 84 4.2 Abstract ...... 85 4.3 Introduction ...... 85 4.4 Methods...... 87 4.4.1 Creation of defined microbial communities ...... 87 4.4.2 Bioreactor operation ...... 88 4.4.3 16S rRNA based compositional profiling ...... 88 4.4.4 1H-NMR based metabonomics ...... 89 4.4.5 Statistical analysis ...... 89 4.5 Results ...... 91 4.5.1 Determination of microbial community ‘steady-state’ stability and replicate reproducibility 91 4.5.2 Microbial community response to dietary changes ...... 94 4.5.3 Effect of coevolution on microbial community structure and behaviour ...... 100 4.6 Discussion ...... 100 Chapter 5 – Competition, Niche Processes & Redundancy ...... 109 5.1 Article Information ...... 110

v

5.2 Abstract ...... 111 5.3 Introduction ...... 111 5.4 Methods...... 112 5.4.1 Bioreactor operation and defined microbial communities ...... 112 5.4.2 16S rRNA compositional profiling ...... 113 5.4.3 1H- NMR metabonomics ...... 114 5.4.4 Statistical analysis ...... 114 5.4.5 Functional and pathway analysis ...... 115 5.5 Results ...... 115 5.5.1 Distinct incorporated into rifaximin pretreated vs. untreated communities ...... 115 5.5.2 Metabonomic analysis of pretreated vs. untreated communities revealed differences in saccharolytic and proteolytic fermentation ...... 117 5.5.3 Predictive functional analysis suggested similarities in engraftment of allochthonous microbes into pretreated vs. untreated communities ...... 120 5.6 Discussion ...... 126 Chapter 6 – Conclusions ...... 131 Chapter 7 – References ...... 135 Appendix ...... 184 A. Supplementary Figures ...... 184 B. Supplementary Tables ...... 191

vi

List of Tables Table 1. Major genera present in the human gut microbiota and their metabolisms ...... 4 Table 2. Major short-chain and branched-chain fatty acid products of fermentation ...... 20 Table 3. Differential changes in mean concentration of metabolites after microbial replenishment by rifaximin ...... 119 Table 4. Contribution to KEGG pathways by number of unique KEGG orthologies of each integrated allochthonous species ...... 122

vii

List of Figures Figure 1. Definitions of microbial ecology terminology...... 3 Figure 2. Strategies of pyruvate catabolism ...... 17 Figure 3. The four forces that drive assembly of the human gut microbiota ...... 25 Figure 4. The graphical relationship between ecosystem functional efficiency and stability ...... 32 Figure 5. Schematic of a single vessel bioreactor designed to replicate the human distal colon ...... 36 Figure 6. Strengths and weaknesses of model systems ...... 48 Figure 7. Metabolomics workflow ...... 52 Figure 8. Identifications from global profiling and complementarity of targeted approaches ...... 54 Figure 9. Global Profiling by Mass Spectrometry ...... 56 Figure 10. Evaluation of NMR data ...... 57 Figure 11. Validation by targeted MS Spectrometry and NMR ...... 59 Figure 12. Linking metabolomic information with phylogenetic information ...... 62 Figure 13. Validation of D- and 18O-incorporation ...... 74 Figure 14. Relative abundances (RA) obtained from the 16S rRNA marker gene sequencing, metaproteomics and protein-SIP data ...... 76 Figure 15. Difference in abundance of functional classes ...... 77 Figure 16. Analysis of overall statistically significant differences in the 1H-NMR spectral binning data obtained for the control community ...... 94 Figure 17. Concentrations of short-chain fatty acids determined by 1H-NMR metabolite profiling in bioreactor samples over time ...... 97 Figure 18. Concentrations of select metabolites that were statistically significantly different between medium formulations or communities ...... 99 Figure 19. Differential compositional changes in the ulcerative colitis associated microbial community after microbial replenishment by rifaximin ...... 118 Figure 20. Differential unique KEGG orthologies contributed by integrated allochthonous microbes by rifaximin ...... 122 Figure 21. Overall functional contribution of the integrated allochthonous microbes to the ulcerative colitis associated microbial community ...... 125

viii

List of Abbreviations GI: Gastrointestinal COG: Cluster of orthologous groups APC: Antigen presenting cell ASV: Amplicon sequence variant IEC: Intestinal epithelial cell WTS: Wald-type statistic IBD: Inflammatory bowel disease MATS: Modified ANOVA-type statistic IBS: Irritable bowel syndrome PAM: Partitioning around medoids FMT: Fecal microbiota transplantation ASW: Average silhouette width RCDI: Recurrent Clostridioides difficile ATS: ANOVA-type statistic infection KO: KEGG orthology MET: Microbial ecosystem therapeutics TLR: Toll-like receptor CD: Crohn’s disease SCFA: Short-chain fatty acid BCFA: Branched-chain fatty acid NGS: Next generation sequencing GF: Germ-free qRT-PCR: Quantitative real-time PCR ddPCR: Droplet digital PCR LC-MS/MS: Liquid chromatography tandem mass spectrometry SIP: Stable-isotope probing NMR: Nuclear magnetic resonance GC-MS: Gas chromatography mass spectrometry UC: Ulcerative colitis DDA: Data dependent acquisition PCA: Principal component analysis PLS-DA: Partial least squares-discriminant analysis SOM: Self-organizing map RIA: Relative isotope abundance LR: Labeling ratio BSA: Bovine serum albumin RA: Relative abundance

ix

Chapter 1 - Introduction 1.1 The Human Gut Microbiota

The human gut microbiota is a complex ecosystem of microorganisms that inhabits and critically maintains homeostasis of the gastrointestinal (GI) tract [1]. The myriad of health benefits found to be contributed both directly and indirectly by these microbes has garnered much attention in the research community over the past decade, even resulting in the reclassification of humans as ‘superorganisms’ [2]. Indeed, the contribution of microbially-derived genes to the total genome of the human superorganism is 500-fold larger than those that are human-derived [3, 4], with the number of microbial cells at least equating the number of mammalian cells [5]. The main bestowments of the gut microbiota to host wellbeing are 1) nutrient and vitamin production through degrading otherwise indigestible food material and 2) acting as part of the innate immune system by both competitively excluding pathogens and interacting with host antigen presenting cells (APC)/intestinal epithelial cells (IEC) to regulate their response [1, 4, 6, 7]. Thus, it is no surprise that alterations in the gut microbiota are associated with a number of GI disorders, including inflammatory bowel disease (IBD), irritable bowel syndrome (IBS), Celiac disease/allergies, diabetes, metabolic syndrome/obesity and antibiotic-associated infections [8–10]. There are even far reaching implications of the gut microbiota on human health from by-products of their fermentation crossing the intestinal barrier and entering the bloodstream, denoting the so-called gut-liver [11], gut-kidney [12] and gut-brain [13] axes. Such alterations are often described in a relative fashion to the ‘normal population’ and may be in terms of taxonomic structure and/or functional capabilities/activity of the ecosystem. These patterns of deviation have been loosely and broadly designated as ‘dysbiosis’ [14]. It is not known in most cases if dysbiosis is a cause, an effect or merely an association of disease, but therapeutic interventions targeting the gut microbiota have been met with some success.

1.1.1 Strategies for modification of the human gut microbiota to enhance health

Historically, antibiotics or probiotics, i.e., ‘live microorganisms that, when administered in adequate amounts, confer a health benefit on the host’ (International Society of Probiotics and Prebiotics definition [15]), have been utilized [16, 17]. However, it is important to realize that the aim of these therapeutics is to remediate a dysfunctional ecosystem. Antibiotics may be able to diminish numbers or downregulate activity of opportunistic pathogens, but cannot restore the properties essential to a healthy gut microbiota that do not already exist in the underlying ecosystem. In the case of probiotics, typically only one to eight microbial strains are administered [17], which in the context of the 160+ bacterial species present in the GI tract [18] are unlikely to bestow a substantial benefit. Further, the majority of probiotic strains used to date have been relatively easy-to-culture food-associated microbes, such as

1

Lactobacillus spp. or Bifidobacterium spp., and although several mechanisms for improving host wellness have been described for these taxa [17], the ecological parameters that are vital to disease prevention may not be provided by such species. Thus, a shift in paradigm of medicine was necessary, changing from a ‘microbial warfare’- to a ‘parks management’-style approach [19], hallmarked by the first successful application of fecal microbiota transplantation (FMT).

FMT involves the transfer of fecal material from a healthy donor to a patient and has effectively cured cases of recurrent Clostridioides difficile infection (RCDI) [20]. Unfortunately, its use in other disorders has yielded more variable results, highlighting the several concerns that have been raised regarding this treatment [21, 22]. In particular, the undefined nature, safety, ethical considerations with recruitment of donors and even the ‘ick’ factor have encumbered its value in the clinical setting. A treatment modality within the definition of a probiotic, yet with the efficacy of FMT, would be preferable, for example, through a targeted assemblage of microbes with essential ecological factors. The potential of such a treatment, termed Microbial Ecosystem Therapeutics (MET), has already been demonstrated in a preliminary clinical trial which successfully cured two RCDI patients [23]. However, there are critical knowledge gaps that need to be addressed in order to refine such an approach and tailor it to specific diseases; for example, it is important to understand both what constitutes and drives the assembly of a ‘healthy’ gut microbial ecosystem. This review will highlight our current knowledge of this topic by summarizing data on the composition, functions and modulators of the human gut microbiota, and then the ecological theories and methodologies that can facilitate the study of this complex microbial ecosystem will be described. As this review touches on many concepts central to the field of ecology, definitions of core ecological terminology are given for reference in Figure 1 and will be bolded at first mention. Finally, the central hypothesis and objectives of this thesis are laid out in respect to both the current information and these specific aims.

2

Figure 1. Definitions of microbial ecology terminology utilized throughout this review. Community: A group of two or more species inhabiting the same geographical area within the same timeframe. Ecosystem: A community of organisms and the abiotic environmental features of the geographical area in which they inhabit. Generalist: A species able to shift its metabolism in the presence of changing environmental conditions (e.g., substrate profiles) without sacrificing growth. Specialist: A species limited to a narrow range of environmental conditions (e.g., substrate profiles) for optimal growth. Primary Degrader: A species capable of extracting its own nutrients from the available food sources in their existing form (e.g., possess glycoside hydrolases or proteases). Cross-feeder: A species requiring microbial manipulation of the existing food sources (e.g., enzymatic degradation or fermentation) to utilize their nutrients. Fitness: The ability of a species to survive and reproduce within an environment. Niche: A role or position a species can have within an environment. Succession: The process of acquiring and sequentially integrating species into an environment to form a community. Resilience: The capacity of an ecosystem to recover from an environmental perturbation. The recovery is measured both in terms of completeness and rate, compositionally and behaviourally. α-Diversity: The diversity of each site (i.e., local species pool). Diversity is calculated as a factor of richness (i.e., number of unique species) and evenness (i.e., relative equality of species abundance). A community with higher diversity is richer and/or more even. β-Diversity: The difference in species composition among sites.

1.1.2 Composition of the human gut microbiota

Microorganisms from all domains of life reside in the GI tract, including bacteria, archaea, fungi, other Eukarya (e.g., Blastocystis and Amoebozoa) and viruses [24, 25]. Bacteria, however, are by far the largest contributors to the functioning of the ecosystem in terms of relative genetic content [3]. They are also the most abundant and diverse of these microbes, apart from viruses [24, 26, 27]. The majority of viruses within the gut habitat are phages, and thus their main role is to modulate the bacterial population [25]. For these reasons, researchers have predominantly focused on bacteria when describing the composition and functioning of the gut microbiota. Because of the chronology of technological advances in ‘-omics’ methods to be discussed in section 1.3, the composition, or ‘who’s there’, of the gut microbiota has been more heavily studied than the functions, or ‘what they are doing’. The variation between the species composition of individuals’ gut microbiotas has been found to be vast, often likened to a fingerprint, whereas the metabolic modules present amongst the healthy are highly conserved [28]. Nevertheless, that does not mean the structure of a gut microbial ecosystem is entirely random, as trends in the occurrence of phylogenetic groups have not only been elucidated but also have been utilized to describe ‘dysbiosis’ [8–10, 24, 29]. The prominent bacterial taxa inhabiting the human colon are summarized in Table 1, which features representatives from the phyla Bacteroidetes, ,

3

Actinobacteria, Proteobacteria and Verrucomicrobia. The highlighted genera in this table represent a proposed ‘core microbiota’, based upon comparison of human fecal metagenomic data collected from several cohorts (n = 3948) [30].

Table 1. Major genera present in the human gut microbiota and their metabolisms. Taxa that are part of the core microbiota found by Falony et al. are in bold [30]. Those genera that were core components of exclusively the ‘Western’ cohorts are denoted with a ‘W’ superscript, whereas the exclusively ‘non- Western’ ones are denoted with a ‘NW’ superscript. If the core taxon could not be resolved to the genus level, the bacterial families are bolded. For the bacterial families that do not already contain a core genus, the most commonly described genus of the human gut microbiota for that family is listed as a representative. Additionally, genera found to be highly prevalent amongst the human population, yet typically present in low abundance, are underlined [31]. Finally, select genera are also included due to their discussion in this review. The possible substrates consumed, metabolisms and metabolites for each genus are listed. These metabolisms were inferred from the following articles [29, 32–62]. Note that many of these metabolisms are species specific, and only the substrates commonly utilized amongst species of the genus are listed. Further, only the most abundant metabolites produced from pyruvate catabolism (i.e., saccharolytic processes) are given. When a particular metabolic pathway is denoted with an ‘I’ superscript, the microorganisms do not possess the full enzymatic pathway, but rather produce the typical intermediate as an end-product instead. Likewise, an ‘I/A’ indicates species of that genus may possess either the full or half pathway.

4

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS ARCHAEA: Methanobacteriaceae Methanobrevibacter Carbon dioxide Methanogenesis Methane EURYARCHAEOTA and Hydrogen Ethanol Formate Methanol ACTINOBACTERIA Bifidobacteriaceae Bifidobacterium Dietary Bifid Shunt Acetate Carbohydrates Pathway Ethanol HMO Formate Mucin Lactate BACTEROIDETES Bacteroidaceae Bacteroides Dietary 1,2-Propanediol 1,2-Propanediol Carbohydrates PathwayI Acetate HMO Acetate Production Carbon dioxide and Mucin Ethanol Production Hydrogen Proteins Succinate Pathway Ethanol Succinate Formate Propionate Succinate Porphyromonadaceae ParabacteroidesW Dietary Acetate Production Acetate carbohydrate Succinate Pathway Carbon dioxide

Proteins Hydrogen

Succinate Formate

Propionate

5

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS Succinate Prevotellaceae PrevotellaNW Dietary Acetate Production Acetate carbohydrates Succinate Formate Proteins PathwayI/A Propionate Succinate Succinate Rikencellaceae AlistipesW Dietary Acetate Production Acetate carbohydrates Succinate Pathway Carbon dioxide Proteins Hydrogen Succinate Formate Propionate Succinate FIRMICUTES Ethanol and 1,2-Propanediol 1,2-Propanediol (Clostridium cluster I) Propionate PathwayI Acetate Lactate Acetate Production Carbon dioxide Proteins Acrylate Pathway Hydrogen Saccharides Butyrate Kinase Ethanol Pathway Formate Ethanol Production Lactate Lactate Production Propionate Valerate Production Butyrate Valerate Eubacteriaceae Eubacterium Acetate Acetogenesis Acetate Acetate Production Butyrate

6

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS Carbon dioxide Butyryl Carbon dioxide and Hydrogen CoA:Acetyl CoA Hydrogen Formate Transferase Ethanol Lactate Ethanol Production Formate Methanol Lactate Production Lactate Proteins Saccharides Erysipelotrichaceae Erysipelatoclostridiu Proteins Acetate Production Acetate m Saccharides Lactate Production Carbon dioxide Hydrogen Formate Lactate Lachnospiraceae Blautia 1,2-Propanediol 1,2-Propanediol Acetate (Clostridium Carbon dioxide Pathway Carbon dioxide cluster XIVa) and Hydrogen Acetogenesis Hydrogen Dietary Acetate Production Ethanol Carbohydrates Ethanol Production Formate Formate Lactate Production Lactate Mucin Succinate PathwayI Propanol Propionate Succinate Coprococcus Acetate Acrylate Pathway Acetate Butyrate

7

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS (Clostridium Dietary Butyrate Kinase Ethanol cluster XIVa) carbohydrates Pathway Carbon dioxide Lactate Butyryl Hydrogen CoA:Acetyl CoA Formate Transferase Lactate Ethanol Production Propionate Lactate Production Dorea Dietary Acetate Production Acetate (Clostridium carbohydrates Ethanol Production Carbon dioxide cluster XIVa) Lactate Production Hydrogen Ethanol Formate Lactate Lachnoclostridium Proteins Acetate Production Acetate (Clostridium Saccharides Butyrate Kinase Butyrate cluster XIVa) Pathway Carbon dioxide Ethanol Production Hydrogen Lactate Production Ethanol Formate Lactate Roseburia 1,2-Propanediol 1,2-Propanediol Acetate (Clostridium Acetate Pathway Butyrate cluster XIVa) Acetate Production

8

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS Dietary Butyryl Carbon dioxide carbohydrates CoA:Acetyl CoA Hydrogen Transferase Ethanol Ethanol Production Formate Lactate Production Lactate Propanol Propionate Lactobacillaceae Lactobacillus 1,2-Propanediol 1,2-Propanediol Acetate Saccharides Pathway Ethanol Acetate Production Formate Ethanol Production Lactate Lactate Production Propanol Propionate Peptostreptoclostridiaceae Clostridioides Proteins Acetate Production Acetate (Clostridium Saccharides Butyrate Kinase Butyrate cluster XI) Succinate Pathway Carbon dioxide Ethanol Production Hydrogen Lactate Production Ethanol Formate Lactate Ruminococcaceae Faecalibacterium Acetate Butyryl Butyrate (Clostridium CoA:Acetyl CoA Carbon dioxide cluster IV) Transferase Hydrogen

9

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS Formate RuminiclostridiumW Dietary Acetate Production Acetate (Specifically carbohydrates Butyrate Kinase Butyrate Clostridium cluster Proteins Pathway Carbon dioxide IV, which is currently Ethanol Production Hydrogen grouped with Lactate Production Ethanol Clostridium Formate cluster III) Lactate Ruminococcus Dietary Acetate Production Acetate (Clostridium carbohydrates Ethanol Production Ethanol cluster IV) Lactate Production Formate Succinate PathwayI Lactate Succinate Streptococcaceae StreptococcusNW Mucin Acetate Production Acetate Saccharides Ethanol Production Ethanol Lactate Production Formate Lactate Veillonellaceae Veillonella 1,2-Propanediol 1,2-Propanediol Acetate Lactate Pathway Carbon dioxide Proteins Acetate Production Hydrogen Saccharides Lactate Production Formate Succinate Succinate Pathway Lactate Propanol

10

PHYLUM FAMILY GENUS SUBSTRATES METABOLISM END PRODUCTS Propionate Succinate PROTEOBACTERIA Desulfovibrionaceae Desulfovibrio Sulfate and Sulfate reducing Acetate Hydrogen Carbon dioxide Lactate Hydrogen Formate Hydrogen sulfide Enterobacteriaceae Escherichia Proteins 1,2-Propanediol 1,2-Propanediol Saccharides PathwayI 2,3-Butanediol 2,3-Butanediol Acetate Acetate Production Carbon dioxide 2,3-Butanediol Hydrogen Ethanol Production Ethanol Lactate Production Formate Succinate PathwayI Lactate Succinate VERRUCOMICROBIA Akkermansiaceae Akkermansia Mucin Acetate Production Acetate Succinate Ethanol Production Carbon dioxide Succinate Pathway Hydrogen Ethanol Formate Propionate Succinate

11

The Bacteroidetes and Firmicutes are the most abundant phyla present within the human gut microbial ecosystem, averaging a total of 90% of the composition [8]. However, the contribution of each phylum to this value varies dramatically between individuals, often termed the Firmicutes/Bacteroidetes ratio [9, 28, 63]. The utilization of this ratio as an indicator of dysbiosis was based upon early work indicating an association of its increase in IBD patients and its decrease in obese individuals when compared to controls [3]. However, later studies yielded conflicting results [63, 64]. Although species from the Bacteroidetes phylum that inhabit the GI tract only span a couple of bacterial families, the representation from the Firmicutes phylum is considerably diverse (Table 1; ref [24]). Firmicutes are thus often referenced by their Clostridium clusters or by bacterial families as relevant groupings. Clostridium clusters IV (Clostridium leptum group - Ruminococcaceae) and XIVa (Clostridium coccoides group - Lachnospiraceae) are the most dominant microorganisms, whereas opportunistic pathogens in Clostridium clusters I (e.g., Clostridium perfringens - Clostridiaceae) and XI (e.g., Clostridioides difficile - Peptostreptoclostridiaceae) are only minor constituents, if present [29].

1.1.2.1 Ratios of human gut microbiota members in health and disease

Elevation of Bacteroidetes relative abundance is associated with several inflammatory disorders, including Celiac disease, type II diabetes and colitis [9, 65]. As Bacteroidetes is a phylum of Gram- negative bacteria, antigenic lipopolysaccharides are present on their outer membranes, which in turn are known to interact with the pattern recognition receptor of innate immune cells, toll-like receptor (TLR) 4 [6, 65]. Therefore, greater amounts of Gram-negative bacteria could contribute to an overactive immune response in the host, with more events of inflammation that last longer and are more severe. Indeed, Bacteroidetes were shown to be necessary to trigger colitis in mice deficient in IL-10 or the inflammasome component, NLRP6 [66, 67]. Firmicutes are Gram-positive bacteria, and species from Clostridium clusters IV and XIVa can produce butyrate, an anti-inflammatory compound [67, 68]. Dysbiosis is thus unsurprisingly often characterized for many of these GI disorders, especially IBD, by a specific deficit of these Firmicutes clusters [64, 69]. On the other hand, a reduction of the relative abundance of Bacteroidetes within the gut ecosystem is associated with metabolic syndrome [70, 71]. Bacteroidetes are known to be both primary degraders and complex polysaccharide utilizing generalists [72, 73] that possess an impressive array of carbohydrate-active enzymes (CAZymes) and proteases [54, 74–76]. Thus, the capacity of Bacteroidetes spp. to breakdown and take up substrates attributes to their ubiquity and survivability in the gut environment, but comes at a cost of less efficiency compared to their Firmicutes counterpart. The Firmicutes are generally specialists [72, 73] either acting as primary degraders or cross-feeders, and are collectively capable of complete metabolism of saccharides and/or

12

amino acids with recycling of their fermentation by-products (Table 1). In turn, their increased digestive capacity produces a greater availability of calories to the host [70, 71, 77].

Upon closer inspection, contradictions do, however, arise. For example, polysaccharide A present on the outer membrane surface of Bacteroides fragilis can interact with TLR2 and induce IL-10 production by intestinal T cells, which downregulates the immune response [78]. Studies on the compositional associations of both Celiac disease and Crohn’s Disease (CD – a subtype of IBD) patients’ gut microbiota have thus unsurprisingly yielded specific alterations in the profile of Bacteroides spp. [9]. Additionally, lower overall proportions of Bacteroidetes are conversely associated with IBD, but there is also a concomitant large increase in the abundance of another Gram-negative phylum, Proteobacteria [64]. Certainly, an increase in Proteobacteria is a commonly documented descriptor of dysbiosis, in which its dominant family in the GI tract, Enterobacteriaceae, encompasses many opportunistic pathogens, such as Escherichia coli/Shigella spp. and Salmonella spp. [79]. Likewise, other members of Clostridium cluster XIVa are actually positively associated with GI disorders, particularly the non-butyrate producing species, such as Ruminococcus gnavus in IBD [80]. Ru. gnavus is a mucolytic bacterium, and elevated abundances are thought to erode the protective mucus lining of the gut at a disproportionate rate [81]. This activity increases the exposure of the epithelium to the gut microbiota, inducing more events of inflammation [82].

In terms of metabolic syndrome, the differences in response to dietary fibers by the two most prevalent genera in the human gut microbiota, Bacteroides and Prevotella, are worth considering. Both genera are members of the Bacteroidetes phylum and have been synonymous with the concept of ‘enterotypes’. Early work had suggested that individuals could be divided based upon the composition of their gut microbiota into one of three enterotypes, characterized by high relative abundance of Bacteroides, Prevotella or Ruminococcaceae, respectively [83]. It was later determined that the Ruminococcaceae enterotype was fused with the Bacteroides enterotype, and that the two resulting enterotypes were associated with long-term dietary patterns. Individuals consuming a high fat, low fiber diet typical of ‘Western society’ were more likely to be of the Bacteroides enterotype, whereas individuals consuming a low fat, high fiber diet typical of ‘Agrarian society’ were more likely to be of the Prevotella enterotype [30, 84]. The concept has since been hotly disputed, with more recent studies suggesting that a gradient rather than fixed categories exists, and that previous results were an artifact of the sequencing technology, which promotes a bias for taxa in high abundances [85]. Nonetheless, the dietary associations remain, and Prevotella spp. have since been demonstrated to utilize fiber to a higher capacity within a fecal-associated community than Bacteroides spp., as demonstrated by accumulation of their major fermentation by-product, propionate, in combination with lower production of acetate and

13

butyrate than is typically found in a Firmicutes-dominated ecosystem [86]. Further, obese individuals that harboured a higher proportion of Prevotella within their GI tract demonstrated improved weight loss on a high fiber diet than individuals that harboured more Bacteroides [87]. Whether Prevotella spp. are simply more efficient or less permissive to cross-feeding than Bacteroides spp. has yet to be elucidated. However, this work highlights that the constitution of the Bacteroidetes phylum influences the availability of substrates to the Firmicutes, thus modulating the impact of the latter on the host.

Another factor found to promote weight loss is the stimulation of the farnesoid X receptor present on IECs, which regulates genes involved in cholesterol and triglyceride metabolism [88–90]. The receptor is activated by hydrophobic primary and secondary bile acids, which are produced via enzymatic degradation by the gut microbiota of the conjugated bile acids released from the gallbladder [90]. The initial deconjugation step is completed by bile salt hydrolases, and a metagenomic analysis of this enzyme activity revealed that the majority of fosmid clones were attributed to the Firmicutes [90, 91]. The Firmicutes also possessed clones capable of breaking down both types of conjugated bile acids, (most common in humans) and taurine, whereas the Bacteroidetes clones only exhibited activity against the latter [90, 91]. These factors certainly blur the lines of an adequate Firmicutes/Bacteroidetes ratio.

In summary, these observations highlight three major problems in composition-based studies; 1) sequencing technology is of low sensitivity, biasing against taxa in small amounts that could be contributing important functionalities as discriminating factors [85], 2) the inherent relative nature of sequencing technology raises the question of whether a particular taxon had truly decreased in abundance or if other taxa had instead increased in abundance [92] and 3) determining trends even at the highest taxonomic resolution of species still allows for considerable doubt of functionality, due to the high dissimilarity between strains; for example, only 20% of the pan-genome of E. coli is thought to be shared by all strains [93]. Therefore, an emerging and critical next step in the field has been the move from using ‘groups of species’ to ‘groups of functions’ to better describe dysbiosis.

1.1.3 Functions of the human gut microbiota

Most of the contributions made by the gut microbiota to the physiology of the human superorganism are related to microbial metabolism [3, 7, 28], and thus a discussion of such metabolism will form a central part of this review. In general, microbial metabolism of both exogenous and endogenous substrates to nutrients useable by the host is the direct benefit, but metabolites can also act to modulate the immune system through impacting the physiology and gene expression of host cells [6, 7, 65]. The presence of diverse metabolic activity can allow the microbiota to maximally fill the available ecological niches and competitively inhibit colonization by pathogens [10, 94, 95]. Further, the elevated

14

concentrations of the mostly acidic fermentation by-products reduces the pH to create a more inhospitable environment for these incoming invaders [95]. The utilization of these nutrients by IECs also promotes the integrity of these cells and can induce downregulation of the inflammatory response [68, 96, 97]. However, certain fermentation pathways carried out by gut bacteria can result in the formation of toxic compounds that have the potential to damage the host epithelium and cause inflammation [98–100]. Certain microbes within the human gut have the capacity to degrade endogenous mucin leading to increased exposure of the microbiota to pattern recognition receptors [81, 82] and bile acids, the by- products of which activate the farnesoid X receptor resulting in the inhibition of the production of pro- inflammatory cytokines [101, 102].

The three macronutrients predominantly consumed in the human diet, carbohydrates, proteins, and fat, can reach the colon when in complex, unrefined forms that resist primary digestion in the stomach and small intestine [63, 103, 104]. The bioavailability of other micronutrients can also be influenced by the gut microbiota; as examples, exogenous plant derived polyphenols that have anti- oxidant, anti-cancer and/or anti-inflammatory properties can be biotransformed by the gut microbiota, which improves their uptake by the host [105]. Additionally, the gut microbiota can endogenously synthesize essential co-factors, such as B vitamins [106]. However, an extensive discussion of these processes is beyond the scope of this review, and thus only the predominant food sources that act as precursors for the most highly concentrated metabolites will be considered. As typically only a very small proportion of dietary fat reaches the colon [103, 104] and microorganisms are not thought to have the ability to catabolize free lipids in the anaerobic environment of the gut [107], this focus will thus be on the microbial metabolism of carbohydrates and proteins.

1.1.3.1 Fermentation

Dietary polysaccharides can be interlinked in complex ways through a diverse array of bonds between monosaccharide units, reflected by the sheer number of CAZymes reported to have been found in the human gut microbiome [74]. For example, Bacteroides thetaiotaomicron possesses 260 glycoside hydrolases in its genome alone [75], which emphasizes the evolutionary requirement for adaptation in order to maximize utilization of the assortment of fibers available in the human diet. However, once the rate limiting step of primary degradation is surpassed, the resulting monosaccharides can be rapidly consumed by the gut microbiota, with often little interconversion necessary for substrates to enter the Embden-Meyerhof-Parnas Pathway, Entner-Doudoroff Pathway or Pentose Phosphate Pathway for pyruvate and subsequent ATP production (Table 1; ref [108]). Conversely, dietary proteins are characterized by conserved peptide bonds that can be broken down by proteases; gut bacteria can produce aspartic, cysteine, serine and metalloproteases, but in a typical fecal sample these bacterial enzymes are

15

far outnumbered by proteases arising from human cells [76]. However, the 20 proteinogenic amino acid building blocks require more interconversion steps for incorporation into biochemical pathways, in comparison to monosaccharide units, and thus it is not typical for a given gut microbial species to have the capacity to ferment all amino acids to produce energy [109]. Additionally, incorporating amino acids from the environment into anabolic processes would further conserve energy in comparison to when they are used catabolically, by relieving the necessity for amino acid biosynthesis [99]. Therefore, amino acids are generally not considered to be as efficient of an energy source as carbohydrates for human gut- associated microbes. It is thus of no surprise that the gut microbiota preferentially consume carbohydrates over proteins depending on the ratio presented to them [54, 110]. However, there are notable exceptions to this general rule, as certain species of bacteria have adapted an asaccharolytic lifestyle, likely as a strategy to evade competition (Table 1).

1.1.3.2 Pyruvate catabolism

Once pyruvate is produced, the human gut microbiota has developed several fermentation strategies to further generate energy, which are depicted in Figure 2. Pyruvate can either be catabolized into succinate, lactate or acetyl-CoA. However, these intermediates do not reach high concentrations in typical fecal samples, as they can be further metabolized by cross-feeders, producing the short-chain fatty acids (SCFAs) acetate, propionate and butyrate [56]. These fecal metabolites are the most abundant and well-studied microbial end-products, due to host IECs utilizing them as a source of fuel [111]. Indeed, SCFAs contribute approximately 10% of the caloric content required by the human body [96]. Butyrate is the most preferred, and its consumption improves the integrity of IECs by promoting tight junctions, cell proliferation and increasing mucin production by Goblet cells [96, 97]. Butyrate also exhibits anti- inflammatory effects, through stimulating both IECs and APCs to produce the cytokines TGFβ, IL-10 and IL-18, and inducing the differentiation of naive T cells to T regulatory cells [68]. Acetate and propionate can also be consumed by IECs (though to a much lesser degree than butyrate) and have some anti- inflammatory effects [56, 96]. Both acetate and propionate can dampen pro-inflammatory cytokine production mediated by TLR4 stimulation, and propionate, similar to butyrate, can induce the differentiation of T cells to T regulatory cells [36, 56]. Excess SCFAs that are not metabolized by IECs are transported via the hepatic vein to the liver, where they can be incorporated as precursors into gluconeogenesis, lipogenesis and cholesterolgenesis [111]. Specifically, propionate is gluconeogenic, whereas acetate and butyrate are lipogenic. The ratio of propionate to acetate is particularly thought to be important, as propionate can inhibit the conversion of acetate to cholesterol and fat [111, 112]. Indeed, propionate administration alone can reduce visceral and liver fat in obese humans [113]. The role of SCFAs in glucose homeostasis is not yet fully elucidated, although preliminary work has additionally

16

suggested a beneficial effect [111]. In addition to SCFAs, small but significant amounts of alcohols, including ethanol, propanol and 2,3-butanediol, can be end-products of pyruvate fermentation (Figure 2). A further alcohol, methanol, is also produced by the gut microbiota as a result of pectin degradation, demethylation of proteins for regulation or vitamin B12 synthesis [114] rather than fermentation. Alcohols are transported to the liver, where the detoxification process involves their conversion to SCFAs, although through pathways that yield toxic aldehydes as precursors [114–116]. Higher concentrations of endogenous alcohols are thus thought to be a contributing factor to the development of non-alcoholic fatty liver disease (NAFLD) [115, 117]. Proteobacteria are known to be particularly capable of alcohol generation [114, 117], and are, interestingly, positively associated with dysbiosis in IBD [64], a disease in which patients are predisposed to developing NAFLD [118]. However, alcohols can also be detoxified by many members of the gut microbiota via the same pathway present in mammalian cells, regulating their concentration [114]. Additionally, methanol can be used as a substrate for methanogenesis or acetogenesis [62, 114, 119], and ethanol can be coupled to propionate for fermentation to valerate [32]. Valerate is a poorly studied metabolite, but it has been shown to inhibit growth of cancerous cells [120] and to prevent vegetative growth of Cl. difficile both in vitro and in vivo [32].

Figure 2. Strategies of pyruvate catabolism, with pathways of short-chain fatty acid production colour coded. Carbohydrates are first degraded to pyruvate. Pyruvate may then be converted to succinate, lactate, acetyl CoA + formate/carbon dioxide + hydrogen, ethanol, or 2,3-butanediol. Succinate may, however, also be a direct product of carbohydrate fermentation. Succinate and lactate do not typically reach high concentrations in fecal samples, as they can be further catabolized to produce energy, but certain species do secrete them as their final fermentation end-product, which enables cross-feeding. Acetate is produced by two pathways; 1) through direct conversion of acetyl CoA for the generation of energy (brown) or 2) acetogenesis (red). Formate/carbon dioxide + hydrogen can also be substrates for methanogenesis. Propionate is produced by three pathways, 1) the succinate pathway (orange), 2) the acrylate pathway (green) or 3) the 1,2-propanediol pathway (blue). 1,2-Propanediol is synthesized from lactaldehyde or dihydroxyacetone phosphate, which both are products of deoxy sugar fermentation (e.g., fucose, rhamnose). Alternatively, lactaldehyde can be produced from lactate, or 1,2-propanediol can be fermented to propanol. Propionate can be coupled with ethanol for fermentation to valerate (grey). The precursor for butyrate, butyryl CoA, is generated from either acetyl CoA or succinate. Butyrate is then produced by two pathways; 1) the butyrate kinase pathway (pink) or 2) the butyryl CoA:acetyl CoA transferase pathway (purple). Butyrate-producing bacteria may also cross-feed on lactate, converting it back to pyruvate. Lactate may also be catabolized as part of sulfate reduction. The figure was constructed from information provided in the following articles [32, 52, 53, 115, 121, 122] and with consultation of the KEGG database [119]. The figure was generated by the program ChemDraw Prime version 17.

17

18

1.1.3.3 Hydrogenotrophy

The human body may rapidly take up SCFAs and alcohols; however, the other fermentation by- products, carbon dioxide and hydrogen, must also be utilized by cross-feeders within the gut microbiota to continue favorable reaction kinetics [63, 123]. Three main strategies for this activity exist in the human gut; 1) acetogens convert carbon dioxide plus hydrogen to acetate, 2) methanogens convert carbon dioxide plus hydrogen to methane, and 3) sulfate reducing bacteria convert sulfate plus hydrogen to hydrogen sulfide [63]. A higher abundance of these cross-feeders may improve the overall efficiency of metabolism in the gut; for example, an increase in methanogens is observed in the GI tract of anorexia nervosa patients, which may be a coping strategy by the gut microbiota in response to a lack of food sources [124, 125]. Sulfate reducing bacteria are the most efficient of the hydrogenotrophs but require a source of sulfate; in the gut, the most prominent source of sulfate is sulfated glycans [126]. Although some of these glycans may be obtained from the diet, the most accessible source would be mucin produced by the host [47]. Sulfate reducing bacteria obtain sulfate from these substrates via cross-feeding with microbes such as Bacteroides, which produce sulfatases [126, 127]. Hydrogen sulfide is not only directly toxic to IECs through inhibition of mitochondrial cytochrome C oxidase, but is also pro- inflammatory via activation of Th17 cells [128, 129]. Hydrogen sulfide can additionally directly act on disulfide bonds in mucin to further facilitate mucin degradation [130]. Both elevated hydrogen sulfide concentrations and increased proportions of sulfate reducing bacteria are thus perhaps unsurprisingly reported in IBD [131].

1.1.3.4 Degradation of amino acids

The extra steps of interconversion required for amino acid fermentation yield a large number of by-products. Protein catabolism in the gut generally has a negative connotation, as compounds that are cytotoxic, inflammatory and/or neuroactive can result from this process, including amines, phenols/indoles and sulfurous compounds [98–100]. However, it is important to note that not all amino acids are fermented to toxic products as a result of gut microbial activity, in fact the most abundant end- products are SCFAs (Table 2) [99, 100]. Therefore, it may not be protein catabolism per se that negatively impacts the host, but instead specific metabolisms or overall increased protein fermentation activity. It is thus important to examine these subtleties. A microbe can exhibit one of two strategies for the initial step of amino acid catabolism, either deamination to produce a carboxylic acid plus ammonia or decarboxylation to produce an amine plus carbon dioxide [98]. Ammonia can inhibit mitochondrial oxygen consumption and decrease SCFA catabolism by IECs, which has led to the assumption that excess ammonia production can negatively impact the host [132–134]. However, the gut microbiota also rapidly assimilates ammonia into microbial amino acid biosynthetic processes [99], and host IECs can

19

additionally control ammonia concentration through conversion to citrulline and glutamine or through slow release into the bloodstream [135, 136]. It is thus unclear how much protein catabolism is necessary to achieve toxic ammonia concentrations, and this may vary between hosts. This uncertainty, coupled with the multiple negative impacts amines can have on the host, have led to speculation that deamination would improve host outcomes. Fortunately, deamination appears to be the more common strategy of amino acid catabolism by the gut microbiota because high concentrations of SCFAs can be produced from their degradation [98, 99]. The next steps depend on the class of amino acid, with most eventually resulting in tricarboxylic acid cycle intermediates, pyruvate or coenzyme A-linked SCFA precursors [53, 119]. An exception would be the series of Stickland reactions exhibited by certain Clostridium spp., in which a coupled oxidation and reduction of two amino acids occurs instead of using hydrogen ions as the electron acceptor [33, 34]. Phosphate is simultaneously added to the reduced amino acid in this case, and thus oxidative phosphorylation for the production of adenosine triphosphate (energy) can occur directly from the resultant acyl phosphate. In turn, branched-chain fatty acids (BCFAs), such as isovalerate and isobutyrate, can be produced as end-products. Additionally, some gut microbial species also possess a specialized branched-chain keto acid dehydrogenase complex to yield energy from the oxidized forms of the branched-chain amino acids directly, which also leads to BCFA production [99]. BCFAs are often used as a biomarker of protein catabolism, with the goal to reduce their concentration in order to improve health outcomes [100]. However, little is actually known about the impact of BCFAs on host health. In fact, preliminary work has shown that BCFAs are able to modulate glucose and lipid metabolism in the liver similarly to SCFAs [137], and isobutyrate can be used as a fuel source by IECs when butyrate is scarce [138].

Table 2. Major short-chain and branched-chain fatty acid products of amino acid fermentation. Listed are the compounds found to be above 1 mM concentration in in vitro fermentation experiments conducted by Smith and Macfarlane [139], in addition to the biogenic amines that can be produced by decarboxylation [98, 99]. Underlined are the products indicated as most abundant as reported in a review article by Fan et al. [98].

20

AMINO ACID AMINO ACID CLASS MAJOR PRODUCTS ASPARTATE Acidic Propionate GLUTAMATE Acidic Acetate, Butyrate Aliphatic Acetate, Propionate, Butyrate GLYCINE Aliphatic Acetate Methylamine ISOLEUCINE Aliphatic 2-Methylbutyrate or converted to Valine LEUCINE Aliphatic Isovalerate PROLINE Aliphatic Acetate VALINE Aliphatic Isobutyrate ASPARAGINE Amidic Converted to Aspartate GLUTAMINE Amidic Converted to Glutamate PHENYLALANINE Aromatic Phenolic SCFA, Phenylethylamine TRYPTOPHAN Aromatic Indolic SCFA Tryptamine TYROSINE Aromatic 4-Hydroxyphenolic SCFA Tyramine ARGININE Basic Converted to other amino acids (mainly Ornithine) Agmatine HISTIDINE Basic Acetate, Butyrate Histamine LYSINE Basic Acetate, Butyrate Cadaverine SERINE Hydroxylic Butyrate THREONINE Hydroxylic Acetate, Propionate, Butyrate CYSTEINE Sulfur-containing Acetate, Butyrate, Hydrogen sulfide METHIONINE Sulfur-containing Propionate, Butyrate, Methanethiol

21

1.1.3.5 Catabolism of endogenous substrates

Metabolism of exogenous substrates greatly affects the use of endogenous substrates by the gut microbiota. Dietary fiber reduces the degradation of mucin, and the utilization of mucin is thought to cycle daily depending on the availability of food sources [82, 140]. Mucin is a sulfated glycoprotein [47], thus the same concepts of carbohydrate and protein degradation from dietary sources discussed above apply. However, it should be noted that mucin turnover by the gut microbiota is a naturally occurring process, and only when it occurs in elevated amounts does it have negative connotations. For example, Akkermansia muciniphila is a mucin-utilizing specialist that is depleted in the GI tract of IBD [81] and metabolic syndrome [141] patients. Ak. muciniphila has a demonstrated ability to cross-talk with host cells, promoting an increase in concentration of glucagon-like peptides, 2-arabinoglycerol and antimicrobial peptides that improve barrier function, reduce inflammation and induce proliferation of IECs [142]. Through this communication, Ak. muciniphila also, paradoxically, restored the thickness of the mucin layer in obese mice.

1.1.4 Modulators of the human gut microbiota

1.1.4.1 Age

Several factors have been associated with compositional and functional changes in the human gut microbiota, including age, host genetics, sanitation, geographical location, antibiotics and diet [24, 30]. The human gut microbial ecosystem begins as a low diversity consortium of microorganisms after birth, which continues to develop highly dynamically as microbes are sequentially integrated until adulthood [143, 144]. The concept of whether a placental microbiome exists, or infants are indeed born sterile, is currently an unsettled topic [145]. However, the most prevalent theory is that the mode of delivery is the initial event that influences the profile of early colonizers, with the GI tract of vaginally delivered infants containing species from the maternal vaginal microbiota and caesarian section delivered infants containing species from the maternal skin microbiota [144, 146]. After birth, the infant gut microbiota exhibits alterations as the diet changes, from the subtle differences between formula and breastfed infants to the dramatic shifts after the introduction of solid food [143, 144]. The general trend is the succession from facultative anaerobes capable of utilizing milk oligosaccharides or their derived components, such as Bifidobacterium, Lactobacillus and Proteobacteria, to obligate anaerobes capable of primary degradation, such as Bacteroidetes, to finally more specialized obligate anaerobes fulfilling specific dietary niches, such as Clostridium clusters IV and XIVa [143, 144]. Further, the immune system and physiology of the gastrointestinal tract progressively develops though the interactions of early colonizers with the host, which impacts the ability of later encountered putative colonizers to integrate into the ecosystem [143].

22

Upon adulthood, the composition of the gut microbiota is thought to be remarkably stable with approximately 60% of bacterial strains retained within a five-year window [147]. A final shift in gut microbiota structure towards a Bacteroidetes-predominated phenotype is apparent in the elderly, which could be a result of accumulating external factors such as a less varied and fiber rich diet, reduced mobility and medication [148].

1.1.4.2 Host genetics

Host genetics has been proposed to play a role in modulating the gut microbiota since the discovery that monozygotic twins have more similarities between the constitution of their fecal microbial communities than dizygotic twins [149]. The most consequential effects of genetics may be in how the host immune system is able to process microbial signals. For example, it has been demonstrated that a genetic predisposition is necessary for the development of IBD, and although over 200 separate loci have been identified in genome-wide association studies, they are all linked to interference of host immune homeostasis via several mechanisms [150–152]. Such mechanisms include disruption of IEC integrity, barrier function or signalling, reduction of microbial clearance by defective phagocytes or hyperinflammation/autoinflammation by dysregulation of T or B cells. Nonetheless, the concordance rate of IBD between monozygotic twins is less than 50%; thus, disease progression is additionally reliant on environmental factors [153]. Host genetics can also influence the dietary landscape available to the gut microbiota. This case is particularly exemplified by the synthesis of host mucins. For example, people who possess an inactive FUT2 gene, defined as ‘nonsecretors’, in whom fucose residues are not integrated into the glycan component of mucin [154], have an altered gut microbiota profile characterized by decreased diversity and an increase in Lachnospiraceae [155, 156]. Interestingly, nonsecretors have additionally been identified in genome-wide association studies as susceptible to CD [157].

1.1.4.3 Sanitation

Sanitation is synonymous with the ‘old friends’ hypothesis, in which overuse of antibiotics/antimicrobial cleaning detergents combined with a lack of outdoor activity is thought to restrict the exposure of individuals to beneficial microbes in ‘Western societies’, particularly in cities [158]. Accordingly, geographical location is also a contributing factor influencing which microbes an individual can contract [24, 30, 159]. However, critics of the ‘old friends’ hypothesis have stated that it can set a dangerous precedent, as the rise in sanitation has also limited the spread of infectious diseases [160]. Therefore, it is perhaps prudent to not only prevent unnecessary exposures to substances that can affect the microbiota, but also to understand how to reacquire beneficial microbes after, e.g., administration of antibiotics. Additionally, regarding the impact of geographical location, it is difficult to

23

separate the actual locational effects from the inherent genetic and cultural (especially diet) bias between communities [24, 159]. Clearly, more research is necessary to understand the determinants that drive microbial acquisition (i.e., is the environment itself a critical factor?), the functionalities that are key to a ‘healthy’ gut microbial ecosystem and how we can appropriately modify the gut microbiota.

1.1.4.4 Diet

Diet is perhaps unsurprisingly thought to be a dominant modulator of the gut microbiota, and thus presents the most promising target for intervention strategies [4]. Most of the effects of diet have already been discussed, with complex carbohydrates favoring taxa such as Prevotella, Ruminococcus, Roseburia and Coprococcus and proteins/fats favoring taxa such as Bacteroides, Alistipes, Clostridium and Escherichia (see Table 1) [84, 161, 162]. However, it should be noted that the gut microbiota is compositionally robust to moderate short-term dietary changes, and thus either extreme or long-term dietary patterns are required for a considerable impact [4]. Other more man-made strategies include microbial supplementation with, e.g., probiotics, FMT or MET [17, 20, 23]. However, in order to effectively support beneficial microbial behaviours and/or inhibit detrimental ones, a thorough appreciation of gut microbial community assembly is essential.

1.2 Ecological Theory and the Human Gut Microbiota

The explosion of research into the human gut microbiota over the past decade has consequently led to a transition in medical paradigm. The medical community is now shifting perspectives from a ‘biological warfare’ approach, aiming to exterminate infectious agents, to a ‘parks management’ approach, aiming to remediate an overall ‘dysbiotic’ microbial ecosystem [19]. An unprecedented merger in principals of macroecology and medicine is thus starting to take place. If we can draw upon an already implemented theoretical framework such as that established in ecological theory, we should be able to make predictions and design interventions more efficiently than we can through association-based research alone. However, a critical barrier remains before the full potential of this ambition can be realized, and this can be summed up by one fundamental question, ‘Do macroecological theories apply in the context of a microbial ecosystem?’ There are fundamental differences between eukaryotic species, which are the concern of macroecology, and microorganisms, such as those that represent the gut microbiota. For example, unlike macroorganisms, microorganisms such as bacteria reproduce asexually, undergo horizontal gene transfer at a relatively high rate and some members can sporulate. The study of microbial ecosystems could also yield empirical evidence for certain macroecological theories that would not otherwise be observed, because of their shorter time scales. Key macroecological theories with relevance to the human gut microbiota will be discussed below.

24

1.2.1 Assembly

Determining the drivers of human gut microbial community assembly would allow us insights into strategies that could be used to support health-promoting ecosystems. Not only would this information be useful for the application of therapeutics that target the human gut microbiota, including probiotics, FMT and MET, it would also be useful for ensuring adequate succession in early life. The gut microbiota is known to play a critical role in the development of the host immune system, as determined by experiments in germ-free (GF) mice [163], and alterations in its composition and behaviour at this stage of life are associated with the later occurrence of immune disorders, such as allergies [164], asthma [165] and diabetes [166]. The proposed forces that drive community assembly, stochasticity, dispersal limitation, historical contingency and environmental selection, are displayed in Figure 3. In reality, a combination of these forces likely contributes to the assembly of a human gut microbial community rather than one in isolation. However, understanding the relative importance of these forces is essential to effectively manipulate the gut microbiota.

Figure 3. The four forces that drive assembly of the human gut microbiota, as proposed by Costello et al. [19]. For each force, key concepts behind their mechanism of action are highlighted. Environmental selection: The environmental conditions select for the microbial species [167]. In this case, the individual coloured in green selected for cocci and the individual coloured in blue selected for bacilli. This can result in either habitat filtering, in which resource availability selects (e.g., the individual coloured in green consumes a low amount of fiber that is the sole carbon source of the bacilli) or species assortment, in which the relative fitness of colonizers selects (e.g., the cocci secrete an antibiotic as a defense mechanism, against which they possess inherent resistance, decreasing the fitness of incoming bacilli). Historical contingency: The order of colonization influences the ability of microbial species to colonize [168, 169]. In this case, the identically coloured bacilli must be acquired before the cocci. This can result from vertical transmission, in which microbial species are passed from parent to offspring (e.g., the bacilli are vertically transmitted allowing for preliminary colonization) and horizontal transmission, in which microbes are acquired from the environment (e.g., the cocci are then able to colonize when acquired from the environment). Dispersal limitation: The ability of a microbial species to disperse in the environment influences its ability to colonize [24, 30, 158, 159]. In this case, the individual coloured in green can only acquire microbes from his environment on the left and the individual coloured in blue from his environment on the right (separated by a dashed line). This can result from geographical location (e.g., different locales support unique pools of microbes) or the ‘old friends’ hypothesis (e.g., the increased use of antimicrobials in the right environment selecting for microbes that can resist them). Stochasticity: Assembly of the microbial community is random [170]. This results from multistability, in which a microbial ecosystem can exist, and shift between, alternative stable states. The figure was generated in the program Microsoft PowerPoint version 10 with default graphics/icons.

25

26

1.2.1.1 Environmental selection

Environmental selection presents the most promising avenue of therapeutic exploitation. Previous work exploring the succession of the infant gut microbiota has indicated there exists a ‘checkerboard pattern’, in which pairs of taxa exclude each other from shared environments [167, 171]. This outcome would suggest that deterministic interactions and niche processes, i.e., environmental selection, are the dominant forces in shaping gut microbiota composition as opposed to neutral processes, i.e., randomness. Indeed, Jeraldo et al. quantified the relative role of each of these forces utilizing marker gene sequencing data and found a significant non-neutral contribution to gut microbial community assembly [172]. This effect is highly similar to what is observed in macroecological communities [173]. However, there are two different prevailing theories that can result in the observed checkerboard pattern, habitat filtering and species assortment. Habitat filtering emphasizes host-microbe interactions through proposing that species have affinities for non-overlapping niches [174]. The exception would be when microbial species actually create niches for cross-feeders through excretion of their fermentation by-products. Species assortment, on the other hand, emphasizes microbe-microbe interactions through proposing that competition between species leads to mutual exclusion [175]. Therefore, the crux of these theories is that for habitat filtering, resource availability selects, whereas for species assortment, the relative fitness of colonizers selects. Metabolic modeling conducted by Levy and Borenstein utilizing metagenomic data has shown that habitat filtering is the dominant driver [167]. The authors indicated, however, that habitat filtering is not mutually exclusive of species assortment, and in particular, adhesion, coaggregation, signaling and antibiotic tolerance were not factored into their model. In contrast to their work is the finding that microbes can form distinct ecological units based upon paired antimicrobial production and resistance [176]. More experimental studies are thus needed to confirm this result. Such work is important, because if habitat filtering is proven to be indeed dominant, it would indicate that therapeutic microbial strains could be chosen based upon the niches present, and thus knowledge of the pre-existing microbes would be unnecessary. It would also indicate that these strains would not necessarily require unique metabolisms to colonize the target GI tract and that dampening of the competitive interactions through, e.g., antibiotic pre-treatment would be less effective. Dietary interventions could thus be paired with microbial supplementation to enhance success.

1.2.1.2 Historical contingency

Historical contingency is when the formation of a community is dependent on the order of colonization. This theory takes into account whether microbes are acquired individually from the environment or as groups of species. The best example of a theory that relies upon historical contingency is the multilayered hologenome hypothesis [177]. A holobiont, synonymous with ‘superorganism’, exists

27

when the fitness of the host relies upon their microbiota and vice versa, and the collective genome thus evolves as a unit [178]. In the multilayered model, a set of ‘core’ taxa provide critical traits, while a flexible, environmentally-acquired pool confer niche-specific adaptions [177]. Over time, if such flexible traits improve holobiont fitness, they will be not only be retained, but will also lose mobility. This concept can be likened to the difference between the core genome and plasmids in bacteria. Therefore, it implies that the coevolved ‘core’ microbiota would be passed on to offspring early through vertical transmission and that the flexible pool would be acquired horizontally, perhaps later in life. The idea became popular from the evidence of a set of core genera [30], and the remarkable conservation of microbial metabolic modules between healthy individuals, which seems to supersede randomness [28]. Further, the abundance of certain bacterial families has been shown to be highly influenced by host genetics [149]. Indeed, not only does succession of the gut microbiota in healthy infants follow a strongly reproducible ordered pattern [143, 144], but there is now evidence of considerable microbial strain transfer from mother to infant within the first year of life, including species of Bifidobacterium, Escherichia, Bacteroides, Coprococcus and Ruminococcus [168, 169]. An average maternal transmission rate of 14-16% was observed, and these strains had a higher retention rate of 70.5% compared to the horizontally acquired strains of 27%. However, these findings do not preclude that infants could also acquire sets of microbes from close contact with other caregivers; thus, these microbes could co-evolve amongst themselves and within the human species in general but are not host specific. Discovering the degree of coevolution among microbes versus between the host and its microbes is important to formulating therapeutic interventions. Is matching a set of microbes to host genetic factors, such as obtaining fecal samples from relatives in the case of FMT, necessary? If microbes are spread from host- to-host as groups of species, then would recreating these groups by isolating them from the same source improve success? If order is important, then would providing therapeutic consortia in multiple, sequential doses be critical for the integration of targeted, e.g., cross-feeders? Finally, horizontally transmitted microbial strains that exhibit a lower retention rate might present a more promising target to manipulate.

1.2.1.3 Dispersal limitation

Dispersal limitation goes hand-in-hand with the formerly discussed ‘old friends’ hypothesis [158] and geographical location [24, 30, 159] influencers. How we are able to naturally acquire our microbes may not only aid in determining how to optimally administer therapeutic consortia but will also allow us to understand how the composition of the human gut microbiota can shift over time, depending on the host’s environment. The latter point brings to center the concept of ecological drift (aka demographic stochasticity) [19]. If a species in low-abundance (e.g., if it occupies a low capacity niche, recently immigrated, or was sensitive to a recently experienced environmental perturbation) is not able to adapt by

28

acquiring a new competitive advantage, it can be easily lost. However, it may be continually rescued from the brink of extinction by high dispersal in the environment. If high dispersal is indeed an evolutionary strategy employed by certain species of microbes in the gut, then this factor would lead to natural fluctuations in the overall constitution of species within a microbial ecosystem over time, hence ecological drift. Each class of microbes that inhabits the GI tract has its own strategy for surviving in the environment, which could explain why geography and hygiene would affect microbial acquisition. For example, Bacteroides was shown to have species-dependent, variable persistence in river water depending on the temperature [179]. Understanding how microbes survive in the face of variable conditions of oxygen, light, temperature and antimicrobials enables us to determine their relative ability to spread. For Gram-positives, many Firmicutes are capable of sporulation, while Actinobacteria modify their membrane morphology to become more lipophilic and oxygen tolerant with shorter lipid chains and higher protein content [180, 181]. Gram-negatives are capable of forming a viable but non-culturable state, in which their membranes are modified to resist stress, with a higher degree of peptidoglycan cross- linking [180, 182]. Many Proteobacteria are facultative anaerobes, while the obligate anaerobic members of the Bacteroidetes, Firmicutes and Actinobacteria phyla can possess enzymes, such as superoxide bismutases, catalases, NADH oxidases or flavin-thiol electron shuttles, to persist in the presence of oxygen for a few hours [179–181]. Much work in this area is still needed, as survival mechanisms for all core genera are not yet understood, e.g., Roseburia, a member of the Lachnospiraceae family that is prevalent in the human gut, is highly sensitive to oxygen tension, but is not known to sporulate [180]. Further knowledge of these mechanisms would allow us to design appropriate therapeutics through prediction of gut microbiota development trajectories, based on, e.g., an individual person’s lifestyle.

1.2.1.4 Stochasticity

Stochasticity is perhaps the most difficult factor to harness in terms of artificially reprogramming gut microbial community assembly. Nonetheless, that does not preclude its previous exploitation in treating GI disorders. It is important to understand that complex dynamic systems, which possess over a hundred entities with nonlinear relationships and feedback loops, can possess a property termed ‘multistability’. As the name implies, multistability refers to the concept that an ecosystem may stabilize in multiple alternative states that can be switched back and forth as a result of gradual exposure to external influence or spontaneously after a perturbation [170]. Empirical evidence of this phenomenon exists, and in the case of the human gut microbiota, it is best shown by the variability in the species compositions exhibited after antibiotic treatment [170, 183]. Indeed, this factor has even been exploited, as antibiotics have been utilized in the past to induce remission in IBD patients by effectively ‘resetting’ the ecosystem and facilitating its assembly to the pre-inflammatory state [64]. However, there is a

29

considerable risk involved in this technique, as by its nature it is highly unpredictable and could also lead to the development of antibiotic resistance, as well as the extinction of beneficial commensals.

1.2.2 Diversity and evolution of the human gut microbiota

In addition to trends in the composition of taxonomic classes, i.e., β-diversity, another frequently used descriptor is to evaluate the overall diversity, i.e., α-diversity, in relation to host health status. The accuracy of α-diversity metrics calculated from data obtained by the current next-generation sequencing (NGS) technology is controversial, as these instruments cannot provide information on all species present within a community due to their inherent low sensitivity and the measure is often thus highly correlated with sequencing depth [184]. Nonetheless, an intriguing pattern has emerged in that many GI disorders are associated with low α-diversity [185]. However, it is important to understand why an increased diversity improves health and how such a complex ecosystem evolves with the host.

1.2.2.1 Relationship between functional efficiency and stability

In a low diversity ecosystem, ecological niches are not utilized to their full capacity, and its community members become highly interdependent. This, in turn, is a hallmark of low resilience that makes the ecosystem vulnerable to colonization by invasive species and inhibits recovery from an environmental disturbance [94]. Indeed, antibiotic-associated infections are a good example, as it has been found that the opportunistic pathogen, Cl. difficile consumes nutrients such as simple sugars and free amino acids that are not otherwise found in high concentrations in fecal samples [10]. The relationship between functional efficiency, stability and the number of species within the ecosystem can be represented as a curve (Figure 4). At low numbers, adding species increases the functionality of the ecosystem up to a certain point, at which it plateaus. Afterwards, the addition of subsequent species elevates the functional redundancy of the ecosystem, which heightens stability. The microbial interactions progress from strongly positive to weak and negative. If the survival of a microbial species is critically linked to another and an environmental perturbation inhibits the growth of one, then the other is also necessarily depleted. Thus, competition within an ecosystem is, perhaps counterintuitively, stabilizing. In a healthy gut microbial ecosystem, it is thus unsurprising that a high degree of functional redundancy exists, with examples including polysaccharide utilization, butyrate production and hydrogen consumption [24, 186]. Further, Coyte et al. modeled a time-series metagenomics dataset and demonstrated that competition with weak interactions dominate [187]. However, that does not preclude the presence of microbe-microbe connectivity, with prior work indicating the presence of microbial guilds [188, 189]. Microbial guilds are small networks of species that share a metabolic process. For example, a previous study found that a strain of Bacteroides ovatus expressed both membrane-bound and secreted

30

forms of enzyme systems capable of degrading inulin [190]. When the secreted form of the enzyme system was knocked out, the fitness of the strains was not improved when grown in monoculture but was significantly diminished when grown in a community setting. The authors concluded that the strain produced the secreted form of the enzyme system to cross-feed other species that can, e.g., secrete growth factors for its benefit. Larsen and Claasen were able to apply graph theory and demonstrate that only six nodes (i.e., species) would be necessary to maximize the efficiency of a particular process [188]. However, it is most likely that such species networks within a gut microbial ecosystem would overlap with each other at several points, enhancing stability as previously stated. Understanding which functions are critical, and at which point the goal may instead be to enhance redundancy in order to improve health, is key to creating effective and perhaps personalized therapeutic microbial replenishment strategies for each GI disorder.

31

Figure 4. The graphical relationship between ecosystem functional efficiency and stability as proposed by Konopka [191]. As species are added, the functional efficiency increases up to a point then plateaus, which was determined to be at networks of 6 nodes (species) using graph theory by Larsen and Claassen [188]. At that point, no new functions are added and instead the functional redundancy increases, which heightens stability. Through this process, interactions become progressively more negative and weaker. The types of interactions are outlined by Coyte et al. [187]. The figure was generated in the program Microsoft PowerPoint version 10 with default graphics/icons. The graph portion was created in R version 3.5 with randomly generated data that recreated the proposed mathematical pattern.

1.2.2.2 Ecosystem on a leash A promising new model of host-microbiota evolution has been proposed by Foster et al., termed the ‘ecosystem on a leash’ [192]. To explain this model, it is best to consider the three interactions that can occur in a gut microbial ecosystem separately, ‘microbe to host’, ‘host to microbe’ and ‘microbe to microbe’. The model suggests that the ‘microbe to host’ direction is not the main driver of holobiont evolution. The reasoning behind this statement is that the gut microbiota is subject to continual turnover, and if a particular species were to develop a trait that improved host health at the cost of its own growth, it would likely be quickly displaced. The only exceptions to this scenario would be 1) if the host depended on this function so critically that its loss resulted in immediate host death (which has not been demonstrated as germ-free mice are still viable [193, 194] and most GI disorders that exhibit dysbiosis

32

have a low rate of lethality [195]) and 2) if vertical transmission increased the fitness of a species that elicits a host benefit, as it enables the host to have successful offspring. Although vertical transmission has been observed [143, 144], there is no current evidence on the reproducibility of this phenomenon. Are the same microbial species reliably transmitted in each subsequent generation? This question needs to be addressed to yield more definitive conclusions. It could be a process in place to help preserve a health promoting gut microbiota, but the host would then still likely need additional mechanisms to maintain these microbes in the face of environmental challenges; consequently, there would need to be a heavy ‘host to microbe’ component. It is possible that a microbial species that provisions a benefit to the host at no cost to its own fitness might be selected for in certain situations. For example, if a species supplied a vitamin that a human population could no longer obtain from food, because of, e.g., famine, then it may become more prevalent. However, this scenario seems unlikely as it would occur only under highly limiting conditions, especially considering the incredible diversity and functional redundancy that is observed between individuals [24, 28, 186]. Therefore, it is far more plausible that the ‘host to microbe’ direction is the driver, in which the host adapts to its microbes and evolves traits to encourage beneficial functionality within the ecosystem. Hence, it keeps its ‘ecosystem on a leash’. Examples include the immune system and selective feeding. The immune system is trained to monitor particular microbial associated molecular patterns, such as lipopolysaccharides on the membrane surface of Gram-negative bacteria, which are ubiquitous on the common enteric pathogens of the family Enterobacteriaceae [6, 65, 79]. Further, the production of butyrate, a fuel source for IECs, downregulates the immune response [68]. The host also feeds its microbes through the biosynthesis of mucin [47], and the amount secreted can be regulated via cross-talk with Goblet cells [142]. The previously described bacterial species Ak. muciniphila is a good example, and its production of propionate in close proximity to the IECs could explain why the host would have adapted to retain it [196]. The host can also mediate the attachment of beneficial microbes to the mucus layer by the adaptive immunity-associated immunoglobulin A protein [197]. The existence of GI disorders with a genetic component, such as IBD [150–152] and the previously noted dominance of habitat filtering [167] also plays into this hypothesis. However, controlling all microbial strains present within a highly diverse community is probably an impossible feat for the host to accomplish alone. Therefore, although the ‘host to microbe’ arm may predominate, there is also a significant ‘microbe to microbe’ component. The existence of these interactions would explain the presence of microbial guilds [188, 189] and the evolution of features to gain a competitive advantage, such as secretion of antibiotics [176]. This model thus successfully synthesizes many of the concepts discussed above. If true, research into understanding how the host environment selectively retains beneficial microbes and microbial coevolution would enable us to effectively tailor microbial replenishment strategies.

33

1.3 Study of the Human Gut Microbiota

In order to study the human gut microbiota to determine if macroecological theories are applicable, the utilization of both suitable models and accurate methodologies to probe the system for critical functionalities/interactions is essential. The former is necessary, because of the limitations of human and clinical studies, which are constrained to associations and specific treatments with respect to ethical considerations. Both in vitro and in vivo modelling strategies are currently in use, in which the choice of inoculum between undefined but appreciably diverse fecal samples or defined microbial consortia must be made [194, 198–200]. The latter is reliant upon modern ‘-omics’ technology that evaluates biomolecules of a specified class present in an ecosystem. It follows the central dogma of biology, in which microbial DNA (marker gene sequencing and metagenomics), RNA (metatranscriptomics), proteins (metaproteomics) or metabolites (metabonomics) can be observed [201]. As each methodology has a set of strengths and weaknesses, it is often best to combine approaches, termed ‘multi-omics’, in order to gain a better appreciation of the system [202]. The available model systems and ‘-omics’ techniques will be outlined in this review, with their advantages, drawbacks and future developments discussed.

1.3.1 Model systems

1.3.1.1 In vivo models

The most common in vivo model is the mouse, although rats and pigs have been utilized to a lesser extent [193, 203, 204]. Mice can be studied with their natural microflora but are most often ‘humanized’ via inoculation with human fecal samples or defined microbial consortia [193, 194]. To prevent cross-contamination with the environment, the latter mice are kept in strict GF conditions, and are referred to as gnotobiotic mice upon inoculation. Several genetic lines of mice can be utilized for research, which can be exploited to predispose mice to conditions such as IBD [205]. Mice provide a means of studying a complete immune system and physiology but present several weaknesses. In addition to the cost and skilled personnel required to upkeep a GF facility, there are obvious characteristic differences between humans and mice, such as diet and cell structure [193]. The latter leads to a failure of complete recapitulation of host-microbe cross-talk. It is for these reasons that the gut microbiota composition is altered and less diverse in gnotobiotic mice compared to their human hosts [194, 206]. Particularly, gnotobiotic mice may lack specific taxonomic groups from the phylum Firmicutes that are typically in low-abundance in humans. This lack of conference can lead to deficiencies in the development of the host immune system [206]. Despite these limitations, mice are still a necessity for

34

research probing, e.g., host effects on disease causation, and have been utilized with success to address knowledge gaps in the field.

1.3.1.2 In vitro models

Bioreactors can be fashioned to replicate the conditions of the human GI tract, enabling a useful in vitro model (Figure 5). Two main formats exist, multi-stage bioreactors such as the SHIME® (ProDigest), which comprises of several connected units that model each compartment along the GI tract [207], and single-stage systems, which exclusively replicate a given compartment, usually the distal colon [198]. Naturally, these models cannot recapitulate the host immune system nor the spatial distribution that occurs between the lumen and epithelium lining. Because of this, bioreactor-based model systems can lead to alterations in composition of fecal-associated microbial communities, with proliferation of propionate producers from Bacteroidetes and Clostridium cluster IX at the expense of butyrate producers from Clostridium clusters IV and XIVa [208]. Fortunately, mucin can still be added to the medium to support mucus-degrading specialists, such as Ak. muciniphila [37, 198]. The limitations of the model are, however, not always a drawback; they allow researchers to isolate factors that are dependent upon the microbial community itself from factors that are dependent on the host. Bioreactors also offer the advantage of enabling the monitoring of community dynamics over tight time-courses, whereas in vivo models are limited to, e.g., the frequency of defecation. Further, bioreactor volume is scalable, which allows for increased sampling volume in comparison to, e.g., mouse fecal/cecal samples. Finally, experiments in bioreactors are more controllable than those carried out in mice, both between replicates and laboratories, translating to a heightened reproducibility [193, 198]. Nonetheless, not all research questions can be answered using bioreactors, especially when host-derived effects are to be studied. Certain ex vivo strategies are, however, attempting to address this issue. Human IEC lines can be combined with bioreactor effluent to study host-microbial interactions. Recently, this approach has progressed to the use of colonoids or enteroids, i.e., self-renewing monolayers of primary cells, which exhibit fewer abnormalities and are capable of recapitulating multiple cell types within a single experiment, e.g., IECs, Goblet cells and Paneth cells [209, 210]. A bioreactor-host cell combination has also been developed as a miniaturized system in chip format [211]. The latest chip model of the human small intestine, for example, utilizes primary epithelial cells expanded as 3D organoids that undergo multi-lineage differentiation, coupled with an intestinal microvascular endothelium cell line cultured in parallel. This complex model was shown to more closely mimic the immune response to infection of the human duodenum. Improvements in ex vivo model systems that incorporate both host cells and their associated microbiota present a promising future avenue for human gut microbiota research.

35

Figure 5. Schematic of a single vessel bioreactor designed to replicate the human distal colon. Image produced by Judy Yen and commissioned by the Dr. Allen-Vercoe laboratory (original figure has been altered through the addition of labels).

1.3.1.3 Fecal versus defined microbial communities

Both types of models can be inoculated either directly with human fecal samples or with defined microbial consortia [194, 198–200]. Defined microbial communities can be derived from fecal samples via methodical isolation and subsequent cryopreservation of the component microbial strains, which can then be individually regrown as part of a given bioreactor inoculum [23, 199, 200]. Defined microbial communities have several advantages over a typical fecal sample inoculum, i.e., the number of replicates is no longer limited by the size of the sample and the isolates can continue to be examined even after the initial study takes place. For example, if a bioreactor experiment results in the identification of a specific protein associated with a given effect, the microbial strain that produces this protein effector can be regrown in order to characterize it. This is technically challenging to do if fecal material is used in the bioreactor experiment instead. Defined experimental communities also maximize the potential for manipulation, in which the community can be cultured with different combinations of species to specifically answer multiple hypotheses. Finally, defined microbial communities heighten the resolution

36

that can be obtained from ‘-omics’ methodologies, as obtained data can be matched to the genome sequences of the strains and low-level noise (e.g., such as that presented through unavoidable sample cross-contamination, common to many massively parallel sequencing platforms) can be effectively separated from the pertinent information. The disadvantage of defined microbial communities is that they are less diverse than fecal samples because not all species of microbes within a given ecosystem may be cultured or may be difficult to preserve cryogenically. However, culturing techniques are continuing to improve, especially since the availability of metagenomics has allowed isolation medium to be tailored to the metabolic modules present in a given microbial ecosystem [212]. Cell sorting has also been effectively coupled with metagenomics to extract information from microbial species of very low abundance within an ecosystem [213]. Thus, these model systems will progressively become more accurate representations of their parent ecosystems in the future.

1.3.2 Gut microbial ecosystem analysis methods

1.3.2.1 Marker gene sequencing and metagenomics

At the advent of NGS technologies in 2005, beginning with 454 Pyrosequencing, followed by Illumina sequencing (aka sequencing-by-synthesis) then Ion Torrent sequencing, much work began into uncovering the species structure of the human microbiome [214, 215]. Marker gene sequencing strategies (phylogenomics) makes use of orthologous gene groups that are suitable to differentiate and thus relatively quantify taxonomic lineages, typically using 16S rRNA genes for bacteria and archaea [216], 18S rRNA genes for eukaryotes and internal transcribed spacer regions for fungi [217]. PCR is often first used to amplify small quantities of DNA [214, 216]. Metagenomics relies on the sequencing of short, sheared fragments of DNA, which are then reconstructed in silico, in what is referred to as a shotgun approach [18, 28, 215]. Metagenomics enables researchers to determine not only the presence of certain taxa, but also their potential functional capabilities. Further, the highest level of taxonomic differentiation, i.e., between strains of species, is possible, whereas marker gene sequencing can only derive species groupings or genera at best [216, 218]. However, metagenomics exhibits several weaknesses. NGS technologies can boast a high accuracy rate of up to 99.9%, but in the context of the size of a bacterial genome, which averages at 5 million base pairs, that still allows an average error rate of 1000 base pairs per million [214, 219]. Therefore, considerable overlapping reads, i.e., relative species abundance, is required to confirm the accuracy and presence of a genome [220]. NGS technologies have a fixed capacity for the number of bases that can be sequenced per run, for example, an Illumina HiSeq can yield 1800 billion bases, and thus it is currently not possible to sequences all of the genomes present in a human gut microbial ecosystem [214, 220]. This limitation creates a considerable bias for microorganisms in higher abundance, and the achieved depth makes suitable quantification difficult

37

[220]. Thus, marker gene sequencing is still often utilized to determine relative abundance of community members.

Marker gene sequencing is also not without limitations. Its incapability to determine absolute numbers coupled with the fixed capacity of the instruments has made normalizing the data particularly challenging. Adequate solutions have now been presented, in which the data is either center-log ratio transformed [221] or adjusted based on the total cell counts in addition to the obtained read counts [92]. The latter is reliant on technology that can accurately quantify cell counts, which can be achieved using techniques such as flow cytometry. Other considerations for accuracy of DNA sequence-based ‘omics methods are the bias introduced from gDNA extraction, PCR, carriage of the marker gene and data processing [92, 216, 222, 223]. Particularly, the 16S rRNA gene was chosen for characterization of bacteria and archaea due to database availability, but over time, as databases have grown, it has become clear that different species and even strains can carry variable copy numbers of this gene, introducing yet another bias [223]. Marker gene sequencing has a restricted threshold of detection as well, since a higher cut-off is required to discriminate real data from instrumental issues such as cross-contamination in the sequencer [224]. Thus, for accurate, sensitive quantification beyond relative trends, a method such as quantitative real-time PCR (qRT-PCR) [225, 226] or digital droplet PCR (ddPCR) [227] should be used. However, both of these techniques rely upon the design of probes, requiring either some prior knowledge of what is present in the microbial community or of a desired target. Despite the discussed issues with accuracy, marker gene sequencing is still a meritorious method as it allows researchers to obtain an untargeted overview of an ecosystem’s structure. Knowledge of the shortfalls of the method have been and continue to be addressed. For example, development of a standardized DNA preparation and sequencing protocol and PCR-free methods of marker gene sequencing have improved accuracy and lab- to-lab comparability [216, 228]. The Ribosomal RNA Operon Copy Number Database has also been initiated by the University of Michigan to facilitate rRNA gene copy number correction (although, more data is necessary before it can be successfully integrated into mainstream studies) [223]. Data processing has also advanced from early strategies that grouped unique sequences into operational taxonomic units using their percentage similarity to using quality scores generated by the sequencer to delineate true versus artificial amplicon sequence variants [222]. Additional improvements could be made to heighten the precision of taxonomic groupings and usability of subsequent data processing pipelines. Finally, future research into understanding the diversity of the gut microbiota, or the use of defined microbial communities, could expedite the design of probes for qRT-PCR or ddPCR technologies when accurate quantification is imperative to the research question.

38

1.3.2.2 Metatranscriptomics and metaproteomics

Metatranscriptomics and metaproteomics measure the active functions of a microbial ecosystem, enabling researchers to answer the question, ‘who is doing what’ [215, 229, 230]. These methodologies additionally enable researchers to connect the active functional biomolecules to the species from which they were derived and to observe functions beyond metabolism, both of which are advantages over metabonomics (described below). Metatranscriptomics is carried out in a very similar fashion to metagenomics or marker gene sequencing, with the exception that the RNA must first be reverse transcribed to cDNA [229]. As above, accurate quantification using qRT-PCR [231] or ddPCR [232] can also be conducted using cDNA. Transcription can be useful for measuring protein production rates. Since the turnover of RNA is rapid, it is not confounded by an existing pool of the biomacromolecule [201]. However, the method presents a few disadvantages. First, RNA is a much more labile molecule than DNA or proteins, and advanced preparation of samples is thus often required, using RNA stability matrices [233]. Reverse transcription also introduces an amplification bias akin to standard PCR [214]. Finally, the cell possesses several mechanisms downstream of transcription to regulate functional activity. Examples include post-translational modification and restriction of protein half-life [201]. In this light, metaproteomics provides a more accurate picture of the functional ‘end-product’.

In metaproteomics, proteins are first extracted from the cells, as well as from the environmental milieu [201, 230]. Samples are then subjected to analysis, usually using liquid chromatography tandem mass spectrometry (LC-MS/MS). Proteins must also be enzymatically digested, so that their primary peptide sequence is within the size range of favorable collision dynamics, thus enabling the quantification of mass from the fragmented amino acids. The full peptide sequence of a protein can then be derived from assembling its peptide fragments, not unlike as is done with fragments of DNA following metagenomics. The extraction technique can inherently favour certain classes of proteins. For example, detergents used may not effectively dissipate specific types of membrane bound proteins, protein precipitation methods select against small proteins and the use of trypsin for enzymatic digestion may bias for proteins that frequently contain the combination of amino acids that fit into its active site [234–237]. Quantification is also, in general, relative, as the acquired peptide intensities cannot be translated into meaningful concentrations. However, unlike for NGS technologies, metaproteomics is not limited to a fixed capacity, as the peptides are first separated by chromatography and then read individually. If absolute quantification is necessary, an enzyme-linked immunosorbent assay can be employed, but akin to quantitative real-time PCR, prior knowledge of the target is necessary to design antibodies with appropriate epitopes [238]. LC-MS/MS also possesses high sensitivity. In fact, this sensitivity also represents a disadvantage because it erodes reproducibility between runs if background contamination

39

(e.g., from air particulates) is a factor [239]. This drawback can fortunately be remedied bioinformatically through strict data filtration and log transformation [230], or experimentally by using isotope labelling to separate the sample material from debris. Isotopes can either be introduced chemically or by culturing the microbial community with a labelling reagent; the latter method is referred to as stable-isotope probing (SIP) [239, 240]. Protein-SIP particularly allows the capture of protein synthesized over a specific time- frame, depending upon how long the culture is incubated with the labeling reagent, in addition to time- course analysis, if sampling occurs over distinct time points. It should be noted that a newly synthesized protein may be present yet inactive. Fortunately, metaproteomics allows for detection of post-translational modifications through selected enrichment [241]. However, accurate interpretation of this data requires prior knowledge of the conditions in which a particular protein is active. Finally, the detected peptides also exhibit a lower resolution of information than DNA or RNA, as three nucleotide bases encode one amino acid and most amino acids can be translated from more than one codon [242]. As a result, proteins of similar function but derived from different organisms will be grouped together. Resolution of the instrumentation is a critical factor for metaproteomics, in order to differentiate between amino acids that are highly similar in chemical properties, with the Thermo Fisher Orbitrap being a paramount development [243]. Technological advancement will continue to improve the resolution of the obtained results; for example, an Orbitrap fusion mass spectrometer has been created to differentiate between leucine and isoleucine [244], which are of identical mass. The use of defined microbial communities that are able to yield databases of high precision, combined with well-designed, e.g., protein-SIP studies for accuracy of quantification, will also enhance the quality of metaproteomic data in the future.

1.3.2.3 Metabonomics

Metabonomics is unique in that it detects the ‘final output’ of a microbial ecosystem that is a result of all of the metabolisms that occurred, including intricate cross-feeding patterns [245]. Metabolites also can greatly affect the host, either by causing cellular damage, by being metabolized by host cells as fuel or even from being transported to other parts of the body [68, 96–100]. The disadvantage is that metabolites cannot be easily traced back to the microorganisms from which they were derived; instead, at this time, only inferences based on pathway analysis can be made [119, 246]. Two principal methods can be utilized to detect metabolites, nuclear magnetic resonance (NMR) spectroscopy or mass spectrometry [200, 245, 247]. NMR spectroscopy presents two main advantages. First the metabolites do not need to be extracted prior to analysis, which is extremely helpful in preventing bias. In contrast, mass spectrometry, requiring extraction of metabolites prior to analysis, presents a considerable bias because a combination of polar and non-polar extractions coupled to both positive and negative ionizations are often necessary to get a range of metabolic signatures [245, 248]. Additionally, LC-MS/MS alone does not easily capture the

40

volatile compounds that can be particularly important in human gut microbiota research, namely the SCFAs [111, 249]; often, a further analysis using gas chromatography mass spectrometry (GC-MS) is required to capture such volatile components effectively. A second key advantage of NMR spectroscopy is that this technique is capable of absolute quantification, whereas mass spectrometry-derived data is relative due to the inability of the acquired intensities to be meaningfully translated. The exception to this latter rule, however, is if a standard of known concentrations is run against a sample; then, a concentration for that compound can be obtained from a calibration curve [250], although this circumvention relies on prior knowledge of the sample contents and the availability of appropriate standards. On the other hand, the chief disadvantage to NMR spectroscopy is its relatively low sensitivity when compared to mass spectrometry techniques [245]. The inability to detect compounds in low abundance can be a critical barrier depending on the research question.

Both NMR spectroscopy and mass spectrometry require databases to be populated with spectral information for compound identification [119, 251]. Otherwise, spectral signatures can only be compared between samples, via binned areas and intensities respectively, to determine if the overall metabolite landscape is different [252]. Even with databases existing, NMR spectroscopy can present difficulties in compound identification due to spectral convolution [253], i.e., overlap of compounds, and mass spectrometry is limited to the accuracy of mass quantified to discriminate between compounds [252, 254]. However, future developments in these methodologies will continue to improve the accuracy of results obtained. In addition to database population, strategies that enable a greater delineation of compounds detected via NMR spectroscopy, such as 2D 1H13C-NMR approaches, particularly with improvements in the ‘ultrafast’ single scanning method [255, 256], would increase the amount of compounds that can be confidently identified. Mass spectrometry depends upon instruments of high accuracy, such as quadrupole time-of-flight, and compounds are additionally fragmented, with quantities determined of both the base and fragment ions, to further probe the chemical structure [252, 254]. Technological advancements that enable a higher magnitude of decimal places or allow for double fragmentation to yield more structural information would thus lead to better identifications. A comparison between NMR spectroscopy and mass spectrometry is also warranted to enable researchers to select the most suitable method.

1.4 Overview of thesis work and overall hypothesis

Three main issues need to be addressed in human gut microbiota research. First, the remarkable complexity of the ecosystem creates difficulties in dissecting the components that are key to maintaining host health. Second, without a theoretical framework, research progresses much slower, and it becomes challenging to make predictions in order to formulate suitable interventions strategies. Third, there are technical limitations to the current methodologies, which, upon improvement, would facilitate more

41

definitive interpretations. A suitable model system, including the coupling of in vitro bioreactor culture and defined microbial communities derived from human fecal samples, allows deconvolution of the ecosystem, thus enabling a more precise uncovering of community structure and functioning. Experiments can then be designed to prove causative links, or rather, to determine the applicability of macroecological principals. Therefore, I hypothesize that microbial ecological theory can be replicated utilizing complex defined microbial communities. I propose to address this hypothesis via four main objectives, which are broadly divided into two components. The first is the development of ‘-omics’ tools to expedite the discovery of critical interactions within a microbial community, and the second is to then test ecological theories in vitro. The first two chapters are in response to development of the first component, specifically involving 1) the comparison of 1H-NMR spectroscopy and LC-MS/MS based metabonomics analysis of bioreactor culture seeded with defined microbial communities derived from fecal samples and 2) a novel protein-SIP approach utilizing heavy water to elucidate the relative contribution of microbial ecosystem constituents to overall functional activity. The final two chapters then examine macroecological principals. In chapter 4, to test the theory that coevolution is a contributor to species assortment [192], an ‘artificial’ defined microbial community was created by matching species to a ‘natural’ defined microbial community, but where each species was sourced from a separate human subject. Further, both communities were grown in two medium formulations representing a high fiber and a high protein diet respectively, to confirm the proposed relative roles of habitat filtering and species assortment in environmental selection [167], and several replicates of each condition were completed, to determine the contribution of environmental selection versus stochasticity to community assembly (Figure 3). In chapter 5, in order to test the theory that niche processes in habitat filtering dominate the competitive exclusion in species assortment [167], a fecal sample derived from a patient with ulcerative colitis (UC), a sub-type of IBD, was treated using MET both with and without antibiotic pre-treatment. It was also of interest to elucidate if the integrated microbes added unique functionalities or improved the redundancy of the low diversity, UC-associated microbial community to corroborate the proposed graphical relationship between diversity and functionality (Figure 4). These experiments were aimed at enhancing our understanding of human gut microbiota dynamics, which will ultimately lead to rationally designed therapeutic strategies that target gut microbial ecosystems exhibiting undesired behaviours, potentially improving outcomes of patients with GI disorders.

42

Chapter 2 – 1H-NMR spectroscopy vs. LC-MS/MS 1H-NMR spectroscopy and LC-MS/MS are two common methodologies utilized in metabonomics of human fecal samples [111, 200, 245, 247–249]. Each of these technologies has their own set of advantages and disadvantages. Briefly, 1H-NMR spectroscopy is higher throughput, as metabolites need not be extracted in order to be detected, and is capable of absolute quantification. LC- MS/MS, however, has a much higher sensitivity. Surprisingly few studies have compared the effectiveness and limitations of these techniques directly through measuring the same set of samples by each. I was particularly interested in how successfully each methodology was able to delineate a change in in vitro bioreactor cultured defined microbial community behaviour, and thus I measured samples before and after the application of a known perturbation, antibiotics [16]. The completion of metabonomics was a critical aspect of my research, as it was essential to not just observe the abundance of species present in the microbial communities but also to determine how they functioned, in order to properly assess ecological dynamics and microbial interactions thus addressing my hypothesis. This study fits into the first component of my goals, i.e., development of ‘omics’ tools, and its completion enabled me to select the best method for my subsequent experiments. As I simply needed to evaluate when alterations in microbial community behaviour occurred, I chose to conduct 1H-NMR spectroscopy in chapter 4 and chapter 5. The following body of work is presented in the format of a mini review as published in the International Journal of Medical Microbiology (DOI: 10.1016/j.ijmm.2016.03.007) [259].

43

2.1 Article Information

Optimization of Metabolic profiling and characterization of defined in vitro gut microbial ecosystems

Dirk K. Wissenbach1,*, Kaitlyn Oliphant2,*, Ulrike Rolle-Kampczyk1, Sandi Yen2, Henrike Höke1,3, Sven Baumann1, Sven B. Haange1, Elena Verdu4, Emma Allen-Vercoe2, and Martin von Bergen1,5,6

Affiliations

1 Department of Molecular Systems Biology, Helmholtz – Centre for Environmental Research - UFZ, Permoserstrasse 15, D-04318 Leipzig, Germany 2 Department of Molecular and Cellular Biology, University of Guelph, Guelph, Ontario, Canada 3Department of Pharmaceutical and Medicinal Chemistry, Institute of Pharmacy, University of Leipzig, Leipzig, Germany 4 Farncombe Family Digestive Health Research Institute, McMaster University, Hamilton, ON, Canada 5 Institute of , Faculty of Biosciences, Pharmacy and Psychology, University of Leipzig, Germany 6 Aalborg University, Department of Chemistry and Biosciences, Aalborg University, 9000 Aalborg, Denmark * These authors contributed equally to this work

Acknowlegements

The authors thank their colleagues Gabriele Heimpold and Brigitte Winkler for scientific discussion and technical assistance. The authors acknowledge funding support from Crohn’s and Colitis Canada, which allowed for culturing and NMR analysis of the UC-derived community. The authors also acknowledge funding of Sven Haange by DFG SPP165.

44

2.2 Abstract

The metabolic functionality of a microbial community is a key to the understanding of its inherent ecological processes and the interaction with the host. However, the study of the human gut microbiota is hindered by the complexity of this ecosystem. One way to resolve this issue is to derive defined communities that may be cultured ex vivo in bioreactor systems and used to approximate the native ecosystem. Doing so has the advantage of experimental reproducibility and ease of sampling, and furthermore, in-depth analysis of metabolic processes becomes highly accessible. Here, we review the use of bioreactor systems for ex vivo modelling of the human gut microbiota with respect to analysis of the metabolic output of the microbial ecosystem and discuss the possibility of mechanistic insights using these combined techniques. We summarize the different platforms currently used for metabolomics and suitable for analysis of gut microbiota samples from a bioreactor system. With the help of representative datasets obtained from a series of bioreactor runs, we compare the outputs of both NMR and mass spectrometry-based approaches in terms of their coverage, sensitivity and quantification. We also discuss the use of untargeted and targeted analyses in mass spectroscopy and how these techniques can be combined for optimal biological interpretation. Potential solutions for linking metabolomic and phylogenetic datasets with regards to active, key species within the ecosystem will be presented.

2.3 Introduction

Over the last decade, the relevance of the gut microbiota in many different human diseases, including inflammatory bowel disease and the metabolic syndrome, has become evident [260, 261]. At the same time, a need for increased understanding of host:microbiome interactions has driven the development of techniques assessing the molecular features of the microbiota, such as high throughput sequencing [220] and the use of gnotobiotic animal models [262]. Animal models remain essential to microbiota related research because they may be used to address microbial ecosystem functionality in vivo; in particular, gnotobiotic animals colonized with selected, defined microbial community subsets has led to key insights in immunology and metabolism [262]. However, such animal models also have major drawbacks, including the expenses involved with husbandry and the low throughput nature of experiments. These, in turn, limit the number of conditions that can be experimentally tested and thus the mechanistic insight that can be gained. Additionally, differences between the human and murine physiology (including the immune system) limit the relevance of murine models for human health research.

A useful alternative to animal models for the study of the gut microbiota is ex vivo culture of microbial ecosystems in bioreactor models [263]. Using such models, the gut microbiota can be cultured

45

over time-frames and under conditions that can mimic those of the gut environment itself, and, furthermore, defined ecosystems may be reproducible to enable biological replication. Although ex vivo bioreactor systems are artificial (and results obtained using them must thus be carefully interpreted), they offer the key advantage of exquisite controllability of the ecosystem environment, and as such they are increasing in their popularity for mechanistic studies of the human gut microbiota associated with both health and disease.

Metagenomic studies of the human gut microbiome have provided a rough blueprint of possible metabolic reactions within this complex ecosystem [261]. The next step is a more functional understanding of the microbial ecology of the intestinal tract and its interaction with the host, but this requires detailed descriptions of the active metabolic processes. Potential metabolic pathways can be partially predicted on the basis of a metagenome [201] but there is a large gap between the genomic information and metabolomic phenotype for a given microbiome that needs to be addressed in greater experimental detail than metagenomics alone allows. The field of metabolomics aims to characterize and assess the collection of metabolites made by a given cell or tissue at a given time point [264], and two different platforms are commonly used, namely NMR and mass spectrometry. Both of these methods have drawbacks and merits: briefly, NMR is inherently quantitative but with limited sensitivity while mass spectrometry has a higher sensitivity but requires standards for quantification (and, additionally, due to the high diversity of physico-chemical polarities within the molecular class of metabolites, the chosen mass spectrometry technique, be it LC- or GC coupled, creates a strong bias in detected metabolites). Within the metabolomics field, there is an ongoing debate about the optimal relation of global and targeted analyses. Here we focus on how to obtain the most biologically meaningful information by using a combination of NMR and mass spectrometry and from the integration of global and targeted mass spectrometric and NMR based analyses.

The most dramatic drawback of isolated metabolomic data is the lack of a direct link of detected metabolites to the activity of specific species within the phylogenetically highly diverse microbiota. Several groups have tried to establish bioinformatic modelling of metabolic functions based on metagenomic data [265] or by comparing single culture activity with global metabolomic data [266]. Both methods yielded reasonable correlations between the model and experimental data, but it is important to understand that microbes grown in isolation, and usually under highly artificial culture conditions, produce very different metabolic profiles under these conditions compared to when they are growing as part of their native community [267, 268]. In order to partially overcome this problem, SIP techniques have been developed and are now well established in environmental microbiology. In stable isotope probing isotopically labelled substrates are provided to a given ecosystem and their incorporation into

46

metabolites, such as fatty acids, proteins, or DNA/RNA [269, 270] measured. Metagenomic approaches combined with metabolomics/SIP is thus likely to become a central method for ascribing metabolic functionalities within a microbial ecosystem to the key species involved. While there are challenges involved in using SIP approaches in human and animal models of microbiota metabolism, such studies are ideally suited to ex vivo ecosystem models where the substrate administration and environmental conditions can be carefully controlled.

2.4 Relevance for human health, potential for mechanistic insights, and feasibility define the strengths of model systems

The utility of any model used for simulating human biology is built upon the strength of the potential mechanistic insights that can be obtained [271]. Although quantification of the level of such insights is difficult, it can help to visualize the various characteristics of different model systems using a semi-quantitative 3D plot (Figure 6). Use of human subject data has, of course, great relevance for human biology, but high experimental costs, and ethical considerations that limit interventions and hence reduce experimental flexibility minimizes the use of human subjects in microbiome research. Animal models, for example, gnotobiotic mice colonized with components of the human gut microbiota, offer fewer ethical constraints compared to human subjects’ research, and thus offer more flexibility. However, experimental throughput is limited by cost and technical constraints. On the other hand, ex vivo culturing of the gut microbiota in bioreactors removes most ethical constraints and allows a higher throughput, however this host-free approach is limited in that interpretation of host-microbiome interactions is heavily restricted. Thus, there is not yet any ideal model for studying the gut microbiota, and the most experimental strength will likely be drawn from combinatorial approaches.

47

Figure 6. Strengths and weaknesses of model systems. For judging model systems the criteria of relevance for human biology (z-axis), mechanistic insights (x-axis) and feasibility (y-axis) are used. A normalized scaling from 0 to 1 is used and the models are located within this plot according to their estimated values in respect to these criteria. The red droplines indicate for convenience the placement on the three axes. The black arrows indicate the ongoing and expected future efforts in order to improve the characteristic of the model system along a certain axis. The lines with a diamond at the end indicate an inherent limit of further improvement.

Most studies of the human gut microbiota are focused on microbial consortia from stool samples, as these are easily accessible. However, stool is not necessarily useful in the study of some gut diseases; for example, in the metabolic syndrome, metabolic interactions in the ileum are more relevant than those in the distal colon (which stool samples best represent), since bile acids within the ileum are highly abundant and central to the solubilisation and uptake of lipids and fatty acids, which in turn influence disease [272, 273]. The small intestine is also highly relevant for the etiology of inflammatory bowel diseases, particularly some forms of CD where the ileum is the site of inflammation [274]. Colonoscopy allows for sampling of the large intestine and the distal part of the small intestine, but ethical considerations limit this invasive procedure to disease cases or for colorectal cancer screening of (usually) older adults, and in addition, the requirement for bowel lavage prior to the procedure can influence microbiota composition and function. Although, in the future, the application of ingestible capsules may allow for some of the constraints of longitudinal sampling of the intestinal tract to be lifted [275], it is also important to appreciate the cross-sectional gradients of microbial community structure within the gut. For example, the mucus layer harbours a specific subset of the microbiota that provides a significant contribution to the immunological and metabolic interaction zone with the host organism [276, 277].

48

The problem of limited sampling can be somewhat negated by the use of gnotobiotic animal models, and as a result, functional studies that focus on the mechanistic links in microbiome:host interactions can be more easily explored. Of specific value is the ability to test hypotheses generated through observation of human patients; these studies range from proving that a transmissible microbiome can lead to obesity [70] to demonstrating the imprinting effect of hunger-induced effects in the microbiome on host metabolism [278]. However, differences in both host physiology and extrinsic factors such as diet between animals and humans will always limit the relevance of these models. In addition, the rising costs associated with animal research increasingly constrain experimental flexibility.

Ex vivo culturing of microbial ecosystems, whilst intrinsically reductionist in nature, are gaining importance in the human microbiota research field because they represent an experimental platform whereby the complexities of the gut microbial community alone can be studied, in the absence of, or with strict control of, specific host factors. The lack of host components can be seen as both a shortfall, and an advantage of the model. On the one hand, a studied microbial ecosystem, normally shaped by the host environment, may encounter deficits in an ex vivo model, for example, if a host component with importance in influencing microbial growth of a/some component species is absent. On the other hand, it can be argued that the output of a given gut microbial ecosystem is easier to measure ex vivo; for example, some metabolites produced by an animal-associated ecosystem may be readily absorbed by the host and thus difficult to measure, but within a bioreactor model the metabolites are more readily available for sampling. While, clearly, the bioreactor community is not a perfect facsimile of the native situation, it reduces the complexity of the system enough that major elements of ecosystem function can be resolved, such that predictions can be made about microbial interactions which can be tested later in the native animal if required [198]. A further refinement of bioreactor culture of gut microbial ecosystems is derivation of defined microbial ecosystems from fecal or gut biopsy specimens. With careful culture, a sizeable subset of microbial isolates can be obtained from a given sample and recombined back into a simplified community [199, 200, 279]. This approach has the advantage of allowing multiple experiments seeded with the same community and is not limited by the original sample size. Although the derived community is usually much simpler than the community from which it was derived, its net function can share attributes that make it suitable as an experimental model [199].

2.5 Metabolic interaction as a key feature of microbiome:host interaction

The metabolic processes of the gut microbiome are complex and can be grouped into different classes: (1) metabolic pathways that merely support microbial metabolism; and (2) metabolic pathways that serve both the microbes and the host, and are relevant for microbiome:host interaction. The first group is exemplified by microbial digestion of protein derived from the diet in the distal gut, since this

49

would otherwise not be utilized by the host [98]. A small fraction of the microbiota is able to degrade mucin, a host-derived molecule, and because it is unclear what the benefit to the host of this microbial activity may be, it is uncertain whether mucin-degrading microbiota species should be considered as part of this first group [280]. A good example of the second class of pathway is the microbial conversion of substrates that are indigestible to the host, such as complex starches, into metabolites which the host can easily utilize, e.g., SCFAs [281]. Through this synergistic process the bacteria gain energy through starch fermentation and the waste products of this metabolism, SCFAs, provide up to 70% of the energy uptake for colonic epithelial cells [282]. The potential importance of gut microbial metabolism to health is illustrated by the finding that there are differences in the fecal SCFA profiles of obese and lean individuals [283]. A further example of host:microbiome metabolic interaction is the production of B- vitamins, such as folate, cobalamin and riboflavin, by certain gut bacteria, which, as well as being important for bacterial metabolism, are absorbed for use by the host [284, 285].

IBD is associated with disturbances of the intestinal microbiota, often termed ‘dysbiosis’ [286, 287]. In UC, a subtype of IBD, the gut microbiota has been found to be both quantitatively and qualitatively abnormal in comparison to that of healthy individuals, although taxonomic differences vary widely across studies and are difficult to interpret [288]. On the other hand, metabolic pathways may represent a better target to study in order to understand dysbiosis in UC. For example, changes in metabolite profiles have been observed in fecal samples from IBD patients compared to healthy controls, including reductions in amino acid biosynthesis and carbohydrate metabolism [289], as well as differences in amino acids, microbiota-related SCFAs, and lactate abundance [290].

2.6 NMR and MS

For metabolomics there are two different technical platforms available, NMR and MS. While the first is inherently quantitative the latter technique is more sensitive, although mass spectrometry-based profiling has the integral disadvantage of imperfect quantification. Since the metabolism of the microbiota is highly complex, the best approach to its analysis is to first screen the global profile and then validate identified metabolic pathways with targeted assays that allow accurate quantification later. Metabolomics by LC–MS/MS uses high resolution MS systems with data dependent MS2 acquisition (DDA). Thus, accurate fullscan information and MS/MS spectra are available for identification of putative metabolites. Tools such as XCMS online [291] are designed to normalize spectra data and provide quantitative profile data from accurate fullscan [292]. This information can be used for compound identification using state- of-the-art databases such as Metlin [293], HMDB [251] and MassBank [294]. The qualitative information can be used for pathway enrichment analysis using software tools such as KEGG [119], MetaboAnalyst [295], or IPA [296]. Combining qualitative and quantitative information for global pathway analysis can

50

be performed by software tools such as XCMS [297] and Ingenuity Pathway Analysis [298]. However, proposed effects on certain pathways should then be confirmed by methods that use targeted analysis to reveal key players for the corresponding pathways, as these methods provide more precise analytical (both qualitative and quantitative) information in contrast to the global LC–MS/MS DDA approach. This is especially true for identified pathways which are based solely on accurate fullscan identification/information.

1H-NMR represents an alternative method for analysis of metabolic signatures from the human gut [299] and has been previously used to measure metabolites produced by defined microbial ecosystems derived from the human gut in a bioreactor set-up [200]. The compounds detected by 1H-NMR are generally small molecules, some of which can be detected by both 1H-NMR and LC–MS/MS, although overall, the use of the two methods together are complementary, expanding the spectrum of detectable compounds.

Surprisingly few studies have used both NMR and LC–MS/MS in parallel in order to define the overlap or complementarity of these platforms. In order to allow the direct comparison and to show the merits and shortcomings of both techniques, we utilized a previously derived and characterized microbial ecosystem representative of active UC [199], and cultured it in bioreactor vessels simulating the environment of the human distal gut [200]. One subset of cultures was treated with rifaximin, an antibiotic which has been used for treatment of UC flares [300], another subset was treated with ethanol (the carrier for rifaximin) as a control, and a third and final subset was left untreated as a further control set. Samples were withdrawn from vessels, filtered to remove microbes, and aliquots subjected to both LC–MS/MS and 1H-NMR analysis, and the results from both techniques were interpreted and compared.

2.7 Metabolomics detects many spectral features and 50 metabolites detected by NMR result in the same quality of group separation

Metabolomics approaches are established for the different modes of mass spectrometry. In microbiome research so far, fecal samples are typically analyzed, and this approach is summarized and reviewed by Smirnov et al. [248] in this journal. Hence, we focus here on one aspect that is often overlooked, namely the different strengths of the platforms and how to combine them for optimal biological interpretation.

Samples from bioreactor experiment sets described above were subjected to a LC–MS/MS workflow as shown in Figure 7. The semi-quantitative data from global mass spectrometric profiling were used for pathway detection by the XCMS platform and some of the data was selected for validation through targeted MS or NMR data.

51

Figure 7. Metabolomics workflow. The analysis by the semiquantitative (shaded dark green) global mass spectrometry profiling allowed pathway detection by XCMS and some of them were validated by quantitative targeted analyses. Also, the quantitative NMR results (light green) were used for pathway enrichment as well as for validation.

Using DDA based on full scan MS survey scans 23.117 “Total Aligned Features” for positive ionization and 18.520 features for negative ionization, and from those 7.235 were significantly regulated

52

(p ≤ 0.01) using positive ionization (Figure 8A), while 5.478 regulated features were detected by negative ionization. In contrast to NMR, LC–MS based metabolomics using accurate fullscan and MS/MS acquisition reveals thousands of detected features in one run. However, even if large, state-of-the-art databases such as Metlin [293], HMDB [251] and MassBank [294] are available, data analysis is still a challenge. In principle, unambiguous identification of metabolites can only be achieved by retention time and the corresponding MS/MS spectrum or analyzing authentic standards of metabolite candidates. Depending on the MS settings used, this information is not always given (e.g., missing MS/MS spectrum according to the DDA approach). Thus, only a small fraction of the detected features, in our example only 332 out of over 30,000 features (about 1%), could be unambiguously identified (Figure 8A). With even higher mass resolution this proportion might be increased but this ratio of spectral features to unambiguously identified metabolites is normally not reported in the literature. This small proportion of unambiguously identified metabolites might be improved in the future with metabolite specific spectral information. Consequently, an important demand is that the original data of global profiling data is stored in publicly available data bases and can be used for data base searches in the future, when more spectral data of single metabolites might be implemented in data bases. Together with targeted analyses for bile acids, lipids, amino acids, and SCFAs we identified and quantified in total 523 metabolites by MS and 50 by NMR in this study. The number of metabolites in NMR is comparable with those obtained in other studies of fecal-derived microbial communities [200] and also other studies on fecal samples [301–303]. The types of metabolites are more closely analyzed in Figure 8C. The largest overlap between NMR and MS exists in the group of carboxylic acids (11 out of 22). The mass spectrometry-based detection was achieved by using a targeted approach for detecting and quantifying SCFAs. Ten other metabolites were found in the global profiling by MS.

53

Figure 8. Identifications from global profiling and complementarity of targeted approaches. (A) The numbers of spectral features and the resulting identified metabolites in global screening are shown. (B) The complementarity of NMR and MS with respect to identified metabolites is shown in a proportional Venn diagram. (C) The redundancy of the targeted mass spectrometry-based approaches and NMR is depicted.

The next step often used in data evaluation is determining the group separation by principal component analysis. For this example, the principle component analysis (PCA) of MS data showed good separation in principle but an imperfect separation for one sample of run 40 (Figure 9B). In particular, the samples from bioreactors that underwent antibiotic treatment were well separated from the others, underpinning the expected strong effect. Despite the fact that MS provided over three orders of magnitude

54

more data points than NMR, the PCA derived from MS data was not improved in comparison to the PCA based on 50 quantified metabolites detected by NMR (Figure 10A). This finding supports the assumption that the highest abundance metabolites (such as those detected by NMR), characterize and define the overall metabolism of microbiome cultures. PCA and other approaches, such as interactive PCA and partial least squares regression, are highly suitable tools for testing data quality in general and detecting correlations but since the original information on single molecular features is not retained in PCA, it is of limited use for biological interpretation. In order to overcome this drawback, methods that allow the identification of molecular features that can still be identified after comparing highly complex data methods such as partial least squares-discriminant analysis (PLS)-DA and self-organizing maps (SOM) have been developed and introduced into the field of metabolomics [304, 305].

55

Figure 9. Global Profiling by Mass Spectrometry. (A) A cloud plot for significantly (p-values ≤ 0.01) regulated features using positive ionization. (B) Discrimination of samples by PCA plot showing differences for treatment groups as well as for biological, technical replicates based on positive ionization data. (C) Pathway coverage [%] and significance level [−log (p-value)] of proposed pathways.

56

Figure 10. Evaluation of NMR data. (A) Discrimination of samples by PCA plot showing differences for treatment groups as well as for biological, technical replicates. and the enrichment analysis of metabolic pathways (B).

In order to derive biologically relevant information, the profiling spectra were analyzed using the XCMS platform (Figure 9C) with parameters reported by Ivanisevic et al. [292] and the identified pathways for positive and negative ionization are shown in Figure 9C. Based on fullscan and MS/MS, data regulations for several pathways were proposed by XCMS. As shown in Figure 9C, based on positive ionization and negative ionization high percentages (median = 66.6% (33.3–100%)) of relative

57

pathway coverage of the pathways (n = 39) were putatively covered showing highly significant regulations with a median p-value of 0.00949 (0.00135–0.04591).

Data analysis of NMR data by metabolomics pathway analysis [306] revealed statistically relevant enrichments for similar pathways (Figure 10B) but with a much lower coverage in comparison to the analysis of mass spectrometry data. The second difference lies in the distribution of pathways. While the broad coverage of the metabolome by mass spectrometric profiling allows identifying a broad range of pathways, the pathways detected by NMR data focus on amino acids and carboxylic acid related pathways.

2.8 Targeted validation

Proposed pathways such as butyrate degradation, histidine degradation, l-carnitine biosynthesis and bile acid biosynthesis (neutral pathway), as well as dopamine degradation, l-cysteine degradation, and glycine were further investigated by targeted analysis. Here the quantification results of butyrate, glutamate, lysine, histidine, dopamine, cholic acid and deoxycholic acid are shown. The metabolites butyrate, glutamate and lysine were detected by both metabolomics platforms. The NMR measurements provide absolute quantifications, whereas MS provides only relative quantifications. While for glutamate and lysine the measurements obtained were similar, the results obtained for butyrate quantification differed for the last two samples. Our analysis as stands here does not indicate which measurement is more valid, but the use of 2 different analytical platforms may reveal specific analytes that require further methods for absolute quantification.

The proposed differences in abundance of amino acids were confirmed as shown in Figure 11B for histidine, and the biogenic amine, dopamine. The changes in the abundance of amino acids vary according to different pathways, suggesting pathway-specific effects of the treatments. Using the IDQ P180 kit [273], differences in carnitine abundance were found (not shown). Proposed changes in bile acid pathways such as “bile acids biosynthesis (neutral pathways)” were investigated by analysis of 16 human- specific bile acids. Strong effects were found for cholic acid and deoxycholic acid. Different gut microbial ecosystem-relevant classes of metabolites (specifically amino acids, SCFAs and bile acids) vary greatly in their physico-chemical properties, and thus it will be useful to establish recommendations for which platform to use for optimal analysis of each metabolite class.

58

Figure 11. Validation by targeted MS Spectrometry and NMR. Quantitative analyses of the samples 39, 40 before and after treatment, and 41 before and after treatment are shown for different classes of metabolites. Samples treated with antibiotics or ethanol are linked by dotted lines. (A) Comparison between targeted MS analysis and NMR analysis for selected metabolites. Amino acids (B) and bile acids (C) were detected by commercially available kits.

59

60

2.9 Perspectives for metabolic flux and for linking community activity to the composition of the consortium

Despite the steady development of modelling approaches [265] there still remains the need for more detailed experimental analyses at a functional level. With respect to metabolomics this means going from abundance-based information to flux analysis [307], and this holds especially true for microbiome research where, for example, in studies of metabolic diseases it is important to understand how substrates are utilized. In such an experiment (Figure 12B) isotopically labelled substrates can be used and their metabolism within the ecosystem followed. For this type of work, both NMR and mass spectrometry would be useful platforms since both are capable of detecting either the change in mass or in the nuclear magnetic spin caused by 13C or 15N; these two isotopes could be particularly useful here, since many relevant substrates (including plant fibres and intermediates such as SCFAs and amino acids) are commercially available in a 13C or 15N labelled form.

61

Figure 12. Linking metabolomic information with phylogenetic information. Metagenomic data provide an overview of the metabolic potential but little information on the actual metabolic activity resulting in the measured meta-metabonomic profile (A). The application of 13C or 15N labelled substrates enable the analysis of the incorporation into metabolites, proteins and DNA. Protein-SIP and DNA-SIP then deliver the phylogenetic information of the active species.

The next step in biological interpretation of metabolome data from microbiome samples is determining the linkage between metabolic functionality and phylogenetic structure of the ecosystem under study. The application of SIP approaches to the field of microbiome research has proven to be technically difficult, in particular for delivery of the labelled substrates. The extremely high sensitivity of Nano-SIMS in terms of detection of incorporation allowed injection of labelled amino acids into the tail of mice and detection of labelled species in the mucus [308]. This technique requires establishing RNA- FISH probes for specific microbes within the ecosystem, and this is limited by available probes, and compatible hybridization conditions, thereby hindering a more global analysis of active component species. As a broader, non-targeted alternative to this approach, ex vivo cultures of gut microbial ecosystems have been incubated with heavy water (D2O) to allow active members of the consortium to

62

take up the label in contrast to the inactive fraction that does not; cells were then sorted through Raman microspectroscopy and single-cell sequencing was used to identify the microbial members of the active fraction [309]. A similar approach using isotopically labelled substrates could be applied to a bioreactor vessel system in order to allow global analysis of metabolically active species in the microbiota supported therein (Figure 12B).

Metabolomic analysis by MS and NMR will thus allow the metabolic flux within a microbial community to be assessed, and this can be linked to microbial phylogeny using specific sequencing technologies, although the latter part of the procedure depends on the quality of the available metagenome database [310]. Protein–SIP may offer the best resolution for these kinds of experiments: for 13C detection only 1% of additional incorporation is sufficient for detection compared to at least 25% of incorporation for DNA-SIP [269]. This higher sensitivity for protein–SIP is of specific relevance since it allows the utilisation of smaller amounts of label.

2.10 Conclusion

The combination of culturing microbiome-derived ecosystems and metabolomic analyses provides a powerful tool for investigating the effects of ecosystem perturbation (including assessment of treatments) at the functional level. For obtaining broad biological information the combination of NMR and mass spectrometry are recommended. With respect to mass spectrometry the integration of untargeted profiling and targeted validation is necessary to obtain quantitative data. The combination of platforms and integration of profiling and targeted analysis allows coverage of a wide range of metabolites, thereby capturing key pathways in microbiome biology.

63

Chapter 3 – Protein-SIP utilizing Heavy Water Protein-SIP is a relatively new technique that has some clear advantages in terms of elucidating microbial community functional activity [201, 230, 239, 240]. Particularly, the measurement of proteins enables the delineation of functions beyond metabolism, and such functions can be traced back to the microorganisms from which they were derived. Further, not only are proteins the final actors of the central dogma, but SIP inherently only detects the actively synthesized proteins over a specific time frame. Therefore, time course analysis is possible. Currently popular protein-SIP protocols utilize 13C or 15N labelled substrates [311]. Although useful in illuminating how a food source or elemental cycling is handled in the context of a microbial ecosystem, this strategy does place a limit on the types of activities that are able to be quantified. The use of heavy water circumvents this shortfall, as water is universally taken up by microbes and incorporated into biomacromolecules via hydrolysis reactions [312, 313]. In this study, the aim was not only to demonstrate how heavy water-based protein-SIP allows for the measurement of global microbial activity, but also to determine the relative effectiveness of the different isotopes (D vs. 18O) and implement them into the readily available bioinformatics pipeline MetaProSIP [240]. This study fits into the first component of my goals, i.e., development of ‘-omics’ tools, and although not applied to address my hypothesis in subsequent chapters, such an approach is invaluable to the field of microbial ecology in general, as it is a more broadly applicable technique that elucidates ecological dynamics and microbial interactions. Further, for the demonstrative aspect, the fecal-derived defined microbial community was grown in two different medium formulations simulating a high fiber and high protein diet, respectively. The response of the microbial community to diet, a known modulator of the human gut microbiota [4] and a critical component of environmental selection [174], was informative for later experiments in chapter 4 that made use of the identical media conditions. The following body of work is presented in the format of an original research article to be submitted to the ISME Journal.

64

3.1 Article Information

Tracing species-specific general metabolic activity in complex communities by detecting isotopes from heavy water in proteins

Robert Starke1*, Kaitlyn Oliphant2*, Nico Jehmlich1*, Stephanie Schäpe1, Sven Baumann1,4, Timo Sachsenberg3, Oliver Kohlbacher3, Emma Allen-Vercoe2, and Martin von Bergen1,4

Affiliations

1Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research (UFZ), Leipzig, Germany

2Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada

3Wilhelm Schickard Institut für Informatik, Applied Bioinformatics Group, University of Tübingen, Tübingen, Germany

4Institute of Biochemistry, Faculty of Biosciences, Pharmacy and Psychology, University of Leipzig, Leipzig, Germany

*These authors contributed equally to this work

Acknowledgements

We would like to acknowledge the Natural Sciences and Engineering Research Council of Canada scholarship and Ontario Ministry of Training, Colleges and Universities scholarship to KO for providing funding.

65

3.2 Abstract

Over the past decades, SIP approaches have risen in popularity as an elegant tool to highlight active organisms in ecosystems, through monitoring the incorporation of an isotopically labeled substrate into biomolecules to examine energy or nutrient flows. Here, we have validated and demonstrated a substrate-independent protein-SIP protocol using isotopically labeled water that captures the complete

18 microbial activity of a community. We tested both D2O and H2 O as markers for the detection of metabolic activity in a pure culture of E. coli K12 grown at both growth permissive and retardant temperatures. We found that 18O yielded a higher incorporation rate into peptides and thus was more sensitive. We then applied the method to an in vitro model of a human distal gut microbial ecosystem grown in two medium formulations, to evaluate changes in microbial activity between a high fiber and high protein diet. We showed that there is little change in both the abundance of species and functional groups between diets, providing evidence that the gut microbiota preferentially responds to carbohydrates rather than increased protein content. Through detecting general metabolic activity, we were additionally able to discriminate between the generalists and specialists in the microbial community, and identify key species involved in the dietary adaption process. Our approach can be applied to study any microbial ecosystem, and we have implemented the pipeline into the bioinformatics tool MetaProSIP, which is freely available to researchers through OpenMS.

3.3 Introduction

Culture-independent omic-techniques are deployed to gain deeper insights into the structure and function of microbial communities [314]. Among these commonly used omic-tools, metaproteomics has gained popularity as a central element in microbial ecology studies to decipher functional relationships between community members [269, 315]. However, the assessment of the relative metabolic activity of distinct community members can only be facilitated by SIP approaches [316]. The two possibilities for labeling proteins are (a) the utilization of an energy or nutrient source with a heavy isotope component, which can be biochemically incorporated into amino acids, or (b) the SILAC protocol, where specifically labeled amino acids are added to the medium [317]. The latter method is problematic for microbiome research, as it is restricted by the need to use defined culture medium, in which all amino acids can be replaced by the labeled amino acids. SILAC is also dependent on the studied microorganisms being auxotrophic for the labeled amino acids [318]. Thus, studies utilizing protein-SIP have favoured the isotopic labeling approach, with 13C and 15N being the most widely used elements [311, 312, 319, 320]. This previous work has successfully applied the method to determine the relative isotope abundance (RIA), which relates to the amount of substrate utilization of a microorganism either directly or indirectly, and the labeling ratio (LR), which relates to the microbial growth or protein turnover [319]. Careful cross-

66

evaluation of all members in a studied community, especially in time course experiments, can provide both information on the mechanism of substrate use and the overall contribution of activity by each taxon [320]. However, the general drawback of these 13C and 15N-based SIP studies is that only the activity of the substrate-degrading microbes is assessed, and thus the activity of the microbial community in its entirety cannot be determined [311]. A possible solution was presented by Justice and colleagues, which involved the parallel usage of two different labels, one for the substrate specificity (15N) and one for determining the baseline metabolic activity (D2O), and this method was successfully used to elucidate the relative activity of the microbial constituents present in acid mine drainage biofilms under different conditions [312]. More recently, it was also shown that 18O-labeled water can be tracked in both the DNA [321–323] and RNA [324, 325] of active key players in microbial communities. However, the behavior of

18 microbes metabolizing isotopically labeled water, either as D2O or H2 O, has yet to be explored at the proteomic level.

18 In this study, we applied isotopically labeled water, as D2O and H2 O, to both pure cultures of E. coli K12, and to defined microbial communities derived from a healthy human fecal donor, in order to examine the temporal dynamics of isotopically labeled water incorporation into the proteome. We verified the activity measure by incubating inactive and active cultures of E. coli K12 at growth permissive and retardant temperatures, respectively. The incorporation patterns were implemented into the bioinformatics tool MetaProSIP [240] to facilitate a high-throughput analysis of the LC-MS data. The pipeline was validated for D using commercially available D5-ring labeled Angiotensin-II as exemplary peptide with incorporation, and for 18O-labeled water using a proteolytic digestion of bovine serum albumin (BSA). Next, we utilized the pipeline in an in vitro bioreactor-based model of a human distal gut microbial ecosystem, to illustrate its applicability to addressing a biologically relevant question in a microbiome- related setting. We investigated the differential effects of two media formulations, designed to represent a high fiber and a high protein diet respectively, on microbial activity, since it is well known that diet is an important modulator of colonic microbial ecosystems [326]. The aims of this study thus were to 1)

18 determine the most suitable heavy water (D2O or H2 O) for protein-SIP experiments in microbial communities, 2) demonstrate how a protein-SIP approach can contribute to the analysis of key species and key pathways in a complex microbial community, and 3) provide a software tool for researchers to complete protein-SIP studies utilizing heavy water.

67

3.4 Methods

3.4.1 Validation of isotope detection and abiotic HD-exchange

In order to validate the MetaProSIP v2.0 pipeline, ring-D5 at phenylalanine labeled Angiotensin- II (Bachem, Bubendorf, Switzerland) was mixed with unlabeled Angiotensin-II (Bachem, Bubendorf, Switzerland) at different labeling ratios (20, 40, 60, 80 and 100%) in 0.1% formic acid, in triplicate. For

18 18 O incorporation analysis, BSA was tryptically digested at different concentrations of H2 O (99 atom%, Sigma-Aldrich) as previously described [327]. Briefly, 1 mg of BSA was incubated in a mixture of 10 µL trypsin solution (0.02 µg µL-1) and 10 µL water with descending percentages of label (100%, 75%, 50% and 0%), resulting in a final trypsin concentration of 0.01 µg µL-1 and labeling ratios of 50%, 38%, 25% and 0%, respectively, at 37 °C for 16 h. The solution was dried in a vacuum centrifuge and stored at -20 °C. To determine the impact of the abiotic HD-exchange, 1 µM Angiotensin-II (Bachem, Bubendorf,

Switzerland) was incubated in 1 mL 100% D2O (99.9 atom%, Sigma-Aldrich) with shaking (500 rpm) at room temperature for 24 h (n=3). After 2, 4, 8 and 24 h, 100 µL samples were withdrawn, dried in a vacuum centrifuge and stored at -20 °C. The remaining solution of 600 µL containing D-labeled

Angiotensin-II was additionally dried, resuspended in 600 µL unlabeled ddH2O and incubated with shaking (500 rpm) at room temperature for 72 h (n=3). After 24, 48 and 72 h, 100 µL samples were taken, dried in a vacuum centrifuge and stored at -20 °C. Dried protein pellets were resuspended in 0.1% formic acid and measured by mass spectrometry.

3.4.2 Growth of E. coli K12

The strain E. coli K12 was obtained from the Leibniz-Institut DSMZ - Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM 498). For the starter cultures, three replicates of 100 mL LB-Miller medium supplemented with 2 g L-1 glucose [326] were each inoculated with 10 µL of the thawed commercial culture and three replicates served as blanks. The cultures were incubated at 37 °C with 200 rpm agitation (KS 4000 I control, IKA, Staufen, Germany). The optical density at 600 nm was reported over time using 1 mL of the culture (Novaspec II, Pharmacia, Uppsala, Sweden) to monitor growth. Due to the limited amount of 18O and D labeled water, experiments were then downscaled to 4 mL. The 8 mL reaction tubes containing 2 mL LB-Miller medium supplemented with 2 g L-1 glucose and

18 2 mL of ddH2O. For the protein-SIP experiments, the water was replaced by either 2 mL H2 O (99 atom%, Sigma-Aldrich) or D2O (99.9 atom%, Sigma-Aldrich). Half of the cultures were then diluted 1:1, yielding the final total concentrations of 0.25 and 50% labeled water with three replicates each (n=15). For inoculation, 50 µL of the starter culture (OD of 1.0) was added. Cultures were then incubated as described above, stopping after 2, 4 and 6 h for sample collection. The 1 mL sample volumes were

68

centrifuged at 13,200 rpm, -4 °C for 10 min (Eppendorf Centrifuge 5430 R, Eppendorf North America, New York, USA). Cell pellets were resuspended in 1 mL Tris buffer (20 mM Tris/HCl pH 6.8, 0.1% SDS) and ultrasonicated for 10 min. The cell suspension was then centrifuged a second time at 13,200 rpm, -4 °C for 10 min. The supernatant was subsequently collected and stored at -20 °C.

For validation of activity measurements, starter cultures of E. coli K12 were made in 10 ml BHI (Brain Hearth Infusion Medium, Carl Roth, Karlsruhe, Germany) by inoculating with 1 ml of E. coli K12 and incubating aerobically with agitation for 24 h (37 °C, 700 rpm). Stock solutions of 50% D2O and

18 18 50% H2 O were prepared by mixing sterile ddH2O with pure D2O and H2 O (v/v). Subsequently, 1.5 mL (A, B) or 0.5 mL (C, D) of fresh BHI medium was mixed with 500 mL each of the E. coli starter cultureand aliquoted in 15 mL volumes. Cultures were incubated (KS4000 I control, IKA, Staufen,

Germany) with agitation at 37 °C (A, C) or at 4 °C (B, D). After 4 h, 2 mL of 100% D2O (AI, BI), 2 mL

18 of 50% D2O (AII, BII), or 1 mL of 50% H2 O (CII, DII) were added and incubation continued with agitation at 37 °C or at 4 °C for an additional 2 h. Samples were then collected and centrifuged (Universal 320 R, Hettich Zentrifugen, Tuttlingen, Germany) at 10000 rpm for 10 min to pellet cells. For bacterial cell lysis, pellets were incubated for 30 s in liquid nitrogen followed by 30 s in a 90 °C water bath. After three freeze-thaw cycles, a pipette tip of the pellet was mixed with 21 µL of digestion buffer (20 mM ammonium bicarbonate, 5 % acetonitrile) containing 0.02 µg µL-1 trypsin (Promega) and incubated overnight at 37 °C. The digestion was stopped with 1 µL of 100% formic acid per reaction and after centrifugation (10000 rpm, 10 min), the supernatant was dried in a vacuum centrifuge. Samples were resuspended in 0.1% formic acid, then desalted and purified by ZipTip® treatment (Millipore, Billerica, MA, USA).

3.4.3 Bioreactor operation and batch culture

Two Multifors bioreactors (Infors, Basel/Bottmingen, Switzerland) were operated as in vitro models of the distal human gut with working volumes of 500 mL as previously described [200], but with custom medium formulations representing different diets adapted from Marzorati et al. [328] to accommodate a single vessel system as in McDonald et al. [198] (Table S1). Both bioreactors were inoculated with a defined microbial community isolated from a healthy human fecal sample and described in Yen et al. [200], comprising of 63 bacterial strains in total from 6 phyla (Table S2). Bioreactor A was fed the high fiber medium and Bioreactor B was fed the high protein medium, in which the microbial ecosystems were allowed two weeks to equilibrate. Batch cultures were then set-up using harvested

18 bioreactor material and the isotopically labeled waters, D2O and H2 O. Each batch culture comprised of 2 mL of the harvested bioreactor contents, 1 mL of the pre-reduced, double-strength respective medium used in the bioreactor, and 1 mL of the isotopically labeled water, with ddH2O used as a control. In total,

69

18 x 4 mL batch cultures were prepared, since triplicates were completed per medium formulation and per water type. The batch cultures were incubated in an anaerobic chamber (Baker Ruskinn, Sanford, ME,

USA) at 37 °C with a gas mixture of 5:5:90 H2:CO2:N2 for 12 h, after which the culture contents were evenly divided into two aliquots and immediately cryopreserved at -80°C. A sample of the bioreactor contents for each medium formulation was also collected upon batch culture preparation, and immediately cryopreserved at -80 °C.

3.4.4 Batch culture sample processing

Samples were thawed and then centrifuged at 14 000 rpm (28,260 x g) for 15 min at 4 °C. The cell pellets were used for both DNA and protein extraction, with one of the sample aliquots being dedicated to each. The DNA extraction was completed via the QIAamp DNA Stool Mini Kit (Qiagen, Germantown, MD, USA) following the manufacturer’s directions with slight modifications: the cell pellet was resuspended in 200 μL of 100 mM Tris-HCl, 10 mM EDTA, pH 8.0 buffer prior to the first step, the initial high-temperature incubation was done at 9 5 °C for 15 min, and the final elution of DNA was carried out using only 50 μL of the elution buffer, which was pre-warmed at 50 °C. All DNA samples were stored at 4 °C after processing. The protein extraction was completed by first re-suspending the cell pellet in 1 mL of 0.4% (w/v) sodium dodecyl sulfate in 100 mM Tris-HCl, 5 mM EDTA, 0.5 mM NaCl, pH 8.0 buffer with 0.1 μL of protease inhibitor cocktail set III (Calbiochem, Etobicoke, ON, Canada) added. The samples were then bead-beaten with 0.2 g of zirconia beads (Biospec Products Inc., Bartlesville, OK, USA) each using the Digital Disruptor Genie (Scientific Industries Inc., New York City, NY, USA) at 3000 rpm for 4 min, followed by incubation at 60 °C for 15 min, before being sonicated using the Sonicator Ultrasonic Processor XL2020 (Mandel Scientific, Guelph, ON, Canada) on ice for a total of 2 min with 10-s pulse, 10-s off intervals. Samples were subsequently centrifuged at 8,330 rpm (10,000 x g) for 10 min at 4 °C, and the supernatant containing the protein extract was collected and stored at -20 °C.

3.4.5 Direct infusion of D-labeled Angiotensin-II

Dried D-labeled Angiotensin-II pellets were resuspended in 40% acetonitrile plus 0.08% formic acid and measured with direct infusion by Orbitrap Q-Exactive-MS to ensure minimal exposure to unlabeled water. Continuous scanning of the infused peptide ions was carried out between 350 and 1,550 m/z at a resolution of 120,000. Mass spectra were collected for 30 s after 0.1, 2, 3, 5 and 8 min.

70

3.4.6 Sample preparation for proteomics

The supernatants were incubated with five-fold volumes of 100 mM ammonium acetate in methanol overnight. After centrifugation at 7,830 rpm, -4 °C for 10 min (7,000 x g, Sorvall RC 6 plus, Thermo Electron Corporation, Waltham, MA, USA), cell pellets were resuspended in 1 mL ice-cold acetone and centrifuged at 13,200 rpm (12,000 x g), -4 °C for 10 min. The entire volume of the samples was used for 1D gel electrophoresis without prior determination of the amount of proteins. Air-dried protein pellets were suspended in 30 µL 1x Laemmli buffer [329], dissolved via ultrasonication, and incubated with shaking at 500 rpm, 90 °C for 10 min. Samples were centrifuged at 13,200 rpm as described above to remove precipitates before loading on sodium dodecyl sulfate gels (4% stacking gel and 12% separating gel). Electrophoresis was performed at 10 mA per gel. Polypeptides were stained by colloidal Coomassie Brilliant Blue G-250 (Roth, Kassel, Germany). Gel lanes were cut into pieces for each sample, and an in-gel tryptic digestion was performed [330]. Excised gel bands were washed thrice with 200 µL of 10 mM ammonium bicarbonate in acetonitrile (40% v/v) for 10 min. Gel pieces were then dried with 200 µL acetonitrile for 5 min. After removal of acetonitrile, the dried gel pieces were reduced and alkylated by subsequent incubation with 30 µL of 10 mM dithiothreitol in 10 mM ammonium bicarbonate then 30 µL of 100 mM iodoacetamide in 10 mM ammonium bicarbonate for 30 min each. Afterwards, the solution was removed, and the gel pieces were dried in a vacuum centrifuge prior to incubation with acetonitrile as described above. After removal of the acetonitrile, the gel pieces were equilibrated with 200 µL of 10 mM ammonium bicarbonate for 10 min followed y a further incubation with acetonitrile as described above. Gel pieces were finally incubated with 30 µL of 5 mM ammonium bicarbonate containing 0.01 µg µL-1 trypsin (Promega) overnight at 37 °C. After proteolytic cleavage, trypsin solution was collected and the peptides from gel bands were extracted twice with 30 µL acetonitrile/formic acid (50%/5% v/v). Solutions were combined and dried in a vacuum centrifuge as described above. Samples were resuspended in 0.1% formic acid, then desalted and purified by ZipTip® treatment (Millipore, Billerica, MA, USA).

3.4.7 Bioinformatics tool development

We extended the OpenMS tool MetaProSIP with algorithms to extract ion chromatograms of D or 18O labeled peptides. For D labeled peptides, our novel algorithm uses a Theil-Sen estimator to fit a robust regression line through chromatographic apices and compensate for retention time shifts between isotopologues. Our novel method then derives single isotopic peak intensities by integrating over the corresponding extracted ion chromatogram. We further extended the methods to calculate isotope patterns

71

of D and 18O labeled peptides and used the previously described decomposition algorithm to calculate RIA and LR of the labeled peptide species. These extensions to the MetaProSIP tool enabled, to the best of our knowledge for the first time, the performance of LC-MS-based protein-SIP analyses using isotopically labeled water. MetaProSIP workflows were next constructed in the integration platform KNIME, allowing for sophisticated downstream processing. We provide an analysis workflow to demonstrate how MetaProSIP now interfaces with novel OpenMS data processing tools that support precursor charge correction, mass recalibration, improved non-linear chromatographic retention time alignment, novel feature detection algorithm, and extended support for peptide identification engines. Source code of the extended MetaProSIP tool and installer for all major platforms (Windows, Linux, and Mac) are freely available as part of the OpenMS framework (www.openms.de).

3.4.8 Mass spectrometry and identification of stable isotope incorporation

Tryptic peptides were analyzed by UPLC-Orbitrap Q-Exactive-MS/MS. The peptides were eluted using a linear gradient of 125 min with 4-55% solvent B (80% acetonitrile, 0.08% formic acid) or, in case of the validation, using a linear gradient of 60 min with 4-55% solvent B (80% acetonitrile, 0.08% formic acid). Continuous scanning of eluted peptide ions was carried out between 350 and 1,550 m/z at a resolution of 120,000 and a maximum injection time of 120 ms, automatically switching to MS/MS HCD mode using a normalized collision energy of 30%. The obtained raw data was processed with database searches by Thermo Proteome Discoverer (v1.4.1.14; Thermo Fisher Scientific, Waltham, MA, USA). Searches were performed using the Sequest HT algorithm with the following parameters: tryptic cleavage with maximal two missed cleavages, a peptide tolerance threshold of ±10 ppm, an MS/MS tolerance threshold of ±0.1 Da, carbamidomethylation at cysteines as static modifications and oxidation of methionines as variable modifications. Searches were performed against the genome of E. coli K12 (Uniprot, 02/16/2016) or, in the case of bioreactor samples, a combined metagenome consisting of the species present in the defined microbial community (Table S2). Only contigs with at least one unique peptide and high confidence (false discovery rate <0.01) were considered. The false discovery rate was determined by decoy database searches. The MetaProSIP tool of OpenMS was used for the identification of stable isotopes in the proteins. Therefore, *.raw data was converted into *.mzML files using MSConvert of ProteoWizard. The latter files were downsized into both *. mzML and *. featureXML files using a stricter fragment mass tolerance of ±0.02 Da. The generated files were then used for the MetaProSIP searches with an m/z tolerance of ±10 ppm, an intensity threshold of 1,000 and a correlation threshold of 0.7.

72

3.4.9 Assessment of microbial community composition

PCR amplification of the 16S rRNA gene was completed using 1 μL of the DNA extraction as template and 400 ng of Nextera XT Index v2 sequences (Illumina Inc., Hayward, CA, USA) plus standard v4 region primers [331], in a 25 μL reaction volume with Invitrogen Platinum PCR SuperMix High Fidelity (Life Technologies, Burlington, ON, Canada). Cycler conditions included an initial melting step of 94 °C for 2 min, followed by 50 cycles of 94 °C for 30 s, annealing temperature for 30 s and 68 °C for 30 s, with a final extension step of 68 °C for 5 min. The annealing temperature comprised of a 0.5 °C increment touch-down starting at 65 °C for 30 cycles, followed by 20 cycles at 55 °C. PCR products were purified using the Invitrogen PureLink Quick PCR Purification Kit (Life Technologies) according to the manufaturer’s directions, and then the subsequent normalization and Illumina MiSeq sequencing was carried out at the Advanced Analysis Center located in the University of Guelph, ON, Canada. Sequencing data processing was performed using Mothur (v.1.39.5), following the procedure published by Kozich et al. [332], with the exception that sequences were binned into phylotypes based on taxonomic classification. Sequences were aligned with the SILVA 16S rRNA reference database (v.132) and classified with an internally generated database containing 16S rRNA gene sequences of each component of the defined microbial community.

3.5 Results

3.5.1 Validation of isotope detection and activity measure with E. coli K12

The MetaProSIP software [240] was extended to trace the incorporation of D and 18O. In order to control the accuracy of quantifying 18O incorporation, BSA was tryptically digested in 18O-labeled water at different concentrations. This digestion resulted in median RIA of 24.8%, 30.1% and 51.5%, when the peptides were incubated with 50%, 38% and 25% of the 18O label, respectively, each showing small variation indicated by the lower and upper quartiles (Figure 13D). These measurements matched the theoretical values for 50% and 25% 18O label but had 8% lower RIA values when 38% 18O label was used. To validate D incorporation, D5-ring labeled Angiotensin-II was measured at different labeling ratios (Figure 13B). The median RIA of these measurements was 7.5%, each with small variation indicated by the lower and upper quartiles, which was highly similar to the theoretical value of 7.04%.

Further, unlabeled Angiotensin-II was incubated in D2O for 24 h and measured by direct infusion to ensure minimal exposure to unlabeled water. The abiotic incorporation of D greatly diminished after eight minutes (Figure S1). The cultivation of E. coli K12 in 4 mL volumes of LB medium resulted in higher

18 amounts of labeled peptides with H2 O (Figure 13E) as compared to D2O (Figure 13B), which increased in the exponential phase but decreased in the stationary phase. The RIA values were lower with D2O but

73

stagnated in the early exponential growth phase when 25% label was applied, whereas increasing RIA values were found with 50% label of both 18O and D until the stationary phase. Thus, increasing the

18 dosage did not improve the amount of incorporation, but H2 O demonstrated an enhanced level of peptide integration compared to D2O. E. coli K12 grown at growth permissive and retardant temperatures

18 in 25% or 50% of D2O or H2 O over a short-time period resulted in unlabeled protein identifications that ranged from 174 at 4 °C with 50% D2O to 1114 at 37 °C with 50% D2O (n=3) (Figure S2). Less than 10 peptides with incorporation were detected in the cold temperature-incubated cultures, whereas incubation at 37 °C resulted in 410 peptides with incorporation when 50% D2O was supplied, 90 with 25% D2O and

18 74 with 25% H2 O. Therefore, our method discriminated active from inactive cultures. Typical incorporation pattern for D (Figure 13C) and 18O (Figure 13F) are shown for two exemplary peptides derived from E. coli K12.

Figure 13. Validation of D- and 18O-incorporation, using standards (A, D) and Escherichia coli K12 18 18 grown in 25% and 50% D2O (B) or H2 O (E), and the incorporation pattern of D(C) and O(f). Angiotensin-II, ring-D5-labeled at phenylalanine (RIA=7.04%), was mixed with unlabeled Angiotensin-II at different labeling ratios (20, 40, 60, and 80%) (A). Bovine serum albumin (BSA) was triptically digested in different concentrations of 18O-labeled water (D). Relative isotope abundances (RIA) of 18 peptides extracted from E. coli K12 after incubation in 25% and 50% concentrations of D2O (B) or H2 O (E). Exemplary incorporation pattern of D (C) and 18O (F) into peptides produced by E. coli. The number of labeled peptides is indicated for each boxplot. Median, lower and upper quartiles, lower and upper whiskers, 5th and 95th percentiles, and the detection limit are shown in the boxplots (n=3).

74

3.5.2 Incorporation of D and 18O into the metaproteome of a defined human fecal microbial community for the detection of active key players

A defined microbial community derived from a human fecal sample was incubated in two different medium formulations representing a high fiber and a high protein diet, respectively. Each was

18 supplemented with 25% D2O and 25% H2 O and harvested after 12 hours. Experiments were done in triplicate. Incubation with D resulted in median RIA values of 5.1% for high protein and 5.3% for high fiber diet with high reproducibility (Figure S3A). Similarly, 18O resulted in high reproducibility but higher RIA values of 13.8% for high protein and 14.8% for high fiber diet. The RIA values of the bioreactor samples were significantly (HSD-test, p-value<0.5) and up to two-fold higher than the respective RIA values from E. coli K12 with identical atom% of label (Figure S3B).

Among the 20 most active species in the defined ecosystem, 25% D2O incubation resulted in similar RIA ranging between 5 to 7%, but showed varying contributions of peptides with isotope

18 incorporation, whereas an identical pattern was revealed when 25% H2 O was supplied with the RIA ranging from 11 to 17%. (Figure S4). The relative abundances (RAs) from the 16S rRNA profiling data, the metaproteome, and the protein-SIP approach, in addition to the RIA, of the most abundant species in the defined microbial community is depicted in Figure 14, which represents more than 90% coverage of the data for each method. The 16S rRNA gene based compositional data was dominated by Rarimicrobium hominis, Ak. muciniphila, Alistipes onderdonkii and Bacteroides uniformis in the high protein medium, and Ak. muciniphila, Al. onderdonkii, B. thetaiotaomicron, B. ovatus, Bacteroides eggerthii and B. uniformis in the high fiber medium, in descending order. The metaproteome showed similar abundances across diets (high fiber and high protein) and incubations (inoculum, unlabeled water,

18 D2O and H2 O), which was dominated by B. uniformis, B. thetaiotaomicron, B. ovatus, B. eggerthii, Bacteroides cellulosilyticus and Ak. muciniphila. In the protein-SIP approach, most labeled peptides were assigned to Ak. muciniphila, followed by Bacteroides vulgatus, B. ovatus, B. uniformis and B. thetaiotaomicron, whereas the RIA was similar among species with isotope incorporation. Therefore, most of the dominant species overlapped in all three methods.

75

Figure 14. Relative abundances (RA) obtained from the 16S rRNA marker gene sequencing, metaproteomics and protein-SIP data, in addition to the relative isotope abundances (RIA) from protein- SIP data, for the most abundant genera in the defined microbial community derived from a human fecal sample, in regards to medium (P – high protein, F – high fiber) and treatment (0 – inoculum, C – 18 unlabeled water, D – D2O, O – H2 O).

Next, we analyzed the difference of the relative functional contribution of the defined microbial community peptides between high fiber and high protein diets. Proteins were assigned to cluster of orthologous groups (COGs) and the RA of peptides in the metaproteome data and the SIP data (18O and D) were compared. On the metaproteomic level, the COG classes D, E, M, P, R, U and V were higher in the high fiber diet, whereas J, O and S were higher in the high protein diet (Figure 15). However, these differences were not significant (fold change < |1|). The SIP data demonstrated an enhancement of nearly all COG classes in the high protein diet, but this effect was only significantly (fold change > |1|) higher with 18O and not D.

76

Figure 15. Difference in abundance of functional classes as cluster of orthologues groups (COG) comparing the defined microbial community derived from a human fecal sample grown in high fiber (HF) 18 18 and high protein (HP) medium by metaproteomic and protein-SIP, utilizing D2O (D-SIP) or H2 O ( O- SIP), data.

3.6 Discussion

A critical next step in the study of most microbiomes, be it human, animal or environment associated, is to assess ecosystem functionality. Compositional surveys via metagenomics have yielded a useful baseline of knowledge; however, given the sheer volume of genes that could be expressed, it is impossible to determine with certainty what processes could be occurring. Further, it has been demonstrated that shifts in microbial abundance can poorly correlate with alterations in relative functional contribution via both metatranscriptomics [332] and metaproteomics [333–335]. Both of these techniques are suitable for evaluating snapshots of functional distributions, but it is difficult to monitor fluxes or the activity of community members over a specific time frame, because of their inherent relative measurement, insensitivity, and the various regulatory mechanisms downstream that influence the final state, such as protein half-life [201, 309]. Protein-SIP approaches have the potential to fill this knowledge gap, and have been previously utilized with success, for example, through labelling substrates with 13C or 15N [309, 310, 317, 318]. However, a method not limited to the metabolism or elemental cycling of only one environmental feature would be able to address a much wider-breadth of microbial function-related hypotheses. As such, deuterated water has been used previously to evaluate the protein turnover in both mammals and microorganisms [310, 311]. Here, we present a novel method for quantifying system-wide microbial activity through tracing of isotopically labeled water in proteins, and we have demonstrated its use in an in vitro model of a human distal gut microbial ecosystem. Further, we have incorporated this into an additional pipeline within the bioinformatics tool MetaProSIP [240].

77

Our pipeline was validated by first experimentally comparing the number of incorporated isotopes (as RIA) to the theoretical values of peptide standards labelled with D and 18O, respectively. Next, we cultured E. coli K12 in the presence of the respective isotopes to detect the difference in activity via RA by labeled peptides through application of a known growth deterrent, low temperature. For D, the experimental RIA was slightly higher than the theoretical value for the purchased peptide in which five hydrogens were replaced by D (an RIA of 7.50% was measured compared to the theoretical 7.04%); however, this deviation was within the expected 1% range of analytical inaccuracy and thus should be disregarded. For 18O, commercially available BSA was tryptically digested in different concentrations of 18O-labeled water, as an 18O-labeled peptide was unavailable because of the isotope’s interference with synthetic solid-phase peptide synthesis [336]. The applicability of our approach to introducing 18O by proteolytic digestion has been described elsewhere [325]. Briefly, when a protein is tryptically digested, a maximum of one oxygen molecule from water is introduced into the resulting peptide at the C-terminus

18 [337]. In our experiment, tryptic digestion in 50% and 25% H2 O yielded peptides with a median RIA of

18 50% and 25%, respectively, and only 38% H2 O resulted in a slightly lower median RIA of 30%. The isotope distribution of the measured labeled peptides thus matched expectation, and a lower isotope

18 abundance with 38% H2 O could be the result of a stochasticity since only one oxygen is replaced during protein hydrolysis. As this variability could clearly be ascribed to the experimental rather than technical factors of the pipeline, especially in the case of D, we conclude that our pipeline accurately quantified the RIA of both D and 18O.

Next, the activity measure was validated by growing E. coli K12 in growth retardant or permissive temperatures. A suitable cell biomass was obtained from cells initially grown at 37 °C, then further incubated at either 4 °C (retardant) or 37 °C (permissive). Peptides had a significantly higher RA when cells were actively growing at permissive temperatures compared to cold exposure. The RIA was identical in both cases, indicating that the mechanism of protein synthesis was unchanged, as expected (growth conditions outside of temperature were identical). Thus, our pipeline clearly differentiated the two states of activity in a consistent manner and indicated that the RA was a suitable indicator of baseline activity in protein-SIP experiments utilizing heavy water.

After confirming the accuracy and precision of the pipeline, the method was next applied to a diverse, defined microbial community (63 strains) in an in vitro setting to demonstrate its usefulness in microbiome research. A dosage of 25% isotopically labeled water was chosen, as the RIA was not significantly different between this dose and the highest dosage tested in E. coli, suggesting that there was no substantial improvement in isotope incorporation after increasing the amount to 50% for stationary growth. A difference in growth medium formulation, representing a high fiber diet and a high protein diet,

78

was utilized to evaluate shifts in microbial activity. We chose to assess these diets because high protein, low carbohydrate interventions represent a popular weight-loss strategy; however, concerns have been raised over the impact of such dietary strategies on colonic health, as the gut microbiota is known to convert complex polysaccharides into beneficial nutrients for intestinal epithelial cells, e.g., short-chain fatty acids [338, 339], whereas proteolytic products include compounds which can behave as gut irritants, including nitrosamines and heterocyclic amines [339]. Examining if the lower carbohydrate availability impacted microbial functional activity was thus of relevant interest, while simultaneously demonstrating our protocol. The experiments demonstrated high reproducibility between replicates, with minor lower and upper quartiles for both RIA and RA. There was no significant difference in the RIA between diets; however, the values were roughly double the RIA of the E. coli experiments.

Additionally, the RIA for 18O was nearly three times higher than the RIA for D, in all cases (including the E. coli experiments). These observations can be attributed to the mechanism of D integration into proteins [313]. D will readily equilibrate with normal water and be taken up by a bacterial cell, after which it is incorporated into nonessential amino acids during specific enzymatic steps in their biosynthetic pathways. These labeled amino acids can then be subsequently added to proteins. The method of 18O integration is expected to be similar, but unlike D, in which only C-H bonds are reliable due to abiotic HD-exchange of acidic hydrogens when proteins are in contact with unlabeled water [333], all incorporated 18O atoms would be retained. We observed a higher RIA for 18O, consistent with this expectation.

Microbial species of different taxa possess a diversity of auxotrophy for amino acids, and alter their requirements for de novo amino acid biosynthesis depending on the concentration of these amino acids in the environment [99, 334]. In our experiments, batch cultures of E. coli in protein-rich media would have had access to abundant amino acids by protein hydrolysis, such that a relatively lower amount of de novo amino acid biosynthesis would be required (and thus a lower RIA). In contrast, members of a complex microbial community such as our derived fecal ecosystem compete for available amino acids in a less protein-rich medium, necessitating a higher degree of de novo amino acid biosynthesis (and thus a higher RIA). Most healthy human fecal samples indeed possess a relatively low concentration of amino acids in the range of 0-20 μg/mL [335], and the maintenance of these levels is thought of as an innate immune defense; conditions that give rise to excess amino acids in this environment, e.g., in the case of antibiotic treatment, can be exploited by opportunistic pathogens such as Cl. difficile [336].

We determined that the RIA was similar for most species within our defined microbial community, which we believe further validates our experimental approach. Amino acid biosynthetic pathways are generally highly conserved amongst the bacterial families present in the intestine [99], and

79

although differences in amino acid auxotrophy have been reported for, e.g., Lactobacillus [99, 334], our analysis suggests that this phenomenon is not a common feature. Our findings are in agreement with those of Price et al. [334], who recently tested the growth of a range of genera in minimal media and found that almost all of them grew without supplementation, despite only half of them being predicted to be auxotrophic for certain amino acids. It was concluded that our current knowledge of amino acid biosynthetic pathways is insufficient, and indeed most free-living microorganisms may be capable of synthesizing all 20 proteinogenic amino acids.

This finding, however, does not preclude exceptions, which became apparent when comparing our metaproteomic analysis to our compositional data. Although we did not expect the results to match based on previous work [337–339], the contrast between the presence and activity of Ra. hominis was particularly striking. Ra. hominis constituted almost 50% of the RA by 16S rRNA gene sequencing in the high protein medium yet contributed to only 2.4% of the relative metaproteome and had not incorporated any isotopically labelled water (18O and D). The gut microbe Ra. hominis is part of the phylum Synergistetes, and although species within it are generally poorly studied, it is known that its members (i) have compact genomes, with a mean size of 2.2 Mb, (ii) are amino acid fermenters, with the highest proportions of amino acid transport and metabolism genes in their genomes compared to any other phylum to date, and (iii) are minor constituents of their environments [340]. This information suggests that Ra. hominis is highly specialized for protein utilization and may also indicate auxotrophy for amino acids. Its lack of metabolic diversity could be the contributing factor to low detection rates in the metaproteome, as a higher proportion of genes may be attributed to housekeeping and are thus more likely to be conserved across the bacterial kingdom. Strict amino acid requirements would both limit growth in the environment and explain the absence of labeled peptides derived from Ra. hominis in our experiment. The high abundance of Ra. hominis obtained during in vitro bioreactor growth at first glance appears to be contradictory, but we suggest that it may be a result of the reduced diversity of our defined microbial community compared to the originating fecal ecosystem. Within a low diversity ecosystem, the Ra. hominis strain would have had access to increased amounts of amino acids because of reduced nutrient competition and/or from an enrichment of primary protein degraders that facilitate cross-feeding. If this is the case, then we also suggest that the RIA can be used as a predictor of the relative amount of de novo amino acid biosynthesis occurring within an ecosystem, that should be carefully considered prior to analyzing the RA.

One strength to our approach when applied to microbial ecosystems is that it has the ability to distinguish microbial generalists from specialists, in a non-predictive manner. We found that the majority of the most active microbes did not experience a shift in the ranking of RA between diets. Generalists are

80

able to shift their metabolism when substrates are changed without sacrificing growth. An example of a generalist genus is Bacteroides, members of which possess a wide variety of polysaccharide and protein degradative capabilities. Six Bacteroides spp. were components of our defined microbial community, and these displayed the most activity in our experiment. Interestingly, even after the switch in diets, the four Bacteroides isolates did not experience a rank change in RA amongst themselves. Previous work has indicated that Bacteroides spp. do not utilize polysaccharides in a random fashion, but rather have an inherent, strain-specific hierarchy of preference, which is retained in a community setting [341, 342]. We suspect that if a strain of Bacteroides was presented with an abundance of a given preferred substrate, it would increase its activity relative to the other species present in the community. Our media formulations were similar with respect to the ratio of the majority of component fiber sources and increasing protein content did not alter the activity ranking of the Bacteroides species, indicating that none of these species favoured protein degradation when carbohydrates were still available. This observation is in line with previous work that showed when there is an abundance of complex carbohydrates in the gut ecosystem environment, amino acids tend to be utilized anabolically [99]. Thus, our work adds compelling evidence to the premise that behavioral shifts of the gut microbiota in response to diet are primarily complex carbohydrate-driven [73, 82, 162].

In addition to Bacteroidetes, members of the Firmicutes and the Actinobacteria are capable of primary polysaccharide degradation in the gut. However, unlike the Bacteroidetes, members of the Firmicutes and Actinobacteria tend to be specialists, possessing a limited repertoire of fibers they can utilize, with a tendency to favor starch-derived polymers [72, 73]. Indeed, Ruminococcus bromii, a known starch-degrading specialist that has been previously shown to have low RA within a healthy gut ecosystem by 16S rRNA gene sequencing when starch is limiting [162, 343], experienced a drop in RA by protein-SIP upon introduction of the high protein medium. Therefore, it was ultimately the specialists that exhibited an alteration of activity upon dietary change. In contrast, Ak. muciniphila, a mucin- degrading specialist [82], did not alter its relative activity between shifts in growth media. This observation could simply be attributed to the equal amount of mucin in both media (Table S1); however, Ak. muciniphila has been previously reported to increase in RA within irritable bowel syndrome and obese subjects’ fecal microbial ecosystems by 16S rRNA sequencing after fiber [344] or oligofructose supplementation [345]. This is counterintuitive, given that Ak. muciniphila is not known to be able to utilize, e.g., inulin as a sole carbon source [142]. It is possible that Ak. muciniphila indirectly benefits from fiber intake, by partaking in cross-feeding and/or expanding after a decrease in competition. In our experiment, there was enough dietary substrate to prevent a notable difference in such proposed mechanisms, but future work utilizing our technique could evaluate if withholding complex carbohydrates indeed diminishes the activity of Ak. muciniphila.

81

Similar to our findings, multiple studies have also reported only minor alterations in microbial community composition after dietary change [84, 162, 346], and the stability of the gut microbiota over time is purportedly robust, with 60% retention of strains after 5 years [147]. Generalist Bacteroides spp. in particular are known to have a higher retention rate than more specialist species [147]. When more dramatic changes have been observed, this was usually associated with more extreme dietary intervention, e.g., completely withholding all plant sources of fiber [161], or with the maintenance of specific dietary patterns over the long-term [84]. Further, the context in which a particular dietary substrate, e.g., a prebiotic, is taken also appears to be important, as one study demonstrated that the common prebiotic, inulin, only sufficiently prevented microbial conversion of dietary nitrogen, e.g., protein fermentation, when digested with the test diet and not several hours afterwards [110]. Our future goals include tracing isotopically labeled water in an in vitro model to evaluate the differential carbohydrate ratios that drive gut microbiota metabolism.

Lastly, consistent with the minor changes in microbial community composition derived from unlabeled and isotopically labeled peptides, we found only slight differences in the RA of peptides by functional groupings (COG) between the diets. The fold changes for all COGs were less than the absolute value of one (i.e., non-significant) in both the unlabeled inoculum and after incubation in D2O. Notably,

18 there was a bias of increased RA by COGs in the high protein medium after incubation in H2 O. The discrepancy highlights the limitations of experimental design in that protein-SIP approaches necessitate the use of batch as opposed to continuous cultures. The elevated amount of fresh medium gave the defined microbial community access to an increased proportion of free amino acids and a less restricting carbohydrate content. These factors facilitated a higher degree of growth especially in the high protein

18 medium, which was apparent in H2 O because of its higher incorporation rates (i.e., RIA). We therefore recommend that researchers consider utilizing the metaproteome of the inoculum as a control for true environmental conditions and design their batch experiments carefully when applying this method.

In summary, we have validated two novel pipelines for evaluating microbial activity through tracing isotopically labeled heavy water, and successfully applied them to a defined microbial community cultured in vitro in a model emulating the distal human gut. Our pipelines are now a feature of the MetaProSIP bioinformatics tool [240], which is freely available as part of the OpenMS platform (www.openms.de). The methodology described here is applicable to a variety of both animal- and environment-associated microbiomes and in vitro models. The relative functional contribution of microbiota consitutents is a key question, in which the answer will further our understanding of these complex ecosystems, and we suggest that our approach can help address critical hypotheses in complex ecosystem dynamics.

82

Chapter 4 – Coevolution, Determinism & Stochasticity The research conducted within this chapter had a twofold aim in uncovering drivers of human gut microbial community assembly. First, within the factor of environmental selection, the relative contribution of habitat filtering and species assortment, specifically coevolution, to the assembly process was determined. Levy and Borenstein have suggested that habitat filtering is dominant from metabolic modeling [167]; however, I was interested in designing an experiment to evaluate this presumption more directly. Further, the aspect of coevolution is virtually unexplored [192]. Next, I endeavoured to statistically quantify whether environmental selection or stochasticity had the greater influence; human gut microbiota assembly is thought to be largely a deterministic process [172], but stochasticity still plays an important role, the extent of which has not yet been fully elucidated, due to the limited immediate observations of microbial reassembly within a human gut microbial ecosystem [170, 183, 347]. To accomplish these tasks, an ‘artificial’, non-coevolved defined microbial community was created through species matching to a natural, coevolved human fecal-derived defined microbial community control, but each bacterial strain was sourced from a different human donor fecal sample. Then, these communities were grown in two different medium formulations, representing a high fiber and a high protein diet, respectively, in several replicates in vitro. Diet is a host driven, i.e., habitat filtering, associated factor [174], whereas biological replication allows stochasticity to be observed. This study fits into the second component of my goals through directly addressing my hypothesis that microbial ecological theory can be replicated utilizing complex defined microbial communities. The following body of work is presented in the format of an original research article to be submitted to the ISME Journal.

83

4.1 Article Information

Drivers of human gut microbial community assembly: Coevolution, determinism and stochasticity

Kaitlyn Oliphant and Emma Allen-Vercoe

Affiliations

Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON, Canada

Acknowledgements

We would like to acknowledge the Natural Sciences and Engineering Research Council of Canada scholarship and Ontario Ministry of Training, Colleges and Universities scholarship to KO for providing funding.

84

4.2 Abstract

Microbial community assembly is a complex process shaped by multiple factors, including habitat filtering, species assortment and stochasticity. Understanding the relative importance of these drivers would enable scientists to design strategies to initiate a desired reassembly for, e.g., remediating low diversity ecosystems. Here, we aimed to examine if a fecal-derived defined microbial community derived from a healthy donor and cultured in bioreactors assembled deterministically or stochastically, by completing replicate experiments under two growth medium conditions characteristic of either high fiber or high protein diets. Then, we recreated this defined microbial community through matching different strains of the same species but where each bacterial strain was sourced from a different human donor, in order to elucidate whether coevolution of strains within a host influenced community dynamics. Each defined microbial ecosystem was evaluated for composition using marker gene sequencing and for behaviour using 1H-NMR based metabonomics. We found that stochasticity had the largest influence on the species structure when substrate concentrations merely varied, whereas habitat filtering greatly impacted the metabonomic output. Evidence of coevolution was elucidated from comparisons of the two communities; we found that the artificial community tended to exclude saccharolytic Firmicutes species and was enriched for metabolic intermediates, such as Stickland fermentation products, suggesting overall that polysaccharide utilization by Firmicutes is dependent on cooperation.

4.3 Introduction

A critical knowledge gap in the field of microbial ecology is understanding the relative contribution of the forces that drive microbial community assembly. Uncovering this information would facilitate the development of rationally-designed strategies to remediate microbial communities exhibiting undesirable functionality or successive progression after a perturbation. Such forces have been proposed to include environmental selection, historical contingency, dispersal limitation and stochasticity [19]. Environmental selection additionally encompasses several distinctive sub-factors that yield well to manipulation or measurement for predictions, including niche availability (i.e., habitat filtering) and microbial interactivity (i.e., species assortment and coevolution) [167, 188, 189]. Bioreactors present a promising strategy for studying the importance of such drivers, because culture conditions within them can be tightly controlled, and the use of defined microbial consortia can additionally serve to not only deconvolute the system but also to allow for manipulation to address each factor individually [348, 349]. Bioreactors are also currently utilized for both research and industrial processes [198, 207, 350–354], and thus observing and quantifying the contribution of these ecological forces on community assembly is in itself useful information for these applications.

85

The human gut microbial ecosystem (i.e., human gut microbiota) is a suitable testing ground for ecological theory. This ecosystem is known to be critical to health and well-being, as it degrades otherwise indigestible food material into usable nutrients for the host [99, 355] and modulates the immune system [356], with alterations in both community structure and function reported in several GI disorders [357, 358]. Proper succession of the human gut microbiota during infancy and childhood is also essential to proper development and education of the immune system [163], with such deviations again associated with later onset of autoimmune conditions [164–166]. Thus, therapeutics targeting the human gut microbiota have been trialed in such cases, including probiotics and fecal microbiota transplantation. However, the results of such clinical trials have been mixed, from the > 90% cure rate of FMT in treating RCDI [359] to the 40% success rate of FMT inducing remission in IBD [360]. Meanwhile, probiotics supplementation has been shown to reduce infant mortality by 20%, but several factors were found to influence outcomes of these clinical trials, including dosage, number of strains, feeding and age [361]. Clearly, the availability of an ecological theoretical framework to contextualize the environmental and microbial constitution would improve the design process of these ecosystem interventions. Further, in vitro bioreactor-based models such as the SHIME system [208, 362] and single-vessel units [198] are a popular methodological approach for investigating hypotheses of human gut microbial ecology, due to their replicability, sample yields, cost and lack of ethical constraints.

Thus, in our study, we aimed to quantify the relative impact of two of the drivers of microbial community assembly, environmental selection and stochasticity, in terms of both compositional species structure and metabolic behaviour, through use of a defined microbial community derived from a human fecal sample and single-vessel bioreactor-based models. For the evaluation of environmental selection, we chose to utilize different medium formulations that replicate a high fiber, low protein and a high protein, low fiber diet, as diet has been proposed to be the dominant environmental influencer acting on the human gut microbiota [326]. We additionally scrutinized the two sub-factors of environmental selection, habitat filtering and species assortment, by including a second defined microbial community matching the species constitution of the first, in which each bacterial strain was sourced from a unique human donor. Therefore, the diets would be a representation of habitat filtering, whereas the distinctive communities would model species assortment or coevolution. To control for coadaption to the dietary condition that could occur during the initial assembly, we additionally introduced a dietary change after allowing sufficient time for community equilibration to measure the response to a relevant perturbation. Finally, for the evaluation of stochasticity, we assessed the reproducibility of replicates using several multivariate statistical methods to explore steady-state community dynamics.

86

4.4 Methods

4.4.1 Creation of defined microbial communities

Two defined microbial communities were created to examine the effects of coevolution on the dynamics of community assembly. The first community served as a control, in which all bacterial strains were derived from the same fecal sample. The second community was constructed to match the species composition of the first (as determined by aligning the full length 16S rRNA genes from each respective pair), but with each bacterial strain sourced from a unique donor’s fecal sample. The isolation methods and donor description of the control community (CC) is described in Petrof et al. [23]; however, additional species from this isolation round were added to improve the diversity of the formulation (Table S3). The same isolation techniques were utilized to derive the microbial strains for the second, ‘artificial’ community (AC), except for those species obtained from international culture collections, which were simply resuscitated according to the recommended directions. The donors or international culture collections used to source each species of the AC are indicated in Table S3.

Genomic DNA isolated from each strain was individually 16S rRNA gene sequenced using an Illumina MiSeq instrument (Illumina Inc., Hayward, CA, USA) in order to use the high read count output as a way to interrogate the purity of each sample. Briefly, all strains were first cultured on fastidious anaerobe agar (Neogen Corporation, Lansing, MI, USA) in an anaerobic chamber (Baker Ruskinn,

Sanford, ME, USA) under an atmosphere of 90:5:5 N2:H2:CO2 at 37 °C for 48 h. To extract gDNA, single colonies were suspended in a solution of 2.6 mg/mL lysozyme (Bio Basic Inc., Markham, ON, Canada), 0.1 mg/mL proteinase K (Bio Basic Inc.), 1% sodium dodecyl sulfate in 100 mM Tris-HCl, 10 mM EDTA, pH 8.0 buffer and incubated at 60 °C until visible clearing was observed. The lysate was subsequently transferred to MaXtract High Density tubes (Qiagen Inc., Germantown, MD, USA), to which 500 μL of 25:4:1 phenol:chloroform:isoamyl alcohol was added, and incubated overnight with rotation. The gDNA was then retrieved from the mixture through centrifugation at 14 000 rpm (28,260 x g) for 5 min at 4 °C to obtain the aqueous layer then precipitation by the addition of 400 μL isopropanol. The isopropanol was removed after centrifugation at 14 000 rpm for 15 min at 4 °C, and the pellet was washed with 300 μL of 70% ethanol. The ethanol was next completely removed by first centrifugation at 14 000 rpm for 5 min at 4 °C then allowing for evaporation of the residual amounts.

Finally, the gDNA was resuspended in 50 μL ddH2O. The subsequent library preparation, sequencing and data processing were conducted as described in the 16S rRNA based compositional profiling section below. An average read depth in the 103 range was typically achieved when sequencing single strains. The strains were decidedly pure when all amplicon sequence variants (ASVs) that could not be attributed to the target species were of low abundance (< 1%) and could be accounted for as sample cross-

87

contamination through referencing sample blanks. Any strains that were not pure were subjected to serial dilution to extinction, as a method to improve isolation, and were then re-evaluated by the above technique.

4.4.2 Bioreactor operation

A 500 mL Multifors bioreactor system (Infors AG, Bottmingen/Basel, Switzerland) was inoculated with the defined microbial communities and operated as a model of the human distal colon as previously described [200]. The feed medium was designed to replicate two dietary conditions, high fiber, low protein (HF) and high protein, low fiber (HP), based upon the formulations in Marzorati et al. [328] but modified to accommodate a single-vessel system as in McDonald et al. [198] (Table S1). The experiment was designed such that each community was initially equilibrated in each medium for two weeks, upon which the media were swapped to simulate a dietary change. The communities were then allowed an additional two weeks to re-equilibrate. Three biological replicates were completed for each condition, with sampling conducted every two days for the duration of the experimental run. At sampling, 2 x 2 mL volumes were acquired and immediately cryopreserved at -80 °C.

4.4.3 16S rRNA based compositional profiling

The gDNA from bioreactor samples was extracted through use of the QIAamp Fast DNA Stool Mini Kit (Qiagen Inc., Germantown, MD, USA) according to the manufacturer’s directions, but with extra steps included to improve cell lysis. Prior to proceeding with their recommended protocol, cells were first pelleted through centrifugation at 14 000 rpm for 15 min at 4 °C. After resuspension in the lysis buffer, 0.2 g of zirconia beads (Biospec Products Inc., Bartlesville, OK, USA) were added, then the samples were bead-beat with a Digital Disruptor Genie (Scientific Industries Inc., New York City, NY, USA) at 3000 rpm for 4 min. The samples were subsequently incubated at 90 °C for 15 min, and finally, ultrasonicated at 120 V for 5 min (Branson Ultrasonics, Danbury, CT, USA). Libraries for sequencing were constructed by a one-step PCR amplification with 400 ng of Nextera XT Index v2 sequences (Illumina Inc.) plus standard 16SrRNA v4 region primers [331] and 2 μL of gDNA template in Invitrogen Platinum PCR SuperMix High Fidelity (Life Technologies, Burlington, ON, Canada). Cycler conditions included an initial melting step of 94 °C for 2 min, followed by 50 cycles of 94 °C for 30 s, annealing temperature for 30 s and 68 °C for 30 s, with a final extension step of 68 °C for 5 min. The annealing temperature comprised of a 0.5 °C increment touch-down starting at 65 °C for 30 cycles, followed by 20 cycles at 55 °C. The PCR products were subsequently purified using the Invitrogen PureLink PCR Purification Kit (Life Technologies) according to the manufacturer’s directions. Normalization and

88

Illumina MiSeq sequencing was carried out at the Advanced Analysis Center located in the University of Guelph, ON, Canada.

The obtained sequencing data was processed using R software version 3.5 with the package DADA2 version 1.8, following their recommended standard protocol [222]. Classification to the genus level was additionally carried out via DADA2 using the SILVA database [363] version 132. Classification to the species level, however, was conducted by uploading the ASVs to NCBI BLAST (https://blast.ncbi.nlm.nih.gov) and selecting the identification with the highest percentage and lowest e- value, while cross-referencing with the known species constitution of the defined microbial communities. The data was then denoised by adding the ASVs that returned identical species classifications together, and after which the ASVs that equated to < 0.01% total abundance across all samples were removed. Finally, the data was normalized by center-log ratio transformation through use of the package ALDeX2 version 1.12, taking the median Monte-Carlo instances as the value [221].

4.4.4 1H-NMR based metabonomics

Sample preparation, 1H-NMR spectral acquisition and processing, and profiling of metabolites was conducted as previously described [200]. A Bruker Avance III 600 MHz spectrometer with a TCI 600 probe (Bruker, Billerica, MA, USA) and acquisition temperature of 298 K at the Advance Analysis Center located in the University of Guelph, ON, Canada was utilized to obtain spectra. The data was analyzed using both an untargeted spectral binning and targeted metabolite profiling approach with the Chenomx NMR suite 8.3 (Chenomx Inc., Edmonton, AB, Canada). For spectral binning, the recommended default parameters by Chenomx of 0.04 ppm sized bins along the 0.04 – 10 ppm region of the spectrum line with omission of water (4.44 – 5.50 ppm) and normalization by standardized area (fraction of the chemical shape indicator, DSS) were implemented. For metabolite profiling, target regions of interest in the spectra were selected by orthogonal PLS-DA of the spectral binning data, as described in the Statistical analysis section. Metabolite identifications were then based on the best fit for the peak regions with the available libraries of compounds. The libraries included both the compound set included with the Chenomx software suite, and the downloaded HMDB [251] set release 2.

4.4.5 Statistical analysis

All data sets could be divided by community (control versus artificial), starting medium formulation (HF versus HP), and treatment (before versus after the dietary change), in addition to time (days 2 – 28) and replicate number (1 – 3). The normalized sequencing and spectral binning data were utilized to determine if relevant groupings were overall statistically significantly different. Such groupings included: 1) time, in order to determine when a ‘steady-state’ equilibrium had been reached

89

(here the analysis was conducted separately for each community-diet-treatment combination); 2) replicate, in order to determine the extent of reproducibility (here the analysis was conducted separately for each community-diet-treatment combination after removal of unequilibrated time points); 3) diet, in order to determine if this factor influenced the composition or behaviour of the microbial community (here the analysis was conducted separately for each community before treatment after removal of unequilibrated time points); 4) treatment versus starting diet, in order to determine if the state of the microbial community achieved after the dietary change was equivalent to the state of the microbial community when assembled in the starting diet (here the analysis was conducted separately for each community after removal of unequilibrated time points); and 5) community, in order to determine if the control and artificial defined microbial communities assembled or responded to the diets differently (here the analysis was conducted separately for each diet-treatment combination (when relevant, based upon findings from analyses 3 and 4 above) after removal of unequilibrated time points).

Statistical analysis was conducted using R software. First, the reduction of dimensions was essential to conduct statistical tests. Three approaches were evaluated for this task, PCA, PLS-DA and Euclidean distance matrices with non-metric multidimensional scaling for plotting. The former two methods were conducted with use of the package ropls version 1.12, and the latter with use of the package vegan version 2.5.2. The R2 and Q2 quality metrics [364, 365] and permutation testing [366] was computed to validate the significance of the PLS-DA model, thus preventing overfitting, as is standard with the package. Plots were generated using package ggplot2 version 2.2.1; ellipses were drawn for the PCA and PLS-DA plots assuming a multivariate t distribution via this package. With the PCA (untargeted) and PLS-DA (targeted) data, the Wald-type statistic (WTS) for multivariate data [367] and the modified ANOVA-type statistic (MATS) [368] were computed with resampling by a parametric bootstrap approach. These functions are implemented in the package MANOVA.RM version 0.3.1, and enable statistical testing for arbitrary semi-parametric designs, even with unequal covariance matrices among groups and small sample sizes. Pairwise Mahalanobis distances were also computed to determine the group separation (e.g., fold change) [369] using the package HDMD version 1.2. With the Euclidean distance matrices, a PERMANOVA [370] was computed using the package vegan, with differences in variation (i.e., beta dispersion) determined prior by an ANOVA coupled with Tukey’s HSD. Additionally, an untargeted approach was evaluated by conducting partitioning around medoids (PAM) clustering and choosing the best solution from the average silhouette widths (ASWs) [371] using the package fpc version 2.1.11.

To compensate for the dependent nature of the initial category of time, samples were first divided into two groups, ‘early’ and ‘late’, sequentially with increasing time points, removing the previous ‘early’

90

group in each successive iteration. For the former method, an orthogonal PLS-DA approach was then used to reduce dimensions, and the WTS and ANOVA-type statistic (ATS) were computed with resampling via a wild bootstrap using Rademacher weights on the singular predictive component [372]. For the latter method, permutations for the PERMANOVA were both toroidally shifted and restricted from occurring across the repeated measures within samples (code available at: https://thebiobucket.blogspot.com/2011/04/repeat-measure-adonis-lately-i-had-to.html#more accessed October 19, 2018) [370, 373]. Once the steady-state time point was established, spectral bins of interest were selected by variable importance of projection scores >= 1 on the orthogonal PLS-DA model for metabolite profiling [374]. To calculate the theoretical amount of time it would take for the bioreactors to adjust substrate concentrations to equal that of the new formulation after the medium change, the package deSolve version 1.21 was utilized. This value was used for comparison against the rate of response to dietary change of the microbial community.

The normalized sequencing and metabolite concentration data were used to determine which taxa and metabolites were significantly different between groupings. For these calculations, a series of Kruskal-Wallis tests with Benjamini-Hochberg correction and Dunn’s post-hoc analysis via the package dunn.test version 1.3.5 were computed, in addition to the calculation of effect size (η2). Only features that remained significant after post-hoc analysis (q-value < 0.05) and had an effect size above 50% were considered significant.

4.5 Results

4.5.1 Determination of microbial community ‘steady-state’ stability and replicate reproducibility

For this work, it was essential to first determine at which day the bioreactor-grown microbial communities had achieved a stable equilibrium through proliferation and interaction, referred to as ‘steady-state’, in order to make subsequent comparisons. We were additionally interested in whether the compositional steady-state was equivalent to the metabolic, or behavioural, steady-state. To accomplish this task, two repeated-measures approaches were utilized on both the normalized sequence count and 1H- NMR spectral binning data, 1) the building of an orthogonal PLS-DA model for completing the WTS and ATS with wild bootstrap resampling against the singular predictive component and 2) the building of a Euclidean distance matrix for completing a PERMANOVA. The CC reached steady-state both compositionally and metabolically by day 4 (Table S4). Upon dietary change from high fiber to high protein, the microbial community was already compositionally similar from the first time point (day 16) to the end of the run (day 28), however, metabolic stability was not reached until day 18. This latter observation is in line with the calculated amount of time it would take for the bioreactor to shed the

91

excess 2000 mg/mL concentration of fiber from lingering fiber-rich medium following media change to high protein composition, which is 4 days post medium change (Figure S5).

Results obtained running the AC under the same conditions were much more variable (Table S4). Steady-state was reached for this community both compositionally and metabolically by day 2 in the HF medium, with no significant differences observed at the first time point of day 16 after the change in medium formulation. In the HP medium, however, the AC followed the patterns of the CC more closely than in the HF medium, except for taking longer to reach initial metabolic stability (6 days compared to 4 days for the CC).

The two dimension reduction methods for steady-state calculation did not always match significance levels (Table S4). For the normalized sequence count data, the orthogonal PLS-DA approach was better at resolving variation, as the there were no significant differences found in the Euclidean distance matrices by PERMANOVA. For the 1H-NMR spectral binning data, however, 43% of the significant measurements were determined as such by both methods, 43% by the Euclidean distance matrices alone, and 14% by the orthogonal PLS-DA alone. It would thus appear that the Euclidean distance matrix approach worked better for this type of data, although a combination of both methods may be prudent. Since the aim of calculating when steady-state occurs was to remove as much unnecessary, technical variation as possible, if either method determined there was a significant difference between a particular early time point and the rest of the data, the time point was not deemed to have been stabilized and thus was subsequently removed.

Next, the reproducibility between replicates of the same condition was evaluated for both the normalized sequence count and 1H-NMR spectral binning data using three different approaches; 1) PCA for completing the multivariate WTS and MATS with parametric bootstrap resampling and pairwise Mahalanobis distance determination, 2) PLS-DA for completing the same calculations as the PCA and 3) Euclidean distance matrices for completing a PERMANOVA, in addition to selecting an untargeted PAM clustering solution by the highest ASW. All three methods found statistically significant differences between replicates within all conditions (Table S5). The pairwise Mahalanobis distances ranged in magnitude from 4.3 to 10.6 for the compositional data and 9.5 to 19.1 for the metabolic data by PCA. For the PLS-DA model, these distances ranged from 5.7 to 12.3 for the compositional data and 10.1 to 29.5 for the metabolic data. The untargeted PAM clustering approach failed to recapitulate the expected patterns, i.e., clustering the samples by replicate. The ASWs ranged in magnitude from 0.23 to 0.28 for the compositional data and 0.13 to 0.24 for the metabolic data. Stochastic variation was thus clearly present across each replicate experiment, and in order for the changes induced by an introduced environmental pressure to be overall statistically significant, we defined this to mean that it must exceed

92

the within-condition replicate separation (Figure 16). This could be evaluated by overlap of ellipses and magnitude of the Mahalanobis distances for the PCA and PLS-DA approaches, and recapitulation of the expected clusters by untargeted PAM clustering with improved ASWs for the Euclidean distance matrix approach.

Figure 16. Analysis of overall statistically significant differences in the 1H-NMR spectral binning data obtained for the control community. Panel A depicts samples from days 2 – 14 of the three bioreactors fed the high fiber medium formulation, grouped by replicate (1 – 3). Panel B depicts samples from days 2 – 14 of all six bioreactors grouped by medium formulation (high fiber and high protein). The left panel is the result of principal component analysis (PCA) conducted in R software version 3.5 by the package ropls version 1.12, with plots, including ellipses assuming the multivariate t distribution, drawn by the package ggplot2 version 2.2.1. The right panel is the result of non-metric multidimensional scaling (NMDS) of Euclidean distance matrices conducted by the package vegan version 2.5.2, with plots, including the best solution of untargeted partitioning around medoids clustering determined by average silhouette width using the package fpc version 2.1.11, drawn by the package ggplot2. Panel B presents a superior separation than panel A, indicating that the effect of diet exceeds stochasticity.

93

4.5.2 Microbial community response to dietary changes

The response of both the CC and AC to changes in medium formulation from HF to HP was determined using both the normalized sequence count and 1H-NMR spectral binning data. It was also of interest to determine if the state of the microbial communities was the same when grown in one medium formulation throughout a run compared to when switched to that medium formulation after being initially grown in a different medium. This latter evaluation enabled us to assess whether any species exhibited a more permanent adaptation to the environment during the course of the assembly period. With the

94

decided criteria from the above objective, i.e., the between group Mahalaonbis distance exceeding the found within group Mahalaonbis distance and untargeted PAM clustering recapitulating the expected pattern of clustering samples by diet, neither community altered its composition in response to dietary change (Table S5). However, the dietary change did elicit a clear significant difference in the metabolite profile of both communities (Table S5; Figure 16). The untargeted PAM clustering approach reliably recapitulated two clusters, each containing only the samples of one specific diet, as the best solution. The ASWs also exceeded the values obtained from the clustering solutions within conditions (replicates), at 0.29 and 0.25 for the CC and AC respectively. The Mahalanobis distance derived from PCA was also larger than the maximum value obtained between replicates of the CC, i.e., 19.3 compared to 18.1. For the AC, however, the result was more dubious, as the Mahalanobis distance was less than the maximum value obtained between replicates, i.e., 17.6 compared to 19.1. In contrast, the Mahalanobis distances derived from PLS-DA did not meet the criteria for either the CC or AC. Finally, there was an absence of significant difference between the communities that were originally cultured in one medium when compared to being changed to that medium when initially grown in a different medium, suggesting that within-experiment adaption was not a confounding factor (Table S5).

Based upon the above results, we conducted a series of Kruskal-Wallis tests with Benjamini- Hochberg correction and Dunn’s post-hoc analysis to determine which features were statistically significantly different between the medium formulations in which the communities were initially grown, in addition to calculating the effect size via the η2. The normalized sequencing data was used to evaluate shifts in taxonomic abundance, and as expected, no statistically significant differences were found (data not shown). The 1H-NMR metabolite profiles, however, revealed > 15 metabolites that exhibited significant changes in concentration in both the CC and AC (Table S6). The concentrations of SCFAs and select metabolites of interest that were significantly different between growth medium conditions are additionally depicted in Figure 17 and Figure 18, respectively. Ten percent error bars were added to these plots based on the found median 1H-NMR metabolite profiling measurement error of 9.7% by Sokolenko et al. [253]. Most of these alterations were identical between communities, including increased concentrations of several amino acids and their specific fermentation by-products, a lower concentration of methanol and a higher concentration of uracil in the HP medium. Community specific deviations included increases in concentration of glycine, isoleucine, leucine, succinate and valine, and a decrease in concentration of valerate for the CC in the HP medium, whereas the AC had a higher concentration of isobutyrate and a lower concentration of glyoxylic acid.

95

Figure 17. Concentrations of short-chain fatty acids determined by 1H-NMR metabolite profiling in bioreactor samples over time. Replicates 1 – 3 are the communities that were initially grown in the high fiber medium formulation, whereas replicates 4 – 6 are the communities that were initially grown in the high protein medium formulation. Panel A depicts the control community, and panel B depicts the artificial community. Plots were drawn in R software version 3.5 by the package ggplot2 version 2.2.1, with 10% error bars representing the expected amount of technical measurement inaccuracy.

96

97

Figure 18. Concentrations of select metabolites that were statistically significantly different between medium formulations or communities as determined by 1H-NMR metabolite profiling in bioreactor samples over time. Replicates 1 – 3 are the communities that were initially grown in the high fiber medium formulation, whereas replicates 4 – 6 are the communities that were initially grown in the high protein medium formulation. Panel A depicts the control community, and panel B depicts the artificial community. Plots were drawn in R software version 3.5 by the package ggplot2 version 2.2.1, with 10% error bars representing the expected amount of technical measurement inaccuracy.

98

99

4.5.3 Effect of coevolution on microbial community structure and behaviour

Finally, the differences between the CC and AC in both media were evaluated overall and between individual taxa and metabolites as above, to determine if potential coevolution impacted community composition or behaviour. With the set criteria, there were no significant differences between the overall compositional nor metabolite landscape (Table S5). However, there were several significant differences between individual taxa and metabolites. For the taxa, [Eubacterium] rectale (HF/HP; p-value = 5E-6/4E-4; effect size = 74%/62%), Faecalicatena fissicatena (HF/HP; p-value = 2E-5/6E-4; effect size = 60%/54%) and Coprococcus comes (HF only; p-value = 1E-5; effect size = 65%) were significantly altered. It was observed that both E. rectale and Co. comes were virtually undetected in the AC, with the latter likely only reaching statistical significance in the HF condition due to its relatively higher abundance in the CC. On the other hand, Fa. fissicatena was present in both communities, but at a lower abundance in the CC. For the metabolites, several amino acids (aspartate, isoleucine, leucine, valine), organic acids (formate, lactate, succinate) and uracil had increased concentrations in the CC, whereas branched-chain fatty acids (isobutyrate, isovalerate) reached higher concentrations in the AC (Table S6; Figure 18). In the HF medium only, the AC additionally had higher concentrations of the amino acid derived metabolites 4-aminobutryate and ornithine, and the SCFAs acetate and propionate (Table S6; Figure 17). Meanwhile in the HP medium only, the AC had increased 4-hydroxyphenylacetate, glutamate and valerate concentrations, while the CC had higher methionine and p-cresol concentrations (Table S6; Figure 18).

4.6 Discussion

Understanding the drivers of community succession is not only useful for researchers utilizing technologies to simulate microbial ecosystems, such as bioreactors, but also can translate to real world applications. For the human gut microbiota, it could assist in the design of therapeutic strategies aiming to ameliorate abnormalities exhibited in GI disorders or the building of predictive tools to project changes over time, including during infant development. Here, we examined the relative impact of environmental selection and stochasticity on succession through use of HF and HP medium formulations. Further, we differentiated the effects of habitat and member coevolution within environmental selection, by creating both a microbial community derived from a single fecal sample and an equivalent community species- wise sourced from individual donor fecal samples.

First, it was essential to determine the point at which the microbial communities had proliferated and interacted to form a stable equilibrium, known as ‘steady-state’. Removing time level variation is important for researchers utilizing bioreactor-based models, as it eliminates technical artifact bias that can

100

confound results. We found that for the CC, steady-state was reached by day 4. This result is much earlier than what has been suggested by other studies using SHIME [208, 362] or single-vessel systems [198]. There are several reasons for the discrepancy. This previous work has used observational techniques, such as moving window analysis or Unifrac clustering, to determine the day at which the microbial community had stabilized. These methods thus do not statistical validate the variation, resulting in overestimation of the value. Additionally, steady-state tends to be measured within each replicate, instead of assessed as a dataset. Microbial community dynamics between replicates were not found to be 100% reproducible. Therefore, ‘steady-state’ can ultimately be defined as the point at which time level variation no longer exceeds the extent of replicability, as upon reaching this range statistical tests would no longer be confounded by this technical property. Finally, biological differences could also contribute to this low value, since our experimental communities were of a far reduced complexity compared to the typical fecal sample inoculum. For example, as there would be fewer interaction types and processes in our experimental communities compared to fecal inocula, equilibrium is achieved more quickly by relevant mathematical models, such as Lotka-Volterra [375]. Of note, we recommend our approach of utilizing whole sequence count and spectral binning data, as opposed to individual taxa and metabolites, to gauge steady-state for a complex microbial community. Not all metabolites stabilized in concentration at the same time, and the rate at which individual metabolites reached flatlined concentrations was not identical between replicates. Particularly, the metabolites most frequently measured in human fecal-associated communities due to their importance and dominant concentrations, the SCFAs [56, 111], often stabilized faster than other metabolites, e.g., amino acid-derived fermentation by-products (Figure 17; Figure 18).

The property of significant variation between replicates of the same microbial community under identical conditions is not a new observation but has been rather frequently observed in bioreactors [350– 354]. It has been termed ‘multistability’, which occurs when numerous entities have non-linear interactions or feedback loops, and thus multiple alternative stable states are able to exist [170]. This phenomenon has been observed both in environmental ecosystems in situ [376, 377] and the human gut microbiota, with the best example of it being reported after antibiotic treatment [170, 183, 347]. Therefore, multistability is not an artifact of the modeling technology but rather an accurate representation of this naturally inherent property. Switching between such states can occur either under a gradual external influence or after experiencing a substantial perturbation [170]. Apart from antibiotics, it is unclear which factors could drive these changes in the human gut microbiota. However, understanding this natural variation is not only an important property for scientists to consider in study design, but also has tangible real-world applicability. Of the human gut microbiota, a good example would be IBD. IBD is defined as chronic, recurrent inflammation of the GI tract [378], and a well-documented descriptor of the altered state of these patient’s gut microbiota is a lack of butyrate [290]. Butyrate is known to be a fuel

101

source for intestinal epithelial cells [111] and is anti-inflammatory via several potent mechanisms [68]. Here, we have shown in the CC that butyrate can vary by as much as 10 mM between replicates (Figure 17). Therefore, it might become more understandable how the gut microbial ecosystem of an IBD patient can seemingly switch between a remission of ‘health-promoting’ microbial behaviours and a pro- inflammatory state [378] and how antibiotic treatment has had some success in inducing remission [64]. Bioreactors thus present a useful tool for quantifying the extent of this stochasticity and determining which factors induce microbial community reassembly through examining within run variability over time. However, it is also important for researchers to ensure that the applied perturbation in their experiment exceeds this inherent variability in order to prove an overall change in microbial community composition or behaviour; simply obtaining a significant p-value is not enough, as such a result can be attained when comparing replicates of identical conditions. We therefore recommend calculating the differences between and within conditions in the Mahalonbis distance of groups after PCA and ASW plus cluster membership after PAM clustering of Euclidean distance matrices (Figure 16).

The pairwise Mahalonbis distances after PLS-DA were contrastingly always large regardless of which conditions were being compared, but there are reasons to interpret such results with caution. Worley and Powers found that when they added increasing noise to statistically significantly separated NMR datasets, the PCA scores-space distance rapidly decreased, whereas the validity of the PLS-DA model diminished but the scores-space distance remained unaffected [379]. Goodpaster and Kennedy also observed a 666% increase in Mahalanobis distance by PLS-DA compared to PCA in a not significantly separated NMR dataset [369]. Finally, Westerhuis et al. demonstrated that the PLS-DA model aggressively overfits, revealing excellent separation even with random data [380]. They thus advocated the use of permutation testing to cross-validate the model and concluded that PLS-DA should not be implemented to determine the magnitude of class separation. We did complete this cross-validation, and once the model passed these parameters, there was never an instance at which the distance between clusters was not statistically significant. Therefore, we can add additional evidence to these studies, and we concur with the recommendation that statistical testing and evaluation of cluster separation magnitude from PLS-DA models should not be conducted. Instead, model cross-validation can be used to confirm that the data can be separated, and PLS-DA can then be valuable in feature selection, as it is the only dimension reduction method of these three approaches that can quantify the contribution of individual variables [381].

We thus found that growth medium did significantly alter the microbial community metabolic behaviour but not the composition. The latter result is in line with previous observations that found only minor alterations in microbial community species structure after a dietary change [84, 162, 346], and the

102

composition of the human gut microbiota is also reportedly robust in adulthood, with a 60% microbial strain retention rate in a five-year window [147]. Several findings from other groups contrast our results, for example, Walker et al. found that certain taxa were diet responsive within an individual’s gut microbiota [162], Wu et al. found that the dietary trends must continue over a long-term period, i.e., within the span of months to years, rather than weeks in order to elicit a change in the composition of an individual’s gut microbiota [84], and David et al. found that the dietary change must be sufficiently extreme, such as the switch from an entirely animal-based to plant-based diet, to cause a shift in the species structure of an individual’s gut microbiota [161]. As our microbial communities were reduced in diversity compared to the fecal ecosystems from which they were derived, it is entirely possible that the diet responsive microbes had been omitted. For example, our communities lacked many of the species found to proliferate in specific diets by Walker et al., including Ru. bromii and Oscillibacter spp. [162]. However, the authors also concluded that not all individuals contained diet responsive microbes within their gut microbiota, and thus our observation is not necessarily a technical deficit. Our induced dietary change was also short-term, with the experiment run in two-week blocks, and our two media formulations contained the same substrates, but in different proportions (Table S1). Interestingly, David et al. found that the switch in microbial community structure between their extreme dietary states was reproducible within individuals [161]. Therefore, our resultant lack of difference between the assembled and switched microbial communities growing in the same medium is supported by both David et al. and Wu et al. [84, 161], indicating that a much longer time period is required for adaption. We would thus conclude that for the typical duration of bioreactor experiments, it is not necessary to consider microbial community adaptation as a confounding factor. One exception to this rule might be under conditions when a substrate that a diet responsive microbe requires is completely missing; then, the said microbe might not be able to integrate into the community during the turbulent initial assembly stage. An example of this is illustrated by Walker et al. and Ze et al.; when Ru. bromii lacks resistant starch, it is dramatically reduced in relative abundance within a human gut microbial community [162, 343]. However, Ru. bromii is also likely to have survival mechanisms in place to remain present in a human gut environment during the regular feeding-fasting oscillations; for example, isolates from human fecal samples are also able to consume galactose and fructose, and can sporulate [382]. Thus, this hypothesis would need to be tested in future work.

The ability of the healthy human gut microbiota to maintain a stable composition even after dietary change can be attributed to the wide diversity of primary degrading generalists that it contains, namely the Bacteroidetes [72, 73]. We would expect ecosystems with similar properties to behave in the same manner. The metabolic changes were as expected, both proteolysis (higher amino acid concentrations) and amino acid fermentation (higher concentrations of by-products specific to these

103

metabolisms [99, 119]) increased in the HP medium (Figure 18). The lower methanol concentrations can be attributed to decreased pectin, as it is a side product of pectinolytic activity [114]. The higher uracil concentrations, a pyrimidine catabolic intermediate, could be ascribed to the more unfavorable growth conditions (Figure 18) [383, 384]. In cultures of E. coli, the release of uracil was found to occur during perturbation of balanced growth conditions, particularly energy-source downshifts, at which there was a higher rate of cell death and lysis [383, 385]. Carbohydrates are the preferred substrate of the human gut microbiota, as found by experiments testing different carbohydrate:protein ratios [54, 110]. Carbohydrates typically exhibit a higher available energy content than proteins, since monosaccharides are more repetitive within the biomacromolecule and require fewer interconversion steps than amino acids [99, 109]. Further, uracil has been proposed to act as a secondary bacterial messenger promoting the fitness and survival of the secretors, such as via the enhancement of virulence factors [386], and was found to be a ligand for dual oxidase-dependent reactive oxygen species generation in the gut [383]. It has been reported that diets high in protein and low in fiber can promote colonic disease [346, 387], which is thought to be due to the production of toxic phenolic, biogenic amine and sulfur compounds from amino acid fermentation [98–100], but perhaps uracil secretion could be another factor.

The control and artificial, non-coevolved communities were similar, but not identical to each other, as indicated by the non-significant differences in overall composition and metabolic behaviour. In terms of species structure, both E. rectale and Co. comes failed to integrate into the AC. Both species are saccharolytic and capable of degrading fructans, e.g., inulins; E. rectale can also utilize starch and xylan- derived polymers [43, 388, 389]. The activity of fructan degradation is highly specific, however, as it is restricted to only certain types of inulins and is strain variable. Intriguingly, Lozupone et al. built a co- occurrence network from fecal metagenomic data collected from 124 unrelated adults and found that Co. comes co-occurred with E. rectale [390]. E. rectale in particular is known to require cooperation with other species to utilize resistant starches, as it is incapable of conducting this activity on its own [391, 392]. Therefore, most studies aiming to observe this process have cocultured E. rectale with either Ru. bromii [343], Bifidobacterium spp. [393] or Bacteroides spp. [394]. Since there was no Ru. bromii in our communities, and Bifidobacterium spp. had variable and low-level presence amongst the experimental replicates of both the CC and AC, we will restrict our discussion to B. ovatus. Further, Lozupone et al. found that Bacteroides spp. additionally co-occurred with E. rectale and Co. comes [390]. An elegant study conducted by Rakoff-Nahom et al. discovered that a strain of B. ovatus produced both membrane- bound and secreted forms of a glycoside hydrolase capable of degrading inulin [190]. When the secreted form of the enzyme was knocked-out, the fitness of B. ovatus was not impacted when grown in monoculture but was significantly diminished when grown in a community setting. Tuncil et al. additionally demonstrated that different Bacteroides spp., namely B. thetaiotaomicron and B. ovatus, had

104

reciprocal glycan substrate preferences that were maintained from monoculture to coculture [395]. The combination of these two studies would indicate potential mechanisms of coevolution within a gut microbial ecosystem, such that species would adapt to occupy unique niches or collaborate to exploit the same niche, at least in terms of polysaccharide consumption. Therefore, it is possible that the B. ovatus strain in the AC lacked secretory catabolic enzymes that would have assisted E. rectale and Co. comes in utilizing, e.g., fructans, or that the glycan substrate preference of the B. ovatus had shifted to consuming the fructans that E. rectale and Co. comes could have otherwise degraded themselves, thus effectively outcompeting them. Alternative hypotheses beyond metabolism might be a communication mismatch between strains through, e.g., quorum sensing [396] or uncomplimentary antimicrobial defense mechanisms [176]. This result thus not only revealed the presence of a possible, cooperative microbial guild within the CC, but also demonstrated that strain level variation is an important property to consider in studies aiming to examine microbial community cooperation. In the case of Fa. fissicatena, another saccharolytic bacterium from the family Lachnospiraceae [397, 398], the strain may have simply expanded within the AC to partially fill niches that were vacated by the absence of E. rectale and Co. comes. These alterations in taxa did impact the metabolic output of the AC when compared to the CC. Both E. rectale and Co. comes produce lactate and occasionally formate as by-products of fermentation [43, 389], which were of significantly higher concentrations in the CC. Fa. fissicatena not only produces acetate as its major fermentation product [397, 398], but E. rectale also consumes acetate as part of its butyryl-CoA:acetyl-CoA transferase pathway for butyrate production [399], thus leading to the significantly decreased acetate in the CC in the HF medium (Figure 17).

Another intriguing observation of metabolic differences between the communities was the elevated amount of amino acid fermentation in the AC, as supported by decreased amino acid concentrations and increased concentrations of their specific fermentation by-products [99, 119]. Further, this heightened activity would explain the lower amount of separation between clusters by medium formulation when evaluating overall statistical significance in the AC comparted to the CC. Particularly, there was evidence of Stickland fermentation occurring via higher concentrations of isobutyrate, isovalerate and valerate, which is a metabolism specific to the (Figure 18) [33, 34]. An interesting finding by Shoaie et al. was that when E. rectale was cocultured with B. thetaiotaomicron, it switched its gene expression from fermenting saccharides to amino acids [394]. The strain thus likely responded to the introduced competition by occupying a different niche. Together, these results would suggest that polysaccharide utilization by the Firmicutes is dependent on a collaborative effort due to their more highly specialized metabolisms [72, 73], and in the absence of cooperation, these species will become putrefactive instead. Many GI disorders are characterized by an inherent low diversity [185], and a loss of bacteria from Clostridium cluster XIVa, which includes E. rectale and Co. comes [64, 69].

105

Further, there is often a concurrent lack of butyrate, produced exclusively from fermentation by Firmicutes [399], and increased inflammation, which can be promoted by the products of protein fermentation [98–100]. Therefore, we suggest that an erosion of cooperative interactivity has occurred in these situations, resulting in extinction of particularly dependent species and altered behaviours from those that remain, which exacerbate symptoms.

Further evidence of a loss of cooperation between community members would include a lack of cross-feeding, the lower growth density achieved and an increased variability. During growth in HP medium, there was a significantly higher concentration of 4-hydroxyphenylacetate and lower concentration of p-cresol in the AC compared to the CC (Figure 18). Both are products of tyrosine catabolism; 4-hydroxyphenylacetate is produced from 4-hydroxyphenylpyruvate, which generates energy, and then 4-hydroxyphenylacetate can be converted by demethylation to p-cresol [400]. Valerate can additionally be a product of energy generation by Clostridia from a process that consumes both ethanol and propionate, which are main anaerobic fermentative metabolites in gut microbial ecosystems [32]. Although valerate is also derived from Stickland fermentation, we believe the former process to be occurring in the CC, as valerate is of a significantly increased concentration in the HF medium compared to the HP medium, and the AC thus only has a significantly higher concentration of valerate in the HP medium when compared to the CC (Figure 17). These two cross-feeding processes not only recycle metabolites to generate more energy, but also increase turn-over, which prevents build-up of end-products that slow metabolism, and thus the ecosystem becomes more efficient [63, 123]. The improved efficiency would both elevate the rate and capacity of growth. Uracil is derived from bacterial RNA and is thus thought to additionally serve as a proxy for relative cell counts, as it was found to be of significantly lowered concentrations in the fecal samples of patients receiving antibiotic treatment [384, 401, 402]. Uracil was of a significantly higher concentration in the CC compared to the AC, which could be evidence of enhanced growth (Figure 18). Further, the AC did not reach steady-state in the HP medium until day 6, two days slower than the CC. This effect was not observed in the HF medium; however, as steady-state is defined as the point at which time level variation no longer exceeds the extent of replicability, the discrepancy may stem from a heightened variability. Indeed, the AC had a significantly higher β-dispersion than the CC by species composition in the HF medium (Figure S6). As the community behaviours may be less deterministic rather than settling into cooperative roles in the AC, it could be more random as to which species are able to acquire a particular niche, since the fitness inequalities among microbial species inhabiting the gut are generally low [187, 349, 403]. As bacteria from the gut microbiota are known to prefer carbohydrates [54, 110], the higher substrate availability in the HF medium could be less selective and therefore promote chaos in the system, decreasing replicability

106

[349, 354, 404]. Such factors should be important considerations for researchers designing communities or coculture assays to address hypotheses related to microbial interactivity.

One limitation to our study was the methods utilized to determine microbial strain purity. Despite completing rigorous dilutions to extinction and conducting 16S rRNA marker gene sequencing by an Illumina MiSeq for each strain, we found these techniques to be inadequate at detecting and removing all contaminants. In the CC, we found that Ak. muciniphila bloomed in the bioreactors, which was initially unexpected. Upon completing an Ak. muciniphila specific PCR [405] on the gDNA collected from each strain, we found it was present in the Acidaminococcus intestini stock. This result was in spite of being completely undetected by marker gene sequencing, and since an average of > 10 000 total reads were attained, its presence was indicated at a rate of less than one in every 10 000 cells (Table S7). When we completed a re-extraction and re-sequencing of gDNA obtained from Ac. intestini cultured on FAA supplemented with mucin, the preferred growth substrate of Ak. muciniphila [37], we were then able to detect it; at this point it achieved 10% of the total growth (Table S8). To compensate, we added a strain of Ak. muciniphila to the AC prior to conducting the bioreactor experiments for this community and then scrutinized the sequencing count data for any other outliers once completed (Table S6). We found an unexpected number of reads classified as Phascolarctobacterium, and when Phacolarctobacterium specific PCRs [406, 407] were conducted on the gDNA of the bioreactor samples and strains, we found Phascolarcterobacterium faecium to be present in all six replicates and the E. rectale stock of the AC, but not in the CC. Again, it was present at a rate of 1 per 10 000 cells of E. rectale. Ph. faecium did not reach high numbers, however, as it was not statistically significantly increased in the AC when compared to the CC. The fact that it had amounted to above 1% total percentage contribution may be attributed to sequencing error or cross-contamination. We examined the possibilities as to how its inclusion could confound our experiment. Interestingly, E. rectale did not integrate into the AC, so any interactions resulting from coevolution between it and Ph. faecium were absent. Ac. intestini was also able to colonize the CC with a contaminant present at an equivalent amount as E. rectale, so we doubt its inclusion would have negatively impacted the ability of E. rectale to incorporate into the AC either. Ph. faecium is asacchrolytic and incapable of Stickland fermentation, as it instead consumes succinate as a substrate to produce propionate [406, 408]. Therefore, the inclusion of Ph. faecium would not have altered any of our discussion points regarding polysaccharide utilization networks or increased putrefactive activity. The only influence it thus appeared to have had was the significant decrease in the concentration of succinate and increase in the concentration of propionate in the AC (Figure 17). The latter difference was only observed in the HF medium, providing more evidence that the CC was using propionate to make valerate [32]. We thus determined that our experiment remained of sufficient validity to answer our hypothesis, and decidedly continued due to a lack of viable alternatives for detecting contaminants. Hopefully, new

107

technology that can improve sequencing accuracy, reduce cross-contamination and enhance the obtained number of reads will address this issue in the future; for a review of current next-generation sequencing developments, please refer to Goodwin et al. [409]. However, these drawbacks should presently be carefully considered by other researchers.

We have concluded that stochasticity is a property inherent to human gut microbial ecosystems, but is exceeded by forces of environmental selection, at least in terms of driving microbial community behaviour. Substrate availability also seems to dictate functionality over cooperative interactivity, but that does not preclude the existence of coadaptation. Our work fits into the observations of previous studies, which indicate microbial community assembly in the human gut is a deterministic process [172], habitat filtering predominates species assortment [167], competitive interactions are more numerous than cooperative ones [187] but microbial guilds covering metabolic modules exist [188, 189]. We have also suggested a methodology to elucidate steady-state, that additional testing is required to determine overall statistically significant differences of a treatment and that microbial community alterations exhibited in GI disorders could result from a break-down of cooperation. Future work studying microbial interactions should consider strain level variation, and could, for example, compare interactions between strains derived from ‘healthy’ versus low diversity ecosystems.

108

Chapter 5 – Competition, Niche Processes & Redundancy The research within this chapter aimed to explore how competitive exclusion, i.e., species assortment, influenced the ability of incoming allochthonous microbes to colonize. Further, when such microbes were able to integrate into a human fecal-associated microbial community, it was of interest to determine whether new functionalities were added or the redundancy was simply improved. To accomplish these tasks, a defined microbial community was derived from a UC patient fecal sample and cultured in vitro. This community was thus naturally low in diversity [185], and as expected in IBD patients, lacking in functionality that typically benefits the host [64, 69]. Therefore, I expected new functionalities to be added to a degree, as hypothesized by the graphical relationship between ecosystem functionality and stability (Figure 4). After stabilization, a subset of the bioreactors was treated with the clinically relevant antibiotic rifaximin [410–413] to dampen competition. A defined microbial community derived from a healthy donor fecal sample, simulating MET, was then added to the control and antibiotic treated bioreactors. The possible unique functionalities of the microbial species that incorporated in each case was then determined by subtracting the KEGG orthologous functional groups [119] obtained from their genomes from the base microbial community members. This study fits into the second component of my goals through directly addressing my hypothesis that microbial ecological theory can be replicated utilizing complex defined microbial communities. The following body of work is presented in the format of an original research article and is under consideration for publication by the journal Microbiome.

109

5.1 Article Information

Antibiotic pre-treatment of an ulcerative colitis associated fecal microbial community does not improve integration of therapeutic bacteria in vitro

Kaitlyn Oliphant1, Valeria R. Parreira1, Kyla Cochrane1, Kathleen Schroeter1, Michelle C. Daigneault1, Sandi Yen1, Elena F. Verdu2 & Emma Allen-Vercoe1

Affiliations

1Department of Molecular and Cellular Biology, University of Guelph, 50 Stone Rd E, ON, Canada N1G 2W1

2Division of Gastroenterology, Department of Medicine, Farncombe Family Digestive Health Institute, McMaster University, 1280 Main St W, Hamilton, ON, Canada L8S 4K1

Acknowledgements

We would like to thank Dr. Marc Aucoin in the department of Chemical Engineering at the University of Waterloo in Waterloo, Canada for his assistance in calculating bioreactor antibiotic concentrations. We would like to acknowledge the Crohn’s and Colitis Canada grant to EFV and EAV, and Canada Institutes for Health Research scholarship and Ontario Ministry of Training, Colleges and Universities scholarship to KO for providing funding.

110

5.2 Abstract

FMT is a proposedly useful strategy for the treatment of UC through remediation of the patient gut microflora. However, its therapeutic success has varied, necessitating research to uncover mechanisms that improve patient response. Antibiotic pre-treatment has been proposed as one method to enhance the success rate in UC by increasing niche availability for introduced species that can alter the underlying ecosystem. The purpose of this study was to use an in vitro, bioreactor-based, colonic ecosystem model to determine how pre-treatment with rifamixin, an antibiotic shown to be efficacious for the treatment of UC flares, influenced engraftment of bacterial strains sourced from a healthy donor into a defined microbial community representative of a UC patient fecal ecosystem. Distinct species integrated in the treated and untreated conditions. Metabonomic analysis revealed that the untreated community was characterized by saccharolytic fermentation, whereas the treated community was associated with amino acid degradation. Predictive functional analysis indicated a surprisingly small number of introduced unique KEGG orthologies (KOs), suggesting most of the engrafted microbes were filling in overlapping niches. We conclude that the antibiotic pre-treatment did not improve incorporation of introduced microbes.

5.3 Introduction

The consortium of microorganisms that populate the human GI tract, referred to as the gut microbiota, is known to be a critical component of human health, contributing to essential processes such as the metabolism of otherwise indigestible foods to produce nutrients [99, 355] and immune system modulation [356]. Multiple GI disorders are associated with derangements in colonic ecosystem composition and function, loosely referred to as ‘dysbiosis’ [357, 358]. Whether dysbiosis is a cause or an effect of disease remains to be elucidated in most cases; however, procedures that aim to ameliorate dysbiosis through the replenishment of microbes are currently being studied. Fecal microbiota transplantation (FMT) is one such method, which entails the transfer of microbes directly from stool provided by a pre-screened donor into a patient [357, 358]. FMT gained traction as a therapeutic due to its success in curing RCDI [359], but the efficacy of FMT in treating other GI disorders is not clear, such examples include metabolic syndrome [414], IBS [415], and particularly IBD exemplifies the current insufficiencies of FMT as a therapy.

IBD is defined as chronic, reoccurring inflammation of the GI tract, and its incidence continues to increase in developed countries [378, 416]. IBD is an umbrella term for a group of diseases that include CD and UC [378, 416]. The potential of FMT to reduce symptoms or induce remission in IBD is indicated by the current microbiota-mediated model of IBD pathophysiology [150], in addition to

111

dysbiosis being an often-described phenomenon [378]. UC is a particularly attractive candidate for FMT since inflammation is limited to the colon, which is more readily accessible for treatment. However, a recent meta-analysis of 18 FMT clinical trials has suggested an overall clinical response rate of only 46.18 ± 25.08% in UC patients [360]. Thus, focus has now advanced towards developing an understanding of the mechanisms and factors contributing to the effective instances of FMT.

Several aspects of FMT application have been proposed as deserving of further scrutiny, including route of delivery, dosing and patient-donor matching [378]. A meta-analysis conducted by Keshteli et al. concluded that antibiotic pre-treatment improved the clinical response rate of FMT in UC patients [417] and antibiotics, such as rifaximin, have been used with some success for the treatment of UC [410–413]. However, current guidelines do not recommend it as a treatment option because of the risk of antimicrobial resistance [418–420]. The use of rifaximin as a pre-treatment prior to FMT in UC cases has not yet been explored. Therefore, we conducted a pilot study to determine if applying rifaximin as a pre-treatment could increase the incorporation of allochthonous microbes sourced from a healthy donor into a UC-derived microbial ecosystem. To test this hypothesis, we utilized bioreactors as in vitro models of the distal human gut, populated with defined microbial communities. This approach not only allowed strict control over confounding parameters that are prevalent in human studies, but also enabled us to gain insight into the mechanistic determinants of integration through strain characterization and deeper analysis.

5.4 Methods

5.4.1 Bioreactor operation and defined microbial communities

A Multifors bioreactor system (Infors AG, Bottmingen/Basel, Switzerland) was operated as an in vitro model of the distal human gut with a working volume of 400 mL as described previously[200]. The bioreactor medium formulation was as stated for replicate one of this study. Thereafter, a single component of the formulation, xylan (Sigma Aldrich, Saint Louis, MO, USA), was replaced by xylooligosaccharide (BioNutrition, Laval, QC, Canada) because of discontinuation of the xylan product; however, this formulation change was not found to affect ecosystem function or dynamics by partial least squared discriminant analysis calculated through use of R package ropls version 3.7 (Figure S7). Bioreactor vessels were inoculated with a defined microbial community that comprised of 24 bacterial strains originally derived from an UC patient fecal sample (Table S8) [199]. Vessels were allowed three weeks to equilibrate [200]. Three replicate vessels each were assigned to the conditions of control and treatment, for a total of six vessels. Control vessels were administered 10 mL of the MET in a relative ratio as in Petrof et al. [23], in which additional isolates were included to improve the metabolic diversity

112

of the formulation (Table S8). Rifaximin resistance profiles for all strains were analyzed by the Kirby- Bauer disc diffusion test using discs impregnated with 40 μg rifaximin as in Huhulescu et al. [421], prior to the bioreactor experiments. Treatment vessels were perturbed with the clinically-relevant dosage of 200 mg/d [411] rifaximin (Sigma Aldrich), which was used with equivalence in vitro since the antibiotic is not gut-permeable (< 0.4% systemically absorbed) [422]. As such, the vessels were pulsed with 100 mg using 5 mL ethanol as a carrier every 12 hr for 5 d, prior to the administration of MET. In this condition, MET was added twice, after a 2 d and 4 d wash-out period, respectively, to increase the probability of capturing the point at which the underlying microbial ecosystem had not fully recovered from the perturbation, yet enough of the residual antibiotic had been removed to not disturb the incoming microbes. The hypothetical amount of rifaximin remaining at these time points was calculated through use of R package deSolve version 1.21, at which roughly 23% of the maximum concentration would have been retained at the first time point and 3% at the second time point (Figure S8). Sampling was conducted at time points directly after equilibration, directly after the course of antibiotics, and 10-14 d post MET replenishment.

5.4.2 16S rRNA compositional profiling

The QIAamp Fast DNA Stool Mini Kit (Qiagen Inc., Germantown, MD, USA) was utilized according to the manufacturer’s directions to extract gDNA from the cellular pellet of 2 mL of culture per bioreactor sample. 16S rRNA libraries were prepped with 400 ng of Nextera XT Index v2 sequences (Illumina Inc., Hayward, CA, USA) plus standard v4 region primers[331] and 2 μL of gDNA template in Invitrogen Platinum PCR SuperMix High Fidelity (Life Technologies, Burlington, ON, Canada) as a one- step PCR amplification. Cycler conditions included an initial melting step of 94 °C for 2 min, followed by 50 cycles of 94 °C for 30 s, annealing temperature for 30 s and 68 °C for 30 s, with a final extension step of 68 °C for 5 min. The annealing temperature comprised of a 0.5 °C increment touch-down starting at 65 °C for 30 cycles, followed by 20 cycles at 55 °C. PCR products were purified using the Invitrogen PureLink PCR Purification Kit (Life Technologies) according to the manufacturer’s directions. Subsequent normalization and Illumina MiSeq sequencing was carried out at the Advanced Analysis Center located in the University of Guelph, ON, Canada. The raw sequencing data has been deposited in the publicly accessible NCBI database as a BioProject (https://www.ncbi.nlm.nih.gov/bioproject/488265), with BioSample accession numbers included in Additional file 8. Sequencing data was processed in R (version 3.4.4) following the recommended procedure of the package DADA2 [222] version 1.6, with classification to the genus level by SILVA database [363] version 132 (https://benjjneb.github.io/dada2/training.html). ASVs were next classified to the species level, by first identifying the top hits of NCBI BLAST searches (https://blast.ncbi.nlm.nih.gov) via percentage identity and e-value, then cross-referencing with the known members of the defined microbial communities to

113

determine the correct identification. ASVs that classified to redundant species were amalgamated so that each ASV was attributed to a unique species, and those that did not represent 0.01% total abundance in at least one sample were removed. Similarly, a set of ASVs at the genus level was created by amalgamating ASVs that were classified to the same genus. Finally, the sequencing data was center-log ratio transformed by package ALDEx2 [221] version 1.10, taking the median of the Monte Carlo instances as the value for relative abundance.

5.4.3 1H- NMR metabonomics

Sample preparation, 1H-NMR spectral acquisition and processing, and profiling of metabolites was conducted as previously described [200]. A Bruker Avance III 600 MHz spectrometer with a TCI 600 probe (Bruker, Billerica, MA, USA) and acquisition temperature of 295 K at the Advance Analysis Center located in the University of Guelph, ON, Canada was utilized to obtain spectra. Metabolite profiling was conducted with Chenomx NMR suite 8.1 (Chenomx Inc., Edmonton, AB, Canada) utilizing the internal library of compounds. Metabolite identifications are based on the best fit for the peak regions with the available compounds.

5.4.4 Statistical analysis

To determine which species and metabolites were significantly different, a series of one-way tests was conducted for each data set by first rank transforming the data, due to its non-parametric nature, then converting it to a mixed linear model, to handle its dependent and unbalanced attributes, to complete an ANOVA. The p-values were subsequently adjusted via the Benjamini-Hochberg method to correct for multiple testing (i.e., obtain q-values). Effect sizes were calculated as the marginal R2 value from the model. Only features that both had an effect size of above 50% and were under a q-value threshold, such that the number of expected false positives was less than one, were considered significant. Groupings of significant features were determined from Tukey HSD post-hoc tests, which were verified by calculating pairwise effect sizes [423]. For compositional data, species unique to MET with high pairwise effect sizes were also considered even in the absence of statistical significance, which was a result of these species attaining a lower relative abundance in the final community structure but does not preclude their presence. All analysis was conducted in R, with use of the packages lme4 version 1.1.17, lmerTest version 3.0.1, MuMIn version 1.40.4, and lsmeans version 2.27.62. Figures were generated by the package ggplot2 version 2.2.1.

114

5.4.5 Functional and pathway analysis

To gain insight into the connection between condition, taxonomic, and metabolic changes, protein sequences derived from fully sequenced genomes were first obtained for each species present in the bioreactors, as defined by 0.01% compositional abundance in at least three replicates from 16S rRNA profiling via Illumina sequencing. The KEGG database (https://www.genome.jp/kegg) was first utilized, and if no genomes were available for a given species, the NCBI genome database (https://www.ncbi.nlm.nih.gov/genome) was used (Table S9); genome retrieval was automated with use of R packages KEGGREST version 1.20 and rentrez version 1.2.1, respectively. If protein data was available for a given species in the absence of a complete genome through NCBI (https://www.ncbi.nlm.nih.gov/protein), this deposited data was used instead (Table S9). For three of the bacterial strains, Lachnospiraceae sp., Ph. faecium and Pseudoflavonifractor sp., genomes of exact species could not be obtained from these databases. Draft genomes were thus de novo assembled from shotgun genomic Illumina sequencing data obtained from the Broad Institute (Cambridge, MA, USA) and deposited in NCBI (https://www.ncbi.nlm.nih.gov/taxonomy/), via the Shovill pipeline version 0.2 (https://github.com/tseemann/shovill), which uses the SPAdes algorithm [424] version 3.12 with subsequent annotation by Prokka [425] version 1.12 (Table S9). KO annotation was either provided directly if the genome could be acquired from KEGG or was conducted via the online tool GhostKOALA (https://www.kegg.jp/ghostkoala/). The KOs that were unique to the species that engrafted were then obtained for each condition, with and without rifaximin pre-treatment. Pathway mapping on the global ‘microbial metabolism in diverse environments’ KEGG map01120 was carried out with use of the tool iPath [426] version 3. KOs were then linked to their respective KEGG pathways, and the number of unique KOs attributed to each pathway by organism was tabulated.

5.5 Results

5.5.1 Distinct bacteria incorporated into rifaximin pretreated vs. untreated communities

To elucidate which species engrafted in each tested condition (rifaximin pretreated or untreated), we carried out compositional analysis using profiling of sequenced 16S rRNA genes. Allochthonous species that remained present after a sufficient amount of time post therapeutic replenishment, i.e., at least 10 wash-outs, were actively replicating and thus considered to have stably engrafted into the microbial community. However, it was also of interest to determine how the antibiotic changed the community, as such alterations could provide insight of how the incoming microbes were influenced. For this analysis, gDNA was extracted from bioreactor samples that had been harvested (i) after the UC patient fecal- derived microbial community stabilized for three weeks, (ii) after which three of the bioreactor vessels

115

were perturbed with a physiologically relevant 5 d dosage of rifaximin, and (iii) 10-14 d post replenishment from MET. Notably, rifaximin treatment did not cause any significant shifts in the underlying species composition. At the genus level, however, Pseudoflavonifractor increased with a pairwise effect size of 1.23 (q-value < 0.0001) and Adlercreutzia decreased with a pairwise effect size of - 2.90 (q-value < 0.01), indicating slight but not impactful alterations had occurred. A total of seven species engrafted in the control condition, and four species in the treatment condition, which was indicated by a pairwise effect size of ≥ 1, defined as the median of the differences in relative abundance between groups divided by the maximum of the median differences in relative abundance between samples within each group. The species with this substantial pairwise effect size, which includes both the allochthonous microbes and members of the base community that were compositionally altered, for each condition are depicted in Figure 19. Of these species, the following were significantly changed (q-value < 0.05): Ac. intestini, E. coli, [Eubacterium] fissicatena, Flavonifractor plautii, Klebsiella oxytoca, Lachnospiraceae sp., Parabacteroides distasonis, and Parabacteroides merdae. B. ovatus was also below the calculated q- value cut-off of 0.0647. Not only did distinct species integrate in the control versus treatment conditions, but the engraftment resulted in a pronounced decrease of select base community species, mostly from the phylum Proteobacteria.

116

Figure 19. Differential compositional changes in the ulcerative colitis associated microbial community after microbial replenishment by rifaximin. Pairwise effect sizes of processed and centre-log ratio transformed Illumina 16S rRNA profiling data generated from gDNA extracted bioreactor samples seeded with a defined microbial community representing an ulcerative colitis patient fecal sample. Shown are the species with absolute effect sizes greater than one, when comparing the before treatment (Before) and after microbial ecosystem therapeutic replenishment conditions, both untreated (MET) and with prior rifaximin perturbation (Abx-MET). Dotted lines at the -1 and 1 effect sizes for reference.

5.5.2 Metabonomic analysis of pretreated vs. untreated communities revealed differences in saccharolytic and proteolytic fermentation

Metabonomic profiles derived from 1H-NMR spectra provided a means of determining the behavioural changes of the defined microbial communities associated with the antibiotic perturbation and engrafted microbes, allowing a greater mechanistic insight. The filtered bioreactor waste utilized to obtain the spectra were matched to the periods of sampling for compositional analysis. An untargeted, manual profiling approach using a standard library of compounds was applied to the spectra, and a total of 84 compounds was yielded. The metabolite concentrations that were significantly different between groupings, including before and after rifaximin application, and MET replenishment in both treatment conditions, are summarized in Table 3. It was noted that rifaximin treatment resulted in the increase of

117

several amino acids, particularly, the aromatics tyrosine, tryptophan, and phenylalanine, eluding to a selective reduction of specific fermentations involving these compounds as substrates. The metabolite concentration changes from MET replenishment were unexpectedly consistent, regardless of rifaximin pre-treatment. Both resulted in a decrease of the fermentation by-products 2-hydroxyisovalerate and desaminotyrosine, a reduction of nitrogenous compounds such as glycine, isoleucine, N-acetylcysteine, betaine and carnitine, and a steep decline of methanol, with concomitant increases in the fermentation by- products pyruvate and valerate. However, the control condition could be distinguished from the treatment conditions through the significant reduction of sugars compared to the before grouping, specifically fructose, fucose and galactose, suggesting a greater propensity for saccharolytic fermentation. On the other hand, the treatment condition resulted in a significant diminishing of the amino acids histidine, leucine, N-acetylglutamine, N6-acetyllysine, phenylalanine and valine, which could be related to more substantive amino acid degradation capabilities.

Table 3. Differential changes in mean concentration of metabolites after microbial replenishment by rifaximin. Mean concentrations with standard deviations in mM of 1H-NMR measured metabolites in 0.2 μm filtered bioreactor samples seeded with a defined microbial community representing an ulcerative colitis patient fecal sample that significantly changed after rifaximin perturbation (Abx) or microbial ecosystem therapeutic replenishment with (Abx-MET) and without (MET) prior antibiotic usage. Only metabolites identified by Tukey HSD post-hoc testing, with verification by pairwise effect sizes, from the before treatment (Before) samples after conducting a one-way rank-transformed repeated measures ANOVA are shown.

118

METABOLITE BEFORE ABX MET ABX-MET Q- VALUE 2-HYDROXYVALERATE 0.0218 ± 0.0105 ± 0.0068 ± 0.0026 0.0074 0.0034 0.0011 BETAINE 0.2423 ± 0.0033 ± 0.0367 ± 0.0090 0.0316 0.0005 0.0585 CARNITINE 0.0137 ± 0.0045 ± 0.0046 ± 0.0103 0.0023 0.0024 0.0014 DESAMINOTYROSINE 0.0355 ± 0.0210 ± 0.0107 ± 0.0101 ± 0.0031 0.0086 0.0067 0.0021 0.0024 FRUCTOSE 0.2691 ± 0.1752 ± 0.0103 0.0641 0.0390 FUCOSE 0.0837 ± 0.0371 ± 0.0246 0.0289 0.0052 GALACTOSE 0.1075 ± 0.1661 ± 0.0636 ± 0.0160 0.0264 0.0223 0.0133 GLYCINE 0.4420 ± 0.7561 ± 0.2064 ± 0.2412 ± 0.0013 0.0959 0.1397 0.0507 0.0134 HISTIDINE 0.0570 ± 0.1553 ± 0.0233 ± 0.0077 0.0261 0.0124 0.0111 ISOLEUCINE 0.2216 ± 0.0985 ± 0.0611 ± 0.0026 0.0580 0.0261 0.0247 LEUCINE 0.2124 ± 0.0696 ± 0.0235 0.0531 0.0418 METHANOL 3.8277 ± 0.4208 ± 0.4626 ± 0.0014 0.6685 0.5839 0.3068 METHYLAMINE 0.0220 ± 0.0122 ± 0.0423 ± 0.0092 0.0098 0.0023 0.0192 N-ACETYLCYSTEINE 0.1246 ± 0.0875 ± 0.0992 ± 0.0103 0.0147 0.0072 0.0080 N-ACETYLGLUTAMINE 0.0306 ± 0.0093 ± 0.0064 0.0052 0.0036 N6-ACETYLLYSINE 0.0772 ± 0.0256 ± 0.0130 0.0204 0.0071 PHENYLALANINE 0.2052 ± 0.3170 ± 0.0953 ± 0.0014 0.0390 0.0871 0.0270 PYRUVATE 0.0872 ± 0.0742 ± 0.1780 ± 0.1713 ± 0.0012 0.0040 0.0038 0.0255 0.0141 TYROSINE 0.1273 ± 0.2666 ± 0.0023 0.0269 0.0400 VALERATE 1.1631 ± 0.7961 ± 2.2932 ± 2.2423 ± 0.0026 0.1373 0.1121 0.6921 0.2121 VALINE 0.3448 ± 0.1042 ± 0.0077 0.1423 0.0430

119

5.5.3 Predictive functional analysis suggested similarities in engraftment of allochthonous microbes into pretreated vs. untreated communities

A predictive functional analysis was conducted to not only determine which capabilities were provided by the engrafted microbes, but also provide a putative link for the compositional and metabonomic results. KOs were obtained from the available genomic data of all species present in the bioreactors, as defined by 0.01% compositional abundance in at least three replicates by 16S rRNA Illumina sequence profiling, which included bacteria from both the UC patient fecal-derived defined microbial community and MET. A total of 53.8% of the 26 species present had genomes obtained from KEGG, 34.6% from NCBI, and 11.6% had draft genomes constructed from shotgun genomic Illumina sequencing data of the strains (Table S9). A range of 1 – 26 genomes could be attained per species, with only one species utilizing deposited proteome data in the absence of a complete genome through NCBI (Table S9). The number of contributed unique KOs associated with species that engrafted in both rifaximin pre-treatment and untreated conditions are depicted in Figure 20. The top three KEGG pathways attributed to each species are described in Table 4. An overview of the uniquely contributed KOs is also displayed in Figure 21. We found that there was a general lack of unique KOs introduced to the UC patient fecal-derived community (Figure 21). A few added predicted functionalities were noted, particularly the heightened capacity for the utilization of amino sugars and nucleotide sugars, starch and sucrose, and select amino acids (Figure 20). The Lachnospiraceae spp. especially contributed these carbohydrate-utilizing functionalities, with a higher functional potential observed in the condition without prior rifaximin usage, due to the increased rate of incorporation from this family (Figure 20, Table 4). Another distinguishing feature between conditions was the preference of specific amino acid substrates, with the control condition (no prior rifaximin usage) favouring arginine and proline metabolism by B. ovatus and Pa. distasonis, and the rifaximin pretreated condition favouring lysine degradation by Fl. plautii (Figure 20, Table 4).

120

Figure 20. Differential unique KEGG orthologies contributed by integrated allochthonous microbes by rifaximin. The number of contributed KEGG orthologies by engrafted species from the healthy consortium of microbes (MET) that were unique from those KEGG orthologies linked to species present in the bioreactors from the ulcerative colitis patient fecal sample-derived microbial community. The KEGG orthologies are grouped by their associated KEGG pathways, and only pathways with at least three contributed KEGG orthologies are shown. The groupings were conducted for each experimental condition separately, both the control with no prior rifaximin usage (MET) and the rifaximin pretreated condition (Abx-MET).

Table 4. Contribution to KEGG pathways by number of unique KEGG orthologies of each integrated allochthonous species. The top three KEGG pathways related to specific metabolisms that were attributed to each engrafted species from the healthy consortium of microbes in either condition, determined by the number of KEGG orthologies contributed that were unique from those KEGG orthologies linked to the ulcerative colitis patient fecal-derived species present in the bioreactors. Each of these KEGG orthologies is indicated by a + symbol. Species: Acidaminococcus intestini, [Eubacterium] eligens, Bacteroides ovatus, Eubacterium ventriosum, Parabacteroides distasonis, Roseburia faecis, Roseburia inulinivorans, [Eubacterium] fissicatena, Coprococcus comes, and Flavonifractor plautii.

121

SPECIES A. E. B. E. P. R. R. E. C. F. INTESTINI ELIGENS OVATUS VENTRI- DISTASONIS FAECIS INULINI- FISSI- COMES PLAUTII -OSUM -VORANS -CATENA CONDITION All MET MET MET MET MET MET Abx_MET Abx_MET Abx_MET KEGG PATHWAY AMINO SUGAR + + ++ + + AND NUCLEOTIDE SUGAR METABOLISM ARGININE AND + + ++ + + + + PROLINE METABOLISM BUTANOATE + ++++ ++ +++ METABOLISM D-ARGININE ++++ AND D-ORNITHINE METABOLISM FRUCTOSE AND + + MANNOSE METABOLISM GALACTOSE + + ++ + METABOLISM GLYCERO- + + + + + + + -PHOSPHOLIPID METABOLISM LYSINE ++++++ DEGRADATION METHANE ++++ +++ + METABOLISM

SPECIES A. E. B. E. P. R. R. E. C. F. INTESTINI ELIGENS OVATUS VENTRI- DISTASONIS FAECIS INULINI- FISSI- COMES PLAUTII -OSUM -VORANS -CATENA PROPANOATE ++++ ++ METABOLISM RIBOFLAVIN + ++ METABOLISM STARCH AND + + + +++ ++ ++ ++ SUCROSE METABOLISM SULFUR ++ ++ METABOLISM

Figure 21. Overall functional contribution of the integrated allochthonous microbes to the ulcerative colitis associated microbial community. The KEGG ‘microbial metabolism in diverse environments’ pathway map (map01120) with KEGG orthologies in common to both the ulcerative colitis patient fecal- derived species present in the bioreactors and species engrafted from MET without prior rifaximin usage, mapped and coloured in blue. The KEGG orthologies unique to the engrafted species are coloured in green. The detected metabolites by 1H-NMR are also depicted as nodes, with those that increased after allochthonous bacterial engraftment coloured in orange and those that decreased coloured in purple. The figure was generated by the ipath3 online tool, version 3.

124

5.6 Discussion

There is currently great interest in using FMT strategies to treat UC and therefore gaining insights into how therapeutic microbes may incorporate into patient microbial communities is key for the development of this treatment modality. We chose to focus on microbial ecosystem changes mimicking the introduction of healthy donor-derived microbes into a defined UC-derived community using an in vitro model of the colonic microbiota. This approach allowed us to determine species-level interactions in a reproducible and highly detailed way that we hoped may offer a foundation for development of rationally designed therapeutic microbial consortiums for the treatment of UC.

One of the considerations for the use of FMT strategies to treat UC is whether antibiotic therapy should be applied to a patient prior to the addition of beneficial microbes, with the rationale that incoming bacterial species may be better able to utilize a vacated rather than an occupied niche. In this study, we attempted to address the question of whether niche availability might influence introduced species engraftment by treating a UC-derived microbial community with rifaximin before introduction of our healthy donor-derived strains. Rifaximin was chosen because it is known to have beneficial effects as a treatment for reducing the severity of UC flares [410–413], and several studies have demonstrated that rifaximin treatment has the potential to disrupt colonic microbial communities in dysbiotic ecosystems, albeit any effects on overall species structure are temporary [427–429]. In this work, we found that the effect of the antibiotic on our UC patient fecal-derived microbial community composition was minimal. Although our experimental design used a defined community, perhaps limiting the effects of antibiotic perturbation, we were able to discern significant relative abundance shifts in a few taxa, such as Pseudoflavonifractor sp. and Adlercreutzia equolifaciens. These taxa did follow the expected trends based from the antibiotic resistance profiles (Table S8); however, it is still surprising that more of the relatively resistant strains did not proliferate in this situation. As rifaximin is considered to have both broad- spectrum bactericidal and bacteriostatic properties, its latter ability to limit behaviours conducive to proper growth may be key to its beneficial attributes.

Indeed, in contrast to the small changes in ecosystem composition noted above, we observed a significant change in metabolite concentrations before and after rifaximin treatment. Rifaximin’s mechanism of action lies in its ability to inhibit RNA polymerase, and thus it is not unexpected that specific bacterial metabolisms are reduced through treatment with this drug. We detected both increases of certain amino acids and sugars and decreases in their fermentation by-products in the UC patient fecal- derived community treated with rifaximin compared to the untreated community. Particularly interesting were the increases in the aromatic amino acids histidine, phenylalanine and tyrosine, as compounds that can be produced from their fermentation, such as histamine, tyramine, phenol, and p-cresol, may have

126

detrimental effects on the host [99]. Other groups have reported similar changes in amino acids that can yield harmful by-products after rifaximin treatment; Kang et al. determined that less glutamine was converted to glutamate and ammonia in mouse models of minimal hepatic encephalopathy [430] and Maccaferri et al. also found that tyrosine increased in their bioreactor experiments modeling CD [427]. Together, our data fits with the hypothesis that rifaximin has a greater effect as a modulator of microbial behaviour than as an antibiotic, and that its beneficial attributes as a treatment for UC flares may be more related to its ability to regulate host-detrimental metabolism or expression of virulence factors [431]. However, limitations of our model mean that any direct effects of rifaximin on the host immune system or on bacterial adhesion to the host epithelium could not be tested, and these may also be important in the management of disease [430–433].

When we used rifaximin as an antibiotic pre-treatment of our UC patient fecal-derived community prior to the introduction of healthy donor-derived bacterial species, we saw a pronounced effect on the engraftment of allochthonous microbes into the UC community compared to control (untreated). Our calculations estimated that only 3% of the maximum concentration of rifaximin, or 21 μg/mL, would have been retained at the second time point of MET delivery (Figure S8), and this value is likely inflated, as it does not account for light or microbial degradation of the antibiotic. Yet, this small amount may have still been a driver of engraftment abilities, as we found that members of the healthy donor-derived ecosystem that were very sensitive to rifaximin indeed did not engraft, for example, Roseburia spp., [Eubacterium] eligens and Eubacterium ventriosum. The notable exception to this pattern was that the relatively resistant strain of B. ovatus did not integrate into the treated community yet did integrate into the control community. KO analysis suggested that the contributions of B. ovatus to the UC + MET community were related to riboflavin, arginine and proline, and fructose metabolism. As Pseudoflavonifractor sp. (UC community member) and Fl. plautii (MET community member) are also capable of performing these metabolisms, and strains of these species were found to be abundant in the condition with rifaximin pre-treatment, B. ovatus may have been out competed in this situation.

Studies of human trials of FMT for UC treatment also suggest that this kind of therapy may be influenced by antibiotic pre-treatment. For example, Moayyedi et al. did not pre-treat UC patients with antibiotics prior to FMT and found that the bacterial family Lachnospiraceae and the genus Ruminococcus were associated with their successful donor [434], whereas Kump et al. pretreated with vancomycin, paromomycin and nystatin and determined that the bacterial family ‘unclassified Ruminococcaceae’ and the genera Akkermansia and Ruminococcus were associated with their successful donors [435]. Similarly, the in vitro work we describe here demonstrated that many Lachnospriaceae spp. integrated under the

127

control (no antibiotic pre-treatment) condition, and Fl. plautii, a Ruminococcaceae family member, significantly engrafted following antibiotic pre-treatment.

It is difficult to predict from our study whether antibiotic pre-treatment prior to microbial ecosystem-based therapeutics could have resulted in a better patient outcome. We found that both rifaximin pre-treatment or no pre-treatment resulted in engraftment of strains representing species from clostridium clusters IV and XIVa, together with a concomitant decrease in Proteobacteria spp. Both of these outcomes (and in particular the increase of Clostridiales taxa associated with butyrate production) are associated with improved clinical response rates to FMT in UC patients [436–442]. While introduced MET treatment changed UC patient fecal-derived community metabolism, we noted that most measured metabolite concentrations in our study were not influenced by rifaximin pre-treatment. For example, methanol, 2-hydroxyisovalerate, betaine, and carnitine, compounds that are associated with poor health outcomes under elevated conditions [443–448], all significantly decreased following MET introduction into the UC community, either with or without rifaximin pre-treatment.

While we did see some subtle effects of rifaximin pre-treatment on functional outcomes in our experiment, the only notable findings were that a greater number of amino acids decreased in concentration following MET addition to the antibiotic pre-treatment condition, including the aromatic amino acids histidine and phenylalanine, and a greater amount of sugars decreased in concentration following MET addition to the condition without previous rifaximin usage, including fructose, galactose and fucose. It is possible that the former difference could be attributed to the multiple KOs of the rifaximin resistant strain, Fl. plautii, that were related to the degradation of several amino acids, as determined by the predicted functional analysis. The latter heightened capacity for dietary carbohydrates was linked by KOs to the primary degraders from the phylum Bacteroidetes and Lachnospiraceae spp. Additionally, fucose, a component of mucin [47], may have been associated with an enhanced ability for mucin glycan foraging provided by Lachnospiraceae members, more of which incorporated in the condition without rifaximin usage.

In terms of predicted microbial function, the somewhat surprising observation from our study was that the number of uniquely attributed KOs was quite small, with a few of the engrafted species seemingly not contributing any at all. There were, however, several limitations to our method. For example, predicted functions were based on genomes that were available in the KEGG and NCBI databases; thus, the strains used in this study may have possessed strain-level variations which could have been associated with enhanced functionality. Regardless, it is noteworthy that the allochthonous strains which integrated belonged to species that were closely related to species within our UC community. Examples of this pattern include Ac. intestini, Fl. plautii, Pa. distasonis and B. ovatus engrafted species from MET, which

128

are highly related to UC patient fecal-derived strains Ph. faecium, Pseudoflavonifractor sp., Pa. merdae and Bacteroides dorei, respectively. Similar results have been previously reported; two studies that attempted to predict the donor strains that would engraft in RCDI and UC patients respectively found that functional potential or specific traits of the donor species were not robust indicators [449, 450]. Further, Li et al. completed metagenomic sequencing after conducting FMT in a group of metabolic syndrome patients and determined that three months following this procedure, the proportion of species associated with the donor profile but absent in the patients prior to treatment was not significant, yet many strains of donor species already present in the donor engrafted [414], often co-existing with the patients’ own strains. It was also noted that colonization patterns were different amongst individuals receiving FMT from the same donor, stressing the potential importance of patient-donor matching [414, 449].

One of the key findings of the Human Microbiome Project was that a degree of functional redundancy exists amongst gut microbial species of healthy individuals [451]. It is tempting to speculate that the niches present in the gut environment are of a fixed capacity and can accommodate strains that are best suited to them up to a certain limit. Our working hypothesis is that Proteobacteria, which are often associated with expansion or ‘blooming’ during dysbiosis and which, as a phylum, contains many highly functionally diverse species [79], are able to take advantage of a broad range of vacant niches but are easily displaced by more fastidious microbes that are specifically and highly adapted to these spaces. If this prediction is correct, then one strategy for microbial ecosystem-based therapeutics in the future might be to match certain keystone components of therapeutic ecosystems as closely as possible to the pre-existing recipient ecosystem. One exception to this strategy, however, might be the provision of species from the Lachnospriaceae family, which we observed to incorporate into our UC patient fecal- derived community without prior representation. From a functional standpoint, the unique KOs present among these engrafted Lachnospiraceae spp. suggested that they were involved in aspects of starch and mucin degradation. This result is also supported by our previous metatranscriptomic analysis after therapeutically replenishing this UC patient fecal-derived microbial community with MET in germ-free mice [199]. Both starch and mucin degradative ability are likely to create robust niches in the human intestine. Mucin is found in abundance in the GI tract and starch is estimated to be the largest diet-derived source of carbohydrate in most human societies [355]. Members of the Lachnospiraceae family are suggested to be core components of the human gut microbiota and are commonly represented in samples derived from healthy individuals [30]. Indeed, such core taxa have been shown to be more likely to incorporate during FMT treatment than rare species [449, 450]. In this light, the success of FMT may rely closely on donor species richness and the likelihood of there being component species that may be a good fit for vacant niches [435, 439].

129

In conclusion, the recommendation of Keshteli et al. that antibiotic pre-treatment could improve FMT efficacy in UC patients was not supported by this study [417], at least from the perspective of using rifaximin as the pre-treatment. Antibiotic treatment is associated with a considerable number of risks to patient health, with the most pressing concern being the increase in antimicrobial resistance. Although antibiotic pre-treatment prior to FMT has been proposed as a mechanism to create new niches to improve the likelihood of treatment success, we would suggest a different approach. Instead of applying an additional selection criterion to the incoming microbes, i.e., the ability to resist a given antibiotic, an alternative method would be to enhance donor-patient matching of keystone species, perhaps with concomitant attempts to create new niches tailored to unmatched but desired allochthonous microbes. A good example of the latter might be the use of bespoke prebiotics; indeed, Wei et al. were able to increase the clinical response rate of FMT in UC patients with pectin supplementation [452]. Future research should focus on ways to predict and manipulate microbial ecosystem therapeutic success based on a given patient’s existing gut microbial repertoire and the incorporation of dietary supplementation.

130

Chapter 6 – Conclusions The central aim of this thesis was to investigate how principals of microbial ecological theory are replicated in complex, human fecal-derived defined microbial communities cultured in vitro. To address this overarching hypothesis, two broad objectives were completed. First, the further development of ‘- omics’ techniques was completed, which included both evaluating the two common metabonomics technologies and validating a novel protein-SIP approach utilizing heavy water. Next, the individual factors that influence microbial community assembly, specifically, stochasticity and environmental selection (including its sub-plot structure of habitat filtering and species assortment), were examined (Figure 3). Species assortment additionally encompasses microbial interactivity, such as competitive exclusion and coevolution, which were also investigated independently. Finally, the relationship between ecosystem functionality and redundancy was observed when allochthonous microbes were added to a low diversity microbial community (Figure 4).

The comparison between 1H-NMR spectroscopy and LC-MS/MS allowed for the evidence-based selection of the appropriate technology to conduct metabonomics. I found that although LC-MS/MS was more sensitive, as expected [245], it did not improve the separation between the metabolites produced by an antibiotic treated defined microbial community versus the control via PCA. I concluded that the metabolic signature of a microbial community is thus dictated by the compounds that are in highest abundance. Therefore, I chose 1H-NMR spectroscopy for my subsequent experiments, as I was more interested in determining when a change occurred rather than elucidating the fine details of a behavioural change. However, this result does not indicate that LC-MS/MS is without benefit. Indeed, the heightened sensitivity of LC-MS/MS enabled the detection of intermediates for pathway analysis or large compounds that were difficult to discriminate by 1H-NMR spectroscopy. LC-MS/MS extraction protocols for the detection of select classes of metabolites are continuing to be developed, such as for bile acids [454] or phenolic acids [455]. In future work, the desired compounds of interest should be carefully considered regarding the research question, and it is likely best to combine both methods, i.e., 1H-NMR spectroscopy to obtain the overall metabolic signature and SCFA concentrations and LC-MS/MS to obtain the concentrations of metabolites in targeted classes.

The Protein-SIP approach utilizing heavy water was successfully demonstrated and implemented into the bioinformatics pipeline MetaProSIP [240]. A couple of findings were notable from this study. First, the incorporation of 18O did not significantly improve how the metaproteomic data was interpreted over D, despite the abiotic HD-abiotic exchange leading to a lower RIA. As D is readily available and inexpensive, this Protein-SIP protocol will be relatively more accessible to researchers. Additionally, protein-SIP was able to delineate the species acting as generalists and specialists within a microbial

131

community, when a change in environmental conditions was applied. This feature is useful for probing the roles of active key players in the context of a microbial ecosystem and allows researchers to predict which microbes may be particularly sensitive, i.e., prone to inhibition or at the extreme, extinction. Although most microbial species did exhibit similar amounts of isotope incorporation, there were a few clear exceptions, such as Ra. hominis. Since the mechanism of heavy water isotopes integration into proteins is interrelated with amino acid biosynthesis [313], I have suggested that Ra. hominis may lack such biosynthetic pathways and instead takes up amino acids from the environment in order to synthesize its proteins. Such a microbial lifestyle has been observed previously in particularly nutrient rich environments, including the Lactobacillus spp. present in fermented foods [334, 456]. This hypothesis warrants further exploration, as does elucidating the pathways of amino acids biosynthesis in anaerobic gut bacteria, which are not entirely known [334]. Such information would enable a more thorough understanding of the mechanism behind this technique, so that researchers have added knowledge of its advantageous and disadvantageous.

Interestingly, stochasticity played a surprisingly stronger role in microbial community assembly than environmental selection, in terms of species structure. This property was observed in the complex defined microbial community utilized in the protein-SIP demonstration and in the reduced control defined microbial community utilized in the study of coevolution, which were both cultured in identical medium formulations representing high fiber and high protein diets. Additionally, further confirmation was provided by external research examining dietary interventions [84, 162, 346]. These results appear to contradict earlier claims that diet is the most important component of environmental selection [4, 174]. However, it may be more indicative of the lack of knowledge behind how diet influences human gut microbial community assembly. For example, David et al. actually noted a dramatic shift in the composition of individual’s gut microbiota when their diet changed from being entirely plant-sourced to entirely animal-sourced [161]. It also has been suggested that the microbial response to diet is driven by carbohydrate content [82, 162, 355], as it is the preferred substrate of most microbial species in the gut [54, 110]. Therefore, more work is needed to determine the extent at which fiber deprivation fails to alter the species structure of the human gut microbiota. It would also be of interest to observe how such microbial communities are able to adapt to and resist complete starvation, since such a state does naturally occur during, e.g., circadian rhythms or fasting [140]. Finally, varying the types of carbohydrate substrates available to the microbes could have a greater impact; however, this effect may need to be observed over a longer time period [84] or may be specific to certain microbial communities containing diet responsive microbes [162].

132

I was able to experimentally confirm the result attained from the computational metabolic modeling conducted by Levy and Borenstein that habitat filtering dominates species assortment [167]. There was a clear statistically significant difference in metabolic behaviour when human fecal-derived, defined microbial communities were grown separately in medium formulations representing a high fiber and high protein diet, yet a lack of overall change between the ‘artificial’, non-coevolved community and the coevolved control. Additionally, competitive exclusion did not dramatically affect the integration of allochthonous microbes into the UC fecal-associated defined microbial community. In fact, these microbes appeared to alternatively be highly influenced by the underlying environmental conditions, i.e., habitat filtering, since the residual antibiotic concentrations prevented the particularly sensitive strains from colonizing as evidenced by their antibiotic resistance profiles. Further, many of the allochthonous microbial species that did integrate inhabited similar niches as the base community members, suggesting that competition was weak and therefore the redundancy of the community was instead improved by MET. Interestingly, the few microbial species that provided unique functionalities were of the core genera proposed by Falony et al. [30], and such functions were related to ubiquitous features of the human gut environment, including starch and mucin degradation [355]. Together, these results indicate that the underlying composition of a patient’s gut microbiota is not as important to the design of a targeted therapeutic, and as such, the application of antibiotics would be ineffective. Instead, it is imperative to contemplate the niches that would be present within the individual. Providing species that can fill common, expected niches may thus be more fruitful. Although, it is important to note that the UC fecal- associated microbial community was characteristically very low in diversity [185]. If the aim is to remediate the gut microbiota of a patient with a less obvious discrepancy in ecosystem diversity when compared to the healthy population, it may be best to test the limits of niche occupancy and thus redundancy. Creating a niche for a desired beneficial microbe through pairing it with a prebiotic, i.e., a specific fiber supplement [452], may instead be more successful in these cases. Such a hypothesis could be addressed with the modelling strategy utilized in this thesis, i.e., defined microbial communities cultured in vitro, for future work.

However, the dominance of habitat filtering does not preclude the existence of species assortment altogether. Indeed, the ‘artificial’, non-coevolved microbial community was more variable, less metabolically efficient as evidenced by less cross-feeding and certain microbial strains were unable to assemble into the final microbial community when compared to the control. These results suggest that simply providing a beneficial microbe in isolation may not be an effective strategy, especially considering it was core, butyrate-producing species that failed to integrate [30, 399]. Indeed, the more specialist nature of the Firmicutes in utilizing polysaccharides [72, 73] appeared to cause them to be particularly reliant on cross-feeding. The obvious next step would be to further probe how a natural community has

133

coevolved to promote beneficial behaviours. Particularly, the B. ovatus, E. rectale and Co. comes microbial guild could be examined. Did the B. ovatus strain fail to cross-feed or competitively exclude these strains [190, 395]? Alternatively, unrelated to metabolism, a breakdown of communication by, e.g., quorum sensing [396] or the secretion of antimicrobials [176] could have additionally led to the rejection of E. rectale and Co. comes from the microbial community. Coculture experiments could answer such questions, and the overall design of microbial therapeutic strategies should carefully consider ecological dynamics, as revealed through such experimentation.

I thus conclude that my overarching hypothesis, microbial ecological theory can be replicated in complex defined microbial communities, was correct. Such defined microbial communities can be paired with either in vivo animal, in vitro bioreactor or ex vivo organoid type culturing, and thus present useful models for further probing the factors of microbial community assembly and ecosystem dynamics [194, 198–200, 211]. An example of such work would be to investigate the succession of the human gut microbiota, and how the delayed acquisition of key microbes impacts the education of an infant’s immune system, regarding the associations of allergy [164], asthma [165] and diabetes [166]. Further, it would be of interest to determine why such microbes were unable to integrate into the gut microbiota of an infant that develops poorly. Was the environment unfavorable from antibiotic treatment or the lack of HMOs in infant formula? Did the initial set of colonizers exclude beneficial microbes by competition, failing to promote cross-feeding, or a mismatch in communication, suggesting an adverse historical contingency? Was it simply dispersal limitation from increased hygiene, as in the ‘old friends’ hypothesis, or unfortunate stochastic effects? Such questions can be addressed in an iterative fashion with suitable models, progressing research in the field from associations to causative links, and thus enabling the creation of effective therapeutics that remediate undesirable, potentially disease-promoting human gut microbiota configurations.

134

Chapter 7 – References 1. Thursby E, Juge N. Introduction to the human gut microbiota. Biochem J 2017; 474: 1823–1836.

2. Dietert RR. Safety and risk assessment for the human superorganism. Hum Ecol Risk Assess Int J

2017; 23: 1819–1829.

3. Li J, Jia H, Cai X, Zhong H, Feng Q, Sunagawa S, et al. An integrated catalog of reference genes in

the human gut microbiome. Nat Biotechnol 2014; 32: 834–841.

4. Sonnenburg JL, Bäckhed F. Diet–microbiota interactions as moderators of human metabolism.

Nature 2016; 535: 56–64.

5. Sender R, Fuchs S, Milo R. Revised Estimates for the Number of Human and Bacteria Cells in the

Body. PLoS Biol 2016; 14.

6. Belkaid Y, Hand T. Role of the Microbiota in Immunity and inflammation. Cell 2014; 157: 121–141.

7. Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett C, Knight R, Gordon JI. The human microbiome

project: exploring the microbial part of ourselves in a changing world. Nature 2007; 449: 804–

810.

8. Brown K, DeCoffe D, Molcan E, Gibson DL. Diet-Induced Dysbiosis of the Intestinal Microbiota and

the Effects on Immunity and Disease. Nutrients 2012; 4: 1095–1119.

9. Clemente JC, Ursell LK, Parfrey LW, Knight R. The Impact of the Gut Microbiota on Human Health:

An Integrative View. Cell 2012; 148: 1258–1270.

10. Theriot CM, Young VB. Interactions Between the Gastrointestinal Microbiome and Clostridium

difficile. Annu Rev Microbiol 2015; 69: 445–461.

11. Tripathi A, Debelius J, Brenner DA, Karin M, Loomba R, Schnabl B, et al. The gut–liver axis and the

intersection with the microbiome. Nat Rev Gastroenterol Hepatol 2018; 15: 397–411.

12. Meijers BKI, Evenepoel P. The gut–kidney axis: indoxyl sulfate, p-cresyl sulfate and CKD

progression. Nephrol Dial Transplant 2011; 26: 759–761.

135

13. Carabotti M, Scirocco A, Maselli MA, Severi C. The gut-brain axis: interactions between enteric

microbiota, central and enteric nervous systems. Ann Gastroenterol Q Publ Hell Soc Gastroenterol

2015; 28: 203–209.

14. Hooks KB, O’Malley MA. Dysbiosis and Its Discontents. mBio 2017; 8: e01492-17.

15. Hill C, Guarner F, Reid G, Gibson GR, Merenstein DJ, Pot B, et al. Expert consensus document: The

International Scientific Association for Probiotics and Prebiotics consensus statement on the

scope and appropriate use of the term probiotic. Nat Rev Gastroenterol Hepatol 2014; 11: 506–

514.

16. Ianiro G, Tilg H, Gasbarrini A. Antibiotics as deep modulators of gut microbiota: between good

and evil. Gut 2016; 65: 1906–1915.

17. Pandey KR, Naik SR, Vakil BV. Probiotics, prebiotics and synbiotics- a review. J Food Sci Technol

2015; 52: 7577–7587.

18. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, et al. A human gut microbial gene

catalog established by metagenomic sequencing. Nature 2010; 464: 59–65.

19. Costello EK, Stagaman K, Dethlefsen L, Bohannan BJM, Relman DA. The application of ecological

theory towards an understanding of the human microbiome. Science 2012; 336: 1255–1262.

20. Cammarota G, Ianiro G, Gasbarrini A. Fecal microbiota transplantation for the treatment of

Clostridium difficile infection: a systematic review. J Clin Gastroenterol 2014; 48: 693–702.

21. Choi HH, Cho Y-S. Fecal Microbiota Transplantation: Current Applications, Effectiveness, and

Future Perspectives. Clin Endosc 2016; 49: 257–265.

22. Gupta S, Allen-Vercoe E, Petrof EO. Fecal microbiota transplantation: in perspective. Ther Adv

Gastroenterol 2016; 9: 229–239.

136

23. Petrof EO, Gloor GB, Vanner SJ, Weese SJ, Carter D, Daigneault MC, et al. Stool substitute

transplant therapy for the eradication of Clostridium difficile infection: ‘RePOOPulating’ the gut.

Microbiome 2013; 1: 3.

24. Lozupone CA, Stombaugh JI, Gordon JI, Jansson JK, Knight R. Diversity, stability and resilience of

the human gut microbiota. Nature 2012; 489: 220–230.

25. Scarpellini E, Ianiro G, Attili F, Bassanelli C, De Santis A, Gasbarrini A. The human gut microbiota

and virome: Potential therapeutic implications. Dig Liver Dis 2015; 47: 1007–1012.

26. Ogilvie LA, Jones BV. The human gut virome: a multifaceted majority. Front Microbiol 2015; 6.

27. Shkoporov AN, Ryan FJ, Draper LA, Forde A, Stockdale SR, Daly KM, et al. Reproducible protocols

for metagenomic analysis of human faecal phageomes. Microbiome 2018; 6.

28. Human Microbiome Project Consortium. Structure, function and diversity of the healthy human

microbiome. Nature 2012; 486: 207–214.

29. Lopetuso LR, Scaldaferri F, Petito V, Gasbarrini A. Commensal Clostridia: leading players in the

maintenance of gut homeostasis. Gut Pathog 2013; 5: 23.

30. Falony G, Joossens M, Vieira-Silva S, Wang J, Darzi Y, Faust K, et al. Population-level analysis of

gut microbiome variation. Science 2016; 352: 560–564.

31. Lloyd-Price J, Abu-Ali G, Huttenhower C. The healthy human microbiome. Genome Med 2016; 8.

32. McDonald JAK, Mullish BH, Pechlivanis A, Liu Z, Brignardello J, Kao D, et al. Inhibiting Growth of

Clostridioides difficile by Restoring Valerate, Produced by the Intestinal Microbiota.

Gastroenterology 2018.

33. de Vladar HP. Amino acid fermentation at the origin of the genetic code. Biol Direct 2012; 7: 6.

34. Fischbach MA, Sonnenburg JL. Eating For Two: How Metabolism Establishes Interspecies

Interactions in the Gut. Cell Host Microbe 2011; 10: 336–347.

137

35. Pokusaeva K, Fitzgerald GF, van Sinderen D. Carbohydrate metabolism in Bifidobacteria. Genes

Nutr 2011; 6: 285–306.

36. Macfarlane GT, Macfarlane S. Bacteria, colonic fermentation, and gastrointestinal health. J AOAC

Int 2012; 95: 50–60.

37. Derrien M, Vaughan EE, Plugge CM, de Vos WM. Akkermansia muciniphila gen. nov., sp. nov., a

human intestinal mucin-degrading bacterium. Int J Syst Evol Microbiol 2004; 54: 1469–1476.

38. Jumas-Bilak E, Carlier J-P, Jean-Pierre H, Teyssier C, Gay B, Campos J, et al. Veillonella

montpellierensis sp. nov., a novel, anaerobic, Gram-negative coccus isolated from human clinical

samples. Int J Syst Evol Microbiol 2004; 54: 1311–1316.

39. Paixão L, Oliveira J, Veríssimo A, Vinga S, Lourenço EC, Ventura MR, et al. Host Glycan Sugar-

Specific Pathways in Streptococcus pneumonia: Galactose as a Key Sugar in Colonisation and

Infection. PLOS ONE 2015; 10: e0121042.

40. Duncan SH, Hold GL, Harmsen HJM, Stewart CS, Flint HJ. Growth requirements and fermentation

products of Fusobacterium prausnitzii, and a proposal to reclassify it as Faecalibacterium

prausnitzii gen. nov., comb. nov. Int J Syst Evol Microbiol 2002; 52: 2141–2146.

41. Charalampopoulos D, Pandiella SS, Webb C. Growth studies of potentially probiotic lactic acid

bacteria in cereal-based substrates. J Appl Microbiol 2002; 92: 851–859.

42. Taras D, Simmering R, Collins MD, Lawson PA, Blaut M. Reclassification of Eubacterium

formicigenerans Holdeman and Moore 1974 as Dorea formicigenerans gen. nov., comb. nov., and

description of Dorea longicatena sp. nov., isolated from human faeces. Int J Syst Evol Microbiol

2002; 52: 423–428.

43. HOLDEMAN LV, MOORE WEC. New Genus, Coprococcus, Twelve New Species, and Emended

Descriptions of Four Previously Described Species of Bacteria from Human Feces. Int J Syst Evol

Microbiol 1974; 24: 260–277.

138

44. Liu C, Finegold SM, Song Y, Lawson PA. Reclassification of Clostridium coccoides, Ruminococcus

hansenii, Ruminococcus hydrogenotrophicus, Ruminococcus luti, Ruminococcus productus and

Ruminococcus schinkii as Blautia coccoides gen. nov., comb. nov., Blautia hansenii comb. nov.,

Blautia hydrogenotrophica comb. nov., Blautia luti comb. nov., Blautia producta comb. nov.,

Blautia schinkii comb. nov. and description of Blautia wexlerae sp. nov., isolated from human

faeces. Int J Syst Evol Microbiol 2008; 58: 1896–1902.

45. Roh H, Ko H-J, Kim D, Choi DG, Park S, Kim S, et al. Complete Genome Sequence of a Carbon

Monoxide-Utilizing Acetogen, Eubacterium limosum KIST612. J Bacteriol 2011; 193: 307–308.

46. Polansky O, Sekelova Z, Faldynova M, Sebkova A, Sisak F, Rychlik I. Important Metabolic Pathways

and Biological Processes Expressed by Chicken Cecal Microbiota. Appl Environ Microbiol 2016; 82:

1569–1576.

47. Tailford LE, Crost EH, Kavanaugh D, Juge N. Mucin glycan foraging in the human gut microbiome.

Front Genet 2015; 6.

48. Sakamoto M, Benno Y. Reclassification of Bacteroides distasonis, Bacteroides goldsteinii and

Bacteroides merdae as Parabacteroides distasonis gen. nov., comb. nov., Parabacteroides

goldsteinii comb. nov. and Parabacteroides merdae comb. nov. Int J Syst Evol Microbiol 2006; 56:

1599–1605.

49. Rautio M, Eerola E, Väisänen-Tunkelrott M-L, Molitoris D, Lawson P, Collins MD, et al.

Reclassification of Bacteroides putredinis (Weinberg et al., 1937) in a new genus Alistipes gen.

nov., as Alistipes putredinis comb. nov., and description of Alistipes finegoldii sp. nov., from

human sources. Syst Appl Microbiol 2003; 26: 182–188.

50. KANEUCHI C, MIYAZATO T, SHINJO T, MITSUOKA T. Taxonomic Study of Helically Coiled,

Sporeforming Anaerobes Isolated from the Intestines of Humans and Other Animals: Clostridium

cocleatum sp. nov. and Clostridium spiroforme sp. nov. Int J Syst Evol Microbiol 1979; 29: 1–12.

139

51. Yutin N, Galperin MY. A genomic update on clostridial phylogeny: Gram-negative spore-formers

and other misplaced clostridia. Environ Microbiol 2013; 15: 2631–2641.

52. Liang K, Shen CR. Selection of an endogenous 2,3-butanediol pathway in Escherichia coli by

fermentative redox balance. Metab Eng 2017; 39: 181–191.

53. Louis P, Flint HJ. Formation of propionate and butyrate by the human colonic microbiota. Environ

Microbiol 2017; 19: 29–41.

54. Smith EA, Macfarlane GT. Enumeration of amino acid fermenting bacteria in the human large

intestine: effects of pH and starch on peptide metabolism and dissimilation of amino acids. FEMS

Microbiol Ecol 1998; 25: 355–368.

55. Chassard C, Delmas E, Robert C, Lawson PA, Bernalier-Donadille A. Ruminococcus

champanellensis sp. nov., a cellulose-degrading bacterium from human gut microbiota. Int J Syst

Evol Microbiol 2012; 62: 138–143.

56. Koh A, De Vadder F, Kovatcheva-Datchary P, Bäckhed F. From Dietary Fiber to Host Physiology:

Short-Chain Fatty Acids as Key Bacterial Metabolites. Cell 2016; 165: 1332–1345.

57. Mashima I, Liao Y-C, Miyakawa H, Theodorea CF, Thawboon B, Thaweboon S, et al. Veillonella

infantium sp. nov., an anaerobic, Gram-stain-negative coccus isolated from tongue biofilm of a

Thai child. Int J Syst Evol Microbiol 2018; 68: 1101–1106.

58. Samuel BS, Hansen EE, Manchester JK, Coutinho PM, Henrissat B, Fulton R, et al. Genomic and

metabolic adaptations of Methanobrevibacter smithii to the human gut. Proc Natl Acad Sci U S A

2007; 104: 10643–10648.

59. Wolf PG, Biswas A, Morales SE, Greening C, Gaskins HR. H2 metabolism is widespread and diverse

among human colonic microbes. Gut Microbes 2016; 7: 235–245.

140

60. Elshaghabee FMF, Bockelmann W, Meske D, de Vrese M, Walte H-G, Schrezenmeir J, et al.

Ethanol Production by Selected Intestinal Microorganisms and Lactic Acid Bacteria Growing under

Different Nutritional Conditions. Front Microbiol 2016; 7.

61. Kelly WJ, Henderson G, Pacheco DM, Li D, Reilly K, Naylor GE, et al. The complete genome

sequence of Eubacterium limosum SA11, a metabolically versatile rumen acetogen. Stand

Genomic Sci 2016; 11: 26.

62. Mountfort DO, Grant WD, Clarke R, Asher RA. Eubacterium callanderi sp. nov. That

Demethoxylates O-Methoxylated Aromatic Acids to Volatile Fatty Acids. Int J Syst Evol Microbiol

1988; 38: 254–258.

63. Krajmalnik-Brown R, Ilhan Z-E, Kang D-W, DiBaise JK. Effects of Gut Microbes on Nutrient

Absorption and Energy Regulation. Nutr Clin Pract Off Publ Am Soc Parenter Enter Nutr 2012; 27:

201–214.

64. Lane ER, Zisman TL, Suskind DL. The microbiota in inflammatory bowel disease: current and

therapeutic insights. J Inflamm Res 2017; 10: 63–73.

65. Spiljar M, Merkler D, Trajkovski M. The Immune System Bridges the Gut Microbiota with Systemic

Energy Homeostasis: Focus on TLRs, Mucosal Barrier, and SCFAs. Front Immunol 2017; 8.

66. Henao-Mejia J, Elinav E, Jin C, Hao L, Mehal WZ, Strowig T, et al. Inflammasome-mediated

dysbiosis regulates progression of NAFLD and obesity. Nature 2012; 482: 179–185.

67. Bloom SM, Bijanki VN, Nava GM, Sun L, Malvin NP, Donermeyer DL, et al. Commensal Bacteroides

species induce colitis in host-genotype-specific fashion in a mouse model of inflammatory bowel

disease. Cell Host Microbe 2011; 9: 390–403.

68. Lee W-J, Hase K. Gut microbiota-generated metabolites in animal health and disease. Nat Chem

Biol 2014; 10: 416–424.

141

69. Vital M, Karch A, Pieper DH. Colonic Butyrate-Producing Communities in Humans: an Overview

Using Omics Data. mSystems 2017; 2.

70. Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI. An obesity-associated gut

microbiome with increased capacity for energy harvest. Nature 2006; 444: 1027–1031.

71. Koliada A, Syzenko G, Moseiko V, Budovska L, Puchkov K, Perederiy V, et al. Association between

body mass index and Firmicutes/Bacteroidetes ratio in an adult Ukrainian population. BMC

Microbiol 2017; 17.

72. Rogowski A, Briggs JA, Mortimer JC, Tryfona T, Terrapon N, Lowe EC, et al. Glycan complexity

dictates microbial resource allocation in the large intestine. Nat Commun 2015; 6.

73. Flint HJ, Scott KP, Duncan SH, Louis P, Forano E. Microbial degradation of complex carbohydrates

in the gut. Gut Microbes 2012; 3: 289–306.

74. Bhattacharya T, Ghosh TS, Mande SS. Global Profiling of Carbohydrate Active Enzymes in Human

Gut Microbiome. PloS One 2015; 10: e0142038.

75. Xu J, Bjursell MK, Himrod J, Deng S, Carmichael LK, Chiang HC, et al. A genomic view of the

human-Bacteroides thetaiotaomicron symbiosis. Science 2003; 299: 2074–2076.

76. Vergnolle N. Protease inhibition as new therapeutic strategy for GI diseases. Gut 2016; gutjnl-

2015-309147.

77. Rahat-Rozenbloom S, Fernandes J, Gloor GB, Wolever TM. Evidence for Greater Production of

Colonic Short Chain Fatty Acids in Overweight than Lean Humans. Int J Obes 2005 2014; 38:

1525–1531.

78. Huang S, Rutkowsky JM, Snodgrass RG, Ono-Moore KD, Schneider DA, Newman JW, et al.

Saturated fatty acids activate TLR-mediated proinflammatory signaling pathways. J Lipid Res

2012; 53: 2002–2013.

142

79. Shin N-R, Whon TW, Bae J-W. Proteobacteria: microbial signature of dysbiosis in gut microbiota.

Trends Biotechnol 2015; 33: 496–503.

80. Hall AB, Yassour M, Sauk J, Garner A, Jiang X, Arthur T, et al. A novel Ruminococcus gnavus clade

enriched in inflammatory bowel disease patients. Genome Med 2017; 9: 103.

81. Png CW, Lindén SK, Gilshenan KS, Zoetendal EG, McSweeney CS, Sly LI, et al. Mucolytic bacteria

with increased prevalence in IBD mucosa augment in vitro utilization of mucin by other bacteria.

Am J Gastroenterol 2010; 105: 2420–2428.

82. Desai MS, Seekatz AM, Koropatkin NM, Kamada N, Hickey CA, Wolter M, et al. A dietary fiber-

deprived gut microbiota degrades the colonic mucus barrier and enhances pathogen

susceptibility. Cell 2016; 167: 1339-1353.e21.

83. Arumugam M, Raes J, Pelletier E, Paslier DL, Yamada T, Mende DR, et al. Enterotypes of the

human gut microbiome. Nature 2011; 473: 174–180.

84. Wu GD, Chen J, Hoffmann C, Bittinger K, Chen Y-Y, Keilbaugh SA, et al. Linking Long-Term Dietary

Patterns with Gut Microbial Enterotypes. Science 2011; 334: 105–108.

85. Gorvitovskaia A, Holmes SP, Huse SM. Interpreting Prevotella and Bacteroides as biomarkers of

diet and lifestyle. Microbiome 2016; 4: 15.

86. Chen T, Long W, Zhang C, Liu S, Zhao L, Hamaker BR. Fiber-utilizing capacity varies in Prevotella-

versus Bacteroides-dominated gut microbiota. Sci Rep 2017; 7.

87. Hjorth MF, Roager HM, Larsen TM, Poulsen SK, Licht TR, Bahl MI, et al. Pre-treatment microbial

Prevotella-to-Bacteroides ratio, determines body fat loss success during a 6-month randomized

controlled diet intervention. Int J Obes 2018; 42: 580–583.

88. Bilz S, Samuel V, Morino K, Savage D, Choi CS, Shulman GI. Activation of the farnesoid X receptor

improves lipid metabolism in combined hyperlipidemic hamsters. Am J Physiol Endocrinol Metab

2006; 290: E716-722.

143

89. Joyce SA, MacSharry J, Casey PG, Kinsella M, Murphy EF, Shanahan F, et al. Regulation of host

weight gain and lipid metabolism by bacterial bile acid modification in the gut. Proc Natl Acad Sci

U S A 2014; 111: 7421–7426.

90. Staley C, Weingarden AR, Khoruts A, Sadowsky MJ. Interaction of Gut Microbiota with Bile Acid

Metabolism and its Influence on Disease States. Appl Microbiol Biotechnol 2017; 101: 47–64.

91. Jones BV, Begley M, Hill C, Gahan CGM, Marchesi JR. Functional and comparative metagenomic

analysis of bile salt hydrolase activity in the human gut microbiome. Proc Natl Acad Sci U S A

2008; 105: 13580–13585.

92. Vandeputte D, Kathagen G, D’hoe K, Vieira-Silva S, Valles-Colomer M, Sabino J, et al. Quantitative

microbiome profiling links gut community variation to microbial load. Nature 2017; 551: 507–

511.

93. Lukjancenko O, Wassenaar TM, Ussery DW. Comparison of 61 Sequenced Escherichia coli

Genomes. Microb Ecol 2010; 60: 708–720.

94. Sommer F, Anderson JM, Bharti R, Raes J, Rosenstiel P. The resilience of the intestinal microbiota

influences health and disease. Nat Rev Microbiol 2017; 15: 630–638.

95. Stecher B, Hardt W-D. Mechanisms controlling pathogen colonization of the gut. Curr Opin

Microbiol 2011; 14: 82–91.

96. den Besten G, van Eunen K, Groen AK, Venema K, Reijngoud D-J, Bakker BM. The role of short-

chain fatty acids in the interplay between diet, gut microbiota, and host energy metabolism. J

Lipid Res 2013; 54: 2325–2340.

97. Ríos-Covián D, Ruas-Madiedo P, Margolles A, Gueimonde M, de los Reyes-Gavilán CG, Salazar N.

Intestinal Short Chain Fatty Acids and their Link with Diet and Human Health. Front Microbiol

2016; 7.

144

98. Fan P, Li L, Rezaei A, Eslamfam S, Che D, Ma X. Metabolites of Dietary Protein and Peptides by

Intestinal Microbes and their Impacts on Gut. Curr Protein Pept Sci 2015; 16: 646–654.

99. Portune KJ, Beaumont M, Davila A-M, Tomé D, Blachier F, Sanz Y. Gut microbiota role in dietary

protein metabolism and health-related outcomes: The two sides of the coin. Trends Food Sci

Technol 2016; 57: 213–232.

100. Yao CK, Muir JG, Gibson PR. Review article: insights into colonic protein fermentation, its

modulation and potential health implications. Aliment Pharmacol Ther 2016; 43: 181–196.

101. Gadaleta RM, van Erpecum KJ, Oldenburg B, Willemsen ECL, Renooij W, Murzilli S, et al. Farnesoid

X receptor activation inhibits inflammation and preserves the intestinal barrier in inflammatory

bowel disease. Gut 2011; 60: 463–472.

102. Vavassori P, Mencarelli A, Renga B, Distrutti E, Fiorucci S. The bile acid receptor FXR is a

modulator of intestinal innate immunity. J Immunol Baltim Md 1950 2009; 183: 6251–6261.

103. Morales P, Fujio S, Navarrete P, Ugalde JA, Magne F, Carrasco-Pozo C, et al. Impact of Dietary

Lipids on Colonic Function and Microbiota: An Experimental Approach Involving Orlistat-Induced

Fat Malabsorption in Human Volunteers. Clin Transl Gastroenterol 2016; 7: e161.

104. Mu H, Høy C-E. The digestion of dietary triacylglycerols. Prog Lipid Res 2004; 43: 105–133.

105. Ozdal T, Sela DA, Xiao J, Boyacioglu D, Chen F, Capanoglu E. The Reciprocal Interactions between

Polyphenols and Gut Microbiota and Effects on Bioaccessibility. Nutrients 2016; 8.

106. Biesalski HK. Nutrition meets the microbiome: micronutrients and the microbiota. Ann N Y Acad

Sci 2016; 1372: 53–64.

107. Cândido FG, Valente FX, Grześkowiak ŁM, Moreira APB, Rocha DMUP, Alfenas R de CG. Impact of

dietary fat on gut microbiota and low-grade systemic inflammation: mechanisms and clinical

implications on obesity. Int J Food Sci Nutr 2018; 69: 125–143.

108. Wolfe AJ. Glycolysis for the Microbiome Generation. Microbiol Spectr 2015; 3.

145

109. Lin R, Liu W, Piao M, Zhu H. A review of the relationship between the gut microbiota and amino

acid metabolism. Amino Acids 2017; 49: 2083–2090.

110. Geboes KP, De Hertogh G, De Preter V, Luypaerts A, Bammens B, Evenepoel P, et al. The influence

of inulin on the absorption of nitrogen and the production of metabolites of protein fermentation

in the colon. Br J Nutr 2006; 96: 1078–1086.

111. Morrison DJ, Preston T. Formation of short chain fatty acids by the gut microbiota and their

impact on human metabolism. Gut Microbes 2016; 7: 189–200.

112. Nishina PM, Freedland RA. Effects of propionate on lipid biosynthesis in isolated rat hepatocytes.

J Nutr 1990; 120: 668–673.

113. Chambers ES, Viardot A, Psichas A, Morrison DJ, Murphy KG, Zac-Varghese SEK, et al. Effects of

targeted delivery of propionate to the human colon on appetite regulation, body weight

maintenance and adiposity in overweight adults. Gut 2015; 64: 1744–1754.

114. Dorokhov YL, Shindyapina AV, Sheshukova EV, Komarova TV. Metabolic methanol: molecular

pathways and physiological roles. Physiol Rev 2015; 95: 603–644.

115. Gkolfakis P, Dimitriadis G, Triantafyllou K. Gut microbiota and non-alcoholic fatty liver disease.

Hepatobiliary Pancreat Dis Int 2015; 14: 572–581.

116. O’Brien PJ, Siraki AG, Shangari N. Aldehyde sources, metabolism, molecular toxicity mechanisms,

and possible effects on human health. Crit Rev Toxicol 2005; 35: 609–662.

117. Zhu L, Baker SS, Gill C, Liu W, Alkhouri R, Baker RD, et al. Characterization of gut microbiomes in

nonalcoholic steatohepatitis (NASH) patients: A connection between endogenous alcohol and

NASH. Hepatology 2013; 57: 601–609.

118. Principi M, Iannone A, Losurdo G, Mangia M, Shahini E, Albano F, et al. Nonalcoholic Fatty Liver

Disease in Inflammatory Bowel Disease: Prevalence and Risk Factors. Inflamm Bowel Dis 2018;

24: 1589–1596.

146

119. Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 2000;

28: 27–30.

120. Hinnebusch BF, Meng S, Wu JT, Archer SY, Hodin RA. The effects of short-chain fatty acids on

human colon cancer cell phenotype are associated with histone hyperacetylation. J Nutr 2002;

132: 1012–1017.

121. Köpke M, Mihalcea C, Liew F, Tizard JH, Ali MS, Conolly JJ, et al. 2,3-Butanediol Production by

Acetogenic Bacteria, an Alternative Route to Chemical Synthesis, Using Industrial Waste Gas ▿.

Appl Environ Microbiol 2011; 77: 5467–5475.

122. Flint HJ, Scott KP, Louis P, Duncan SH. The role of the gut microbiota in nutrition and health. Nat

Rev Gastroenterol Hepatol 2012; 9: 577–589.

123. Miceli JF, Torres CI, Krajmalnik-Brown R. Shifting the balance of fermentation products between

hydrogen and volatile fatty acids: microbial community structure and function. FEMS Microbiol

Ecol 2016; 92.

124. Mack I, Cuntz U, Grämer C, Niedermaier S, Pohl C, Schwiertz A, et al. Weight gain in anorexia

nervosa does not ameliorate the faecal microbiota, branched chain fatty acid profiles, and

gastrointestinal complaints. Sci Rep 2016; 6.

125. Armougom F, Henry M, Vialettes B, Raccah D, Raoult D. Monitoring Bacterial Community of

Human Gut Microbiota Reveals an Increase in Lactobacillus in Obese Patients and Methanogens

in Anorexic Patients. PLoS ONE 2009; 4.

126. Rey FE, Gonzalez MD, Cheng J, Wu M, Ahern PP, Gordon JI. Metabolic niche of a prominent

sulfate-reducing human gut bacterium. Proc Natl Acad Sci U S A 2013; 110: 13582–13587.

127. Benjdia A, Martens EC, Gordon JI, Berteau O. Sulfatases and a radical S-adenosyl-L-methionine

(AdoMet) enzyme are key for mucosal foraging and fitness of the prominent human gut

symbiont, Bacteroides thetaiotaomicron. J Biol Chem 2011; 286: 25973–25982.

147

128. Nicholls P, Kim JK. Sulphide as an inhibitor and electron donor for the cytochrome c oxidase

system. Can J Biochem 1982; 60: 613–623.

129. Figliuolo VR, dos Santos LM, Abalo A, Nanini H, Santos A, Brittes NM, et al. Sulfate-reducing

bacteria stimulate gut immune responses and contribute to inflammation in experimental colitis.

Life Sci 2017; 189: 29–38.

130. Ijssennagger N, Belzer C, Hooiveld GJ, Dekker J, van Mil SWC, Müller M, et al. Gut microbiota

facilitates dietary heme-induced epithelial hyperproliferation by opening the mucus barrier in

colon. Proc Natl Acad Sci U S A 2015; 112: 10038–10043.

131. Ijssennagger N, Meer R van der, Mil SWC van. Sulfide as a Mucus Barrier-Breaker in Inflammatory

Bowel Disease? Trends Mol Med 2016; 22: 190–199.

132. Andriamihaja M, Davila A-M, Eklou-Lawson M, Petit N, Delpal S, Allek F, et al. Colon luminal

content and epithelial cell morphology are markedly modified in rats fed with a high-protein diet.

Am J Physiol-Gastrointest Liver Physiol 2010; 299: G1030–G1037.

133. Hughes R, Kurth MJ, McGilligan V, McGlynn H, Rowland I. Effect of colonic bacterial metabolites

on Caco-2 cell paracellular permeability in vitro. Nutr Cancer 2008; 60: 259–266.

134. Cremin JD, Fitch MD, Fleming SE. Glucose alleviates ammonia-induced inhibition of short-chain

fatty acid metabolism in rat colonic epithelial cells. Am J Physiol-Gastrointest Liver Physiol 2003;

285: G105–G114.

135. Eklou-Lawson M, Bernard F, Neveux N, Chaumontet C, Bos C, Davila-Gay A-M, et al. Colonic

luminal ammonia and portal blood l-glutamine and l-arginine concentrations: a possible link

between colon mucosa and liver ureagenesis. Amino Acids 2009; 37: 751–760.

136. Mouillé B, Robert V, Blachier F. Adaptative increase of ornithine production and decrease of

ammonia metabolism in rat colonocytes after hyperproteic diet ingestion. Am J Physiol-

Gastrointest Liver Physiol 2004; 287: G344–G351.

148

137. Heimann E, Nyman M, Pålbrink A-K, Lindkvist-Petersson K, Degerman E. Branched short-chain

fatty acids modulate glucose and lipid metabolism in primary adipocytes. Adipocyte 2016; 5: 359–

368.

138. Jaskiewicz J, Zhao Y, Hawes JW, Shimomura Y, Crabb DW, Harris RA. Catabolism of isobutyrate by

colonocytes. Arch Biochem Biophys 1996; 327: 265–270.

139. Smith EA, Macfarlane GT. Dissimilatory amino Acid metabolism in human colonic bacteria.

Anaerobe 1997; 3: 327–337.

140. Liang X, FitzGerald GA. Timing the Microbes: The Circadian Rhythm of the Gut Microbiome. J Biol

Rhythms 2017; 32: 505–515.

141. Dao MC, Everard A, Aron-Wisnewsky J, Sokolovska N, Prifti E, Verger EO, et al. Akkermansia

muciniphila and improved metabolic health during a dietary intervention in obesity: relationship

with gut microbiome richness and ecology. Gut 2016; 65: 426–436.

142. Everard A, Belzer C, Geurts L, Ouwerkerk JP, Druart C, Bindels LB, et al. Cross-talk between

Akkermansia muciniphila and intestinal epithelium controls diet-induced obesity. Proc Natl Acad

Sci 2013; 110: 9066–9071.

143. Laforest-Lapointe I, Arrieta M-C. Patterns of Early-Life Gut Microbial Colonization during Human

Immune Development: An Ecological Perspective. Front Immunol 2017; 8.

144. Bäckhed F, Roswall J, Peng Y, Feng Q, Jia H, Kovatcheva-Datchary P, et al. Dynamics and

Stabilization of the Human Gut Microbiome during the First Year of Life. Cell Host Microbe 2015;

17: 690–703.

145. Perez-Muñoz ME, Arrieta M-C, Ramer-Tait AE, Walter J. A critical assessment of the ‘sterile

womb’ and ‘in utero colonization’ hypotheses: implications for research on the pioneer infant

microbiome. Microbiome 2017; 5: 48.

149

146. Dominguez-Bello MG, Costello EK, Contreras M, Magris M, Hidalgo G, Fierer N, et al. Delivery

mode shapes the acquisition and structure of the initial microbiota across multiple body habitats

in newborns. Proc Natl Acad Sci 2010; 107: 11971–11975.

147. Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, et al. The long-term

stability of the human gut microbiota. Science 2013; 341: 1237439.

148. O’Toole PW, Jeffery IB. Gut microbiota and aging. Science 2015; 350: 1214–1215.

149. Goodrich JK, Waters JL, Poole AC, Sutter JL, Koren O, Blekhman R, et al. Human genetics shape

the gut microbiome. Cell 2014; 159: 789–799.

150. Loddo I, Romano C. Inflammatory Bowel Disease: Genetics, Epigenetics, and Pathogenesis. Front

Immunol 2015; 6: 551.

151. Liu JZ, van Sommeren S, Huang H, Ng SC, Alberts R, Takahashi A, et al. Association analyses

identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk

across populations. Nat Genet 2015; 47: 979–986.

152. Jostins L, Ripke S, Weersma RK, Duerr RH, McGovern DP, Hui KY, et al. Host-microbe interactions

have shaped the genetic architecture of inflammatory bowel disease. Nature 2012; 491: 119–

124.

153. Halfvarson J, Bodin L, Tysk C, Lindberg E, Järnerot G. Inflammatory bowel disease in a Swedish

twin cohort: a long-term follow-up of concordance and clinical characteristics. Gastroenterology

2003; 124: 1767–1773.

154. Kelly RJ, Rouquier S, Giorgi D, Lennon GG, Lowe JB. Sequence and Expression of a Candidate for

the Human Secretor Blood Group α(1,2)Fucosyltransferase Gene (FUT2) HOMOZYGOSITY FOR AN

ENZYME-INACTIVATING NONSENSE MUTATION COMMONLY CORRELATES WITH THE NON-

SECRETOR PHENOTYPE. J Biol Chem 1995; 270: 4640–4649.

150

155. Kashyap PC, Marcobal A, Ursell LK, Smits SA, Sonnenburg ED, Costello EK, et al. Genetically

dictated change in host mucus carbohydrate landscape exerts a diet-dependent effect on the gut

microbiota. Proc Natl Acad Sci U S A 2013; 110: 17059–17064.

156. Rausch P, Rehman A, Künzel S, Häsler R, Ott SJ, Schreiber S, et al. Colonic mucosa-associated

microbiota is influenced by an interaction of Crohn disease and FUT2 (Secretor) genotype. Proc

Natl Acad Sci 2011; 108: 19030–19035.

157. McGovern DPB, Jones MR, Taylor KD, Marciante K, Yan X, Dubinsky M, et al. Fucosyltransferase 2

(FUT2) non-secretor status is associated with Crohn’s disease. Hum Mol Genet 2010; 19: 3468–

3476.

158. Bach J-F, Chatenoud L. The Hygiene Hypothesis: An Explanation for the Increased Frequency of

Insulin-Dependent Diabetes. Cold Spring Harb Perspect Med 2012; 2.

159. Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, et al. Human gut

microbiome viewed across age and geography. Nature 2012; 486: 222–227.

160. Scudellari M. News Feature: Cleaning up the hygiene hypothesis. Proc Natl Acad Sci 2017; 114:

1433–1436.

161. David LA, Maurice CF, Carmody RN, Gootenberg DB, Button JE, Wolfe BE, et al. Diet rapidly and

reproducibly alters the human gut microbiome. Nature 2014; 505: 559–563.

162. Walker AW, Ince J, Duncan SH, Webster LM, Holtrop G, Ze X, et al. Dominant and diet-responsive

groups of bacteria within the human colonic microbiota. ISME J 2011; 5: 220–230.

163. Renz H, Holt PG, Inouye M, Logan AC, Prescott SL, Sly PD. An exposome perspective: Early-life

events and immune development in a changing world. J Allergy Clin Immunol 2017; 140: 24–40.

164. Fujimura KE, Sitarik AR, Havstad S, Lin DL, Levan S, Fadrosh D, et al. Neonatal gut microbiota

associates with childhood multisensitized atopy and T cell differentiation. Nat Med 2016; 22:

1187–1191.

151

165. Arrieta M-C, Stiemsma LT, Dimitriu PA, Thorson L, Russell S, Yurist-Doutsch S, et al. Early infancy

microbial and metabolic alterations affect risk of childhood asthma. Sci Transl Med 2015; 7:

307ra152.

166. Kostic AD, Gevers D, Siljander H, Vatanen T, Hyötyläinen T, Hämäläinen A-M, et al. The dynamics

of the human infant gut microbiome in development and in progression toward type 1 diabetes.

Cell Host Microbe 2015; 17: 260–273.

167. Levy R, Borenstein E. Metabolic modeling of species interaction in the human microbiome

elucidates community-level assembly rules. Proc Natl Acad Sci U S A 2013; 110: 12804–12809.

168. Ferretti P, Pasolli E, Tett A, Asnicar F, Gorfer V, Fedi S, et al. Mother-to-Infant Microbial

Transmission from Different Body Sites Shapes the Developing Infant Gut Microbiome. Cell Host

Microbe 2018; 24: 133-145.e5.

169. Asnicar F, Manara S, Zolfo M, Truong DT, Scholz M, Armanini F, et al. Studying Vertical

Microbiome Transmission from Mothers to Infants by Strain-Level Metagenomic Profiling.

mSystems 2017; 2: e00164-16.

170. Pepper JW, Rosenfeld S. The emerging medical ecology of the human gut microbiome. Trends

Ecol Evol 2012; 27: 381–384.

171. Koenig JE, Spor A, Scalfone N, Fricker AD, Stombaugh J, Knight R, et al. Succession of microbial

consortia in the developing infant gut microbiome. Proc Natl Acad Sci U S A 2011; 108 Suppl 1:

4578–4585.

172. Jeraldo P, Sipos M, Chia N, Brulc JM, Dhillon AS, Konkel ME, et al. Quantification of the relative

roles of niche and neutral processes in structuring gastrointestinal microbiomes. Proc Natl Acad

Sci U S A 2012; 109: 9692–9698.

152

173. Horner-Devine MC, Silver JM, Leibold MA, Bohannan BJM, Colwell RK, Fuhrman JA, et al. A

comparison of taxon co-occurrence patterns for macro- and microorganisms. Ecology 2007; 88:

1345–1353.

174. Cornwell WK, Schwilk LDW, Ackerly DD. A trait-based test for habitat filtering: convex hull

volume. Ecology 2006; 87: 1465–1471.

175. Cody ML, MacArthur RH, Diamond JM, Diamond P of GJ. Ecology and Evolution of Communities.

1975. Harvard University Press.

176. Cordero OX, Wildschutte H, Kirkup B, Proehl S, Ngo L, Hussain F, et al. Ecological populations of

bacteria act as socially cohesive units of antibiotic production and resistance. Science 2012; 337:

1228–1231.

177. Shapira M. Gut Microbiotas and Host Evolution: Scaling Up Symbiosis. Trends Ecol Evol 2016; 31:

539–549.

178. van de Guchte M, Blottière HM, Doré J. Humans as holobionts: implications for prevention and

therapy. Microbiome 2018; 6: 81.

179. Ballesté E, Blanch AR. Persistence of Bacteroides Species Populations in a River as Measured by

Molecular and Culture Techniques. Appl Env Microbiol 2010; 76: 7608–7616.

180. Browne HP, Neville BA, Forster SC, Lawley TD. Transmission of the gut microbiota: spreading of

health. Nat Rev Microbiol 2017; 15: 531–543.

181. Ruiz L, Ruas-Madiedo P, Gueimonde M, de los Reyes-Gavilán CG, Margolles A, Sánchez B. How do

bifidobacteria counteract environmental challenges? Mechanisms involved and physiological

consequences. Genes Nutr 2011; 6: 307–318.

182. Signoretto C, Lleò MM, Tafi MC, Canepari P. Cell wall chemical composition of Enterococcus

faecalis in the viable but nonculturable state. Appl Environ Microbiol 2000; 66: 1953–1959.

153

183. Dethlefsen L, Relman DA. Incomplete recovery and individualized responses of the human distal

gut microbiota to repeated antibiotic perturbation. Proc Natl Acad Sci U S A 2011; 108 Suppl 1:

4554–4561.

184. Reese AT, Dunn RR. Drivers of Microbiome Biodiversity: A Review of General Rules, Feces, and

Ignorance. mBio 2018; 9.

185. Cho I, Blaser MJ. The Human Microbiome: at the interface of health and disease. Nat Rev Genet

2012; 13: 260–270.

186. Rakoff-Nahoum S, Coyne MJ, Comstock LE. An Ecological Network of Polysaccharide Utilization

among Human Intestinal Symbionts. Curr Biol 2014; 24: 40–49.

187. Coyte KZ, Schluter J, Foster KR. The ecology of the microbiome: Networks, competition, and

stability. Science 2015; 350: 663–666.

188. Larsen OFA, Claassen E. The mechanistic link between health and gut microbiota diversity. Sci Rep

2018; 8: 2183.

189. Zhang C, Yin A, Li H, Wang R, Wu G, Shen J, et al. Dietary Modulation of Gut Microbiota

Contributes to Alleviation of Both Genetic and Simple Obesity in Children. EBioMedicine 2015; 2:

968–984.

190. Rakoff-Nahoum S, Foster KR, Comstock LE. The evolution of cooperation within the gut

microbiota. Nature 2016; 533: 255–259.

191. Konopka A. What is microbial community ecology? ISME J 2009; 3: 1223–1230.

192. Foster KR, Schluter J, Coyte KZ, Rakoff-Nahoum S. The evolution of the host microbiome as an

ecosystem on a leash. Nature 2017; 548: 43–51.

193. Nguyen TLA, Vieira-Silva S, Liston A, Raes J. How informative is the mouse for human gut

microbiota research? Dis Model Mech 2015; 8: 1–16.

154

194. Turnbaugh PJ, Ridaura VK, Faith JJ, Rey FE, Knight R, Gordon JI. The effect of diet on the human

gut microbiome: a metagenomic analysis in humanized gnotobiotic mice. Sci Transl Med 2009; 1:

6ra14.

195. Peery AF, Crockett SD, Barritt AS, Dellon ES, Eluri S, Gangarosa LM, et al. Burden of

Gastrointestinal, Liver, and Pancreatic Diseases in the United States. Gastroenterology 2015; 149:

1731-1741.e3.

196. Belzer C, de Vos WM. Microbes inside—from diversity to function: the case of Akkermansia. ISME

J 2012; 6: 1449–1458.

197. McLoughlin K, Schluter J, Rakoff-Nahoum S, Smith AL, Foster KR. Host Selection of Microbiota via

Differential Adhesion. Cell Host Microbe 2016; 19: 550–559.

198. McDonald JAK, Schroeter K, Fuentes S, Heikamp-Dejong I, Khursigara CM, de Vos WM, et al.

Evaluation of microbial community reproducibility, stability and composition in a human distal

gut chemostat model. J Microbiol Methods 2013; 95: 167–174.

199. Natividad JM, Pinto-Sanchez MI, Galipeau HJ, Jury J, Jordana M, Reinisch W, et al. Ecobiotherapy

Rich in Firmicutes Decreases Susceptibility to Colitis in a Humanized Gnotobiotic Mouse Model.

Inflamm Bowel Dis 2015; 21: 1883–1893.

200. Yen S, McDonald JAK, Schroeter K, Oliphant K, Sokolenko S, Blondeel EJM, et al. Metabolomic

analysis of human fecal microbiota: a comparison of feces-derived communities and defined

mixed communities. J Proteome Res 2015; 14: 1472–1482.

201. Xiong W, Abraham P, Li Z, Pan C, Hettich RL. Microbial metaproteomics for characterizing the

range of metabolic functions and activities of human gut microbiota. Proteomics 2015; 15: 3424–

3438.

202. Pérez-Cobas AE, Gosalbes MJ, Friedrichs A, Knecht H, Artacho A, Eismann K, et al. Gut microbiota

disturbance during antibiotic therapy: a multi-omic approach. Gut 2013; 62: 1591–1601.

155

203. Wos-Oxley M, Bleich A, Oxley APA, Kahl S, Janus LM, Smoczek A, et al. Comparative evaluation of

establishing a human gut microbial community within rodent models. Gut Microbes 2012; 3: 234–

249.

204. Litten-Brown JC, Corson AM, Clarke L. Porcine models for the metabolic syndrome, digestive and

bone disorders: a general overview. Anim Int J Anim Biosci 2010; 4: 899–920.

205. Peloquin JM, Nguyen DD. The Microbiota and Inflammatory Bowel Disease: Insights from Animal

Models. Anaerobe 2013; 24.

206. Chung H, Pamp SJ, Hill JA, Surana NK, Edelman SM, Troy EB, et al. Gut Immune Maturation

Depends on Colonization with a Host-Specific Microbiota. Cell 2012; 149: 1578–1593.

207. Van de Wiele T, Van den Abbeele P, Ossieur W, Possemiers S, Marzorati M. The Simulator of the

Human Intestinal Microbial Ecosystem (SHIME®). In: Verhoeckx K, Cotter P, López-Expósito I,

Kleiveland C, Lea T, Mackie A, et al. (eds). The Impact of Food Bioactives on Health: in vitro and ex

vivo models. 2015. Springer, Cham (CH).

208. Van den Abbeele P, Grootaert C, Marzorati M, Possemiers S, Verstraete W, Gérard P, et al.

Microbial community development in a dynamic gut model is reproducible, colon region specific,

and selective for Bacteroidetes and Clostridium cluster IX. Appl Environ Microbiol 2010; 76: 5237–

5246.

209. Wang Y, DiSalvo M, Gunasekara DB, Dutton J, Proctor A, Lebhar MS, et al. Self-renewing

Monolayer of Primary Colonic or Rectal Epithelial Cells. Cell Mol Gastroenterol Hepatol 2017; 4:

165-182.e7.

210. Foulke-Abel J, In J, Kovbasnjuk O, Zachos NC, Ettayebi K, Blutt SE, et al. Human enteroids as an ex-

vivo model of host–pathogen interactions in the gastrointestinal tract. Exp Biol Med Maywood NJ

2014; 239: 1124–1134.

156

211. Bein A, Shin W, Jalili-Firoozinezhad S, Park MH, Sontheimer-Phelps A, Tovaglieri A, et al.

Microfluidic Organ-on-a-Chip Models of Human Intestine. Cell Mol Gastroenterol Hepatol 2018; 5:

659–668.

212. Gutleben J, Chaib De Mares M, van Elsas JD, Smidt H, Overmann J, Sipkema D. The multi-omics

promise in context: from sequence to microbial isolate. Crit Rev Microbiol 2018; 44: 212–229.

213. Podar M, Makarova KS, Graham DE, Wolf YI, Koonin EV, Reysenbach A-L. Insights into archaeal

evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host

from Obsidian Pool, Yellowstone National Park. Biol Direct 2013; 8: 9.

214. van Dijk EL, Auger H, Jaszczyszyn Y, Thermes C. Ten years of next-generation sequencing

technology. Trends Genet TIG 2014; 30: 418–426.

215. Franzosa EA, Hsu T, Sirota-Madi A, Shafquat A, Abu-Ali G, Morgan XC, et al. Sequencing and

beyond: integrating molecular ‘omics’ for microbial community profiling. Nat Rev Microbiol 2015;

13: 360–372.

216. Di Segni A, Braun T, BenShoshan M, Farage Barhom S, Glick Saar E, Cesarkas K, et al. Guided

Protocol for Fecal Microbial Characterization by 16S rRNA-Amplicon Sequencing. J Vis Exp JoVE

2018.

217. Nash AK, Auchtung TA, Wong MC, Smith DP, Gesell JR, Ross MC, et al. The gut mycobiome of the

Human Microbiome Project healthy cohort. Microbiome 2017; 5: 153.

218. Luo C, Knight R, Siljander H, Knip M, Xavier RJ, Gevers D. ConStrains identifies microbial strains in

metagenomic datasets. Nat Biotechnol 2015; 33: 1045–1052.

219. Land M, Hauser L, Jun S-R, Nookaew I, Leuze MR, Ahn T-H, et al. Insights from 20 years of

bacterial genome sequencing. Funct Integr Genomics 2015; 15: 141–161.

220. Wang W-L, Xu S-Y, Ren Z-G, Tao L, Jiang J-W, Zheng S-S. Application of metagenomics in the

human gut microbiome. World J Gastroenterol WJG 2015; 21: 803–814.

157

221. Fernandes AD, Reid JN, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB. Unifying the analysis

of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and

selective growth experiments by compositional data analysis. Microbiome 2014; 2: 15.

222. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution

sample inference from Illumina amplicon data. Nat Methods 2016; 13: 581–583.

223. Stoddard SF, Smith BJ, Hein R, Roller BRK, Schmidt TM. rrnDB: improved tools for interpreting

rRNA gene abundance in bacteria and archaea and a new foundation for future development.

Nucleic Acids Res 2015; 43: D593-598.

224. Tourlousse DM, Ohashi A, Sekiguchi Y. Sample tracking in microbiome community profiling assays

using synthetic 16S rRNA gene spike-in controls. Sci Rep 2018; 8: 9095.

225. Yang Y-W, Chen M-K, Yang B-Y, Huang X-J, Zhang X-R, He L-Q, et al. Use of 16S rRNA Gene-

Targeted Group-Specific Primers for Real-Time PCR Analysis of Predominant Bacteria in Mouse

Feces. Appl Env Microbiol 2015; 81: 6749–6756.

226. Vázquez L, Guadamuro L, Giganto F, Mayo B, Flórez AB. Development and Use of a Real-Time

Quantitative PCR Method for Detecting and Quantifying Equol-Producing Bacteria in Human

Faecal Samples and Slurry Cultures. Front Microbiol 2017; 8.

227. Thomas AM, Jesus EC, Lopes A, Aguiar S, Begnami MD, Rocha RM, et al. Tissue-Associated

Bacterial Alterations in Rectal Carcinoma Patients Revealed by 16S rRNA Community Profiling.

Front Cell Infect Microbiol 2016; 6.

228. Costea PI, Zeller G, Sunagawa S, Pelletier E, Alberti A, Levenez F, et al. Towards standards for

human fecal sample processing in metagenomic studies. Nat Biotechnol 2017; 35: 1069–1076.

229. Franzosa EA, Morgan XC, Segata N, Waldron L, Reyes J, Earl AM, et al. Relating the

metatranscriptome and metagenome of the human gut. Proc Natl Acad Sci U S A 2014; 111:

E2329-2338.

158

230. Cerdó T, Ruiz A, Acuña I, Jáuregui R, Jehmlich N, Haange S-B, et al. Gut microbial functional

maturation and succession during human early life. Environ Microbiol 2018; 20: 2160–2177.

231. Duranti S, Turroni F, Lugli GA, Milani C, Viappiani A, Mangifesta M, et al. Genomic

characterization and transcriptional studies of the starch-utilizing strain Bifidobacterium

adolescentis 22L. Appl Environ Microbiol 2014; 80: 6080–6090.

232. Stauber J, Shaikh N, Ordiz MI, Tarr PI, Manary MJ. Droplet digital PCR quantifies host

inflammatory transcripts in feces reliably and reproducibly. Cell Immunol 2016; 303: 43–49.

233. Reck M, Tomasch J, Deng Z, Jarek M, Husemann P, Wagner-Döbler I, et al. Stool

metatranscriptomics: A technical guideline for mRNA stabilisation and isolation. BMC Genomics

2015; 16: 494.

234. Zhang X, Li L, Mayne J, Ning Z, Stintzi A, Figeys D. Assessing the impact of protein extraction

methods for human gut metaproteomics. J Proteomics 2018; 180: 120–127.

235. Tanca A, Palomba A, Pisanu S, Addis MF, Uzzau S. Enrichment or depletion? The impact of stool

pretreatment on metaproteomic characterization of the human gut microbiota. Proteomics 2015;

15: 3474–3485.

236. Leary DH, Hervey WJ, Deschamps JR, Kusterbeck AW, Vora GJ. Which metaproteome? The impact

of protein extraction bias on metaproteomic analyses. Mol Cell Probes 2013; 27: 193–199.

237. Choudhary G, Wu S-L, Shieh P, Hancock WS. Multiple enzymatic digestion for enhanced sequence

coverage of proteins in complex proteomic mixtures using capillary LC with ion trap MS/MS. J

Proteome Res 2003; 2: 59–67.

238. Song L, Zhao M, Duffy DC, Hansen J, Shields K, Wungjiranirun M, et al. Development and

Validation of Digital Enzyme-Linked Immunosorbent Assays for Ultrasensitive Detection and

Quantification of Clostridium difficile Toxins in Stool. J Clin Microbiol 2015; 53: 3204–3212.

159

239. Li Z, Adams RM, Chourey K, Hurst GB, Hettich RL, Pan C. Systematic comparison of label-free,

metabolic labeling, and isobaric chemical labeling for quantitative proteomics on LTQ Orbitrap

Velos. J Proteome Res 2012; 11: 1582–1590.

240. Sachsenberg T, Herbst F-A, Taubert M, Kermer R, Jehmlich N, von Bergen M, et al. MetaProSIP:

automated inference of stable isotope incorporation rates in proteins for functional

metaproteomics. J Proteome Res 2015; 14: 619–627.

241. Li Z, Wang Y, Yao Q, Justice NB, Ahn T-H, Xu D, et al. Diverse and divergent protein post-

translational modifications in two growth stages of a natural microbial community. Nat Commun

2014; 5: 4405.

242. Heyer R, Schallert K, Zoun R, Becher B, Saake G, Benndorf D. Challenges and perspectives of

metaproteomic data analysis. J Biotechnol 2017; 261: 24–36.

243. Graham C, McMullan G, Graham RLJ. Proteomics in the microbial sciences. Bioeng Bugs 2011; 2:

17–30.

244. Xiao Y, Vecchi MM, Wen D. Distinguishing between Leucine and Isoleucine by Integrated LC-MS

Analysis Using an Orbitrap Fusion Mass Spectrometer. Anal Chem 2016; 88: 10757–10766.

245. Vernocchi P, Del Chierico F, Putignani L. Gut Microbiota Profiling: Metabolomics Based Approach

to Unravel Compounds Affecting Human Health. Front Microbiol 2016; 7.

246. Krieger CJ, Zhang P, Mueller LA, Wang A, Paley S, Arnaud M, et al. MetaCyc: a multiorganism

database of metabolic pathways and enzymes. Nucleic Acids Res 2004; 32: D438–D442.

247. Chow J, Panasevich MR, Alexander D, Vester Boler BM, Rossoni Serao MC, Faber TA, et al. Fecal

metabolomics of healthy breast-fed versus formula-fed infants before and during in vitro batch

culture fermentation. J Proteome Res 2014; 13: 2534–2542.

248. Smirnov KS, Maier TV, Walker A, Heinzmann SS, Forcisi S, Martinez I, et al. Challenges of

metabolomics in human gut microbiota research. Int J Med Microbiol IJMM 2016; 306: 266–279.

160

249. García-Villalba R, Giménez-Bastida JA, García-Conesa MT, Tomás-Barberán FA, Carlos Espín J,

Larrosa M. Alternative method for gas chromatography-mass spectrometry analysis of short-

chain fatty acids in faecal samples. J Sep Sci 2012; 35: 1906–1913.

250. Hou W, Zhong D, Zhang P, Li Y, Lin M, Liu G, et al. A strategy for the targeted metabolomics

analysis of 11 gut microbiota-host co-metabolites in rat serum, urine and feces by ultra high

performance liquid chromatography-tandem mass spectrometry. J Chromatogr A 2016; 1429:

207–217.

251. Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, et al. HMDB: the Human Metabolome

Database. Nucleic Acids Res 2007; 35: D521-526.

252. Alonso A, Marsal S, Julià A. Analytical Methods in Untargeted Metabolomics: State of the Art in

2015. Front Bioeng Biotechnol 2015; 3.

253. Sokolenko S, Blondeel EJM, Azlah N, George B, Schulze S, Chang D, et al. Profiling Convoluted

Single-Dimension Proton NMR Spectra: A Plackett–Burman Approach for Assessing Quantification

Error of Metabolites in Complex Mixtures with Application to Cell Culture. Anal Chem 2014; 86:

3330–3337.

254. Andra SS, Austin C, Patel D, Dolios G, Awawda M, Arora M. Trends in the application of high-

resolution mass spectrometry for human biomonitoring: An analytical primer to studying the

environmental chemical space of the human exposome. Environ Int 2017; 100: 32–61.

255. Giraudeau P, Frydman L. Single-scan 2D NMR: An Emerging Tool in Analytical Spectroscopy. Annu

Rev Anal Chem Palo Alto Calif 2014; 7: 129–161.

256. Guennec AL, Giraudeau P, Caldarelli S. Evaluation of fast 2D NMR for metabolomics. Anal Chem

2014; 86: 5946–5954.

161

257. Bokulich NA, Kaehler BD, Rideout JR, Dillon M, Bolyen E, Knight R, et al. Optimizing taxonomic

classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin.

Microbiome 2018; 6: 90.

258. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA

sequences into the new bacterial . Appl Environ Microbiol 2007; 73: 5261–5267.

259. Wissenbach DK, Oliphant K, Rolle-Kampczyk U, Yen S, Höke H, Baumann S, et al. Optimization of

metabolomics of defined in vitro gut microbial ecosystems. Int J Med Microbiol IJMM 2016; 306:

280–289.

260. Kau AL, Ahern PP, Griffin NW, Goodman AL, Gordon JI. Human nutrition, the gut microbiome, and

immune system: envisioning the future. Nature 2011; 474: 327–336.

261. Wang M-H, Achkar J-P. Gene-environment interactions in inflammatory bowel disease

pathogenesis. Curr Opin Gastroenterol 2015; 31: 277–282.

262. Yi P, Li L. The germfree murine animal: an important animal model for research on the

relationship between gut microbiota and the host. Vet Microbiol 2012; 157: 1–7.

263. Venema K, van den Abbeele P. Experimental models of the gut microbiome. Best Pract Res Clin

Gastroenterol 2013; 27: 115–126.

264. Dumas M-E, Kinross J, Nicholson JK. Metabolic phenotyping and systems biology approaches to

understanding metabolic syndrome and fatty liver disease. Gastroenterology 2014; 146: 46–62.

265. Shoaie S, Ghaffari P, Kovatcheva-Datchary P, Mardinoglu A, Sen P, Pujos-Guillot E, et al.

Quantifying Diet-Induced Metabolic Changes of the Human Gut Microbiome. Cell Metab 2015;

22: 320–331.

266. Walker A, Pfitzner B, Neschen S, Kahle M, Harir M, Lucio M, et al. Distinct signatures of host-

microbial meta-metabolome and gut microbiome in two C57BL/6 strains under high-fat diet.

ISME J 2014; 8: 2380–2396.

162

267. Marozava S, Röling WFM, Seifert J, Küffner R, von Bergen M, Meckenstock RU. Physiology of

Geobacter metallireducens under excess and limitation of electron donors. Part I. Batch

cultivation with excess of carbon sources. Syst Appl Microbiol 2014; 37: 277–286.

268. Marozava S, Röling WFM, Seifert J, Küffner R, von Bergen M, Meckenstock RU. Physiology of

Geobacter metallireducens under excess and limitation of electron donors. Part II. Mimicking

environmental conditions during cultivation in retentostats. Syst Appl Microbiol 2014; 37: 287–

295.

269. von Bergen M, Jehmlich N, Taubert M, Vogt C, Bastida F, Herbst F-A, et al. Insights from

quantitative metaproteomics and protein-stable isotope probing into microbial ecology. ISME J

2013; 7: 1877–1885.

270. Neufeld JD, Wagner M, Murrell JC. Who eats what, where and when? Isotope-labelling

experiments are coming of age. ISME J 2007; 1: 103–110.

271. Fritz JV, Desai MS, Shah P, Schneider JG, Wilmes P. From meta-omics to causality: experimental

models for human microbiome research. Microbiome 2013; 1: 14.

272. Flynn CR, Albaugh VL, Cai S, Cheung-Flynn J, Williams PE, Brucker RM, et al. Bile diversion to the

distal small intestine has comparable metabolic benefits to bariatric surgery. Nat Commun 2015;

6: 7715.

273. Herberth G, Offenberg K, Rolle-Kampczyk U, Bauer M, Otto W, Röder S, et al. Endogenous

metabolites and inflammasome activity in early childhood and links to respiratory diseases. J

Allergy Clin Immunol 2015; 136: 495–497.

274. Davies JM, Abreu MT. Host-microbe interactions in the small bowel. Curr Opin Gastroenterol

2015; 31: 118–123.

163

275. Amoako-Tuffour Y, Jones ML, Shalabi N, Labbe A, Vengallatore S, Prakash S. Ingestible

gastrointestinal sampling devices: state-of-the-art and future directions. Crit Rev Biomed Eng

2014; 42: 1–15.

276. Haange S-B, Oberbach A, Schlichting N, Hugenholtz F, Smidt H, von Bergen M, et al.

Metaproteome analysis and molecular genetics of rat intestinal microbiota reveals section and

localization resolved species distribution and enzymatic functionalities. J Proteome Res 2012; 11:

5406–5417.

277. Li H, Limenitakis JP, Fuhrer T, Geuking MB, Lawson MA, Wyss M, et al. The outer mucus layer

hosts a distinct intestinal microbial niche. Nat Commun 2015; 6: 8292.

278. Subramanian S, Blanton LV, Frese SA, Charbonneau M, Mills DA, Gordon JI. Cultivating healthy

growth and nutrition through the gut microbiota. Cell 2015; 161: 36–48.

279. Allen-Vercoe E. Bringing the gut microbiota into focus through microbial culture: recent progress

and future perspective. Curr Opin Microbiol 2013; 16: 625–629.

280. Derrien M, Belzer C, de Vos WM. Akkermansia muciniphila and its role in regulating host

functions. Microb Pathog 2017; 106: 171–181.

281. Sommer F, Bäckhed F. The gut microbiota--masters of host development and physiology. Nat Rev

Microbiol 2013; 11: 227–238.

282. Scheppach W. Effects of short chain fatty acids on gut morphology and function. Gut 1994; 35:

S35–S38.

283. Schwiertz A, Taras D, Schäfer K, Beijer S, Bos NA, Donus C, et al. Microbiota and SCFA in lean and

overweight healthy subjects. Obes Silver Spring Md 2010; 18: 190–195.

284. LeBlanc JG, Milani C, de Giori GS, Sesma F, van Sinderen D, Ventura M. Bacteria as vitamin

suppliers to their host: a gut microbiota perspective. Curr Opin Biotechnol 2013; 24: 160–168.

164

285. Magnúsdóttir S, Ravcheev D, de Crécy-Lagard V, Thiele I. Systematic genome assessment of B-

vitamin biosynthesis suggests co-operation among gut microbes. Front Genet 2015; 6: 148.

286. Huttenhower C, Kostic AD, Xavier RJ. Inflammatory bowel disease as a model for translating the

microbiome. Immunity 2014; 40: 843–854.

287. Manichanh C, Borruel N, Casellas F, Guarner F. The gut microbiota in IBD. Nat Rev Gastroenterol

Hepatol 2012; 9: 599–608.

288. Walters WA, Xu Z, Knight R. Meta-analyses of human gut microbes associated with obesity and

IBD. FEBS Lett 2014; 588: 4223–4233.

289. Morgan XC, Tickle TL, Sokol H, Gevers D, Devaney KL, Ward DV, et al. Dysfunction of the intestinal

microbiome in inflammatory bowel disease and treatment. Genome Biol 2012; 13: R79.

290. Bjerrum JT, Wang Y, Hao F, Coskun M, Ludwig C, Günther U, et al. Metabonomics of human fecal

extracts characterize ulcerative colitis, Crohn’s disease and healthy individuals. Metabolomics

2015; 11: 122–133.

291. Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G. XCMS: processing mass spectrometry data

for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem

2006; 78: 779–787.

292. Ivanisevic J, Zhu Z-J, Plate L, Tautenhahn R, Chen S, O’Brien PJ, et al. Toward ’omic scale

metabolite profiling: a dual separation-mass spectrometry approach for coverage of lipid and

central carbon metabolism. Anal Chem 2013; 85: 6876–6884.

293. Smith CA, O’Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, et al. METLIN: a metabolite mass

spectral database. Ther Drug Monit 2005; 27: 747–751.

294. Horai H, Arita M, Kanaya S, Nihei Y, Ikeda T, Suwa K, et al. MassBank: a public repository for

sharing mass spectral data for life sciences. J Mass Spectrom JMS 2010; 45: 703–714.

165

295. Xia J, Sinelnikov IV, Han B, Wishart DS. MetaboAnalyst 3.0--making metabolomics more

meaningful. Nucleic Acids Res 2015; 43: W251-257.

296. Ganter B, Zidek N, Hewitt PR, Müller D, Vladimirova A. Pathway analysis tools and toxicogenomics

reference databases for risk assessment. Pharmacogenomics 2008; 9: 35–54.

297. Tautenhahn R, Patti GJ, Rinehart D, Siuzdak G. XCMS Online: a web-based platform to process

untargeted metabolomic data. Anal Chem 2012; 84: 5035–5039.

298. Baumann S, Rockstroh M, Barthel J, Krumsiek J, Otto W, Jungnickel H, et al. Subtoxic

concentrations of benzo[a]pyrene induce metabolic changes and oxidative stress in non-activated

and affect the mTOR pathway in activated Jurkat T cells. J Integr OMICS 2014; 4.

299. Xie G, Zhang S, Zheng X, Jia W. Metabolomics approaches for characterizing metabolic

interactions between host and its commensal microbes. ELECTROPHORESIS 2013; 34: 2787–2798.

300. Khan KJ, Ullman TA, Ford AC, Abreu MT, Abadir A, Abadir A, et al. Antibiotic therapy in

inflammatory bowel disease: a systematic review and meta-analysis. Am J Gastroenterol 2011;

106: 661–673.

301. Amiot A, Dona AC, Wijeyesekera A, Tournigand C, Baumgaertner I, Lebaleur Y, et al. (1)H NMR

Spectroscopy of Fecal Extracts Enables Detection of Advanced Colorectal Neoplasia. J Proteome

Res 2015; 14: 3871–3881.

302. Kortman GAM, Dutilh BE, Maathuis AJH, Engelke UF, Boekhorst J, Keegan KP, et al. Microbial

Metabolism Shifts Towards an Adverse Profile with Supplementary Iron in the TIM-2 In vitro

Model of the Human Colon. Front Microbiol 2016; 6.

303. Lamichhane S, Yde CC, Schmedes MS, Jensen HM, Meier S, Bertram HC. Strategy for Nuclear-

Magnetic-Resonance-Based Metabolomics of Human Feces. Anal Chem 2015; 87: 5930–5937.

166

304. Rubingh CM, Bijlsma S, Derks EPPA, Bobeldijk I, Verheij ER, Kochhar S, et al. Assessing the

performance of statistical validation tools for megavariate metabolomics data. Metabolomics Off

J Metabolomic Soc 2006; 2: 53–61.

305. Wirth H, von Bergen M, Binder H. Mining SOM expression portraits: feature selection and

integrating concepts of molecular function. BioData Min 2012; 5: 18.

306. Xia J, Wishart DS. MetPA: a web-based metabolomics tool for pathway analysis and visualization.

Bioinforma Oxf Engl 2010; 26: 2342–2344.

307. Zamboni N, Saghatelian A, Patti GJ. Defining the metabolome: size, flux, and regulation. Mol Cell

2015; 58: 699–706.

308. Berry D, Stecher B, Schintlmeister A, Reichert J, Brugiroux S, Wild B, et al. Host-compound

foraging by intestinal microbiota revealed by single-cell stable isotope probing. Proc Natl Acad Sci

U S A 2013; 110: 4720–4725.

309. Berry D, Mader E, Lee TK, Woebken D, Wang Y, Zhu D, et al. Tracking heavy water (D2O)

incorporation for identifying and sorting active microbial cells. Proc Natl Acad Sci U S A 2015; 112:

E194-203.

310. Kleindienst S, Herbst F-A, Stagars M, von Netzer F, von Bergen M, Seifert J, et al. Diverse sulfate-

reducing bacteria of the Desulfosarcina/Desulfococcus clade are the key alkane degraders at

marine seeps. ISME J 2014; 8: 2029–2044.

311. Jehmlich N, Vogt C, Lünsmann V, Richnow HH, von Bergen M. Protein-SIP in environmental

studies. Curr Opin Biotechnol 2016; 41: 26–33.

312. Justice NB, Li Z, Wang Y, Spaudling SE, Mosier AC, Hettich RL, et al. (15)N- and (2)H proteomic

stable isotope probing links nitrogen flow to archaeal heterotrophic activity. Environ Microbiol

2014; 16: 3224–3237.

167

313. Busch R, Kim Y-K, Neese RA, Schade-Serin V, Collins M, Awada M, et al. Measurement of protein

turnover rates by heavy water labeling of nonessential amino acids. Biochim Biophys Acta 2006;

1760: 730–744.

314. Herbst F-A, Lünsmann V, Kjeldal H, Jehmlich N, Tholey A, von Bergen M, et al. Enhancing

metaproteomics--The value of models and defined environmental microbial systems. Proteomics

2016; 16: 783–798.

315. Bastida F, Jehmlich N, Lima K, Morris BEL, Richnow HH, Hernández T, et al. The ecological and

physiological responses of the microbial community from a semiarid soil to hydrocarbon

contamination and its bioremediation using compost amendment. J Proteomics 2016; 135: 162–

169.

316. Jehmlich N, Schmidt F, von Bergen M, Richnow H-H, Vogt C. Protein-based stable isotope probing

(Protein-SIP) reveals active species within anoxic mixed cultures. ISME J 2008; 2: 1122–1133.

317. Chen X, Wei S, Ji Y, Guo X, Yang F. Quantitative proteomics using SILAC: Principles, applications,

and developments. Proteomics 2015; 15: 3175–3192.

318. Quijada JV, Schmitt ND, Salisbury JP, Auclair JR, Agar JN. Heavy Sugar and Heavy Water Create

Tunable Intact Protein Mass Increases for Quantitative MS in any Feed and Organism. Anal Chem

2016; 88: 11139–11146.

319. Seifert J, Taubert M, Jehmlich N, Schmidt F, Völker U, Vogt C, et al. Protein-based stable isotope

probing (protein-SIP) in functional metaproteomics. Mass Spectrom Rev 2012; 31: 683–697.

320. Taubert M, Vogt C, Wubet T, Kleinsteuber S, Tarkka MT, Harms H, et al. Protein-SIP enables time-

resolved analysis of the carbon flux in a sulfate-reducing, benzene-degrading microbial

consortium. ISME J 2012; 6: 2291–2301.

321. Blazewicz SJ, Schwartz E. Dynamics of 18O incorporation from H₂ 18O into soil microbial DNA.

Microb Ecol 2011; 61: 911–916.

168

322. Schwartz E. Characterization of Growing Microorganisms in Soil by Stable Isotope Probing with

H218O. Appl Env Microbiol 2007; 73: 2541–2546.

323. Schwartz E. Analyzing microorganisms in environmental samples using stable isotope probing

with H2(18)O. Cold Spring Harb Protoc 2009; 2009: pdb.prot5341.

324. Rettedal EA, Brözel VS. Characterizing the diversity of active bacteria in soil by comprehensive

stable isotope probing of DNA and RNA with H218O. MicrobiologyOpen 2015; 4: 208–219.

325. Angel R, Conrad R. Elucidating the microbial resuscitation cascade in biological soil crusts

following a simulated rain event. Environ Microbiol 2013; 15: 2799–2815.

326. Xu Z, Knight R. Dietary effects on human gut microbiome diversity. Br J Nutr 2015; 113: S1–S5.

327. Stewart II, Thomson T, Figeys D. 18O Labeling: a tool for proteomics. Rapid Commun Mass

Spectrom 2001; 15: 2456–2465.

328. Marzorati M, Vilchez-Vargas R, Bussche JV, Truchado P, Jauregui R, El Hage RA, et al. High-fiber

and high-protein diets shape different gut microbial communities, which ecologically behave

similarly under stress conditions, as shown in a gastrointestinal simulator. Mol Nutr Food Res

2017; 61.

329. Laemmli UK. Cleavage of Structural Proteins during the Assembly of the Head of Bacteriophage

T4. Nature 1970; 227: 680–685.

330. Jehmlich N, Schmidt F, Hartwich M, von Bergen M, Richnow H-H, Vogt C. Incorporation of carbon

and nitrogen atoms into proteins measured by protein-based stable isotope probing (Protein-

SIP). Rapid Commun Mass Spectrom RCM 2008; 22: 2889–2897.

331. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global

patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci

U S A 2011; 108 Suppl 1: 4516–4522.

169

332. Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. Development of a dual-index

sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq

Illumina sequencing platform. Appl Environ Microbiol 2013; 79: 5112–5120.

333. Englander SW, Kallenbach NR. Hydrogen exchange and structural dynamics of proteins and

nucleic acids. Q Rev Biophys 1983; 16: 521–655.

334. Price MN, Zane GM, Kuehl JV, Melnyk RA, Wall JD, Deutschbauer AM, et al. Filling gaps in

bacterial amino acid biosynthesis pathways with high-throughput genetics. PLOS Genet 2018; 14:

e1007147.

335. Gao X, Pujos-Guillot E, Martin J-F, Galan P, Juste C, Jia W, et al. Metabolite analysis of human

fecal water by gas chromatography/mass spectrometry with ethyl chloroformate derivatization.

Anal Biochem 2009; 393: 163–175.

336. Neumann-Schaal M, Hofmann JD, Will SE, Schomburg D. Time-resolved amino acid uptake of

Clostridium difficile 630Δerm and concomitant fermentation product and toxin formation. BMC

Microbiol 2015; 15.

337. Verberkmoes NC, Russell AL, Shah M, Godzik A, Rosenquist M, Halfvarson J, et al. Shotgun

metaproteomics of the human distal gut microbiota. ISME J 2009; 3: 179–189.

338. Tanca A, Abbondio M, Palomba A, Fraumene C, Manghina V, Cucca F, et al. Potential and active

functions in the gut microbiota of a healthy human cohort. Microbiome 2017; 5.

339. Kolmeder CA, Been M de, Nikkilä J, Ritamo I, Mättö J, Valmu L, et al. Comparative

Metaproteomics and Diversity Analysis of Human Intestinal Microbiota Testifies for Its Temporal

Stability and Expression of Core Functions. PLOS ONE 2012; 7: e29913.

340. Hugenholtz P, Hooper SD, Kyrpides NC. Focus: Synergistetes. Environ Microbiol 2009; 11: 1327–

1329.

170

341. Lau CH-F, Hughes D, Poole K. MexY-promoted aminoglycoside resistance in Pseudomonas

aeruginosa: involvement of a putative proximal binding pocket in aminoglycoside recognition.

mBio 2014; 5: e01068.

342. Rogers TE, Pudlo NA, Koropatkin NM, Bell JSK, Balasch MM, Jasker K, et al. Dynamic responses of

Bacteroides thetaiotaomicron during growth on glycan mixtures. Mol Microbiol 2013; 88: 876–

890.

343. Ze X, Duncan SH, Louis P, Flint HJ. Ruminococcus bromii is a keystone species for the degradation

of resistant starch in the human colon. ISME J 2012; 6: 1535–1543.

344. Halmos EP, Christophersen CT, Bird AR, Shepherd SJ, Gibson PR, Muir JG. Diets that differ in their

FODMAP content alter the colonic luminal microenvironment. Gut 2015; 64: 93–100.

345. Everard A, Lazarevic V, Derrien M, Girard M, Muccioli GG, Muccioli GM, et al. Responses of gut

microbiota and glucose and lipid metabolism to prebiotics in genetic obese and diet-induced

leptin-resistant mice. Diabetes 2011; 60: 2775–2786.

346. Duncan SH, Belenguer A, Holtrop G, Johnstone AM, Flint HJ, Lobley GE. Reduced Dietary Intake of

Carbohydrates by Obese Subjects Results in Decreased Concentrations of Butyrate and Butyrate-

Producing Bacteria in Feces. Appl Environ Microbiol 2007; 73: 1073–1078.

347. Gonze D, Lahti L, Raes J, Faust K. Multi-stability and the origin of microbial community types.

ISME J 2017; 11: 2159–2166.

348. Großkopf T, Soyer OS. Synthetic microbial communities. Curr Opin Microbiol 2014; 18: 72–77.

349. Zhou J, Ning D. Stochastic Community Assembly: Does It Matter in Microbial Ecology? Microbiol

Mol Biol Rev 2017; 81: e00002-17.

350. Pagaling E, Strathdee F, Spears BM, Cates ME, Allen RJ, Free A. Community history affects the

predictability of microbial ecosystem development. ISME J 2014; 8: 19–30.

171

351. Graham DW, Knapp CW, Van Vleck ES, Bloor K, Lane TB, Graham CE. Experimental demonstration

of chaotic instability in biological nitrification. ISME J 2007; 1: 385–393.

352. Zhou J, Liu W, Deng Y, Jiang Y-H, Xue K, He Z, et al. Stochastic Assembly Leads to Alternative

Communities with Distinct Functions in a Bioreactor Microbial Community. mBio 2013; 4:

e00584-12.

353. Kohrs F, Heyer R, Bissinger T, Kottler R, Schallert K, Püttker S, et al. Proteotyping of laboratory-

scale biogas plants reveals multiple steady-states in community composition. Anaerobe 2017; 46:

56–68.

354. Gast CJVD, Ager D, Lilley AK. Temporal scaling of bacterial taxa is influenced by both stochastic

and deterministic ecological factors. Environ Microbiol 2008; 10: 1411–1418.

355. Flint HJ, Scott KP, Duncan SH, Louis P, Forano E. Microbial degradation of complex carbohydrates

in the gut. Gut Microbes 2012; 3: 289–306.

356. Round JL, Mazmanian SK. The gut microbiota shapes intestinal immune responses during health

and disease. Nat Rev Immunol 2009; 9: 313–323.

357. Cho JA, Chinnapen DJF. Targeting friend and foe: Emerging therapeutics in the age of gut

microbiome and disease. J Microbiol Seoul Korea 2018; 56: 183–188.

358. Daliri EB-M, Tango CN, Lee BH, Oh D-H. Human microbiome restoration and safety. Int J Med

Microbiol IJMM 2018.

359. Quraishi MN, Widlak M, Bhala N, Moore D, Price M, Sharma N, et al. Systematic review with

meta-analysis: the efficacy of faecal microbiota transplantation for the treatment of recurrent

and refractory Clostridium difficile infection. Aliment Pharmacol Ther 2017; 46: 479–493.

360. Cao Y, Zhang B, Wu Y, Wang Q, Wang J, Shen F. The Value of Fecal Microbiota Transplantation in

the Treatment of Ulcerative Colitis Patients: A Systematic Review and Meta-Analysis.

Gastroenterol Res Pract 2018; 2018.

172

361. Sun J, Marwah G, Westgarth M, Buys N, Ellwood D, Gray PH. Effects of Probiotics on Necrotizing

Enterocolitis, Sepsis, Intraventricular Hemorrhage, Mortality, Length of Hospital Stay, and Weight

Gain in Very Preterm Infants: A Meta-Analysis. Adv Nutr 2017; 8: 749–763.

362. Liu L, Firrman J, Tanes C, Bittinger K, Thomas-Gahring A, Wu GD, et al. Establishing a mucosal gut

microbial community in vitro using an artificial simulator. PLOS ONE 2018; 13: e0197692.

363. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene

database project: improved data processing and web-based tools. Nucleic Acids Res 2013; 41:

D590–D596.

364. Eriksson L, Byrne T, Johansson E, Trygg J, Vikström C. Multi- and Megavariate Data Analysis Basic

Principles and Applications. 2013. Umetrics Academy.

365. Tenenhaus M. La régression PLS: théorie et pratique. 1998. Editions TECHNIP.

366. Szymańska E, Saccenti E, Smilde AK, Westerhuis JA. Double-check: validation of diagnostic

statistics for PLS-DA models in metabolomics studies. Metabolomics Off J Metabolomic Soc 2012;

8: 3–16.

367. Konietschke F, Bathke AC, Harrar SW, Pauly M. Parametric and nonparametric bootstrap methods

for general MANOVA. J Multivar Anal 2015; 140: 291–301.

368. Friedrich S, Pauly M. MATS: Inference for potentially singular and heteroscedastic MANOVA. J

Multivar Anal 2018; 165: 166–179.

369. Goodpaster AM, Kennedy MA. Quantification and statistical significance analysis of group

separation in NMR-based metabonomics studies. Chemom Intell Lab Syst Int J Spons Chemom Soc

2011; 109: 162–170.

370. Anderson MJ. A new method for non-parametric multivariate analysis of variance. Austral Ecol

2001; 26: 32–46.

173

371. Kaufman L, Rousseeuw PJ. Finding Groups in Data: An Introduction to Cluster Analysis. 2005. John

Wiley & Sons, Inc., New York, NY, USA.

372. Kunert J, Götz T, Schach S. Mathematical Statistics with Applications to Biometry: Festschrift in

Honor of Prof. Dr. Siegfried Schach. 2001. Josef Eul Verlag, Cologne, Germany.

373. Efron B, Tibshirani RJ. An Introduction to Bootstrap. 1994. Chapman and Hall/CRC, London, UK.

374. Wold S, Sjöström M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemom Intell Lab

Syst 2001; 58: 109–130.

375. Gavina MKA, Tahara T, Tainaka K, Ito H, Morita S, Ichinose G, et al. Multi-species coexistence in

Lotka-Volterra competitive systems with crowding effects. Sci Rep 2018; 8: 1198.

376. Caruso T, Chan Y, Lacap DC, Lau MCY, McKay CP, Pointing SB. Stochastic and deterministic

processes interact in the assembly of desert microbial communities on a global scale. ISME J

2011; 5: 1406–1413.

377. Dumbrell AJ, Nelson M, Helgason T, Dytham C, Fitter AH. Relative roles of niche and neutral

processes in structuring a soil microbial community. ISME J 2010; 4: 337–345.

378. Weingarden AR, Vaughn BP. Intestinal microbiota, fecal microbiota transplantation, and

inflammatory bowel disease. Gut Microbes 2017; 8: 238.

379. Worley B, Powers R. PCA as a practical indicator of OPLS-DA model reliability. Curr Metabolomics

2016; 4: 97–103.

380. Westerhuis JA, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, van Velzen EJJ, et al. Assessment of

PLSDA cross validation. Metabolomics 2008; 4: 81–89.

381. Brereton RG, Lloyd GR. Partial least squares discriminant analysis: taking the magic away. J

Chemom 2014; 28: 213–225.

174

382. Mukhopadhya I, Moraïs S, Laverde‐Gomez J, Sheridan PO, Walker AW, Kelly W, et al. Sporulation

capability and amylosome conservation among diverse human colonic and rumen isolates of the

keystone starch‐degrader Ruminococcus bromii. Environ Microbiol 2018; 20: 324–336.

383. Lee K-A, Kim S-H, Kim E-K, Ha E-M, You H, Kim B, et al. Bacterial-derived uracil as a modulator of

mucosal immunity and gut-microbe homeostasis in Drosophila. Cell 2013; 153: 797–811.

384. Hong Y-S, Ahn Y-T, Park J-C, Lee J-H, Lee H, Huh C-S, et al. 1H NMR-based metabonomic

assessment of probiotic effects in a colitis mouse model. Arch Pharm Res 2010; 33: 1091–1101.

385. Rinas U, Hellmuth K, Kang R, Seeger A, Schlieker H. Entry of Escherichia coli into stationary phase

is indicated by endogenous and exogenous accumulation of nucleobases. Appl Environ Microbiol

1995; 61: 4147–4151.

386. Ueda A, Attila C, Whiteley M, Wood TK. Uracil influences quorum sensing and biofilm formation

in Pseudomonas aeruginosa and fluorouracil is an antagonist. Microb Biotechnol 2009; 2: 62–74.

387. Russell WR, Gratz SW, Duncan SH, Holtrop G, Ince J, Scobbie L, et al. High-protein, reduced-

carbohydrate weight-loss diets promote metabolite profiles likely to be detrimental to colonic

health. Am J Clin Nutr 2011; 93: 1062–1072.

388. Scott KP, Martin JC, Duncan SH, Flint HJ. Prebiotic stimulation of human colonic butyrate-

producing bacteria and bifidobacteria, in vitro. FEMS Microbiol Ecol 2014; 87: 30–40.

389. Rosero JA, Killer J, Sechovcová H, Mrázek J, Benada O, Fliegerová K, et al. Reclassification of

Eubacterium rectale (Hauduroy et al. 1937) Prévot 1938 in a new genus Agathobacter gen. nov.

as Agathobacter rectalis comb. nov., and description of Agathobacter ruminis sp. nov., isolated

from the rumen contents of sheep and cows. Int J Syst Evol Microbiol 2016; 66: 768–773.

390. Lozupone C, Faust K, Raes J, Faith JJ, Frank DN, Zaneveld J, et al. Identifying genomic and

metabolic features that can underlie early successional and opportunistic lifestyles of human gut

symbionts. Genome Res 2012; 22: 1974–1984.

175

391. Cockburn DW, Orlovsky NI, Foley MH, Kwiatkowski KJ, Bahr CM, Maynard M, et al. Molecular

details of a starch utilization pathway in the human gut symbiont Eubacterium rectale. Mol

Microbiol 2015; 95: 209–230.

392. Cockburn DW, Suh C, Medina KP, Duvall RM, Wawrzak Z, Henrissat B, et al. Novel carbohydrate

binding modules in the surface anchored α-amylase of Eubacterium rectale provide a molecular

rationale for the range of starches used by this organism in the human gut. Mol Microbiol 2018;

107: 249–264.

393. Rivière A, Gagnon M, Weckx S, Roy D, De Vuyst L. Mutual Cross-Feeding Interactions between

Bifidobacterium longum subsp. longum NCC2705 and Eubacterium rectale ATCC 33656 Explain

the Bifidogenic and Butyrogenic Effects of Arabinoxylan Oligosaccharides. Appl Environ Microbiol

2015; 81: 7767–7781.

394. Shoaie S, Karlsson F, Mardinoglu A, Nookaew I, Bordel S, Nielsen J. Understanding the

interactions between bacteria in the human gut through metabolic modeling. Sci Rep 2013; 3:

2532.

395. Tuncil YE, Xiao Y, Porter NT, Reuhs BL, Martens EC, Hamaker BR. Reciprocal Prioritization to

Dietary Glycans by Gut Bacteria in a Competitive Environment Promotes Stable Coexistence.

mBio 2017; 8.

396. Kerényi Á, Bihary D, Venturi V, Pongor S. Stability of Multispecies Bacterial Communities:

Signaling Networks May Stabilize Microbiomes. PLOS ONE 2013; 8: e57947.

397. Sakamoto M, Iino T, Ohkuma M. Faecalimonas umbilicata gen. nov., sp. nov., isolated from

human faeces, and reclassification of Eubacterium contortum, Eubacterium fissicatena and

Clostridium oroticum as Faecalicatena contorta gen. nov., comb. nov., Faecalicatena fissicatena

comb. nov. and Faecalicatena orotica comb. nov. Int J Syst Evol Microbiol 2017; 67: 1219–1227.

176

398. Taylor MM. Eubacterium fissicatena sp.nov. isolated from the alimentary tract of the goat. J Gen

Microbiol 1972; 71: 457–463.

399. Louis P, Flint HJ. Diversity, metabolism and microbial ecology of butyrate-producing bacteria from

the human large intestine. FEMS Microbiol Lett 2009; 294: 1–8.

400. Saito Y, Sato T, Nomoto K, Tsuji H. Identification of phenol- and p-cresol-producing intestinal

bacteria by using media supplemented with tyrosine and its metabolites. FEMS Microbiol Ecol

2018; 94.

401. Zhao Y, Wu J, Li JV, Zhou N-Y, Tang H, Wang Y. Gut microbiota composition modifies fecal

metabolic profiles in mice. J Proteome Res 2013; 12: 2987–2999.

402. Yap IKS, Li JV, Saric J, Martin F-P, Davies H, Wang Y, et al. Metabonomic and microbiological

analysis of the dynamic effect of vancomycin-induced gut microbiota modification in the mouse. J

Proteome Res 2008; 7: 3718–3728.

403. Pedruski MT, Fussmann GF, Gonzalez A. Predicting the outcome of competition when fitness

inequality is variable. R Soc Open Sci 2015; 2.

404. Chase JM. Drought mediates the importance of stochastic community assembly. Proc Natl Acad

Sci 2007; 104: 17430–17434.

405. Collado MC, Derrien M, Isolauri E, de Vos WM, Salminen S. Intestinal Integrity and Akkermansia

muciniphila, a Mucin-Degrading Member of the Intestinal Microbiota Present in Infants, Adults,

and the Elderly. Appl Environ Microbiol 2007; 73: 7767–7770.

406. Wu F, Guo X, Zhang J, Zhang M, Ou Z, Peng Y. Phascolarctobacterium faecium abundant

colonization in human gastrointestinal tract. Exp Ther Med 2017; 14: 3122–3126.

407. Watanabe Y, Nagai F, Morotomi M. Characterization of Phascolarctobacterium succinatutens sp.

nov., an Asaccharolytic, Succinate-Utilizing Bacterium Isolated from Human Feces. Appl Env

Microbiol 2012; 78: 511–518.

177

408. Del Dot T, Osawa R, Stackebrandt E. Phascolarctobacterium faecium gen. nov, spec. nov., a Novel

Taxon of the Sporomusa Group of Bacteria. Syst Appl Microbiol 1993; 16: 380–384.

409. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation

sequencing technologies. Nat Rev Genet 2016; 17: 333–351.

410. Gionchetti P, Rizzello F, Ferrieri A, Venturi A, Brignola C, Ferretti M, et al. Rifaximin in patients

with moderate or severe ulcerative colitis refractory to steroid-treatment: a double-blind,

placebo-controlled trial. Dig Dis Sci 1999; 44: 1220–1221.

411. Shen B, Remzi FH, Lopez AR, Queener E. Rifaximin for maintenance therapy in antibiotic-

dependent pouchitis. BMC Gastroenterol 2008; 8: 26.

412. Guslandi M, Petrone MC, Testoni PA. Rifaximin for active ulcerative colitis. Inflamm Bowel Dis

2006; 12: 335.

413. Guslandi M, Giollo P, Testoni PA. Corticosteroid-sparing effect of rifaximin, a nonabsorbable oral

antibiotic, in active ulcerative colitis: Preliminary clinical experience. Curr Ther Res Clin Exp 2004;

65: 292–296.

414. Li SS, Zhu A, Benes V, Costea PI, Hercog R, Hildebrand F, et al. Durable coexistence of donor and

recipient strains after fecal microbiota transplantation. Science 2016; 352: 586–589.

415. Mizuno S, Masaoka T, Naganuma M, Kishimoto T, Kitazawa M, Kurokawa S, et al.

Bifidobacterium-Rich Fecal Donor May Be a Positive Predictor for Successful Fecal Microbiota

Transplantation in Patients with Irritable Bowel Syndrome. Digestion 2017; 96: 29–38.

416. Kappelman MD, Moore KR, Allen JK, Cook SF. Recent Trends in the Prevalence of Crohn’s Disease

and Ulcerative Colitis in a Commercially Insured US Population. Dig Dis Sci 2013; 58: 519.

417. Keshteli AH, Millan B, Madsen KL. Pretreatment with antibiotics may enhance the efficacy of fecal

microbiota transplantation in ulcerative colitis: a meta-analysis. Mucosal Immunol 2017; 10: 565–

566.

178

418. Kornbluth A, Sachar DB, Practice Parameters Committee of the American College of

Gastroenterology. Ulcerative colitis practice guidelines in adults: American College Of

Gastroenterology, Practice Parameters Committee. Am J Gastroenterol 2010; 105: 501–523; quiz

524.

419. Mowat C, Cole A, Windsor A, Ahmad T, Arnott I, Driscoll R, et al. Guidelines for the management

of inflammatory bowel disease in adults. Gut 2011; 60: 571–607.

420. Dignass A, Lindsay JO, Sturm A, Windsor A, Colombel J-F, Allez M, et al. Second European

evidence-based consensus on the diagnosis and management of ulcerative colitis part 2: current

management. J Crohns Colitis 2012; 6: 991–1030.

421. Huhulescu S, Sagel U, Fiedler A, Pecavar V, Blaschitz M, Wewalka G, et al. Rifaximin disc diffusion

test for in vitro susceptibility testing of Clostridium difficile. J Med Microbiol 2011; 60: 1206–1212.

422. Taylor DN, McKenzie R, Durbin A, Carpenter C, Haake R, Bourgeois AL. Systemic Pharmacokinetics

of Rifaximin in Volunteers with Shigellosis. Antimicrob Agents Chemother 2008; 52: 1179–1181.

423. Gloor GB, Macklaim JM, Fernandes AD. Displaying Variation in Large Datasets: Plotting a Visual

Summary of Effect Sizes. J Comput Graph Stat 2016; 25: 971–979.

424. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: A New

Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J Comput Biol 2012;

19: 455–477.

425. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinforma Oxf Engl 2014; 30: 2068–

2069.

426. Yamada T, Letunic I, Okuda S, Kanehisa M, Bork P. iPath2.0: interactive pathway explorer. Nucleic

Acids Res 2011; 39: W412-415.

179

427. Maccaferri S, Vitali B, Klinder A, Kolida S, Ndagijimana M, Laghi L, et al. Rifaximin modulates the

colonic microbiota of patients with Crohn’s disease: an in vitro approach using a continuous

culture colonic model system. J Antimicrob Chemother 2010; 65: 2556–2565.

428. Soldi S, Vasileiadis S, Uggeri F, Campanale M, Morelli L, Fogli MV, et al. Modulation of the gut

microbiota composition by rifaximin in non-constipated irritable bowel syndrome patients: a

molecular approach. Clin Exp Gastroenterol 2015; 8: 309–325.

429. Ponziani FR, Scaldaferri F, Petito V, Paroni Sterbini F, Pecere S, Lopetuso LR, et al. The Role of

Antibiotics in Gut Microbiota Modulation: The Eubiotic Effects of Rifaximin. Dig Dis Basel Switz

2016; 34: 269–278.

430. Kang DJ, Kakiyama G, Betrapally NS, Herzog J, Nittono H, Hylemon PB, et al. Rifaximin Exerts

Beneficial Effects Independent of its Ability to Alter Microbiota Composition. Clin Transl

Gastroenterol 2016; 7: e187.

431. Jiang Z-D, Ke S, Dupont HL. Rifaximin-induced alteration of virulence of diarrhoea-producing

Escherichia coli and Shigella sonnei. Int J Antimicrob Agents 2010; 35: 278–281.

432. Vitali B, Perna F, Lammers K, Turroni S, Gionchetti P, Brigidi P. Immunoregulatory activity of

rifaximin associated with a resistant mutant of Bifidobacterium infantis. Int J Antimicrob Agents

2009; 33: 387–389.

433. Cianci R, Iacopini F, Petruzziello L, Cammarota G, Pandolfi F, Costamagna G. Involvement of

central immunity in uncomplicated diverticular disease. Scand J Gastroenterol 2009; 44: 108–115.

434. Moayyedi P, Surette MG, Kim PT, Libertucci J, Wolfe M, Onischi C, et al. Fecal Microbiota

Transplantation Induces Remission in Patients With Active Ulcerative Colitis in a Randomized

Controlled Trial. Gastroenterology 2015; 149: 102-109.e6.

435. Kump P, Wurm P, Gröchenig HP, Wenzl H, Petritsch W, Halwachs B, et al. The taxonomic

composition of the donor intestinal microbiota is a major factor influencing the efficacy of faecal

180

microbiota transplantation in therapy refractory ulcerative colitis. Aliment Pharmacol Ther 2018;

47: 67–77.

436. Angelberger S, Reinisch W, Makristathis A, Lichtenberger C, Dejaco C, Papay P, et al. Temporal

bacterial community dynamics vary among ulcerative colitis patients after fecal microbiota

transplantation. Am J Gastroenterol 2013; 108: 1620–1630.

437. Rossen NG, Fuentes S, van der Spek MJ, Tijssen JG, Hartman JHA, Duflou A, et al. Findings From a

Randomized Controlled Trial of Fecal Transplantation for Patients With Ulcerative Colitis.

Gastroenterology 2015; 149: 110-118.e4.

438. Nishida A, Imaeda H, Ohno M, Inatomi O, Bamba S, Sugimoto M, et al. Efficacy and safety of

single fecal microbiota transplantation for Japanese patients with mild to moderately active

ulcerative colitis. J Gastroenterol 2017; 52: 476–482.

439. Vermeire S, Joossens M, Verbeke K, Wang J, Machiels K, Sabino J, et al. Donor Species Richness

Determines Faecal Microbiota Transplantation Success in Inflammatory Bowel Disease. J Crohns

Colitis 2016; 10: 387–394.

440. Paramsothy S, Kamm MA, Kaakoush NO, Walsh AJ, van den Bogaerde J, Samuel D, et al.

Multidonor intensive faecal microbiota transplantation for active ulcerative colitis: a randomised

placebo-controlled trial. Lancet Lond Engl 2017; 389: 1218–1228.

441. Fuentes S, Rossen NG, van der Spek MJ, Hartman JH, Huuskonen L, Korpela K, et al. Microbial

shifts and signatures of long-term remission in ulcerative colitis after faecal microbiota

transplantation. ISME J 2017; 11: 1877–1889.

442. Mintz M, Khair S, Grewal S, LaComb JF, Park J, Channer B, et al. Longitudinal microbiome analysis

of single donor fecal microbiota transplantation in patients with recurrent Clostridium difficile

infection and/or ulcerative colitis. PloS One 2018; 13: e0190997.

181

443. Gralka E, Luchinat C, Tenori L, Ernst B, Thurnheer M, Schultes B. Metabolomic fingerprint of

severe obesity is dynamically affected by bariatric surgery in a procedure-dependent manner. Am

J Clin Nutr 2015; 102: 1313–1322.

444. Garner CE, Smith S, de Lacy Costello B, White P, Spencer R, Probert CSJ, et al. Volatile organic

compounds from feces and their potential for diagnosis of gastrointestinal disease. FASEB J Off

Publ Fed Am Soc Exp Biol 2007; 21: 1675–1688.

445. Pallister T, Jackson MA, Martin TC, Glastonbury CA, Jennings A, Beaumont M, et al. Untangling

the relationship between diet and visceral fat mass through blood metabolomics and gut

microbiome profiling. Int J Obes 2005 2017; 41: 1106–1113.

446. Moore SC, Matthews CE, Sampson JN, Stolzenberg-Solomon RZ, Zheng W, Cai Q, et al. Human

metabolic correlates of body mass index. Metabolomics Off J Metabolomic Soc 2014; 10: 259–

269.

447. Velasquez MT, Ramezani A, Manal A, Raj DS. Trimethylamine N-Oxide: The Good, the Bad and the

Unknown. Toxins 2016; 8.

448. Heianza Y, Ma W, Manson JE, Rexrode KM, Qi L. Gut Microbiota Metabolites and Risk of Major

Adverse Cardiovascular Disease Events and Death: A Systematic Review and Meta-Analysis of

Prospective Studies. J Am Heart Assoc 2017; 6.

449. Lee STM, Kahn SA, Delmont TO, Shaiber A, Esen ÖC, Hubert NA, et al. Tracking microbial

colonization in fecal microbiota transplantation experiments via genome-resolved metagenomics.

Microbiome 2017; 5.

450. Smillie CS, Sauk J, Gevers D, Friedman J, Sung J, Youngster I, et al. Strain Tracking Reveals the

Determinants of Bacterial Engraftment in the Human Gut Following Fecal Microbiota

Transplantation. Cell Host Microbe 2018; 23: 229-240.e5.

451. Structure, Function and Diversity of the Healthy Human Microbiome. Nature 2012; 486: 207–214.

182

452. Wei Y, Gong J, Zhu W, Tian H, Ding C, Gu L, et al. Pectin enhances the effect of fecal microbiota

transplantation in ulcerative colitis by delaying the loss of diversity of gut flora. BMC Microbiol

2016; 16.

453. Quinn TP, Richardson MF, Lovell D, Crowley TM. propr: An R-package for Identifying

Proportionally Abundant Features Using Compositional Data Analysis. Sci Rep 2017; 7: 16252.

454. Alnouti Y, Csanaky IL, Klaassen CD. Quantitative-profiling of bile acids and their conjugates in

mouse liver, bile, plasma, and urine using LC–MS/MS. J Chromatogr B 2008; 873: 209–217.

455. Sánchez-Patán F, Monagas M, Moreno-Arribas MV, Bartolomé B. Determination of Microbial

Phenolic Acids in Human Faeces by UPLC-ESI-TQ MS. J Agric Food Chem 2011; 59: 2241–2247.

456. Matysik S, Le Roy CI, Liebisch G, Claus SP. Metabolomics of fecal samples: A practical

consideration. Trends Food Sci Technol 2016; 57: 244–255.

457. Human Microbiome Jumpstart Reference Strains Consortium, Nelson KE, Weinstock GM,

Highlander SK, Worley KC, Creasy HH, et al. A catalog of reference genomes from the human

microbiome. Science 2010; 328: 994–999.

183

Appendix A. Supplementary Figures

Figure S1. The abiotic HD-exchange in proteins. Unlabeled Angiotensin-II was incubated in 99% deuterated water for one day, dissolved in elution buffer and directly injected into MS/MS. The spectrum was recorded at different time points over eight minutes. The natural abundance peak [M+H+] is highlighted.

Figure S2. The activity of Escherichia coli K12 incubated in cold and warm temperatures. After reaching stationary phase, cultures of E. coli were transferred to 4 °C or 37 °C and dosed with 25% or 50% D2O or 18 H2 O to validate activity. The number of labeled peptides detected is indicated in each boxplot. Median, lower and upper quartiles, lower and upper whiskers, 5th and 95th percentiles and the detection limit are shown in the boxplots (n=3).

184

Figure S3. Relative isotope abundances (RIA) of peptides extracted from a defined microbial community 18 derived from a human fecal sample after incubation with 25% of D2O or H2 O. The detected number of labeled peptides is indicated for each boxplot (a). RIA of peptides extracted from Escherichia coli and the defined microbial community grown in both media (HF – high fiber, HP – high protein) when incubated 18 with 25% D2O or H2 O in the stationary phase (b). Median, lower and upper quartiles, lower and upper whiskers, 5th and 95th percentiles, and the detection limit are shown in the boxplots (n=3). Standard deviation of the average is shown in the bar charts (n=3).

185

Figure S4. Relative isotope abundance (RIA) of peptides from the top 20 genera (>90% relative abundance) in a defined microbial community derived from a human fecal sample incubated with high 18 fiber (HF) or high protein (HP) medium and 25% D2O or 25% H2 O. Standard deviation of the average is shown in the bar charts (n=3).

186

Figure S5. The expected amount of time it would take for the bioreactors to shed the excess 2000 mg/mL of fiber when switched from a high fiber medium formulation to a high protein medium formulation, as determined in R software version 3.5 by the package deSolve version 1.21. The plot was drawn by the package ggplot2 version 2.2.1.

187

Figure S6. The statistically significant difference (ANOVA p-value: 0.008) between the β-dispersion of Euclidean distance matrix transformed, centre-log ratio normalized 16S rRNA marker gene sequencing count data as determined in R software version 3.5 by the package vegan version 2.5.2.

188

Figure S7. Partial least squared discriminant analysis of compositional and metabonomic data. Microbial community compositional data generated from gDNA extracted bioreactor samples that were 16S rRNA profiled by Illumina sequencing, with subsequent processing and centre-log ratio transformation, and 1H- NMR measured metabolite data from 0.2 μm filtered bioreactor samples. The bioreactors were seeded with a defined microbial community representing an ulcerative colitis patient fecal sample. The before treatment (Before), after antibiotic perturbation (Abx), and after microbial therapeutic replenishment with (Abx-MET) and without (MET) prior antibiotic usage groupings are represented.

189

Figure S8. Calculated rifaximin concentration in the bioreactors over time. The dosing regimen of 100 mg every 12 hr into 400 mL vessels with a medium feed rate equivalent to 400 mL/d begun on day 21 of the run and ended on day 25. The time points at which the Microbial Ecosystem Therapeutic was pulsed into the bioreactors is indicated by the dotted line, at days 27 and 29. The maximum concentration of rifaximin reached was 672.86 μg/mL, and the amounts retained on delivery days peaked at 156.52 μg/mL and 21.18 μg/mL respectively.

190

B. Supplementary Tables Table S1. Bioreactor medium formulations. The suppliers of the reagents are indicated, which include Sigma-Aldrich (Saint Louis, MO, USA), Thermo Fisher Scientific (Waltham, MA, USA), Alfa Aesar (Ward Hill, MA, USA), BD (Franklin Lakes, NJ, USA), VWR (Radnor, PA, USA), and BioNutrition (Laval, QC, Canada). REAGENT SUPPLIER HIGH FIBER DIET HIGH PROTEIN DIET (G/L) (G/L) ARABINOGALACTAN Sigma-Aldrich 2.0 0.5 BACTO TRYPTONE BD 0.6 2.5 BILE SALTS Sigma-Aldrich 0.5 0.5 CALCIUM CHLORIDE Sigma-Aldrich 0.01 0.01 CASEIN Alfa Aesar 0.6 2.5 GLUCOSE Sigma-Aldrich 0.4 0.2 HEMIN VWR 0.005 0.005 INULIN (DAHLIA TUBERS) Alfa Aesar 1.0 0.25 L-CYSTEINE HYDROCHLORIDE Thermo Fisher 0.5 0.5 Scientific MAGNESIUM SULFATE VWR 0.01 0.01 MENADIONE Sigma-Aldrich 0.001 0.001 PECTIN (CITRUS) Sigma-Aldrich 2.0 0.5 PEPTONE WATER Thermo Fisher 0.6 2.5 Scientific PORCINE GASTRIC MUCIN (TYPE Sigma-Aldrich 4.0 4.0 II) POTASSIUM PHOSPHATE Thermo Fisher 0.04 0.04 DIBASIC Scientific POTASSIUM PHOSPHATE Sigma-Aldrich 0.04 0.04 MONOBASIC SODIUM BICARBONATE Thermo Fisher 2.0 2.0 Scientific SODIUM CHLORIDE Sigma-Aldrich 0.1 0.1 STARCH (WHEAT, UNMODIFIED) Sigma-Aldrich 2.0 1.75 XYLO-OLIGOSACCHARIDE BioNutrition 2.0 0.5 YEAST EXTRACT BD 3.0 3.0

191

Table S2. The human fecal sample derived defined microbial community composition utilized to inoculate the bioreactors. Taxonomic information was inferred from NCBI BLAST searches of the 16S rRNA gene regions v3 – v6, with a percentage identity of ≥ 97% required for species names and ≥ 95% for genus names. PHYLUM FAMILY SPECIES ACTINOBACTERIA Bifidobacteriaceae Bifidobacterium longum Coriobacteriaceae Collinsella aerofaciens Eggerthellaceae Adlercreutzia equolifaciens Microbacteriaceae Microbacterium saccharophilum Odoribacteraceae Odoribacter splanchnicus Porphyromonadaceae Parabacteroides gordonii Parabacteroides merdae BACTEROIDETES Bacteroidaceae Bacteroides cellulosilyticus Bacteroides eggerthii Bacteroides ovatus Bacteroides thetaiotaomicron Bacteroides uniformis Bacteroides vulgatus Rikenellaceae Alistipes onderdonkii Alistipes putredinis Alistipes shahii FIRMICUTES Acidaminococcaceae Phascolarctobacterium succinatutens Bacillaceae Bacillus circulans Bacillus simplex Catabacteriaceae Catabacter hongkongensis Clostridiaceae Clostridium saudiense Erysipelotrichaceae [Clostridium] innocuum Catenibacterium mitsuokai Erysipelatoclostridium ramosum Erysipelotrichaceae sp. Holdemanella biformis Holdemania massiliensis Eubacteriaceae Eubacterium callanderi Lachnospiraceae [Clostridium] bolteae [Clostridium] hathewayi [Clostridium] hylemonae [Clostridium] lavalense [Clostridium] scindens [Clostridium] symbiosum [Eubacterium] contortum [Eubacterium] eligens [Eubacterium] rectale Blautia hydrogenotrophica

192

PHYLUM FAMILY SPECIES Blautia luti Blautia producta Coprococcus comes Coprococcus eutactus Dorea formicigenerans Dorea longicatena Eisenbergiella massiliensis Eisenbergiella tayi Eubacterium ventriosum Lachnospiraceae sp. Roseburia faecis Roseburia hominis Ruminococcus faecis Ruminococcaceae Gemmiger formicilis Neglecta timonensis Ruminococcus bromii Staphylococcaceae Staphylococcus epidermis Streptococcaceae Streptococcus salivarius Streptococcus tigurinus unclassified Clostridiales Colidextribacter sp. Flavonifractor plautii PROTEOBACTERIA Enterobacteriaceae Escherichia coli Sutterellaceae Parasutterella excrementihominis SYNERGISTETES Synergistaceae Rarimicrobium hominis VERRUCOMICROBIA Akkermansiaceae Akkermansia muciniphila

193

Table S3. The composition of the defined microbial communities with source information indicated for each strain of the artificial microbial community. SPECIES SOURCE INFORMATION Bifidobacterium adolescentis British Columbia Cancer Agency REB H09-01268 Bifidobacterium longum University of Guelph REB09AP011 Bifidobacterium British Columbia Cancer Agency REB H09-01268 pseudocatenulatum Collinsella aerofaciens University of Guelph REB11AP003 Bacteroides ovatus Human Microbiome Jumpstart Reference Strain Consortium[457] Parabacteroides distasonis British Columbia Cancer Agency REB H09-01268 Acidaminococcus intestinalis British Columbia Cancer Agency REB H09-01268 Eubacterium limosum British Columbia Cancer Agency REB H09-01268 [Eubacterium] eligens Kind gift from Dr. Mike Surette, McMaster University, Canada [Eubacterium] rectale University of Guelph REB17-11-014 Coprococcus comes University of Guelph REB16-12-668 Dorea formicigenerans Queen's University REB DMED-867-05 Dorea longicatena Natividad et al.[199] Eubacterium ventriosum Petrof et al.[23] Faecalicatena fissicatena DSMZ (International Culture Collection) Roseburia inulinivorans DSMZ (International Culture Collection) Ruminococcus faecis Kind gift from Dr. Sydney Finegold, UCLA, USA Lactobacillus paracasei Universty of Guelph REB 13AP008 Faecalibacterium prausnitzii ATCC (International Culture Collection) Streptococcus mitis British Columbia Cancer Agency REB H09-01268 Enterobacter aerogenes Natividad et al.[199] Escherichia coli British Columbia Cancer Agency REB H09-01268 Akkermansia mucinipihila British Columbia Cancer Agency REB H09-01268

194

Table S4. Statistical determination of steady-state among bioreactors grouped by condition (medium formulation and community) for both the initial community assembly and response to dietary change, separated by a ‘—'. The p-values are given from the Wald type statistic (WTS) and ANOVA type statistic (ATS) with wild bootstrap resampling computed for the singular predicative component separating the ‘early’ and ‘late’ time point groupings of the orthogonal partial least squared discriminant analysis model, and the PERMANOVA computed for the Euclidean distance matrix with the ‘early’ and ‘late’ time point groupings, with both methods calculated in repeated-measures designs. Statistics were conducted in R software version 3.5 by packages ropls version 1.12, MANOVA.RM version 0.3.1 and vegan version 2.5.2. The WTS and ATS in these cases always yielded identical results, and therefore the p-values are denoted as ‘(WTS and ATS)/(PERMANOVA)’. Legend: CC = control community, AC = artificial community, HF = high fiber medium formulation, HP = high protein medium formulation, SEQ = centre- log ratio normalized 16S rRNA marker gene sequencing count data, NMR = 1H-NMR spectral binning data, NS = non-significant p-value > 0.05. DAY CC HF CC HF NMR CC HP CC HP NMR AC HF AC HF AC HP AC HP SEQ SEQ SEQ NMR SEQ NMR 2 0.003/NS 0.004/0.004 0.002/NS 0.003/0.004 NS/NS NS/NS 0.004/NS NS/0.03 4 NS/NS NS/NS NS/NS NS/NS x x NS/NS NS/0.004 6 x x x x x x x NS/NS -- 16 NS/NS NS/0.02 NS/NS 0.003/NS NS/NS NS/NS NS/NS 0.003/0.04 18 x NS/NS x NS/NS x x x NS/NS

Table S5. Statistical determination of grouping differences among bioreactors both between and within conditions. A versus denotes when a comparison is between, as opposed to within, conditions. The following information is provided for each column: PCA-MANOVA = the p-values for the multivariate Wald type statistic (WTS)/modified ANOVA-type statistic (MATS) both with parametric bootstrap resampling of the principal component analysis (PCA), PCA-MD = the pairwise Mahalanobis distance of the PCA, PLS-MANOVA = the p-values for the multivariate WTS/MATS both with parametric bootstrap resampling of the partial least squared discriminant analysis (PLS-DA), PLS-MD = the pairwise Mahalanobis distance of the PLS-DA, PERMANOVA = the p-value of the PERMANOVA computed from the Euclidean distance matrix, PAM Cluster = the answer to if the best solution found from untargeted partitioning around medoids (PAM) clustering based upon the average silhouette width (ASW) recapitulated the expected patterns, PAM ASW = the maximum ASW computed from PAM clustering. Statistics were conducted in R software version 3.5 by the packages ropls version 1.12, MANOVA.RM version 0.3.1, HDMD version 1.2, vegan version 2.5.2 and fpc version 2.1.11. Legend: CC = control community, AC = artificial community, HF = high fiber medium formulation, HP = high protein medium formulation, A-HF/HP = medium formulation in which the community was initially assembled, C-HF/HP = medium formulation the community was changed to after being assembled in the opposite medium, SEQ = centre-log ratio normalized 16S rRNA marker gene sequencing count data, NMR = 1H-NMR spectral binning data, NS = non-significant p-value > 0.05, NC = non-calculatable (i.e., statistical assumption failure), NA = not applicable, Y = yes, N = no.

195

CC HF CC HP CC HF CC A- CC A-HP AC HF AC HP AC HF AC A-HF AC A-HP CC HF VS. CC HP VS. VS. HP HF VS. VS. C-HF VS. HP VS. C-HF VS. C-HP AC HF AC HP C-HF SEQ PCA-M 2E-02 < 4E-02 NS NS < < NS NS NS < < ANOVA / 1E-04 / 1E-04 1E-04 1E-04 1E-04 2E-02 / 4E-02 / / / / < < 1E-02 < 2E-04 1E-04 1E-04 1E-04 PCA-MD 4.3 - 7.6 - 3.0 3.0 2.0 5.4 - 5.3 - 0.6 - 0.8 1.7 0.8 5.1 5.3 8.7 10.3 10.6 8.6 PLS-M 4E-04 < < < < < 4E-04 NS NS NS < < ANOVA / 1E-04 1E-04 1E-04 1E-04 1E-04 / 1E-04 1E-04 4E-04 / / / / / 1E-04 / / < < < < < < < 1E-04 1E-04 1E-04 1E-04 1E-04 1E-04 1E-04 PLS-MD 5.7 - 9.1 - 5.0 4.6 4.3 6.7 - 6.4 - NA NA NA 6.2 6.00 9.3 12.3 11.0 9.8 PERM 9E-03 1E-03 NS 1E-02 NS 1E-03 3E-02 NS NS NS NC 2E-02 ANOVA PAM N N NA N NA N N NA NA NA NA N CLUSTER PAM 0.28 0.25 NA 0.22 NA 0.23 0.28 NA NA NA NA 0.19 ASW NMR PCA-M 2E-04 5E-03 < NS NS 1E-04 < < 6E-04 NS < < ANOVA / / 1E-04 / 1E-04 1E-04 / 1E-04 1E-04 < 2E-03 / 3E-03 / / 1E-03 / / 1E-04 < 1E-03 < < < 1E-04 1E-04 1E-04 1E-04 PCA-MD 10.1 - 11.3 - 19.3 5.5 4.0 9.5 - 12.1 - 17.6 6.5 5.2 11.8 13.0 18.1 17.6 14.5 19.1 PLS-M 1E-04 7E-04 < < < < 1E-03 < < NS < < ANOVA / / 1E-04 1E-04 1E-04 1E-04 / 1E-04 1E-04 1E-04 1E-04

196

CC HF CC HP CC HF CC A- CC A-HP AC HF AC HP AC HF AC A-HF AC A-HP CC HF VS. CC HP VS. VS. HP HF VS. VS. C-HF VS. HP VS. C-HF VS. C-HP AC HF AC HP C-HF < 2E-03 / / / / < / / / / 1E-04 < < < 1E-04 1E-04 < < < < 1E-04 1E-04 1E-04 1E-04 1E-04 1E-04 1E-04 PLS-MD 10.1 - 15.3 - 19.5 9.3 10.9 11.2 - 24.6 - 17.7 9.6 NA 12.0 14.3 19.7 23.0 15.7 29.5 PERM 1E-03 1E-03 1E-03 6E-03 4E-02 1E-03 1E-03 1E-03 1E-02 NS 1E-03 1E-03 ANOVA PAM N N Y N N N N Y N NA N N CLUSTER PAM 0.13 0.17 0.29 0.12 0.2 0.17 0.24 0.25 0.17 NA 0.13 0.18 ASW

197

Table S6. The results of statistical tests differentiating metabolite concentrations obtained from profiling 1H-NMR spectra between medium formulations and communities. Given are the q-values computed from the Kruskal-Wallis statistic after Benjamini-Hochberg correction and Dunn’s post-hoc analysis, and the effect size computed from the eta squared. Statistics were conducted in R software version 3.5 with use of the package dunn.test version 1.3.5. Legend: CC = control community, AC = artificial community, HF = high fiber medium formulation, HP = high protein medium formulation.

198

METABOLITES Q-VALUE EFFECT Q-VALUE EFFECT Q-VALUE EFFECT Q-VALUE EFFECT SIZE SIZE SIZE SIZE CONDITION CC HF vs. HP AC HF vs. HP HF CC vs. AC HP CC vs. AC 4-AMINOBUTYRATE 0.011328 0.173461 0.58734 -0.01789 2.91E-05 0.501909 0.324178 0.00959 4- 6.93E-06 0.606865 1.65E-06 0.722168 0.025014 0.126271 3.70E-06 0.736243 HYDROXYPHENYLACETATE 5-AMINOPENTANOATE 7.79E-07 0.743243 1.65E-06 0.721781 0.005654 0.209459 0.058978 0.104744 ACETATE 0.00036 0.373609 0.002057 0.282565 5.32E-07 0.739189 0.0217 0.162724 ALANINE 7.79E-07 0.743243 1.65E-06 0.721781 0.62209 -0.01928 0.000248 0.459583 ASPARTATE 7.79E-07 0.743243 3.47E-06 0.674843 5.51E-07 0.731103 3.70E-06 0.736243 BUTYRATE 0.052484 0.08744 0.045478 0.108562 5.88E-05 0.462055 0.333691 0.005693 ETHANOL 0.209535 0.020079 9.01E-05 0.462866 0.171527 0.030974 0.00375 0.279612 FORMATE 0.000174 0.416004 0.000662 0.347066 5.32E-07 0.739189 3.70E-06 0.736243 GLUTAMATE 0.476518 -0.01451 0.013635 0.176698 0.000525 0.335478 8.40E-06 0.680371 GLYCINE 6.26E-06 0.615468 0.078731 0.077795 0.000116 0.42396 0.906596 -0.03158 GLYOXYLIC ACID 0.011871 0.167819 3.22E-05 0.52953 0.014384 0.15789 0.81579 -0.02918 HISTAMINE 0.376015 -0.00548 0.479466 -0.01048 0.171527 0.032113 0.448354 -0.00896 HISTIDINE 0.005964 0.209123 0.15203 0.044788 0.000354 0.35813 0.000974 0.372853 INDOLE 3.20E-06 0.659777 1.65E-06 0.721974 0.025014 0.128012 0.333691 0.005693 INDOLE-3-ACETATE 0.052484 0.087455 0.161952 0.040404 0.090786 0.060868 0.102211 0.073213 ISOBUTYRATE 0.001945 0.276895 1.65E-06 0.721781 5.32E-07 0.739189 3.70E-06 0.736243 ISOLEUCINE 7.79E-07 0.743243 0.809786 -0.02771 5.32E-07 0.739189 3.70E-06 0.736371 ISOVALERATE 7.79E-07 0.743243 1.65E-06 0.721781 3.56E-06 0.622394 3.70E-06 0.736243 LACTATE 0.003457 0.241918 0.000205 0.414295 5.32E-07 0.739344 3.70E-06 0.736243 LEUCINE 7.79E-07 0.743243 6.24E-05 0.486305 1.43E-06 0.675697 3.70E-06 0.736243 LYSINE 6.26E-06 0.615468 4.11E-06 0.656492 0.596737 -0.01757 0.899371 -0.0312 METHANOL 7.79E-07 0.743243 5.29E-06 0.638383 0.007104 0.196139 0.00239 0.309298 METHIONINE 7.79E-07 0.743243 1.65E-06 0.721974 0.025014 0.126178 6.54E-05 0.544972 ORNITHINE 7.79E-07 0.743343 4.11E-06 0.656668 3.56E-06 0.618863 0.00239 0.309355 PHENYLACETATE 7.79E-07 0.743243 1.65E-06 0.721781 0.22751 0.018361 0.124971 0.06089

199

METABOLITES Q-VALUE EFFECT Q-VALUE EFFECT Q-VALUE EFFECT Q-VALUE EFFECT SIZE SIZE SIZE SIZE PHENYLALANINE 7.79E-07 0.743243 1.65E-06 0.721877 0.775087 -0.02443 0.019116 0.174362 PROLINE 0.019407 0.140706 0.032404 0.127673 0.053801 0.087303 0.448354 -0.00995 PROPIONATE 0.004115 0.230731 0.305592 0.008752 5.32E-07 0.739189 0.00239 0.309298 PROPYL ALCOHOL 0.002335 0.265 0.013635 0.176698 0.009686 0.178979 0.405068 -0.00375 PYROGLUTAMATE 0.138009 0.039859 0.075634 0.081429 0.001305 0.287023 0.010493 0.211342 PYRUVATE 0.000158 0.423743 0.772971 -0.02607 0.000351 0.361324 0.942317 -0.03209 SUCCINATE 9.96E-06 0.581081 0.189269 0.030586 5.32E-07 0.7395 3.70E-06 0.736243 TRYPTAMINE 0.221939 0.01653 0.18365 0.03328 0.088037 0.063599 0.324178 0.010921 TRYPTOPHAN 0.011506 0.171024 0.772971 -0.02607 0.238293 0.015458 0.019327 0.171489 TYRAMINE 9.96E-06 0.581553 1.99E-05 0.560043 0.865732 -0.02625 0.151186 0.049405 TYROSINE 7.79E-07 0.743343 1.65E-06 0.721877 0.551437 -0.01467 0.005139 0.258233 URACIL 7.79E-07 0.743243 4.29E-05 0.510288 1.01E-05 0.560554 3.70E-06 0.736243 UROCANATE 0.001585 0.290093 0.022778 0.147808 0.002885 0.245 0.010493 0.211586 VALERATE 7.79E-07 0.743343 0.382369 -0.00124 0.000351 0.3639 3.70E-06 0.736371 VALINE 7.79E-07 0.743243 0.779044 -0.02668 5.32E-07 0.739189 3.70E-06 0.736243 P-CRESOL 7.79E-07 0.743343 1.65E-06 0.722071 0.048404 0.093678 9.27E-06 0.669684

200

Table S7. The obtained 16S rRNA marker gene sequencing count data from the single strain Acidaminococcus intestini of the control community grown on fastidious anaerobe agar with and without mucin (Mu) supplementation. Counts are denoted by technical replicate number in raw form. Data processing was conducted in R software version 3.5 by the package DADA2 version 1.8, with classification to the genus level by the SILVA database version 132.

201

OTU# TAXONOMY R1 R2 R3 R4 R5 R6 R7 R8 R1 - MU R2 - MU R3 - MU R4 - MU OTU01 Acidaminococcus 24381 33039 7943 9480 6949 11282 13187 8580 14 7 2 5 OTU02 Eubacterium 44 118 10 21 13 41 44 34 366 88 75 203 OTU03 Lachnospiraceae 0 0 0 0 0 0 0 0 1 0 1 0 unclassified OTU04 Flavonifractor 29 51 13 10 6 20 23 9 0 0 0 0 OTU05 Dialister 63 106 15 31 16 32 29 30 0 0 0 0 OTU06 Akkermansia 0 0 0 0 0 0 0 0 50 5 5 40 OTU07 Dorea 2 1 0 0 2 0 0 1 1 0 1 0 OTU08 Lactococcus 0 1 0 0 0 0 1 0 1 0 0 1 OTU09 Collinsella 0 2 0 0 0 1 1 0 0 0 0 0 OTU10 Streptococcus 0 0 0 0 0 0 0 0 2 0 0 2 OTU11 Bacteria 0 1 0 0 0 0 1 0 0 0 0 0 unclassified SUM 24519 33319 7981 9542 6986 11376 13286 8654 435 100 84 251

202

Table S8. Bacterial strain information and relative rifaximin resistance for the defined microbial communities. Bacterial strains source, taxonomic information, relative abundance in formulation, and rifaximin resistance profiles of the defined microbial community derived from an ulcerative colitis patient fecal sample (UCC) and the Microbial Ecosystem Therapeutic (MET). Strains that were not originally included in the formulation described in Petrof et al.[23] are indicated as MET-A. Closest species matches were determined from NCBI BLAST searches of the 16S rRNA gene, with matches < 97% identity denoted by the genus and < 95% identity denoted by the family. The relative abundance in formulation is represented by biomass measured using standard 10 μL loopfuls as described in Petrof et al.[23]. Note that relative abundance only applies to MET, as the UCC strains were added in equal ratios, since the community was allowed ample equilibration time in the bioreactor. Rifaximin resistance profiles were determined via the disc diffusion method with a 40 μg dosage. The diameters were measured in cm and assigned into categories as follows: completely resistant (R), highly resistant, i.e., 1-2 (+), relatively resistant, i.e., 2-3 (++), sensitive, i.e., 3-4 (+++), and very sensitive, i.e., 4+ (++++). SOURCE TAXONOMIC INFORMATION RELATIVE ABUNDANCE RIFAXIMIN SENSITIVITY UCC [Clostridium] innocuum + + UCC Adlercreutzia equolifaciens + ++++ UCC Bacteroides cellulosilyticus + +++ UCC Bacteroides dorei + R UCC Bacteroides fragilis + ++ UCC Bacteroides thetaiotamicron + ++ UCC Collinsella aerofaciens + +++ UCC Enterococcus faecalis + ++ UCC Escherichia coli + + UCC Klebsiella aerogenes + R UCC Klebsiella oxytoca + R UCC Lachnospiraceae sp. + ++++ UCC Parabacteroides merdae + ++ UCC Phascolarctobacterium faecium + + UCC Pseudoflavonifractor sp. + + UCC Streptococcus anginosus + ++ UCC Streptococcus gordonii + ++ UCC Streptococcus mutans + +++ UCC Streptococcus parasanguinis + ++ UCC Veillonella atypica + ++ UCC Veillonella denticariosi + + UCC Veillonella dispar + ++ UCC Veillonella parvula + + UCC Veillonella tobetsuensis + ++ MET [Eubacterium] eligens +++++ ++++ MET-A [Eubacterium] fissicatena +++ ++ MET [Eubacterium] rectale +++ +++ MET [Eubacterium] rectale +++ ++ MET [Eubacterium] rectale +++++ ++++ MET [Eubacterium] rectale ++++ +++

203

SOURCE TAXONOMIC INFORMATION RELATIVE ABUNDANCE RIFAXIMIN SENSITIVITY MET Acidaminococcus intestini +++ R MET-A Akkermansia muciniphila + ++ MET Bacteroides ovatus ++++ ++ MET Bifidobacterium adolescentis ++++ ++ MET Bifidobacterium adolescentis ++++ ++ MET Bifidobacterium longum + ++ MET Bifidobacterium longum +++++ +++ MET-A Bifidobacterium pseudocatenulatum +++ ++ MET Blautia luti + + MET Blautia sp. ++++ ++ MET Blautia stercoris ++ ++++ MET Butyricicoccus faecihominis +++ ++ MET Collinsella aerofaciens +++ +++ MET-A Coprococcus comes +++ ++ MET-A Dialister invisus + ++++ MET-A Dorea formicigenerans +++ +++ MET Dorea longicatena +++ +++ MET Dorea longicatena +++++ +++ MET Erysipelotrichaceae sp. +++ + MET Escherichia coli + R MET Eubacterium limosum ++++ ++ MET Eubacterium ventriosum +++ ++++ MET Faecalibacterium prausnitzii +++++ ++ MET-A Flavonifractor plautii + R MET Klebsiella aerogenes +++ R MET-A Lachnoclostridium sp. + ++ MET Lactobacillus casei +++ + MET Lactobacillus paracasei +++ + MET Parabacteroides distasonis ++++ +++ MET Roseburia faecis ++++ +++ MET-A Roseburia inunlinivorans + ++++ MET Ruminococcus faecis +++ ++ MET Ruminococcus faecis +++++ ++ MET Streptococcus mitis ++ ++ MET-A Sutterella stercoricanis + ++

204

Table S9. Genome information and KEGG orthologies for all species present in the bioreactors. Source, condition, source of genomes, number of genomes from which proteome data was obtained, and KEGG orthologies for each species present in the bioreactors, as defined by at least 0.01% compositional abundance in three replicates from 16S rRNA profiling via Illumina sequencing. The species were sourced from defined microbial communities, either derived from an ulcerative colitis patient fecal sample (UCC) or a healthy donor fecal sample (MET). The UCC species were present both before and after treatments (All), whereas the MET species either engrafted in the condition without rifaximin usage (MET) or in the rifaximin pretreated condition (Abx-MET). Genomes were obtained from KEGG or NCBI. When deposited proteome data could be obtained from NCBI despite the absence of a complete genome, the genome source is listed as NCBI, but the number of genomes is listed as zero. If no data was available, either due to the strain being a potential novel species or otherwise, a draft genome of the strain was assembled and annotated from shotgun genomic Illumina sequencing conducted at the Broad Institute. SPECIES SOURCE CONDITION GENOME SOURCE NUMBER OF GENOMES [Clostridium] innocuum UCC All NCBI 1 Bacteroides cellulosilyticus UCC All KEGG 1 Bacteroides dorei UCC All KEGG 2 Bacteroides fragilis UCC All KEGG 4 Bacteroides thetaiotamicron UCC All KEGG 2 Klebsiella oxytoca UCC All KEGG 3 Lachnospiraceae sp. UCC All Broad 1 (NCBI:txid1357394) Parabacteroides merdae UCC All NCBI 3 Phascolarctobacterium faecium UCC All Broad 1 (NCBI:txid1357409) Pseudoflavonifractor sp. UCC All Broad 1 (NCBI:txid1357380) Veillonella denticariosi UCC All NCBI 1 Veillonella dispar UCC All NCBI 2 Acidaminococcus intestini MET All KEGG 1 [Eubacterium] eligens MET MET KEGG 1 Bacteroides ovatus MET MET KEGG 1 Eubacterium ventriosum MET MET NCBI 1 Parabacteroides distasonis MET MET KEGG 1 Roseburia faecis MET MET NCBI 0 Roseburia inulinivorans MET MET NCBI 1 [Eubacterium] fissicatena MET Abx-MET NCBI 1 Coprococcus comes MET Abx-MET NCBI 1 Flavonifractor plautii MET Abx-MET KEGG 1 Escherichia coli Both All KEGG 65

205