<<

The Structure and Function of the Murine Gut Microbiome in Sub-Therapeutic Antibiotic- Induced Obesity

by Christine Tara Peterson

M.F.S. Forensic Science, January 2001, George Washington University B.A. Government, May 1999, University of Virginia

A Dissertation submitted to

The Faculty of The Columbian College of Arts and Sciences of The George Washington University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

May 18, 2014

Dissertation directed by

David Leitenberg Associate Professor of Microbiology, Immunology, and Tropical Medicine, and of Pediatrics

The Columbian College of Arts and Sciences of The George Washington University certifies that Christine T. Peterson has passed the Final Examination for the degree of

Doctor of Philosophy as of March 25, 2014. This is the final and approved form of the dissertation.

The Structure and Function of the Murine Gut Microbiome in Sub-Therapeutic Antibiotic- Induced Obesity

Christine Tara Peterson

Dissertation Research Committee:

David Leitenberg, Associate Professor of Microbiology, Immunology, and Tropical Medicine, and of Pediatrics, Dissertation Director

Norman H. Lee, Professor of Pharmacology and Physiology and of Biochemistry and Molecular Medicine, Committee Member

Cynthia L. Sears, Professor of Medicine, Johns Hopkins University, Committee Member

ii

© Copyright 2014 by Christine Tara Peterson All rights reserved

iii

Dedication

To the alleviation of suffering in all beings everywhere.

iv

Acknowledgments

The author wishes to thank Shri Hanuman Ji, the breath of Ram, who teaches that in service all things are possible.

The author would also like to thank Martin J. Blaser and Ilseung Cho at the New York

University School of Medicine for stool samples from their sub-therapeutic antibiotic- treated (STAT) and control mice, which served as biological materials for this project.

The author is grateful to Joshua Adkins and Brooke L.D. Kaiser at the Pacific Northwest

National Laboratory for assistance with protein fractionation, the use of mass spectrometry equipment, and metaproteomic data analysis with SEQUEST.

The author would like to thank a number of scientists at the J. Craig Venter Institute

(JCVI) for advice, training, or assistance with various aspects of this project, including:

Rembert Pieper for advice on metaproteomics and protein isolation methods; Shih-Ting

Huang for assistance with proteomic data processing for METAREP analysis; Shibu

Yoosef for help with metagenomic assemblies; Johannes B. Goll and Mathangi

Thiagarajan for advice and assistance with METAREP data formatting, data loading, and analysis; and Karen E. Nelson for the use of DNA sequencing equipment.

The author is grateful for the help and advice of the Dissertation Research Committee,

Director and Chair, namely David Leitenberg, Norman H. Lee, Paul J. Brindley, Cynthia

L. Sears, Anelia D. Horvath, and Linda Werling. Thank you for all of your help and encouragement.

v

The author expresses gratitude to the microbes for teaching us that in community, cooperation, and synergy, great feats are possible.

Abstract of Dissertation

The Structure and Function of the Murine Gut Microbiome in Sub-Therapeutic Antibiotic Induced Obesity

Obesity represents an important public health concern in the United States and throughout the world. The composition of the gut microbiome is fairly stable unless disturbed by outside variables such as antibiotic treatment. While farmers have exploited sub-therapeutic antibiotic treatment for growth promotion in livestock for over

50 years, the mechanism of action remains unknown. The hypothesis of this dissertation is that the structure and activities of the gut microbiome impact health and disease in the context of obesity. We employed a systems biology approach to examine the murine gut microbiome in sub-therapeutic antibiotic-treated (STAT) and control mice.

We generated deep DNA sequence coverage and gene annotation of the gut microbiomes of obese and lean mice and compared the representation of metabolic pathways. While less sensitive than the metagenomics strategy, we also compared profiles of abundant proteins using fractionated mouse microbiome samples derived from lean and obese mice with nLC-MS/MS. The combination of metagenomic and metaproteomic data allowed us to construe a model for major metabolic processes in the mouse gut. We identified differences in under- and over-represented gene/protein functions. Taxonomic analyses revealed that Enterococcaceae were 3-fold higher and

Akkermansia muciniphila was reduced 4-fold in STAT compared to control mice. These results suggest that that increases in Enterococcaceae may reflect an elevated inflammatory state in the gut of obese mice, and A. muciniphila may promote gut health by reducing inflammation and improving gut barrier function. The metagenomics data

vi revealed an over-representation of β-glucosidase as well as increases in carbohydrate binding functions, and the metaproteomics data indicated an elevated production of D- xylose and -D-glucose in STAT compared to control mice. These results suggest that the STAT mouse microbiota harbors an elevated potential for energy harvest from polysaccharides. The signatures of obesity indicate that the development of therapeutic interventions that drive the microbial composition of the gut toward that of healthy individuals may alleviate obesity.

vii

Table of Contents

Dedication iv

Acknowledgments v

Abstract of Dissertation vi

List of Figures ix

List of Tables xii

Glossary of Terms xiii

Chapter 1: Introduction 1

Chapter 2: Literature Review 27

Chapter 3: Methods 63

Chapter 4: Results 73

Chapter 5: Conclusions 240

References 256

Appendices 272

viii

List of Figures

Figure 1 2

Figure 2 5

Figure 3 12

Figure 4 17

Figure 5 20

Figure 6 81

Figure 7 85

Figure 8 87

Figure 9 91

Figure 10 93

Figure 11 95

Figure 12 100

Figure 13 102

Figure 14 104

Figure 15 106

Figure 16 110

Figure 17 112

Figure 18 116

Figure 19 118

Figure 20 120

Figure 21 123

Figure 22 128

ix

Figure 23 130

Figure 24 135

Figure 25 139

Figure 26 143

Figure 27 145

Figure 28 149

Figure 29 151

Figure 30 153

Figure 31 157

Figure 32 159

Figure 33 161

Figure 34 164

Figure 35 166

Figure 36 170

Figure 37 172

Figure 38 174

Figure 39 176

Figure 40 178

Figure 41 180

Figure 42 183

Figure 43 185

Figure 44 187

Figure 45 190

Figure 46 192

Figure 47 197

Figure 48 199

x

Figure 49 201

Figure 50 203

Figure 51 205

Figure 52 208

Figure 53 211

Figure 54 213

Figure 55 215

Figure 56 218

Figure 57 221

Figure 58 225

Figure 59 228

Figure 60 232

Figure 61 235

Figure 62 238

xi

List of Tables

Table 1 77

xii

Glossary of Terms

Basic Local Alignment Search Tool (BLAST): Algorithm that finds regions of local similarity between DNA and protein sequences and calculates the statistical significance of matches. The results can be used to identify members of gene families and infer functional and evolutionary relationships between sequences.

Contig: A set of overlapping DNA reads or segments that together represent a consensus region of DNA.

Enteric nervous system (ENS): A division of the autonomic nervous system that is composed of ganglionated nerve plexuses situated between layers of gut tissues.

Enteroendocrine cell: Specialized epithelial cells distributed throughout the gastrointestinal tract that secrete granules containing peptides that communicate with the nervous system.

Gnotobiotic mice: Germ-free mice, which are born and raised in sterile isolators without colonization by microorganisms, are said to be gnotobiotic after being experimentally colonized with defined strains of bacteria.

Gut-associated lymphoid tissue (GALT): The peripheral lymphoid structures and aggregates associated with the intestinal mucosa, including the tonsils, Peyer’s patches, isolated lymphoid follicles, and intraepithelial lymphocytes. It is enriched in lymphocytes, macrophages and specialized dendritic cells and provides the first line of defense against the entry of pathogens across the mucosal barrier.

xiii

Isolated lymphoid follicles: Organized GALT tissues that contain mostly B cells.

Koch’s Postulates: Four criteria designed to establish a causal relationship between an infectious microbe and a disease. The postulates are: (1) the microorganism must be found in abundance in all organisms suffering from the disease, but should not be found in healthy organisms, (2) the microorganism must be directly isolated from a diseased organism and grown in pure culture, (3) the cultured microorganism should cause disease when introduced into a healthy host, and (4) the microorganism must be reisolated from the inoculated, diseased experimental host and identified as being identical to the original, single causative agent.

Lamina propria: The connective tissue which is located beneath the surface epithelium of the mucosa and contains lymphocytes and immune system cells.

Mesenteric lymph nodes: Located in the peritoneum that connects the small intestine to the posterior wall of the abdomen and drains the intestinal mucosa, Peyer’s patches, and lymphoid follicles of the gut.

Metagenomic sequencing: The DNA sequencing of the genomes of all of the microorganisms contained in a given sample.

Microbiome: The complete genomes of all of the microorganisms that colonize a specific environment. This term has been used interchangeably with microbiota.

Microbiota: The community of microorganisms living within a given anatomical niche.

Mucosa-associated lymphoid tissue (MALT): The mucosal immune system which consists of all lymphoid cells in the epithelia and in the lamina propria that underlies mucosal surfaces.

xiv

Mutualism: A symbiotic association in which both members benefit from the relationship.

Operational taxonomic unit (OTU): A group of organisms that have 16S ribosomal DNA sequences with 97% or greater identity. The term is often used to approximate a species.

Pathobiont: A symbiont that has the potential to cause deregulated inflammation and disease under certain environmentally induced conditions.

Peyer’s patches: Groups of lymphoid follicles that are present along the small intestine.

They contain lymphoid follicles and T cell areas. Together with the mesenteric lymph nodes, they form the inductive compartment for intestinal immune responses.

Phylotype: A biological type that classifies organisms based on phylogenetic, or evolutionary, relationships with other organisms. This term is not taxon specific, thus it can be used in reference to any phylogenetic level or species (See also Operational

Taxonomic Unit).

Psi: The pound per square inch or pound-force per square inch is a unit of pressure.

Read: A term that refers to the primary output of DNA sequencing.

xv

Chapter 1: Introduction

Obesity represents an important public health concern in the United States (US) and throughout the world. In the US, adult obesity doubled between 1980 and 2004, and obesity rates continue to rise (see Figure 1) [1]. In 2010, more than one-third (35.7%) of the adult population and almost 17% of children and adolescents were obese with the highest prevalence in southern states [2]. In 2008, the World Health Organization

(WHO) estimated that over 1.4 billion adults over the age of 20 were overweight and over 500 million adults (10% of the world population) were obese [3]. Recently, the

American Medical Association (AMA) officially declared obesity a disease [4]. According to the Centers for Disease Control and Prevention (CDC) and the AMA, individuals who are considered overweight or obese have a greater risk of developing many disorders including type II diabetes, hypertension, heart disease, stroke, non-alcoholic fatty liver disease, osteoarthritis, depression and some types of cancer [5]. Moreover, morbidity for these conditions increases as body mass index (BMI), a measure of body fat based on height and weight, increases. Approximately 2.8 million adults die each year due to overweight or obesity, which rank fifth as the leading risk factor for death worldwide.

Additionally, low- and middle-income countries are facing the double burden of obesity and malnutrition, as less expensive foods tend to be higher in calories and lower in micronutrients [3].

The direct and indirect costs of overweight/obesity and the associated health problems place an enormous burden on our healthcare system. Approximately 44% of the diabetes health burden, 23% of the ischemic heart disease burden, and between 7% and 41% of the cancer burden is associated with overweight and obesity [3]. In 2008, the medical costs associated with obesity were estimated at 147 billion dollars and the

1

Figure 1. Obesity rates in the US. Obesity rates have been increasing steadily in recent years [6, 7].

2

3 per capita medical spending was 42 percent higher for an insured obese patient than for someone of normal weight [8]. The total estimated cost of diabetes healthcare, just one of the conditions associated with obesity, was estimated at 245 billion dollars in 2012, and the prevalence was estimated at 25.8 million sufferers, which represents a 41% increase over five years (see Figure 2) [9]. Studies have shown that obesity influences aggregate healthcare spending. For example, data collected by the Behavioral Risk

Factor Surveillance System (BRFSS) at the CDC revealed that obesity rates increased by 37 percent (obesity prevalence rose from 18.3 percent to 25.1 percent) between 1998 and 2006, which suggests that the increased occurrence of obesity may be contributing to increases in total medical spending [8].

A recent study employing nonlinear regression models predicted that by 2030, 42% of the US population will be obese and the resulting increase in healthcare spending will total 549.5 billion dollars [10]. If these estimates are accurate, this study predicts that obesity prevalence will increase by 33% over the next two decades, thus interfering with efforts toward healthcare cost containment. Thus, reducing, preventing, and understanding obesity confers great economic and health benefits. Both genetic and environmental factors such as antibiotic treatment, gut dysbiosis, high-fat diet, high- fructose corn syrup, processed food, and inactivity have been implicated as contributing factors in obesity, which is a risk factor for type 2 diabetes, colorectal cancer (CRC), and perhaps even IBD [11].

4

Figure 2. New cases of diagnosed diabetes among US adults. If the current trends continue 1 in 3 adults will have diabetes by 2050 [12].

5

6

The human gastrointestinal (GI) tract contains a diverse microbial community that greatly influences human health and coevolves with us [13]. Alterations in the population structure of this community can cause both beneficial and harmful effects on human health. Dysbiosis, which is a disruption of the healthy gut microbiota, is associated with gastrointestinal diseases such as obesity [14, 15]. In addition, gut dysbiosis is observed in malnutrition, systemic diseases such as type 2 diabetes, chronic inflammatory diseases such as inflammatory bowel disease (IBD), and colorectal cancer [16-19]. The role of the gut microbiome, the genomes of the microbes colonizing the gut, in human health and disease has become clearer through the application of high throughput sequencing technologies (HTS) as well as other tools by both individual researchers and through large-scale efforts.

Historical perspective of the human microbiome. In 2008, large initiatives such as the European Metagenomics of the Human Intestinal Tract (MetaHIT) and the US-based

Human Microbiome Project (HMP) were undertaken to understand the complexity of the human microbiome. These efforts included large-scale metagenomic sequencing of the gut microbiota in healthy and diseased states as well as developments in computational analysis [20-24]. The first phase of the HMP emphasized the cataloguing of human microbiota and involved the deep metagenomic sequencing of DNAs derived from 5 body sites (oral cavity, gastrointestinal tract, skin, vagina and nasal cavity) and multiple subdomains within each of those sites in 250 healthy human subjects [25]. Additional resources generated by the program include the generation of reference genome sequences from over 1,000 strain isolates as well as 16S rRNA sequences containing taxonomic information from each body site.

One of the goals of the first phase HMP initiative was to determine whether a conserved core microbiome could be defined for healthy individuals [26]. However, one of the

7 important observations of human microbiome data is the substantial inter-personal variability that exists in all site-specific microbiota [22, 27, 28]. The specific genera and species represented at each body site (domain) are unique and reflect millions of years of natural selection that created these differentiated microbial communities. Comparing the microbial residents at these domains across human subjects revealed surprisingly low levels of sequence conservation at the species level but increased similarity at progressively higher taxonomic levels [29]. Based on these observations, it was speculated that human microbial communities feature many highly related genera and species, and therefore those genomes may encode a high level of functional redundancy. Functional redundancy of individual species may strongly alleviate positive selective pressure for maintaining functionally equivalent species. Therefore, the massive complexity and diversity of microbiota encoding numerous successful and highly fit communities may naturally give rise to high inter-personal variation in the microbiota [20]. This speculation was strongly supported by metagenomic sequencing efforts that confirmed that phylogenetic heterogeneity is indeed a hallmark of human microbiota, but so too is the high-level conservation of encoded functions [22]. While compelling, these observations are biased to the extent that the most highly conserved functions, those encoded by all Eubacteria or major phyla, are also the most comprehensively annotated functions. Strain-specific genomic functions, particularly those functions that are difficult to identify in the laboratory, such as host interaction factors, are among the least well understood and poorly annotated (hypothetical proteins) at the functional level. The challenges associated with interpreting the relationships between human health and disease as it pertains to human microbiota are great due to the rather extreme subject-to-subject heterogeneity. Case-control studies of human disease and their associated microbiota display geographical distinctions, which further complicates the identification of meaningful differences in the community

8 structure. Finally, while the human microbiome has been implicated in a large number of important human diseases, the cause-effect relationship for the vast majority of these associations remains unclear and difficult to unambiguously define. It is likely that the dysbioses of microbiota in some instances will be causal in disease initiation and/or progression, whereas in other cases, the observed alterations in microbiota may be the effect of alterations in the disease microenvironment that positively selects for varied microbial communities. Additional studies will be needed to clarify this important distinction for a number of human diseases.

Strong international efforts served to complement and enhance the NIH roadmap initiative. An important aspect of the international effort is to ensure that human microbiome research examines a broad range of races, ethnicities and cultures. The

MetaHIT project focused on correlation between the gut microbiome, which is the most complex and plays a particularly important role in human health and well being, and intestinal diseases such as IBD and obesity. In one study, the gut microbiomes from

124 subjects, including healthy individuals and those with IBD or obesity, were sequenced to create a database of non-redundant genes from the GI tract [20]. The data included the characterization of 3.3 million non-redundant microbial genes, which revealed that 40% of the microbial genes were shared among the majority of subjects and represented a core metagenome. Both the MetaHIT and HMP consortia have produced invaluable microbial databases that have revealed the significant variation in microbial species and genes in the gut environment. This body of work as well as individual studies have furthered our understanding of the definition of a “healthy” gut microbiota/microbiome and provided correlations between the gut community and GI disease.

9

The global interest in the human microbiome is strong and is based, at least in part, upon the realization that the microbiomes present at various sites of the human body encode an enormous number of gene functions. The gut microbiota has even been referred to as a hidden metabolic “organ” due to its connection with host metabolism, nutrition, and immune function [30]. Among the more compelling concepts gleaned from human microbiome data is that the microbiome is not simply a potential etiological agent of disease, but may be more appropriately viewed as having the capacity to encode the functions required for optimal human health [31]. In this regard, microbiome research may benefit from an improved understanding of both aspects of the microbiome’s capacity. Our future ability to modulate the microbiota through therapeutic intervention and ecological restoration to create a “healthy” community from one that promoted disease is a subject of great interest [32]. The expanded development of pre- and probiotics is a subject that is being aggressively pursued in Europe but has yet to become part of a strong US research initiative [33]. Similarly, it is likely that health- promoting compounds synthesized by the microbiome will be actively mined and may represent our societies’ next generation of drugs.

The large international framework for human microbiome research has fueled numerous studies that have sought to establish links between the function and activities of the microbiota and a variety of human diseases. In this regard, the promise offered by the study of the human microbiome has been very successful. Over the past decade, associations have been established between microbiomes and human health as well as a surprisingly large number of significant human diseases and conditions including obesity, type 1 and 2 diabetes, metabolic disease, colon cancer, IBD, Irritable Bowel

Syndrome (IBS), psoriasis, dental caries and periodontal disease, autism, and non- alcoholic liver disease. Microbiome research is now part of the research agenda in

10 virtually every field of medicine including gastroenterology, dermatology, gynecology, psychiatry, neurology, endocrinology, cardiology, cariology, periodontology, oncology, and urology, and encompasses diverse fields such as immunology, microbiology, behavioral science, and nutrition. The dynamic nature of this field is unique. The pace of this growth over the past decade is a reflection of the many fields of research that are impacted by the human microbiome, which has shifted the outlook and perspectives of scientists and clinicians in the collective quest to improve human health. It is conceivable, if not expected, that the rate of growth in this exciting field of research will continue to increase, thereby accelerating basic science discovery and advances in therapeutic and applied developments for many years to come. A simple literature search in the Pubmed database using the search term “microbiome” illustrates the rapid growth in this field (Figure 3) [34]. Additional technological breakthroughs will be required to enable microbiome research to expand beyond its strong dependence on

DNA sequencing technology so that the functional relationships between the vast networks of bacterial, fungal, viral and bacteriophage residents in the microbiota may be better understood as well as the means by which those relationships provide fitness advantages to the human host, help to maintain homeostasis, and promote human health.

11

Figure 3. Microbiome research publication trends. Pubmed search term=

“microbiome” retrieved 4005 articles; 2012=1661, 2011=1082, 2010=691, 2009=280,

2008=148, 2007=63, 2006=38, 2005=21, 2004=16, 2003=5 articles.

12

PubMed cited articles 1800

1600

1400

1200

1000

800

600

400

200

0 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

13

It has been estimated that the gut microbiota consists of hundreds of member species with unknown numbers of strains. The large interpersonal differences in microbiota suggest that the combined human population’s microbial gene pool may approach one billion genes (assuming >1,000 species/individual X ~3,000 genes/genome X 7 billion person world population). The enormity of the human microbiota at the worldwide population level and defining its role in human health and disease is daunting.

Historically, microbiologists and clinical microbiologists have applied cultivation-based approaches to understand the diversity and functional significance of bacterial representation in microbial communities. These studies were most dominant in the oral cavity and the gut [35]. While these efforts provided invaluable insights and resources, the labor-intensiveness of these characterizations coupled with the inability to cultivate most of the species comprising human microbial communities has led researchers to conclude that the comprehensive, large-scale characterization of human microbiota using these classical techniques is not practical. The development of inexpensive high- throughput culture-independent methods to catalogue the taxa present in microbiota has revolutionized our ability to characterize the microbiome population structure in both health and dysbiosis in association with a number of human diseases [36]. While these sequence-based approaches continue to fuel ongoing and future studies there is a need to complement these approaches with those that will allow an appropriate scale of functional characterization. Numerous studies today are combining culture-independent methodologies with traditional microbiology to allow the functional aspects of the microbiota to be better resolved [37, 38].

Clinical microbiology and infectious disease research also have significant histories and have established important paradigms and perspectives regarding the processes and approaches taken to understand and combat infectious disease. In the recent past,

14 infectious disease research did not consider the microbiome as relevant to health and disease and held the previously reductionist view that included only the human host and a single infectious agent [39]. In this context, microbiome research has inspired infectious disease researchers to consider the etiology of polymicrobial disease, in which many species may be either over- or under-represented. One of the early lessons learned from human microbiome characterization is that microbial communities are relatively stable in healthy adults throughout their adult life; however, the use of antibiotics, changes in diet, and other yet to be defined factors may alter the stability of the microbiota creating a state of dysbiosis [40]. These observations have provided specific insights that potentially link known environmental factors of disease predisposition to alterations in microbial homeostasis, thereby strengthening suspicions that the microbiota may play a causative role in human disease initiation or disease progression. The paradigm shifts associated with our improved perspective of the human-associated microbiota raises some specific and novel challenges as we are not yet adept at the process of fulfilling Koch’s postulates in the context of these communities or defining the mechanistic basis of microbiomes as etiological agents of polymicrobial disease [41].

Genomic technologies. The determination of the composition and functional capacity of the gut microbiome is a challenge. The rapid advancement in genomic technologies represents a critical factor that greatly facilitated the accelerated characterization of the human microbiome in terms of microbial composition, genetic content, and function. The infrastructure, tool development, and bioinformatics support structures were being driven by rapid advances in genomic sciences that were focused on the determination of a large number of microbial, eukaryotic, vertebrate and mammalian genomes. These endeavors, previously considered out of reach, were being completed based on brute-

15 force sequencing and the implementation of large fleets of high-throughput 96-well capillary sequencers (Figure 4a and 4b). These instruments employed the Sanger sequencing method, referred to as long-read technology, capable of simultaneously generating around 750 base reads from 96 clones per instrument in a few hours. The largest sequencing facilities housed hundreds of these instruments. Despite these advances, the complexity and community structures of the microbiota demanded even greater advances in DNA sequence generation capacity and cost reductions. Short read technologies such as the Roche 454 platform offered 1 million sequences of intermediate read lengths of 250 bases that was later improved to 450 bases generated in a single run [42]. Shortly thereafter, the competing ABI SOLiD and Illumina GSX II sequencing instruments were introduced and provided many millions of short 35 base reads [43, 44]. In subsequent years, the Illumina platform was improved and has become the dominant technology for microbiome research as it currently facilitates the generation of 3 billion 150 base reads per run. The short-read next generation (NexGen)

DNA sequencers began to meet the needs of researchers who had undertaken the monumental task of characterizing the human microbiome.

16

Figure 4. a, The ABI 3730. This sequencing instrument has a throughput capacity ~75 kb/run [45]. b, The Illumina HiSeq 2000. This sequencing platform is capable of generating over 500 Gb of sequence data/run and 200 million sequence reads/lane [46].

17

a b

18

The continued rate of growth of the DNA sequencing technology development allowed warehouses filled with sequencing instruments to be replaced by a single tabletop instrument and provided a means for most biological research laboratories to generate huge quantities of data and to conduct sequence-based investigations that involved large numbers of samples. In many regards, the ability to generate DNA sequence data pertaining to microbiomes is outpacing both upstream (sample procurement and library construction) and downstream (data storage and comprehensive bioinformatics analysis) activities. The sharply decreased costs of DNA sequencing, coupled with the dramatic increases in data generation have propelled human genome and microbiome research into the realm of what was not possible a decade ago (Figure 5). Within a 5 year timespan, we have seen microbiome studies describing the sequencing of hundreds of

16S rDNA clones to large projects like MetaHit, which generated terabases (1012) of metagenomic sequence data [20, 47]. The genomics research community was well poised to carry out the initial microbiome cataloguing efforts that provided the necessary framework for more detailed investigations, thus accounting for the exponential growth in microbiome research we are witnessing today. Attention is now turning towards finding complementary high-throughput approaches to enable the functional characterization of the human microbiome [48]. Many of the high-throughput approaches being developed and applied to the newly defined field of systems biology, which includes genomics, metabolomics, transcriptomics and proteomics, represented attractive techniques due to their high sensitivity and high content data generation potential.

19

Figure 5. Decreasing DNA sequencing costs. The rate of cost reductions in DNA sequencing, portrayed as cost/human genome, is currently outpacing Moore’s law, which suggests that data storage and analysis may become tight bottlenecks in the future [49].

20

21

Systems biology application and the microbiome. Among the omics sciences, the fields of metagenomics, transcriptomics, proteomics and metabolomics are well poised with high-throughput technologies that contribute to the functional characterization of the human microbiome. The use of multi-omics approaches, or systems biology, is becoming more common in investigations of the human microbiome [39, 50, 51]. Both metagenomics and transcriptomics as applied to the characterization of the human microbiome leverage the massive DNA sequence production capacities of NexGen instruments. Metagenomic analysis of human microbiomes involves the random or shotgun sequencing of genomic DNAs derived from microbial communities. While not expression based, analysis of metagenomic DNA sequences does provide functional clues as reflected by the genes present in the community [26]. Conceptually, this approach is based on the observation that many microbiomes display broad shifts in the taxonomic representation of community members, whereby some species/genera may be sharply reduced in abundance, whereas others may be increased. The genes encoded in the genomes of species undergoing abundance shifts necessarily change their abundance within the community in parallel. Thus, the application of metagenomic analysis of microbiomes is potentially fruitful, providing an objective way of assessing whether specific functional attributes of the microbiome have undergone higher-order changes.

The generation of many millions of genomic DNA sequence reads followed by their functional annotation using automated microbial gene annotation pipelines allows the direct comparison of the gene representation of microbiomes. Comparing frequency tables for the representation of genes belonging to specific gene ontology (GO) categories or biochemical pathways allows specific hypotheses to be formed and biological insights to be gained [52]. The metagenomic data can also provide a powerful

22 means of comparing taxonomic representation of microbiomes. The data can be used to perform BLAST analyses such that informative gene sequences with high levels of sequence identity to reference genome sequences or annotated sequences present in the National Center for Biotechnology Information’s (NCBI) Genbank database can be tabulated and assigned to specific genera and species [53, 54]. As the number of reference genomes continues to expand, this approach is being used more frequently as a result of the shortcomings of 16S rRNA profiling [55]. Disadvantages of 16S rDNA sequencing include the introduction of polymerase chain reaction (PCR) bias and some subregions of the gene reveal various levels of taxonomic resolution. While 16S rRNA studies allow compositional analysis, they do not reveal the direct functional potentials of the population. A significant advantage of the metagenomic DNA analysis approach is the ability to combine accurate phylogenetic classification with gene functional representation analysis such that the specific sources of over- or under-represented gene functions may be viewed in the context of the species encoding those functions in the community.

In addition to gene content, microbiome researchers seek to understand the activity and interactions of these genes and their products within metabolic pathways.

Transcriptomics is being applied to microbiome analysis as a means of assessing the overall expression state of the community in the context of health and disease [56-58].

The introduction of RNA-Seq methodologies that exploit NexGen DNA sequencing platforms provided a robust means for mapping cDNA reads onto reference genomes or functionally annotated microbiome databases of interest. Currently, one drawback of this approach is the lack of commercial software to conduct the mapping of cDNA sequence reads onto the appropriate reference genomes. Therefore, this type of analysis requires a strong bioinformatics support structure that allows computationally

23 expensive analyses to be conducted. The primary advantage of the transcriptomics analysis of microbiomes is that, unlike metagenomic analyses, this approach allows one to identify those genes encoded in microbial genomes that are both expressed and repressed in both health and disease thereby narrowing the focus of the genome to a smaller subset of genes that may serve important biological functions in those contexts.

Proteomics analysis of microbiomes has enabled the identification of those and proteins that are expressed and potentially contain post-translational modifications in a variety of biological contexts [59-61]. Given that the correlation of RNA and protein expression cannot be broadly assumed, analysis of protein expression is widely considered a more accurate reflection of the expression state of the cell or microbial community [62]. Advances in mass spectrometers have been ongoing for many years, which have led to improvements not only in their resolution but also in their sensitivity, thus enabling the large-scale identification of proteins [63, 64]. Developments in protein separation have also driven the improved identification of proteins derived from highly complex samples [65]. Despite these improvements, proteomics technologies suffer from limitations in dynamic range, and as similar to many brute-force random sampling approaches, the most abundant proteins are over sampled at the expense of observing other less abundant but potentially important proteins. For this reason and others, the comprehensiveness of proteomic characterization of human microbiomes lags far behind the DNA sequence-based approaches. Although only a few studies have been published on the metaproteomics of microbiome samples, metaproteomics provides a useful means for establishing a fundamental understanding of the most abundant functions encoded and expressed by microbiomes, the degree that proteins undergo modifications such as phosphorylation, and the extent to which the expressed microbial proteins interact with host cell machinery. It is possible that this technology will be

24 applied in a more specific and/or selective manner. For example, researchers in the field of chemical proteomics have developed chemical probes, often substrates for enzymes, that enable the quantification of protein functions and activities based on simple read outs [66]. While these methods hold great promise, the development of these chemical probes is laborious and has not enabled the large-scale multi-function characterizations. However, developers may adapt proteomics technology in ways not yet conceived in order to apply this powerful technology to the human microbiome.

Nearly every human disease for which the microbiome has been implicated as a potential etiological factor is a multi-factorial diseases. Human genetics, environment, ethnicity, age, diet, epigenetics, and the microbiome may all play a role in the transition from health to disease. Perhaps in the future, systems biology will come to define approaches where all of the inherent risk factors of a disease can be explored simultaneously rather than as separate fields lacking cross-talk. The complexity of human disease demands multi-disciplinary approaches in order to demonstrate how the microbiome is a contributing factor. The further integration of previously disparate fields will enable true systems level understanding of human health and disease.

My thesis work employed cutting edge genomic, proteomic and functional genomic methodologies to gain insights into the potential role of the gut microbiome in the newly developed sub-therapeutic antibiotic-treated (STAT) mouse model of obesity [67]. We employed a systems biology approach to examine the human gut microbiome in obesity.

The hypothesis of this dissertation is that the structure and activities of the human gut microbiome impact health and disease in the context of obesity. The composition of the gut microbiome is stable unless disturbed by outside variables such as antibiotic treatment. While farmers have exploited sub-therapeutic antibiotic treatment for growth promotion in livestock for over 50 years, the mechanism of action remains unknown. We

25 generated deep DNA sequence coverage and gene annotation of the gut microbiomes of lean and obese mice in order to compare the representation of metabolic pathways.

While less sensitive than the metagenomics strategy, we also compared profiles of abundant proteins using fractionated mouse microbiome samples derived from lean and obese mice using nLC-MS/MS (nano liquid-chromatography electrospray ionization tandem mass spectrometry). The combination of metagenomic and proteomic data allowed us to construe a model for major metabolic processes in the mouse gut. We identified differences in under- and over-represented gene/protein functions. The statistical significance of the changes, often apparent at the level of biochemical pathways, were determined and assessed in the context of health and disease.

26

Chapter 2: Literature Review

The Gut Microbiome. The term human microbiome refers to the microbes and their genomes as they are living in or on a human being. Overall, the human microbiota contains about 10-fold more cells than human cells and 150-fold more genes than the human genome [68, 69]. Accommodating the highest cell densities documented for any ecosystem, the human gut contains around to bacteria [70, 71] with a diversity of at least 1000 species [72]. There are over 50 phyla on the planet but the predominant phyla in the mammalian intestine include Bacteroidetes, Firmicutes, Proteobacteria,

Actinobacteria, and Fusobacteria [29]. With its 400 m² of surface area, the gastrointestinal (GI) tract represents one of the largest interfaces for host-microbe interactions [70, 73]. The microbiota interacts with the host in a dynamic way, contributing to both health and disease. The gut microbiota performs important protective and metabolic functions, such as fermentation of indigestible plant components and endogenous mucous using genes encoding enzymes and biochemical pathways not contained within the host genome [47]. However, changes in the composition of the gut microbiota due to environmental exposures, such as antibiotic treatment, or polymorphisms in host genes that are involved in regulation of inflammatory responses in the gut may lead to pathogenesis. It has been proposed that changes in host-microflora relationships contribute to various diseases such as obesity, inflammatory bowel disease, and colorectal cancer [74]. The underlying mechanisms of the gut microbiota that affect host metabolism and contribute to obesity are not well- defined but likely include pathways related to lipogenesis, fatty acid oxidation, inflammation, and the gut endocannabinoid system [9, 17, 75, 76].

Gastrointestinal and Microbiome Function. The information encoded by the mammalian genome is not sufficient to maintain health and protect against disease.

27

However, some or most gut bacteria are mutualists and provide many key health functions to the host such as adequate digestion, metabolism, nutrient biosynthesis, angiogenesis, detoxification, enterocyte growth, defense against colonization by opportunistic pathogens, and immune system development and function [77-81]. The gut microbiota provides enzymes for the synthesis of vitamin K, various B vitamins, and amino acids [78]. A major function of this bacterial population is energy harvest through the metabolism of otherwise indigestible complex polysaccharides from food and the subsequent generation of short-chain fatty acids (SCFA) [75]. The GI tract is also a site of production for hormones involved in energy homeostasis (such as insulin, glucagon, leptin, and ghrelin) and growth (such as glucagon-like peptide 1 (GLP-1) and glucose- dependent insulinotropic polypeptide (GIP)) [82]. Symbiotic bacteria interact with the host in dynamic ways to influence the development and function of the intestinal architecture and immune system.

The colonic mucosa is protected by the innate and adaptive immune systems. The mucosal-associated lymphoid tissue (MALT) comprises the largest component of the body’s total immune tissue and contains over three quarters of all lymphocytes. The majority of these cells are located in the gastrointestinal tract within the gut-associated lymphoid tissues (GALT), which includes the Peyer’s patches, appendix, isolated lymphoid follicles, and the associated mesenteric lymph nodes [83]. The largest epithelial cell surface in the body lines the GI tract and provides a site for numerous cell interactions that regulate host responses to enormous quantities of food and bacterial antigens. Highly glycosylated mucins and the glycocalyx form a barrier between the lumenal contents and the gut epithelium [84]. The gut epithelial barrier, a single layer of cells overlaying the lamina propria, prevents food and microbial antigens in the intestinal lumen from interacting with the host immune system. The gut epithelial cells contain

28 tight junctions in the paracellular space, which act as regulated gateways that respond to various cytokine and bacterial signals [85]. Specialized microfold cells (M cells), which are interspersed in small numbers with the conventional enterocytes in the follicle- associated epithelium, take up antigen from the lumen by endocytosis or phagocytosis that is subsequently transported via transcytosis to the basal cell membrane and released into the extracellular space [86]. Some pathogens, such as retroviruses as well as Salmonella, Shigella, and Yersinia species, exploit this mode of accessibility to the subepithelial space, which contains an abundance of macrophages and effector lymphocytes [87] . Dendritic cells not only receive antigen from vesicles released at the basal cell membrane of M cells but also send processes through an intact epithelium in order to sample lumenal antigens.

The gut epithelium represents the largest surface for the interaction between the microbiota and host tissues. Host interactions as well as gut microbial composition affect gut barrier function. The barrier function of the intestinal epithelium is enhanced by mucus layers (one layer in the small intestine and two layers in the colon) and host- produced immune factors [88]. Paneth cells produce antimicrobial peptides, such as α- defensins, lysozyme C, phospholipases, RegIIIγ (also produced by enterocytes), and C- type lectin, that enhance innate immunity [89]. Effector cells of the adaptive immune system secrete IgA to enhance barrier function and restrict bacterial invasion [90].

Lipids of the endocannabinoid system, such as 2-arachidonoylglycerol reduces both metabolic endotoxemia and systemic inflammation, regulate barrier function and inflammation [91, 92]. In addition to host interactions, the mucosal barrier function also operates with the support of the normal gut microbiota, which helps to prevent colonization and invasion by pathogens both directly and indirectly through interactions with the host immune system. The healthy gut does not generate productive immunity

29 against these residents; however, antibiotic treatment and other environment disruptions provide opportunities for pathogens or pathobionts, such as Clostridium difficile, to thrive in this ecological niche [93]. Antibiotic treatment has the side effect of changing the gut community architecture and is linked to an increased risk of developing C. difficile colitis and asthma [69, 94]. Some commensal bacteria also inhibit colonization by competing for nutrients and inhibiting the pro-inflammatory nuclear factor-kappa B (NFκB) mediated signaling pathways that pathogens activate and rely upon for invasion. This pathway inhibition is achieved by inhibiting the degradation of the inhibitor of kappa B (IκB), which is bound to the transcription factor NFκB in the cytoplasm, or activating peroxisome proliferator activated receptor-γ (PPARγ), which removes NFκB from the nucleus [95].

Modulators of Microbiota Population Structure and Microbial Succession. Studies of the human microbiota have revealed the complexity and unique phylotypes harbored in distinct ecological niches and the high degree of interpersonal variability of these microbial populations [27, 28]. Additional studies have demonstrated the overall stability of the gut microbiota in healthy adults [40]. The relative impact of environmental and genetic factors in shaping the adult microbiome has been evaluated but contradictory results have hampered definitive conclusions. An important question in terms of understanding how dysbioses of microbiota originate and are perpetuated involves gaining a greater appreciation of how microbial populations form and how that evolutionary process contributes to the inherent stability of microbial populations.

Humans are born essentially sterile but quickly adopt microbiomes that populate a variety of host surfaces and tissues. The pioneer colonizers are distinct from the phylotypes observed in the adult microbiome. In early life, neonates develop a microbiota that over time presumably adopts a stable and highly fit community that maximizes the overall efficiency of its individual components. Studies examining the

30 microbial succession associated with the formation of the adult microbiota in early life provide a powerful framework to address fundamental questions with regard to how microbiota form stable states of homeostasis [96]. The pioneering studies detailing the microbial succession of microbiota initiated at birth have demonstrated that the maternally inherited, low-diversity microbiota is unstable but matures over the first years of life to form a stable configuration with an elevated representation of niche adapted phylotypes. Comparisons of the microbiota derived from vaginally and Cesarean section

(C-section) delivered babies indicate that despite the distinct taxonomic representation of the initial microbial inoculums, they evolve dynamically and display increasing relatedness to one another over time as climax communities are formed [97-99]. These findings illustrate the dynamics of microbial succession during early life that involve the

“replacement” of vertically acquired microbiota with phylotypes that are more highly adapted for competition in specific ecological niches such as those found in the gut.

The population structure of the human microbiota displays a similar phylogenetic topology as that first described in the gut, featuring “deep fan structures” with impressive radiations of species derived from a substantially smaller set of genera [100, 101]. The phylogenetic representation of related species within bacterial communities confers functional redundancy to the community and represents, at least in part, the basis of the observed interpersonal variation in human microbiota. It is possible that most phylotypes in the microbiota are functionally dispensable since their contribution to the community fitness may be easily replaced by related phylotypes. High levels of functional redundancy and interpersonal variation of microbial communities suggest that microbiota may be capable of adopting a very large number of configurations or stable states.

31

To understand the dysbioses displayed by microbiota in association with various human diseases it is of importance to first understand how the microbiota establishes homeostasis. Microbial succession is driven at least in part by the fact that the metabolic activities of the initial pioneer community alter the virginal ecosystem thereby providing novel opportunities for subsequent community succession that in turn broadens the functional complementarity of species forming the community. This process is iterative and in the context of the developing human microbiota, represents a dynamic process that entails the replacement of the inoculums acquired at birth with niche-adapted species. It may be appropriate to view this process as an optimization of the community with a trend towards elevated fitness and increased complexity of functional networks over time. The functional networks involve those occurring between groups of species within the community and those between the microbiota and the host

[102, 103]. The resilience of microbiota is defined as the ability of a community to re- attain its original structure following a defined perturbance. The resilience of a microbiome may be a direct reflection of the number and robustness of such networks.

The number and nature of selective forces driving successional communities and network formation are largely unknown at present.

The microbial colonization of human microenvironments is initiated at birth. The primary inoculum for vaginally delivered babies is derived primarily from the mother’s vaginal and fecal microbiome and is typically dominated by Lactobacillus, Prevotella, or Sneathia spp. In C-section births, the skin of individuals handling the newborns is the primary source of the initial microbiota and is dominated by distinct taxa that include

Staphylococcus, Corynebacterium, and Propionibacterium spp. [104, 105]. The diversity of both gut and oral microbiota is higher in vaginally compared to C-section delivered infants [98, 106, 107]. The microbiota of family members are more similar to each other

32 than to unrelated individuals suggesting that human genetic and/or environmental factors, such as physical contact and diet, may play an important role in shaping the community composition of microbiota [26, 108]. Detailed investigations examining twin pairs revealed that their gut microbiota were more similar to each other than one twin is to its parents or non-twin siblings [26]. However, the similarity between monozygotic

(mz) twin pairs was only marginally greater than dizygotic (dz) twin pairs suggesting that environmental factors rather than host genetics play a role in microbiota development

[26]. A subsequent analysis employing substantially deeper sequencing of the microbiota derived from twin pairs noted a greater similarity between the microbiota of mz compared to dz twin pairs, which provided some evidence for a host genetic basis for microbiota assembly [109]. This is consistent with studies of inbred mice strains that concluded that the microbiota of littermates are more similar to each other compared to those derived from separate lines. Further support for a host genetic role came from the demonstration that mice of different genetic backgrounds display varying levels of inter- mouse community similarity [110]. Similarly, mouse quantitative trait loci (QTL) detection identified 18 significant host QTL displaying genome-wide linkages [28]. These QTL impacted features of the microbial community such as species representation. The increasing number of identified genetic loci (e.g. TLRs, NOD-1 and -2, IL-10, NLRP6) that impact aspects of microbial community composition provide further evidence of a host genetic component in microbiota assembly [111-114]. The potential influence of the environment on microbiota development was illustrated by a study that demonstrated that mice reared together display greater similarity in microbiomes compared to those littermates separated just after birth and reared in separate cages [27, 28, 115]. Dietary influences on the developing gut microbiota indicate a strong influence on gut microbiota formation [26, 116-118]. In both the gut and oral cavity, breast and formula fed babies develop similar but distinct microbiota [119, 120]. Children exposed to antibiotics one or

33 more times in early life display altered degrees of community resilience, suggesting that resilience and the response of the microbiota to external perturbation may be a personalized trait [121, 122]. Taken together, the determinants of early microbial succession, its trajectories and the climax communities that are formed are complex and may involve a combination of maternal, genetic and environmental factors.

Microbial Succession. The development of the gut microbiota in early life has been addressed in several studies. The most comprehensive longitudinal study of a single individual (60 stool samples) from birth to age 28 months [123] revealed that the phylogenetic diversity of the microbiota increased over time [123-125]. The interpersonal variation in the microbial communities was higher in early life compared to adults, suggesting that microbiota development involves a convergence towards stable adult configurations. Primary microbial community succession is accompanied by dynamic changes that start in the first week or two of life. Analysis of gut succession communities in early life bear resemblance to models of punctuated equilibrium wherein brief periods of transient stability are followed by bursts of change [123, 124, 126]. The developing microbiota is perturbed by several environmental factors such as dietary and antibiotic exposure. The rate that communities restore equilibrium following a perturbation is known as elasticity. The stability or resilience of the microbiota describes its resistance to change following a perturbation. The mechanisms dictating these features in microbial communities are not well understood but may be personalized and define an individuals’ stability landscape [40, 122]. Ecological models suggest that some community topologies or phylotypes will be highly resistant to change whereas others may be prone to larger change in response to the same perturbation. In the context of human microbiota, it is tempting to speculate that the resilience of the microbial community may be directly related to its maturity and inter-connectivity of species

34 networks within the community. Many such mutualistic interactions have been described in the oral cavity [127]. These cooperative interactions relate to a wide variety of environmental contexts which include flux in dietary nutrients [128], O2 concentration

[129], temperature [130], pH [131] and energy metabolism [132, 133]. The stability landscape of human microbial communities is of great interest to explore as such analyses will improve our understanding of how the microbiota establish and maintain homeostasis.

The maturation of the human microbiota is an example of ecological succession [134].

The pioneer species that initially colonize an ecological niche undergo consecutive changes in composition and function that culminates in a climax community displaying increased diversity and stability. The stability of communities in equilibrium states is influenced by community evenness as communities dominated by relatively few species are less resilient compared to high diversity populations [135]. Antibiotic administration may drive communities from one stable state to another sometimes resulting in the development of a “degraded” state as was illustrated when the gut microbiota of healthy volunteers subjected to two courses ciprofloxacin at a 10-month interval were analyzed

[136]. Functional response diversity describes the relative sensitivity of a phylotype to change following a perturbation (e.g. pH, redox state, etc.). High functional response diversity may be a hallmark of succession communities thereby facilitating dynamic change. The parameters controlling such dynamics are likely to be complex and include a variety of positive and negative feedback. Negative feedback is thought to contribute to community stability since changes in the environment would be met with down- regulating feedback loops, controlled by the host and/or the microbial community itself.

Positive feedback is thought to induce ecosystem change resulting from signals/effectors produced by phylotypes that provide a fitness advantage to them or other phylotypes,

35 resulting in their increased representation. Positive and negative feedback loops may play a role in destabilizing the microbiota during regime changes, such as during succession in early development and/or following perturbation.

The human microbiota represents a complex assemblage of member species and a significantly larger set of gene functions. Collectively these gene functions provide the driving force for the establishment and maintenance of mutualistic relationships including both microbe-microbe and host-microbe interactions. One strategy microbes utilize to gain fitness advantage in the population is to engage in metabolic cooperation [137-139].

The intensive selective pressure and fierce competition for nutrients in the gut microenvironment may drive the subsequent shedding of genes involved in metabolic coding thereby creating and/or deepening co-dependent relationships between pairs or groups of microbial species. It has been speculated that mutualism and synthrophic relationships abound in human microbiota. Evidence is quickly accumulating to support this conjecture [71, 140, 141]. A combination of in vitro microbiological co-culture studies and the use of gnotobiotic mice have led to a broadened appreciation of the evolutionary strategies underlying functional interactions that define higher-order networks within microbial communities [142, 143].

The Gut Microbiota Shapes Immune Responses during Health and Disease. A mutualistic relationship has evolved between symbiotic bacteria and the immune system.

The microbiota seems to direct the development of the immune system and the immune system shapes the composition of the microbiota. Mutualists have been shown to prevent inflammatory disease during colonization in contrast to pathogens that induce immune responses that lead to tissue damage. However, the healthy microbiota has been shown to induce inflammation under certain conditions. Thus, the healthy gut microbiota exerts both pro- and anti-inflammatory responses. The symbionts that induce

36 an inflammatory response under certain conditions are called pathobionts. Moreover, altered microbiota composition or dysbiosis may lead to altered immune responses that lead to inflammatory disorders.

Studies using gnotobiotic mice have revealed several important effects of the microbiota on the host immune system. Gnotobiology uses the selective colonization of germ-free or sterile mice in which immune responses have not been influenced by microorganisms.

Gut bacteria affect the morphology of the gut. Germ-free mice display significant defects in gut-associated lymphoid tissue (GALT) development and have fewer and less developed mesenteric lymph nodes and Peyer’s patches [144]. The intestinal epithelial cells (IECs), which line the gut and sequester the lumen and immune cells, of germ-free mice display altered microvilli architecture with shorter crypts containing fewer cells

[145]. Gnotobiotic mice display significant reductions in the size of all peripheral lymphoid organs, serum immunoglobulin levels, and all immune responses while wildtype animals have higher levels of IgA as well as T cell populations that recognize commensal bacteria [146]. The T cells residing within the GALT may indeed recognize commensals; however, the systemic immune system remains in a state of ignorance or unresponsiveness due in part to anatomical compartmentalization and the quiescent state of most intestinal dendritic cells.

Both resident and non-resident members of the gut microflora play a key role in colonization resistance in the host. Pathogens and pathobionts exist in the gut microflora in low titer unless the ecosystem is disrupted [147] or under certain conditions such as immunodeficiency [85]. For example, antibiotics cause major disruptions in the gut community architecture [148]. Clostridium difficile, normally carried at a rate of 3-5% in adults, may become opportunistic and cause colitis associated with antibiotic or immunosuppressive drug treatment [149]. The restoration of the gut ecological balance

37 is necessary in the treatment of opportunistic enterocolitis. Reconstitution of the community balance using whole fecal transplant strategies has been demonstrated as directly curative [150, 151].

Homeostasis in the Gut. Gut homeostasis is maintained in the healthy gut as an inflammatory tone, allowing a rapid and self-limiting response that is appropriate to stress signals. These responses require a substantial amount of metabolic energy, particularly following tissue injury. Elevated and chronic inflammatory responses have been attributed to the development of frailty in the elderly. Endogenous and exogenous signals in the gut are recognized by a repertoire of innate immune cell pattern recognition receptors (PRRs) leading to the activation of an inflammatory cascade and the activation of an adaptive immune response. Toll-like receptors (TLRs), nucleotide binding oligomerization domain (NODs) receptors (NLRs) and retinoic acid inducible gene (RIG) like-receptors (RLRs) act in distinct cellular compartments and cell-type specific combinations. Activation of PRRs leads to activation of the transcription factor

NF-κB and interferon (IFN) regulatory factor (IRF) leading to the induction of pro- inflammatory genes such as TNF-α and IL-1 [152].

The gut microbiota harbors the potential as a collective to promote chronic pro- inflammatory states in the host. Indeed, dysbioses of the gut microbiota have been strongly implicated as a factor in a number of inflammatory diseases including: obesity, inflammatory bowel disease (IBD), irritable bowel syndrome (IBS), colitis, colon (CRC) and stomach cancers, type I and II diabetes and others. The host immune system has co-evolved with the gut microbiota as it derives substantial nutrient benefits in the form of short chain fatty acids (SCFA) based on the capacity of gut community members to

38 metabolize complex carbohydrates and indigestible fibers that are abundant in plant based diets. It has been speculated that this mutualism has driven and fortified the interactions between the gut microbiota and the host. The human immune system must strike a delicate balance to display tolerance to the gut commensal microbiota while maintaining vigilance to guard against infectious agents and commensal bacteria that manage to breach the gut epithelium. Immune surveillance of the gut commensal community involves the recognition of a diversity of pathogen-associated molecular patterns (PAMPs) such as lipopolysaccharide (LPS) and bacterial peptidoglycan cell walls through toll-like receptors (TLRs). These prevalent surface molecules have a strong impact on the immune system and its inflammatory status. For example, the numerically dominant Bacteroidetes express a pentacylated LPS is a poor agonist of

TLR4, whereas Proteobacteria make a hexacylated LPS that is strongly endotoxic and a potent agonist of TLR4 signaling. In human subjects with IBD and IL-10 deficient mice,

Proteobacteria increase their abundance relative to Bacteriodetes and Firmicutes, suggesting that the relative balance of specific bacterial taxa impacts the expression of pro-inflammatory pathways.

The mucin, produced by epithelial goblet cells, forms a protective layer that places a distance between the gut microbiota and host epithelium. The mucin is embedded with a number of antimicrobial factors such as IgA, α and β-defensins [153]. TLRs are sequestered in the crypts and generate commensal induced signals that maintain tolerance. Likewise, regulatory T cells in the gut sub-epithelium produce IL-10 and TGF-

β that also contribute to tolerance. Commensal bacteria display a variety of mechanisms to help the host maintain immune homeostasis and serve to inhibit the extent of the pro- inflammatory responses. For example, non-pathogenic Salmonella block the ubiquitination of IκBα, preventing its degradation thereby maintaining NF-B in an inactive

39 state. Bacteroides thetaiotaomicron, an abundant member of the gut microbiota through an unknown mechanism, increases the nuclear export of the RelA subunit of NF-κB, thereby reducing its activity. Lactobacillus casei down-regulates a number of components of the proteasome complex thereby decreasing the degradation and turnover of I-κB. Bacteroides fragilis regulates the production of the cytokine IL-10 in gut immune cells.

Recent studies suggest that immunosenescence, characterized by persistent NF-κB- mediated inflammation and loss of naive CD4+ T cells, is not based on progressive reductions in all immune functions but rather a remodeling of the repertoire of immune cell types [154]. The adaptive immune system appears to be more strongly impacted by age [155] as a variety of T-cells display an age-dependent decline that mirrors the involution of the thymus, which is essentially complete in human subjects >60 years of age [156]. Specifically, CD3+ cell numbers are reduced as are helper CD4+ and suppressor/cytotoxic CD8+ cells accompanied by an increase of type 1 (IL-2, IFN-γ,

TNF-α) and type 2 (IL-4, IL-6, IL-10) cytokines [157]. Taken together the results support the perspective that aging is associated with chronic pro-inflammatory status, now commonly referred to as inflamm-aging [158]. The endogenous and exogenous signals that induce a chronic pro-inflammatory status in the growing elderly population are a subject of great interest. Our current understanding of homeostasis and dysbioses of the gut microbiota and its association with significant human diseases, suggest that gut commensal bacteria may represent a significant source of signals promoting pro- inflammatory responses of the mucosal and systemic immune system. For example, the gut microbiota of the elderly progresses toward decreased diversity and increased interpersonal variability, making this decay seem random [159, 160]. Alterations in the gut microbiota during the aging process may lead to dysbioses and provide an

40 opportunity for increased colonization of pathobionts and susceptibility to infectious agents. The inflammatory status of the gut also increases the potential for bacteria to adhere to the colonic mucosa due to reduced mucin production [161]. Chronic inflammation of the gut serves to weaken epithelial tight junctions leading to “leaky gut” and increased frequency of bacterial invasion that further exacerbates the inflammatory response of the gut-associated immune system. The frequency of a variety of cancers is elevated in the elderly population. Chronic infections with enterotoxin producing B. fragilis (ETBF) lead to persistent enterocolitis and tumor formation in multiple intestinal neoplasia (Min) mice [162, 163]. ETBF induces Stat3 and a selective Th17 response distributed between CD4+ TCRγδ+ and CD4-, CD8- TCR γδ+ T cells. Blocking IL-17 and/or IL-23, a key cytokine amplifying Th17 responses, inhibits ETBF-induced colitis, colonic hyperplasia and tumor formation. Chronic inflammation is a well-documented risk factor of cancers including CRC [164]. Inflammatory mediators such as TNF-α, IL-6,

IL-23 and reactive oxygen species promote the development of colon tumors [165, 166].

A study of the gut microbiota and its relationship to inflammation revealed that the inflammatory status of the gut modulates the gut community in colitis-susceptible Il10-/- mice that display reduced richness and clustered distinctly from the microbiota of wild- type mice [113]. These mice displayed increases in Verrucomicrobia, Bacteroidetes and

Proteobacteria. Specifically, commensal Escherichia coli levels were increased by ~100- fold. Azoxymethane (AOM)-treated Il10-/- mice mono-associated with either E. coli or E. faecalis developed colitis, but only the E. coli-associated mice developed tumors, suggesting that elevated inflammation is not sufficient to drive tumor formation.

Importantly, AOM treated WT mice mono-associated with E. coli did not develop inflammation or colon tumors suggesting that in the absence of inflammation, E. coli is not tumorigenic. These results support a model that inflammation is necessary but not sufficient for tumor formation in the gut and that genotoxic gut bacteria provide additional

41 signals required for tumorigenesis. The extent that this model resembles features of inflamm-aging remains untested but may suggest that the elevated inflammatory status of the elderly leads to specific alterations in the gut microbiota that serve to further exacerbate the immune status thereby increasing the susceptibility of aged individuals to infectious agents, a variety of chronic diseases and cancer.

Gut Inflammation and Human Disease. Alterations in the composition of the gut microbiota have now been associated with a broad spectrum of chronic inflammatory and metabolic diseases including IBD, cancer, type I and II diabetes, obesity and cardiovascular disease. Common to many chronic inflammatory diseases are alterations in gut permeability and a variety of host defensins, which maintain surveillance of the gut microbiota to achieve homeostasis. Interestingly, the cytokines generated by CD4+ T helper cells have significant impact on the intestinal epithelium thus impacting permeability, proliferation, repair and control of the expression of critical tight junctions and antimicrobial defensins. Recent studies have shed light on several species in the gut microbiota that play a role in the regulation of CD4+ T helper cells [167-170]. A cell population of growing interest is the innate lymphoid cells (ILCs) that bear many similarities to CD4+ T cells. One distinguishing feature of ILCs is that their differentiation does not require somatic recombination. Tbet+ ILCs produce IFN-γ and TNF-α and play a role in tumor suppression and immunity to intracellular pathogens. GATA3+ ILCs require both GATA3 and RORγ for differentiation and are involved in airway hypersensitivity and immunity toward the helminth parasite. Finally, RORγt+ ILCs respond to extracellular pathogens, inducing the production of inflammatory molecules and play a role in tissue repair processes. While Tbet+ and GATA3+ ILCs differentiate in the absence of commensal microbiota, some RORγt+ ILC subsets do not [171]. While development of ILCs may be largely independent of commensal microbiota, it is

42 increasingly clear that the commensal microbiota has profound effects on their function.

While the presence of the commensal microbiota promotes protection against viral infection involving NK cells [172], the microbiota appears to inhibit IL-22 production by

RORγt+ ILCs, although this was reversible by damage to the intestinal epithelium [173].

While RORγt+ ILCs express a functional TLR2, it may be more likely that these cells respond to the commensal microbiota indirectly via IL-1β and IL-23 produced by myeloid cells [174].

Host Inflammation in IBD. Inflammatory bowel disease (IBD) refers to two distinct inflammatory diseases of the gut including Crohn’s Disease (CD) and ulcerative colitis

(UC). Studies of IBD have advanced rapidly and have become a model for a number of inflammatory diseases of the gut that now recognize the gut microbiota as a key player in the disease process, although the specific etiology of the disease remains unclear.

Among the significant changes associated with IBD is the reduction in barrier function of the gut epithelium. The increased expression of claudin 2 results in pores and coupled with the down-regulation and spatial redistribution of claudin 5 and 8 results in discontinuities in tight junctions that are thought to increase the likelihood of bacterial breach and induction of strong inflammatory responses [175, 176]. Similarly, defective mucin organization and expression contribute to a reduction in barrier function leading to

ER stress and inflammation [177, 178]. The dysregulation of host defensins including α- and β-defensin associated with CD also serve to modulate the gut microbiota and may contribute to host-microbiome inflammatory interactions [179]. A highly informative study using T-bet deficient mice, a function required for Th1 cell development, showed that the gut microbiota of these mice became colitogenic and furthermore when transferred to wild-type mice could confer colitis [180].

43

The Gut Microbiota in IBD. The hallmarks of the gut microbiota in subjects with IBD include a reduction in overall complexity of the microbial populations coupled with an overall loss in many of the high abundance commensal species present in the healthy gut microbiota. The loss of these commensal populations appears to be replaced by microbes that may be referred to as pathobionts that display an elevated inflammatory potential. These pathobionts are suspected of an increased propensity to breach the epithelium of the gut to exacerbate the inflammatory response mounted by the host tissues, leading to a chronic inflammatory status. Specific organisms and phyla with either increased mucolytic and/or adherence properties in IBD subjects have been identified and include Ruminococcus gnavus and Ruminococcus torques [181-183] as well as γ-proteobacteria, actinobacteria and bifidobacteria [18, 184, 185]. A decrease in the anti-inflammatory species Faecalibacteria prausnitzii has been reported in IBD patients [184, 186]. Subjects with IBD displayed a significant increase in mucus penetrant bacteria compared to healthy controls [184]. Recent studies have implicated the increased abundance of adherent-invasive Escherichia coli (AIEC) as a potentially significant causal agent in the potentiation of CD. This strain of E. coli is highly adherent to intestinal epithelial cells and displays an invasive phenotype [187, 188]. It is interesting to note that defects in autophagy in host genes NOD2, ATG16L1 and IRGM results in increased prevalence of AIEC [189, 190]. Another host factor that promotes

AIEC colonization is the abnormal expression of carcinoembryonic antigen-related cell adhesion molecule 6 (CEACAM6). This protein serves as a receptor for AIEC in patients with ileal CD [191]. Similar associations with other bacterial species have been noted including Listeria monocytogenes, Campylobacter spp., Salmonella spp, Yersinia enterolitica and Yersinia pseudotuberculosis in subjects with CD; however, many of these observations were not confirmed by additional studies that attempted to identify these species.

44

Obesity, Inflammation and the Gut Microbiome. Studies have demonstrated that the gut microbiome may have a role in the induction of systemic, adipose, and hepatic tissue inflammation that contributes to the development of obesity-associated insulin resistance, type 2 diabetes, liver cancer, and related metabolic diseases [192-194]. For example, dietary or genetic obesity induces changes in gut microbial composition, which leads to increases in enterohepatic circulation of the gut microbial metabolite deoxycholic acid (DCA). DCA, which causes DNA damage, also provokes the secretion of inflammatory cytokines and tumor-promoting factors in the liver thus promoting obesity-associated hepatocellular carcinoma [194]. Bacterial lipopolysaccharide (LPS), a structural component of gram-negative cell walls, has been shown to serve as a triggering factor via toll-like receptor 4 (TLR4) that initiates inflammation, insulin resistance, obesity and diabetes. Moreover, a high-fat diet increases plasma LPS to levels that increase body weight, fasting glycemia, and inflammation. LPS treatment of mice also increases inflammation and the expression of genes endoding proinflammatory cytokines such as IL-6, TNF- α, IL-1, and PAI-1 [195]. TNF-α is an important inflammatory marker that impairs glucose tolerance and insulin sensitivity in animal models [196]. This cytokine contributes to the induction of insulin resistance by increasing the rate of lipolysis in adipose tissue [197]. Obesity also increases monocyte chemoattractant protein-1 (MCP-1) secretion and causes the infiltration and accumulation of macrophages in adipose tissue, which leads to insulin resistance [198].

The resulting insulin resistance contributes to hyperinsulinemia and excessive lipid storage in hepatic and adipose tissue. Type 2 diabetes is associated with obesity- related insulin resistance and alterations in the composition of the gut microbiota.

Sequencing of the 16S rDNA gene has revealed that members of the Clostridia class within the Firmicutes was reduced while the Betaproteobacteria class and Bacteroidetes phylum were enriched in type 2 diabetes patients compared to healthy adults [199]. A

45 large metagenomic study also found gut dysbiosis in patients with type 2 diabetes that was characterized by a reduction in butyrate-producing bacteria and enrichment of both opportunistic pathogens, such as Clostridium spp, as well as important gut residents such as Akkermansia muciniphila, Bacteroides spp. and Desulfovibrio spp. [17]. While it remains unclear as to whether these alterations in microbial composition represent a cause or effect, the identification of these markers may help diagnose or classify patients with obesity and related diseases.

Strategies to Manipulate the Gut Microbiota in Obesity. One of the major advances in our understanding of the gut microbiota is the recognition that it is a metabolically adaptable “organ”. Among the numerous factors that dictate the microbial community composition of the gut, diet is considered among the most influential. Other therapeutic strategies to modulate the gut microbiota in the context of disease include the administration of prebiotics, supplementation with probiotics, employing antimicrobial compounds such as bacteriocins to eradicate pathogens, and reconstitution of healthy gut population structure by fecal microbiota transplantation (FMT). Therefore, the opportunities related to enhancing human health through modulation of the gut microbiota through dietary interventions and other strategies remain highly promising.

Prebiotics are nutritional compounds that promote the growth of beneficial microbes and demonstrate potential to improve gut health. Elevated Bifidobacteria levels are beneficial at all ages and have been associated with a number of human diseases, including boosting the declining immune systems of the elderly. Inulin type fructans are naturally occurring polysaccharides that are indigestible by humans and serve as prebiotics to enhance the growth of populations such as Bifidobacteria [200]. The widespread capacity of the Bifidobacteria to metabolize polyphenolic compounds

(esters, glycosides or polymers) into bio-active metabolites and their impact on human

46 health have been studied extensively; however, many gaps in our mechanistic understanding of these relationships remain.

Supplementation with oral probiotic cultures has led to promising results in the treatment of GI disorders such as obesity, type 2 diabetes and UC [201, 202]. Probiotics alter gene expression and the function of the microbiome, thus altering energy homestasis, such as through expanding carbohydrate utilization potentials, as well as other physiological functions [203, 204]. The anti-obesity potential of probiotics is of great interest. For example, Lactobacillus rhamnosus PL60, which produces conjugated linoleic acid (CLA), has been shown to reduce weight gain and white adipose tissue mass in mice fed a high-fat diet [205]. Lactobacillus plantarum PL62, another CLA- producing strain, has also been reported to reduce weight gain and glucose levels in diet-induced obese mice [206]. The administration of synbiotics, which are nutritional supplements that synergistically combine both prebiotics and probiotics, resulted in the increase in IL-8 secretion and subsequent increases in neutrophil populations [207].

The development of the gut microbiota is a dynamic process that produces a relatively stable community throughout adulthood until old age when this relative stability becomes reduced. The abundance of Bifidobacteria in the gut is reduced in elderly populations

[208]. Galactooligosaccharide (GOS) treatment of elderly subjects resulted in an increase in Bifidobacteria levels in the gut [209]. The physiology of aging is complex and increases the challenges associated with evaluating factors impacting healthy aging.

For example, the elderly undergo changes in taste and smell, appetite, gastrointestinal motility, gastric atrophy and tooth loss that plays a direct role in nutrition. The gut displays a reduction in its absorptive properties of calcium, iron and vitamin B-12. The elderly show reductions in a large number of important minerals and vitamins.

Furthermore, the health status of the elderly has been associated with the quantity of

47 fiber in their diet through consumption of fruits and vegetables. These age-related phenomena highlight the difficulties in establishing direct links between gut dysbiosis and health decline in the elderly.

Decreases in Faecalibacterium prausnitzii, which is known for its anti-inflammatory properties, and members of Clostridium cluster XIVa have been noted in the elderly

[207, 208]. Other alterations associated with the aged gut microbiota include the increase in Streptococci, staphylococci, enterococci and enterobacteria groups that are known to contain a number of pathogenic and pathobionts. A reduction in the abundance of health-promoting Bifidobacteria and species diversity has also been noted in elderly populations. It is known that gut bacteria from the Faecalibacterium,

Bifidobacterium and Lactobacteria down-regulate pro-inflammatory responses. Indeed, the level of Bifidobacteria is inversely correlated with serum TNF-γ and IL-10 levels.

Probiotics containing Bifidobacterium have been shown to positively modulate the microbiota of the elderly. Similarly, prebiotics have led to similar effects in the elderly

[200, 210, 211]. To date, a limited number of studies have evaluated the relationship between the microbiota and processes associated with inflamm-aging. It has been demonstrated that probiotics containing Bifidobacterium or treatment with inulin reduces the frequency of translocating Enterobacteriaceae in DSS-colitis induced rats and similarly mice showed decrease in mortality following infection with either L. monocytogenes and Salmonella typhimurium [212, 213]. Oligofructose prebiotic treatment of human subjects decreased the colonization by C. difficile and was accompanied by reductions of relapse and hospital stay duration [214].

The abundance of Akkermansia muciniphila, a dominant gut mucus layer resident that degrades mucin, inversely correlated with body weight in mice and humans [215, 216] and decreased in obese and type 2 diabetic mice [217]. Moreover, administration of

48 oligofructose (prebiotics) to genetically obese or high-fat diet-induced obese mice increased A. muciniphila levels, commensurate with improved glucose and lipid metabolism, leptin sensitivity and barrier function, while displaying reduced LPS- mediated endotoxemia [217, 218]. Probiotic treatment with viable but not heat-killed A. muciniphila reversed high-fat diet-induced metabolic endotoxemia and related disorders, which included fat mass gain, inflammation, and insulin resistance [56]. This probiotic treatment also increased endocannabinoid levels in the intestine, which impacts inflammatory pathways and gut barrier function, and reversed diet-induced gut mucosal barrier dysfunction by restoring the thickness of the inner mucus layer and decreasing gut permeability. Additional studies must be undertaken to develop prevention or treatment protocols using this probiotic for obesity and related disorders.

FMT, also referred to as fecal bacteriotherapy, is not a new therapeutic modality; however, interest in this technique has been recently rekindled [219]. FMT reintroduces a balanced intestinal community derived from healthy donors into diseased individuals and has been especially effective when treating recurrent C. difficile infection with up to a 98% remission rate [220]. An obese phenotype has been demonstrated as transmissible from genetically obese mice to germ-free mice [75, 221]. In humans, one double-blinded controlled study reported that the patients that received fecal transplants from lean donors developed significantly lower fasting triglyceride levels and insulin sensitivity compared to controls [222]. While highly effective at treating recurrent C. difficile infection, additional controlled trials of FMT in the context of both GI and non-GI disorders are needed before FMT can be widely accepted and applied clinically.

The Gut-Brain Axis. The gut-brain axis, or brain-gut axis, refers to interactions between the enteric nervous system (ENS) within the GI tract and the central nervous system (CNS) through a bi-directional communication system that is comprised of neural

49 and humoral pathways [223]. The enteric nervous system, often referred to as the

“second brain”, contains as many nerves as the spinal cord, regulates basic gut functions such as gut motility, intestinal permeability, secretion and mucosal immune activity and shares similar neurotransmitters and signaling molecules with the brain. The brain communicates to the GI tract through multiple, parallel pathways which include both sympathetic and parasympathetic divisions of the autonomic nervous system

(ANS), the hypothalamic-adrenal-pituitary axis (HPA), and sympathoadrenal axis (which modulates the GALT) [224]. Peripheral metabolic signals, such as nutrients and hormones, can even access the arcuate nucleus (ARC) of the hypothalamus without crossing the blood-brain barrier due to anatomical features of the ARC [225]. Another interesting area of the hypothalamus, the ventromedial nucleus (VMN), contains neurons with receptors for glucose and leptin, receives neuronal projections from the ARC, projects axons to the ARC and other hypothalamus nuclei, and produces the anorexigenic neuropeptide brain-derived neurotrophic factor (BDNF) [226, 227]. Leptin is produced by adipocytes, as well as in gut epithelium, and is released into systemic circulation in concentrations that are proportional to body fat mass. Leptin receptors are highly expressed in the ARC and activation of leptin signaling results in reduced food intake and increased energy expenditure. Insulin, another peripheral adiposity marker, also binds receptors in the ARC thus relaying an anorexigenic signal to the brain [228].

The brainstem is another area that is involved in feeding behavior and energy balance.

The sensory vagus nerve transmits satiety signals to the solitary tract nucleus, which receives both humoral and neural signals, in the medulla oblongata [229]. Thus, peripheral metabolic signals and nutrients signal the brain through the hypothalamus, brainstem, and some areas of the midbrain such as the ventral tegmental area [230].

50

The gut has been recognized as the largest endocrine organ in the body, producing myriad hormones and bioactive peptides [231]. Gut to brain signaling in the gut is accomplished by highly chemosensitive primary afferent neurons, immune cells and enteroendocrine cells, which contain over 30 different hormones. For example, different types of afferent neurons express receptors for hunger generating (ghrelin) and satiety generating (including peptide YY (PYY), cholecystokinin (CCK), glucagon-like peptide 1

(GLP-1), and oxyntomodulin) peptides that are released from enteroendocrine cells.

Animal studies suggest that the expression of these receptors may become altered by nutritional state and diet [232]. The expression of receptors for hunger generating peptides was found to be up-regulated while the expression of receptors for satiety generating peptides was down-regulated during periods of fasting, which demonstrates vagal afferent plasticity in response to homeostatic state. Dysregulation in gut-brain signaling has been implicated in chronic diseases such as inflammatory bowel disease

(IBD) and obesity [233]. For example, satiety-inducing gut-brain signaling mechanisms, such as the vagal afferent pathway, have been observed as down-regulated in high-fat diet-induced obesity models [234]. Brain imaging of obese patients has revealed an increased sensitivity in neuronal pathways that predict reward in response to high caloric food cues accompanied by a decreased sensitivity to the actual effects/rewards in response to food intake in dopaminergenic pathways, which reveals a mismatch between expected and actual reward that may describe a mechanism involved in food addiction and eating disorders [235]. Another study found that the presence of gut microbiota influenced body fat regulating genes in the hypothalamus and brainstem, which are two neuronal areas that control body fat regulation [236]. The gene expression of the anti-obesity neuropeptides BDNF and GLP-1 precursor glucagon was reduced in conventionally-raised compared to germ-free mice. The presence of the body fat inducing microbiota was also associated with signs of decreased leptin, which is

51 produced by adipocytes in proportion to fat mass to suppress appetite, sensitivity in the hypothalamus and may contribute to the relative obesity in observed conventional compared to germ-free mice. Thus, the microbiota contributes to obesity by affecting gene expression and signaling in the CNS. Additionally, the GI endocannabinoid system may be involved in the regulation of food intake. For example, cannabinoid type 1 receptor (CB1) agonist administration decreased food intake and weight gain in obese mice, which suggests that endocannabinoids may exert an orexigenic effect [237]. In addition to hormones, nutrients can relay signals to the hypothalamus. For example, glucose and leucine relay satiety signals to the ARC and free fatty acids may induce an anorexigenic effect through ATP sensitive potassium channels [238-240]. Recently, guanylyl cyclase C (GUCY2C) receptors, previously found in the intestinal epithelium, were identified in the mouse hypothalamus [241]. GUCY2C signaling mediates satiety and is involved in gut homeostatic mechanisms, including anti-tumorigenesis, electrolyte balance, and gut barrier function. The regulation of both energy balance and tumor suppression by this facet of the gut-brain endocrine axis may contribute to the established link between obesity and colorectal cancer. Thus, the brain, especially the hypothalamus, plays a key role in appetite regulation, energy homestasis, and obesity and its associated pathologies.

Obesity, Energy Harvest and the Gut Microbiome. Obesity represents an important public health concern in the US and throughout the world. Obesity predisposes an individual to many disorders including type II diabetes, hypertension, heart disease, stroke, non-alcoholic fatty liver disease, arthritis, depression and some types of cancer.

Both genetic and environmental factors such as high-fat diet, use of high-fructose corn syrup in processed food, and inactivity have been implicated as factors in obesity [11].

This disease may indeed have a genetic component; however, a growing body of data

52 suggests that it may be caused by the type and/or ratio of bacterial populations residing in the gut. The gut microbiota helps to extract calories from ingested food that is then stored as adipose tissue or used to facilitate microbial growth. Individuals whose microbiota is overly efficient at energy extraction may be predisposed to obesity, and data have shown that the gut microbiome plays an important role in energy harvest from the host diet and fat storage [71, 221]. One study clearly demonstrated this connection when germ-free mice colonized with normal gut flora increased their body fat composition by 60% within 14 days despite a decrease in food consumption. However, when the microbiota from lean donor mice was transplanted, no significant alteration in phenotype was observed [75, 242]. This study demonstrated that colonization increases both the host capacity to harvest dietary energy as well as store the harvested energy in fat cells. The microbiota increased monosaccharide uptake in the gut, transfer of monosaccharides to the liver, and transactivation of lipogenic enzymes [75]. Quantitative

RT-PCR analysis demonstrated that fasting-induced adipocyte factor (Fiaf) expression is down-regulated by the gut microbiota and leads to increased hepatic lipogenesis, demonstrating that the microbiota may regulate host genes that promote deposition of fat into adipocytes. Studies have also shown that not only the presence but also the relative proportions of microbial divisions in the gut correlate with obesity. The Gordon group demonstrated that obese mice had a 50% reduction in Bacteroidetes and correspondingly more Firmicutes present in the gut relative to lean mice. These changes were shown to be division wide and not due to differences in chow consumption or total body mass [243]. In another study, this group transplanted the gut microbiota from both obese and lean mice into lean, germ-free mice. The mice that received the obese microbiota had significantly greater dietary caloric extraction and fat gain compared to those that received the lean microbiota. These results demonstrate that the increased energy extraction traits in the obese microbiota are transmissible and

53 suggest that manipulation of the gut microbiota could be used for treatment in obesity

[221]. Another study analyzed the microbiome of obese humans that were fed a weight loss diet [14]. This experiment demonstrated that the ratios of specific bacterial phyla,

Firmicutes and Bacteroidetes, were altered over time in the human gut and that an increase in energy harvest correlated with these changes as observed in previous studies in the mouse gut. It was concluded from these studies that the composition of the gut microbiota influences body weight and that the obese gut in mice/humans is more efficient than the lean gut at harvesting and converting dietary nutrition into energy

[14, 221]. Thus, the gut microbiome could contribute to the development of obesity through a variety of mechanisms, including polysaccharide fermentation, increased absorption of monosaccharides and SCFA, and increased hepatic lipogenesis.

Antibiotic-Induced Changes in the Gut Microbiome and Immune Cell Homeostasis.

The commensal microbiome engages in a constant cross talk with the human immune system and pattern recognition receptors (PRRs). Interestingly, the best-studied interactions between the gut microbiota and the immune system have impact on local immunity but also contribute to distal or systemic immunity to microbial infection. While additional studies must be undertaken to examine this interplay, it has been established that the microbiota is involved in immune development, tolerance and priming of the immune system that enables the host to mount a robust response to microbial infection.

The gut microbiota provides substantial energy to the host cell epithelium through the production of butyrate that also is anti-inflammatory, reducing neutrophil-derived reactive oxygen species [244]. Butyrate also reduces the lipopolysaccharide (LPS) -mediated production of TNF-α in peripheral blood mononuclear cells [245]. The use of germ-free mice has been a key tool to elucidate the role of the microbiome in immunity and dysbioses related to human disease. While powerful, germ-free animals do not have

54 well-developed immune systems and therefore complicate the interpretation of studies involving the interplay between the microbiota and immunity. An alternative approach used in multiple studies involves the use of broad-spectrum antibiotics that impact overall microbial titers or specific classes of bacteria resident in the gut bacterial communities.

Antibiotic treatments have substantial direct and indirect effects on the gut microbiota.

The specificity of antibiotics target specific classes of bacteria, such as gram-positive organisms; however, due to the extensive mutualistic relationships amongst member species, the ablation of sets of species has impact beyond the direct antibiotic effect and may result in additional losses within the community due to reduced fitness (indirect effect). This concept is well illustrated by treatment of mice with vancomycin, a gram- positive specific antibiotic, resulted in a sharp reduction of not only gram-positive species but also several gram-negative species [246]. Exposure to antibiotics is known to impact both innate and adaptive immunity of the host. The treatment of mice with antibiotics necessarily redistributes the phyla present in the gut, but also their accompanying immune modulatory ligands. Antibiotics that target gram-negative bacteria are likely to impact signaling through TLR4 and NOD1, whereas those targeting gram-positive species are likely to impact signaling through TLR2 and NOD2 pathways [247, 248].

Clostridium spp., a major group of bacteria in the commensal gut community, have been shown to increase TGF-β expression and increases in Treg titers [168]. Commensal production of ATP is known to stimulate differentiation of Th17 cells [249]. Similarly, segmented filamentous bacteria (SFB) play a role in the induction of Th17 cells [169].

Polysaccharide A (PSA) derived from the gut commensal B. fragilis, resulted in increased systemic CD4+ T-cell titers in germ-free mice and in a separate study was shown to induce FoxP3+ regulatory T cells to produce IL-10 via TLR2 thus preventing

55 inflammation in a mouse model of colitis [47, 170, 250]. Accordingly, TLR9- mice display increases in levels of Treg cells and reductions in IL-17 and IFN-γ CD4+ T cells [170].

Antibiotic treatment impacts the immune system resulting in the reduction of antimicrobial peptides (AMPs), including defensins, C-type lectins and cathelicidins secreted by Paneth and goblet cells and enterocytes [251]. Mice treated with amoxicillin resulted in the loss of Lactobacillus spp. and a reduction in β-defensins, matrilysin and phospholipase A2 production [252]. These alterations in community structure resulted in the down regulation of MHC class II and class Ib mRNA expression and an increased expression of mast cell proteases. Mice treated with gram-positive specific antibiotics

(vancomycin or ampicillin) but not gram-negative-specific treatments (metronidazole and neomycin) resulted in depletion of Th17 cells in the intestine [253]. These studies led to the eventual identification of SFB as the signal causing Th17 cell collapse [169]. The continued presence of the gut microbiota appears to be important for maintaining effector T cell populations in the gut. Treating mice for 10 days with a five antibiotic cocktail resulted in the reduced abundance of CD4+ T cells expressing IFN-γ or IL-17A with associated changes in the gut community architecture [148]. Recently, it was demonstrated that NK cells residing in non-mucosal lymphoid organs could not be primed to mount anti-viral immunity in germ-free mice as macrophages and DCs failed to produce type one interferons (IFN-I) [254]. While PRR signaling as well as NF-κB and

IRF3 nuclear translocation appeared normal, the chromatin structure of germ-free mice failed to induce expression of cytokine-associated promoter elements. As alterations in the gut microbiota alter the abundance of the molecular patterns, decreased CpG methylation resulted in reduced TLR9 signaling and a suppressed response to oral vaccination to antigen or infection by the E. cuniculi [255]. The treatment of mice with amoxicillin and cluvulanic acid resulted in reduced serum IgG levels without alteration in

56

IgA levels [256]. The peptidoglycan of gram-negative bacteria, signaling through NOD1 is required for the generation of adaptive lymphoid tissues [257]. The loss of immunoglobulins in the host results in a 100-fold expansion of the anaerobic gut microbiota [258]. Thus, antibiotic treatment directly and indirectly impacts the microbiota, which interacts with PRRs and the immune system.

STAT Mouse. The mouse microbiome has similar levels of both Firmicutes (60-80%) and Bacteroidetes (20-40%) when compared to the human microbiome, and the abundances of the taxa are 51% and 48% of the total, respectively [242]. Therefore, it is a relevant model of obesity despite having the disadvantages of an animal model. For the purposes of this dissertation, we will focus on the STAT mouse model system for obesity. Given the extensive work that has been performed using this model and the close association at the division level seen thus far between its microbiome and that of humans, we believe that the goals our experimentation were served using this well characterized model system. A long-standing observation in the agricultural industry is that sub-therapeutic antibiotic administration results in elevated body mass and adiposity of livestock. In the United States, antibiotics are most commonly used on farms, and these low doses administered in feed or water increase weight gain in animals by up to

15%, yet until recently the mechanisms for this effect remained unclear. Dr. Martin

Blaser at the New York University (NYU) School of Medicine (SOM) has developed a mouse model of obesity, which is referred to as STAT (sub-therapeutic antibiotic treatment) [67]. At weaning, C57BL/6J mice were exposed to penicillin, vancomycin, penicillin plus vancomycin, chlortetracycline, or no antibiotic at standard agriculturally implemented levels for 7 weeks. No significant differences in overall growth or gut microbial count were observed between STAT and control mice. However, total fat mass and percent body fat was increased in the STAT groups compared to controls,

57 indicating that the 7 week exposures changed body composition but not overall weight.

While results vary from experiment to experiment, the fat mass gain observed in male mice is greater than that observed in female mice. Dual energy x-ray absorptiometry

(DEXA) scans revealed that penicillin treated mice had the most marked change in phenotype with 32% body fat compared to 23% in untreated control mice. GIP, an incretin that stimulates lipoprotein lipase activity and targets adipocytes, levels were measured to define metabolic correlated to the changes in body composition. Serum

GIP levels were significantly higher in STAT mice compared to controls, which implies a possible mechanism for the observed increase in adiposity but could also be a secondary effect.

The titer of microbes in the gut of STAT mice is unaltered; however, specific changes in the composition of the intestinal microbiota are associated with the observed increase in adiposity. The ratio of Firmicutes to Bacteroidetes was significantly higher in STAT compared to control mice. In particular, a Lachnospiraceae family bloom was observed within the residents from the Firmicutes phylum within the STAT gut. It was also found that STAT exposure alters gut microbiome SCFA metabolism as evidenced by differences in gene copy number of formyltetrahydrofolate synthetase (FTHFS), which is involved in acetate synthesis, in STAT compared to control mice, which may reflect changes in microbial residents. Gas chromatography analysis demonstrated significant increases in acetate, propionate, and butyrate in STAT mice, and the ratio of butyrate to acetate was also significantly altered. Thus, STAT exposure may not only alter the composition of the gut microbiome but also SCFA metabolic capabilities. Increased

SCFA and butyrate/acetate levels may represent mechanisms for the STAT-induced adiposity phenotype. Metabolic cage experiments showed that there was no significant difference in caloric intake between STAT and control mice, however, a lower caloric

58 output was observed in STAT stool samples, indicating that STAT exposure may select for species that are more adept at energy harvest. Microarray analysis of liver samples identified 466 genes that were differentially expressed between STAT and control groups. Several pathways related to lipogenesis and triclyceride synthesis were upregulated in STAT mice and indicate regulatory changes in hepatic metabolism of fatty acids and lipids as a result of STAT treatment. No significant differences in adipocyte counts were observed between control and STAT visceral adipose tissue dissections.

While the STAT model of obesity does not perfectly mirror the weight gain observed in agriculture, the data provide evidence that antibiotics administered to young animals alter the microbiome and adiposity. The researchers concluded that STAT exposure selected for a gut microbiome with an increased capacity for energy harvest from complex carbohydrates, thus enabling increased lipogenesis as a result of increased

SCFA production.

Obesity and High-Fat Diet. A major metabolic consequence of a high-fat diet is that the insulin and regulatory mechanisms controlling body weight are deregulated through a lipotoxic effect [259]. Moreover, high-fat diets create changes in the composition of the gut microbiota, such as decreasing the numbers of Bifidobacteria [260]. It has been shown that members of this genus reduce LPS levels in the intestine and improve mucosal barrier function [68, 261, 262]. Obesity and type 2 diabetes are characterized by altered gut community profile, inflammation as well as gut barrier disruption [68, 263].

One study demonstrated that high-fat feeding induces changes in the gut microbiota, but the development of inflammation is more strongly associated with hyperphagia and an obesity phenotype [264]. The data suggested that the activation of TLR4, which has been shown to alter tight junctions and increase intestinal permeability, gut inflammation, and LPS contribute to the development of obesity in response to a high-fat diet.

59

Interestingly, they found that a high-fat diet induces changes in the gut microbiota, but such changes were not always associated with an obese phenotype [264]. Thus, the authors concluded that the concept of an “obese” microbiota may not always be applicable. They concluded that changes in the gut microbiota could lead to an increase in TLR4 activation at the epithelium, which could lead to altered tight junction permeability and an increase in gut inflammation. The increase in permeability could then increase the transfer of LPS from the lumen to the lamina propria, thereby increasing plasma levels of LPS (endotoxemia). These mechanisms are plausible, although their relative roles in promoting gut inflammation and metabolic disease remained undefined. Another study using female RELMβ null mice, which are resistant to the obesigenic effects of high-fat feeding, found similarities in the gut microbiota of both null and control mice. This data also suggested that diet rather than phenotype is more strongly correlated with the composition of the gut microbiome [265].

Germ-free mice fed a high-fat diet do not exhibit an obese phenotype or display metabolic effects, such as insulin resistance, associated with consumption of high-fat diets in conventionally raised mice [266]. In addition, germ-free mice have reduced adiposity despite increased food intake and are resistant to diet-induced obesity compared to their conventional counterparts [9, 75]. Germ-free mice fed a high-fat, high- carbohydrate diet, modeling an average Western diet, gained less weight than conventionalized mice and were resistance to diet-induced glucose intolerance and insulin resistance [148, 266]. Conventionalization of germ-free mice with the microbiota from lean or obese mice, both genetically- and diet-induced, results in the donor phenotype. These findings suggest associations between the gut microbiota, nutrition, and energy homeostasis.

60

Several studies have found an altered composition and functional potential in the obese gut microbiome. Deep metagenomic sequencing of stool samples of a large human cohort by the European MetaHit consortium revealed that the gut microbiota can be grouped into three general enterotypes that are composed of mostly Bacteroides,

Prevotella, and Ruminococcus [21]. Diet has been associated with enterotypes, with

Bacteroides being associated with a high fat diet and Prevotella associated with a high carbohydrate diet [267]. A reduction in the abundance of Bacteroidetes accompanied by a proportional increase in Firmicutes was observed in genetically obese (ob/ob) leptin- deficient mice [243]. Similar compositional changes were observed in wild-type mice fed a high-fat, high-carbohydrate Western diet [268]. However, conflicting data has been produced in humans and animal models in the ratio of these two major bacterial phyla in the obese microbiome. Some studies have reported increases in the

Firmicutes/Bacteroidetes ratio [14, 269], while other studies have not observed a change in ratio [15, 270]. Such inconsistencies highlight the need to consider variables such as diet, age, demographics, and methodology when comparing studies. An altered functional potential has more consistently been associated with the obese microbiome.

The Turnbaugh group described a “core microbiome” of commonly shared microbial genes and deviations that were associated with obesity [271]. Other studies employing metagenomic and systems biology approaches yielded similar results and suggested that obese microbiomes undergo topological shifts and reductions in diversity [26, 272].

Coding Potential of Obese and Lean Gut Microbiomes. Clustering of metagenomic sequence data has been used to identify functional categories with statistically significant differences in their representation in healthy humans [47], obese and lean mice [221], and obese and lean twins [26]. The metagenomes of the healthy human distal gut microbiota was shown to be enriched for energy production and conversion;

61 carbohydrate transport and metabolism; amino acid transport and metabolism; coenzyme transport and metabolism; and secondary metabolites biosynthesis, transport, and catabolism. The gut microbiome provides us with the ability to degrade common plant polysaccharides, and is enriched for enzymes that metabolize starch, sucrose, glucose, galactose, fructose, arabinose, mannose, and xylose. The gut microbiota is also important for generating short-chain fatty acids, which provide energy for colonocytes and may thus protect the integrity of the intestinal mucosal barrier [47]. In a comparative study, the obese microbiome was found to be enriched for glycoside families used for degradation of dietary polysaccharides such as starch.

Proteins that import (ABC transporters) and metabolize the products of these enriched glycoside (α- and β-galactosidases) as well as generate butyrate and acetate

(pyruvate formate- and formate-tetrahydrofolate ), the major end products of fermentation, were also enriched in the obese gut. Biochemical analysis has shown that the obese murine microbiota contains a higher concentration of butyrate and acetate, which is produced by many Firmicutes, and obese mice have less energy remaining in their feces compared to lean mice [221]. In a twin study, the Gordon group demonstrated that the majority of relevant genes enriched in obesity were derived from

Actinobacteria and Firmicutes. Specifically, the group identified over 400 genes of metabolic pathways that were enriched or depleted in the obese gut; one example was a phosphotransferase system responsible for microbial transport and processing of carbohydrates [26], which is consistent with what was also found in a mouse model of diet-induced obesity [268]. Many of the obesity-associated genes were involved in carbohydrate, lipid, and amino-acid metabolism, and as the authors suggested, comprise a set of biomarkers for obesity. Similarly, in this thesis project, through KEGG and COG analysis of metagenomic sequence data, we identified biomarkers of both health and antibiotic-induced obesity.

62

Chapter 3: Methods

Metaproteomics in Obese STAT and Control Mice

Mouse husbandry and antibiotic dosing. Animal experiments using the STAT mouse model of obesity [67] were performed in the laboratory of Dr. Martin Blaser at the NYU

SOM by Dr. Ilseung Cho and Laura Cox. The following protocol was approved by the

NYU SOM Institutional Animal Care and Use Committee (IACUC) (approval number

100708). There were two groups of mice, Control Male and STAT Male, with 2 individual mice per group. C57BL/6J mice were obtained at weaning (21 days of life) from Jackson Laboratories and allowed to adjust to the NYU animal facility for 1 week.

The mice were allowed ad libitum access to food and water, were maintained on a 12 hour light/dark cycle, and fed normal laboratory chow (PicoLab® rodent diet 5053, Lab

Diet) with 13.2% kcal% fat, 24% kcal% protein, and 62% kcal% carbohydrates. The

STAT mice were exposed to penicillin since birth because the dams were exposed to penicillin during nursing. Beginning on day 28 of life, control mice were given standard water (pH 6.8) and STAT mice were given water containing 6.67 mg/L penicillin, which represented a mid-range dose of those approved for use in agriculture. Fecal pellets were collected, placed in RNAlater® RNA stabilization reagent (Qiagen), and the samples were frozen at -80°C. The stool from the control mice was pooled into one tube and the same was done for the STAT mice. The average weight of the control mice was

26.4 g and body fat was 14.8%. In the male STAT group, the average weight of the mice was 29.8 g and body fat was 23.4%.

Isolation of bacteria from STAT and control mouse stool and cell lysis for protein recovery. A schematic for metaproteomics experimentation is shown in Appendix A.

For male control mice, 1.5 g (~100 stool pellets) of stool was processed. For male STAT

63 mice, 1.75 g stool was processed. First, 20 mL of ice-cold of homogenization buffer

PBST (100 mM sodium phosphate composed of 25 mM NaH2PO4 and 75 mM Na2HPO4 salts, pH 7.8; 50 mM NaCl; 0.05% Triton X-100) was added to each sample. The suspension was homogenized with stirring for 30 minutes at 4°C. The samples were placed in a sonicator water bath (Elma) filled with ice at low amplitude for 5 minutes.

This particular sonication treatment disrupts mammalian cell membranes only. The homogenates were filtered through a 100 um nylon sieve and the filtrate spun at 900 xg for 15 minutes at 4°C. The pellets were resuspended in 2 mL of 1X PBST and spun again at 900 xg for 15 minutes at 4°C. This step was repeated two additional times.

The remaining pellet consisted of mouse cells, cell debris and insoluble food residues.

The pellet from control mice weighed 0.98 g wet and the pellet from STAT mice weighed

0.65 g wet. The 900 xg supernatants, which were enriched in bacteria, were combined.

The combined supernatants were spun at 30,000 xg (Beckman JA 20.1 rotor) for 15 minutes at 4°C. The pellets, which were enriched in bacteria, from control mice weighed

0.45 g and the pellet from STAT mice weighed 0.63 g wet. The supernatants contained smaller particles including proteins, DNA, or polysaccharides released from bacterial cells and viruses. The bacterial pellets were resuspended in 1X PBST and spun at

30,000 xg for 15 minutes at 4°C. This step was repeated one additional time. This step removes any remaining structures at bacterial cell surfaces such as polysaccharides.

Differential pressure cycling was used for cell lysis. The bacterial pellets were resuspended in 1.4 mL of TTL Buffer (25 mM Tris-HCl, 5 mM EDTA, 0.05% Triton X-100 and 50 μg/mL lysozyme) then 1mM (final) AEBSF and 1 mM benzamidine was added with pipette mixing. The samples were subjected to 40 cycles of pressure cycling at

30,000 at 25°C. Next, 20 mM (final) MgCl2 and 1 μL of 2 U/μL DNAse (New England

Biolabs) was added, and the samples were incubated at 37°C for 15 minutes. After the

64 first incubation, 500 mM (final) NaCl was added and the samples were incubated an additional 15 minutes. The salt may result in further solubilization of proteins trapped in cell debris and aggregates. The samples were spun at 30,000 xg for 15 minutes at 4°C.

The supernatant (SUP1), containing proteins from the cell lysate, was put on ice, and an aliquot was taken for SDS-PAGE analysis. The pellet may contain unlysed cells, Gram positive cells that have cell walls that are not effectively degraded with lysozyme, cell debris, and aggregates. The pellets were resuspended in 0.5 volume of TTL Buffer, mutanolysin (Sigma Aldrich) was added to a concentration of 75 μg/mL, and samples were incubated for 2 hours at 37°C. This step may render the lysis of Gram positive bacteria more effective by cleaving peptidoglycan. The samples were subsequently subjected to 40 cycles of pressure cycling at 35,000 psi at 55°C. The samples were spun at 30,000 xg for 15 minutes at 4°C. The supernatants (SUP2) contained proteins from this cell lysis step were put on ice and aliquots were taken for SDS-PAGE analysis.

This experiment was repeated with new stool pellets to generate biological replicates.

The two supernatants were combined. Samples were concentrated to 100 μl in

Microcon YM-10 NMWCO 10 kDa (Sigma) centrifugal filter units and visualized on sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) gels.

SDS-PAGE analysis of mouse protein samples. SDS-PAGE or 1-dimensional (1D) gel analysis was performed to assess protein sample. Protein concentrations were determined using a Pierce™ bicinchoninic acid (BCA) assay (Thermo Scientific). For each sample, 5 μg as determined by BCA assay was loaded. An equal volume of rehydration buffer (62.5 mM Tris-HCl, pH 6.8; 2% SDS; 25% glycerol; 0.01% bromophenol blue) was added to each sample. Samples, protein standards (Bio-Rad), and bovine serum albumin (BSA) were incubated in a water bath for 3 minutes at 100°C.

The SDS-PAGE gel was run in MOPS SDS running buffer (50 mM MOPS, pH 7.7; 50

65 mM Tris; 0.1% SDS; 1 mM ETDA) at 100 V for 2 hours. The gel was removed from the gel box and placed in fix solution (25% isopropanol, 10% HOAc) for 30 minutes on a lab rotator with gentle shaking. The gel was subsequently placed in freshly prepared staining solution (10% HOAc, Coomassie® Brilliant Blue G-250), wrapped in foil, and placed on a lab rotator overnight. Finally, the gel was placed in destain solution (10%

HOAc) with a Kimwipe™ on a lab rotator for 30 minutes then scanned.

Filter-aided sample preparation and trypsin digestion for LC/MS analysis. The protein enriched supernatants (SUP1 and SUP2 pooled) derived from STAT and control male mouse stool were equilibrated to room temperature. Next, 20 μL of 1 M DTT and

12 μL of 10% SDS was added to each sample and mixed by vortexing. The samples were heated at 95°C in a boiling water bath for 3 minutes. The heat-denatured samples were equilibrated to room temperature and spun briefly at low speed. Microcon YM-10

NMWCO 10 kDa (Sigma) filters were washed with 400 μL water and spun at 8,000 xg for

10 minutes at room temperature. The heat-denatured cell samples were transferred to the prepared Microcon YM-10 filters and 5 μL aliquots were pipetted back into the original tube for SDS-PAGE analysis. The Microcon YM-10 filter units were spun at

14,000 xg for 40 minutes. The flow-through was discarded from the collection tube.

Next, 150 μL of UA denaturation buffer (8 M urea (Sigma); 50 mM Tris-HCl, pH 8.0) was added to the filter unit. The samples were spun at 14,000 xg for 40 minutes at room temperature. The flow-through was discarded from the collection tube and another 150

μL aliquot of UA buffer was added to the filter. The samples were spun again at 14,000 xg for 40 minutes at room temperature. Next, 100 μL of IAA alkylation solution (0.05 M iodoacetamide; 50 mM Tris-HCl, pH 8.0) was added to each sample. Samples were mixed by vortexing the filter units for 1 minute and incubated without mixing for 10 minutes in the dark at room temperature. The samples were spun at 14,000 xg for 30

66 minutes at room temperature. The flow-through was discarded from the collection tube.

Next, 150 μL of UA buffer was added and the samples were spun at 14,000 xg for 40 minutes. This step was repeated once and subsequently 30 μL of UA buffer, 112 μL of

50 mM NH4HCO3, and 3 μg of sequencing grade trypsin (Promega) was added to each sample. The samples were mixed by vortexing and new collection tubes were placed on the filter units. The filter units were sealed with Parafilm® M barrier film and incubated at room temperature overnight (~16 hours) after exchange of the collection tube of the filter unit. The samples were spun at 14,000 xg for 40 minutes and the first filtrate was collected. Next, 50 μL of 500 mM ammonium formate (pH 6.8) was added and samples were incubated for 5 minutes at room temperature. The samples were spun at 14,000 xg for 20 minutes, and the second filtrate was collected. This step was repeated and the three filtrates were pooled for each sample. Finally, 100 μL of 50 mM

NH4HCO3 was added to each filter unit and the samples were incubated for 5 minutes at room temperature. The samples were spun at 14,000 x g for 20 minutes, and the last filtrate was collected and pooled with the first three. The filtrates were lyophilized overnight and stored at -80°C. The samples were then analyzed by LC/MS on an

Orbitrap (Thermo Scientific) mass analyzer during a trip to the laboratory of Dr. Joshua

Adkins at Pacific Northwest National Laboratory (PNNL).

Tryptic digestion and high pH-reversed phase fractionation with smart pooling.

Protein samples from two biological replicates of each experimental condition were combined into one control and one STAT sample, and then digested with trypsin

(Promega). Briefly, samples were combined with 7 M urea and 5 mM dithiothreitol (DTT) then incubated at 60°C for 30 minutes with shaking. Samples were diluted 10-fold with

150 mM NH4HCO3 and CaCl2 was added to a final concentration of 1 mM. Finally, trypsin (resuspended in NH4HCO3) was added to each sample (final concentration 1 unit

67 trypsin: 50 units protein) and incubated for 3 hours at 37°C with gentle shaking. Peptide samples were subjected to C18 solid phase extraction for clean-up prior to further analysis. The high pH reversed phase fractionation of each peptide sample was accomplished according to the method found in Wang, et al [273], with the following exceptions: 96 fractions were collected over the 100-minute gradient, and every 25th fraction was combined to total 24 pooled fractions per biological condition.

Capillary nLC-MS/MS analysis. Peptide samples were analyzed using a nanocapillary liquid chromatography (LC) column [274, 275] coupled to a mass spectrometer (details below) via an in-house-manufactured interface. Fractionated samples were analyzed with both an LTQ Orbitrap mass spectrometer (Thermo Scientific) and LTQ Orbitrap

Velos (Thermo Fisher Scientific); unfractionated samples were analyzed with an LTQ

Orbitrap Velos only. The heated capillary temperature and spray voltage were 275°C and 2.2 kV, respectively. Data acquisition began 20 minutes after the sample was injected and continued for 100 min over an m/z range of 400 to 2,000. For each cycle, the ten most abundant ions from MS analysis were selected for MS/MS analysis.

MS data analysis using SEQUEST and MS-GF+. Raw spectra were searched using

SEQUEST [276] against the mouse gut metagenome sequence assembly generated in this study and MS-GF+ [277] against the Mus musculus genome sequence. The

SEQUEST parameter file that was used specified the following characteristics: fully tryptic peptide ends, dynamic methionine oxidation, static cysteine reduction and alkylation with a 2.8 Da peptide mass tolerance and 0.5 Da fragment ion tolerance. The

MS-GF+ parameter file specified partially tryptic ends and dynamic methionine oxidation.

Results were filtered using the MSGF [277] SpecProb value (<= 1e-12 for metagenome- based searches, <=1e-10 for Mus musculus searches). An estimate of the false-positive rate was determined by searching against a reversed FASTA database, as described

68 elsewhere [278]. An estimated false-discovery rate of 0.07 - 0.2% (range represents different datasets) was determined. The total number of peptide observations from each protein (spectral count) was used as a measure of relative abundance. Peptide observation counts and spectral counts were calculated using Microsoft Access.

Multiple charge states of a single peptide were considered individual observations, as were the same peptides detected in different mass spectral analyses. The metaproteomics spectral count data was analyzed using the JCVI Metagenomics

Reports (Metarep) open source tool for high-performance comparative metagenomics

[82, 279], and hits with p values < 0.05 using the Fisher’s Exact Test were further analyzed. The Hierarchical Cluster (HCL) matrices were generated by calculating the

Bray-Curtis dissimilarity indexes among samples using normalized category counts

(count divided by the number of overall counts per sample) and datasets were clustered with Average Linkage, which uses the average distance of two merged clusters for subsequent clustering iterations, in Metarep. The data was mapped onto KEGG pathways using the KEGG Mapper tool [280].

Metagenomics in Obese STAT and Control Mice

Mouse husbandry and antibiotic dosing for metagenomics experiments. Animal experiments were performed in the laboratory of Dr. Martin Blaser at the NYU SOM by

Dr. Ilseung Cho and Laura Cox. The following protocol was approved by the NYU SOM

Institutional Animal Care and Use Committee (IACUC) (approval number 100708).

There were four groups of mice (Control Male, STAT Male, Control Female, and STAT

Female) 10 individual mice per group. C57BL/6J mice were obtained at weaning (21 days of life) from Jackson Laboratories and allowed to adjust to the NYU animal facility for 1 week. The mice were allowed ad libitum access to food and water, were maintained on a 12 hour light/dark cycle ,and fed normal laboratory chow (PicoLab® rodent diet

69

5053, Lab Diet) with 13.2% kcal% fat, 24% kcal% protein, and 62% kcal% carbohydrates. STAT mice received penicillin from the time of birth. The STAT mice were exposed to penicillin since the dams were exposed to penicillin during nursing.

Beginning on day 28 of life, control mice were given standard water (pH 6.8) and STAT mice were given water containing 6.67 mg/L penicillin, which represented a mid-range of dose those approved for use in agriculture. At 17 weeks of age, mice were switched to a high fat diet (Rodent Diet D12451, Research Diets) with 45% kcal% fat, 20% kcal% protein, and 35% kcal% carbohydrates. Mice were placed in individual metabolic cages from 17 to 30 weeks of age. RNAlater RNA stabilization reagent (Qiagen) was placed in the fecal pellet collector and samples were frozen at -80°C. Pellets were collected at 30 weeks of age. The average weight of the female controls was 22 g and body fat was

23%, and in female STAT treated mice the average weight was 28 g and body fat was

34%. The average weight of the male controls was 36 g and body fat was 39%, and in the male STAT treated mice the average weight was 40 g and body fat was 43%.

Genomic DNA Isolation from STAT and control mouse stool and re-association kinetics enrichment (RAKE) of moderate and low abundance gDNA. A schematic overview of the metagenomic experimentation is shown in Appendix B. First, the

PowerSoil® DNA Isolation Kit (MoBio) was used to isolate microbial genomic DNA from

STAT male, control male, STAT female and control female mouse stool. Pools were made for each of the 4 groups using pellets from at least 8 individual mice per group.

The genomic DNA was quantitated using a Nanodrop UV spectrophotometer (Thermo

Scientific). Next, the samples were subjected to RAKE to enrich the samples both 100- and 1,000-fold through the sequential removal of the most abundant genomic DNAs using re-association kinetics (Cot analysis) [281]. Genomic DNAs (10 µg each) isolated from STAT and control mice were combined to create two reference pools (STAT and

70

Control). The DNA pools were mixed with re-association buffer (50 mM HEPES, pH 7.5;

200 µM EDTA) and sheared to 3–8 Kb by nebulization. The samples were heat denatured in a volume of 10 µl at 95°C for 5 minutes, NaCl was added to 0.5 M (final), and the tubes were subsequently held at 65°C. Hydroxyapatite (HA) columns were prepared using ~20 cm3 of resin (Bio-Rad) and were maintained at 65°C using a water jacket for optimal binding and nucleic acid migration. The columns were equilibrated sequentially with 10 mL of sodium phosphate buffer (SPB); Wash 1 (480 mM SPB), followed by wash 2 (120 mM SPB) and finally wash 3 (30 mM SPB) were added at 65°C.

The DNAs were diluted 100-fold in 30 mM SPB and loaded onto hydroxyapatite columns to separate and recover single-stranded (ss) DNA and double-stranded (ds) DNA fractions. The ssDNA was eluted using 120 mM SPB followed by elution of dsDNA with

480 mM SPB. The DNAs were then precipitated, washed with 80% ethanol, resuspended in TE, and quantitated using a Nanodrop spectrophotometer. This procedure resulted in 3 distinct genomic DNA fractions from both STAT and Control genomic DNA pools: 1. non-enriched, 2. 100-fold enriched (6 hours re- association), 3. 1,000-fold enriched (24 hours re-association). These determinations were made by performing qPCR and comparing the number of cycles required for ssDNA and dsDNA fractions to cross the fluorescence threshold.

A total of 5 μg for each sample (STAT male, control male, STAT female, control female) was sent to the JCVI Sequencing Core for DNA sequencing using Illumina technology.

In addition, the 100-fold RAKE samples were pooled and the 1,000-fold RAKE samples were pooled, and 5 μg from each of those pools was also sent for DNA sequencing on the Illumina Hi-Seq 2000, which represented 2 additional lanes of sequencing. Deep metagenomic DNA assemblies were carried out on the combined DNA sequence to maximize coverage and contig length. The Celera Assembler tool was used as

71 described previously [282]. The JCVI automated Metagenomics Annotation Pipeline

[283] was used to generate open reading frame (ORF) predictions and functional annotations. The metagenomic data was analyzed using the JCVI Metagenomics

Reports (Metarep) open source tool for high-performance comparative metagenomics

[279, 284] and hits with p values < 0.05 using the Fisher’s Exact Test were analyzed.

The Hierarchical Cluster (HCL) matrices were generated by calculating the Bray-Curtis dissimilarity indexes among samples using normalized category counts (count divided by the number of overall counts per sample) and datasets were clustered with Average

Linkage, which uses the average distance of two merged clusters for subsequent clustering iterations, in Metarep. Metagenomic data was mapped onto KEGG Pathways using the KEGG Mapper tool [280]. Individual sequence reads were also aligned to the reference genomes of the abundant gut residents Akkermansia muciniphila and Alistipes shahii using the CLC bio Genomic Workbench software (Qiagen), and differential abundance estimates using RPKM (reads per kilobase per million) were calculated using the open source JCVI Multiple Experiment Viewer (MeV) software [285]. The Linear

Expression Graph (LEG) function of the Linear Expression Map (LEM) module was used to visualize the mapped RPKM data. Gene Ontology (GO) categories with > 0.1% of total reads were further analyzed. qPCR analysis of RAKE-enriched DNA from STAT and Control samples. The 16S,

18S, and GAPDH genes from DNA samples isolated using either PCT or kit only were amplified using fluorescence (SYBR® green) and the iCycler IQ™ real-time detection system (Bio-Rad). PCR reactions consisted of 8 μl sample (3.2 ng total), 2 μL of primer mix (150 nM final), and 10 μl of iQ™ SYBR® Green Supermix (Bio-Rad) run at a 60° annealing temperature for 60 cycles. PCR RFU/cycle graphs were collected for each gene.

72

Chapter 4: Results

Overview. The results presented here summarize findings based on metagenomic DNA sequencing of microbial communities present in male and female control and STAT mice. This data allowed a detailed analysis of the alterations in species composition in

STAT mice compared to control and further to evaluate the shifts in functional gene representation resulting from the altered microbiota. Metaproteomics analysis was also conducted as a means of confirming the metagenomics results and to identify those functions that are differentially expressed in control and STAT mice. Due to resource limitations at PNNL, we only analyzed male control and STAT mice in metaproteomics studies. In all cases stool samples were provided by the Blaser laboratory (NYU) that carried out all of the animal work and antibiotic treatments. Male and female control mice gained weight and fat mass over time in an equivalent manner. STAT mice however display a gender-specific response to antibiotic treatment, such that male mice have increased fat mass compared to female mice. The reason for the phenotypic difference is not currently known. We exploited this difference in the analysis of all data sets to distinguish those changes in the STAT mouse microbiota that may be gender-specific but independent of the obese phenotype, from those that differences that were present in both male and female STAT mice but was more pronounced in male mice. Applying this analysis strategy, the amount of differential OMICs data was substantially reduced, allowing focus to be applied to those observed differences most likely to be linked to the obese phenotype under investigation.

Differential lysis of bacteria from healthy stool and protein isolation using TCL buffer and differential pressure cycling. Differential lysis of mammalian and bacterial cells was compared to iodixanol density gradient centrifugation as a means of enriching for microbially encoded proteins. These methods remove human contamination to a

73 greater level (over 65,000-fold enrichment of microbial DNA) than Iodixanol density gradient centrifugation (over 1,000-fold enrichment of microbial DNA) as determined by real-time quantitative PCR. The level of mammalian cells in stool samples is nominal but the amount of protein secreted into the lumen of the distal colon is substantial. In order to improve our ability to visualize the gut microbiome’s metaproteome, we applied strategies to enrich the microbial fractions to avoid mouse cell contamination and host secreted proteins. Interestingly, a number of mouse proteins (334 unique proteins) were observed in our metaproteomic analysis and while for brevity not described in the context of this thesis work, represent an interesting group of proteins that may bind tightly to microbial cell surfaces and represent an understudied aspect of the host- microbiome interaction in the gut.

Re-Association Kinetics Enrichment (RAKE) of Moderate and Low Abundance

Genomic DNAs. Despite the massive number of DNA sequences generated by Next

Generation sequencing technologies, the brute force DNA sequencing of microbiome samples only provides high coverage of genomic sequences derived from dominant species present within the population. To more comprehensively evaluate the species and gene functions encoded in the gut microbial communities and improve the representation of lower abundance species, an optimized re-association kinetics enrichment (RAKE) protocol to enrich metagenomic DNA samples through sequential removal of genomic sequences using re-association kinetics (Cot analysis) was applied.

This enrichment is based on the principle that the rate of DNA re-association following complete denaturation is directly proportional to the relative concentration of the sequences in the population [281]. In a DNA re-association time course, the concentration of dsDNA increases proportionately as the concentration of ssDNA decreases. By recovering the ssDNA fraction using hydroxyapatite chromatography, it is

74 possible to enrich for the slower to anneal DNAs. The rate of re-association is not linear over time. The dsDNAs form rapidly at first and the annealing slows progressively over time. Preliminary experiments indicated that a 100-fold normalization

([dsDNA]/[ssDNA]=100) was achieved after a 100 hour re-association reaction. The ssDNA fractions were isolated from fractions approximating 10 and 100-fold normalization and converted to dsDNA using random hexamers and the DNA polymerase Klenow fragment. It is important to note that the normalized DNAs were sequenced deeply for the purposes of improved DNA assembly but only those DNA sequences derived from unnormalized samples were used for quantitative comparison of control and STAT mice gene representation.

DNA sequence generation and assembly. Deep sequencing of samples was performed on the Illumina GSA IIX platform resulting in the generation of over 221 million reads (100 base) or 22 Gb of primary DNA sequence data. The number of reads and total number of bases of DNA sequence generated from each sample subjected to metagenomic DNA sequence analysis is shown in Table 1. Approximately 25% of the generated DNA sequence data (66 million reads) was derived from RAKE normalized

DNAs to improve the representation of moderate and low abundance genomic sequences (10X normalized = 24 million reads and 100X normalized = 41 million reads).

The normalized DNAs were derived from pooled male and female STAT mice. While the functional annotation of 100 base reads has improved dramatically due to the more comprehensive availability of reference bacterial genomes derived from the gut microbiota, the ability to unambiguously annotate putative gene functions is improved by assembly of DNA sequence into larger contigs. We therefore performed assemblies of the data to create a project-specific set of DNA scaffolds for the purpose of read mapping of DNA sequences derived from unnormalized control and STAT mice

75 microbiota. Assemblies were conducted using the Celera assembler. The assembled

DNA sequence resulted in the generation of 61,433 contigs (2 or more reads/contig).

The average contig length was 3,135 bases with a median contig length of 895 bases.

The N50 contig length was 78,760 bases and was determined based on the definition as the weighted median value and defined as the length of the smallest contig S in the sorted list of all contigs where the cumulative length from the largest contig to contig S is at least 50% of the total length. Nearly 29,000 contigs of 1 Kb or greater in length were generated, and this large number of gene size (1 Kb or greater) contigs facilitates mapping to functionally annotated gene sequences. The largest contig generated was over 225 Kb in length. Sequence reads from unnormalized samples were mapped to the assembled contigs and singletons. The functional annotation of the contigs and singletons were generated and comparisons of gene functional representation were carried out using the MetaRep program [279] developed by scientists at the JCVI for analysis of community-based metagenomic DNA sequences.

76

Table 1. Summary of metagenomic DNA sequencing data. The samples are listed in the first column. Two individual technical replicates of the RAKE samples were sequenced. The number of reads and kilobases (kb) generated for each sequenced sample are listed.

77

Sample # of reads total # of kb Male Control 39,961,048 3,996,104,800 Male STAT 48,269,772 4,826,977,200 Female Control 20,986,078 2,098,607,800 Female STAT 46,439,698 4,643,969,800 STAT RAKE Normalized 1 24,288,606 2,428,860,600 STAT RAKE Normalized 2 41,880,248 4,188,024,800

TOTAL 221,825,450 22,182,545,000

78

Taxonomic analysis of STAT obese and lean metagenomes. Most of the studies of the gut microbiota in obesity have reported changes in microbial community composition at the phylum level, although some studies have also identified genus level changes that may impact host metabolism in this context. While early investigations into microbiota profiles in obese subjects reported increases in Firmicutes and decreases in

Bacteroidetes as a promising general pattern [243], the results of studies to date have not yielded a consistent gut microbiome profile in obesity. The inconsistency in findings are likely due to differences in the model system (animal or human) and analytical methods used as well as confounding variables such as age, enterotypes, genetic background and diet. Thus, links between taxonomic features and obesity must be interpreted with caution.

Taxonomic analyses of STAT obese and lean microbiomes in both male and female mice were performed using the MetaRep program [279] at the superkingdom, phylum, class, order, and family taxonomic levels. Deeper full taxonomic analysis at the species level was not possible given the large datasets and server capabilities, although read mapping onto the most abundant full reference genomes was also performed thus allowing species-level associations to be established. As stated previously, the treatment of mice with sub-therapeutic doses of antibiotic induced increased fat mass

[67]. However, the obese phenotype is more pronounced in male compared to female mice, and the mechanism is unknown. Throughout this chapter, the results indicate that the antibiotic-induced changes are greater in the males compared to females, which is consistent with phenotypic measurements.

The superkingdom level metagenomic analysis revealed that Archaea domain levels

(0.25% of total mapped reads) were 2.5-fold higher in male STAT compared to control mice but similar in both female STAT and control mice. Methanosarcina acetivorans

79 was the most abundant Archaea species identified across all samples. Elevated

Archaea abundance levels in obesity have been observed in some studies [15, 138,

286]. The Eukarya, Bacteria, and sequences classified as Viruses were similar across all mouse conditions and represented 0.35%, 99.3% and 0.05% of the total mapped reads respectively. Most of the viral sequences derived from bacteriophages; the most abundant sequence in the STAT mouse was Bacteroides phage B40-8 (11% of detected viral sequence) and Geobacillus phage GBSV1 (20% of viral sequence).

Analysis of the top ten most abundant phyla (Figure 6) revealed that Bacteroidetes levels were similar in all mouse groups, which has been similarly observed in other studies [270, 287]. The relative abundance of Bacteroidetes ranged from 78 – 83% and

Firmicutes ranged from 7 – 13%, which is consistent with previous observations of these phyla in obesity [14]. In male STAT mice, the Firmicutes counts are about 3-fold higher compared to controls, while in female mice the levels are similar. This trend of increased Firmicutes in obesity has been observed in other studies [14]; however, different trends indicating that no change or reductions in Firmicutes have also been reported [15, 26, 270]. The Fusobacteria and Euryarchaeota were over 2-fold higher in male STAT compared to control, and Actinobacteria and Spirochaetes were approximately 3-fold higher in obese male mice; the levels of these four phyla were similar in female STAT compared to control mice.

80

Figure 6. Taxonomic variation at the phylum level in the mouse gut microbiome.

Relative abundance of the ten most abundant classes from male and female STAT and control mice. Percent abundance is shown on the x-axis. The phyla are named in the legend.

81

Male STAT

Male Control

Female STAT

Female Control

0 20 40 60 80 100

Bacteroidetes Firmicutes Proteobacteria Verrucomicrobia Actinobacteria Euryarchaeota Cyanobacteria Fusobacteria Spirochaetes

82

The Verrucomicrobia were the third most abundant phylum in Bacteria and the phylum found to be most reduced in both male and female STAT compared to matched control mice (2.5- and 5-fold reduced, respectively). Akkermansia muciniphila, which represents

3-5% of the gut community in healthy human subjects, is a mucin-degrading bacterium belonging to the Verrucomicrobia phyla and a dominant gut mucous layer resident [288].

The abundance of A. muciniphila inversely correlates with body weight in both mice and humans [215] and displayed decreased abundance in obese and type 2 diabetic mice

[217, 218]. Moreover, administration of viable A. muciniphila cells into mice by oral gavage reversed high-fat diet induced metabolic pathologies, including fat-mass gain, adipose tissue inflammation, and insulin resistance [217]. The administration of oligofructose, a prebiotic, to genetically obese mice increased the abundance of A. muciniphila by two orders of magnitude [218]. Thus, the development of an A. muciniphila treatment for obesity has been suggested.

The Bacteroidia, Clostridia, and Verrucomicrobiae were the most abundant classes of

Bacteria detected (Figure 7). The hierarchical clustering of the gut microbiome DNA sequence data revealed that the STAT mice clustered and were distinct from the control mice at the class level (Figure 8). These clades reveal a correlation between the host treatment and the gene content of the microbiome. Deeper phylogenetic hierarchical clustering was not possible with the MetaRep program. The Methanobacteria and

Fusobacteria abundance were 2 times higher in male STAT compared to control mice, while the levels were similar among female mice. The Clostridia, Bacilli, Actinobacteria,

Erysipelotrichi, Spirochaetes, and Negativicutes were all approximately 3-fold more abundant in male STAT compared to control mice. The Verrucomicrobiae were the most reduced class in both male and female mice. These results make evident that STAT

83 treatment of mice results in shifts in the microbial communities involving a broad range of taxa.

84

Figure 7. Taxonomic variation at the class level in the mouse gut microbiome.

Relative abundance of the ten most abundant classes from male and female STAT and control mice. Percentages shown on the x-axis. Classes listed in the legend in decreasing order of abundance.

85

86

Figure 8. Gut microbiomes cluster according to host treatment at the class level.

Hierarchical clustering plot of mouse gut microbiome data at the class level. Data from

STAT mice formed one cluster while the data from control mice formed a separate cluster.

87

88

The most abundant orders detected in Bacteria were the Bacteroidales, Clostridiales, and Verrucomicrobiales (Figure 9). The Lactobacillales, Clostridiales,

Selenomonadales, Thermoanaerobacterales, Spirochaetales, Bifidobacteriales,

Erysipelotrichales, and Coriobacteriales were all 2-fold more abundant or greater in male

STAT compared to control mice. The Burkholderiales were reduced 2.5-fold in male

STAT compared to controls, and the Verrucomicrobiales were reduced 10-fold in male

STAT mice and 2.5 fold in female STAT compared to controls, respectively.

The most abundant taxonomic families measured were Bacteroidaceae,

Porphyromonadaceae, and Prevotellaceae (Figure 10). The Veillonellaceae,

Aerococcaceae, Bifidobacteriaceae, Erysipelotrichaceae, Peptococcaceae, Clostridiales

Family XI. Incertae Sedis were 2 times higher in male STAT compared to control mice.

The Enterococcaceae, Thermoanaerobacterales Family III. Incertae Sedis,

Ruminococcaceae, Coriobacteriaceae were 3 times higher in male STAT compared to control, and the Eubacteriaceae and Lachnospiraceae were over 4 times higher in male

STAT compared to control mice (Figure 11). Only two small differences in abundance were observed in the female mouse data; the Thermoanaerobacterales Family III.

Incertae Sedis and Lachnospiraceae were 1.5 times higher in female STAT compared to control mice. The Verrucomicrobiaceae were reduced 9-fold in male STAT compared to control mice and reduced 2.5-fold in female STAT compared to control mice. The

MetaRep program could not provide taxonomic comparisons deeper than the family level; however, the top most abundant species and functions lists and read counts for each sample were possible and generated by the program and subsequently compared at the species level for additional higher-resolution taxonomic analysis. These data suggest that the most significant change in the microbiota of STAT mice is the reduction of Verrucomicrobiaceae, the family that A. muciniphila is a prominent member. While

89 the fold-change observed for the Bacteroidaceae is relatively small, the reduction of this family in the STAT obese male mouse is evident and potentially significant given its numerical dominance in the gut microbiota of these mice. Moreover, the increased abundance of Enterococcaceae in STAT mice may reflect an elevated inflammatory state in the gut of the obese mice. Our published work examining the changes occurring in the gut microbiota as the result of Salmonella typhimurium-mediated intestinal infection in mice showed that the abundance of this Enterococcus spp. sharply increased in parallel with the growth of the infectious agent over a 28-day time course

[289]. These increases were at the expense of the pre-infection commensal microbiota.

In addition, at several time points, the abundance of Enterococcaceae exceeded that of

S. typhimurium. It is likely that the heightened inflammatory status of the gut environment is highly inhibitory to the normal commensal microbiota, but may favor the outgrowth of Enterococcaceae. It cannot be ruled out that the Enterococcaceae directly contribute to the microbially-induced inflammatory response mounted by the host.

Measurements of features of the microbiota are likely to be a combination of an antibiotic effect that modulates the gut microbial community as well as a host immune response effect that may induce further changes in the community structure, which could confound data interpretation.

90

Figure 9. Taxonomic variation at the order level in the mouse gut microbiome.

Relative abundance of the ten most abundant orders from male and female STAT and control mice. Percentages shown on the x-axis. Orders listed in in the legend decreasing order of abundance.

91

92

Figure 10. Taxonomic variation at the family level in the mouse gut microbiome.

Relative abundance of the ten most abundant families from male and female STAT and control mice. Percentages shown on the x-axis. Families listed in the legend in decreasing order of abundance.

93

94

Figure 11. Differential abundance at the family level in the gut microbiome of male mice. The x-axis displays normalized differential abundance ratios (male STAT divided by male control) from families that contain at least 0.1% of the total metagenomic reads from a given sample. The y-axis displays the corresponding family name.

95

Lachnospiraceae

Eubacteriaceae

Coriobacteriaceae

Ruminococcaceae

Thermoanaerobacterales Family III. Incertae Sedis Family Enterococcaceae

Clostridiales Family XI. Incertae Sedis

Peptococcaceae

Erysipelotrichaceae

Bifidobacteriaceae

Aerococcaceae -10 -5 0 5

96

The most abundant species detected in the male control mouse group was Akkermansia muciniphila, which was not among the top eight most abundant species in the male

STAT mice. The most abundant organism detected in the male STAT animals was reported as Bacteroides fragilis, however upon closer inspection of the data this assignment is unlikely to be correct (Figure 12). The reads corresponding to B. fragilis are likely to be an unsequenced, close relative of B. fragilis. After the STAT treatment, the A. muciniphila is no longer found within the top eight species and the distribution of the most abundant species is more uniform. When comparing the most abundant species for each group, Alistipes shahii is enriched 1.8-fold and Bacteroides cellulosilyticus and Bacteroides intestinalis are 1.6-fold more prevalent in male STAT compared to control mice. A similar trend was observed in the females, wherein the most abundant species detected in the control female group was A. muciniphila while the most abundant organism detected in the female control mice was the B. fragilis sibling species (Figure 13). When comparing the abundance of the top species in the female groups, A. muciniphila is reduced 2.2-fold in STAT compared to control mice.

The metagenomic mapping pertaining to A. shahii is increased 1.3-fold female STAT treated mice and is therefore not consistent with the trend established by the male mice.

More detailed analysis of this genome and A. muciniphila were undertaken and described below. In addition, B. fragilis-like spp., Parabacteroides merdae, B. cellulosilyticus, and Parabacteroides johnsonii were enriched 1.4-fold in female STAT compared to control animals. The increased abundance of A. shahii, a genome replete with glycosyl hydrolases and glycosyl and an active cellulose degrader, in the STAT mouse may represent a potential topic for future research [290]. B. intestinalis and B. cellulosilyticus were also enriched in the STAT mouse and are related species that also actively metabolize cellulose in the gut [291].

97

Metagenomic read mapping to the reference genomes of abundant gut species.

Metagenomic reads were aligned with the A. muciniphila (Verrucomicrobia phylum), A. shahii (Bacteroidetes phylum), and B. fragilis (Bacteroidetes) genomes, which were of interest given the differential abundance observed between treatment groups and considering the full reference genomes were available for these organisms. A. muciniphila DNA sequence reads mapped uniformly across the length of the reference genome and to 90% of the organism’s genes, with the exception of 7 strain-specific genomic regions (Figure 14). These results also indicated that A. muciniphila abundance is reduced 2-fold in male STAT mice. The A. shahii reads also map uniformly across its genome; however, substantially more heterogeneity exists between the reference genome and the strains populating the gut of our experimental mice as the gene coverage of mapped reads was about 80% (Figure 15). Nevertheless, it is evident that A. shahii is overrepresented by 2-fold in female STAT mice and 8-fold in the male

STAT mice. Few reads mapped to the B. fragilis reference genome, thus we have likely detected a highly abundant yet unsequenced Bacteroides sp. A. muciniphila is a mucin degrader and exists in the gut as a dominant member of the mucosal-associated microbiota. The abundance of A. muciniphila is inversely proportional to body weight and molecular features associated with type 2 diabetes. Given these relationships, it may be concluded that A. muciniphila, in the rodent model, promotes gut health.

Common to both obesity and type 2 diabetes are an altered gut microbial community structure, chronic inflammation, and reduced barrier function of the intestinal epithelium cells (IECs). Recent studies have shown that A. muciniphila may be administered to treat obesity and type 2 diabetes by reversing high fat diet-induced metabolic disorders and inflammation induced by endotoxemia and adipose tissue produced inflammatory cytokines [217]. Mice treated with A. muciniphila also displayed improved gut barrier function. Taken together, A. muciniphila represents an important species in the context

98 of obesity. Future efforts should seek to more precisely determine the A. muciniphila- host interaction and the mechanisms through which it helps to maintain metabolic function and homeostasis.

99

Figure 12. Relative abundance of abundant species in the gut microbiome of male

STAT and control mice. The species are listed in the legend.

100

101

Figure 13. Relative abundance of abundant species in the gut microbiome of female STAT and control mice. Species listed in the legend.

102

103

Figure 14. Metagenomic read mapping onto the Akkermansia muciniphila reference genome and differential abundance of mapped reads in male and female

STAT and control mice. a, Genome coverage of DNA read mapping. The x-axis indicates chromosome position of detected reads. RPKM log2 ratios for each sample, which are named above each respective plot, are displayed on the y-axis. RPKM = total reads / mapped reads (millions) X gene length (Kb). b, Differential abundance of mapped reads. The y-axis displays STAT/Control log2 ratios, with the totals that include both genders on the first row, followed by the female data, and finally the male data in the last row.

104 a

b

105

Figure 15. Metagenomic read mapping onto the Alistipes shahii reference genome and differential abundance of mapped reads in male and female STAT and control mice. a, Genome coverage of DNA read mapping. The x-axis indicates chromosome position. RPKM log2 ratios for each sample, which are named above each respective plot, are displayed on the y-axis. b, Differential abundance of mapped reads. The y- axis log2 ratios of STAT/Control, with the totals that include both genders on the first line, followed by the female then male data. The x-axis indicates chromosome position of mapped metagenomic reads.

106 a

b

107

Comparative functional genomics analysis in STAT obese and lean gut microbiomes. Functional analysis revealed that the 3 most abundant enzyme categories found in the metagenomic data were forming aminoacyl-tRNA and related compounds (EC 6.1.1), nucleotidyltransferases (EC 2.7.7), and methyltransferases (EC 2.1.1). The categories that were the most overrepresented in male STAT compared to control and represented at least 0.1% of total mapped reads for a given sample included (Figure 16): With quinone or similar compound as an acceptor

(EC 1.1.5), Protein- kinases (EC 2.7.13), Acting on acid anhydrides to facilitate cellular and subcellular movement (EC 3.6.4), and Acting on polysaccharides (EC 4.2.2).

The female data were similar for the previously mentioned enzyme groups with the exception of EC 1.1.5, which was almost two times higher in female STAT compared to control mice. The most underrepresented categories in both male and female STAT compared to control mice that represented by at least 0.1% of total mapped reads for a given sample included: With an iron-sulfur protein as an acceptor (EC 1.4.7), Protein serine-threonine kinases (EC 2.7.11), Protein-tyrosine kinases (EC 2.7.10), and With oxygen as acceptor (EC 1.4.3).

Under anaerobic conditions present in the gut, glycolysis produces lactate through the reduction of pyruvate via the oxidation of NADH. Under aerobic conditions, the pyruvate generally enters the TCA cycle where it is oxidized to carbon dioxide while generating reducing equivalents of NADH. Oxygen serves as the terminal electron acceptor as ATP is produced during electron transport. Under anaerobic conditions, the TCA cycle is bypassed and ATP production occurs through less efficient processes which utilize sulfate, fumarate, methane, and acetate as terminal electron acceptors. Several of the differentially represented gene sequences mentioned previously encoded functions related to redox states. For example, quinone is also known as NADH dehydrogenase

108 and catalyzes the conversion of NADH to NAD+. The over-representation of this enzyme in STAT mice compared to controls may drive elevated NAD+ levels in the cell.

Iron-sulfur proteins chelate iron in various oxidation states in Fe-S clusters. These proteins mediate a wide range of cellular functions involving metalloproteinases, NADH dehydrogenase, cytochrome C reductase, and succinate coenzyme Q. Enzymes and proteins containing Fe-S clusters are particularly sensitive to oxidative damage caused by reactive oxygen species (ROS). This class of enzymes was under-represented in the

STAT compared to control group.

The hierarchical cluster (HCL) analysis of the metagenomics data revealed that sequence data from the STAT mice clustered and the data from the control mice clustered discretely at the enzyme level (Figure 17). The HCL analysis is based on a dissimilarity matrix of samples that is contingent on differences in their counts per category. The matrix was generated by calculating the Bray-Curtis dissimilarity indices among samples using normalized category counts (count divided by the number of overall counts per sample) and datasets were clustered with Average Linkage, which uses the average distance of two merged clusters for each subsequent clustering iteration, in the MetaRep program.

109

Figure 16. Differential abundance of enzyme class hits in the gut microbiome.

The y-axis displays normalized fold-change ratios (STAT/Control) for the most under and over abundant enzyme classes that represent at least 0.05% of the total reads mapped for a given sample. The legend displays EC numbers and categories.

110

5 1.1.5.- With a quinone or similar 4 compound as acceptor

3 2.7.13.- Protein-histidine kinases

2 3.6.4.- Acting on acid anhydrides 1 to facilitate cellular and subcellular movement 0 4.2.2.- Acting on polysaccharides Female Male -1

-2 1.4.3.- With oxygen as acceptor

-3 2.7.10.- Protein-tyrosine kinases -4

-5 2.7.11.- Protein- serine/threonine kinases -6

-7 1.4.7.- With an iron-sulfur protein as acceptor -8

111

Figure 17. Microbiomes cluster according to host treatment at the enzyme category level. Hierarchical clustering plot of mouse gut microbiome data at the enzyme level. Sequences from STAT mice enzyme hits formed one cluster while the data from control mice formed a separate cluster.

112

113

Additional functional analysis of the coding capacity of lean and STAT obese gut microbiomes was performed using Gene Ontology (GO) categories [52]. GO Slim analysis revealed 9 enriched and 6 depleted categories in obese compared to lean mice

(Figure 18). While not the most highly abundant GO categories, these differentially enriched categories met a threshold of at least 1,000 reads mapped per category (0.05% of total reads per category). GO Slim’s output represents higher-order or overview ontologies that contain a subset of the entire GO content. These terms are useful for providing results summaries of GO annotation of genome or microarray data as they denote broad classification of gene function. Functions involving organism motility and inter-cellular movement were enriched in STAT mice. GO analysis of the molecular function namespace, or container of indices, revealed that carbohydrate binding (GO:0005529, GO synonym: sugar binding) was enriched 5.5-fold in male STAT compared to control mice and was enriched 2-fold in female STAT compared to control mice (Figure 19). While below the 2-fold change threshold, the two-component sensor activity (GO:0000155), thioredoxin peroxidase activity (GO:000879), and heat shock protein binding (GO:0031072) was enriched 1.7-fold or greater in the male STAT compared to control mice. Certain metabolic processes were depleted in the STAT compared to control mice. The protein transporter activity (GO:0008565), (GO:0008892), acyl carrier (GO:0000036), and ligase activity, forming carbon-carbon bonds (GO:0016885) functions were underrepresented by 2-fold in male

STAT compared to control mice. Additionally, the phosphopantetheine binding

(GO:0031177) category was underrepresented by 3-fold, and the activity, acting on hydrogen as donor (GO:0016695) and ligase activity, forming carbon- sulfur bonds (GO:16877) functions were reduced by > 4 fold.

114

HCL analysis of both GO Slim and GO molecular function yielded a discrete clade with female and male STAT mice in contrast to the female and male control data (Figure 20).

Among the observed changes, the increase in sugar binding functions in the male STAT mouse may be significant with respect to the obese phenotype. The reduction of genes encoding thioredoxin peroxidase is predicted to increase the concentration of oxidized thioredoxin. Thioredoxin in its oxidized form will be inhibited in terms of its ability to metabolize H2O2 leading to elevated oxidative stress in the gut environment. Elevated oxidative stress is known to increase inflammation, which in turn may stimulate increased IEC turn-over and apoptosis, both of which lead to reduced barrier function that is a significant characteristic of the obese phenotype [292].

115

Figure 18. GO Slim analysis of lean and obese gut microbiomes. The x-axis displays normalized fold-change ratios (STAT/Control). The legend displays GO Slim categories.

116

locomotion

flagellum

Male cell projection

motor activity

cellular component movement

site-specific recombinase activity DNA packaging

energy reserve metabolic process Female cyclase activity

multi-organism process

cellular component disassembly

-12 -10 -8 -6 -4 -2 0 2 4 6 8

117

Figure 19. GO molecular function namespace analysis of STAT and lean gut microbiomes. The y-axis displays fold change differences and the legend lists the GO category.

118

6

ligase activity, forming carbon- sulfur bonds 4 phosphopantetheine binding

ligase activity, forming carbon- carbon bonds 2 acyl carrier activity

guanine deaminase activity

0 Female Male protein transporter activity

two-component sensor activity -2 thioredoxin peroxidase activity

heat shock protein binding -4 sugar binding

-6

119

Figure 20. GO HCL plots of male and female lean and STAT gut microbiomes. a,

GO Slim cluster. b, GO molecular function clusters.

120 a

b

121

Comparative metagenomic analysis of the common name annotation fields revealed that the top ten most abundant common names were similar across all mouse groups in terms of both gender and antibiotic treatment. The top ten common names were identical in both female STAT and control; however, the TonB-linked outer membrane protein SusC/RagA family was slightly enriched 1.4-fold in female STAT compared to control mice. The top ten common names identified in the male STAT and control groups were similar with the exception of the RNA polymerase sigma factor, sigma-70 family, which was found in the male control but not STAT male group (Figure 21). In the male groups, the representation of the top common names was similar with the exception of the TonB-linked outer membrane protein SusC/RagA family, which was underrepresented by 2-fold in the STAT male compared to control groups. Members of this family of outer membrane proteins are almost exclusively restricted to the

Bacteroidetes phylum and occur in high copy number. Members of this clade pair with

SusD/RagB family members to form tranporter complexes that are theorized to import protein degradation products (e.g. RagA) or carbohydrates (e.g. SusC) as nutrients as opposed to siderophores [293]. TonB-linked outer membrane protein members have also been implicated in IBD-associated immune response [294]. However, the significance of the reduction in representation of this family in the STAT male group compared to controls is unclear.

122

Figure 21. Most abundant gene annotation common names in male STAT and control gut microbiomes. The common name annotation fields are listed in the legend.

123

124

The TIGRFAMs resource contains Hidden Markov Models (HMM) derived from multiple sequence alignments for protein sequence classification [295]. Over 120

TIGRFAMs with greater than 1,000 mapped reads were found to be differentially enriched in STAT and control gut microbiomes with 45 TIGRFAMs representing > 0.1% of the total counts per sample (Figure 22, 23). The most differentially represented

TIGRFAM in both male and female group was CRISPR-associated protein Cas9/Csn1 subtype II/NMEMI (TIGR01865), which was enriched by > 40-fold in STAT mice compared to controls. CRISPR-associated endonuclease Cas1, subtype II/NMENI

(TIGR03639) and CRISPR-associated helicase Cas3 (TIGR01587) were also enriched but to a lesser extent. Clustered Regularly Interspaced Short Palindromic Repeat

(CRISPR) loci and the associated proteins (Cas) are bacterial mobile elements that act as a system of adaptive immunity against viruses and plasmids encoded by a wide range of prokaryotes and interact with key DNA repair proteins such as RecA [296]. The second most differentially enriched category in both male and female STAT mice compared to controls was lipoyltransferase and lipoate-protein ligase (TIGR00545), which was enriched by 20-fold in male STAT mice and 3.4-fold in female STAT compared to controls. Obesity reducing effects of alpha-lipoic acid have been observed in some studies. The glycerol-3-phosphate dehydrogenase, anaerobic, A, B, and C subunits (TIGR03377, TIGR03378, TIGR03379) protein families were enriched 4-fold in both male and female STAT compared to control mice. This enzyme complex converts glycerol-3-phosphate to dihydroxyacetone phosphate using electron acceptors other than oxygen, resulting in the generation of fumarate or nitrate as the final electron acceptor. This complex plays a key role in lipid biosynthesis, and enhanced adipose tissue activity has been associated with obesity and adipoctye differentiation [297, 298].

The significance of this activity in the gut microbiota is unclear with respect to host obesity. The C subunit contains motifs indicative of iron-sulfur binding. The

125 magnesium-dependent HAD phosphoserine phosphatase-like hydrolase, family IB

(TIGR01488) and HAD hydrolase, family IIA (TIGR01460) and family IIB (TIGR01484) were all greater than 2-fold enriched in male STAT compared to control mice. These families are invovled in N-acetylglucosamine metabolism and may mediate cell wall recycling [299]. The magnesium (TIGR00400) and ammonium (TIGR00836) transporters were enriched in both male and female STAT compared to control mice.

Ammonium transport is necessary to maintain active cell growth during periods of low intracellular ammonium concentration [300]. One could speculate that increased magnesium transport is a marker of increased bacterial growth in the STAT mouse gut

[301].

The Delta-proteobacterial GC-motif containing C-terminal protein sorting domain

(TIGR03901) was the most depleted (10-fold reduction) category in male STAT compared to control mice and was also depleted in female STAT compared to controls.

A member protein from this family, TraA, can repair mobility defects in community members by contact-dependent outer membrane exchange of lipids and proteins [302].

The PEP-CTERM protein sorting domain (TIGR02595), which is analagous to

TIGR03901, was the second most depleted category in both male STAT (6-fold reduction) compared to control mice and was also depleted in female STAT compared to controls although to a slightly lesser extent. The Tripartite ATP-independent periplasmic transporter (TRAP-T) family, DctM subunit TRAP-T family permeases (TIGR00786) category was depleted by 5-fold in male STAT compared to control mice. The reduction of these uptake systems may be compensated for by the over-representation of enzymes in the STAT mice that increase the levels of succinate and/or fumarate in the cell (see biochemical pathway analysis). This subfamily contains membrane proteins involved in the uptake of dicarboxylates such as malate, fumarate, and succinate. The

126 subfamily TRAP transporter solute receptor, DctP family (TIGR00787) was depleted by

3-fold in male STAT compared to control mice and also in female STAT compared to control groups. The Bacteroides expansion family 1 (TIGR02985) was depleted 4-fold in male STAT compared to control mice. With the exception of two known family members outside of Bacteroides, the TIGR02985 subfamily represents a group of sigma factors that are found within Bacteroides genus members.

127

Figure 22. HMM TIGRFAM analysis of depleted categories in STAT and lean gut microbiomes. Depleted TIGRFAM categories (STAT/control ratios) representing >

0.1% of total mapped reads per sample are shown. The x-axis displays fold change and the legend lists the TIGRFAM category.

128

Figure 21. HMM TIGRFAM domain analysis of enriched categories in STAT and lean gut microbiomes. Enriched TIGRFAM categories (STAT/control ratios) with >

0.1% of hits per samples are shown. The x-axis displays fold change and the legend lists the TIGRFAM category.

129

Figure 23. TIGRFAM HMM domain analysis of enriched categories in STAT and lean gut microbiomes. Enriched TIGRFAM categories (STAT/Control ratios) with >

0.1% of hits per sample are shown. The x-axis displays fold change and the legend lists the TIGRFAM category.

130

131

Biochemical pathway analysis in STAT obese and lean gut microbiomes. With the exception of the superpathways, MetaCyc pathways are smaller in comparison to KEGG pathways, which are typically mosaic pathways representing the activities of multiple species, and model individual biological pathways from individual organisms. In contrast, MetaCyc superpathways describe pathways possessed by multiple taxa.

Seven MetaCyc pathways each with greater than 1,000 reads mapped were found to be differentially enriched in STAT and control gut microbiomes (Figure 24). The UDP-N- acetyl-galactosamine biosynthesis I pathway (PWY-5512), a step in the processing and incorporation of cell wall polysaccharides, was the most enriched (5-fold increased representation) MetaCyc pathway in male STAT compared to control groups.

Interestingly, the glycerol-3-phosphate shuttle (PWY-6118) was enriched in both male and female STAT compared to control mice, which corresponds to the glycerol-3- phosphate dehydrogenase A and B subunit (TIGR03377 and TIGR03378) protein family members also found to be enriched in both male and female STAT compared to control groups in the HMM analysis. Glycerol-3-phosphate dehydrogenase represents the first step in the glycerol-3-phosphate shuttle pathway, which is annotated as occurring in yeast but not in bacteria and generates NAD+ and Glycerol-3-phosphate. The glycerol degradation IV (PWY-4261) and fatty acid activation (PWY-5143) pathways were enriched while the acetate conversion to acetyl-coA pathway (PWY-1313) represented the most depleted category in male STAT compared to control mice. The reduction of enzymes that convert acetate to acetyl-coA in STAT compared to control mice may related to the increased utilization of acetate as a terminal electron acceptor, which is used by acetogenic bacteria for anaerobic respiration and ATP generation [142]. The ppGpp biosynthesis pathway (PPGPPMET-PWY) was found enriched in male STAT compared to control animals. These nucleotides are key markers and regulators of the stringent response in many bacterial species during nutrient deprivation and

132 environmental stress and contribute to the regulation of cellular processes such as growth, cell division, motility, both positive and negative transcriptional regulation of some Sigma-70 dependent genes and often signals the induction of virulence factors

[303]. This is interesting in the context of the GO analysis that revealed enrichment in functions related to growth and motility in the male STAT compared to control animals and depletion of Bacteroides specific Sigma-70 family sequences in male STAT compared to control groups. The DIMBOA-glucoside degradation pathway (PWY-4441) was enriched in both male and female STAT compared to control mice. This pathway is annotated as found in the Plant kingdom; however, it is likely that these enzymes are bacterially encoded. Their elevated expression in STAT mice is consistent with increased energy harvest from dietary polysaccharides in STAT mice compared to controls.

The superpathway of sulfur metabolism (Desulfocapsca sulfoexigens) (PWY-5308) and superpathway of thiosulfate metabolism (Desulfocapsca sulfoexigens) (PWY-5306) were the most depleted categories in female STAT compared to control and the second most depleted category in male STAT compared to control. The superpathway of sulfur amino acid biosynthesis (Saccharomyces cerevisiae) (PWY-0821) was depleted in both the male and female STAT compared to control group. The methylglyoxal degradation

VI pathway (MGLDLCTANA-PWY), which is annotated as occurring in mammals such as Rattus norvegicus, was decreased in male STAT compared to control mice. A methylglyoxal degradation VI pathway is found in bacteria such as Escherichia coli K-12 and methylgloxal reductases have been characterized for this kingdom [304]. Thus, these signatures could have been derived from either the host or the gut commensals.

The NADH to cytochrome bo oxidase electron transfer (PWY0-1335), 3-amino-5- hydroxybenzoate biosynthesis (PWY-5979) which is curated as expected in the

133

Actinobacteria phylum as the taxonomic range, and pentose phosphate pathway

(oxidative branch) (OXIDATIVEPENT-PWY) were also reduced in the male STAT compared to control group. The oxidative branch of the pentose phosphate pathway serves to convert glucose-6-P to pentose sugars and reducing equivalents in the form of

NADPH. One important function of NADPH in the cell is to reduce oxidative stress. The

NADPH associated with glutathione in conjunction with glutathione reductase converts

H2O2 into H2O, whereas when this function is inhibited, H2O2 is converted to harmful hydroxyl free radicals that cause damage to macromolecules within the cell.

134

Figure 24. MetaCyc pathway analysis of STAT and lean gut microbiomes. The x- axis displays fold-change and the legend lists the names of specific MetaCyc pathways.

135

136

The KEGG pathway analysis involved mapping metagenomic reads onto KEGG pathway maps, which represent known interactions with KEGG Orthology groups, for the interpretation of high level systemic functions. From a total pathway perspective, few differences in the total number of reads mapped or abundance were detected between the STAT and control groups (Figure 25). The Drug metabolism - cytochrome P450

(00982) and Metabolism of xenobiotics by cytochrome P450 (00980) were the most differentially enriched pathways in male STAT compared to control animals (5-fold) and enriched to a lesser extent in the female STAT group. Xenobiotics are metabolized by a large family of enzymes. This metabolism and the division of activities between the host and gut microbiota are not yet well defined. Xenobiotic metabolism is increased in STAT mice compared to controls, but the products of this activity are largely uncharacterized and therefore the impact on the host is unknown. Dietary xenobiotics are recognized by a variety of nuclear receptors. The nonsteroid constitutive androstane receptor (CAR) regulates glucose homeostasis, lipogenesis, xenobiotic, and energy metabolism. CAR- deficient mice are more susceptible to high fat diet-induced obesity [305]. Other critical receptors recognizing xenobiotics include the nuclear receptor protein group called the peroxisome proliferative activated receptors (PPAR,  and ). These receptors respond to a variety of metabolic signals that impact lipid balance. PPARis activated in the host by fatty acid binding that promotes lipid biosynthesis and storage.

Administration of penicillin group antibiotics induces cytochrome P450 isozymes [306], which is intriguing given that the STAT mice were treated with penicillin. The Starch and sucrose metabolism pathway (00500), Biosynthesis of vancomycin group antibiotics

(01055), and Cyanoamino acid metabolism (00460) pathways were enriched 1.3-fold in male STAT compared to control mice; of the aforementioned pathways, only the

Cyanoamino acid metabolism pathway was enriched in female STAT compared to

137 control mice.

The Penicillin and cephalosporin biosynthesis (00311) and beta-Lactam resistance

(00312) pathways, which interact with one another, were slightly underrepresented in the male STAT compared to control group, which is interesting given the antibiotic treatment in the obese group. The Sulfur metabolism (00920) pathway was also depleted in male

STAT compared to control animals, which corresponds to the MetaCyc pathway analysis that revealed underrepresentation of the sulfur (PWY-5308) and thiosulfate (PWY-5306) metabolism superpathways and superpathway of sulfur amino acid biosynthesis (PWY-

821). The Arachidonic acid metabolism (00590) pathway was depleted in male STAT compared to control mice. The cytochrome P450 enzymes in this pathway metabolize arachinodic acid to eicosanoids, which control a variety of biological functions such as cell proliferation and inflammation [307]. The depletion of the Arachidonic acid metabolism pathway and enrichment of the two Cytochrome P450 KEGG pathways in male STAT compared to control mice is noteworthy.

138

Figure 25. KEGG Pathway analysis at the whole pathway level in STAT and lean gut microbiomes. The x-axis displays fold-change at the overall pathway level and the legend lists KEGG Pathway names.

139

140

The analysis of individual KEGG pathways in male STAT and control mouse metagenomes revealed many differentially abundant enzymes. For brevity, KEGG pathways displaying overall differential abundance at the Pathway level and that contain differential representation of greater than 2 enzymes were subsequently analyzed in greater detail. First, the overrepresented pathways were examined. In the Cyanoamino acid metabolism pathway (00460), which is described primarily in plant microbes, seven enzymes were differentially represented in STAT and control mice. Beta-glucosidase

(EC 3.2.1.21) was overrepresented 1.7-fold and the With NADH or NADPH as one donor, and incorporation of one atom of oxygen (EC 1.14.13.-) was overrepresented by

1.5-fold in male STAT compared to control mice (Figure 26). Amino acids such as L- tyrosine, L-isoleucine, L-valine and L-phenylalanine are metabolized in a series of pathways involving overlapping and unique enzymes (EC 1.14.13.-) that produce (Z)-4- hydroxyphenyl-acetaldehyde oxime, (Z)-2-methyl-butanal oxime, (Z)-2-methyl-propanal oxime and (Z)-phenyl-acetaldoxime, respectively. Beta-glucosidase is a glycoside hydrolase that enables the release of glucose from polysaccharides, and inhibitors are being studied extensively as potential anti-obesity, anti-diabetic, and anti-tumor therapeutic agents [308]. The over-representation of -glucosidase in STAT mice is of potential importance in converting these substrates to a series of nitrile compounds such as (S)-4-hydroxy-mandelonitrile, 2-hydroxy-2-methyl-butanenitrile, 2-hydroxy-2-methyl- propanenitrile and (R)-mandelonitrile. The significance of these compounds has not been well established in the scientific literature. It is possible that the over- representation of -glucosidases in STAT compared to control mice have more relevance in the context of other pathways (discussed below) that metabolize cellulose and other host indigestible polysaccharides. (EC 3.5.1.1) was slightly underrepresented by 1.3-fold in the male STAT compared to control group. Given the

141 locations of these differentially represented enzymes within the Cyanoamino metabolism pathway, it is likely that L-aspartate metabolites are reduced and cyano amino acid metabolites are increased in STAT compared to control mice (Figure 26).

The most abundant species representing the metagenomic reads mapped onto the

Cyanoamino metabolism pathway derive from the Bacteroidetes phylum (Figure 27).

The detection of Porphyromonas gingivalis in both the male STAT and control group is likely incorrect as this species is a known oral pathogen [309] . The reads may derive from a yet to be sequenced Porphyromonas species. The pathway mapping revealed that Prevotella marshii, Parabacteroides merdae, and Clostridium bolteae were among the top species of male control but not male STAT mice, and the Bacteroides coprophilus, Alistipes shahii, Butyrvibrio fibrisolvens, and Spirochaeta smaragdinae were among the stop species in male STAT but not control mice. The Faecalibacterium prausnitzii, Prevotella bergensis, and Bacteroides xylanisolvens species were represented greater than 2-fold higher in male STAT compared to controls within the

Cyanoamino acid metabolism hits, and Bacteroides thetaiotaomicron was underrepresented by 1.5-fold in male STAT compared to control.

142

Figure 26. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Cyanoamino acid metabolism pathway (00460). Enzymes overrepresented in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enzymes with equal representation are shown but not colored. Metagenomic reads mapped onto 26% of the enzymes in this pathway.

143

144

Figure 27. Top species to which metagenomic reads mapped onto the

Cyanoamino acid metabolism pathway (00460) also map. Species listed in the legend in decreasing order of abundance per treatment group.

145

146

In the Starch and sucrose metabolism KEGG pathway (00500), 18 enzymes were differentially represented in male STAT compared to control mice (Figure 28). Given that two starch synthases are overrepresented in male STAT compared to control in this pathway and the position of the overabundant enzymes within the starch and sucrose metabolism pathway, it is likely that amylose, D-xylose, fructose and glucose metabolites are overabundant in the male STAT compared to control group (Figure 29). Specifically, the effect of this differential gene representation predicts several alterations in the STAT mouse such as the increased generation of D-xylose due to increased representation of xylan 1,4--xylosidase (EC 3.2.1.37), increased generation of -D-glucose and/or -D- fructose due to increased representation of-fructosidase (EC 3.2.1.26) and sucrose - glucosidase (EC 3.2.1.48), respectively. Similarly, increased representation of amylo--

1,6-glucosidase (EC 3.2.1.10) promotes the increased conversion of isomaltose and/or dextrin to -D-glucose. It appears that these shifts in enzyme representation may increase glycogen/amylose. The UDP-glycogen synthase (EC 2.4.1.11), which is overrepresented, converts UDP-glucose to glycogen. Likewise, ADP-glucose pyrophosphorylase (EC 2.7.7.27) and ADP-glucose synthase (EC 2.4.1.21) drive increased conversion of -D-glucose-1P to ADP-glucose and glycogen, respectively.

Finally, the increased representation of glycogen phosphorylase (EC 2.4.1.1) and amylo-

(1,4->1,6)-transglycosylase (EC 2.4.1.18) may result in the increased biosynthesis of glycogen.

A number of genes encoding enzymes within the Starch and sucrose metabolism pathway are also decreased in representation in the STAT mouse compared to controls.

The representation of -D-glucose 1,6-phosphomutase (EC 5.4.2.2), glucose-1- phosphate cytidylyltransferase (EC 2.7.7.33), uridine 5'-diphosphoglucose pyrophosphorylase (EC 2.7.7.9), and UDP-D-glucose dehydrogenase (EC 1.1.1.22)

147 genes were decreased in the STAT mouse compared to controls; and, all of these enzymes metabolize -D-glucose-1P. The reduced representation of these gene sequences in the STAT mouse suggests that -D-glucose-1P will be maintained at higher concentrations in the STAT mouse compared to controls.

The KEGG pathway mapping revealed that Akkermansia muciniphila, Prevotella marshii,

Clostridium bolteae, Ferrimonas balearica, and Prevotella ruminicola were among the top species of male control but not male STAT mice, and the Bacteroides vulgatus,

Bacteroides dorei, and Eubacterium ventriosum were among the top species in male

STAT but not control mice within the Starch and sucrose metabolism KEGG pathway

(Figure 30). The Alistipes shahii and Butyrivibrio fibrisolvens species were represented

1.3-fold higher in male STAT compared to controls within the Starch and sucrose metabolism hits, and Parabacteroides distasonis was underrepresented over 2-fold in male STAT compared to control.

148

Figure 28. Differential abundance of enzymes in the Starch and sucrose metabolism (00500) KEGG pathway in male STAT compared to control mouse gut microbiomes. The x-axis displays fold-change and the legend lists enzyme names and

EC numbers in parentheses.

149

150

Figure 29. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Starch and sucrose metabolism KEGG pathway (00500).

Enzymes overrepresented in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enzymes with equal representation are shown but not colored. Metagenomic reads mapped onto 55% of the enzymes in this pathway.

151

152

Figure 30. Top species from which metagenomic reads mapped onto the Starch and sucrose metabolism pathway (00500) also map. Species listed in the legend in decreasing order of abundance per treatment group.

153

154

Next, the underrepresented KEGG pathways in male STAT and control mice were further analyzed. In the Sulfur metabolism KEGG pathway (00920), 8 enzymes were differentially represented in male STAT compared to control mice (Figure 31). Given the position of the overabundant enzymes within the sulfur metabolism pathway, it is likely that L-cysteine as well as succinate and/or L-homocysteine metabolites are overabundant in the male STAT compared to control group (Figure 32). Gene representation of enzymes comprising both the assimilatory reduction and dissimilatory sulfate reduction and utilization pathways are reduced in the STAT mouse suggesting diminished reduction of sulfate to sulfite. This suggests that sulfate levels may be higher in the STAT mouse compared to controls. The observed differential representation of enzymes mediating sulfate reduction may have implications with regard to the gut microbial composition as it relates to the obese phenotype of STAT mice. Sulphate reducing bacteria (e.g. Desulphovibrio spp.) present in the gut microbiota are primarily responsible for the reduction of sulfate (oxidation state +6) to sulfite (oxidation state +4) and sulfide (oxidation state -2). The reducing equivalents required for sulfate reduction are facilitated by the activities of H2-producing fermentative bacteria, H2 production and accumulation via branched fermentation, leads to feedback inhibition limiting the activities of H2 producing fermenters. Therefore, the H2-consuming bacteria

(methanogens, acetogens and sulfate reducing bacteria) are critical to the maintenance of the primary fermenters in the gut [310]. The excess sulfate that is likely produced in

STAT compared to control mice suggests an overall reduction of sulfate reducing bacteria, which emphasizes the potential importance of methanogens and acetogens in the maintenance of reduced H2 concentration and elevated fermentative activity in gut bacteria.

155

The KEGG pathway mapping revealed that Rhodopirellula baltica, Rhodospirillum rubrum, Oceanobacillus iheyensis, Parabacteroides distasonis, and Ethanoligenens harbinense were among the top most abundant species in male control but not male

STAT mice, and the Bacteroides cellulosilyticus, Bacteroides xylanisolvens, Alistipes shahii, and Bacteroides sp. 1_1_14 and 3_1_19 were among the top species present within the Sulfur metabolism pathway in male STAT but not control mice (Figure 33).

The Alistipes shahii were represented 2-fold higher in male STAT compared to controls within the Sulfur metabolism hits, while the Prevotella bergensis were underrepresented by 2.5-fold and the Akkermansia muciniphila were decreased by 5.5-fold in the male

STAT compared to control group.

156

Figure 31. Differential abundance of enzymes in the Sulfur metabolism (00920)

KEGG pathway in male STAT compared to control mouse gut microbiomes. The x-axis displays fold-change and the legend list enzyme names and EC numbers in parentheses.

157

158

Figure 32. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Sulfur metabolism KEGG pathway (00920). Enzymes overrepresented in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enzymes with equal representation are shown but not colored. Metagenomic reads mapped onto 38% of the enzymes in this pathway.

159

160

Figure 33. Top species to which metagenomic reads mapped onto the Sulfur metabolism pathway (00920) also map. Species listed in the legend in decreasing order of abundance per treatment group.

161

162

In the Arachidonic acid metabolism pathway (00590), 3 enzymes were differentially represented in STAT compared to control mice (Figure 34). The Aminoacyltransferases

(2.3.2.-) and gamma-glutamyltransferase (2.3.2.2) were overrepresented by over 3-fold and the Transferring hydroxy groups (5.4.4.-) enzymes were underrepresented 2.5-fold in male STAT compared to control mice. Given the position of these enzymes within the pathway, it is possible that Hepoxillin A3 and B3 would be reduced while LTF4 would be increased in male STAT compared to control mice. Hepoxillin B3 has proinflammatory action in the skin, while the functional significance of Hepoxilin A3 is to target polymorphonuclear leukocytes to the intestinal lumen at sites of inflammation [311, 312].

The leukotriene LTF4 is an inflammatory mediator that increases vascular permeability and bronchoconstriction [313]. The production of LTF4 induces the production of prostaglandins and histamine. Thus, it is possible that increased inflammation and decreased neutrophil migration to the intestines in STAT compared to control mice may be mediated by increased LTF4-induced host pathways. The KEGG pathway mapping revealed that the top species were very different in the two treatment groups (Figure

35). The Prevotella sp. was 5-fold higher and the Akkermansia muciniphila were 1.4-fold reduced in male STAT compared to control mice in the Arachidonic acid metabolism pathway. The aforementioned Prevotella species has not been reported in Pubmed, but has been observed in metagenome sequence (NCBI taxon ID 619693).

163

Figure 34. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Arachidonic acid metabolism KEGG pathway (00590).

Enzymes overrepresented in male STAT compared to control groups are colored or outlined in yellow while those underrepresented are outlined in blue. Enzymes with equal representation are shown but not colored. Metagenomic reads mapped onto 24% of the enzymes in this pathway.

164

165

Figure 35. Top species to which metagenomic reads mapped onto the

Arachidonic acid metabolism pathway (00590) also map. Species listed in the legend in decreasing order of abundance per treatment group.

166

167

Taxonomic analysis of STAT Obese and Lean metaproteomes. A total of 2,975 unique proteins were identified in STAT and control mice from searching the STAT model-specific database, which contained over 666,000 protein sequence entries, with

Sequest. Taxonomic analyses of STAT obese and lean metaproteomes in male mice were performed using the MetaRep program [279] at the superkingdom, phylum, class, order, and family taxonomic levels. The superkingdom level analysis revealed that

Archaea protein expression levels (0.06% of total mapped peptides) were 3-fold lower in

STAT compared to control mice, while in the metagenomic data the Archaea domain levels of metagenome representation was 2.5-fold higher in the male STAT compared to control group. At the phylum level, the most abundant protein expression derived from

Bacteroidetes (88-89%), Firmicutes (5-6.5%), and Proteobacteria (4%) (Figure 36), which was similar in representation to the metagenomics data (Figure 6). In addition, protein spectral counts from 11 phlya were differentially represented in the STAT compared to control group (Figure 37). In the metagenomic data, the Firmicutes counts were higher and the Verrucomicrobia read counts were reduced in STAT compared to control mice; however, in the metaproteomes, Firmicutes spectral counts were 2-fold lower and Verrucomicrobia counts were 2-fold higher in STAT compared to control mice.

The disparity in meta proteomic and metagenomic results in some instances is difficult to explain, however, the reliability of the metagenomic data is substantially greater in phylogenomic analyses since the depth of coverage is very high for abundant species.

Conversely, the identification of species based on short peptide sequences is error prone.

The Bacteroidia, Clostridia, and Betaproteobacteria were the classes with the highest spectral counts (Figure 38). Additionally, protein spectral counts from 9 classes were differentially represented in the STAT compared to control group (Figure 39). The

168

Verrucomicrobiae counts, which were the most reduced class in STAT compared to control mice, were 2-fold higher at the protein level in the STAT compared to control group. The Fusobacteria, which were 2 times higher in STAT compared to control mice, were 3-fold lower in STAT compared to control mice at the protein expression level. The orders with the highest spectral counts were the Bacteroidales, Clostridiales, and

Neisseriales, and 19 classes exhibited differential protein expression (Figure 40, 41).

The short length of peptides used for mapping represent a less reliable method for phylogenomic analysis compared to the metagenomic data, which is substantially deeper in its range of representation. Therefore, the metagenomic data may be considered more accurate for phylogenomic profiling of STAT and control groups.

169

Figure 36. Taxonomic variation at the phylum level in the mouse gut metaproteomes. Relative abundance of the ten most abundant phyla from male STAT and control mice. Categories listed in the legend in decreasing order of abundance.

170

Bacteroidetes Male STAT Firmicutes Proteobacteria Actinobacteria Verrucomicrobia Euryarchaeota Aquificae

Male Control Fusobacteria Cyanobacteria Spirochaetes

80% 85% 90% 95% 100%

171

Figure 37. Differential protein expression at the phylum level in the gut metaproteome of male mice. The x-axis displays normalized differential expression ratios (male STAT divided by male control) from phlya that contain at least 0.1% of the total metaproteomic spectral counts from a given sample. The y-axis displays the corresponding phylum name.

172

Spirochaetes Verrucomicrobia Ascomycota Fibrobacteres Firmicutes Fusobacteria Euryarchaeota Aquificae Chloroflexi Cyanobacteria

-12 -9 -6 -3 0 3

173

Figure 38. Taxonomic variation at the class level in the mouse gut metaproteomes. Relative abundance of the ten most abundant classes from male

STAT and control mice. Categories listed in the legend in decreasing order of abundance.

174

175

Figure 39. Differential protein expression at the class level in the gut metaproteomes of male mice. The x-axis displays normalized differential expression ratios (male STAT divided by male control) from class that contain at least 0.1% of the total metaproteomic spectral counts from a given sample. The y-axis displays the corresponding class name.

176

177

Figure 40. Taxonomic variation at the order level in the mouse gut metaproteomes. Relative abundance of the ten most abundant orders from male STAT and control mice. Categories listed in the legend in decreasing order of abundance.

178

Bacteroidales STAT Clostridiales Neisseriales Enterobacteriales Bifidobacteriales Flavobacteriales Sphingobacteriales Verrucomicrobiales Control Burkholderiales Rhizobiales

80% 85% 90% 95% 100%

179

Figure 41. Differential protein expression at the order level in the gut metaproteomes of male mice. The x-axis displays normalized differential expression ratios (male STAT divided by male control) from orders that contain at least 0.1% of the total metaproteomic spectral counts from a given sample. The y-axis displays the corresponding order name.

180

181

At the family level, the structural analysis of gut metaproteomes in STAT and control mice revealed that the most abundant protein expression derived from the

Bacteroidaceae, Porphyromonadaceae, and Prevotellaceae (Figure 42), which were also the three most abundant families in the metagenomes of STAT compared to control mice (Figure 10). In addition, spectral counts from 12 families were differentially represented in the STAT compared to control group (Figure 43). While the abundance of the Verrucomicrobiaceae was reduced 9-fold in male STAT compared to control metagenomes, the spectral counts were 3-fold higher in the male STAT compared to control group for this family.

The top ten most abundant species were similar in the STAT compared to control mouse gut metaproteomes with the exceptions of the Prevotella bryantii found in the control and

Bacteroides intestinalis found in the STAT mice (Figure 44). The 3 species displaying the highest protein expression were Bacteroides plebeius, Parabacteroides merdae, and

Bacteroides coprocola. As explained previously, the Bacteroides fragilis detection is predicted to represent an imprecise hit as the peptides likely derive from a different

Bacteroides species given that metagenomic read mapping to the B. fragilis reference genome displayed minimal coverage. The Bacteroides uniformis protein expression levels were 1.5-fold higher in male STAT compared to control mice. Interestingly,

Bacteroides uniformis strains have been recently found to reduce weight gain and immunological dysfunction in high-fat diet induced obesity in C57BL/6 mice [314].

182

Figure 42. Taxonomic variation at the family level in the mouse gut metaproteomes. Relative abundance of the spectral counts of the ten most abundant families from male STAT and control mice. Categories listed in the legend in decreasing order of abundance.

183

Bacteroidaceae

Porphyromonadaceae STAT Prevotellaceae

Neisseriaceae

Rikenellaceae

Clostridiaceae

Enterobacteriaceae

Control Bifidobacteriaceae Lachnospiraceae

Flavobacteriaceae

60% 70% 80% 90% 100%

184

Figure 43. Differential protein expression at the family level in the gut metaproteome of male mice. The x-axis displays normalized differential expression ratios (male STAT divided by male control) from families that contain at least 0.1% of the total metaproteomic spectral counts from a given sample. The y-axis displays the corresponding family name.

185

186

Figure 44. Taxonomic variation at the species level in the mouse gut metaproteomes. Relative abundance of the spectral counts of the ten most abundant species from male STAT and control mice. Categories listed in the legend in decreasing order of abundance.

187

188

Comparative functional analysis in STAT obese and lean gut metaproteomes.

Functional analysis of obese and lean gut metaproteomes using MetaRep [279] and

KEGG Mapper revealed that the top three enzymes categories with the highest number of spectral counts were the In linear amides (EC 3.5.1), Interconverting aldoses, ketoses, and related compounds (5.3.1), and Hydrolases (4.2.1), which represented

40% of the enzyme hits. In addition, 29 enzyme classes were differentially abundant in male STAT compared to control mice (Figure 45). The three most overexpressed enzyme classes in STAT compared to control were With a quinone or similar compound as acceptor (EC 1.6.5), With a quinone or related compound as acceptor (EC 1.3.5), and

Aminopeptidases (EC 3.4.11), while the most overrepresented enzyme category in the metagenomic data was With a quinone or similar compound as acceptor (EC 1.1.5); the three aforementioned classes that are related to quinones as acceptors are mostly comprised of reductases.

Additional analysis of the functional capacity of STAT obese and lean gut metaproteomes was performed using GO categories. GO Slim overview ontologies revealed 15 enriched and 11 depleted categories in obese compared to lean mice.

While not the most highly abundant GO categories detected, which displayed equal abundance, these differentially expressed categories met a threshold of containing greater than 0.1% total spectral counts per category (Figure 46). The carbohydrate binding GO Slim ontologies were increased 2-fold in male STAT compared to control mice, which was also the pattern observed in the metagenomic data. Protein transport,

Cellular respiration as well as molecular and signal transducer activity were increased in the STAT compared to control group.

189

Figure 45. Differential abundance of enzyme class hits in the obese and gut metaproteomes. The y-axis displays normalized fold-change ratios (STAT/Control) for the most under and over abundant enzyme classes that represent at least 0.1% of the total spectral counts mapped for a given sample. The legend displays EC numbers and categories.

190

191

Figure 46. Differential abundance of GO Slim categories in the obese and gut metaproteomes. The y-axis displays normalized fold-change ratios (STAT/Control) for the most under and over abundant categories that represent at least 0.1% of the total spectral counts mapped for a given sample. The legend displays GO Slim categories.

192

193

GO analysis of the molecular function namespace at a distance of 2 from the root revealed that Thioredoxin-disulfide reductase activity (GO:0004791), Signal transducer activity (GO:0004871), and Carbohydrate binding (GO:0030246) were enriched in male

STAT compared to control mice (Figure 47). Similarly, in the metagenomic data, carbohydrate binding and thioredoxin peroxidase activity, which is essential for the transcriptional induction of other components of the thioredoxin system such as thioredoxin reductase, were found to be overrepresented in the STAT compared to control group [315]. The top three most abundant categories detected were

Transmembrane transporter activity (GO:0022857), Oxidoreductase activity

(GO:0016491), and Ion binding (GO:0043167), which represented 42% of the total spectral counts.

GO analysis of the molecular function namespace at a distance of 3 from the root revealed that Pyridoxal phosphate binding (GO:0030170), Vitamin B6 binding

(GO:0070279), and Protein transporter activity (GO:0008565) were the most over- expressed categories in male STAT compared to control mice (Figure 48). The Two- component response regulator activity and polysaccharide binding were also increased in male STAT compared to control metaproteomes, which was also observed in the metagenomics data. The three most abundant categories detected were Cation binding

(GO:0043169), Carbon-oxygen lyase activity (GO:0016835), and Uptake transmembrane transporter activity (GO:0015563); only GO:0015563 was over- expressed in the former two categories were equal in representation within the STAT and control gut metaproteomes. GO analysis of the molecular function namespace at a distance of 4 from the root revealed that Macromolecule transmembrane transporter

(GO:0022884), Protein transmembrane transporter (GO:0008320), and Oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor (GO:0016655)

194 were the most over-expressed categories in male STAT compared to control mice

(Figure 49). Succinate (GO:0000104) and malate (GO:0016615) dehydrogenase activity were also overrepresented in STAT compared to lean metaproteomes.

GO categories are portrayed at various distances from the root. A root distance of 2 portrays functions of a general class, e.g. hydrolase, whereas, root distance 3 and 4 describe progressively more specific functional roles. GO analysis of the biological process namespace at a distance of 2 from the root revealed Cell wall organization or biogenesis (GO:0071554), Establishment of protein localization (GO:0045184), and

Macromolecule localization (GO:0033036) were the most over-expressed categories in male STAT compared to control mice (Figure 50). The most underrepresented category in male STAT compared to control metaproteome was Response to abiotic stimulus

(GO:0009628). The over-representation of the protein localization category is of interest given the high level of protein secretion functions inherent to gut microbiota in which enzymes involved in cellulose and polysaccharide metabolism are thought to function in the extracellular environment.

Comparative metaproteomic analysis of common name annotation fields revealed that the ten most abundant common names expressed were similar in STAT obese and lean control mice (Figure 51). The most abundant common names detected in the metaprotemes were hypothetical proteins, TonB-linked outer membrane protein

SusC/RagA family, and phophopyruvate hydratase. The TonB-linked outer membrane protein SusC/RagA family was the second most abundant common name found in the metagenomes of STAT and control mice. The propionyl-coA carboxylase subunit beta and xylose common names were both enriched 1.5-fold in male STAT compared to control metaproteomes. Both KEGG pathway and metagenomic data analyses have indicated increased production of D-xylose, a for pentose and

195 glucuronate interconversions, in STAT compared to control mice (see metagenomic and metaproteomic analysis of the Starch and sucrose metabolism KEGG pathway).

196

Figure 47. GO molecular function namespace analysis of STAT obese and lean gut metaproteomes. The y-axis displays fold change differences and the legend lists the GO category. The root distance is 2.

197

198

Figure 48. GO molecular function namespace analysis of STAT obese and lean gut metaproteomes. The y-axis displays fold change differences (STAT/Control) and the legend lists the GO category. The root distance is 3.

199

200

Figure 49. GO molecular function namespace analysis of STAT obese and lean gut metaproteomes. The y-axis displays fold change differences (STAT/Control) and the legend lists the GO category. The root distance is 4.

201

202

Figure 50. GO biological process namespace analysis of STAT obese and lean gut metaproteomes. The y-axis displays fold change differences and the legend lists the GO category. The root distance is 2.

203

204

Figure 51. Most abundant gene annotation common names in male STAT obese and lean control gut metaproteomes. The common name annotation fields are listed in the legend. The x-axis displays spectral count percentages within the top ten list of most abundant common names.

205

206

The TIGRFAMs protein family analysis revealed that 56 TIGRFAMs were found to be differentially enriched while 86 families were found to be depleted in male STAT compared to control metaproteomes. Within the enriched families, 21 TIGRFAMs represented > 0.1% of the total spectral counts per sample (Figure 52). The beta- hydroxyacyl-(acyl-carrier-protein) dehydratase FabZ (TIGR01750), NADH:ubiquinone oxidoreductase, Na(+)-translocating A subunit (TIGR01936), and protein-export membrane protein SecD (TIGR01129) were the most enriched families in STAT compared to control animals. In addition, malate dehydrogenase NAD dependent, succinate dehydrogenase or fumarate reductase flavoprotein subunit (TIGR01811), and

TonB-dependent outer membrane receptor SusC/RagA subfamily signature region

(TIGR04057) were found to be the most abundant families within the enriched

TIGRFAMS in the STAT compared to control group. These data were supported by the

PFAM domain analysis of gut metaproteomes as the lactate/malate dehydrogenase alpha/beta C-terminal (PF02866) and NAD binding domains (PF00056), oxidoreductase family NAD binding (PF01408), SecD/SecF GG motif (PF07549), TonB dependent receptor (PF00593), and TonB-dependent receptor plug domain were also found to be enriched in the STAT compared to control group (data not shown).

207

Figure 52. TIGRFAM HMM domain analysis of enriched categories in STAT obese and lean control gut metaproteomes. Enriched TIGRFAM categories (STAT/Control ratios) with > 0.1% of mapped peptides per sample are shown. The x-axis displays fold change and the legend lists the TIGRFAM category.

208

209

Biochemical pathway analysis of STAT obese and lean control gut metaproteomes. Analysis of the MetaCyc pathways in STAT mouse gut proteomes revealed that 20 pathways were enriched and 13 pathways were depleted that represented > 0.1% of total spectral counts per pathway compared to controls (Figure

53). Most of the enriched MetaCyc pathways derive from electron transfer and TCA superpathways, and this finding was confirmed by the KEGG pathway analysis (Figure

54). The KEGG pathway analysis of murine gut metaproteomes at the overall pathway level revealed that 4 pathways were enriched and 9 pathways were depleted with 0.1% of total spectral counts in STAT compared to control mice. Deeper analysis at the individual pathway level was required since a very abundant and enriched enzyme in a given pathway can skew the data at the overview level. This is indeed the case as upon further scrutiny of the detected over-representation of enzymes mapped onto the TCA cycle that reflect functions that also operate outside of the TCA (Figure 55), which is not used in an anaerobic environment.

210

Figure 53. MetaCyc pathway analysis of STAT obese and lean gut metaproteomes. The x-axis displays fold-change (STAT/Control) at the overall pathway level and the legend lists the names of specific MetaCyc pathways.

211

212

Figure 54. Overview of the differential abundance of KEGG pathways in male

STAT obese compared to control metaproteomes. The x-axis displays fold-change

(STAT/Control) for categories at the overall pathway level that represent > 0.1% spectral counts per pathway and the legend lists KEGG pathway names.

213

214

Figure 55. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Citrate cycle (TCA cycle) KEGG pathway (00200). Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

215

216

Nicotinate and nicotinamide metabolism KEGG pathway (00760) analysis revealed that

3 enzymes were over-expressed in male STAT compared to control mice (Figure 56).

The placement of these enzymes within the pathway does permit clear interpretation of the results but suggests the elevated conversion of nicotinate (niacin or vitamin B3) to

NADP+. The enzyme nicotinate phosphoribosyltransferase (EC 6.3.4.21) catalyzes the conversion of nicotinate to nicotinate D-ribonucleoside is over-expressed in STAT mice.

Likewise, the enzyme NAD+ synthase (EC 6.3.1.5) catalyzes the conversion of deamino-

NAD+ to NAD+, followed sequentially by the over-expression of the enzyme NAD+ kinase (EC 2.7.1.23) that converts NAD+ to NADP+. NADP+ is a coenzyme for a number of cellular reactions that require NADPH as a reducing agent. The significance of these differentially expressed enzymes in STAT mice is not clear as the increased abundance of NADP+ at the expense of nicotinate and NAD+ are likely to affect the activity of a wide variety of proteins in the cell.

217

Figure 56. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Nicotinate and nicotinamide metabolism KEGG pathway

(00760). Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

218

219

Mapping of the proteomics data to the Cyanoamino acid metabolism KEGG pathway

(00460) recapitulate the metagenomic data as the over-expression of -glucosidase (EC

3.2.1.21) was also observed at the protein level (Figure 57). This enzyme catalyzes the generation of a series of nitrile compounds. The functional significance of this finding is not clear; however, each of these nitrile products are only a single step removed from producing cyanide. The production of cyanide even in low quantities could have a profound effect not only on the gut community but also host cell functions. A number of bacteria, including Brevibacter spp. can metabolize a number of aryl and aromatic nitriles as energy sources [316]. The detection of enzymes in this pathway by shotgun proteomics suggests that nitrile metabolism is relatively prevalent among the gut microbiota community. It is not clear how these results may pertain to the obese phenotype of STAT mice.

220

Figure 57. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Cyanoamino metabolism KEGG pathway (00460). Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

221

222

Consistent with the metagenomic data, a number of enzymes are differentially expressed in STAT and control mice within the Starch and sucrose metabolism KEGG pathway (00500). The increased abundance of gene sequences encoding xylan 1,4-- xylosidase (EC 3.2.1.37) in metagenomic sequence data are corroborated in the proteomics analysis where this enzyme is also overrepresented in STAT mice. The increased production of D-xylose, a substrate that may influence pentose and glucuronate interconversion activities. The metagenomic DNA sequence analysis revealed a predicted increase in the generation of -D-glucose and/or -D-fructose due to increased representation of-fructosidase (EC 3.2.1.26) and sucrose -glucosidase

(EC 3.2.1.48), respectively. Differential expression of these enzymes was not observed in the proteomics analysis; however, the enzyme glucan 1,4-alpha-glucosidase (EC

3.2.1.20) that catalyzes the same reaction was overrepresented in STAT mice. It is possible that the inability to detect the enzymes reported in the metagenomic analysis reflects technical limitations related to detection sensitivity, protein solubility, trypsin cleavage efficiency or ionization efficiency for mass spectrometry detection.

Nevertheless, it appears that the STAT mice generate increased -D-glucose and/or -

D-fructose compared to control mice. The remaining enzymes that were identified as differentially represented in the metagenomic analyses were not observed in the proteomics data. The conclusion drawn from the metagenomics analysis predicted that differentially represented enzymes may drive increased glycogen biosynthesis. The proteomics data are in strong agreement with the over production of -D-glucose as described above. In addition to glucan 1,4-alpha-glucosidase (EC 3.2.1.20), four unique overrepresented enzymes in STAT mice also contribute to increased production of -D- glucose. The enzyme 4-alpha-glucanotransferase (EC 3.2.1.33) converts glycogen to -

D-glucose. The enzyme glucan 1,4-alpha-glucosidase (EC 3.2.1.3) also catalyzes the

223 conversion of glycogen and dextrin to -D-glucose. Finally, two enzymes amylomaltase

(EC 2.4.1.25) and a second enzyme, also called glucan 1,4-alpha-glucosidase (EC

3.2.1.20) both catalyze the conversion of maltose to -D-glucose. Collectively, the proteomics data appear coherent and indicate elevated production of D-xylose and -D- glucose in STAT compared to control mice.

These are the most significant results pertaining to the obesity phenotype of STAT mice and support the notion that STAT mouse microbiota display elevated potential for energy harvest from cellulose. Taken together with the over-representation of β-glucosidase and species redistribution, STAT mice harbor microbiota with increased capacity for generating simple sugars from cellulose-like polysaccharides. The stool from obese mice contains fewer calories than lean animals. Therefore, the increased energy harvest in the STAT microbiota is likely to provide excess energy to the host in the form of mono- and disaccharides.

224

Figure 58. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Starch and sucrose metabolism KEGG pathway (00500).

Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

225

226

Enzymes comprising the Histidine metabolism KEGG pathway (00340) where glutamate is produced were evaluated (Figure 59). There are widespread reports in the scientific literature describing monosodium glutamate-induced obesity through a mechanism involving altered feeding behavior [317, 318]. The proteomics data are inconsistent with the previously mentioned reports as expression of the enzyme imidazolonepropionase

(EC 3.5.2.7) that generates N-formimino-L-glutamate, which is the precursor to L- glutamate, was increased; however, glutamate formimidoyltransferase (EC 2.1.2.5), which is the enzyme that converts this substrate to L-glutamate is underrepresented in the STAT mouse. It is interesting that an alternate enzyme catalyzing this conversion, formimidoylglutamase (EC 3.5.3.8) is equal in abundance in the STAT and control groups. It is therefore still a possibility that L-glutamate levels are increased in the STAT mouse compared to controls. STAT mice do not display any obvious changes in feeding behavior; nevertheless, the possible increase in glutamate levels in STAT mice will be important to verify experimentally.

227

Figure 59. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Histidine metabolism KEGG pathway (00340). Enzymes overexpressed in male STAT compared to control groups are colored or outlined in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

228

229

Several enzymes involved in the One carbon pool by folate KEGG pathway (00670) are differentially expressed by the gut microbiota of STAT compared to control mice. Most of the differentially expressed enzymes are bidirectional in the reactions that they catalyze; therefore, the interpretation of these results may be imprecise (Figure 60).

Overall, the over-expressed enzymes may drive the increased production of 5,6,7,8- tetrahydrofolate (THF). This THF species is also known as vitamin B9 [319]. Similarly, the underrepresented enzymes in STAT mice may facilitate reduced conversion of THF to its metabolites 10-formyl-THF, 5-formimino-THF, 5,10-methenyl-THF and 5,10- methylene-THF. It is likely that the increased availability of 5,6,7,8-THF increases host cell folate generation, an essential for a wide variety of cellular processes.

Specifically, the increased expression of 10-formyltetrahydrofolate (EC

3.5.1.10) and methionine synthase (also known as 5-methyltetrahydrofolate--L- homocysteine S-methyltransferase) (EC 2.1.1.13) may drive 5,6,7,8-THF production using 10-formyl-THF and 5,10-methylene THF as substrates, respectively. There are six unique enzymes underrepresented in STAT mice. For brevity, the details are not described here, but the mapping displays the coherent clustering of these enzymes at multiple key points in the pathway. While multiple interpretations of these results are possible, viewed in the context of the enzymes that are over-expressed in STAT mice, it may be that the reduced expression of these enzymes are enabling the maintenance of high levels of 5,6,7,8-THF.

A relatively high proportion of gut microbiota species lack the genes required for B- vitamin biosynthesis. These pathways are essential for cellular life as many B-vitamin biosynthetic pathways generate essential cofactors (e.g. NAD, FAD, Co-A, etc.) as end products. It is therefore likely that B-vitamins are secreted by prototrophic species and taken up by auxotrophic species enabling continued proliferation of recipient cells.

230

Growth dependencies like the example illustrated here serve to enforce functional networks of species that coexist in the gut microbial community. It is conceivable that the dysbiotic microbiota associated with the STAT mouse model of obesity are maintained by such dependencies.

231

Figure 60. Mapping of differential enzyme abundance from mouse gut microbiomes onto the One carbon pool by folate KEGG pathway (00670). Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

232

233

Within the Butanoate metabolism KEGG pathway (00650), the succinate dehydrogenase

(EC 1.3.99.1) and acetolactate synthase (EC 2.2.2.6) enzymes are increased in representation in male STAT compared to control mice (Figure 61). Succinate dehydrogenase performs a reversible reaction that may increase the concentrations of either succinate or fumarate. There are reports in the literature suggesting that succinate has anti-obesity effects; therefore, in the context of the STAT model, one interpretation of this result is that succinate levels in the STAT mouse may be reduced however this speculation must be tested experimentally. It is perhaps more likely that fumarate levels are increased in STAT mice since this molecule serves as a terminal electron acceptor in anaerobic electron transport. The observed increase of acetolactate synthase, which converts pyruvate to (S)-2-acetolactate, may also serve to drive pyruvate generated by glycolysis to (S)-2-acetolactate thereby preventing pyruvate from entering the TCA cycle where succinate is generated. Pyruvate is another electron acceptor used in anaerobic respiration and ATP generation.

Five enzymes in the Butanoate metabolism KEGG pathway were reduced in abundance in the STAT compared to control mice. Two of these enzymes, pyruvate-formate lyase

(EC 2.3.1.54) and pyruvate synthase (EC 1.2.7.1) may act in concert with the previously mentioned over-represented enzymes to reduce the conversion of pyruvate to acetyl Co-

A. Three additional enzymes that are underrepresented in the STAT mouse do not appear to be directly related to one another, nor would their reduced activity appear to impact butyrate levels. Analysis of the over- and underrepresented enzymes expressed in this pathway do not support the previously reported observation that butyrate levels are elevated in obese mice and humans [67]. This disparity is most likely a reflection of the differences in the mechanisms underlying the STAT-induced obese phenotype and do not necessarily contradict previous reports.

234

Figure 61. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Butanoate metabolism KEGG pathway (00650). Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

235

236

The analysis of the Oxidative phosphorylation pathway (00190) revealed the increase in expression of two enzymes that constitute the succinate dehydrogenase/fumarate reductase complex II. The increased abundance of succinate dehydrogenase A (EC

1.3.99.1) and B subunits (EC 1.3.5.1) in STAT mice indicates that increases in either fumarate or succinate may occur. As mentioned previously, the directionality of this reaction cannot be discerned from the proteomic data. Additional proteomic alterations within the oxidative phosphorylation pathway involve the reduced abundance of several components of the F and/or V-type ATPase in STAT compared to control mice. This complex V system is universally conserved in Eubacteria and uses a proton gradient to pump H+ ions into the cell that results in the phosphorylation of ADP to generate ATP.

The polyphosphate kinase (EC 2.7.4.1) and inorganic diphosphatase (EC 3.6.1.1) as well as ATPase subunits (EC 3.6.3.14) are reduced in STAT mice compared to controls.

Thus, it appears that the activity of the ATPase is reduced in STAT mice. The significance of this observation with respect to the obese phenotype is not clear.

237

Figure 62. Mapping of differential enzyme abundance from mouse gut microbiomes onto the Oxidative phosphorylation KEGG pathway (00190).

Enzymes overexpressed in male STAT compared to control groups are colored in yellow while those underrepresented are colored in blue. Enymes with equal representation are shown but not colored.

238

239

Chapter 5: Conclusions

Obesity is a complex phenomenon and many factors may contribute to its pathology.

These factors range from psychological and behavioral, environmental to genetic features. More recently, the gut microbiota has been implicated as another potential factor in the etiology of obesity. The impact of antibiotic exposure on body weight and composition has been recognized for many decades by the agricultural industry and was recognized as a potential means of inducing obesity in animal models by Dr. Martin

Blaser [67]. The STAT mouse model of obesity is novel as it represents an antibiotic- induced obesity, driven at least in part by alterations of the gut microbiota. In this regard, the STAT model is potentially informative since it perturbs a specific risk factor and is a model of obesity in which the genetic relationship between antibiotic-treated and control mice, diet, and other factors can be carefully controlled so that the impact of microbiota modulation may be studied independently of other confounding factors.

Despite the simplification of the STAT model, alterations of the gut microbiota impact the host in unknown ways such as reduced epithelial barrier function and increased inflammatory status, which in turn can further perturb the gut microbiota composition.

These primary versus secondary effects of the STAT mice were not addressed in these studies but represent an interesting direction for future experimentation that examines

STAT mice in the context of a longitudinal study in which changes in gut microbiota, host inflammation and other relevant parameters may be examined to determine cause and effect relationships of the host-microbiota interaction as it pertains to obesity.

The major novelty of this thesis work lies in the application of a systems biology approach to improve our understanding of the complex underpinnings of obesity, which is a disease of growing importance as the frequency of its occurrence in the human population has reached epidemic levels. The rationale for systems biology is two-fold.

240

First, the objective analysis of complex diseases like obesity require global approaches that increase the opportunity to understand previously unrecognized relationships between cellular systems that are not generally taken into consideration by traditional approaches. This approach becomes more critical when considering the microbiome, as it is difficult to study the relationship between the microbiota-host interaction by studying one bacterium at a time or alternatively by studying subsystems of bacterial physiology as a whole. The second reason for applying a systems biology approach lies in the recognition that most human disease is complex and therefore no single OMICs approach will provide adequate clarity to the processes under investigation. In this work, phylogenomic, metagenomic and proteomic analyses were conducted. Transcriptomic data was also generated to further extend the systems biology approach but those data were not described due to restrictions on data sharing. While several studies have been published on the relationship of the gut microbiota and obesity, none have applied a multi-pronged approach as described here. The combination of metagenomic DNA sequence analysis coupled to proteomic analysis represents a novel and powerful combination of approaches that combine the deep profiling capacity of NextGen sequencing technologies to compare the species and corresponding gene functions that may be over- or under-represented in STAT mice. While of high utility for some analyses, the relative abundance of any gene sequence does not necessarily imply anything specific about the expression of those genes. Therefore, the use of state of the art metaproteomics allows a direct evaluation of the biological relevance of gene abundance measurements. Indeed, there was generally strong agreement between these measurements but exceptions were also noted. The use of metaproteomics is also considered a novel aspect of this thesis. While other groups have reported the use of this method, these studies are very few, limited in depth, and none have been applied to obesity.

241

The study design employed in this thesis is novel and proved to be of high value in that public databases were not relied upon for data analysis. While the public databases are replete with an enormous quantity of sequence data pertaining to human and mouse microbiomes, it became clear in the early stages of this work that the relevance of much of this data was low with regard to the STAT model. Instead, a project-specific database was constructed based on the deep sequencing performed on STAT and control mice only. In this regard, all of the sequence data used for functional assignments, sequence assembly and mapping of peptides to six frame translations of sequenced genes were directly applicable and greatly reduced noise in the datasets that can arise from the use of very large sequence collections. This was the observation in the metaproteomics data where the short length of peptides often times map equally well to distinct and mutually exclusive target proteins. By reducing the database to solely those sequences of relevance to this study, we enhanced the quantity of unambiguously mapped peptides significantly (18-fold greater protein identification). Furthermore, the metagenomic DNA sequences were normalized by a novel method based on Cₒt curve analysis that has not to our knowledge been previously used for metagenomic DNA sequence analysis and assembly. Despite the impressive number of DNA sequences generated by current sequencing technologies, the use of brute force DNA sequencing on microbiota that display an impressive dynamic range of abundance (i.e. greater than 4 logs) results in a strong overrepresentation of a few highly abundant genomes at the expense of moderate and low abundance genomes. Given that STAT treatment alters the microbiota composition, the use of RAKE normalized DNAs for the purpose of building highly representative sequence scaffolds and assemblies was considered to have strong merit. The combination of the project-specific assembly and use of normalized DNAs greatly increased the mapping of DNA sequence reads and peptide sequences at a higher frequency and with greater confidence in specificity.

242

While the vast majority of phylogenetic studies pertaining to microbiota have applied the simple and efficient method of 16S rDNA profiling. However, a growing number of investigators are adopting metagenomic DNA sequence analysis, which is less prone to

PCR amplification bias, based on the failure or reduced efficiency in the amplification of the 16S rDNA target from selected species present in microbiota using universal primer pairs (e.g. Bifidobacterium spp.). The specific region of 16S rDNA analyzed can also produce results that are inconsistent. Finally, the rapid increase in reference genomes for species present in the gut microbiota increases the ability to perform reliable phylogenomic analysis of microbiota samples from metagenomic sequence while simultaneously deriving substantial insights with respect to the coding potential of microbiota. The primary barrier for more widespread adoption of this method is the lack of user-friendly software that facilitates the functional analysis of microbiota. This thesis work benefited from the availability of the MetaRep program developed at JCVI [279] for the purposes described.

The analysis of deep metagenomic DNA sequence data revealed a number of interesting and potentially relevant differences between STAT and control mice both in terms of taxonomic and gene function coding potential. The Gordon group described the microbiota associated with obese (ob/ob) and lean mice and human subjects [14]. The primary conclusions of this study suggested that the microbiota of obese animals and humans displayed increased energy harvesting potential and involved large-scale reductions in Bacteroidetes with a proportionate increase in Firmicutes. It is important to note that the stool of ob/ob mice possessed significantly fewer calories compared to control mice. Taken together, the increased harvesting of energy harvest of the microbiota must result in increased caloric uptake for the host, most likely via microbially-generated short chain fatty acids such as propionate and butyrate.

243

Taxonomic analysis of metagenomic sequence derived from STAT and control mice revealed that Archaea, while only of moderate abundance (0.25%) were elevated in the male STAT mice. The Archaea in the gut are methanogens and as such are of relevance to the pathophysiology of obesity. The increased abundance of methanogens that use

H2, permit elevated metabolic efficiency of the bacteria that ferment polysaccharides.

The anaerobic fermentation of cellulose and cellulose-like polysaccharides generate H2 and its accumulation is inhibitory to these organisms. Methanogens consume H2 to generate energy. In addition to the increase in Archaea, the increased abundance of

Firmicutes (Eubacterium, Enterococcus) was noted. The increased abundance of

Firmicutes is significant and consistent with the Gordon model featuring increased energy harvest of the microbiota associated with obese subjects. The altered abundance of the Bacteroidetes division was not observed in the STAT mouse but at the family level the Bacteroidaceae was reduced in abundance in male STAT mice compared to controls. The magnitude of change in the Bacteroidaceae was not as large as that described by Turnbaugh et al. [221]. Taken together, the results presented in this thesis are in strong agreement with those initially described by Turnbaugh and support the conclusion that the STAT mouse displays increased capacity for energy harvest and therefore bears resemblance to the leptin deficiency (ob/ob) mouse model of obesity.

The reduction in the family Verrucomicrobiales was the largest change noted in the microbiota of STAT mice. This is consistent with the recent report that A. muciniphila abundance is inversely proportional to body mass, obesity and biomarkers of type 2 diabetes in human subjects [217]. The abundance of A. muciniphila was decreased by

4-fold in male STAT mice. These results allow the conclusion that this aspect of the

STAT mouse model corresponds to characteristics of human obesity. The similarities between the phylogenomic profiles of the STAT and (ob/ob) mouse model are significant since the leptin-deficient mouse displays increased feeding behavior and caloric intake.

244

In the leptin-deficient model, increased caloric intake independent of diet type (high-fat, high carb) induces alterations in the gut microbiota composition. In the STAT model, the gut microbiota is modulated by antibiotic exposure and independent of increased caloric intake. This example illustrates the importance of independent models of disease such as obesity since by comparing experimental outcomes derived from different models, cause and effect relationships become easier to identify. The parallels between the shifts in microbiota structure of STAT and (ob/ob) mice are difficult to explain, since antibiotic treatment with for example, penicillin is unlikely to cause the precise changes in the microbiota required to mimic the basic characteristics of the (ob/ob) mouse.

Therefore, it may be reasonable to conclude that shifts in microbiota community structure introduced by antibiotic treatment may generally result in a microbiota with increased energy harvest potential. This may induce host obesity, which in turn may further modulate the gut microbiota in a manner that accounts for the similarities of the

STAT and ob/ob mouse model.

The results obtained by analysis of the metagenomic sequence data and associated functional annotation was informative and provided substantial insights into the obese phenotype of the STAT mouse. Several of the gene sequences that are differentially represented in STAT and control mice are a reflection of the shifts in the proportionality of species present in the microbiota. In this regard, caution must be exercised when interpreting the results, since the gene sets that are differentially represented may be genomic passengers and have no impact on obesity. For example, the increased prevalence of CRISPR elements and two-component transcription factors represent functions that are present in certain bacterial species and as such, shifts in the microbiota of STAT mice may serve to increase these functions; however, their role in the obese phenotype of STAT mice should not yet be ruled out. Many of these results

245 support the increased energy harvesting activity of the gut microbiota in obese mice.

The increased energy harvest via increased fermentation of host indigestible polysaccharides of the STAT mouse microbiota results in the provision of elevated calories to the host in the form of mono- and disaccharides and other potential sources of energy. Consistent with this, Gordon and others have reported increased butyrate levels in obese mice and humans. In the STAT model of obesity, the differential expression of enzymes in the butanoate pathway were reduced in STAT mice and given their position within the pathway, it is unlikely that their altered expression would impact butanoate levels produced by the microbial community. These findings should be validated by direct metabolite measurements in future experimental efforts but suggest that the increased energy harvesting capacity displayed by STAT mice may induce obesity by mechanisms distinct from those involving elevated SCFA generation. The

Gordon group showed that the obese phenotype was transmissible via fecal transplantation of the gut microbiota from an obese human subject into germ free mice

[320]. The butyrate and propionate levels in obese mice were reduced compared to control and lean microbiota recipient mice. In some respects, the fecal microbiota transplantation studies mimic the STAT obesity model in that the induced change is restricted initially to the gut microbiota. The lack of an apparent butyrate association in either study may highlight common pathway(s) of obesity induction in these two models.

The most compelling differential expression of enzymes in STAT mice was observed in the Starch and sucrose metabolism pathway. The STAT mice appear to produce increased quantities of D-xylose from xylan and α-D-glucose from metabolized starch.

D-xylose is a substrate for pentose and glucuronate interconversions. The increased energy harvest via increased fermentation of host indigestible polysaccharides of the

246

STAT mouse microbiota may result in the provision of elevated calories to the host in the form of increased glucose availability.

The cellular energetics of the STAT microbiota was distinct in additional features. The increased expression of succinate dehydrogenase in STAT mice would increase the production of either succinate or fumarate. Fumarate is one of several terminal electron acceptors used by anaerobic bacteria instead of the TCA cycle that is used for aerobic respiration. Another terminal electron acceptor used by anaerobic bacteria is sulfate.

Analyses conducted in this study suggest that sulfate levels may be elevated in STAT mice suggesting that sulphate-reducing bacteria may be reduced in abundance.

Methanogens use methane as a terminal electron acceptor for energy production. The representation of methanogens was increased in STAT mice, suggesting that methanogens are the primary user of the H2 produced by branched fermentation.

Finally, acetogens use acetate as a terminal electron acceptor and the relative abundance of acetogenic bacteria in the gut microbiota suggests that the formation of acetyl-CoA by these bacteria is a driving force for anaerobic respiration in the gut.

A novel finding observed in STAT mice relates to the reduced abundance of several components of the F and/or V-type ATPases. This result appears consistent with the overall reduction of enzymes comprising the TCA cycle that also generates ATP as energy for a variety of cellular processes. Given the apparent increased levels of fermentation in the STAT mice, it is unclear how the protons generated in that process are managed as their buildup in the cell eventually become inhibitory.

The importance of the results presented in this thesis is the relevance of increased energy harvest observed in STAT mice. It is understood that gut bacteria more than species occupying other environmental niches secrete many enzymes involved in

247 polysaccharide metabolism. The extracellular metabolism or fermentation of polysaccharides may provide unique insights into the mechanisms through which the increased energy harvesting potential of gut microbiota of STAT mice might directly impact the increased caloric availability to the host. It is possible that the additional energy harvest mediated by the STAT microbiota may not be shared equally. Redox imbalances in sulfate/sulfite and reduced V and/or F-type ATPase activity suggest that energy utilization of the microbiota may be negatively impacted due to antibiotic induced dysbiosis. The validity of this speculation will need to be tested in future experiments.

The work presented in this thesis applies an -OMICs approach to the study of the role of the gut microbiota as a potential factor in obesity using the STAT mouse model developed by Dr. Martin Blaser at NYU. Despite the perceived strengths of this approach to make global and unbiased observations pertaining to molecular differences in experimental and control animals, several limitations are also inherent. The metagenomic DNA sequence data highlight functions that are made more or less abundant as the result of antibiotic exposure; however, these functions may or may not be relevant to the phenotype under study. For example, the shifts in the relative abundance of microbial communities necessarily alter the representation of all genes encoded in those genomes; therefore, care must be taken with respect to how data is interpreted. One significant advantage of the STAT mouse model of obesity is that male mice increase fat mass more than female mice. Throughout the presentation of results we observe gene sets that remain unchanged and those that are altered in both male and female STAT mice as a result of antibiotic exposure. In each case, focus was applied to those functions that correlated with the measured phenotype (i.e. observed fat mass changes in male mice were greater than those observed in female mice). The application of this approach greatly focused attention onto those changes that appeared

248 to have direct relevance to the obese phenotype. The power of the OMICs approach lies in its ability to provide global analysis of species, genes, transcripts and proteins expressed under the experimental conditions being evaluated, however these methods rarely prove functional relationships and causality, but rather provide strong clues to these issues. Additional focused studies are required to verify the results of all observations presented here, since virtually all of the results represent well founded hypotheses that may be more meticulously tested to better understand the host- microbiota interaction in obesity.

The use of animal models is widespread in biology, however some animal models are better surrogates than others with respect to their application to improving our understanding of human health and disease. Despite the significant advantages of using inbred mice with a controlled diet, features that would be impossible to achieve in human populations, the true significance of any observation made in animal models must be translated to that of the human subject. Published studies conducted by a number of research groups have initiated this process and while only a small fraction of the total changes associated with obese mice have been evaluated in human subjects, thus far they are largely confirmatory. It is evident that obesity as a phenotype does not constitute a single set of pathways and like most complex diseases represents a process with multiple distinct and alternative initiators and effector pathways that may lead to the same phenotypic end point. The data presented in this thesis focuses on a single potential initiator, namely the composition and functional activities of the gut microbiota and therefore may be limited in their general applicability to obesity in general.

The analysis of metagenomic data has significant power to provide insights with regard to phylogenomic distribution of species present in the gut and those functions that may be altered in representation as the result of experimental treatment. However, certain

249 limitations are also evident in this approach. For example, the ability to unambiguously identify the complement of species present in a microbial community using metagenomic

DNA sequence methods is dependent on the sequences of those species’ genomes being available. Despite the large and growing efforts to expand the reference genomes pertaining to gut species, gaps will always remain and current genome representation is heavily biased in favor of human, rather than mouse isolates. Confidence in assignments is achieved when a large number of sequence reads can be aligned to the genomic sequence of known species with an overall high sequence identity (95-100%). There were instances pointed out in the results section where the best sequence identities mapped to reference microbial genomes with reduced sequence identity (e.g. B. fragilis and P. gingivalis). Upon further scrutiny, these alignments suggested that these sequences were unlikely to be derived from these species and are more likely to be derived from the genomes of close relatives of these species. Presumably, the genomic sequences of many species present in the STAT mice have yet to be sequenced, thereby causing the uncertainty in some species identifications.

The identification of differentially represented metagenomic sequences is also limited in its utility by the fact that the presence of any gene does not reveal anything with respect to its level of expression and functional activity. The mapping of metagenomic sequence reads to known biochemical pathways represents a useful means of establishing context with respect to the vast set of genes that may be differentially represented in the gut microbiota following antibiotic treatment. By the same token, the differential abundance of genes, even in a male dominant manner does not by itself necessarily constitute a biological meaningful observation. Care was taken to evaluate those differentially represented gene functions to apply focus on those differences involving multiple genes involved in a single pathway and moreover those differences that were biochemically

250 coherent, involving either sequential steps in a pathway or impacted the steps in key end products of metabolism. Nevertheless, certain limitations in data interpretation were encountered and described. The most notable were based on the fact that the majority of metabolic enzymes portrayed in biochemical pathways are bidirectional in the activity.

In this regard, biological intuition rather than rigorous biochemical proof was applied in interpreting these results. In many instances, two or more interpretations of such data are possible but a favored interpretation was presented based on our current understanding of obesity-related pathways. The primary achievement provided by the analysis of the metagenomic data is a relatively small set of predicted alterations in specific substrates and/or products that may be directly confirmed or refuted by the application of targeted metabolomics approaches in the future.

One powerful way of compensating for some of the weaknesses described regarding the use of metagenomic DNA sequence analysis is the application of complementary approaches that measure expression states of genes. The use of metaproteomics presented two central strengths. First, the data generated represents those gene functions that are expressed at high levels and imply functional importance as the result of their overall abundance in the metaproteome. Second, the data generated provided an independent experimental platform to evaluate the level of reproducibility of those observations made in metagenomic DNA sequence analysis. The application of proteomics techniques has a number of limitations. No proteomics data sets are complete, since not all proteins are recovered in a soluble form, some proteins are resistant to or have a limited number of sites for trypsin cleavage. Furthermore, not all peptides ionize with equal efficiencies in the mass spectrometer therefore creating certain biases in the observed peptide profiles. Therefore “missing” data may or may not accurately reflect the true expression state of the system under investigation. Despite

251 the significant progress made in the past decade to improve the performance of mass spectrometers used in proteomics studies (increased mass accuracy and sensitivity), the dynamic range of detection is still rather limited compared to DNA based sequencing approaches. The most highly expressed proteins in any sample become over-sampled in proteomics studies at the expense of observing peptides of lower abundance. In this respect, the metaproteomics data presented in this thesis is limited to those proteins expressed at the highest level in the gut metaproteome, whereas proteins of moderate and low expression levels may only be sporadically observed or not observed at all.

Another short-coming of metaproteomics is the difficulty in mapping peptides to their source genes with a uniformly high-level of confidence. This limitation is based on the fact that many tryptic peptides are only 5-7 amino acids in length. When these peptides are used as queries against large databases like NCBI, such peptides may result in matches to two or more mutually exclusive proteins that confound easy interpretation of results. The work presented in this thesis specifically addressed this problem by conducting such peptide searches using a “model-specific” database generated by the deep sequencing of normalized metagenomic libraries constructed for this purpose.

This undertaking resulted not only in an increase of mapped peptides but also an elevation in the confidence values of protein assignments made.

The data sets generated were analyzed almost exclusively by the MetaRep Program, which was developed at the JCVI. This program, like other standard analysis tools performs basic statistical analyses of its outputs. The details of these statistical analyses are not however published or described in any detail in the user manual. In this regard, the data presented in this thesis could be improved by a more rigorous statistical analysis conducted by an expert in this discipline. In general significance of findings was determined using somewhat arbitrary criteria such that specific stated minimum

252 thresholds of abundance relative to the total was met and that fold-changes observed when comparing experimental versus control animals was greater than 2-fold unless otherwise stated. Due to the large data sets generated by OMICs approaches and the costs associated with their use, in most cases, true biological replicates were not performed as independent experiments. Instead stool samples collected from multiple mice were pooled to average potential biological variation. While not ideal, this approach was applied for practical reasons. Technical replicates were applied for all proteomics experimentation consistent with standards applied in that field.

In summary, the approaches applied here carried specific strengths and weaknesses.

The limitations inherent in each method are of importance to consider, however, care was taken throughout to evaluate experimental results in the context of these limitations so that only results of potential significance were presented. Ultimately, like any omics- based study, focused experimental verification is required to establish biological significance and applicability to human conditions.

In terms of gender-specific effects on obesity, it has been reported that male mice are more susceptible to weight gain as the result of high-fat diet [321, 322]. The STAT model allows speculation with regard to the gender-specific effects on fat mass gain in male and female mice. Given that all mice were treated identically with penicillin, it may be predicted that the microbiota would be altered identically in both male and female mice. Based on the data generated and described below, this is not the case. In general, shifts in microbiota composition were more dramatic in male compared to female mice, consistent with male mice displaying a greater increase in fat mass gain in response to antibiotic treatment. This suggests that not all of the alterations in microbiota composition are the direct result of antibiotic treatment and that feedback from the host generates qualitatively similar effects in both sexes that are more

253 quantitatively more dramatic in male mice. Therefore, the distinction of male and female response to antibiotic treatment must involve gender-specific differences in host physiology. One published study investigated the role of caveolin 1, a target of sex- dependent hormones on high-fat diet-induced obesity [323]. These investigators demonstrated that estrogen (17--estradiol) and androgen (dihydrotestosterone) had opposite effects on body weight and caveolin 1 expression. Mice treated with 17-- estradiol displayed reduced weight gain independent of diet compared to untreated mice, whereas androgen treatment resulted in increases in fat mass gain. The interaction of caveolin 1 with components of mitochondrial and lipid oxidative pathways may therefore mediate gender-specific effects of high-fat diet-induced obesity. Cav1-null mice display decreased body weight based on the reduced efficiency of adipose tissue to store lipids. Monitoring of Cav1 expression in visceral adipose tissue of obese human subjects showed consistent reduced expression [324]. Sex steroid hormones are known to regulate aspects of metabolism, including lipid and carbohydrate metabolism [325-

327]. Interestingly, estrogen has been independently shown to reduce body weight in mice provided a high-fat diet [328]. These findings provide a potential explanation or example pertaining to the differential response to obesity-inducing signals in the host in a sex-dependent manner. The relevance of these findings with respect to the STAT mouse model of obesity represents an interesting topic for future investigation.

This thesis has presented evidence of the overall utility of the STAT mouse model of obesity and has significant similarities to other models used by other groups including genetic models and fecal microbiota transplantation models. Taken collectively, this work substantiates the hypothesis that gut microbiota dysbiosis is sufficient to induce obesity in mouse models and human subjects. The phylogenetic signatures of obesity defined here and by others may allow the development of therapeutic interventions in

254 the future that drive the microbial composition of the gut toward that of lean individuals.

Probiotic and prebiotic formulations may be generated to achieve this goal. The potential therapeutics used to treat obesity may only need to be administered temporarily as it may be predicted that once gut microbiota homeostasis is achieved, the need for continued treatment should be unnecessary. While this scenario might normally dissuade investment by pharmaceutical companies, the large and rapidly growing number of individuals with obesity worldwide population makes evident that anti- obesity therapeutics will be highly profitable.

255

References

1. Ogden CL, C.M., McDowell MA, Flegal KM Obesity among adults in the United States - no change since 2003-2004 in NCHS data brief No. 1,2007, National Center for Health Statistics: Hyattsville, MD. 2. Ogden CL, C.M., Kit BK, Flegal KM., et al., Prevalence of obesity in the United States, 2009–2010, in NCHS Data Brief,2012, National Center for Health Statistics: Hyattsville, MD. 3. WHO, W.H.O., Overweight and obesity factsheet, 2013. 4. Breymaier, S., AMA Adopts New Policies on Second Day of Voting at Annual Meeting, in AMA Newsroom Press Release2013, American Medical Assocation 5. Pi-Sunyer, X. Clinical Guidelines on the Identification, Evaluation, and Treatment of Overweight and Obesity in Adults. National Institutes of Health NHLBI Obesity Education Initiative. The Evidence Report. 1998 [cited 2013; No. 98-4803:[Available from: http://www.nhlbi.nih.gov/guidelines/obesity/ob_gdlns.pdf 6. CDC, 2010 State Obesity Rates, 2010, Division of Nutrition, Physical Activity, and Obesity. National Center for Chronic Disease Prevention and Health Promotion. . 7. Sturm, R., J.S. Ringel, and T. Andreyeva, Increasing obesity rates and disability trends. Health Aff (Millwood), 2004. 23(2): p. 199-205. 8. Eric A. Finkelstein, J.G.T., Joel W. Cohen, William Dietz Annual Medical Spending Attributable To Obesity: Payer-And Service-Specific Estimates Health Affairs, 2009. 28(5): p. w822-w831. 9. American Diabetes, A., Economic costs of diabetes in the U.S. in 2012. Diabetes Care, 2013. 36(4): p. 1033-46. 10. Finkelstein, E.A., et al., Obesity and severe obesity forecasts through 2030. Am J Prev Med, 2012. 42(6): p. 563-70. 11. Hill, J.O. and J.C. Peters, Environmental contributions to the obesity epidemic. Science, 1998. 280(5368): p. 1371-4. 12. CDC. Diabetes: Successes and Opportunities forb Population-Based Prevention and Control At A Glance 2011. 2011 2013]; Available from: http://www.cdc.gov/chronicdisease/resources/publications/AAG/ddt.htm. 13. Ley, R.E., et al., Evolution of mammals and their gut microbes. Science, 2008. 320(5883): p. 1647-51. 14. Ley, R.E., et al., Microbial ecology: human gut microbes associated with obesity. Nature, 2006. 444(7122): p. 1022-3. 15. Zhang, H., et al., Human gut microbiota in obesity and after gastric bypass. Proc Natl Acad Sci U S A, 2009. 106(7): p. 2365-70. 16. Kau, A.L., et al., Human nutrition, the gut microbiome and the immune system. Nature, 2011. 474(7351): p. 327-36. 17. Qin, J., et al., A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature, 2012. 490(7418): p. 55-60. 18. Frank, D.N., et al., Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci U S A, 2007. 104(34): p. 13780-5. 19. Wu, N., et al., Dysbiosis signature of fecal microbiota in colorectal cancer patients. Microb Ecol, 2013. 66(2): p. 462-70.

256

20. Qin, J., et al., A human gut microbial gene catalogue established by metagenomic sequencing. Nature, 2010. 464(7285): p. 59-65. 21. Arumugam, M., et al., Enterotypes of the human gut microbiome. Nature, 2011. 473(7346): p. 174-80. 22. Human Microbiome Project, C., Structure, function and diversity of the healthy human microbiome. Nature, 2012. 486(7402): p. 207-14. 23. Human Microbiome Project, C., A framework for human microbiome research. Nature, 2012. 486(7402): p. 215-21. 24. Turnbaugh, P.J., et al., The human microbiome project. Nature, 2007. 449(7164): p. 804- 10. 25. Group, N.H.W., et al., The NIH Human Microbiome Project. Genome Res, 2009. 19(12): p. 2317-23. 26. Turnbaugh, P.J., et al., A core gut microbiome in obese and lean twins. Nature, 2009. 457(7228): p. 480-4. 27. Wen, L., et al., Innate immunity and intestinal microbiota in the development of Type 1 diabetes. Nature, 2008. 455(7216): p. 1109-13. 28. Benson, A.K., et al., Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors. Proc Natl Acad Sci U S A, 2010. 107(44): p. 18933-8. 29. Eckburg, P.B., et al., Diversity of the human intestinal microbial flora. Science, 2005. 308(5728): p. 1635-8. 30. Baquero, F. and C. Nombela, The microbiome as a human organ. Clin Microbiol Infect, 2012. 18 Suppl 4: p. 2-4. 31. Ventura, M., et al., Genome-scale analyses of health-promoting bacteria: probiogenomics. Nat Rev Microbiol, 2009. 7(1): p. 61-71. 32. Jeffery, I.B. and P.W. O'Toole, Diet-microbiota interactions and their implications for healthy living. Nutrients, 2013. 5(1): p. 234-52. 33. Robles Alonso, V. and F. Guarner, Linking the gut microbiota to human health. Br J Nutr, 2013. 109 Suppl 2: p. S21-6. 34. Medicine, U.N.L.o.; Available from: http://www.ncbi.nlm.nih.gov/pubmed. 35. Prakash, O., et al., Microbial cultivation and the role of microbial resource centers in the omics era. Appl Microbiol Biotechnol, 2013. 97(1): p. 51-62. 36. Akondi, K.B. and V.V. Lakshmi, Emerging trends in genomic approaches for microbial bioprospecting. OMICS, 2013. 17(2): p. 61-70. 37. Goodman, A.L., et al., Extensive personal human gut microbiota culture collections characterized and manipulated in gnotobiotic mice. Proc Natl Acad Sci U S A, 2011. 108(15): p. 6252-7. 38. Ze, X., et al., Ruminococcus bromii is a keystone species for the degradation of resistant starch in the human colon. ISME J, 2012. 6(8): p. 1535-43. 39. Fontana, J.M., E. Alexander, and M. Salvatore, Translational research in infectious disease: current paradigms and challenges ahead. Transl Res, 2012. 159(6): p. 430-53. 40. Lozupone, C.A., et al., Diversity, stability and resilience of the human gut microbiota. Nature, 2012. 489(7415): p. 220-30. 41. Nelson, A., et al., Polymicrobial challenges to Koch's postulates: ecological lessons from the bacterial vaginosis and cystic fibrosis microbiomes. Innate Immun, 2012. 18(5): p. 774-83. 42. Zhao, J. and S.F. Grant, Advances in whole genome sequencing technology. Curr Pharm Biotechnol, 2011. 12(2): p. 293-305.

257

43. Morozova, O. and M.A. Marra, Applications of next-generation sequencing technologies in functional genomics. Genomics, 2008. 92(5): p. 255-64. 44. Liu, L., et al., Comparison of next-generation sequencing systems. J Biomed Biotechnol, 2012. 2012: p. 251364. 45. Institute, J.C.V.; Available from: http://www.glassdoor.com/Photos/J-Craig-Venter- Institute-Office-Photos-E111330.htm. 46. Sorek, R. Microbiology with ultra-high-throughput sequencing. 2013; Available from: http://www.weizmann.ac.il/molgen/Sorek/solexa.html. 47. Gill, S.R., et al., Metagenomic analysis of the human distal gut microbiome. Science, 2006. 312(5778): p. 1355-9. 48. Song, S., T. Jarvie, and M. Hattori, Our second genome-human metagenome: how next- generation sequencer changes our life through microbiology. Adv Microb Physiol, 2013. 62: p. 119-44. 49. Institute, N.H.G.R.; Available from: http://www.genome.gov/SequencingCosts/. 50. Zhao, L., et al., Targeting the human genome-microbiome axis for drug discovery: inspirations from global systems biology and traditional Chinese medicine. J Proteome Res, 2012. 11(7): p. 3509-19. 51. Karlsson, F.H., et al., Prospects for systems biology and modeling of the gut microbiome. Trends Biotechnol, 2011. 29(6): p. 251-8. 52. Ashburner, M., et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25-9. 53. Chivian, D., et al., MetaMicrobesOnline: phylogenomic analysis of microbial communities. Nucleic Acids Res, 2013. 41(Database issue): p. D648-54. 54. Altschul, S.F., et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 1997. 25(17): p. 3389-402. 55. Human Microbiome Jumpstart Reference Strains, C., et al., A catalog of reference genomes from the human microbiome. Science, 2010. 328(5981): p. 994-9. 56. Carlisle, E.M., et al., Murine gut microbiota and transcriptome are diet dependent. Ann Surg, 2013. 257(2): p. 287-94. 57. Bomar, L., et al., Directed culturing of microorganisms using metatranscriptomics. MBio, 2011. 2(2): p. e00012-11. 58. Maurice, C.F., H.J. Haiser, and P.J. Turnbaugh, Xenobiotics shape the physiology and gene expression of the active human gut microbiome. Cell, 2013. 152(1-2): p. 39-50. 59. Kolmeder, C.A. and W.M. de Vos, Metaproteomics of our microbiome - Developing insight in function and activity in man and model systems. J Proteomics, 2013. 60. Hettich, R.L., et al., Metaproteomics: harnessing the power of high performance mass spectrometry to identify the suite of proteins that control metabolic activities in microbial communities. Anal Chem, 2013. 85(9): p. 4203-14. 61. Erickson, A.R., et al., Integrated metagenomics/metaproteomics reveals human host- microbiota signatures of Crohn's disease. PLoS One, 2012. 7(11): p. e49138. 62. Gry, M., et al., Correlations between RNA and protein expression profiles in 23 human cell lines. BMC Genomics, 2009. 10: p. 365. 63. Thelen, J.J. and J.A. Miernyk, The proteomic future: where mass spectrometry should be taking us. Biochem J, 2012. 444(2): p. 169-81. 64. Van Riper, S.K., et al., Mass spectrometry-based proteomics: basic principles and emerging technologies and directions. Adv Exp Med Biol, 2013. 990: p. 1-35.

258

65. Wu, Q., et al., Recent advances on multidimensional liquid chromatography-mass spectrometry for proteomics: from qualitative to quantitative analysis--a review. Anal Chim Acta, 2012. 731: p. 1-10. 66. Serim, S., U. Haedke, and S.H. Verhelst, Activity-based probes for the study of proteases: recent advances and developments. ChemMedChem, 2012. 7(7): p. 1146-59. 67. Cho, I., et al., Antibiotics in early life alter the murine colonic microbiome and adiposity. Nature, 2012. 488(7413): p. 621-6. 68. Cani, P.D., et al., Changes in gut microbiota control metabolic endotoxemia-induced inflammation in high-fat diet-induced obesity and diabetes in mice. Diabetes, 2008. 57(6): p. 1470-81. 69. Marra, F., et al., Antibiotic use in children is associated with increased risk of asthma. Pediatrics, 2009. 123(3): p. 1003-10. 70. Cani, P.D. and N.M. Delzenne, The role of the gut microbiota in energy metabolism and metabolic disease. Curr Pharm Des, 2009. 15(13): p. 1546-58. 71. Backhed, F., et al., Host-bacterial mutualism in the human intestine. Science, 2005. 307(5717): p. 1915-20. 72. Ley, R.E., D.A. Peterson, and J.I. Gordon, Ecological and evolutionary forces shaping microbial diversity in the human intestine. Cell, 2006. 124(4): p. 837-48. 73. Macdonald, T.T. and G. Monteleone, Immunity, inflammation, and allergy in the gut. Science, 2005. 307(5717): p. 1920-5. 74. Rabizadeh, S. and C. Sears, New Horizons for the Infectious Diseases Specialist: How Gut MicrofloraPromote Health and Disease. Curr Infect Dis Rep, 2008. 10(2): p. 92-98. 75. Backhed, F., et al., The gut microbiota as an environmental factor that regulates fat storage. Proc Natl Acad Sci U S A, 2004. 101(44): p. 15718-23. 76. Geurts, L., et al., Gut microbiota controls adipose tissue expansion, gut barrier and glucose metabolism: novel insights into molecular targets and interventions using prebiotics. Benef Microbes, 2013: p. 1-15. 77. DiBaise, J.K., et al., Gut microbiota and its possible relationship with obesity. Mayo Clin Proc, 2008. 83(4): p. 460-9. 78. Hooper, L.V., T. Midtvedt, and J.I. Gordon, How host-microbial interactions shape the nutrient environment of the mammalian intestine. Annu Rev Nutr, 2002. 22: p. 283-307. 79. Artis, D., Epithelial-cell recognition of commensal bacteria and maintenance of immune homeostasis in the gut. Nat Rev Immunol, 2008. 8(6): p. 411-20. 80. Stappenbeck, T.S., L.V. Hooper, and J.I. Gordon, Developmental regulation of intestinal angiogenesis by indigenous microbes via Paneth cells. Proc Natl Acad Sci U S A, 2002. 99(24): p. 15451-5. 81. Cerf-Bensussan, N. and V. Gaboriau-Routhiau, The immune system and the gut microbiota: friends or foes? Nat Rev Immunol, 2010. 10(10): p. 735-44. 82. Hansotia, T. and D.J. Drucker, GIP and GLP-1 as incretin hormones: lessons from single and double incretin receptor knockout mice. Regul Pept, 2005. 128(2): p. 125-34. 83. Murphy, K., et al., Janeway's immunobiology. 7th ed. 2008, New York: Garland Science. xxi, 887 p. 84. Turner, J.R., Intestinal mucosal barrier function in health and disease. Nat Rev Immunol, 2009. 9(11): p. 799-809. 85. Tlaskalova-Hogenova, H., et al., The role of gut microbiota (commensal bacteria) and the mucosal barrier in the pathogenesis of inflammatory and autoimmune diseases and cancer: contribution of germ-free and gnotobiotic animal models of human diseases. Cell Mol Immunol, 2011. 8(2): p. 110-20.

259

86. Jang, M.H., et al., Intestinal villous M cells: an antigen entry site in the mucosal epithelium. Proc Natl Acad Sci U S A, 2004. 101(16): p. 6110-5. 87. Owen, R.L., Uptake and transport of intestinal macromolecules and microorganisms by M cells in Peyer's patches--a personal and historical perspective. Semin Immunol, 1999. 11(3): p. 157-63. 88. Johansson, M.E., J.M. Larsson, and G.C. Hansson, The two mucus layers of colon are organized by the MUC2 mucin, whereas the outer layer is a legislator of host-microbial interactions. Proc Natl Acad Sci U S A, 2011. 108 Suppl 1: p. 4659-65. 89. Bevins, C.L. and N.H. Salzman, Paneth cells, antimicrobial peptides and maintenance of intestinal homeostasis. Nat Rev Microbiol, 2011. 9(5): p. 356-68. 90. Macpherson, A.J., et al., The habitat, double life, citizenship, and forgetfulness of IgA. Immunol Rev, 2012. 245(1): p. 132-46. 91. Cani, P.D., Crosstalk between the gut microbiota and the endocannabinoid system: impact on the gut barrier function and the adipose tissue. Clin Microbiol Infect, 2012. 18 Suppl 4: p. 50-3. 92. Alhouayek, M., et al., Increasing endogenous 2-arachidonoylglycerol levels counteracts colitis and related systemic inflammation. FASEB J, 2011. 25(8): p. 2711-21. 93. Chow, J., H. Tang, and S.K. Mazmanian, Pathobionts of the gastrointestinal microbiota and inflammatory disease. Curr Opin Immunol, 2011. 23(4): p. 473-80. 94. De La Cochetiere, M.F., et al., Effect of antibiotic therapy on human fecal microbiota and the relation to the development of Clostridium difficile. Microb Ecol, 2008. 56(3): p. 395- 402. 95. Neish, A.S., et al., Prokaryotic regulation of epithelial responses by inhibition of IkappaB- alpha ubiquitination. Science, 2000. 289(5484): p. 1560-3. 96. Mills, M.E., The CNS and collaborative practice. Clin Nurse Spec, 1990. 4(4): p. 194-5. 97. Biasucci, G., et al., Mode of delivery affects the bacterial community in the newborn gut. Early Hum Dev, 2010. 86 Suppl 1: p. 13-5. 98. Dominguez-Bello, M.G., et al., Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proc Natl Acad Sci U S A, 2010. 107(26): p. 11971-5. 99. Lif Holgerson, P., et al., Mode of birth delivery affects oral microbiota in infants. J Dent Res, 2011. 90(10): p. 1183-8. 100. Peterson, S.N., et al., Dental caries pathogenicity: a genomic and metagenomic perspective. Int Dent J, 2011. 61 Suppl 1: p. 11-22. 101. Walter, J. and R. Ley, The human gut microbiome: ecology and recent evolutionary changes. Annu Rev Microbiol, 2011. 65: p. 411-29. 102. Faust, K., et al., Microbial co-occurrence relationships in the human microbiome. PLoS Comput Biol, 2012. 8(7): p. e1002606. 103. Faust, K. and J. Raes, Microbial interactions: from networks to models. Nat Rev Microbiol, 2012. 10(8): p. 538-50. 104. Gueimonde, M., et al., Effect of maternal consumption of lactobacillus GG on transfer and establishment of fecal bifidobacterial microbiota in neonates. J Pediatr Gastroenterol Nutr, 2006. 42(2): p. 166-70. 105. Vaishampayan, P.A., et al., Comparative metagenomics and population dynamics of the gut microbiota in mother and infant. Genome Biol Evol, 2010. 2: p. 53-66. 106. Penders, J., et al., Factors influencing the composition of the intestinal microbiota in early infancy. Pediatrics, 2006. 118(2): p. 511-21.

260

107. Nelun Barfod, M., et al., Oral microflora in infants delivered vaginally and by caesarean section. Int J Paediatr Dent, 2011. 21(6): p. 401-6. 108. Schwarz, S., et al., Horizontal versus familial transmission of Helicobacter pylori. PLoS Pathog, 2008. 4(10): p. e1000180. 109. Turnbaugh, P.J., et al., Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins. Proc Natl Acad Sci U S A, 2010. 107(16): p. 7503-8. 110. Deloris Alexander, A., et al., Quantitative PCR assays for mouse enteric flora reveal strain-dependent differences in composition that are influenced by the microenvironment. Mamm Genome, 2006. 17(11): p. 1093-104. 111. Elinav, E., et al., NLRP6 inflammasome regulates colonic microbial ecology and risk for colitis. Cell, 2011. 145(5): p. 745-57. 112. Hooper, L.V., D.R. Littman, and A.J. Macpherson, Interactions between the microbiota and the immune system. Science, 2012. 336(6086): p. 1268-73. 113. Arthur, J.C., et al., Intestinal inflammation targets cancer-inducing activity of the microbiota. Science, 2012. 338(6103): p. 120-3. 114. Natividad, J.M., et al., Commensal and probiotic bacteria influence intestinal barrier function and susceptibility to colitis in Nod1-/-; Nod2-/- mice. Inflamm Bowel Dis, 2012. 18(8): p. 1434-46. 115. Grehan, M.J., et al., Durable alteration of the colonic microbiota by the administration of donor fecal flora. J Clin Gastroenterol, 2010. 44(8): p. 551-61. 116. Turnbaugh, P.J., et al., The effect of diet on the human gut microbiome: a metagenomic analysis in humanized gnotobiotic mice. Sci Transl Med, 2009. 1(6): p. 6ra14. 117. Yatsunenko, T., et al., Human gut microbiome viewed across age and geography. Nature, 2012. 486(7402): p. 222-7. 118. Fallani, M., et al., Determinants of the human infant intestinal microbiota after the introduction of first complementary foods in infant samples from five European centres. Microbiology, 2011. 157(Pt 5): p. 1385-92. 119. Holgerson, P.L., et al., Oral Microbial Profile Discriminates Breastfed from Formula-Fed Infants. J Pediatr Gastroenterol Nutr, 2012. 120. Hascoet, J.M., et al., Effect of formula composition on the development of infant gut microbiota. J Pediatr Gastroenterol Nutr, 2011. 52(6): p. 756-62. 121. Dethlefsen, L., et al., The pervasive effects of an antibiotic on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biol, 2008. 6(11): p. e280. 122. Relman, D.A., The human microbiome: ecosystem resilience and health. Nutr Rev, 2012. 70 Suppl 1: p. S2-9. 123. Koenig, J.E., et al., Succession of microbial consortia in the developing infant gut microbiome. Proc Natl Acad Sci U S A, 2011. 108 Suppl 1: p. 4578-85. 124. Palmer, C., et al., Development of the human infant intestinal microbiota. PLoS Biol, 2007. 5(7): p. e177. 125. Dominguez-Bello, M.G., et al., Development of the human gastrointestinal microbiota and insights from high-throughput sequencing. Gastroenterology, 2011. 140(6): p. 1713- 9. 126. Morowitz, M.J., et al., Strain-resolved community genomic analysis of gut microbial colonization in a premature infant. Proc Natl Acad Sci U S A, 2011. 108(3): p. 1128-33. 127. Marsh, P.D. and D.A. Devine, How is the development of dental biofilms influenced by the host? J Clin Periodontol, 2011. 38 Suppl 11: p. 28-35.

261

128. Van der Hoeven, J.S. and P.J. Camp, Synergistic degradation of mucin by Streptococcus oralis and Streptococcus sanguis in mixed chemostat cultures. J Dent Res, 1991. 70(7): p. 1041-4. 129. Diaz, P.I., P.S. Zilm, and A.H. Rogers, Fusobacterium nucleatum supports the growth of Porphyromonas gingivalis in oxygenated and carbon-dioxide-depleted environments. Microbiology, 2002. 148(Pt 2): p. 467-72. 130. Fedi, P.F., Jr. and W.J. Killoy, Temperature differences at periodontal sites in health and disease. J Periodontol, 1992. 63(1): p. 24-7. 131. Svensater, G., et al., Acid tolerance response and survival by oral bacteria. Oral Microbiol Immunol, 1997. 12(5): p. 266-73. 132. Palmer, R.J., Jr., P.I. Diaz, and P.E. Kolenbrander, Rapid succession within the Veillonella population of a developing human oral biofilm in situ. J Bacteriol, 2006. 188(11): p. 4117-24. 133. Jakubovics, N.S., et al., Role of hydrogen peroxide in competition and cooperation between Streptococcus gordonii and Actinomyces naeslundii. FEMS Microbiol Ecol, 2008. 66(3): p. 637-44. 134. Proctor, L.M., The Human Microbiome Project in 2011 and beyond. Cell Host Microbe, 2011. 10(4): p. 287-91. 135. Wittebolle, L., et al., Initial community evenness favours functionality under selective stress. Nature, 2009. 458(7238): p. 623-6. 136. Dethlefsen, L. and D.A. Relman, Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proc Natl Acad Sci U S A, 2011. 108 Suppl 1: p. 4554-61. 137. Sonnenburg, J.L., C.T. Chen, and J.I. Gordon, Genomic and metabolic studies of the impact of probiotics on a model gut symbiont and host. PLoS Biol, 2006. 4(12): p. e413. 138. Samuel, B.S. and J.I. Gordon, A humanized gnotobiotic mouse model of host-archaeal- bacterial mutualism. Proc Natl Acad Sci U S A, 2006. 103(26): p. 10011-6. 139. Chassard, C. and A. Bernalier-Donadille, H2 and acetate transfers during xylan fermentation between a butyrate-producing xylanolytic species and hydrogenotrophic microorganisms from the human gut. FEMS Microbiol Lett, 2006. 254(1): p. 116-22. 140. Mahowald, M.A., et al., Characterizing a model human gut microbiota composed of members of its two dominant bacterial phyla. Proc Natl Acad Sci U S A, 2009. 106(14): p. 5859-64. 141. Fischbach, M.A. and J.L. Sonnenburg, Eating for two: how metabolism establishes interspecies interactions in the gut. Cell Host Microbe, 2011. 10(4): p. 336-47. 142. Rey, F.E., et al., Dissecting the in vivo metabolic potential of two human gut acetogens. J Biol Chem, 2010. 285(29): p. 22082-90. 143. Faith, J.J., et al., Creating and characterizing communities of human gut microbes in gnotobiotic mice. ISME J, 2010. 4(9): p. 1094-8. 144. Falk, P.G., et al., Creating and maintaining the gastrointestinal ecosystem: what we know and need to know from gnotobiology. Microbiol Mol Biol Rev, 1998. 62(4): p. 1157-70. 145. Alam, M., T. Midtvedt, and A. Uribe, Differential cell kinetics in the ileum and colon of germfree rats. Scand J Gastroenterol, 1994. 29(5): p. 445-51. 146. Mueller, C. and A.J. Macpherson, Layers of mutualism with commensal bacteria protect us from intestinal inflammation. Gut, 2006. 55(2): p. 276-84.

262

147. van der Waaij, D., The ecology of the human intestine and its consequences for overgrowth by pathogens such as Clostridium difficile. Annu Rev Microbiol, 1989. 43: p. 69-87. 148. Hill, D.A., et al., Metagenomic analyses reveal antibiotic-induced temporal and spatial changes in intestinal microbiota with associated alterations in immune cell homeostasis. Mucosal Immunol, 2010. 3(2): p. 148-58. 149. McFarland, L.V., S.A. Brandmarker, and S. Guandalini, Pediatric Clostridium difficile: a phantom menace or clinical reality? J Pediatr Gastroenterol Nutr, 2000. 31(3): p. 220-31. 150. Schwan, A., et al., Relapsing Clostridium difficile enterocolitis cured by rectal infusion of normal faeces. Scand J Infect Dis, 1984. 16(2): p. 211-5. 151. Brandt, L.J., American Journal of Gastroenterology Lecture: Intestinal Microbiota and the Role of Fecal Microbiota Transplant (FMT) in Treatment of C. difficile Infection. Am J Gastroenterol, 2013. 108(2): p. 177-85. 152. Sanderson, I.R. and W.A. Walker, TLRs in the Gut I. The role of TLRs/Nods in intestinal development and homeostasis. Am J Physiol Gastrointest Liver Physiol, 2007. 292(1): p. G6-10. 153. Sansonetti, P.J. and R. Medzhitov, Learning tolerance while fighting ignorance. Cell, 2009. 138(3): p. 416-20. 154. Franceschi, C., et al., The immunology of exceptional individuals: the lesson of centenarians. Immunol Today, 1995. 16(1): p. 12-6. 155. Franceschi, C., M. Bonafe, and S. Valensin, Human immunosenescence: the prevailing of innate immunity, the failing of clonotypic immunity, and the filling of immunological space. Vaccine, 2000. 18(16): p. 1717-20. 156. Sansoni, P., et al., Lymphocyte subsets and natural killer cell activity in healthy old people and centenarians. Blood, 1993. 82(9): p. 2767-73. 157. Zanni, F., et al., Marked increase with age of type 1 cytokines within memory and effector/cytotoxic CD8+ T cells in humans: a contribution to understand the relationship between inflammation and immunosenescence. Exp Gerontol, 2003. 38(9): p. 981-7. 158. Franceschi, C., et al., Inflamm-aging. An evolutionary perspective on immunosenescence. Ann N Y Acad Sci, 2000. 908: p. 244-54. 159. Kinross, J. and J.K. Nicholson, Gut microbiota: Dietary and social modulation of gut microbiota in the elderly. Nat Rev Gastroenterol Hepatol, 2012. 9(10): p. 563-4. 160. Claesson, M.J., et al., Composition, variability, and temporal stability of the intestinal microbiota of the elderly. Proc Natl Acad Sci U S A, 2011. 108 Suppl 1: p. 4586-91. 161. Schiffrin, E.J., et al., The inflammatory status of the elderly: the intestinal contribution. Mutat Res, 2010. 690(1-2): p. 50-6. 162. Rhee, K.J., et al., Induction of persistent colitis by a human commensal, enterotoxigenic Bacteroides fragilis, in wild-type C57BL/6 mice. Infect Immun, 2009. 77(4): p. 1708-18. 163. Wu, S., et al., A human colonic commensal promotes colon tumorigenesis via activation of T helper type 17 T cell responses. Nat Med, 2009. 15(9): p. 1016-22. 164. Balkwill, F.R. and A. Mantovani, Cancer-related inflammation: common themes and therapeutic opportunities. Semin Cancer Biol, 2012. 22(1): p. 33-40. 165. Lin, W.W. and M. Karin, A cytokine-mediated link between innate immunity, inflammation, and cancer. J Clin Invest, 2007. 117(5): p. 1175-83. 166. Uronis, J.M., et al., Modulation of the intestinal microbiota alters colitis-associated colorectal cancer susceptibility. PLoS One, 2009. 4(6): p. e6026. 167. Littman, D.R. and E.G. Pamer, Role of the commensal microbiota in normal and pathogenic host immune responses. Cell Host Microbe, 2011. 10(4): p. 311-23.

263

168. Atarashi, K., et al., Induction of colonic regulatory T cells by indigenous Clostridium species. Science, 2011. 331(6015): p. 337-41. 169. Ivanov, II, et al., Induction of intestinal Th17 cells by segmented filamentous bacteria. Cell, 2009. 139(3): p. 485-98. 170. Round, J.L. and S.K. Mazmanian, Inducible Foxp3+ regulatory T-cell development by a commensal bacterium of the intestinal microbiota. Proc Natl Acad Sci U S A, 2010. 107(27): p. 12204-9. 171. Sonnenberg, G.F., L.A. Fouser, and D. Artis, Border patrol: regulation of immunity, inflammation and tissue homeostasis at barrier surfaces by IL-22. Nat Immunol, 2011. 12(5): p. 383-90. 172. Ichinohe, T., et al., Microbiota regulates immune defense against respiratory tract influenza A virus infection. Proc Natl Acad Sci U S A, 2011. 108(13): p. 5354-9. 173. Sawa, S., et al., RORgammat+ innate lymphoid cells regulate intestinal homeostasis by integrating negative signals from the symbiotic microbiota. Nat Immunol, 2011. 12(4): p. 320-6. 174. Shaw, M.H., et al., Microbiota-induced IL-1beta, but not IL-6, is critical for the development of steady-state TH17 cells in the intestine. J Exp Med, 2012. 209(2): p. 251- 8. 175. Gassler, N., et al., Inflammatory bowel disease is associated with changes of enterocytic junctions. Am J Physiol Gastrointest Liver Physiol, 2001. 281(1): p. G216-28. 176. Zeissig, S., et al., Changes in expression and distribution of claudin 2, 5 and 8 lead to discontinuous tight junctions and barrier dysfunction in active Crohn's disease. Gut, 2007. 56(1): p. 61-72. 177. Kaser, A., et al., XBP1 links ER stress to intestinal inflammation and confers genetic risk for human inflammatory bowel disease. Cell, 2008. 134(5): p. 743-56. 178. Heazlewood, C.K., et al., Aberrant mucin assembly in mice causes endoplasmic reticulum stress and spontaneous inflammation resembling ulcerative colitis. PLoS Med, 2008. 5(3): p. e54. 179. Salzman, N.H., et al., Enteric defensins are essential regulators of intestinal microbial ecology. Nat Immunol, 2010. 11(1): p. 76-83. 180. Garrett, W.S., et al., Communicable ulcerative colitis induced by T-bet deficiency in the innate immune system. Cell, 2007. 131(1): p. 33-45. 181. Willing, B.P., et al., A pyrosequencing study in twins shows that gastrointestinal microbial profiles vary with inflammatory bowel disease phenotypes. Gastroenterology, 2010. 139(6): p. 1844-1854 e1. 182. Martinez-Medina, M., et al., Abnormal microbiota composition in the ileocolonic mucosa of Crohn's disease patients as revealed by polymerase chain reaction-denaturing gradient gel electrophoresis. Inflamm Bowel Dis, 2006. 12(12): p. 1136-45. 183. Png, C.W., et al., Mucolytic bacteria with increased prevalence in IBD mucosa augment in vitro utilization of mucin by other bacteria. Am J Gastroenterol, 2010. 105(11): p. 2420- 8. 184. Swidsinski, A., et al., Comparative study of the intestinal mucus barrier in normal and inflamed colon. Gut, 2007. 56(3): p. 343-50. 185. Andoh, A., et al., Terminal restriction fragment length polymorphism analysis of the diversity of fecal microbiota in patients with ulcerative colitis. Inflamm Bowel Dis, 2007. 13(8): p. 955-62. 186. Mondot, S., et al., Highlighting new phylogenetic specificities of Crohn's disease microbiota. Inflamm Bowel Dis, 2011. 17(1): p. 185-92.

264

187. Darfeuille-Michaud, A., et al., Presence of adherent Escherichia coli strains in ileal mucosa of patients with Crohn's disease. Gastroenterology, 1998. 115(6): p. 1405-13. 188. Boudeau, J., et al., Invasive ability of an Escherichia coli strain isolated from the ileal mucosa of a patient with Crohn's disease. Infect Immun, 1999. 67(9): p. 4499-509. 189. Lapaquette, P., et al., Crohn's disease-associated adherent-invasive E. coli are selectively favoured by impaired autophagy to replicate intracellularly. Cell Microbiol, 2010. 12(1): p. 99-113. 190. Brest, P., et al., A synonymous variant in IRGM alters a for miR-196 and causes deregulation of IRGM-dependent xenophagy in Crohn's disease. Nat Genet, 2011. 43(3): p. 242-5. 191. Barnich, N., et al., CEACAM6 acts as a receptor for adherent-invasive E. coli, supporting ileal mucosa colonization in Crohn disease. J Clin Invest, 2007. 117(6): p. 1566-74. 192. Hotamisligil, G.S. and E. Erbay, Nutrient sensing and inflammation in metabolic diseases. Nat Rev Immunol, 2008. 8(12): p. 923-34. 193. Hotamisligil, G.S., Inflammation and metabolic disorders. Nature, 2006. 444(7121): p. 860-7. 194. Yoshimoto, S., et al., Obesity-induced gut microbial metabolite promotes liver cancer through senescence secretome. Nature, 2013. 499(7456): p. 97-101. 195. Cani, P.D., et al., Metabolic endotoxemia initiates obesity and insulin resistance. Diabetes, 2007. 56(7): p. 1761-72. 196. Hotamisligil, G.S., N.S. Shargill, and B.M. Spiegelman, Adipose expression of tumor necrosis factor-alpha: direct role in obesity-linked insulin resistance. Science, 1993. 259(5091): p. 87-91. 197. Green, A., et al., Tumor necrosis factor increases the rate of lipolysis in primary cultures of adipocytes without altering levels of hormone-sensitive lipase. Endocrinology, 1994. 134(6): p. 2581-8. 198. Xu, H., et al., Chronic inflammation in fat plays a crucial role in the development of obesity-related insulin resistance. J Clin Invest, 2003. 112(12): p. 1821-30. 199. Larsen, N., et al., Gut microbiota in human adults with type 2 diabetes differs from non- diabetic adults. PLoS One, 2010. 5(2): p. e9085. 200. Gibson, G.R., et al., Selective stimulation of bifidobacteria in the human colon by oligofructose and inulin. Gastroenterology, 1995. 108(4): p. 975-82. 201. Kadooka, Y., et al., Regulation of abdominal adiposity by probiotics (Lactobacillus gasseri SBT2055) in adults with obese tendencies in a randomized controlled trial. Eur J Clin Nutr, 2010. 64(6): p. 636-43. 202. Bibiloni, R., et al., VSL#3 probiotic-mixture induces remission in patients with active ulcerative colitis. Am J Gastroenterol, 2005. 100(7): p. 1539-46. 203. Martin, F.P., et al., Probiotic modulation of symbiotic gut microbial-host metabolic interactions in a humanized microbiome mouse model. Mol Syst Biol, 2008. 4: p. 157. 204. Kovatcheva-Datchary, P. and T. Arora, Nutrition, the gut microbiome and the metabolic syndrome. Best Pract Res Clin Gastroenterol, 2013. 27(1): p. 59-72. 205. Lee, H.Y., et al., Human originated bacteria, Lactobacillus rhamnosus PL60, produce conjugated linoleic acid and show anti-obesity effects in diet-induced obese mice. Biochim Biophys Acta, 2006. 1761(7): p. 736-44. 206. Lee, K., et al., Antiobesity effect of trans-10,cis-12-conjugated linoleic acid-producing Lactobacillus plantarum PL62 on diet-induced obese mice. J Appl Microbiol, 2007. 103(4): p. 1140-6.

265

207. Jirillo, E., F. Jirillo, and T. Magrone, Healthy effects exerted by prebiotics, probiotics, and symbiotics with special reference to their impact on the immune system. Int J Vitam Nutr Res, 2012. 82(3): p. 200-8. 208. Toward, R., et al., Effect of prebiotics on the human gut microbiota of elderly persons. Gut Microbes, 2012. 3(1): p. 57-60. 209. Walton, G.E., et al., A randomised crossover study investigating the effects of galacto- oligosaccharides on the faecal microbiota in men and women over 50 years of age. Br J Nutr, 2012. 107(10): p. 1466-75. 210. Kleessen, B., et al., Effects of inulin and lactose on fecal microflora, microbial activity, and bowel habit in elderly constipated persons. Am J Clin Nutr, 1997. 65(5): p. 1397-402. 211. Bartosch, S., et al., Microbiological effects of consuming a synbiotic containing Bifidobacterium bifidum, Bifidobacterium lactis, and oligofructose in elderly persons, determined by real-time polymerase chain reaction and counting of viable bacteria. Clin Infect Dis, 2005. 40(1): p. 28-37. 212. Osman, N., et al., Bifidobacterium infantis strains with and without a combination of oligofructose and inulin (OFI) attenuate inflammation in DSS-induced colitis in rats. BMC Gastroenterol, 2006. 6: p. 31. 213. Buddington, K.K., J.B. Donahoo, and R.K. Buddington, Dietary oligofructose and inulin protect mice from enteric and systemic pathogens and tumor inducers. J Nutr, 2002. 132(3): p. 472-7. 214. Lewis, S., S. Burmeister, and J. Brazier, Effect of the prebiotic oligofructose on relapse of Clostridium difficile-associated diarrhea: a randomized, controlled study. Clin Gastroenterol Hepatol, 2005. 3(5): p. 442-8. 215. Santacruz, A., et al., Gut microbiota composition is associated with body weight, weight gain and biochemical parameters in pregnant women. Br J Nutr, 2010. 104(1): p. 83-92. 216. Karlsson, C.L., et al., The microbiota of the gut in preschool children with normal and excessive body weight. Obesity (Silver Spring), 2012. 20(11): p. 2257-61. 217. Everard, A., et al., Cross-talk between Akkermansia muciniphila and intestinal epithelium controls diet-induced obesity. Proc Natl Acad Sci U S A, 2013. 110(22): p. 9066-71. 218. Everard, A., et al., Responses of gut microbiota and glucose and lipid metabolism to prebiotics in genetic obese and diet-induced leptin-resistant mice. Diabetes, 2011. 60(11): p. 2775-86. 219. Aroniadis, O.C. and L.J. Brandt, Fecal microbiota transplantation: past, present and future. Curr Opin Gastroenterol, 2013. 29(1): p. 79-84. 220. Brandt, L.J., et al., Long-term follow-up of colonoscopic fecal microbiota transplant for recurrent Clostridium difficile infection. Am J Gastroenterol, 2012. 107(7): p. 1079-87. 221. Turnbaugh, P.J., et al., An obesity-associated gut microbiome with increased capacity for energy harvest. Nature, 2006. 444(7122): p. 1027-31. 222. Vrieze, A., et al., Transfer of intestinal microbiota from lean donors increases insulin sensitivity in individuals with metabolic syndrome. Gastroenterology, 2012. 143(4): p. 913-6 e7. 223. Bercik, P., S.M. Collins, and E.F. Verdu, Microbes and the gut-brain axis. Neurogastroenterol Motil, 2012. 24(5): p. 405-13. 224. Mayer, E.A., Gut feelings: the emerging biology of gut-brain communication. Nat Rev Neurosci, 2011. 12(8): p. 453-66. 225. Yu, J.H. and M.S. Kim, Molecular mechanisms of appetite regulation. Diabetes Metab J, 2012. 36(6): p. 391-8.

266

226. Xu, B., et al., Brain-derived neurotrophic factor regulates energy balance downstream of melanocortin-4 receptor. Nat Neurosci, 2003. 6(7): p. 736-42. 227. Iguchi, A., P.D. Burleson, and A.J. Szabo, Decrease in plasma glucose concentration after microinjection of insulin into VMN. Am J Physiol, 1981. 240(2): p. E95-100. 228. Taniguchi, C.M., B. Emanuelli, and C.R. Kahn, Critical nodes in signalling pathways: insights into insulin action. Nat Rev Mol Cell Biol, 2006. 7(2): p. 85-96. 229. Schwartz, G.J., The role of gastrointestinal vagal afferents in the control of food intake: current prospects. Nutrition, 2000. 16(10): p. 866-73. 230. Fields, H.L., et al., Ventral tegmental area neurons in learned appetitive behavior and positive reinforcement. Annu Rev Neurosci, 2007. 30: p. 289-316. 231. Ahlman, H. and Nilsson, The gut as the largest endocrine organ in the body. Ann Oncol, 2001. 12 Suppl 2: p. S63-8. 232. de Lartigue, G., C.B. de La Serre, and H.E. Raybould, Vagal afferent neurons in high fat diet-induced obesity; intestinal microflora, gut inflammation and cholecystokinin. Physiol Behav, 2011. 105(1): p. 100-5. 233. Dham, S. and M.A. Banerji, The brain-gut axis in regulation of appetite and obesity. Pediatr Endocrinol Rev, 2006. 3 Suppl 4: p. 544-54. 234. Paulino, G., et al., Increased expression of receptors for orexigenic factors in nodose ganglion of diet-induced obese rats. Am J Physiol Endocrinol Metab, 2009. 296(4): p. E898-903. 235. Volkow, N.D., G.J. Wang, and R.D. Baler, Reward, dopamine and the control of food intake: implications for obesity. Trends Cogn Sci, 2011. 15(1): p. 37-46. 236. Schele, E., et al., The gut microbiota reduces leptin sensitivity and the expression of the obesity suppressing neuropeptides proglucagon (Gcg) and brain-derived neurotrophic factor (Bdnf) in the central nervous system. Endocrinology, 2013. 237. Di Marzo, V., et al., Leptin-regulated endocannabinoids are involved in maintaining food intake. Nature, 2001. 410(6830): p. 822-5. 238. Gonzalez, J.A., F. Reimann, and D. Burdakov, Dissociation between sensing and metabolism of glucose in sugar sensing neurones. J Physiol, 2009. 587(Pt 1): p. 41-8. 239. Obici, S., et al., Central administration of oleic acid inhibits glucose production and food intake. Diabetes, 2002. 51(2): p. 271-5. 240. Cota, D., et al., Hypothalamic mTOR signaling regulates food intake. Science, 2006. 312(5775): p. 927-30. 241. Kim, G.W., J.E. Lin, and S.A. Waldman, GUCY2C: at the intersection of obesity and cancer. Trends Endocrinol Metab, 2013. 24(4): p. 165-73. 242. Tsai, F. and W.J. Coyle, The microbiome and obesity: is obesity linked to our gut flora? Curr Gastroenterol Rep, 2009. 11(4): p. 307-13. 243. Ley, R.E., et al., Obesity alters gut microbial ecology. Proc Natl Acad Sci U S A, 2005. 102(31): p. 11070-5. 244. Hamer, H.M., et al., Review article: the role of butyrate on colonic function. Aliment Pharmacol Ther, 2008. 27(2): p. 104-19. 245. Usami, M., et al., Butyrate and trichostatin A attenuate nuclear factor kappaB activation and tumor necrosis factor alpha secretion and increase prostaglandin E2 secretion in human peripheral blood mononuclear cells. Nutr Res, 2008. 28(5): p. 321-8. 246. Robinson, C.J. and V.B. Young, Antibiotic administration alters the community structure of the gastrointestinal micobiota. Gut Microbes, 2010. 1(4): p. 279-284. 247. Brandl, K., et al., Vancomycin-resistant enterococci exploit antibiotic-induced innate immune deficits. Nature, 2008. 455(7214): p. 804-7.

267

248. Dessein, R., et al., Toll-like receptor 2 is critical for induction of Reg3 beta expression and intestinal clearance of Yersinia pseudotuberculosis. Gut, 2009. 58(6): p. 771-6. 249. Atarashi, K., et al., ATP drives lamina propria T(H)17 cell differentiation. Nature, 2008. 455(7214): p. 808-12. 250. Mazmanian, S.K., J.L. Round, and D.L. Kasper, A microbial symbiosis factor prevents intestinal inflammatory disease. Nature, 2008. 453(7195): p. 620-5. 251. Meyer-Hoffert, U., et al., Secreted enteric antimicrobial activity localises to the mucus surface layer. Gut, 2008. 57(6): p. 764-71. 252. Schumann, A., et al., Neonatal antibiotic treatment alters gastrointestinal tract developmental gene expression and intestinal barrier transcriptome. Physiol Genomics, 2005. 23(2): p. 235-45. 253. Ivanov, II, et al., Specific microbiota direct the differentiation of IL-17-producing T-helper cells in the mucosa of the small intestine. Cell Host Microbe, 2008. 4(4): p. 337-49. 254. Ganal, S.C., et al., Priming of natural killer cells by nonmucosal mononuclear phagocytes requires instructive signals from commensal microbiota. Immunity, 2012. 37(1): p. 171- 86. 255. Hall, J.A., et al., Commensal DNA limits regulatory T cell conversion and is a natural adjuvant of intestinal immune responses. Immunity, 2008. 29(4): p. 637-49. 256. Dufour, V., et al., Effects of a short-course of amoxicillin/clavulanic acid on systemic and mucosal immunity in healthy adult humans. Int Immunopharmacol, 2005. 5(5): p. 917- 28. 257. Bouskra, D., et al., Lymphoid tissue genesis induced by commensals through NOD1 regulates intestinal homeostasis. Nature, 2008. 456(7221): p. 507-10. 258. Fagarasan, S., et al., Critical roles of activation-induced in the homeostasis of gut flora. Science, 2002. 298(5597): p. 1424-7. 259. Magnan, C., et al., Lipid infusion lowers sympathetic nervous activity and leads to increased beta-cell responsiveness to glucose. J Clin Invest, 1999. 103(3): p. 413-9. 260. Cani, P.D., et al., Selective increases of bifidobacteria in gut microflora improve high-fat- diet-induced diabetes in mice through a mechanism associated with endotoxaemia. Diabetologia, 2007. 50(11): p. 2374-83. 261. Wang, Z., et al., The role of bifidobacteria in gut barrier function after thermal injury in rats. J Trauma, 2006. 61(3): p. 650-7. 262. Ewaschuk, J.B., et al., Secreted bioactive factors from Bifidobacterium infantis enhance epithelial cell barrier function. Am J Physiol Gastrointest Liver Physiol, 2008. 295(5): p. G1025-34. 263. Cani, P.D., et al., Changes in gut microbiota control inflammation in obese mice through a mechanism involving GLP-2-driven improvement of gut permeability. Gut, 2009. 58(8): p. 1091-103. 264. de La Serre, C.B., et al., Propensity to high-fat diet-induced obesity in rats is associated with changes in the gut microbiota and gut inflammation. Am J Physiol Gastrointest Liver Physiol, 2010. 299(2): p. G440-8. 265. Hildebrandt, M.A., et al., High-fat diet determines the composition of the murine gut microbiome independently of obesity. Gastroenterology, 2009. 137(5): p. 1716-24 e1-2. 266. Backhed, F., et al., Mechanisms underlying the resistance to diet-induced obesity in germ-free mice. Proc Natl Acad Sci U S A, 2007. 104(3): p. 979-84. 267. Wu, G.D., et al., Linking long-term dietary patterns with gut microbial enterotypes. Science, 2011. 334(6052): p. 105-8.

268

268. Turnbaugh, P.J., et al., Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe, 2008. 3(4): p. 213-23. 269. Armougom, F., et al., Monitoring bacterial community of human gut microbiota reveals an increase in Lactobacillus in obese patients and Methanogens in anorexic patients. PLoS One, 2009. 4(9): p. e7125. 270. Schwiertz, A., et al., Microbiota and SCFA in lean and overweight healthy subjects. Obesity (Silver Spring), 2010. 18(1): p. 190-5. 271. Turnbaugh, P.J. and J.I. Gordon, The core gut microbiome, energy balance and obesity. J Physiol, 2009. 587(Pt 17): p. 4153-8. 272. Greenblum, S., P.J. Turnbaugh, and E. Borenstein, Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease. Proc Natl Acad Sci U S A, 2012. 109(2): p. 594-9. 273. Wang, Y., et al., Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics, 2011. 11(10): p. 2019-26. 274. Adkins, J.N., et al., Analysis of the Salmonella typhimurium proteome through environmental response toward infectious conditions. Mol Cell Proteomics, 2006. 5(8): p. 1450-61. 275. Shen, Y., et al., High-throughput proteomics using high-efficiency multiple-capillary liquid chromatography with on-line high-performance ESI FTICR mass spectrometry. Anal Chem, 2001. 73(13): p. 3011-21. 276. Eng, J.K., A.L. McCormack, and J.R. Yates III, An Approach to Correlate Tandem Mass Spectral Data of Peptides with Amino Acid Sequences in a Protein Database. J Am Soc Mass Spectrom, 1994. 5(11): p. 976-989. 277. Kim, S., N. Gupta, and P.A. Pevzner, Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. J Proteome Res, 2008. 7(8): p. 3354-63. 278. Peng, J., et al., Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J Proteome Res, 2003. 2(1): p. 43-50. 279. Goll, J., et al., METAREP: JCVI metagenomics reports--an open source tool for high- performance comparative metagenomics. Bioinformatics, 2010. 26(20): p. 2631-2. 280. Ogata, H., et al., KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res, 1999. 27(1): p. 29-34. 281. Peterson, D.G., et al., Integration of Cot analysis, DNA cloning, and high-throughput sequencing facilitates genome characterization and gene discovery. Genome Res, 2002. 12(5): p. 795-807. 282. Miller, J.R., et al., Aggressive assembly of pyrosequencing reads with mates. Bioinformatics, 2008. 24(24): p. 2818-24. 283. Tanenbaum, D.M., et al., The JCVI standard operating procedure for annotating prokaryotic metagenomic shotgun sequencing data. Stand Genomic Sci, 2010. 2(2): p. 229-37. 284. Goll, J., et al., A case study for large-scale human microbiome analysis using JCVI's metagenomics reports (METAREP). PLoS One, 2012. 7(6): p. e29044. 285. Saeed, A.I., et al., TM4: a free, open-source system for microarray data management and analysis. Biotechniques, 2003. 34(2): p. 374-8. 286. Patil, D.P., et al., Molecular analysis of gut microbiota in obesity among Indian individuals. J Biosci, 2012. 37(4): p. 647-57.

269

287. Million, M., et al., Obesity-associated gut microbiota is enriched in Lactobacillus reuteri and depleted in Bifidobacterium animalis and Methanobrevibacter smithii. Int J Obes (Lond), 2012. 36(6): p. 817-25. 288. Derrien, M., et al., Akkermansia muciniphila gen. nov., sp. nov., a human intestinal mucin-degrading bacterium. Int J Syst Evol Microbiol, 2004. 54(Pt 5): p. 1469-76. 289. Deatherage Kaiser, B.L., et al., A Multi-Omic View of Host-Pathogen-Commensal Interplay in -Mediated Intestinal Infection. PLoS One, 2013. 8(6): p. e67155. 290. Song, Y., et al., Alistipes onderdonkii sp. nov. and Alistipes shahii sp. nov., of human origin. Int J Syst Evol Microbiol, 2006. 56(Pt 8): p. 1985-90. 291. Robert, C., et al., Bacteroides cellulosilyticus sp. nov., a cellulolytic bacterium from the human gut microbial community. Int J Syst Evol Microbiol, 2007. 57(Pt 7): p. 1516-20. 292. Vereecke, L., R. Beyaert, and G. van Loo, Enterocyte death and intestinal barrier maintenance in homeostasis and disease. Trends Mol Med, 2011. 17(10): p. 584-93. 293. Cho, K.H. and A.A. Salyers, Biochemical analysis of interactions between outer membrane proteins that contribute to starch utilization by Bacteroides thetaiotaomicron. J Bacteriol, 2001. 183(24): p. 7224-30. 294. Wei, B., et al., Molecular cloning of a Bacteroides caccae TonB-linked outer membrane protein identified by an inflammatory bowel disease marker antibody. Infect Immun, 2001. 69(10): p. 6044-54. 295. Haft, D.H., J.D. Selengut, and O. White, The TIGRFAMs database of protein families. Nucleic Acids Res, 2003. 31(1): p. 371-3. 296. Babu, M., et al., A dual function of the CRISPR-Cas system in bacterial antivirus immunity and DNA repair. Mol Microbiol, 2011. 79(2): p. 484-502. 297. Swierczynski, J., et al., Enhanced glycerol 3-phosphate dehydrogenase activity in adipose tissue of obese humans. Mol Cell Biochem, 2003. 254(1-2): p. 55-9. 298. Sledzinski, T., et al., Association between cytosolic glycerol 3-phosphate dehydrogenase gene expression in human subcutaneous adipose tissue and BMI. Cell Physiol Biochem, 2013. 32(2): p. 300-9. 299. Peri, K.G., H. Goldie, and E.B. Waygood, Cloning and characterization of the N- acetylglucosamine operon of Escherichia coli. Biochem Cell Biol, 1990. 68(1): p. 123-37. 300. Kim, M., et al., Need-based activation of ammonium uptake in Escherichia coli. Mol Syst Biol, 2012. 8: p. 616. 301. Scribner, H., E. Eisenstadt, and S. Silver, Magnesium transport in Bacillus subtilis W23 during growth and sporulation. J Bacteriol, 1974. 117(3): p. 1224-30. 302. Pathak, D.T., et al., Cell contact-dependent outer membrane exchange in myxobacteria: genetic determinants and mechanism. PLoS Genet, 2012. 8(4): p. e1002626. 303. Traxler, M.F., et al., The global, ppGpp-mediated stringent response to amino acid starvation in Escherichia coli. Mol Microbiol, 2008. 68(5): p. 1128-48. 304. Kalapos, M.P., Methylglyoxal in living organisms: chemistry, biochemistry, toxicology and biological implications. Toxicol Lett, 1999. 110(3): p. 145-75. 305. Gao, J., et al., The constitutive androstane receptor is an anti-obesity nuclear receptor that improves insulin sensitivity. J Biol Chem, 2009. 284(38): p. 25984-92. 306. Huwyler, J., et al., Induction of cytochrome P450 3A4 and P-glycoprotein by the isoxazolyl-penicillin antibiotic flucloxacillin. Curr Drug Metab, 2006. 7(2): p. 119-26. 307. Kroetz, D.L. and D.C. Zeldin, Cytochrome P450 pathways of arachidonic acid metabolism. Curr Opin Lipidol, 2002. 13(3): p. 273-83. 308. Asano, N., Glycosidase inhibitors: update and perspectives on practical use. Glycobiology, 2003. 13(10): p. 93R-104R.

270

309. Nelson, K.E., et al., Complete genome sequence of the oral pathogenic Bacterium porphyromonas gingivalis strain W83. J Bacteriol, 2003. 185(18): p. 5591-601. 310. Cummings, J.H. and G.T. Macfarlane, The control and consequences of bacterial fermentation in the human colon. J Appl Bacteriol, 1991. 70(6): p. 443-59. 311. Anton, R., et al., Hepoxilin B3 and its enzymatically formed derivative trioxilin B3 are incorporated into phospholipids in psoriatic lesions. J Invest Dermatol, 2002. 118(1): p. 139-46. 312. Mrsny, R.J., et al., Identification of hepoxilin A3 in inflammatory events: a required role in neutrophil migration across intestinal epithelia. Proc Natl Acad Sci U S A, 2004. 101(19): p. 7421-6. 313. Denis, D., et al., Synthesis and biological activities of leukotriene F4 and leukotriene F4 sulfone. Prostaglandins, 1982. 24(6): p. 801-14. 314. Gauffin Cano, P., et al., Bacteroides uniformis CECT 7771 ameliorates metabolic and immunological dysfunction in mice with high-fat-diet induced obesity. PLoS One, 2012. 7(7): p. e41079. 315. Ross, S.J., et al., Thioredoxin peroxidase is required for the transcriptional response to oxidative stress in budding yeast. Mol Biol Cell, 2000. 11(8): p. 2631-42. 316. Kopf, M.A., et al., Key role of alkanoic acids on the spectral properties, activity, and active-site stability of iron-containing nitrile hydratase from Brevibacterium R312. Eur J Biochem, 1996. 240(1): p. 239-44. 317. Franca, L.M., et al., Mechanisms underlying hypertriglyceridemia in rats with monosodium l-glutamate-induced obesity: Evidence of XBP-1/PDI/MTP axis activation. Biochem Biophys Res Commun, 2014. 443(2): p. 725-30. 318. Xu, Y., et al., Glutamate release mediates leptin action on energy expenditure. Mol Metab, 2013. 2(2): p. 109-15. 319. Krummenacker, M., et al., Querying and computing with BioCyc databases. Bioinformatics, 2005. 21(16): p. 3454-5. 320. Ridaura, V.K., et al., Gut microbiota from twins discordant for obesity modulate metabolism in mice. Science, 2013. 341(6150): p. 1241214. 321. Wang, X., et al., Differential expression of liver proteins between obesity-prone and obesity-resistant rats in response to a high-fat diet. Br J Nutr, 2011. 106(4): p. 612-26. 322. Choi, D.K., et al., Gender difference in proteome of brown adipose tissues between male and female rats exposed to a high fat diet. Cell Physiol Biochem, 2011. 28(5): p. 933-48. 323. Mukherjee, R., et al., Sex-dependent expression of caveolin 1 in response to sex steroid hormones is closely associated with development of obesity in rats. PLoS One, 2014. 9(3): p. e90918. 324. Fernandez-Real, J.M., et al., Study of caveolin-1 gene expression in whole adipose tissue and its subfractions and during differentiation of human adipocytes. Nutr Metab (Lond), 2010. 7: p. 20. 325. Joo, J.I. and J.W. Yun, Gene expression profiling of adipose tissues in obesity susceptible and resistant rats under a high fat diet. Cell Physiol Biochem, 2011. 27(3-4): p. 327-40. 326. Choi, J.W., et al., Plasma proteome analysis in diet-induced obesity-prone and obesity- resistant rats. Proteomics, 2010. 10(24): p. 4386-400. 327. Faulds, M.H., et al., The diversity of sex steroid action: regulation of metabolism by estrogen signaling. J Endocrinol, 2012. 212(1): p. 3-12. 328. Bryzgalova, G., et al., Mechanisms of antidiabetogenic and body weight-lowering effects of estrogen in high-fat diet-fed mice. Am J Physiol Endocrinol Metab, 2008. 295(4): p. E904-12.

271

Appendix A

2 mice per group (male control, male STAT)

Protein isolation

Trypsin digestion

2D (high pH - low pH) RP-RP liquid chromotography

MS/MS (m/z)

SEQUEST mapping of peptides onto metagenomic DB

Taxonomic and functional analysis in Metarep

272

Appendix B

10 mice /group •male and female genomic DNA control and isolation STAT

RAKE normalization •10- and 100-fold normalized

•male/female control (2 indiv lanes) 1 lane Illumina HiSeq 2000 •male/female STAT (2 indiv lanes) sequencing •10-fold normalized mixture (2 replicate lanes) •100-fold normalized mixture (2 replicate lanes)

functional and phylogenomic Metagenomic annotation and assembly of comparisons normalized (MetaRep) DNA sequence

Annotation with JCVI Metagenomics Automated Pipeline

Mapping of metaproteomic data to annotated assembly

273