<<

University of Connecticut OpenCommons@UConn

Doctoral Dissertations University of Connecticut Graduate School

4-1-2019 Exploratory Studies of and Plasticity in Mice and Humans Xin Zhou University of Connecticut - Storrs, [email protected]

Follow this and additional works at: https://opencommons.uconn.edu/dissertations

Recommended Citation Zhou, Xin, "Exploratory Studies of Microbiome Dysbiosis and Plasticity in Mice and Humans" (2019). Doctoral Dissertations. 2083. https://opencommons.uconn.edu/dissertations/2083

Exploratory Studies of Microbiome Dysbiosis and Plasticity in Mice and Humans Xin Zhou, PhD University of Connecticut, 2019

Abstract

The human body is cohabitated with large number of , which potentially influence the health conditions of the host. Dynamic interactions between microbiome and host are critical for this cohabitation relationship. Among human population, the composition of microbiome shows clear interpersonal variations and intrapersonal plasticity, with patterns and mechanisms that are currently unclear. In the first part of this thesis, we explored the variations in relation to healthy aging, aimed to understand if microbiome imbalance, or dysbiosis, is a common phenotype of older adults regardless of the age-related diseases. In the second part, we expanded our investigation on the microbiome plasticity to individuals with prediabetes condition, which is more common in older adults. We confirmed that microbiome composition in healthy adults was not significantly different based on their chronological age but were dramatically altered among some individuals at risks metabolic diseases. More importantly, we identified a concomitant change of serum cytokines and the gut microbiome among individuals, which imply a causational relationship between microbiome, immune system and the metabolic diseases. This thesis explored the microbiome variation in both healthy individuals and patients with prediabetes, demonstrated the development of dysbiosis under the risk for metabolic diseases, and potentially revealed a new microbiome-related etiology for metabolic diseases.

Exploratory Studies of Microbiome Dysbiosis and Plasticity in Mice and Humans

Xin Zhou

B.S., Fudan University, 2011

A Dissertation

Submitted in Partial Fulfillment of the

Requirements for the Degree of

Doctor of Philosophy

at the

University of Connecticut

2019

i

Copyright by

Xin Zhou

2019

ii

APPROVAL PAGE

Doctor of Philosophy Dissertation

Exploratory Studies of Microbiome Dysbiosis and Plasticity in Mice and Humans

Presented by Xin Zhou, B.S

Major Advisor ______George Weinstock

Major Advisor ______George Kuchel

Associate Advisor ______Julia Oh

Associate Advisor ______Derya Unutmaz

Associate Advisor ______Blanka Rogina

Associate Advisor ______Laura Haynes

University of Connecticut 2019

iii

Acknowledgement

To me, the past six year of my life as a Ph.D student is a very challenging journey, but I am glad my story has a happy ending. This experience made me become a better, more mature, and older person. I have studied in a field which I didn’t know that exists, met many people that I never thought that I never thought to work with, and gained experience that I never expected before I joined the graduate school.

This is a valuable experience which I will remember for many years.

I am a quite lucky person, which is very important when one is in graduate school. First of all, I would like to thank my parents. The physical distance between me and them is over 7000 miles, but their blessings are with me all the time. They fully supported my idea to study in the United States, and I am grateful for it. Also, I am very lucky that I have my wife Xin with me. We met in Cleveland, and having her around always encouraged me to continue. I could not accomplish this degree without my wife and family’s support.

I also would like to thank my thesis mentors, George Weinstock and George Kuchel. They not only share the first name, but also share the endless passion and strong knowledge about science. Mentor means a lot in Chinese culture, and it is a tremendous honor for me to work with them. With them I have learned not only how to compete scientific projects, but also how to face challenges from life. Meanwhile, I want to thank my committee members, Derya Unutmaz, Julia Oh, Laura Haynes, Blanka Rogina, who had been very helpful during the past years. Despite their busy schedule, they always found time to discuss science with me, and provided input to my projects.

iv

At the very beginning of my graduate school, I know little about aging, and nothing about bioinformatics.

Without the help from members of Weinstock and Kuchel lab, I will not be able to survive. I want to thank all members of the two labs. There are several members of the two labs who worked with me closely, and I am sincerely grateful for it. I have learned the basics of bioinformatics from Lei and Yanfei, and advanced bioinformatics from Jethro and Dan. Also, Hoan is my good neighbor in the lab and my great bioinformatics teacher. The Dorsett family: Yanjiao and Yair are my colleagues and family friends, we always have party at their place. Wei and Anita are from the Oh Lab, and the they always help me and cheer me up.

This list can be very long, but I will put a stop here. As everybody told me after the thesis defense, getting a Ph.D. is just a beginning. I am looking forward to have more challenges, and are prepared for my next journey.

v

Contents Chapter 1: General Introduction ...... 1 The human microbiome ...... 1 Methods to study the microbiome ...... 5 Culturing approaches ...... 5 Sequencing approaches ...... 6 Analysis of Microbiome data ...... 8 Statistical Analysis of Richness and Diversity ...... 8 Statistical Analysis Based on Distance Matrix ...... 12 Statistical Analysis based on Principal Component Analysis and Principal Coordinate Analysis ...... 12 Statistical Analysis based on Linear Discriminant Analysis (LDA) ...... 13 Statistical Analysis based on Machine Learning ...... 13 Animal Models for Microbiome Research ...... 16 Current Datasets for Human Microbiome Research ...... 18 The (HMP):...... 18 The Metagenomics of the Human Intestinal Tract (MetaHIT) ...... 19 The 500 Functional Genomics Project (500FGP) ...... 20 Thesis Objective ...... 20 Chapter 2 ...... 22 Background ...... 22 Results ...... 26 Description of study design ...... 26 Metagenomic Sequencing data Overview ...... 27 Gut microbiome richness and diversity between groups ...... 30 taxonomic signatures in young and old subjects ...... 33 ...... 42 ...... 43 Discussion ...... 44 Chapter 3 ...... 48

vi

Background ...... 48 Result ...... 53 Description of iHMP Cohort ...... 53 Within-individual cytokine variability identifies discrete groups of Th17 activity among iHMP subjects ...... 53 Th17 HA individuals display milder metabolic syndrome related phenotype...... 58 Th17 HA individuals have a gut microbiome characterized by a high level of Clostridia ...... 63 ...... 69 Discussion ...... 70 Chapter 04...... 72 Conclusions and Future Direction ...... 72 Appendix I: Methods ...... 84 Sample Collection ...... 84 DNA Sequencing ...... 84 Sequencing Data Process for Chapter 2 ...... 85 Sequencing Data Curation for Chapter 3 ...... 85 Statistical Analysis ...... 86 Data Modelling: Mixture model of Th17 group ...... 86 Data Modelling: Linear Mixed Model ...... 87 Bayesian Mixed-Effects Model for Taxa and Cytokine interactions conditional on cluster assignment ...... 88 Appendix II: Impact of Age, Caloric Restriction, and Influenza Infection on Mouse Gut Microbiome: An Exploratory Study of the Role of Age-Related Microbiome Changes on Influenza Responses ...... 90 Abstract ...... 90 Introduction ...... 91 Materials and Methods ...... 95 Mice ...... 95 Viral Infection and Analysis ...... 95 Statistical Analysis ...... 96 RESULTS ...... 97 Effect of Age and CR on the Response to Influenza Infection ...... 97

vii

Effect of Age and CR on the Gut Microbiome during Influenza Infection ...... 100 ...... 104 DISCUSSION ...... 105 ETHICS STATEMENT ...... 108 AUTHOR CONTRIBUTIONS ...... 108 FUNDING...... 108 Reference ...... 109

viii

List of figures

Figure 1.1. Example of Microbiome data…….………………………………………………………….....9 Figure 1.2. Equations of Diversity Index…………………………………………………………….……11 Figure 1.3. Random Forest………………………………………………………………………….……..15 Figure 2.1. Overview of Microbiome Community……………………………………………………..…29 Figure 2.2. Ecology Richness and Diversity of Microbiome……………………………………………...32 Figure 2.3. Taxa modulated by age……………………………………..…………………………………34 Figure 2.4. Relative abundance of age-related bacteria…………………………………………………...36 Figure 2.5. Age related microbiome ……………………………………………………………………...41 Figure 2.6. power estimation forage-related OTUs comparison……………………………………….…42 Figure 2.7. Representative OTUs with more than 40% prevalence………………………………………43 Figure 3.1. GMM model to group IL17 and IL22 into different gaussian distribution…………………...52 Figure 3.2. Grouping of iHMP participants based on Th17 related cytokine abundance…………………55 Figure3.3. Comparison of the longitudinal design with the cross-sectional design…………………….57 Figure 3.4. Steady State Plasma Glucose (SSPG) measurement by group……………………………...59 Figure 3.5. Age, BMI and glucose level of subject from different groups……………………………...61 Figure 3.6. Longitudinal modelling of serum metabolism marker and blood cell volume marker show higher type 2 diabetes risk in individuals from LA group…………………………………..…….………62 Figure 3.7. Differences in the gut microbiome of HA vs LA individuals………………………………64 Figure 3.8. Longitudinal modelling of Shannon diversity shows higher diversity in HA group………….66 Figure 3.9. Longitudinal modelling of Firmicutes to Bacteroidetes ratio show higher ratio in HA group.68 Figure3.10. The Genera Correlate with Serum IL-17…………………………………………………...69 Figure 4.1. Comparison of the gut microbiome between Young and aged mice from UCHC………….74 Figure 4.2 Hypothesized model for Th17 relate cytokines and microbiome plasticity during prediabetes progression………………………………………………………………………………………………...77 Figure 4.3 Participants clinical category………………..…………………………………………………80 Appendix Figure 1………………………………………………………………………………………...99 Appendix Figure 2……………………………………………………………………………….………101 Appendix Figure 3…………………………………………………………………………………….…103 Appendix Figure 4…………………………………………………………………………………….…104

ix

Chapter 1: General Introduction

The human microbiome

The human body is colonized with a plethora of microorganisms, such as bacteria, , fungi, and viruses. There were many studies aiming to estimate the number of the different types of microbes at different human body sites. To date, the numbers are being revised regularly due to the application of different counting techniques. In the earliest study published in 1972, the author Dr.Luckey estimated1 that

1012 bacteria colonize the epidermis and 1014 bacteria colonize the alimentary tract of an adult. This estimation has been revised in a most recent study, which stated that there are approximately 3.9*1013 bacteria2 in an adult human with a body weight of 70 kg, which is close to the number of human somatic cells. Other microbes such as viruses and phages are believed to be present in a similar or even higher number than bacteria3. Meanwhile, the number of fungi in an adult is still unknown. Genes carried by these microorganisms are substantially more diverse than those in the human eukaryotes, and consequently provide high levels of functional diversity. The collection of information describing the entire habitat, which includes the microorganisms, the genome of those microorganisms and the information of surrounding environment is termed as “microbiome” 4.

Microbes are present at various human body sites, such as the skin, oral cavity, intestine, and urogenital tract. The skin bacterial microbiome is mainly dominated by the genus of Propionibacterium,

Staphylococcus, and , depending on the skin characteristics (such as dry, moist or oily) 5.

The most common genera in the oral cavity include Streptococcus, Gemella, Granulicatella, ,and

Prevotella6. , Gardnerella and Sneathia are the major bacteria genera found in the urogenital tract 7, 8. The colon is the largest contributor to the total bacterial population in hosts, and the overall number of bacteria in the colon outnumbers the other body sites by approximately ten to one2. Approximately 1000 bacterial species can be found throughout the human body, and the genomes of these bacterial species

1 encode 100 times more genes than human genes9-11. Those bacteria play a role in host digestive process such as digesting the polysaccharides which are otherwise indigestible12 and participating in the development of the immune system 13,14.

Today it is thought that bacterial inoculation occurs at birth upon the exposure of vaginal, fecal, and skin during delivery. Although there is data implying that placenta may also contain bacteria before delivery (these bacteria are transported from other body sites into placenta by blood cells), these bacteria are often in low abundant and/or not culturable 15. Early microbial colonization in the infant gut starts with mainly facultative aerobes from the environment, such as and Propionibacterium16, and the major population of gut microbiome quickly switches to anaerobic microbes within one week after birth17.

The species colonizing the neonates are highly dependent on the method of delivery: Infants born vaginally primarily acquire their gut microbiome from the mother’s vagina while infants who are delivered surgically tend to carry bacteria from human skin18. The composition of the microbial community in the gut microbiome is highly variable during a child’s first three years of life, and then become relatively stable across adulthood 19.

Is the microbiome composition as stable as the human genome? The answer is no. Unlike the human genome, there is a greater microbial variation within an individual over time. Since the bacterial microbiome consists of around 1000 bacterial strains, the disappearance of a pre-existing bacterial strain or the acquisition of a new bacterial strain can cause changes in the microbiome composition. In theory, the accumulation of such small changes of bacterial strains over time could ultimately alter the entire microbiome community in the host. However, several recent studies found that this may not be the case.

One study showed that roughly 45% of the species were maintained in stool samples taken from the same individual 46 years apart 20. Meanwhile, the gut microbiome composition is quite stable over a short time period, based on the comparison of stool samples collected monthly and yearly from the same donor 21,22,20,23, if the composition is unperturbed by external influences. In addition to the gut microbiome, studies focusing

2 on other body sites also concluded that, like the gut microbiome, the skin and virginal microbiome are relatively stable over time24-26.

Nevertheless, the microbiome is relatively stable in a person over time, with unneglectable differences between individuals. Such variability in the composition of microbiome between individuals is called interpersonal variation, 27 which describes the presence and absence of bacteria stains when comparing the microbiome from different hosts. The 1000 Genome Project, which sequenced the complete genomes from

1000 individuals, demonstrated that individual has a unique genome except for identical twins28. The interpersonal variability of the microbiome may be partially connected to the highly personalized genome.

However, the clear relationship between the human genome and the microbiome is still under investigation.

Currently, we do know that functional mutations of some genes in the host will affect the microbiome composition 29,30. For example, were reported with increased abundance31 in the subjects carrying point mutations in Nucleotide-binding oligomerization domain-containing protein 2 (NOD2) gene.

Other genes that are known to affect the microbiome include the fucosyltransferase 2 gene (FUT2) and lactase gene (LCT), and their mutations affect the growth of Bifidobacterium32,33. Other than single mutations, microbiome interpersonal variabilities are also associated with sex34, ethnicity group 35 and systematic host genetic variation36.

Certain environmental perturbations can cause temporal changes of the gut microbiome and sometimes even overwrite the genetic influences on the gut microbiome. Studies found that monozygotic twins do not necessarily carry the 100% same microbiome12, although their genetic background is identical.

Geographical location is one of the major contributors to the microbiome plasticity. If twins live in two different counties, the difference between their microbiome is larger than twins who live together37, indicating that the living environment influence the gut microbiome independent of the genetic background.

Also, the gut microbiome of immigrants quickly converges towards the microbiome of the destination. A recent study found that if a resident from southeast Asia relocates to the US, his/her microbiome will quickly

3 switch to a low diversity, low , high type, which is considered a westernized microbiome type38. Diet is another big contributor to the gut microbiome plasticity. The influence of the diet on the microbiome can be traced back to the very early stage of life. Studies showed that babies fed with breast milk have very different gut compared to babies fed with infant formula. 39 The reason for this is not only the nutritional differences in breast milk and baby formula, 40 but also that both live bacteria and immune components are found in human breast milk. 41 The influence of diet on the microbiome continues during childhood and adulthood. Children growing up in Europe and Africa also harbor different microbiomes, due to different diet components inherent to the two geographic locations. 42

Generally speaking, animal-based diet is associated with an increase of genera such as Bacteroides,

Alistipes, and Bilophila, whereas plant-based diet is linked to an increase of the phylum Firmicutes and the genera of Ruminococcus, Roseburia, and Eubacterium. 43 Switching diet for even a few days is enough to change the microbial composition43, and may have long-term effect on host. A report showed that weight gain by overfeeding systematically altered the host microbiome and physiology conditions, including increased inflammation and hypertrophic cardiomyopathy. 44 This report also described a potential long- term effect that will persist even after the body weight dropped to the pre-perturbation level. Meanwhile, microbiome composition can also be altered during pregnancy, and is characterized by an increase of bacteria phyla Proteobacteria and Actinobacteria. 45 Other components, beyond the scope of this thesis, that may induce changes of the microbiome composition, include the usage of and /prebiotics, which was also experimentally demonstrated and reviewed 46-49. In conclusion, studying the microbiome is complex due to the plasticity caused by the highly personalized genetic background, environmental influences and physiology of the host.

4

Methods to study the microbiome

Since the microbiome is a collection of microorganisms (such as viruses, fungi, archaea and bacteria), specialized methods are required for each type of . Given that this thesis focuses on the bacterial component of the microbiome, the respective methods are introduced in the following.

Culturing approaches

The traditional approach to study the microbiome is by microbiological culture. 50,51 Depending on their response and requirement of oxygen, bacteria can be classified as anaerobic (growth in the absence of oxygen), aerobic (growth with oxygen) and facultative aerobic (growth with or without oxygen). For example, aerobic and facultative aerobic bacteria mainly colonize the oxygen rich environment such as certain areas of the human skin and oral cavity. In contrast, anaerobic bacteria normally colonize the oxygen-lacking environment such as the colon. Besides the different oxygen requirements, microbial growth is also highly dependent on the availability of nutrients. For instance, bacteria living in the gut utilize food ingested by the host as their major source of nutrition and bacteria living on the skin use sebum as a source of nutrition. 52 Bacteria growth can be inter-dependent on each other. On one hand, there is competition for nutrition among bacterial species; on the other hand, bacteria are also sometimes taking advantage of metabolic products secreted from their cohabitants. 53,54 In summary, efforts have been made to cultivate the microbiome from different environmental samples. However, the reasons mentioned above just partly highlight how the complexity of the microbiome makes it complicated to capture the diversity of the microbiome using culture-based methods. As a result, it is believed that currently we are only culturing the tip of the iceberg and that majority of bacteria in the gut microbiome is not culturable. 55,56

5

Sequencing approaches

Our knowledge about the bacterial microbiome is accelerated with the development of sequencing technologies, especially for unculturable bacteria. The first generation of sequencing is called Sanger sequencing, which was developed by Frederick Sanger in 1977. 57 This method is comparatively time- consuming and costly. Hence, it is currently only used for small scale sequencing experiments. The second generation of sequencing technology (Next Generation Sequencing, NGS) utilizes fluorescently-labelled deoxy-ribonucleotide triphosphate (dNTPs) to label the DNA nucleotide and can determine the sequence of millions of DNA strands simultaneously through detecting the fluorescent signals. 58 The NGS technology largely increases the efficiency of DNA sequencing and is currently used as the method for the large-scale genomic sequencing projects. 59 However, both the first and second-generation sequencing techniques are not designed for reads longer than 1000 base pairs (bps). 58 Therefore, the shotgun sequencing method was developed for sequencing long DNA strands, by breaking down long DNA sequences into smaller fragments that can be sequenced separately, and assembly though computational methods to yield the overall sequence. This shotgun-sequencing method can be combined with either first or second-generation sequencing technique for long reads sequencing.

The sequencing processes are routinely used nowadays. Once an environmental sample (e.g. fecal matter, skin swab, soil) is collected, the total DNA is extracted and sequenced. To determine the possible origin of each sequencing read, the data is aligned to the genome database, which contains DNA reads associated taxonomy information. The current databases which are widely used include Ribosomal Database

Project(RDP, https://rdp.cme.msu.edu/), 60 SILVA ribosomal RNA database (https://www.arb-silva.de/) 61 and National Center for Biotechnology Information (NCBI, https://www.ncbi.nlm.nih.gov/). 62 Sequencing the entire genetic material extracted from an environmental sample is called metagenomic sequencing, which is one of the most widely used sequencing method for microbiome research when combined with the shotgun method. This method is named the whole metagenome shotgun (mWGS) sequencing. 63

6

In addition to sequencing the total genetic material, we can also sequence bacterial “name-tag” genes to obtain taxonomy information for the bacterial microbiome. Within the bacteria genome, there is a region of about 1500 base-pairs (bps) called “16S ribosomal RNA (rRNA) gene”. Most part of this gene is conserved, because any detrimental mutation of it will lead to the death of the bacteria. 64 and only nine small regions can bare mutations. Meanwhile, this gene can be found in almost all the bacteria species. The

DNA sequences of the nine variable-regions (also called 16S rRNA gene V1~V9 regions) contain enough information to distinguish bacteria at species level. 65 Thus, the 16S rRNA gene has been used as a label gene for bacteria taxonomy identification. 66,67 The first application of 16S rRNA gene as a bacteria taxonomy marker can be traced back to 1977, when Woese and Fox used the 16S rRNA to understand the phylogenetic relationship of the prokaryotic domain. 68

However, both the mWGS and 16S rRNA sequencing methods have limitations. Although the mWGS sequencing can generates the most comprehensive information, it lacks a computationally easy and accurate method to obtain the microbiome taxonomy composition. 69,70 Moreover, mWGS costs nearly four times more than 16S rRNA sequencing for one sample. The major advantage of the 16S rRNA gene sequencing is the low cost with a focus on bacteria taxonomy. Also, it requires a lot less bacterial DNA to reconstruct the microbiome composition compared to mWGS. However, there are three major limitations of the 16S rRNA sequencing method: 1) Since the interpretation of bacteria taxonomy largely relies on the accuracy of the existing database, the taxonomy estimation cannot be confidently confirmed if the bacteria species have not been cultured and sequenced previously. 2) Bacteria may contain multiple copies of the 16S rRNA gene, so the quantitative comparison of bacteria taxonomy based on 16S rRNA gene reads count may not be accurate. The fold-changes of 16S rRNA gene reads of a certain bacterial species may not reflect the biological quantitative fold-change of the bacteria cell when compared by samples. 3) The preparation of the 16S rRNA sequencing library involves a polymerase chain reaction (PCR), which may introduce bias on certain bacteria species depending on the selection of PCR primers [data from unpublished papers at

7

Weinstock Lab]. Thus, when interpreting results from a 16S rRNA sequencing based experiment, these limitations need to be taken into consideration.

Analysis of Microbiome data

Statistical Analysis of Richness and Diversity

Once the sequencing data is obtained, there are several processing steps before the data can be used for biological interpretation. The methods will be described in detail in the method section of this thesis. Briefly, the 16S rRNA reads are being clustered by their nucleotide sequence similarity. In general, if 97% of the reads are identical, they will be clustered together. 71,72 When all sequencing reads are assigned to different clusters, a representative read for each cluster is identified (normally the representative read is the read that appears the most within a cluster). Then these representative reads are aligned with existing reads in the online database for taxonomy identification. These clusters are not exactly equal to a biologically defined genus or species, hence they are called “Operational Taxonomic Unit (OTU)” 73 At the end, one table is formed with the information of how many reads are assigned to each OTU (Figure 1.1a: example of an

OTU table), and another for the taxonomy information of each OTU (Figure 1.1b, example of a taxonomy table).

After the OTU table is generated, it can be used for the comparative analysis. If one sample contains more reads that belong to a given taxa, then this sample contains more bacteria from such taxonomy. However, before this assumption can be made, the difference of total sample reads (also called sequencing depth) needs to be taken into consideration. Thus, a table illustrating the normalized proportion of reads belonging to each OTU is formed, which is called the relative abundance table (Figure 1.1c, example of a relative abundance table). This is simply calculated by the number of reads in each OTU divided by the total reads of a given sample.

8

Figure 1.1 Example of Microbiome data

a. Example of OTU table Sample 1 Sample 2 Sample 3 OTU_1 15 10 0

OTU_2 5 0 0

OTU_3 0 20 35

OTU_4 5 5 5

b. Example of Taxonomy table

phylum class order family genus Epsilonproteoba Campylobacter Campylobacte Campylobacte OTU_1 Proteobacteria cteria ales raceae r Bacteroidacea OTU_2 Bacteroidetes Bacteroidia Bacteroidales Bacteroides e Fusobacteriac OTU_3 Fusobacteria Fusobacteriia Fusobacteriales eae Prevotellacea OTU_4 Bacteroidetes Bacteroidia Bacteroidales Prevotella e

c. Example of Relative abundance table Sample 1 Sample 2 Sample 3 OTU_1 0.6 0.29 0

OTU_2 0.2 0 0

OTU_3 0 0.57 0.88

OTU_4 0.2 0.14 0.12

Figure 1.1. Example of Microbiome data a) Example of an OTU table, number indicate the reads count belonged to each OTU in each sample. b) Example of a Taxonomy table. c) Example of a relative abundance table, the number indicate the fraction of reads that belonged to each OTU. The column (each person) should have a sum of 1.

9

Since the microbiome is an ecological community of microorganism, there are ecology parameters to describe the characteristics of this community. The most widely used ecology parameters are richness and diversity. 74 Richness simply describes how many unique taxa are present in a community. Measurements of richness include observed species index, which measures how many OTUs can be formed from one sample, and Chao 1 richness (Figure 1.2). There is slight difference between the two parameters in that if an OTU has a low prevalence (can only be found in one sample or two samples), Chao 1 richness does not consider it as a “real” OTU, and these OTUs will be removed from richness estimation. 75,76

If a species is very dominate in the community (for example, 70% of the reads are contributed by one species), the biological influence of this species might be more significant than others. However, the richness index cannot be used to distinguish the more dominate species from the less dominate species. In order to take the species dominance into consideration, ecological diversity indexes such as Shannon diversity and Simpson Index are used. The equations for the two indexes are shown in Figure 1.2. 77-80

Shannon diversity weights the proportion of each species by adding a natural logarithm of the relative proportion, and Simpson diversity index weights the proportion of each species by adding a square of the relative proportion value. Mathematically, rare species that do not have high relative abundances will not greatly impact the index. Another index, which is called Renyi index (formula showed in Figure 1.2 81), adds another parameter, the q-value, to measure the ecological diversity. When the q-value increases, a higher mathematical weight is given to the dominant species. As a result, how sharply the Renyi index decreases with the increase of q-value (the slope of the curve) describes the evenness of a community

(Figure 1.2). If the Renyi index measurements decrease sharply when q-value increase, it means this community is very skewed. 82

10

Figure 1.2. Equations of Diversity Index

Shannon Diversity

Simpson Index

Renyi Diversity

Figure 1.2. Equations of Diversity Index pi is the proportion of individuals belonging to the i th species in the dataset of interest, R is the actual number of species (observed species).

11

Statistical Analysis Based on Distance Matrix

Dissimilarity is a measurement of how similar (or dissimilar) two communities are. The index of dissimilarity can also be applied to microbiome communities. One of the most widely used dissimilarity indexes in microbiome research is called the “Bray-Curtis” Distance. This method was developed to quantitatively compare the ecological similarity of forest around Wisconsin. 83 It uses count data to generate a score of dissimilarity based on two calculations: total number of species in each community and number of species that are shared by the two communities. When each microbiome sample is considered a bacterial community, Bray-Curtis dissimilarity can be calculated for any pair of microbiome samples. The “UniFrac” distance is an alternative to Bray-Curtis distance, 84 it factors the phylogenetic relationships between members of a community. For example, if Community A has 2 fish, 2 birds and 3 mice, Community B has

2 fish, 2 birds and 3 rabbits, Community C has 2 fish, 2 birds and 3 flies. The Bray Curtis dissimilarity index will determine that the distance between Community A and B is the same as the distance between

Community A and C, but the UniFrac distance will determine that the distance between Community A and

B is smaller than the distance between Community A and C. This is because mice and rabbits are phylogenetically more similar than mice and flies. However, the accuracy of UniFrac distance is largely based on obtaining an accurate phylogenetic distance, 84 which is not computationally straightforward when hundreds of taxa are involved.

Statistical Analysis based on Principal Component Analysis and Principal Coordinate Analysis

When pairwise dissimilarity scores are generated between microbiome samples, one way to visualize the total variance of all samples is to perform a principal component analysis (PCA). PCA displays high dimensional data using two major components by mathematically transforming high dimensional data into two dimensions, while still maintaining the majority of relative dissimilarity between samples. The two components are usually set as X axis and Y axis in a scatter plot, with an annotation of the variance that

12 each axis explains. If the dissimilarity is calculated beyond the Euclid coordinates, then this method is generalized to principal coordinate analysis (PCoA). The PCA and PCoA are usually used as methods for exploratory analysis in order to generate a graphical overview of the entire data from a project. To statistically test if the two groups are forming different clusters, a test such as Permutational multivariate analysis of variance (PERMANOVA) or analysis of group similarities can be used. 85

Statistical Analysis based on Linear Discriminant Analysis (LDA)

In the microbiome research, PCA and PCoA are mainly used for visualization of the relationship between microbiome samples, additional analysis is required for a deeper understanding of the data, especially differences on single taxonomy level. For example, if the microbiome from stool samples and from saliva samples show two clusters on a PCA plot, then the next question is to identify the bacteria that are responsible for this result. Linear Discriminant Analysis (LDA) is one of the widely used methods for this purpose. 86 To use this method in above example, each sample needs to be labeled either “stool” or “saliva”, then each bacteria taxonomy unit within a sample will be examined and obtain a score called “linear discriminant score”. 87 This score is a mathematical summary of the relative abundance differences between the two population. The way in which each sample need to be labelled for identifying the differential- abundant species is called “supervised analysis”. 88

Statistical Analysis based on Machine Learning

Machine Learning is a collection of newly emerged methods for pattern recognition and data modelling.

This collection of methods can also be applied to the microbiome data, especially 16S rRNA data. For example, if we want to determine the species that are differentially abundant in stool and saliva sample, we could use traditional statistical methods such as nonparametric test or LDA. However, these methods usually make stringent pre-assumptions about data distributions or sample sizes. Nonparametric tests do

13 not have the same statistical power as the parametric tests, which generally means that more samples are needed for a comparison. Also, nonparametric tests will generate large false positive signals when multiple tests are performed. Similarly, LDA analysis is not designed for a small sample size or zero-inflated data

(indicating that when the data contains many zeros). 89 Thus, the machine learning methods may be an alternative to the traditional statistical methods. Here, I will introduce one of the currently used, supervised machine learning method: Random Forest (RF). An application of RF will be described in Chapter two.

The concept of RF involves using a given taxa to determine the class of a sample. For example, if we hypothetically know from our existing data that OTU X is more abundant in stool samples than in saliva samples, then when a new sample arrives, we only need to measure whether this sample contains a high level of OTU X to determine whether this sample is a stool sample or a saliva sample. This OTU X is named an “important” OTU by the method because it has the power to categorize the sample. In microbiome analysis, the purpose of building the RF model is often to determine which OTU is important. For each

OTU, we will randomly rearrange the class of each sample, and measure if using this OTU we can still determine the class of a given sample. If the random rearrangement process does not decrease the predicting accuracy of this taxon, then it is not important. The mean-decrease-accuracy (MDA) score is used to describe how important a given OTU is for determining the sample class. (Figure 1.3)

In summary, I provided an overview of human microbiome research in this section; stated the benefit and limitations of using the 16S rRNA gene to study the microbiome; described the structure of the microbiome data; and explained how statistical parameters can be used for the data analysis.

14

Figure 1.3

Stool Stool Stool Stool Saliva Saliva Saliva sample sample sample sample sample sample sample

OTU_X high high high high low low low

OTU_X

high low

accurate: Stool sample Saliva sample

randomly rearrange the labels of the sample

Stool Saliva Stool Saliva Stool Stool Saliva sample sample sample sample sample sample sample

OTU_X high high high high low low low

OTU_X

high low

not accurate: Stool sample Saliva sample

Figure 1.3. Random Forest Blue table is the ”original” distribution, yellow table is the “randomly rearranged” distribution. When sample label are randomly rearranged , the OTU_X cannot be used to accurately distinguish sample class, this means OTU_X is “important”.

15

Animal Models for Microbiome Research

The importance of the microbiome in immune response has been demonstrated by studies using different animal models. Therefore, I am going to introduce several animal models. The “germ-free” mouse is a typical animal model since these mice are maintained in a germ-free environment and does not carry any bacteria. The history of developing the germ-free mouse model can be traced back to the year of 1939. 90

This animal model made many important discoveries possible, especially early experiments that demonstrated the causational role of the microbiome in many phenotypes. For instance, studies found that germ-free mice do not develop intact immune systems. They have increased activities on type 2 helper T

(Th2) cells 91 and decreased activity of type 17 helper T (Th17) cells 92 compared to the wildtype mice.

Researchers also observed that the germ-free mice showed an impaired immune response against viral infection. 93 These observations imply that the gut microbiome is required for the development and accurate functioning of the immune system, at least in mice.

Transgenic mouse models are also widely used in the microbiome studies. The manipulation of certain genes in mice allows to address how the expression of these genes or the lack thereof are affecting the composition of microbiome. For example, Toll like receptor 5 gene (TLR5) plays an important role in host recognition of bacterial flagellin. The knockout of the TLR5 gene in mice will lead to gut microbiome changes and host phenotypes such as metabolic syndrome, 94 decreased host immune response against influenza viral infections, 95 and impaired bone tissue material properties. 96 The usage of transgenic animal models can help dissect the specific gene-microbial interactions and the impact of such interactions on host physiology.

Another widely used model is mice fed with different diets. Since diet has a substantial influence on the gut microbiome, changes in components of diets often alter the gut microbiome composition of mice. The correlation between diet and microbiome composition can be established by comparing host microbial

16 compositions before and after the switch of diet. The measurements for host physiologies (such as serum glucose level and BMI) can provide biological relevance associated with such diet-induced microbial alterations. One famous study from Jeff Gordon’s group showed that feeding mice with a high fat diet (HFD) will induce and lead to a microbiome alteration. 97 The study also confirmed that the microbiome associated with the HDF is the cause of obesity, in addition to the excessive nutrition in the food. When the microbiome was transferred from obese to lean germ-free mice, the germ-free mice became obese without the HDF diet. Five years later, a follow up study from the same group confirmed that the causational effect of microbiome on obesity also exist when the human fecal samples are transferred to mice. 98 It is hypothesized that the HFD will cause an overgrowth of certain bacteria species, and thus increase the ability for the gut microbiome to supply energy to the host. 99 This obesity-associated microbiome type is also linked to inflammation in the gut 100 and is proposed as one of the etiologies of Inflammatory Bowel Disease

(IBD) 101,102 and Type 2 Diabetes. 103

A deviation of the normal microbiome composition is called “Dysbiosis” and is potentially harmful to the host. This harmful stage does not necessarily involve an overgrowth of any pathogenic species. 101,102 The precise definition of dysbiosis and the associated species is still under debate and development. Currently, dysbiosis is usually used to describe an altered microbiome composition that is associated with a disease type, such as dysbiosis with IBD, or dysbiosis with Type 2 Diabetes. 104,105

Using animal models to study the microbiome has benefits and limitations. The major benefit is the possibility for manipulating and controlling genetic and environmental background. As mentioned before, it is ethically impossible for humans to be born in a germ-free environment but is feasible for mice.

Meanwhile, it is more practical to manipulate certain gene expressions in mice when studying the gene- microbial interaction. Meanwhile, perturbations such as fecal microbiome transfer (FMT) are much less risky to perform in mice than in humans. 106 When correlational relationships are observed between

17 microbial components and disease phenotypes, the application of FMT will allow researchers to examine the causational relationships of this observation.

However, limitations of using animal models to study the microbiome should not be neglected. The physiological differences between mice and humans should always be considered. The initial usage of mice as a genetic model to study human biology was based on the idea that 95% of human genes can be traced, at least as an isotype, in mice. 107 However, the story in the microbiome research might be different.

Although the microbiome on phylum level is quite similar between mice and humans, 108 taxonomic differences are quite large on species level. 109 Nevertheless, the loss of genetic diversity in experimental mouse models may cause issues for gut microbiome research. For example, a commonly used mouse model

C57B/6J is known to carry a mutation on a gene that encodes phospholipase A2 (PLA2) enzymes, 110 which potentially modulates the gut microbiome composition since this enzyme serves as an anti-microbial peptide. 111 In fact, diversity outbred mice (which have a mixed genetic background) are now proposed for population level microbiome research. 112

Current Datasets for Human Microbiome Research

After the NGS became available, there were several large studies carried out in order to get an understanding of the human microbiome on a large scale. In the last part of this introduction, I will describe several famous datasets and their major findings to provide an overview of the human microbiome research.

The Human Microbiome Project (HMP):

The Human Microbiome Project (HMP) funded by National Institution of Health (NIH) is one of the earliest projects that focused on understanding the composition and variance of human microbiome. This project has two phases. The overall goal for this project is to establish a general understanding of microbiome in

18 the human population, as the majority of the previous studies were based on culture. The first phase of HMP started in 2008. It included 242 healthy subjects who had been sampled for 15 (male) or 18

(female) body sites, which include Gut, Airway, Oral cavity (9 samples), Skin (4 samples), and Vagina (3 samples, female only). This project generated both the microbiome 16S rRNA data (2,971 samples with

V1~V3 region and 4,879 samples with V3~V5 region) and the mWGS data (749 samples, 5,140,472 total non-redundant gene counts from 11.7Gb of sequencing data). 113 This project also generated approximately

2,200 microbial isolates whole genome sequences. In 2017, a supplementary study9 provided 2103 samples with mWGS data from 252 additional subjects. The first phase of HMP provided an excellent overview of the human microbiome and archived a set of high-quality sequencing data for hypothesis testing studies.

The second phase of the HMP project was named “The Integrative Human Microbiome Project” (iHMP), which was started in 2014 and is still in the data-analysis phase.114 This is an additional effort integrating multiple host "omics" with microbiome data in a longitudinal manner. The theme of iHMP is more disease- related compared with first HMP phase. There are three cohorts collected under this project in order to understand the role of the microbiome 1) during pregnancy and onset of preterm birth, 2) on IBD development, 3) on the early stage of Type 2 Diabetes (T2D) development. The study of the third cohort is leaded by Weinstock group at The Jackson Laboratory in collaboration with the Snyder group at the

Stanford University. The data generated from this cohort will be used in the third chapter of this thesis.

The Metagenomics of the Human Intestinal Tract (MetaHIT)

The European initiative MetaHIT project involves eight countries and 13 research centers, which was funded by the European Commission in 2008. The goal for this project was to establish a gene catalog for microbial genes, and mainly focused on the gut microbiome. Similar to the result from HMP, they initially identified 3.3 million different genes in 576.7 Gb of sequencing data from 124 individuals, 11 and later expanded this number to 9.8 million genes from 249 additional samples. 115 The MetaHIT consortium also

19 has a general interest in bacteria related to Inflammatory Bowel Disease (IBD) and obesity in addition to characterizing the microbiome in healthy people. One major discovery of this project is that inter-individual dissimilarity of the gut microbiome could be summarized into three clusters, termed as “Enterotypes”. 116

The formation of clusters is not a result of gender, age, or BMI but does show some clinical significances.

116,117 The Enterotypes were also reported from HMP two years after the first enterotype discovery. 118

Another finding is that the ecological richness of the human gut microbiome is an indicator of host metabolic profiles. 119

The 500 Functional Genomics Project (500FGP)

The 500 Functional Genomics Project aims to understand the interaction between host genomics, host microbiome and host physiological phenotypes. This project started in 2013, with participants from the

Netherlands, Romania, Tanzania, Indonesia and Brazil. The cohorts recruited in this project include individuals with obesity, HIV, gout, fungal infections, Lyme disease, diabetes, and autoimmune diseases.

Different from the iHMP, this project used a cross-sectional design, which means only one time point was included for each participant. This project systematically compared the relationship between host genetics, serum cytokine level, season, sex, age and the gut microbiome composition. 30,120,121 There are several major findings from this project. It reported that microbial palmitoleic acid metabolism and tryptophan degradation to tryptophol was linked to proinflammatory cytokine production, 122 and high fat diet was triggering the proinflammatory cytokines production through NLRP3-Dependent innate immune pathways.

123

Thesis Objective

The purpose of this thesis is to begin an initial exploration of how the microbiome may be altered with healthy aging and whether microbiome dysbiosis reflects the onset and progression of chronic disorders of

20 aging, specifically prediabetes. By exploring the microbiome variation in both healthy and disease conditions, we aim to gain a better understanding of how microbiome composition is linked to host physiologies and health conditions. Additionally, we will explore the overall hypothesis that the altered microbiome composition in individuals with certain cytokine profiles could be a sign of the microbiome dysbiosis, which are related to the onset and progression of metabolic disorders.

21

Chapter 2

Background

In this chapter we will present a comparative study of the microbiome of young and older adults. We use this study as an example to outline experimental design and computational methods for conducting studies on the human microbiome. Our overreaching hypothesize is that the microbiome in older adults show signs of dysbiosis such as lower ecological richness and higher number of harmful bacteria. Here we present a cross-sectional study with the goal of determining if the microbiomes of young and older individuals are significantly different and to illustrate those differences. The results of such a study will help address the myriad of health issues associated with age. We further postulate that investigating microbial composition at the two poles of the natural human health condition would lend insight into the direct and indirect relationships that commensal bacteria have with their hosts. Additionally, we postulate that the effects of the variables which complicated interpretation of the study on microbiome plasticity discussed in chapter

1, will be minimized in this study as both groups are starkly different in terms of age but not designed to be biased for other variations. We hope to reduce the background noise of microbiome variation in this way.

Although our analysis of this study identified several microbes associated with age, our analysis revealed that even in small studies comparing two very different groups, human microbiome studies can require many subjects in order to obtain significant statistical power to draw confident conclusions.

Chronological age may affect microbiome composition. Advanced age is associated with a wide range of changes in physiology, biology, behaviors, as well as social status, and each of these variables can influence microbiome composition to varying degrees. For example, the aged population generally has altered eating habits, reduced physical activity, 124 slow gut motility, 125 immune deficits, 126,127 and a decline in gut stem cell regenerative capacity. 128 These age-related characteristics in old subjects may collectively alter basic biological processes in different tissues, thus influencing both microbial composition a well as the

22 biological responses influenced by the microbiome. 129 Before looking into any causational connection between age-related diseases and microbiome changes, we must understand if and how microbiome composition differs between young and old subjects.

The design of a well-controlled experiment for age-related microbiome studies is faces biological limitations as well as technical challenges. The biological limitations mainly come from the coexistence of confounding factors (e.g. environmental factors) that may affect microbiome composition. This may undermine the conclusion of any study that only focuses on age. The ELDERMET project, one of the largest studies on the microbiome in the old population, pointed out that diet and living condition (such as geographic location and status of being institutionalized) are two major factors that influence the older adults’ gut microbial composition. 130-132 Additionally, age-related diseases such as IBD, 133 Alzheimer’s disease, 134 diabetes 135 and behaviors such as usage of antibiotics 47 could also affect microbiome composition. As a result, not all conclusions of the ELDERMET project may apply to old individuals generally. Therefore, conclusions about age-related microbiome differences should be considered together with the geographic location, living conditions as well as other criteria noted upon a subject’s recruitment.

In addition to biological limitations, statistical challenges arise when testing age-related microbiome differences. Depending on the aim, comparative microbiome analyses could involve either statistical tests on summary statistics or statistical tests across microbial taxa. The former test if the microbiome diversity and richness – common indicators of community complexity – of the young subjects are different from the microbiome richness and diversity in the elderly subjects. The null hypothesis for such tests is that the microbiome richness and diversity in the young and elderly subjects are the same. On the other hand, statistical tests across microbial taxa compare the relative abundance of each individual taxonomy unit

(taxonomy units can be a species, a genus or an operational taxonomy unit) generated from the 16S rDNA sequences and determine whether each unit is differentially abundant between the two age groups. The null

23 hypothesis is that, for a given taxonomy unit, the relative abundance of this unit in the young subjects is equal to that in the elderly subjects.

When a null hypothesis is formed, statistical tests can be used to determine if the null hypothesis is likely to be true. For example, a two-sample t test can be performed to compute the probability (i.e., the p-value) of obtaining a more extreme t value given that two groups have the same mean of a measurement of interest

(e.g., community richness, community diversity, or the relative abundance of a given taxonomy unit). If this probability is less than 0.05, the two groups are considered “significantly different”. A p-value less than 0.05 can be interpreted as there is a less than 0.05 chance to observe the same data if the null hypothesis is correct. Therefore, we reject the null hypothesis and conclude that the sample-mean, community richness, community diversity or the relative abundance of a taxonomy unit are not equal between the two groups.

However, when multiple tests are performed, a more stringent cutoff value then 0.05 is required. For instance, if 500 OTUs are compared across numerous groups with a p-value cutoff set at 0.05, according to the definition of p values a total of 25 OTUs will appear “significantly different” just by chance even if none of the OTUs were truly different between groups. Many statistical techniques have been developed to control for such false positive rate, including resampling approaches such as bootstrap, p value adjustment procedures such as Benjamini-Hochberg, and the introduction of other significance statistics such as q values. In the field of microbiome studies, for the sake of simplicity, we generally reduce the p-value cutoff to 0.002 when multiple comparisons are present, which, in the example above, is equivalent to to an expectation of <1 false positive.

In setting the cutoff p-value for rejecting the null hypothesis one must also consider “effect size” (measures how different two population are) as well as “sample size” (how many biological samples are collected).

These statistics are used to calculate another statistic that determines if the amount of data collected for a specific dataset, is enough to detect a statistically meaningful difference between groups at a specified cutoff p value. This statistic is termed “Power”. Power is calculated from both the expected effect size and

24 sample size across a range of significance levels. For example, if the difference between two groups is large

(large effect size), or a less stringent significance level is set (bigger p-value cutoff), the Power statistic will indicate that relatively few samples are required to detect significant differences at a specified p value.

Two ways to increase the statistical power of a study are to increase the sample size or to increase the effect size. Increasing sample size requires collecting more biological replicates for the comparison. But exactly how to increase effect size is not straightforward. Effect size is determined by the difference between the population mean of two groups and the standard deviation (SD). 136 The effect size increases when the difference between sample-means is increased, or if the SD is decreased. Increasing the mean difference is not applicable to observational studies. Therefore, either increasing the measurements accuracy and/or consistently selecting additional subjects for specific groups is good practice for obtaining a good power.

Determining the sample size that should be used for a particular microbiome study in order for the power of statistical interpretations to be significant is complicated as it cannot predetermine what the distribution of the microbiome data will be. Based on the previous reports, several different statistical assumptions can be made about the data distribution, depending on the hypothesis to be tested. These include the Dirichlet multinomial distribution, 137,138 negative binomial distribution 139,140 and the zero-inflated Gaussian distribution. 141,142 Interestingly, one research demonstrated that the data distribution may not affect the power estimation very much when the expected difference between sample means is less than 10%. 141 To make a simple estimation, the power calculation in this chapter was based on a T-distribution.

Due to the biological and technical challenges mentioned above, the conclusions of previous studies investigating how the microbiome differs between young and old groups is riddled with inconsistencies.

For example, the percentage of Bacteroides, a major member of the human gut microbiome, has been reported to be higher in old adults in some studies, 143,144 and lower in others. 145,146 We attempted to alleviate

25 the inconsistencies identified in previous studies by designing more controlled experiments and using more dependable statistical approaches. For example, to address biological challenges, we reduced the potential confounding effects of disease and medication on the microbiome by selecting subjects who were declared disease-free and not taking certain kinds of medication that known to affect the microbiome (see methods).

Additionally, all subjects were community-dwelling and fully-independent, thus minimizing the confounding effects of disability or life in an institutionalized setting. To address technical challenges in data interpretation, we generate a detailed power estimation for each of our analysis. The findings presented in this chapter will provide additional information on age-related differences in the human microbiome as well as provide an outline on how to design similar comparative studies regarding the human microbiome.

Results

Description of study design

All studies were conducted following approval by the Institutional Review Board of UConn Health Center

(IRB Number: 14-194J-3). Following informed consent, oral and fecal microbiome samples were obtained from 10 healthy young (HY, 23-32yrs, mean=25) and 13 healthy old (HO, 69-94yrs, mean=77) volunteers residing in the Greater Hartford, CT, USA region using services of the UConn Center on Aging Recruitment and Community Outreach Research Core (http://health.uconn.edu/aging/research/research-cores/).

Recruitment criteria were selected to identify healthy individuals who were experiencing “usual aging” and were thus the representatives of average health conditions of the population within the corresponding age groups. 147 Selecting this type of cohort increases the generalizability of our studies and the likelihood that these findings can be translated to the general population. 147 Subjects were carefully screened to exclude potentially confounding diseases and medications, as well as frailty. Individuals who reported chronic or recent (i.e., within two weeks) infections were also excluded. Subjects could have chronic diseases but were excluded if the following were present: congestive heart failure, kidney disease (serum creatinine >1.2 mg/dl in men and >1.1 mg/dl in women), diabetes mellitus requiring medications, immunosuppressive

26 disorders or the use of immunosuppressive agents including oral prednisone in doses >10 mg daily. Since declines in self-reported physical performance are highly predictive of frailty, subsequent disability and mortality, 148 all subjects were questioned as to their ability to walk. Scoring TUG > 10sec was considered an indication of increased frailty and resulted in exclusion from the study. 149,150

Metagenomic Sequencing data Overview

Total DNA was extracted from the samples, and a variable region of 16S rRNA gene was amplified before the sample are sequenced. The raw sequencing reads were being processed in bioinformatics pipeline described in method. There were 548 OTUs generated from the raw sequencing data. OTUs that contained five or less reads were removed from the data set. 381 OTUs from the saliva samples and 405 OTUs from stool samples were obtained. In addition, 120 young-related OTUs and 35 aged-related OTUs were found in saliva samples, while 39 young-related OTUs and 150 aged-related OTUs were identified in fecal samples.

We first examined the beta dissimilarity between samples based on OTUs under unweighted UniFrac

Distance. This method uses a phylogenetic-tree-based beta dissimilarity measurement to calculate the relative distance between samples. According to a permutational multivariate analysis of variance

(PERMANOVA), age may be a significant variable (the P value is on the edge) in explaining the difference between gut microbiome from young subjects compare to gut microbiome from old subjects

(PERMANOVA: Pr>(F) value=0.049). However, age could not explain the potential variation in saliva bacterial microbiome community when we applied the same method to analyze saliva samples

(PERMANOVA: Pr>(F) value=0.449) (Figure 2.1a).

To further understand if bacterial community structure differs regarding to the dominant taxonomies, we constructed a tree based on the 20 most abundant OTUs from all 46 samples (Figure 2.1b). As expected,

27 the saliva and stool samples clearly formed two distinct branches. Interestingly, the saliva sample from subject #107 also contained representative OTUs found in stool samples (Figure 2.1b). Among the 20 most abundant OTUs, we did not observe any additional branch that related to age groups. This implies that dominant bacterial microbiota is relatively stable across the healthy human aging process. Combined with the result from PERMANOVA analysis, we concluded that microbiome population may not have a dramatic change during aging in terms of bacteria type and abundance.

28

a)

b) Saliva

Old

Young Stool

Figure 2.1: Overview of Microbiome Community aPrinciple coordinate analysis of beta dissimilarity. Unweighted-UniFrac Dissimilarity between samples. Green dots represent saliva from young cohorts, red represents saliva sample from old cohorts. Purple dots represents stool sample from young cohorts, and Blue dots represents stool sample from old cohorts. b) Cluster based on 20 most abundant OTU. “Manhattan” clustering of samples based on 20 most abundant OTUs. Color of sample labels indicate the biological origin of the sample. Vertical bars on the right (and color of the branch) indicate the age-group of each sample. Horizontal bar next to each sample label is the relative abundance of 20 most abundant OTUs, the annotation is on the left.

29

Gut microbiome richness and diversity between groups

Diversity of bacterial microbiota is an important characteristic of microbiome composition. 151 Richness composes a mathematic measure of the number of detected species and diversity takes the relative abundance of the species into account. Here we assumed each OTU to be an organism and calculated the richness (total number of different bacterial types present) and diversity (variability of different bacterial types present) for each sample.

To estimate species richness, we calculated the Observed Species and Chao 1 index of our samples. The number of observed species in saliva samples from the old cohort was significantly lower than the young cohort. Although less obvious, a similar trend was observed when calculating the Chao 1 index. Observed

Species richness in Stool from the old cohort showed a higher mean value compared to young, and Chao 1 index showed a similar trend (Figure 2.2a). We speculated this result was due to sequencing singletons, which contribute more to the total variance in stool than in saliva, since Chao1 index does not take singletons into calculation. 76

Within a microbiome community, there are “dominate bacteria” which have higher relative abundances. To give a higher mathematical weight on these dominate bacteria, we used Shannon entropy and the inversed

Simpson index. Similar to the trend in species richness, Shannon entropy of saliva microbiome from the old cohort is lower compared to that from young cohort, and Shannon entropy of stool microbiome is opposite (Figure 2.2b). The measure of inverse Simpson index weights more on abundant taxonomies than the measure of Shannon entropy. We think this is another piece of evidence that the bacterial population difference between young and aged cohort resides in less abundant organisms within bacterial microbiota.

Besides the bacterial diversity, the evenness of bacteria community is another important character of gut microbiome. It is a mathematical description of how different the relative abundance of each species is.

30

One way to describe it is by increasing the mathematical weight on dominate species when calculate diversity and observe the sequentially decreased diversity.

Renyi diversity is designed to perform such a calculation. 152 When alpha-value increases, a heavier mathematical weight is given to more abundant organism. And the slope of the curve represents an overall evenness of microbial community (Figure 2.2c). Like the trends described above, significance of diversity difference is lost when give a higher alpha-value (higher weight on dominate species), which indicates that the bacterial community is more similar when only consider the most abundant organisms. This again confirmed that the variance between young and aged cohorts arises from organisms that are less abundant.

To understand our study power on detecting the difference between richness and diversity, we simulated the relationship between effect size, sample size, expected significance and study power (Figure 2.2d). The actual effect sizes of observed richness, Shannon Diversity and Inverse Simpson index are 0.94, 0.84 and

0.72 for saliva dataset, 0.93, 0.95 and 0.75 for stool dataset. As a result, the study power of our richness/diversity comparison is between 0.676 and 0.869 under a p-value equal to 0.05. This could be interpreted that if we accept the null hypothesis (richness/diversity are equal between groups), there is around 15~25% of a probability that they are not biologically equal on a given effect size. The lower effect size on comparing Inverse Simpson index is consistent with our result generating from Renyi Diversity calculation, which suggests that the more abundant species tends to be similar between two age groups.

31

a) b)

c) d)

Figure 2.2: Ecology Richness and Diversity of Microbiome a) Microbiome richness plot of saliva sample and stool sample. (* P<0.05, **P<0.01) b) Microbiome Shannon diversity and inverse Simpson index plot of saliva sample and stool sample. (* P<0.05, **P<0.01) c) Renyi Diversity of Microbiome in different age group. X axis represents the alpha value (increased alpha value means a higher weight on dominate organisms when calculating diversity), Y axis is the renyi diversity index calculated based on each alpha value. (* P<0.05, **P<0.01) d) Power estimation for microbiome richness/diversity calculation. X axis is the simulated effect size and Y axis is the power. Four panels represent the results under different significance value, significance values are labeled on the top of each panel.

32

Bacteria taxonomic signatures in young and old subjects

Although hierarchical clustering and PCA did not identify data from two age cohorts into distinct clusters, there is an evidence from ecological diversity calculations that some of the taxon may show a different abundant distribution between two age groups. To present those taxa that are different between two age groups, we performed a linear discriminant analysis based on effect size (LEfSe). 86 At significance level of 0.05 for factorial Kruskal-Wallis test, we identified 29 OTUs that are statistically more abundant in young cohort and 2 OTUs in old group within saliva sample. Also, the order of Bacteroidales, the class of

Bacteroidia and the phylum of Bacteroidetes is more abundant in young cohort, and the family of

Micrococcaceae is more abundant in aged group (Figure 2.3a). For stool samples, we identified 13 OTUs that are more abundant in young cohort, and 24 OTUs that are more abundant in aged group. Besides single

OTUs, there are two clusters that are higher in aged populations. One cluster is the order of Coriobacteriales, and consequently the family of Coriobacteriaceae, the phylum of Actinobacteria and the class of

Actinobacteria; another cluster is the family of Victivallaceae, order of Victivallales, class of Lentisphaeria and phylum of Lentisphaerae are more abundant in aged population (Figure 2.3a).

The power for estimating OTU abundance comparison largely relies on the effect size, therefore the effect size is simulated here using the actual OTU relative abundance table (Figure 2.6a). We first calculated the standard derivation of OTUs by sample types and the host-age group (Figure 2.6a). Most SDs of OTU are below 2.5%, so SDs ranging from 1% to 5% are used to simulate the relationship between effect size and the sample-mean differences. From the simulation (Figure 2.6a), we can conclude that the effect size=2 is a large effect, which means we can detect a 4% relative abundance difference under the most cases

(SD<=2%), those differences are likely to be biologically very meaningful. Effect size = 1 is a medium sized effect, which is good for generating hypothesis but hard to draw biological conclusions. Effect size that below 1 is a small effect which we may not have enough power to detect.

33

b)

Figure 2.3: Taxa modulated by age a) Age-related bacteria taxa calculated by LEfSE. Result is ordered with log-transformed Linear discriminant analysis score. Taxa with green bar are more abundant in young cohorts and with red bar are more abundant in older adults. Taxa are calculated from phylum level to single OTU. b) Random Forest result for using each OTU to predict age group. Random forest result plotted by “Mean Decreased Accuracy (MDA)”, which is calculated for each OTU that is the abundance distribution is randomly disrupted, how much of the accuracy will decrease when using this OTU to predict the age group of samples. For example, OTU_131 has a MDA larger than 200, it means if OTU_131’s distribution across samples is randomly changed, it will change the accuracy of using OTU_131 to determine which sample belongs to which age groups. Result was bootstrapped for 1000 times.

34

After we performed an effect size simulation on our dataset, we next estimated the statistical power of comparing OTUs relative abundance based on the simulated effect size (Figure 2.6a). As I mentioned in introduction, the multiple comparison may introduce Type I Error, which undermines our confidence on the detected differential-abundant taxa between groups. Therefore, we may need to apply more stringent significance cutoff if performing too many comparisons. For instance, we identified around 500 OTUs in our data, which belong to 136 genera and 15 phyla. When we expect an OTU to be different from young and old groups, a p-value of 1e-05 would be a conserved cut-off, and our study power under this significance value is 0.16. If less comparison is involved by raising a defined null hypothesis (ie. compare taxa on higher taxonomy level, or only focuses on high abundant OTUs), we will have a better power. The power to detect large difference on phylum (15 comparisons, p=0.0033 before adjustment) and genus level (136 comparisons, p=0.0004 before adjustment) are 0.96 and 0.74 respectively, to detect medium differences on phylum and genus are 0.31 and 0.09 respectively.

Since LEfSe have a chance of missing OTUs that less prevalent, we employed another two methods for picking up taxonomies that show different distribution across groups. Random Forest (RF) 153 is a widely used method for classification and is one of the most accurate method for factor variables classification[ref].

154,88 It generates a variable importance score for each variable to indicate whether using the value of the variable would degrade prediction accuracy. We trimmed out OTUs with global variables less than 0.005 in each site to remove less varied OTUs and obtained 258 OTUs in stool samples and 155 OTUs in saliva sample. Then we generated a list of 30 OTUs with highest importance value (Figure 2.3b).

35

Age-related Saliva Microbiome a

Not related Old related b Age-related Stool Microbiome Young related

c

Figure 2.4: Relative abundance of age related bacteria a) Microbiome core reads percentage to total reads for each subject. Each column represents a subject. X axis shows the subject age. Gray part is the microbiome reads that do not linked to the two age groups. b) Venn Diagram comparing 3 methods. Upper left panel is Saliva Young microbiome that has been found by 3 different method, numbers outside circles represent the number of OTUs generated by each method, upper area is the number for Random Forest, lower right is for Binary (prevalence) and lower left is for LEfSe.

36

If an OTU has a low relative abundance, but show distinct prevalence distribution between age groups,

Random Forest may assign this OTU with low importance score since its importance estimation is based on the entire population. To compensate this, we also listed the OTUs that present in one age cohort but absent in another regardless of its abundance, and we named this type of OTUs as the “Binary” OTUs.

Among all saliva samples, there are 103 binary OTUs in young population and 27 binary OTUs in aged population. Among stool samples, there are 24 binary OTUs in young population whereas there are 131 binary OUTs in aged population (Table 1). A Venn Diagram summarizing the relationship of three methods is showed in Figure 2.4c.

There is also statistical estimation on the detection power of “Binary” OTUs. For the presence-absence analysis, we cannot assume the probability of presence or absence of an OTU follows Gaussian. The presence-absence probability matrix is a classic example of binomial distribution. Thus, understanding the power of such comparison is based on binomial distribution.

To estimate the study power for presence-absence analysis, the first parameter we need to determine is the background of presents (level of background noise). The background means how likely an OTU is artificial

(not biologically real but being identified as an OTU by sequencing). During the data processing, we already accommodated the possibility that OTU is artificial by removing OTUs that have less than 5 reads (see method), so all remained OTUs are considered biologically present. Thus, the minimum valid detection for any given OTU is 1 out of 46 samples, and the detection limit (noise) was set to be 0.5 out of 46.

Under this level of background noise, the study power to detect a difference on presence-absence is a simple function of how many cases of presence can be detected (Figure 2.6b). In our current design (red line), only the cases that one OTU is detected four times (40%) or more have a good detection power. Figure 2.7 shows the taxa that are found more than four times, and this simulation provided the rationale for such selection.

37

Combining the three methods described above, we conclude that 155 OTUs from 381 OTUs in saliva samples, and 189 OTUs from 405 OTUs in stool samples are related to different ages (Table 1). The reads belong to those OTUs only occupy approximately 10% of relative abundance in saliva samples and 30% of that the stool samples, although it is about half of the number of OTUs. If we merge all age-related OTUs together, the relative abundance of this merged bacteria cluster is similar across samples from the same age group (Figure 2.4a, b). Among 155 age-related OTUs in saliva, 120 are more abundant or only present in young cohorts, 35 in old (Table 1). Among the young cohort related microbiome across 23 saliva samples

(Figure 2.5a), 43.06% of the reads belongs to genus Prevotella, followed by Unclassified Prevotellaceae

(12.89%) and (7.53%). For advanced age related OTUs in saliva samples (Figure 2.5b),

Rothia occupies 81.15% of entire advanced age-related population, followed by Capnocytophaga (10.21%) and Prevotella (3.37%).

In stool samples, 39 OTUs are more abundant or only present in young cohorts and 150 OTUs are in old cohorts (Figure 2.5c and Figure 2.5d). Among young cohort associated gut microbiome (Figure 2.5c),

85.12% belongs to Bacteroides. This is consistent with several previous reports. 155-158 With 10 most dominant genus of young cohort related stool microbiome, half of them belongs to order Clostridiales, indicating that the order Clostridiales may be sharply affected by age. Among 53 genera found more abundant in aged cohort (Figure 2.5d), Lachnospiraceae Incertae Sedis and Unclassified Lachnospiraceae contribute to 26.26% of the entire population. 6.7% belongs to Bacteroides, 5.87% of this core population is Alistipes. This genus has been reported more abundant in aged mouse gut, 159 as well as associated with more frail aging population in Europe. 160 It is worth noticing that there are many unclassified bacteria

OTUs in advanced-age related gut microbiome. Among those 150 OTUs, 7.23 % are unclassified

Ruminococcaceae (0.56% in young cohorts), 2.24% are unclassified Clostridiales (0.35% in young cohorts),

3.5% are unclassified Bacteria, and 2.86% are unclassified Prevotellaceae.

38

Table 1. OTUs identified by different methods

Saliva Stool Total number

Total OTU number 381 405 548

Total Reads Count 331977 267234 599211

Young Old Young Old

Age Associated OTU 120 35 39 150

Age Associated Reads 20038 21189 60876 26232

Binary OTU 103 27 24 131

Binary Reads 3105 219 1657 11866

Lefse 29 2 13 24

Lefse Reads 16331 16627 60262 10584

Random Forest 23 7 12 18

RF Reads 7513 5120 7926 8806

39

Finally, we also want to get a general overlook on the age-related bacteria species that we have more confidences. Based on our power estimation, we removed the age-related OTUs that are less prevalent (40%) in one cohort and show an obvious distribution that associated with aging (Figure 2.7). Among those OTUs,

OTU_197 (Figure 2.7) Tannerella was more prevalent and abundant in aged saliva sample than in young saliva sample. Given the fact that and are among 3 major pathogens that involved in periodontitis, 161 this could serve as a hint that oral cavity environment is friendlier to Tannerella growth in old individuals. Moreover, OTU_190 (Figure 2.7) belongs to a species of Prevotella. This OTU could only be found in young saliva cohorts and only present 164 reads in the entire sequencing. The same pattern was found in stool sample for OTU_447 (Figure 2.7). The loss of those Prevotella species may not affect the whole Prevotella population in terms of overall relative abundance, but it is very interesting to us that why a specific species (OTU) could appear or disappear during aging process.

40 a) Saliva microbiome related to young subjects b) Saliva microbiome related to old subjects

c) Stool microbiome related to young subjects d) Stool microbiome related to old subjects

Figure 2.5: Age related microbiome a) Saliva microbiome related to young subjects contain 120 OTUs that belongs to 56 genera, the most abundant 15 genera was listed here, and the other 41 genera are combined and plotted by “Others”. b) Saliva microbiome related to old subjects contain 35 OTUs that belongs to 24 genera, the most abundant 10 genera was listed here, and the other 14 genera are combined and plotted by “Others”. c) Stool microbiome related to young subjects contain 39 OTUs that belongs to 53 genera, the most abundant 10 genera was listed here, and the other 28 genera are combined and plotted by “Others”. d) Stool microbiome related to old subjects contain 150 OTUs that belongs to 53 genera, the most abundant 15 genera was listed here, and the other 38 genera are combined and plotted by “Others”.

The percentage abundance is calculated by dividing the reads of each OTU to the total reads of this sample. A combined percentage plot is showed with circle plot which is a sum of the percentage abundance of each genus from all 23 subjects.

41 a

Figure 2.6: power estimation forage-related OTUs b comparison a) Effect and power simulation for t test based OTU comparison. Upper left panel shows standard derivation (SD) of OTUs by sample types and the host-age group. Upper right panel shows effect size simulation based on different SD (from 1% to 5%) and differences between sample-mean(from 0 to 5%). Three lower panels show power estimation based on different sample size and effect size. Expected significance values are labeled on top of each panel. b) Power estimation for OTU presence/absence analysis. X axis shows the percentage of presence for any given OTU, Y axis shows the power to detect a difference between presence and absence. Color indicate how many samples are included in each group.

42

Young subjects Old subjects

Figure 2.7: Representative OTUs with more than 40% prevalence The relative abundance of age-related OTUs that are more than 40% prevalence. Each dot represents one subject, with x axis shows this subject’s age and Y axis shows the percentage of read that belongs this OTU in each sample. Total reads across sample and subject was listed. Title shows phylum, class, order, family, genus and OTU number, “Unclassified” means this OTU could only be classified at a higher taxonomy rank.

43

Discussion

This chapter presented our pioneer study which investigated the variations of microbiome in healthy aging of individuals from both stool and saliva samples. In spite of this being a small pilot feasibility cross- sectional study, we were able to identify some statistically significant differences of saliva and stool microbiome between aged healthy cohorts and young controls. However, the dominant taxa and the overall population (as tested by PERMANOVA) of microbiome in young and old subjects are not significantly different. One possible explanation for this observation is that the dominant microbiome species is relatively stable during the aging process, especially when the subject is relatively healthy. This is confirmed later in a larger survey regarding the same topic. 158 We also provided a detailed analysis on power estimation since our study serves as a pilot, hypothesis-generating effort. The statistical power estimation discussed in this chapter can be served as effect size and sample size estimation guidelines for future experimental design.

In our study, the gut microbiome in the aged cohort shows a trend of higher ecological richness and diversity.

The increased gut microbiome diversity has been linked to aging in other studies as well. For example, higher richness of the bacterial microbiome has been observed in Drosophila melanogaster, 162 and an increased count of facultative anaerobes have been reported in antibiotics-free old subjects. 145 We think this increase in gut microbiome richness (or facultative anaerobes) of aged cohorts could be ascribed to a decline in the immune function in the gut. 163,164 For example, when hosts fail to provide inhibiting signals

(such as anti-microbial peptides, specific and non-specific immune response) for bacteria growth, certain bacteria that are less sensitive to these inhibiting signals would start to colonize in the gut. Thus, the bacteria that colonize in older adults but cannot be detected in young subject may contribute to some age-related phenotypes. For instance, overgrowth of Clostridiales subpopulation in older adults is considered detrimental. 165 In our study, we found nine species belong to Clostridiales that are at least 40% abundant in of aged cohorts.

44

We next noticed the oral microbiome shows a lower richness and diversity in the older-adults group. there are several interpretations for this observation. First, this could be a result of different nutrition availability between oral and gut environment. The food digestion mainly occurs in gut, so that nutarians available in gut are richer than nutarians available in saliva. As a result, oral microbiome may response to aging differently compare to gut microbiome, both because of the bacteria species of the community are different and because the stability of the community is different. Second, it may because oral environment of older adults is a more difficult for bacteria colonize and growth than that of young. Phenotypes that are unfriendly to bacteria growth, such as tooth loss and Xerostomia (dry mouth), 166 are found more prevalent in old subjects.

Based on our power simulation, the power to detect the difference of a richness/diversity between young and older adults is between 0.676 and 0.869. This indicates that our study is not underpowered for such comparisons. We also demonstrated that when a more stringent p-value is set, more samples are required to make the comparison. To achieve a high level of confidence on diversity differences (p-value < 0.01), at least 20 biological replicates with a p-value equal to 0.001 is required for each group of the study.

In addition, we identified a highly abundant saliva OTU in gut microbiome of older adults, which may suggest a higher potential of microbial migration from saliva to gut in older adults. The concept of microbial migration has been proved and used as a signature of liver cirrhosis associated with microbiome dysbiosis.

167 In our cohort, OTU_33 is identified as the second largest OTU of Streptococcus in saliva samples. The stool abundance of this OTU is significantly higher in old subjects, which indicate a possible oral microbiome migration in older adults. We think this observation may be an indication of general liver functional decline that associated with aging.

Studies 168,169 on the human oral microbiome showed that patients with periodontitis had a low saliva bacteria ecological diversity due to the overgrowth of Selenomonas and Streptococcus. They proposed that

45 the imbalanced bacterial population may contribute to the inflammatory status of periodontitis. In our study, we also noticed one Selenomonas OTU (OTU_394) was more abundant in elder cohort than in young cohort

(62% vs 20%, respectively). Similar pattern was observed for Tannerella (OTU_197), which was contributed to periodontitis. The higher abundance of those bacteria in older-adults cohorts may explain why older adults are more susceptible to periodontitis. 161,169

However, we do notice a major limitation in this study, which is the small sample size and thus low statistical power. Although we do not have a good power (0.16) to detect OTU level differences for the current hypothesis, we do have a good statistical power to determine if we can reject the null hypothesis which is: the most abundant 20 OTUs between young and old subjects are the same in their mean relative abundance. Therefore, one of the conclusions in this chapter is that the dominant gut and saliva microbiome is relatively stable between young and old subjects. Based on our power estimation, we believe this conclusion is not underpowered and have biological significance. The ELDERMET project 160 and another project 143 also confirmed age is not a dominate contributor of gut microbiome inter-person variation among old. In fact, there were only 13 and 9 young-subject controls included in the study respectively, so that the detection power in two published studies is not very different from our study design. Remarkably, one of the well-designed studies, which included more than 1000 samples, directly confirmed our conclusion, that dominate gut microbiome is relatively stable, but subjects around 20-year-old show lower Shannon diversity because their Prevotella and Bacteroides are relatively more dominate. 158

With our current study design, we have good detection power to distinguish whether bacteria presented more than 4 times are significantly different from bacteria that are never present. Also, our way of combining three methods together eliminates the problem of lowering detection power introduced by multiple testing. By merging those OTUs together and adjusted for SD (5% increase), our power of detection rises from 0.16 to a value between 0.47 (to detect medium effect size) and 0.80 (to detect large effect size).

46

In conclusion, we provided an overview of microbiome composition data in 23 human subjects in this chapter. By collectively using three statistical methods for comparison, we identified difference in richness and diversity between young and older adults’ microbiome. We also confirmed that the dominant organisms are relatively stable between two age groups. Therefore, the difference on richness and diversity is due to changes of less abundant organism. Also, we identified a salivary bacterial OTU which was more abundant in older adults’ gut microbiome. This observation exhibits signs of microbiome migration associated with aging. As a pilot project, this study not only showed power estimation for each step, but also offered a good size estimation for a study design aiming to understand difference in microbiome composition between two populations.

47

Chapter 3

Background

In the previous chapter, we described a detailed comparative analysis of microbiome from young and older individuals. We showed that the dominant members of the microbiome (defined by the relative abundance of 20 most abundant bacteria OTUs) is relatively stable across the age-related groups. However, differences do exist in ecological richness (the number of bacteria can be identified) and the ecological diversity of microbiome between young and older adults. Our analysis showed that bacteria with low relative abundance are contributing to the variations in ecological richness and diversity across the age-related groups. As a pilot study, the previous chapter also provided an example of statistical power and sample size estimation for future studies with similar experimental designs. In this chapter, we will extend our investigation of the microbiome plasticity to the study involving hosts with early stage type 2 diabetes (T2D). The overall hypothesis is: The interactions between microbiome and immune system are linked to the onset and progression of prediabetes.

A dysregulation of microbiome, or microbiome dysbiosis, is commonly found in patients with type 2 diabetes 170,171. A pioneer study comparing 18 diabetic males with 18 healthy males demonstrated that phylum Firmicutes and class Clostridia were reduced in diabetic individuals. 172 A larger study in European revealed a decrease in relative abundance of five species and an increase of Lactobacillus in gut microbiome of T2D patients, 173 which is very similar to the result of a study in Japan. 174 Furthermore,

Qin and his colleagues established a correlational association between the loss of certain clostridia population in T2D patients and a reduced SCFA production. 171 Later study confirmed this observation, and demonstrated that supplementary of SCFA-producing bacteria strains to T2D patients would improve their post-treatment clinical outcomes. 175 Meanwhile, the loss of microbial ecological diversity, another signature of the gut microbiome dysbiosis, 176 has been associated with obesity, 12 and T2D. 171,172 These

48 evidences collectively pointed out the possibility that dysbiosis might be associated with T2D onset and progression. Furthermore, transferring the gut microbiome from lean individuals to patients with metabolic syndrome can increase the insulin sensitivity of the recipients. Among 16 bacteria strains that increased in gut microbiome of recipients post procedure, 12 strains belong to the class of Clostridia. 177

Given that type 2 diabetes are associated with inflammation 178,179 and signs of bacteria infiltration to host peripheral blood 180,181, we believe that the immune system regulating the gut microbiome may be involved in the development of T2D. Th17 cells are a subset of CD4 T helper cell that can be induced by combination of IL-6, IL-1β, and TNF-α. 179 This type of immune cells is closely related to the host regulation of gut microbiome through the secretion of anti-microbial peptides and the maintenance of gut epithelial barrier functions, 179 the disruption of which is a direct cause of bacteria infiltration. 182 Thus, Th17 cells are involved in maintaining the homeostasis between host and the gut microbiome. Dysfunction of Th17 has been reported correlating with gut microbiome dysbiosis in patients with multiple sclerosis, 183 IBD, 184 and systemic lupus erythematosus. 185,186 IL-17 and IL-22 are two type of cytokines that preferentially produced by Th17 cells187,188. Although the functions of IL-17 and IL-22 are not quite clear yet, it is generally believed that the two cytokines play a role in antimicrobial activities. 189-193 Reciprocally, signals from the gut microbiome could also affect Th17 cells activities in hosts. Germ-free mice do not carry Th17 cells in the gut, 92 but the inoculation of certain bacteria to germ-free mice can induce the development of Th17 cells.

92,194 Interestingly, Atarashi and his colleagues demonstrated that the bacteria can induce Th17 cells are mainly from clostridium cluster XIVa, XIVb, XVIII, IV, and the class of Bacteroidia, Actinobacteria. Also, the authors claimed that the attachment of bacteria cell to gut epithelium was very important to the induction of Th17 cells in mouse model. 195,196

The interaction between Th17 cells and the microbiome plays an important role in the onset of T2D in mouse models. TLR5 signaling is critical in the production of IL17 and IL22, 197 and a report showed that

TLR5 -/- mice, whom cannot detect the flagellin from the gut microbiome, developed metabolic syndrome

49 spontaneously. 94 Also, the IL-17 deficient mice showed worse body weight control post HFD induced obesity. 198 This indicates that immune system mediated bacteria host interaction is required for host energy homeostasis. Garidou and his colleague showed that high fat diet (HFD) induced metabolic disorder in mice was associated with a reduction of CD4+ IL17+ cells population in the small intestine lamina propria (SILP)

199. One possible explanation, as indicated by later studies, is that the HDF induced gut microbiome dysbiosis reduced the microbiome-produced metabolites that were required for maintaining Th17 cells.

200,201 Consistently, mice fed with HFD also decreased the expression of IL-22 in the intestines. 200,202

Similarly, in human subjects, IL-22 level in serum has been found significantly lower in patients with impaired fasting glucose and T2D. 203 Another group demonstrated that the lack of IL-23 mediated Th17 cells maintenance increased the prevalence of glucose intolerance and insulin resistance in mice. 204 Lastly, studies showed that both adaptive transfer of Th17 cells to gut 205and the supplement of IL-22 202 alleviated the insulin resistance and metabolic disorders in mice.

In humans, the relationship between Th17 cells, their related cytokines and diabetes is less clear. As mentioned above, one recent study found the serum level of IL-22 was negatively correlated with the onset of T2D in humans. 203 However, other studies also claimed that IL-1β and IL-22 were the key contributors of obesity mediated metabolic disorders. 206 In addition to IL-22, there are also debates over whether IL-17 plays a role in obesity mediated metabolic disorders. It was found beneficial in some studies. One study concluded that the decreased serum level of IL-17 was associated with metabolic syndrome and T2D related nephropathic complications. 207,208 Another study showed that, comparing with healthy controls, IL-17 level was higher in diabetes patients without diabetic retinopathy (relatively early-stage of T2D) but was lower in diabetes patients with diabetic retinopathy (relatively late-stage of T2D). Also, the production of IL-17 in PBMC upon phytohaemagglutinin and ionomycin stimulation was negatively correlated with patient

T2D duration. 209

50

On the contrary, the level of IL-17 has been proposed to be an independent predictor of future development of Insulin resistance in a Finnish female cohort. 210 Also, Th17 cell isolated from human T2D patient show higher level of IL-17 production post T cell mitogens stimulation. 211 Furthermore, the number of Th17 cell has been found positively correlate with severity of both T2D disease 211 and T2D related complications.

212,213

Therefore, more studies are required to understand the link between Th17 cell, diabetes and microbiome.

With the access to iHMP data, we hypothesize that: the reduction of Th17-related cytokines in serum is linked to gut microbiome dysbiosis, indicating the onset and progression of prediabetes in humans.

51

Figure 3.1. GMM model to group IL17 and IL22 into different gaussian distribution

a) Model selection based on Bayesian information criterion Our selection

Number of clusters

b) Input variables and clustering result based on the three-cluster assumption

Figure 3.1. GMM model to group IL17 and IL22 into different gaussian distribution a) The Bayesian information criterion (BIC) simulation for the potential group number (Number of components). The highest three BIC indicate the most likely numbers of groups based on GMM. b) Pairwise correlation of three cytokines mean value and SD. Color represents the assigned group of each sample point(blue=inactive group, red=intermediate, green=active group)

52

Result

Description of iHMP Cohort

According to the protocol of the iHMP, 105 individuals at risk of developing T2D were followed for a period of 1-4 years. During the study period, detailed multi-omics profiling was carried out four times per year, and more frequently during periods of self-reported stress, or upper respiratory infection [Ref main manuscript]. In this chapter, we used the repeated sampling of subjects as a basis for an analysis in which each subject was assumed to represent a fixed state relating to development of T2D.

Within-individual cytokine variability identifies discrete groups of Th17 activity among iHMP subjects

First, we hypothesize that the onset of metabolic disease is associated with altered Th17 cell cytokines production. To test this hypothesis, we tried to separate individuals based on their IL-17A, IL-17F, and

IL-22 cytokines levels in the serum. This is based on the previous observation that serum IL-17 and IL-22 level can be used as indicator of Th17 cell activity in hosts with rheumatoid arthritis, 214,215 Psoriasis, 216

Takayasu Arteritis217 or Crohn's disease. 218,219 We clustered individuals based on their serum-cytokine levels by Gaussian mixture modelling (GMM). GMM is a modeling method designed to identify the possible numbers of gaussian distributions in a given dataset. In our analysis, we used GMM to identify how many gaussian distributions can be found in iHMP participants based on their IL-17A, IL-17F and

IL-22. We supplied the mean cytokine abundances and variance estimations over time for each subject to the model, which then indicated that the participants could be optimally separated into at least 3 groups based on Bayesian information criterion (BIC) (Figure 3.1a). Differences between the three groups are illustrated in a pairwise correlation plot (Figure 3.2b). Individuals at one gaussian distribution, characterized by consistently low cytokine mean abundances and variances (Blue Dots), can be separated from subjects at the other distribution, by high mean abundances and/or variances for one or more

53 cytokines (Fig. 3.2b). Based on the result from GMM, we concluded that longitudinally-sampled serum cytokine levels can distinguish discrete groups of subjects, with differences in cytokines that may reflect

Th17 cell activities. We hence referred to the groups as ‘High Activity’ (HA), ‘Indeterminate Activity”

(IA), and ‘Low Activity’ (LA) (Fig. 3.2a, b), to represent their inferred levels of Th17 activity.

Importantly, we noticed that inter-individual variation does not overwrite the intra-individual variation

(Figure 3.3a, b). This result lead us to speculate that identifying subject-clusters in this manner required estimating both inter- and intra-individual variation, which is not possible in cross-sectional data and hence necessitated a longitudinal study design.

54

Figure 3.2. Participants grouped according to IL-17/IL-22 cytokines a b

Figure 3.2. Grouping of iHMP participants based on Th17 related cytokine abundance a) Gaussian Mixture Modelling of cytokine mean abundance and variance (see Online Methods) separates study participants into three discrete groups (columns). Lines within each panel represent repeated measurements of serum cytokine abundance for one individual over the study period. Rows represent serum cytokines associated with Th17 activity (IL-17A, IL-17F, IL-22). CHEX4 is a measurement of background fluorescent intensity, and can be treated as an negative control. (Note different scales on Y axis for each row). b) Density distributions show the abundance of each cytokine within each group. Dashed lines represent the lower limits of detection (LLOD) for the Luminex assay used to calculate cytokine abundance (see Methods). LLODs are calculated as 3 standard deviations above the mean absorbance for the negative control associated with each measurement. They can be interpreted as conservative thresholds below which the accuracy of measurements may become limited. (IL17A: 64.4; IL17F: 140.5, IL22: 199.9)

55

In fact, the cross-sectional design of similar experiment will generate cases that one measurement is not enough to detect the possible existence of certain cytokines. We demonstrated this using an example

(Figure 3.3c). We first took one random sample from each individual and generated a simulated cross- sectional design of the same experiment. By comparing the IL-22 value from this simulated cross- sectional experiment with the mean value of IL-22 from longitudinal design, we can identify 13 individuals whose IL-22 measurements are under the lower limit of detection in randomly-subset samples but are above the lower limit of detection in mean value from repeated measures. Therefore, we believe that the cross-sectional design will generate a great amount of false positive signals in “LA” group, because we showed that one measurement is not enough to capture their real signal related to IL-22. This simulation justified our advantage of using longitudinal measurements from iHMP study.

Also, mean values from repeated measurements will better fit the GMM because the distribution of the sample means will approach normally distributed when the sample size increases according to the central limit theorem. There is no clear definition on how many samples are required in central limit theorem. To get a relatively normal distribution without dropping too much samples, we excluded individuals who had been sampled less than 5 times from our analysis.

56

Figure 3.3 Comparison of the longitudinal design with the cross-sectional design

a b

● ● ● ● ● ● ● ● ● ● 8 ● 20 ●● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●●● ●● ● ● 10 ● ●● ● ●● ●●● ● ● ●● ● ● ● ● 4 ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●● ●● ● ●●● ●● ●●● ● ● ●●● ●●●●● ●● ● ● ● ● ● ●● ● ● ● ●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ●● ● ●● ● ● ● ●●● ● ● ●● ● ●●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● 0 ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●● ● ●● ●●● ●● ●●●● ● ●●● ●●●● ●●● ●●●●● ● ● ● ● ● ● ●●● ● ● ● ●●●●● ●●●●● ● ● ● ●● ● ● ●●● ●●● ● ●●● ●● ●●● ●● 0 ●●●●● ● ●●●●● ● ●● ●●●●●● ● ●● ● ●● ●●●● ● ● ● ● ●●●●●● ●● ● ● ● ● ● ● ●● ●● ● ●●●●● ● ●●●● ●●●●● ● ● ● ● ● ● ●● ●● ● ●●●●●● ● ● ●● ● ● ● ● ● ● ●●●● ●● ●● ●●●● ● ● ● ● ●●● ● ● ●●● ● ●● ● ● ●● ●● ● ● ● ● ●●● ●●●●●● ●●● ●● ● ● ● ● ● ● ●● ●●●●●● ●● ● ● ●● ● ● ●● ●● ● ●●●●●●●●●● ● ● ● ● ● ● ● −2 ● ●●●●●●●●●● ● ● ● ● ●● ●●●● ●●●●● ●●●● ● ● ● ● ● ●●●●●● ● ●●● ●● ●●●●● ● ● ●

PCO2 (10% variance) ● PCO2 (12% variance) ●● ● ●●● −10 ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● −4 ● −40 −30 −20 −10 0 10 20 −10 0 5 10 15 20

PCO1 (18% variance) PCO1 (32% variance) c

Figure3.3 Comparison of the longitudinal design with the cross-sectional design

Non-metric multidimensional scaling (nMDS) plots of study participants based on a) microbiome compositions and b) cytokine profiles. Repeated measurements from different individuals are colored separately to highlight the relationship between intra- and inter-individual variation. c) Compare the longitudinal design with the cross-sectional design for measurements of IL-22. One random sample from each participant was taken in order to simulate a dataset for cross-sectional study design, the blue box cover samples that will be under the lower limit of detection for cross-sectional design, but above the lower limit of detection under longitudinal design.

57

Th17 HA individuals display milder metabolic syndrome related phenotype

After we demonstrated that individuals may be assigned to three discrete groups, we reasoned that discrete Th17 cytokine profiles in three groups may reflect different stages of metabolic disorders progression. Previous studies in mice have demonstrated that loss of Th17 cells is an outcome of high fat diet induced metabolic syndrome. 199 While studies in humans have shown a negative correlation between serum level of IL-22 and the phenotypes of T2D. 203 Based on the results from these studies, we hypothesized that subjects with a Th17 LA profile would show more severe T2D-related phenotypes than subjects with a Th17 HA profile.

To test this hypothesis, we firstly compared the insulin sensitivity among participants in different groups by using the steady state plasma glucose (SSPG) measurement which was available for two thirds (n=65) of study participants. Based on this measurement, individuals could be diagnosed as “insulin sensitive” or

“insulin resistant”. We observed that SSPG levels were significantly higher in Th17 LA individuals compared to Th17 HA individuals (Wilcoxon test, W=64.5, p=0.021, Figure 3.4). As a result, Th17 LA individuals were more frequently insulin resistant (Figure 3.5a). Consistent with this result, we found three-month average glucose (A1C) level tends to be slightly higher in Th17 HA group, but not statistically significant (Figure 3.6). However, mean fasting plasma glucose (FPG) levels, age, and body mass index (BMI) did not vary significantly across groups (Figure 3.5b).

58

Figure 3.4 Steady State Plasma Glucose measurements by group mg/dl

Figure 3.4 Steady State Plasma Glucose (SSPG) measurement by group. P-values for pairwise Wilcoxon test are labelled above the bar plot, the p-value for a one-way ANOVA test is labelled under the bar plot.

LA: Th17 Low-Activity group, IA: Th17 Indeterminate-Activity group, HA: Th17 High-Activity group

59

Secondly, we modelled longitudinal clinical data collected across the 4-year study period to investigate the metabolic profiles difference between groups. Using the linear mixed model, we looked for multiple markers that may suggest differences between Th17 LA subjects compared to Th17 HA or indeterminate subjects. Interestingly, Th17 LA individuals showed significantly lower plasma high-density lipopolysaccharide (HDL, Figure 3.6) and higher plasma triglycerides (TGL, Figure 3.6). Both a decrease of HDL and an increase of TGL has been observed in T2D patients, indicating our LA group has higher risk for T2D development. 220 Additionally, serum sodium and chloride levels were higher in the

Th17 LA group (Figure 3.6). From previous studies, we know that high insulin might be associated with increased sodium retention in T2D patients, 221 so this result is also in line with our hypothesis that Th17

LA group showed higher risk for T2D.

Finally, we examined a red blood cell volume measurement among our participants. The rationale for this analysis comes from a previous large-scale study which identified low red blood cell distribution width

(RDW) as being associated with low HDL and high TGL, and thus negatively correlated with risk for

T2D. 222 As we expected, the RDW was significantly lower in Th17 LA subjects among our iHMP participants. (Figure 3.6).

60

Figure 3.5. Age, BMI and glucose level of subject from different groups

a b

Group 1: Th17 LA Group 2: Th17 IA Group 3: Th17 HA Relative percentages

Group Group

Figure 3.5 Age, BMI and glucose level of subject from different groups a) Percentage of insulin resistant individuals in each of the group. Total number of individuals are labelled on top of each column. IR: Insulin Resistant; IS: Insulin Sensitive b) Comparison of age, Body Mass Index(BMI), and Fasting Plasmid Glucose (FPG, mmoL/L) across groups. Pairwise t test was performed(* P<0.05, ns: p>0.05)

Group 1: Th17 LA Group 2: Th17 IA Group 3: Th17 HA

61

HDL Glucose −20 −10 0 10 20 −10 −5 0 5 10 Adj.age Adj.age BMI BMI GenderM GenderM Days Days

x3 x3

x2 x2

A1C Triglycerides −0.4 0.0 0.2 0.4 −100 −50 0 50 100

Adj.age Adj.age BMI BMI GenderM GenderM Days Days x3 x3 x2 x2

CL Sodium −2 −1 0 1 2 −2 −1 0 1 2 Adj.age Adj.age BMI BMI GenderM GenderM Days Days x3 x3 x2 x2

Figure 3.6 Longitudinal modelling of serum RDW metabolism marker and blood cell volume marker −1.0 −0.5 0.0 0.5 1.0 show higher type 2 diabetes risk in individuals from

Adj.age LA group BMI X3 represents the Th17 HA group, X2 presents IA group, the GenderM Days group is treated as categorical variables (see Online Method). MCV Sex, age (adjusted for the length of their study involvement), MCH BMI and length between project start to each sample collection x3 x2 (Days) are included in building the model. Regression coefficient (beta) is plotted for each variables. Thick lines for each variable indicate 50% Credible intervals or ± 1 SD, thin lines for 95% CI or ± 2SD.

62

In conclusion, analysis on multiple biomarkers of metabolic disease progression indicated that Th17 LA subjects display a more severe clinical phenotype than Th17 HA subjects, supporting our hypothesis that loss of Th17 activity may be associated with the onset and progression of T2D.

Th17 HA individuals have a gut microbiome characterized by a high level of Clostridia

Based on the close association between Th17 cells and the gut microbiome, 92,223 we next asked whether subjects distinguished by Th17 cytokine profiles differed in the composition of their gut microbiome. As a subject’s gut microbiome remained relatively stable throughout the course of this study (Figure 3.3), 224 we began by comparing mean microbiome abundance values for each subject between Th17 LA and

Th17 HA groups. Th17 LA subjects had a significantly lower alpha diversity compared to Th17 HA subjects (Wilcoxon test, W=34.0, p=0.003, Figure 3.7)) and a lower Firmicutes: Bacteroidetes (F/B) ratio

(Wilcoxon test, W=54.0 p=0.044, Figure 3.7). Furthermore, in order to take the repeated measurement design into our comparison, we performed LMM to address the same question. The results from LMM were consistent with our comparisons made on mean values (Figure 3.8, 3.9). Multiple studies also demonstrated a functional association between members of the Firmicutes (taxa associated with the class

Clostridia) and Th17 cell proliferation. 196,225 Our results therefore indicate that the gut microbiome of

Th17 HA individuals is enriched in microbial taxa known to influence Th17 cell abundance.

63

Figure 3.7 Differences in the gut microbiome of HA and LA individuals a b c

Figure 3.7 Differences in the gut microbiome of HA vs LA individuals a) Shannon diversity estimates for the HA group and LA groups. Mean value of diversity for each participant across the study period is used to generate this plot. The p-value from a Wilcoxon test is labelled above the plot. b) Firmicutes to Bacteroidetes ratio of HA group and LA groups. Mean value of F/B ratio for each participant across the study period is used to generate this plot. The p-value from a Wilcoxon test is labelled above the plot. c) Cladogram representing the LEfSe results for comparing taxa abundance between HA and LA groups. Circles on the cladogram represent the phylogenetic relationship of taxa that are tested, with phylum at the center and OTU on the edges. Each point represents a taxonomic unit. Red color covering a dot/region indicates the taxa that are more abundant in the HA group, blue color covering a dot/area indicates the taxa are more abundant in the LA group.

64

To further investigate the relationship between Th17-related cytokines and the gut microbiome, we used data from Th17 HA and Th17 LA individuals to perform longitudinal modelling of the pairwise relationships between individual bacterial genera and cytokine abundances. Previous studies have identified significant associations between cytokines and the gut microbiome, but their cross-sectional design means they have focused on inter-individual variation. 121 By using the repeated measurements for each individual in the iHMP study, we addressed the possibility that intra-individual variation in cytokine/bacteria abundance may be used to identify host-microbe associations of biological interests. We used a mixed-effects model with random effects introduced by repeated measures to accommodate within-subject correlation. To accommodate the zero-inflated nature of microbiome abundance counts we used a Bayesian framework with a negative binomial distribution with a sparse matrix. Th17 groups (HA vs. IA) and cytokines (either IL-17A or IL-17F) were included as fixed effects, as was an interaction term for the combined effect of Th17 group and cytokine on microbe abundances. We excluded IL-22 excluded from this analysis because a large proportion of IL-22 measurements appeared to be below the sensitivity threshold of the Luminex assay (Figure 3.2) and were not reliable for longitudinal modelling.

We included the Th17 cell related cytokine interaction-term with the specific aim of testing the hypothesis that significant microbe-cytokine associations may be detected in Th17 HA individuals that are not present in Th17 LA individuals.

65

Figure 3.8. Longitudinal modelling of Shannon diversity show higher diversity in HA group

Th17 HA group

Th17 IA group

Figure 3.8. Longitudinal modelling of Shannon diversity show higher diversity in HA group The sex, age and BMI is included to build a model for the relationship between Th17 related groups and their gut microbiome Shannon diversity. Regression estimates are plotted with thick lines for each variable indicate 50% Credible intervals or ± 1 SD of the correlation coefficient, thin lines for 95% CI or ± 2SD.

66

Longitudinal modelling of microbe cytokine interactions indicated that the abundance of genus Alistipes varied significantly in response to changes in IL-17F level in serum (Figure 3.10). As models were designed to compare cytokine vs. microbe interactions in the context of Th17 activity (encoded as HA vs.

IA), this result indicates this genus was significantly associated with IL-17F in Th17 LA individuals. In contrast, seven bacterial taxa were significant for the Th17 group: cytokine interaction term (Figure

3.10), indicating their abundance was significantly associated with IL-17F or IL-17A cytokine level in

Th17 HA individuals only. Notably, 6 out of 8 of these significant relationships involved taxa belonging to the class Clostridia, and in all 8 cases, cytokine abundance was positively correlated with taxon abundance (Figure 3.10). In conclusion, analysis of variation in the taxonomic abundance of gut microbiota between individuals (Figure 3.2) and within individuals (Table 1) both provide evidence that members of the class Clostridia are associated with increased levels of Th17 cell activity.

67

Figure 3.9. LMM model to compare Firmicutes/Bacteroidetes ratio

Th17 HA group

Figure 3.9. Longitudinal modelling of Firmicutes to Bacteroidetes ratio show higher ratio in HA group The sex, age and BMI is included to build a model for the relationship between Th17 related groups and Firmicutes to Bacteroidetes ratio. Regression estimates are plotted with thick lines for each variable indicate 50% Credible intervals or ± 1 SD of the correlation coefficient, thin lines for 95% CI or ± 2SD.

68

Figure3.10 The Genera Correlate with Serum IL-17

Correlation in LA Group Correlations in HA Group

Figure3.10 The Genera Correlate with Serum IL-17 Significant correlations between serum IL-17 and bacterial genus abundance are shown for Th17 HA individuals (red panels) and Th17 LA individuals (blue panel). Distributions show estimated effect sizes from Bayesian Markov chain Monte Carlo draws after parameter convergence. Panels show bacteria for which the estimated effect is significantly greater or less than 0 (95% credible interval does not include 0).

69

Discussion

This analysis of the iHMP prediabetes dataset provides an opportunity for detailed exploration of human- microbiome associations which is hard to achieve in the smaller, cross-sectional studies. The longitudinal design of iHMP allowed us to access the chronic properties of human Th17 related cytokines, and thus to distinguish individuals who have either high absolute value of cytokine or high variation of cytokine from individuals who have very low cytokine production across years.

Based on our analysis, we propose that human subjects with a Th17 LA profile and loss of Th17-cell- inducing bacteria in the gut display higher potential risk for metabolic disorders based on several clinical markers. This observation has been made in mouse models previously yet have not been detected in humans. Our observation that, longitudinal monitoring the individuals might be necessary to this discovery, may explain why it is not shown until today.

Our results strengthen support for the hypothesis that gut microbial dysbiosis previously described in T2D patients may arise due to malfunction of host Th17-related regulation of the gut microbiome.

Reciprocally, altered gut microbiome composition may further reduce signals that stimulate Th17 cell differentiation and maintenance. As both Th17 cells and their related cytokines are a critical component of a healthy intestinal epithelial barrier, 204,218 there might be causational effect beyond the observed correlations. For example, Aryl hydrocarbon Receptor (AhR) signaling is critical for maintenance of

Th17 cells, 226 and is reduced after mice are switched to a high fat diet. 200 The decrease of Th17 cell related cytokines may trigger the vicious circle of gut barrier malfunction and gut microbiome dysbiosis.

However, there was several limitations that we need to discuss. First, in the current dataset, we do not have the measurements on short chain fatty acid (SCFA) level in stool samples, thus could not address the

70 possible effects of SCFA on either Th17 cells or metabolic profiles. The hypothesis that SCFA may induce Th17 cell proliferation are currently investigated by other groups. 227

Second, we do notice the complication of diabetes among individuals. Although we can establish correlation between microbiome, Th17 related cytokines and measurements such as HDL, LDL or glucose, we must point out that diabetes is a complicated syndrome that beyond one measurement. For example, one of our participants (Randomized ID: ZVNCGHM) who was clinically diagnosed as

“Diabetes” based on his FPG level (138 mg/dL), was clinically categorized as “insulin sensitive” based on his SSPG (67.13 mg/dL). Meanwhile, four participants who were categorized as “healthy control” based on their A1C levels are diagnosed as “insulin resistant” based on FPG. The two examples demonstrated cases when participants have normal insulin sensitivity, and when participants have normal glucose sensitivity but impaired insulin sensitivity. This complication of diabetes limited our ability to correlate our result simply with different stages of diabetes progression.

In conclusion, this chapter demonstrated a correlational relationship between human gut microbiome and

T2D progression. It provides a new perspective in which certain bacteria in gut microbiome could affect human metabolic process through interacting with the host immune system, raising the possibility that therapeutic targets, related to Th17 cells, may effectively alleviate metabolic diseases in human subjects as they do in animal models.

71

Chapter 04

Conclusions and Future Direction

In summary, the studies in this thesis demonstrated the plasticity of microbiome under the influence of healthy aging and prediabetes. Also, it established a new link between human gut microbiome dysbiosis and the progression of metabolic diseases. Notably, the proposed association we found between microbiome and immune system effectively validated numbers of observations in mouse model and provided a new possible etiology of human T2D onset and progression.

In line with several previous studies, 19,158 we did not identify dysbiosis in healthy aging individuals with our current study design. In theory, the previous studies and ours might miss the taxonomy abundances differences which have smaller effect size. However, the consistent results between our project and other studies strengthened our confidence that we observed a biologically-valid phenotype: In healthy individuals, the microbiome identities are to some extent stable, and we speculate that even if there are nuance taxonomical differences between young and aged cohorts, they might not be dramatic enough to influence the host health status. On the other hand, if we think this observation in a survivorship bias prospective, it might indicate that individuals who have dramatic microbiome changes may unlikely be healthy. In fact, there are links between frailty in older adults and dramatic gut microbiome changes, 155,228 which indicates that dysbiosis may correlate with age-related disease phenotypes instead of chronological ages alone.

However, we cannot definitively reject the hypothesis that microbiome in old individuals might be in higher potential to switch to dysbiosis, especially when considering that old individuals often have chronic low- grade inflammation and immunosenescence. We did a small-scale study regarding this hypothesis in mice.

To test if aged mice are more likely to develop dysbiosis during inflammation, we proposed an experiment described in the Appendix. Briefly, we compared the microbiome plasticity in young (5–6 months old) and

72 old (19–21 months old) post influenza viral infection. We already know from a previous publication that

A/PR/8/34 (PR8) strain of influenza virus does not infect gut tissue229, thus we believe the microbiome plasticity we observed post-infection is a response to mucosal inflammation mediated by IFN 230 and Th17 cells. 229 In response to infection-induced inflammation, the gut microbiome of aged mice changed the same way as it is in young mice in day 4 and day 7. Therefore, we concluded that the microbiome plasticity in aged mice do not change in response to such flu-induced inflammation, and such changes in humans remained unclear.

Although we did not identify any dramatic age-related dysbiosis in our human aging microbiome study

(chapter 2) and the study described in mice, a study from McMaster University do show that transferring the gut microbiome from aged mice to young recipient mice may induce an increase of TNF-a and IL-6 in those recipient mice. 231 When carefully comparing their animal cohorts with ours, we found two major differences: 1) the mice at McMaster University (MC) and UConn Health Center (UCHC) do not bare the same gut microbiome. The microbiome in mice from MC shows high abundance of genus from Escherichia,

Alistipes and Klebsiella, which cannot be found in the most 10 abundant genera in microbiome from UCHC mice (Figure 4.1); 2) the mice at MC show an age-related increase in TNF-alpha and IL-6 in plasma and lung tissues at the baseline level (before fecal microbiome transplantation (FMT)), whereas the mice at

UCHC do not have such phenotypes. So, the correlation between age and increased pro-inflammatory cytokines may be specific to the MC cohort. In addition, the FMT experiment from a later study 232 also tried to claim that aged microbiome caused systemic inflammation. However, they observed that transferring the young microbiome to recipient mice induced an upregulation of TLR4 (an immuno-sensor for bacterial Lipopolysaccharides (LPS)) only when recipient mice are germ-free. As a result, current evidences are not convincing enough to establish causational relationship between microbiome dysbiosis and aging.

73

Figure 4.1 Comparison of the gut microbiome between Young and aged mice from UConn Health Center

Young old

Figure 4.1 Comparison of the gut microbiome between Young and aged mice at UCHC Microbiome relative abundance from young (5–6 months old) and old (19–21 months old) C57B6/J mice from UConn Health Center (UCHC), the ten most-abundant genera are annotated by color (lower part of the right plot).

74

Therefore, whether aged-microbiome causes inflammation also remains unclear, and more studies are required. One experiment that need to be done is to establish an association between certain bacteria genera

(or even species/strains) to an increase in chronic low-grade inflammation instead of chronological age, and to design the FMT experiment using the bacteria isolates that are known to correlate with the increasing of cytokines levels such as TNF-a and IL-6 (which are the biomarkers of chronic low-grade inflammation233,234). However, one of the big challenges in such studies is to meet the sample size and study power requirements to get accurate estimates on potential correlations between bacteria and cytokines.

Studies focus on “chronic” and “low grade” inflammation usually require a longitudinal sampling in order to identify cytokine differences in small effect size, which naturally lead to a very large sample size requirement.

Lucky, the iHMP provided such an opportunity to study the relationship of systemic chronic inflammation and the early progression of an inflammation-related disease, prediabetes. 224,235 Unlike chronic low-grade inflammation which is characterized by elevated serum level of IL-6 and TNF-a, we hypothesized that prediabetes associated inflammation might be linked to plasticity of Th17 cell-related cytokines. The detailed rationale of such hypothesis was discussed in Chapter 3.

One of the major conclusions we made from Chapter 3 was: the serum levels of IL-17 and IL-22 could be an indicator of prediabetes progression. There are contradictory results from previous studies about whether

IL-17 increases as a result of T2D. On one hand, IL-17 and IL-22 has pro-inflammatory properties, 236,237 an increase in IL-17 and IL-22 has been linked to many inflammation-related disorders; 236,238 On the other hand, IL-17 and IL-22 are required for gut epithelium function, thus have anti-inflammatory role, and a reduction of which could also lead to an inflammation caused by the “leaky gut”. 239,240 This controversy as well as the actual function of IL-17 and IL-22 has been previously reviewed. 189,241-243 We hence proposed a model which we think will explain our result, and potentially the results from previous studies.

75

This model is explicitly illustrated in Figure 4.2. We used four colors (green, red, grey and blue) to represent the stages of prediabetes progression reflected in this study. Green zone represents individuals in relatively low risk for T2D. The transit from green zone to red zone indicate the onset of inflammation, which correlates with an increased risk for T2D. In the cohort described in Chapter 3, participants in “HA” group are likely in this stage, whose serum contains high levels of IL-17 and IL-22. The previously reported

244 positive correlations between obesity and level of IL-17 may be observations of individuals in this stage.

If the inflammation persists, multiple immune pathways will be activated, which may influence members of the gut microbiome which are susceptible to inflammation. In line with this theory, we observed a reduced bacteria diversity after flu induced systemic inflammation in mice. 245 The fact that we can observe this decreased diversity at Day 4 post infection implied this is likely an innate inflammation, and therefore non-specific. This inflammation may be the cause of decreased microbiome ecological diversity. Data from studies in both our appendix and chapter 3 indicate that phylum Proteobacteria is more resistant to this inflammation induced decrease of diversity. The gut microbiome dysbiosis (characterized by decreased diversity of microbiome) may not only be a result of inflammation but also contribute to the inflammation in many ways. When the vicious circle of dysbiosis and inflammation developed into certain stage, it is possible that the adaptive immune system will be involved and response to all the bacteria genus that are detectable by antigen presenting cells. As a result, bacteria in a microbiome community that are detectable to the immune system will reduce, hence the interaction between the gut microbiome and the immune system will decrease.

76

Figure 4.2 Hypothesized model for plasticity of Th17 relate cytokines and microbiome during prediabetes progression

Lower Risk Chronic Insulin For T2D Inflammation Resistance

Clostridia Th17 Cell related related Cell Th17 Cytokine level in Serum level Cytokine

Progression of Prediabetic stage

• Increased risk for Type 2 Diabetes • Increased chance of Insulin resistance (increased SSPG) • Increased gut permeability (because the lack of Th17 cells related cytokines) • Increased chance of obesity (increased BMI, increased low-grade inflammation) • Increased Bacteroidia, decreased Firmicutes, Clostridia

Figure 4.2 Hypothesized model for Th17 relate cytokines and microbiome plasticity during prediabetes progression The x axis indicate the early onset and progression of diabetes, from left to right, the risk for diabetes increases. Y axis indicate the level of Th17 cells, when transit from green zone to red zone, the Th17 cell related cytokines increase as a result of inflammation. From red zone to blue zone, the Th17 cell related cytokines decrease as a result of diabetes progressing. The Purple line indicate the hypothesized increase and decrease of Clostridia population. We think the Clostridia plasticity is one of the cause of the variation observed in Th17 cell related cytokines based on previous publications.

77

In our analysis, we believe that individuals from the “LA” group are under such a condition. Consistent with our speculation, we found less microbiome genera are correlated with the Th17 related cytokines in the “LA” group, and individuals in the “LA” group maintains a very low level of microbiome diversity, which indicate a gut microbiome dysbiosis phenotype. As a result, some signals from the gut microbiome, which are required for the maintenance of metabolism homeostasis, such as TLR5 mediated signals94 and the aryl hydrocarbon receptor mediated signals, 200,246 are likely absent or reduced.

However, our analysis also has limitations. We will discuss two major ones. First, our result did not elucidate the origin of IL-17 and IL-22. Although the production of IL-17 and IL-22 has been generally linked to the activity of Th17 cells, we cannot exclude the possibility that other types of cell are also responsible for the observed IL-17 and IL-22 variations. In fact, several research groups demonstrated the

IL-17 and IL-22 production from gamma delta T cells (γδ T cells) 247 and Mucosal Associated Invariant T

(MAIT) 248 cells, and it is still unclear what proportion of IL-17 and IL-22 in serum is from these cell types.

Also, the data from the Th17 “HA” group may have only one Th17 cell related cytokines showed a high level, but two others showed low levels. Thus, we could only interpret our “IA” and “HA” group as having low and high probability that their Th17 cells are active, but we cannot directly equal the cytokine level to their Th17 cell activity. Second, we lacked the measurements to correlate the Th17 cell related cytokine to the progression of diabetes due to the complication of this disease. We could only conclude that the “HA” and “LA” group may show different level of SSPG or similar level of FPG and using those correlation to interpret the prediabetes progression. However, when taking other measurements into consideration, we found our study design is not powered enough to identify an absolutely “healthy-control” group in which individuals have no risk for T2D. This situation is illustrated in figure 4.3. It is worth noticing that the two subjects who have highest FPG in this cohort are insulin sensitive. Also, as we mentioned in Chapter 03, several individuals who have normal glucose level (measured by A1C<5.7%) are insulin resistant.

Therefore, instead of using discrete “sick” and “control” category to group individuals, we tend to think

78 that the cohort are all under risks for type 2 diabetes, and some participants (relatively healthier) have lower risk than the others. We believe that treating the diabetes risk as a continues variable is a better model of the disease progression, however the major limitation is that this risk cannot be mathematically quantified, therefore we cannot computationally correlate any measurement to the overall risk of diabetes.

79

Figure 4.3 Participants clinical category

140 Diabetic Insulin Resistant Based On FPG

Prediabetic Based on FPG 120 ) group dL 1 Inactive 2 Indeterminate 3 Active NANo group FPG (mg/ as.numeric(FPG) 100

80 Healthy Based on FPG 100 200 as.numeric(SSPG)SSPG (mg/dL) Figure 4.3 Participants clinical category The clinical category of each individual is color labeled as the background of this scattered plot. Each dot represent a participant from iHMP2 project. Grey dot means the indicated individual do not have more than five repeated measurements, thus they do not have an assigned group from GMM.

SSPG > 150 mg/dL are categorized as insulin resistant [PMID: 17957034] Among individuals whose SSPG < 150 mg/dL (which is considered insulin sensitive), FPG < 100 mg/dL is considered healthy, 100 < FPG < 150 mg/dL is considered prediabetic, FPG > 150 mg/dL is considered diabetic. http://www.diabetes.org/diabetes-basics/diagnosis/

80

Regarding the limitation we mentioned, there are several experiments that can be proposed at the end of this thesis. We are going to propose following experiments:

1) To identify the age-related gut microbiome changes

As mentioned in chapter 2, we propose to increase the sample size in order to detect smaller changes

in the gut microbiome of older adults. Also, we believe that the chronological age might not be the

most ideal measurements when searching for age-related gut microbiome compositional

differences. Therefore, we would like to add two additional points in testing the hypothesis that:

age-related microbiome changes are linked to the increased low-grade chronic inflammation. First,

we propose to include at least 20 individuals in each group to achieve a reasonable power estimation,

instead of 10 individuals in our previous study design. Also, in addition to young and old groups,

we propose to include a frail group. The additional frail group would allow us to compare the age-

related microbiome difference under healthy and unhealthy conditions, and test if the microbiome

dysbiosis is more prevalent in frail group. The result from this experiment will better test if the gut

microbiome is relatively stable during the healthy aging. Second, we propose to measure the

biological ages in addition to the chronological ages. It has been proposed that individuals could

have different biological ages compared to their chronological ages. There are ways to measure

biological ages such as measure the serum level of IL-6 and TNF-a. 249,250 The measurements

regarding biological ages will allow us to establish associations between the gut microbiome

plasticity with certain biological phenotypes such as low-grade chronic inflammation, therefore

increase the biological relevance and significance of searching for age-related microbiome

differences.

2) To validate the correlation between gut microbiome and Th17 cells

Fecal microbiome transplantation (FMT) experiments are widely used to establish causational

relationship between the gut microbiome and the biological phenotypes. We propose the

81

application of such experiments to validate our finding about the correlation between class

Clostridia and the activity of Th17 cells. Specifically, we could transplant the stool samples from

individuals belonging to “HA” and “LA” group to germ free mice and measure the Th17 cells

development and function in the recipient mice. If the results are the same as expected, we should

observe more Th17 cells and/or Th17 cell related cytokines in mice which received the stool sample

from “HA” group donor (high Clostridia) than “LA” group donor (low Clostridia). Meanwhile, we

propose to measure the gut epithelium function and insulin sensitivity of the recipient mice. If our

hypothesis is true, we should expect a better gut epithelium function and better insulin sensitivity

in mice which received stool samples from “HA” group individuals. These FMT experiments will

help us establish the causational relationship between the gut microbiome and the Th17 cells related

cytokines.

3) To validate the deterioration of Th17 cell functions under high glucose environment

To further understand the Th17 cells activity under different glucose environment, we propose to

culture PBMC/Th17 cells in different glucose concentrations. According to our result, there is a

trend of increased A1C in “LA” group, indicating that a high glucose level may correlate with a

decrease of IL-17 and IL-22 levels in serum. We therefore hypothesize that the chronic exposure

to high concentration of glucose would induce an “exhaustion” like phenotype in Th17 cells. T cell

exhaustion describes a deterioration of cell functions after T cells are exposed to chronic antigen

or inflammatory signals. 251 T cell exhaustion is normally seen in patients with persistent

cancer antigens or in chronic viral infections 251-253. After treat the Th17 cells in high glucose

condition, they show a proinflammatory phenotype by upregulate the IL-17 and IL-6 expression,

indicating that high glucose environment may be considered as an inflammatory signal under

certain circumstances. 254 Therefore, chronic exposure of the Th17 cells to the high glucose

concentration may drive the Th17 cells to a phenotype that are similar to the T cell exhaustion. To

validate this theory, we think culturing Th17 cells isolated from diabetes patients (and healthy

82

individuals) in different glucose conditions and measuring cytokine production as well as markers

of T cell exhaustion will contribute to our understanding of the role Th17 cells plays in the onset

and progression of T2D.

4) To rescue the differentiation and function of Th17 cell by blocking the glycosylation

According to a previous report, high fat diet induced metabolism is correlated with a decrease in

number of Th17 cells in mouse model. 199 Similarly, in iHMP2 cohort, individuals in the Th17 “LA”

group are those who have higher risks for T2D onset and progression. These results indicated that

high level glucose may be detrimental for Th17 cells. We therefore hypothesize that reducing the

glucose exposure of Th17 cells will enhance the differentiation and function of Th17 cells in T2D

patients. To validate this hypothesis, we propose an experiment to reduce the glucose availability

of Th17 cells and observe the functional changes post treatments. The Th17 cells and PBMCs of

T2D patients can be collected and cultured under low glucose or no glucose condition or with the

know inhibitory chemicals that block the metabolism of glucose, 255 and the IL-17 and IL-22

productions in cell cultures can be measured and compared. Taking this experiment and the

experiment proposed in the above paragraph, we will have a better understanding of how Th17

cells behaves under different glucose conditions.

83

Appendix I: Methods

Sample Collection

Fecal samples were collected using Fisherbrand™ Commode Specimen Collection System (Thermo Fisher

Scientific, Waltham, MA, USA). Saliva samples were collected by asking each participant to let saliva collect in the mouth for at least 1 minute. The participant was then asked to drool into the labeled 50 mL collection tube (Falcon, sterile conical polypropylene tube with flat-top screw cap). This process was repeated multiple times in order to collect 2-5 mL of saliva. The specimens were kept at cold until stored at -80 C or for DNA extraction.

DNA Sequencing

Fresh samples were stored at -80 ℃ immediately after collection. Total DNA was extracted from fecal/saliva samples using the Power Soil DNA Extraction kit (Mo Bio Laboratories, Carlsbad, CA, USA) according to manufacturer’s protocol. Bacterial 16S rRNA gene DNA was amplified using the 27F/534R primer set (27F 5′-AGAGTTTGATCCTGGCTCAG-3′, 534R 5′-ATTACCGCGGCTGCTGG-3′). PCR reactions were performed using phusion high-fidelity PCR Master mix (Invitrogen, Carlsbad, CA, USA) with the following condition: 95 °C for 2 min (1 cycle), 95 °C for 20 s /56 °C for 30s /72 °C for 1 min (30 cycles). PCR products were barcoded and purified using Agencourt AMPure XP beads (Beckman coulter,

Brea, CA, USA) according to manufacturer’s protocol. Libraries were prepared with Illumina’s protocol for MiSeq platform. DNA sequencing was conducted on an Illumina MiSeq system. The sample processing was complied with Human Microbiome Project standard protocol (#07-001. V12.0).

84

Sequencing Data Process for Chapter 2

Raw reads were filtered according to sequence length and quality. Filter-pass reads were assembled using the Flash assembly software, for which the minimum overlap requirement is 30 bps, and the maximum mismatch ratio is 10%. 256 After assembly, chimeric sequences were removed using the USEARCH software based on the UCHIME algorithm. 257 After the barcode was removed from each sequence,

Operational Taxonomic Units (OTUs) was calculated using a de novo OTU picking protocol with a 95% similarity threshold. Taxonomy assignment of OTUs was performed by comparing sequences to The

Ribosomal Database Project (RDP) 258 at https://rdp.cme.msu.edu/ with a similarity cutoff of 0.8. 259 There were 599,211 assembled reads generated from the 46 samples. The mean read depth is 13,026 (ranged from

3700 to 29,332) and the standard derivation of read depth is 6099. Multivariate trees were plotted using

Tree of Life v1.0. 260 To normalize the sequence depth of each sample, 3700 reads were randomly picked from each sample for alpha diversity analysis, 51 OTUs were hence removed after random rarefication for diversity measurement.

Sequencing Data Curation for Chapter 3

The iHMP project consists of 105 human subject who has been followed over four years period. In short, the participants were recruited under the Stanford University following Institutional Review Board

Protocol #23062. Stool sample and blood sample were collected from host after a fast overnight [ref main paper]. Stool samples were processed according to Human Microbiome Project standard protocol (#07-

001. V12.0) as described above. Cytokine data was generated by 63-plex Luminex antibody conjugated bead capture assay (Affymetrix, Santa Clara, California). Raw cytokine data were normalized to eliminate the variation introduced by using multiple plates for the assay.

85

After we obtain microbiome 16S rRNA sequence reads count data as an OTU table and normalized cytokine data in Median Fluorescence Intensity (MFI), we selected time points when both microbiome data and cytokine data were collected and time-matched. There were 92 subjects whose measurements were selected for further analysis, which included 753 sample points. Participants age ranged from 25 to

75 years old, Body Mass Index (BMI) ranged from 19.10 kg/m2 to 40.83 kg/m2, steady-state plasma glucose (SSPG) ranged from 40 to 276 mg/dL. According to the manufacturer’s protocol, CHEX1 to

CHEX4 are different types of background control for Luminex MFI data. Any data points showed a

CHEX1~4 background noise that was 5 standard derivations or more away from mean were removed.

Among 92 subjects, 35 subjects had less than 5 time points, those subjects were hence removed for building the general mixture model.

Statistical Analysis

The R packages “phyloseq” and “vegan” were used for alpha diversity and beta dissimilarity analysis.

261,262 The two-sided student’s t-test was used for significance testing of the normally distributed variables, otherwise the Wilcoxon signed-rank test or Mann-Whitney U test was used, depend on whether two samples are from the same population or independent. Chi-Square test was used to determine if the proportion of insulin resistance is different between Th-17 High-Activity (HA) and Th17 Low-Activity

(LA) groups. Linear discriminant analysis based on effect size (LEfSe) 86 was performed with the web tool at http://galaxyproject.org/. The statistical tests were performed using R (version 3.5.0). Plots were generated with either R program “ggplot2” and “cowplot” package263-265 or Graph-Pad Prism version 6.00 for Mac. (GraphPad Software, La Jolla, California, USA)

Data Modelling: Mixture model of Th17 group

Subjects sampled more than five times were included in the General Mixture Model (GMM) as described above. The longitudinal IL-17A, IL-17F and IL-22 MFI data were summarized as mean value and SD for

86 each individual, and then scaled in R. To determine the optimal number of Gaussian distributed clusters, models with 1~9 clusters were evaluated using the Bayesian Information Criterion (BIC), resulting in three clusters selected for further analyses. Cluster #1 was comprised of 25 subjects (Th17 LA group), 32 to cluster #2 (Th17 Indeterminate-Activity (IA) group) and 11 to cluster #3 (Th17 HA group), the mixing probability of each cluster was 0.3634941, 0.4749886, and 0.1615172, respectively. Participants assigned to each cluster were associated with a “confidence of assignment” probability (0%~100%); those samples with less than 99% confidence to be assigned to any group (6 individuals in total) were removed from the subsequent analyses.

Data Modelling: Linear Mixed Model

R package “lme4” 266 was used to build and execute the Linear Mixed Model. The model was built separately to test different hypothesis as described below.

To test if metabolic profiles were different among three groups, the equation we used was:

Target of Interest ~ Group + Days + Gender + BMI + Adj.age + (1|Subject_ID)

Fixed effects included Group as the categorical variable derived from the GMM model, Gender was a binary categorical variable, Body Mass Index (BMI) and adjusted age (Adj.age) were continuous variables. Days was a numerical measurement of how many days after the overall study start date the sample was collected. Adjusted age described the average age of an individual during the study period.

Random effects included a random intercept for each subject (1|Subject_ID).

For RDW, to adjust the previous known effect of Mean Corpuscular Hemoglobin (MCH) and Mean corpuscular volume (MCV) on the readout of RDW222, the model was built as

87

RDW ~ Group + Days + MCH + MCV + Gender + BMI + Adj.age + (1|Subject_ID)

For analysis related to microbiome alpha diversity and Firmicutes/Bacteroidetes Ratio, we simply would like to understand the fixed effect in this mixed model, so the model is built as

Microbiome Diversity (or F/B ratio) ~ Group+ Gender + BMI + Adj.age + (1|Subject_ID)

Bayesian Mixed-Effects Model for Taxa and Cytokine interactions conditional on cluster assignment

A Bayesian negative binomial longitudinal mixed effects model was used to evaluate the relationship between individual microbes and cytokines. Each microbe was modeled as the response variable with a random intercept for each subject, and with fixed effects for time, and an interaction term for the cluster identity (defined by the GMM, described above) and the cytokine of interest, thereby evaluating whether the relationship between a microbe and cytokine pair differed depending on the identity of the cluster.

This followed standard matrix notation:

Mi ~ Xiβ+Zibi+ εi

Where Mi is a vector of microbe relative abundances for each subject i, Xi is the matrix of fixed effects, Zi is the random effects vector of 1’s denoting a random intercept, bi is a scalar for each subject, and εi is a zero-centered error term. The fixed-effects matrix Xi is comprised of the days post study start Di, and an interaction term for cluster (Ci) and cytokine (Yi).

88

Sampling was performed with four chains with 5000 iterations per sample, and a burn-in of 1000 iterations. Samples were drawn using a No-U-Turn Sampler implemented in the “brms” package267-269.

Chain convergence was confirmed by visual inspection of iteration plots and posterior predictive distributions.

89

Appendix II: Impact of Age, Caloric Restriction, and Influenza Infection on

Mouse Gut Microbiome: An Exploratory Study of the Role of Age-Related

Microbiome Changes on Influenza Responses

The following is a duplicate version of the article published in Frontiers in Immunology and reprinted with permission under https://creativecommons.org/licenses/by/4.0/

Jenna M. Bartley*, Xin Zhou*, George A. Kuchel, George M. Weinstock, Laura Haynes. Impact of Age,

Caloric Restriction, and Influenza Infection on Mouse Gut Microbiome: An Exploratory Study of the Role of Age-Related Microbiome Changes on Influenza Response. Front Immunol. 2017; 8: 1164

Abstract

Immunosenescence refers to age-related declines in the capacity to respond to infections such as influenza

(flu). Caloric restriction represents a known strategy to slow many aging processes, including those involving the immune system. More recently, some changes in the microbiome have been described with aging, while the gut microbiome appears to influence responses to flu vaccination and infection. With these considerations in mind, we used a well-established mouse model of flu infection to explore the impact of flu infection, aging, and caloric restriction on the gut microbiome. Young, middle-aged, and aged caloric restricted (CR) and ad lib fed (AL) mice were examined after a sublethal flu infection. All mice lost 10–

20% body weight and, as expected for these early time points, losses were similar at different ages and between diet groups. Cytokine and chemokine levels were also similar with the notable exception of IL-1α, which rose more than fivefold in aged AL mouse serum, while it remained unchanged in aged CR serum.

90

Fecal microbiome phyla abundance profiles were similar in young, middle-aged, and aged AL mice at baseline and at 4 days post flu infection, while increases in Proteobacteria were evident at 7 days post flu infection in all three age groups. CR mice, compared to AL mice in each age group, had increased abundance of Proteobacteria and Verrucomicrobia at all time points. Interestingly, principal coordinate analysis determined that diet exerts a greater effect on the microbiome than age or flu infection. Percentage body weight loss correlated with the relative abundance of Proteobacteria regardless of age, suggesting flu pathogenicity is related to Proteobacteria abundance. Further, several microbial Operational Taxonomic

Units from the Bacteroidetes phyla correlated with serum chemokine/cytokines regardless of both diet and age suggesting an interplay between flu-induced systemic inflammation and gut microbiota. These exploratory studies highlight the impact of caloric restriction on fecal microbiome in both young and aged animals, as well as the many complex relationships between flu responses and gut microbiota. Thus, these preliminary studies provide the necessary groundwork to examine how gut microbiota alterations may be leveraged to influence declining immune responses with aging.

Keywords: aging, influenza, gut microbiome, caloric restriction, cytokines

Introduction

Aging is a complex process that has dramatic impacts on most systems in the body. 270 This is especially true of the immune system where significant age-related changes are observed in both innate and adaptive immune responses. One of the most prominent manifestations of aging is an increase in susceptibility to infections. In fact, influenza infection is one of the top killers of elderly people in the world, with the oldest being most at risk. 271,272 In aging mouse models, the clearance of influenza virus is slower and T cell responses are reduced and delayed when compared to young mice, which closely mirrors what happens during influenza infection of elderly humans. 273-276 In addition, our recent study demonstrated that there are lingering inflammatory cytokines such as IL-6, IL-1α, and G-CSF in the bronchiolar lavage fluid (BAL)

91 of aged mice during influenza infection. We have also shown that there is an increasing level of albumin in the BAL of aged mice, which is indicative of lung damage during the later stages of infection. 277 These results indicate that the inability to efficiently clear virus in aged lungs corresponds to extended inflammation and lung damage.

One of the most consistent ways to delay aging in mice is via caloric restriction (CR). Indeed, CR was first shown to extend the lifespan of rodents in the 1930s 278 and has since been observed across multiple species, ranging from invertebrates to rodents to even some non-human primate studies; 279-282 however, it is important to note that this is not observed in all studies 283 284 285 286 287 288 suggesting some mechanisms of longevity with CR may not be conserved among species and that details of implementation likely effect outcomes. Despite these discrepancies, and more importantly, along with extending lifespan, CR has been shown to delay age-related deficits in multiple physiological systems in mice, 289-294 seemingly targeting the process of aging itself. From a translational perspective, human CR studies assessing longevity are nearly impossible, and short-term studies evaluating other health span measures are difficult to control and criticized for practicality (please refer to Ref. 295 for a recent review of human CR trials). Regardless of these limitations, results from the both population studies, 296 as well as the Comprehensive Assessment of

Long-term Effects of Reducing Intake of Energy 297,298,299 and Caloric Restriction with Optimal Nutrition

300,301 studies show benefits in some areas, albeit not in all aspects of CR in rodent trials, suggesting CR does hold value within human aging research. Moreover, the benefits to multiple different systems evident in rodent studies makes it an attractive avenue to pursue. It is important to note that CR in these cases is not malnutrition (although feed is generally reduced by 40% caloric content, it is fortified with micronutrients to prevent deficiencies), and it is normally introduced after mice have reached maturation. There are many hypotheses for the mechanism of action of CR, including modulation of glucose–insulin homeostasis, growth hormone axis, autophagy changes, and alteration of inflammatory pathways; however, it is likely many of these changes may act in concert to delay aging. Nonetheless, CR has consistently shown improvement in multiple facets of aging, including delaying immunosenescence. 302,303,304 ,305,306 Most

92 notably, CR attenuates the shift from naive to memory phenotype observed in T cells with aging 307 and maintains the proliferative capacity of T cells. 308,309 Despite these positive effects on the aged immune system, the effects of CR on immune responses to influenza with aging is not clear. Original studies by

Effros et al. 310 demonstrated that CR could enhance the immune response in aged mice ameliorating age- related declines in T cell proliferation and antibody production in response to intraperitoneal immunization to influenza. Conversely, work from Gardner and colleagues 311-314 demonstrated increased severity of infection in aged CR mice with increased viral titers and mortality in response to intranasal influenza infection attributed to impaired NK cell responses 312,313 and/or reduced energy reserves and lethal weight loss. 315 However, in these studies, high doses of influenza were utilized with mortality even observed in young mice; thus, it is unclear how CR modulates the immune response to more sublethal doses of influenza.

More recently, the importance of the microbiome in regulating immune responses has been elucidated. The gut microbiota has regulatory effects on not only intestinal immunity but also systemic immune responses

316 and systemic T cell subset populations can be skewed by different microbiota predominance. 91,317-319

Further, pulmonary immunity is directly affected by the gut microbiota with regards to both allergic airway

320 and infectious disease 321,322 responses. Importantly, the gut microbiota plays a major role in mediating flu infection-related immune responses and is particularly crucial for respiratory tract DCs migration, T cell priming, cytokine secretion, and overall viral clearance. 321 Dysbiosis of the gut microbiota induced by administration during flu infection influences helper T cell responses and can negatively impact flu outcomes and recovery. 323 In addition, influenza infection itself induces gut microbiota dysbiosis through type I interferons (IFN-I) favoring Proteobacteria overgrowth. 230 Thus, the relationship between gut microbiota and influenza infection seems complex and integrated.

Furthering this line of thought, the gut microbiome also changes with age [recently reviewed in Ref. 165,324].

More specifically, there seems to be a general decrease in microbiota diversity with aging, 160,325 as well as an increase in Proteobacteria abundance and lower levels of Firmicutes. 143,160,325-327 Also, a shift toward a

93 more Bacteroidetes dominated microbiome was associated with frailty. 143 It is unknown how these changes to gut microbiota with aging may influence immune responses. CR also impacts the microbiome with greater levels of Lactobacillus and other potential probiotics associated with longevity. 328 But importantly, the influence of gut bacteria microbiota changes on age-related pathologies has yet to be determined. It is possible that gut microbiota changes with CR may be a potential mechanism of longevity and/or related to some of the “antiaging” effects evident in many murine studies.

It is known that different components of nutrition can affect the aged immune system 329 and that prebiotic/ supplementation may decrease the severity of infection in the elderly. 330-333 Moreover, small-scale studies have indicated that specific probiotics and prebiotics may improve flu vaccine response in hospitalized elderly 334-336 suggesting that age-related alterations in gut microbiota may precipitate reduced flu responses in the elderly, and that this dysbiosis can be treated to improve immune responses.

Indeed, depletion of specific gut microbiota through antibiotic treatment can negatively affect both DC 321 and T cell 321,323 influenza immune responses, while the gut microbiota itself is affected by influenza infection through type I interferons (IFN-I), 230 thus the relationship between gut microbiota and influenza infection is bidirectional and warrants further investigation.

Here, in this exploratory study, we sought to examine how CR, a known modulator of aging and gut microbiota, can influence influenza induced gut microbiota changes and immune responses during acute influenza infection in young, middle, and aged C57BL/6 mice. We hypothesized that CR would protect aged mice from age-related gut microbiota changes and thus mitigate influenza induced changes to gut microbiota and improve immune responses.

94

Materials and Methods

Mice

Young (5–6 months old), middle (9–10 months old), and aged (19–21 months old) caloric restricted (CR) and ad libitum (AL) C57BL/6 male mice from the National Institute on Aging caloric restricted rodent colony were obtained at least 6 weeks prior to experimentation to allow appropriate acclimation to our facility. CR mice were fed the NIH31 fortified diet with CR was initiated at 14 weeks of age at 10% restriction, increased to 25% restriction at 15 weeks, and to 40% restriction at 16 weeks where it was maintained throughout the life of the animal. AL mice were fed the NIH31 diet for their entire life. All mice were singly housed in a climate controlled environment with 12:12 light: dark cycle and water was provided ad libitum. For all analyses, 2–3 mice per group were analyzed. Due to the closing of the NIA caloric restricted rodent colony, we were unable to obtain more mice for experiments and thus the results presented are preliminary insights into the ability of CR to modulate gut microbiota and influenza responses with aging. All procedures were approved by the University of Connecticut School of Medicine Institutional

Animal Care and Use Committee and carried out in accordance with these regulations. All mice underwent gross pathological examination at time of sacrifice and animals with obvious pathology were excluded from the study.

Viral Infection and Analysis

To infect with Influenza virus A/PR/8/34 (PR8), 400 EID50 were given intranasally in 40 µl to isoflurane anesthetized animals. Mice were sacrificed 7 days post infection. Lungs were harvested and the viral burden in mRNA from digested whole lung tissue was determined by real-time PCR measuring influenza polymerase acidic protein gene (PA) copy number. 256,337 BAL was collected by flushing lungs with 1 ml saline. BAL and serum were assayed for cytokine and chemokine content using the Luminex Mouse

Cytokine/Chemokine 32-plex panel (EMD Millipore, Billerica, MA, USA). Stool Collection and

95

Microbiome Analysis Fecal samples were collected between 6:00 a.m. and 7:00 a.m. in the morning each day beginning 3 days prior to infection (day −3) and stored at −80°C immediately after collection for microbiome analysis. A total number of 187 samples were collected. Total DNA was extracted from fecal samples by using Power Soil DNA Extraction kit (Mo Bio Laboratories, Carlsbad, CA, USA) per manufacturer’s protocol. Bacterial 16S ribosomal RNA gene was amplified by using the 27F/534R primer set (27F 5′-AGAGTTTGATCCTGGCTCAG-3′, 534R 5′-ATTACCGCGGCTGCTGG-3′). PCR reactions were performed using phusion high-fidelity PCR Mastermix (Invitrogen, Carlsbad, CA, USA) with the following condition: 95°C for 2 min (1 cycle), 95°C for 20 s/56°C for 30 s/72°C for 1 min (30 cycles). PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, Brea, CA, USA) according to manufacturer’s protocol. Library was prepared with Illumina’s instruction specifically for Miseq platform. 17 samples failed the DNA extraction and sequencing library preparation. Full microbiome data are available at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA393321.

Statistical Analysis

Weight loss and cytokine/chemokine parameters were analyzed via two-way ANOVA with Bonferroni post hoc corrections (GraphPad Prism, GraphPad Software, Inc., La Jolla, CA, USA). Raw sequencing reads were assembled using FLASH. 256 Chimeric sequences were removed using USEARCH. 257 Operational taxonomic units (OTUs) were generated at ≥97% sequence similarity. Taxonomic assignment of OTUs was performed by comparing sequences to the RDP database (Confidence threshold = 50%). 60 The R package

“phyloseq” 261 was used for alpha diversity and beta dissimilarity analysis. Relationship between microbiota phyla and influenza-induced weight loss and serum/BAL cytokine/chemokine concentration were analyzed via Spearman’s correlation using GraphPad software for Mac 6.0 (GraphPad Prism, La Jolla, CA, USA) and R package “corrplot”.338

96

RESULTS

The goal of this study was to explore the interaction of aging, diet, and influenza infection with the gut microbiome. We aimed to gain preliminary insight into how aging may impact influenza induced microbiome changes, as well as how age-related microbiome changes may impact influenza responses. The first part of the study examined how age and CR impact the response to flu, while the second part examined the effect on the composition of gut microbiome. For these studies, we employed a sublethal infection dose and examined immune parameters at 7 days post infection. At this time point, influenza-induced weight loss becomes more evident, however, is not different between ages 339 or CR and AL groups (unpublished experiments from the Haynes lab). Thus, percent weight loss should not be a confounding factor between groups and should not put CR mice at greater risk to succumb to infection. 314,315 While this limits the observed differences between groups due to infection, it allows us to assess early time points where the gut microbiome may play a role. Additionally, since it is known that weight loss itself affects gut microbiota,

340-342 this control was necessary to gain preliminary insight into influenza-induced alterations.

Effect of Age and CR on the Response to Influenza Infection

Weight loss was monitored throughout the experiment. Figure 1A shows that there were no significant differences in percent weight loss in any of the groups regardless of age or diet. In addition, Figure 1B shows that on day 7 post infection, there was no significant difference in the amount of flu virus in the lungs in each group. These correlate well with our previous studies showing that the main age-related differences in response to sublethal influenza infection are not seen during the peak of the infection during the first week, but become apparent during the resolution phase in week two following flu infection. 342

The BAL fluid from each mouse was analyzed for cytokine and chemokine contents. As shown in Figure

1C, there were no significant differences in cytokines important for a protective immune response to

97 influenza infection including GCSF, IFNγ, IL-1α, IL-6, and TNF. In addition, there were no significant differences in chemokines that recruit protective immune cells to the lungs including CCL3, CCL4, CCL5,

CXCL1, and CXCL10. A similar pattern was also observed when the serum from these groups was analyzed

(Figure 1D) except for a significant increase in IL-1α in the aged AL group that was not seen in aged CR mice. Thus, both locally in the BAL and systemically in the serum, there are few measurable differences between the response to infection in young, middle aged, and aged groups and between AL and CR groups at this relatively early time point following flu infection.

98

Appendix Figure 1

Response to influenza infection. Young, middle-aged, and aged C57BL6 mice on an ad libitum (AL) or caloric-restricted (CR) diet were anesthetized and infected with 400 EID50 of influenza virus. (A) Each day, mice were weighed and the percent of the starting weight is shown. On day 7 postinfection, mice were sacrificed. (B) mRNA was isolated from the lungs and influenza virus was quantitated by real-time PCR measuring influenza polymerase acidic protein gene (PA) copy number. (C) Bronchiolar lavage fluid and (D) serum was collected and subjected to multiplex analysis to determine levels of cytokines and chemokines. Data were analyzed via two-way ANOVA with Bonferroni post hoc corrections; *p < 0.05.

99

Effect of Age and CR on the Gut Microbiome during Influenza Infection

It is known that gut microbiome is affected by influenza viral infection. 230 To fully understand the dynamics of this process and its implication, stool samples were collected prior to and during influenza infection.

Analysis was done by reconstruction of the gut bacterial microbiome by amplicon sequencing to access the composition of the microbiota population. Figure 2A shows the relative abundance of nine bacterial phyla in each experimental group. While there are no major age-related differences apparent in phylum level, there are differences due to influenza infection and diet. In each age category, the distribution of phyla is changed by day 7 post infection in both AL and CR groups, characterized by an increase in Proteobacteria and Verrucomicrobia. Furthermore, Proteobacteria and Verrucomicrobia are more abundant in the CR groups when compared to AL in each age category. Figure 2B shows the Bray Curtis dissimilarity of samples indicating they segregate by diet but not by infection in each age group, this implies that diet had a greater impact on the composition of gut microbiome when compared to the impact of influenza infection.

Similar to a previous report, 230 we observed increased phylum Proteobacteria in all groups at day 7 post influenza infection. We determined that regardless of age, the relative abundance of phylum Proteobacteria was positively correlated with percent body weight loss at this time in AL and CR (r = 0.8095, p = 0.0218 and r = 0.8333, p = 0.0154; respectively, Figures 3A, B) indicating a relationship between severity of infection and Proteobacteria abundance. To further examine the relationship between flu pathogenicity and gut microbiota abundances, we examined this relationship among all groups and major OTUs. This approach increases the overall sample size (n = 17) to increase the power of our correlation analysis and provide preliminary insight into potential key bacteria associated with flu responses regardless of age and diet. Ten specific OTUs correlated with percent weight loss, and interestingly, Alistipes OTU_34 and

Parabacteroides OTU_9, both members of the Bacteroidetes phyla showed the strongest relationship (r =

0.8235, p < 0.0001 and r = 0.7672, p = 0.0005; respectively, Figures 3C,D), while another Bacteroidetes,

Hallella OUT_11 was the next strongest relationship, however, was negatively correlated with percent weight loss (r = −0.5907, p = 0.0216, Figure 3E); highlighting differential relationships within phyla.

100

Appendix Figure 2

(A) Relative abundance of nine phyla at 0, 4, and 7 day postinfection (dpi) in young, middle, and aged ad libitum (AL) or caloric-restricted (CR) mice. For middle-aged CR group, we do not have data at 7 dpi, so 6 dpi is shown here instead. Relative abundance of bacteria is shown as a fraction. (B) PCoA plot of Bray Curtis dissimilarity prior to (naïve, −3–0 dpi) and after (postinfection, 4–7 dpi) flu infection.

101

Next, we examined the relationship between serum cytokine/chemokines and major OTUs observed among all groups to determine how influenza-induced inflammation relates to bacterial abundances (Figure 4). We observed 22 significant correlations between relative bacterial abundance and inflammatory mediators in the serum. Of note, CXCL1 was positively correlated with unclassified Lachnospiraceae OTU_12,

Alistipes OTU_26, and Bacteroides OTU_29, CCL2 was positively correlated with Parabacteroides

OTU_9, Alistipes OTU_34, and Unclassified Clostridiales OTU_25, and TNFα was positively correlated with unclassified alpha-Proteobacteria OTU_2, Butyrivibrio OTU_28, Unclassified Clostridiales OTU_30, and Lachnospiraceae incertae sedis OTU_37. Conversely, CXCL10 was negatively correlated with

Unclassified Porphyromonadaceae OTU_1 and Parasutterella OTU_18, and CCL5 is negatively correlated with Barnesiella OTU_3, Butyrivibrio OTU_28, Unclassified Porphyromonadaceae OTU_27,

Allobaculum OTU_35, and Anaerostipes OTU_40. Among those correlations we observed, the three strongest positive correlations were CXCL1 with Bacteroides OTU_29, CCL2 with Parabacteroides

OTU_9, and CCL2 with Alistipes OTU_34. The three strongest negative correlation were CCL3 with

Prevotella OTU_8, CCL5 with Unclassified Porphyromonadaceae OTU_27, and CCL5 with Butyrivibrio

OTU_19. Thus, systemic immune responses were related to gut microbiota alterations. Interestingly, aside from Butyrivibrio, the strongest correlations, both positive and negative, were Bacteroidetes. Though it is not possible to conclude any causal relations between these bacteria taxa and the correlated host response, these findings provide insight into future research in manipulating bacterial gut microbiome to facilitate antiviral immune responses.

102

Appendix Figure 3

The percent weight loss for AL (A) and caloric restricted (B) at 7 days post infection (dpi) is plotted against the relative abundance of phylum Proteobacteria (after log2 transform). Correlation of relative abundance of Alistipes OTU_34 (C), Parabacteroides OTU_9 (D), and Hallella OTU_11 (E) with percent weight loss among all groups is shown. The bacteria abundance is the average of 5, 6, and 7 dpi. Spearman’s correlation with p < 0.05.

103

Appendix Figure 4

Correlation of operational taxonomic units (OTUs) relative abundance with serum cytokines at 7 day post infection (dpi). OTUs relative abundance is the average of 5, 6, and 7 dpi. Color intensity and size are proportional to the spearman correlation coefficient (p < 0.05). Positive correlations are displayed as blue and negative correlations are displayed in red. UC: unclassified according to RDP database (confidence threshold = 50%).

104

DISCUSSION

In this exploratory study, we sought to obtain preliminary information as to how CR, a known modulator of aging and gut microbiota, could influence influenza-induced gut microbiota changes and immune responses during acute influenza infection in young, middle aged, and aged mice. Different outcomes of influenza infection, due to the influence of aging and CR, could be mediated by modulation of the gut microbiota by these factors with subsequent effects of infection due to the different microbial communities.

We observed that CR affected the gut microbial communities (Figure 2), as found in previous studies, and that age also has an effect (the different patterns in Figure 2B). However, no obvious effect of infection was observed (Figure 2).

Unlike previous studies, 310,314 we did not observe significant differences in weight loss or viral burden following influenza infection in CR aged animals. However, our study differed in that we utilized a sublethal dose of influenza and sacrificed mice at a relatively early point of infection, and both of these could contribute to a lessening of the effect of infection on the microbiome.

Although there was not a large effect on microbial community structure following infection, we did observe an increase in the proportion of Proteobacteria, which was more significant in CR mice but independent of age (Figure 2A). The increase in the proportion of Proteobacteria was correlated with weight loss (Figure

3), taken as a marker of infection severity. This again was independent of age. Further, multiple OTUs correlated with weight loss from the Bacteroidetes phylum. This connection raises the possibility that a change in the microbiome has a connection with infection outcome, as hypothesized above; however, the relationship does not seem to be straight forward and members of the same phyla have differential relationships. Future research can examine if elimination or transfer of these specific bacteria can impact flu responses. Indeed, others have shown antibiotic treatment is detrimental to flu immune responses, specifically oral neomycin eliminated Gram-positive bacteria and impaired immune responses. 321 Here, we

105 identify Gram-negative bacteria that may also be crucial for immune responses. The PR8 strain of influenza virus used for these studies will not directly infect gut tissue of B6 mice, 229 raising the question of how a respiratory infection can affect the gut microbiome. The mucosal surfaces in lung and gut are considered a common mucosal surface that share immunological signals, 343 so inflammation from one site is likely to affect the other site. Since Proteobacteria are generally observed during gut inflammation, 344 the more severe infection in the lung, the greater the effect on the gut, and this would result in the observed correlation between Proteobacteria (a measure of inflammation in the gut) and weight loss (a measure of infection).

Finally, we note that overgrowth of Proteobacteria can be impaired by blocking the type I interferon (IFN-

I) signal, 160 suggesting this gut microbiota change correlated with lung inflammation is mediated by an

IFN-I-related immune response.

The gut is believed to be the largest immunological compartment in the body 345 and thus signaling from the gut microbiome may play an important role in viral infections. For example, mouse mammary tumor virus 346 and Enteric virus 347 require intestinal bacterial flora to establish effective infection. Lymphocytic choriomeningitis virus and influenza virus, 348 conversely, will trigger a more effective immune response if intestinal bacterial flora is present. Also, germ-free mice, 349 mice treated with an antibiotic cocktail, 321 or

TLR5 KO mice (with impaired function in sensing bacterial flagellin) 95 will not generate adequate immune response to influenza viral infection. Thus, the microbiome helps protect during flu infection. Although here we show a correlation between Proteobacteria abundance and infection, each member of the microbiota may signal the immune system in a different manner. 350 For example, a previous report 321 showed that a host with antibiotic treatment to largely deplete Lachnospiraceae would not generate a good antibody response against trivalent inactivated influenza vaccine. Similarly, we noticed that higher relative abundance of genus Lachnospiraceae OTU_12 and OTU_32 are correlated with higher amount of CXCL1 and IL-1α, respectively. More generally, we show here that cytokine production associated with flu infection in our study correlates differently for each of 22 OTUs (Figure 4). This suggests that members of the microbiome regulate the immune response in different ways.

106

These preliminary findings contribute to the understanding of dynamics and complexity of gut bacterial microbiota and influenza infection. We are the first to show the early changes (Day 4 post influenza infection) of gut bacterial microbiota composition in both young, middle, and aged mice on both AL or CR diet. Interestingly, though CR had a great impact on gut microbiota, it did not seem to affect flu-induced immune responses or flu-induced alterations in the gut microbiota at these early time points. It is possible that these alterations, such as increased proteobacteria compared to AL mice, may have effects at later time points in flu responses and recovery. More research is necessary to determine if CR modulation of gut microbiota with aging is beneficial to flu immune responses. Cytokine production associated with influenza infection in our study correlates differently with each OTUs. Although the data we present do not allow causal relations between bacteria and cytokine production to be determined, they do provide hypotheses for virus-bacterial interactions through the immune system. There is evidence that the genus Lactobacillus can improve the immune response to the PR8 strain of influenza 351,352 and respiratory syncytial virus 353 infection. Here, we find multiple members of the Bacteroidetes phyla to be correlated with immune parameters and flu pathogenicity. Thus, while influenza infection promotes Proteobacteria overgrowth through IFN-I, 230 members of the Bacteroidetes phyla are also affected and likely in turn affect immune parameters, both positively and negatively. Interestingly, a Bacteroidetes dominated microbiome was associated with increased frailty among the elderly, 143 perhaps suggesting a microbiota link associated with the increased disability observed following influenza infection in the elderly. 354 Future research should explore manipulation of bacterial species from these phyla to modulate flu immune responses. Thus, we believe this exploratory study can provide some additional guidance to the use of microbiota to facilitate virus-specific immune responses, especially for elderly whose immune responses are known to be deficient.

107

ETHICS STATEMENT

This study was carried out in accordance with all federal, state, and institutional laws, policies, and guidelines. The protocol was approved by the UConn Health Institutional Animal Care and Use Committee.

AUTHOR CONTRIBUTIONS

JB and LH designed the study; JB carried out the influenza infection and analysis of responses; XZ and

GW performed the microbiome analysis; JB, LH, GK, XZ, and GW participated in preparation of the figures and manuscript.

FUNDING

This study was funded by a National Institutes of Health, National Institute on Aging grant P01 AG021600 to LH.

108

Reference

1. Luckey, T.D. Introduction to intestinal microecology. Am J Clin Nutr 25, 1292-1294 (1972). 2. Sender, R., Fuchs, S. & Milo, R. Are We Really Vastly Outnumbered? Revisiting the Ratio of Bacterial to Host Cells in Humans. Cell 164, 337-340 (2016). 3. Gilbert, J.A., et al. Current understanding of the human microbiome. Nat Med 24, 392-400 (2018). 4. Marchesi, J.R. & Ravel, J. The vocabulary of microbiome research: a proposal. Microbiome 3, 31 (2015). 5. Byrd, A.L., Belkaid, Y. & Segre, J.A. The human skin microbiome. Nat Rev Microbiol 16, 143- 155 (2018). 6. Krishnan, K., Chen, T. & Paster, B.J. A practical guide to the oral microbiome and its relation to health and disease. Oral Dis 23, 276-286 (2017). 7. Rampersaud, R., Randis, T.M. & Ratner, A.J. Microbiota of the upper and lower genital tract. Semin Fetal Neonatal Med 17, 51-57 (2012). 8. Gottschick, C., et al. The urinary microbiota of men and women and its changes in women during bacterial vaginosis and antibiotic treatment. Microbiome 5, 99 (2017). 9. Lloyd-Price, J., et al. Strains, functions and dynamics in the expanded Human Microbiome Project. Nature 550, 61-66 (2017). 10. Weinstock, G.M. Genomic approaches to studying the human microbiota. Nature 489, 250-256 (2012). 11. Qin, J., et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59-65 (2010). 12. Turnbaugh, P.J., et al. A core gut microbiome in obese and lean twins. Nature 457, 480-484 (2009). 13. Cebra, J.J. Influences of microbiota on intestinal immune system development. Am J Clin Nutr 69, 1046S-1051S (1999). 14. Belkaid, Y. & Hand, T.W. Role of the microbiota in immunity and inflammation. Cell 157, 121- 141 (2014). 15. Pelzer, E., Gomez-Arango, L.F., Barrett, H.L. & Nitert, M.D. Review: Maternal health and the . Placenta 54, 30-37 (2017). 16. Houghteling, P.D. & Walker, W.A. Why is initial bacterial colonization of the intestine important to infants' and children's health? J Pediatr Gastroenterol Nutr 60, 294-307 (2015). 17. Jost, T., Lacroix, C., Braegger, C.P. & Chassard, C. New insights in gut microbiota establishment in healthy breast fed neonates. PLoS One 7, e44595 (2012). 18. Dominguez-Bello, M.G., et al. Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proc Natl Acad Sci U S A 107, 11971- 11975 (2010). 19. Yatsunenko, T., et al. Human gut microbiome viewed across age and geography. Nature 486, 222-227 (2012). 20. Faith, J.J., et al. The long-term stability of the human gut microbiota. Science 341, 1237439 (2013).

109

21. Schloissnig, S., et al. Genomic variation landscape of the human gut microbiome. Nature 493, 45-50 (2013). 22. Mehta, R.S., et al. Stability of the human faecal microbiome in a cohort of adult men. Nat Microbiol 3, 347-355 (2018). 23. Lozupone, C.A., Stombaugh, J.I., Gordon, J.I., Jansson, J.K. & Knight, R. Diversity, stability and resilience of the human gut microbiota. Nature 489, 220-230 (2012). 24. Hannigan, G.D., et al. The human skin double-stranded DNA virome: topographical and temporal diversity, genetic enrichment, and dynamic associations with the host microbiome. MBio 6, e01578-01515 (2015). 25. Ma, B., Forney, L.J. & Ravel, J. Vaginal microbiome: rethinking health and disease. Annu Rev Microbiol 66, 371-389 (2012). 26. Oh, J., et al. Temporal Stability of the Human Skin Microbiome. Cell 165, 854-866 (2016). 27. Ursell, L.K., et al. The interpersonal and intrapersonal diversity of human-associated microbiota in key body sites. J Allergy Clin Immunol 129, 1204-1208 (2012). 28. Genomes Project, C., et al. A global reference for human genetic variation. Nature 526, 68-74 (2015). 29. Turpin, W., et al. Association of host genome with intestinal microbial composition in a large healthy cohort. Nat Genet 48, 1413-1417 (2016). 30. Bonder, M.J., et al. The effect of host genetics on the gut microbiome. Nat Genet 48, 1407-1412 (2016). 31. Knights, D., et al. Complex host genetics influence the microbiome in inflammatory bowel disease. Genome Med 6, 107 (2014). 32. Kolde, R., et al. Host genetic variation and its microbiome interactions within the Human Microbiome Project. Genome Med 10, 6 (2018). 33. Goodrich, J.K., et al. Genetic Determinants of the Gut Microbiome in UK Twins. Cell Host Microbe 19, 731-743 (2016). 34. Org, E., et al. Sex differences and hormonal effects on gut microbiota composition in mice. Gut Microbes 7, 313-322 (2016). 35. Gupta, V.K., Paul, S. & Dutta, C. Geography, Ethnicity or Subsistence-Specific Variations in Human Microbiome Composition and Diversity. Front Microbiol 8, 1162 (2017). 36. Goodrich, J.K., et al. Human genetics shape the gut microbiome. Cell 159, 789-799 (2014). 37. Lee, S., Sung, J., Lee, J. & Ko, G. Comparison of the gut of healthy adult twins living in South Korea and the United States. Applied and environmental microbiology 77, 7433- 7437 (2011). 38. Vangay, P., et al. US Immigration Westernizes the Human Gut Microbiome. Cell 175, 962-972 e910 (2018). 39. Mueller, E. & Blaser, M. Breast milk, formula, the microbiome and overweight. Nat Rev Endocrinol 14, 510-511 (2018). 40. Chu, D.M., Meyer, K.M., Prince, A.L. & Aagaard, K.M. Impact of maternal nutrition in pregnancy and lactation on offspring gut microbial composition and function. Gut Microbes 7, 459-470 (2016). 41. Tamburini, S., Shen, N., Wu, H.C. & Clemente, J.C. The microbiome in early life: implications for health outcomes. Nat Med 22, 713-722 (2016). 42. De Filippo, C., et al. Impact of diet in shaping gut microbiota revealed by a comparative study in children from Europe and rural Africa. Proc Natl Acad Sci U S A 107, 14691-14696 (2010). 43. David, L.A., et al. Diet rapidly and reproducibly alters the human gut microbiome. Nature 505, 559-563 (2014). 44. Piening, B.D., et al. Integrative Personal Omics Profiles during Periods of Weight Gain and Loss. Cell Syst 6, 157-170 e158 (2018). 45. Koren, O., et al. Host remodeling of the gut microbiome and metabolic changes during pregnancy. Cell 150, 470-480 (2012).

110

46. Langdon, A., Crook, N. & Dantas, G. The effects of antibiotics on the microbiome throughout development and alternative approaches for therapeutic modulation. Genome Med 8, 39 (2016). 47. Francino, M.P. Antibiotics and the Human Gut Microbiome: Dysbioses and Accumulation of Resistances. Front Microbiol 6, 1543 (2015). 48. Saulnier, D.M., et al. The intestinal microbiome, probiotics and prebiotics in neurogastroenterology. Gut Microbes 4, 17-27 (2013). 49. Petschow, B., et al. Probiotics, prebiotics, and the host microbiome: the science of translation. Ann N Y Acad Sci 1306, 1-17 (2013). 50. Moore, W.E. & Holdeman, L.V. Human fecal flora: the normal flora of 20 Japanese-Hawaiians. Appl Microbiol 27, 961-979 (1974). 51. Finegold, S.M., Sutter, V.L. & Mathisen, G.E. CHAPTER 1 - Normal Indigenous Intestinal Flora. in Human Intestinal Microflora in Health and Disease (ed. Hentges, D.J.) 3-31 (Academic Press, 1983). 52. Grice, E.A. & Segre, J.A. The skin microbiome. Nat Rev Microbiol 9, 244-253 (2011). 53. Faust, K. & Raes, J. Microbial interactions: from networks to models. Nat Rev Microbiol 10, 538- 550 (2012). 54. Hibbing, M.E., Fuqua, C., Parsek, M.R. & Peterson, S.B. Bacterial competition: surviving and thriving in the microbial jungle. Nat Rev Microbiol 8, 15-25 (2010). 55. Suau, A., et al. Direct analysis of genes encoding 16S rRNA from complex communities reveals many novel molecular species within the human gut. Applied and environmental microbiology 65, 4799-4807 (1999). 56. Eckburg, P.B., et al. Diversity of the human intestinal microbial flora. Science 308, 1635-1638 (2005). 57. Sanger, F., Nicklen, S. & Coulson, A.R. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 74, 5463-5467 (1977). 58. Heather, J.M. & Chain, B. The sequence of sequencers: The history of sequencing DNA. Genomics 107, 1-8 (2016). 59. Escobar-Zepeda, A., Vera-Ponce de Leon, A. & Sanchez-Flores, A. The Road to Metagenomics: From Microbiology to DNA Sequencing Technologies and Bioinformatics. Front Genet 6, 348 (2015). 60. Cole, J.R., et al. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic acids research 42, D633-642 (2014). 61. Quast, C., et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic acids research 41, D590-596 (2013). 62. Geer, L.Y., et al. The NCBI BioSystems database. Nucleic acids research 38, D492-496 (2010). 63. Chen, K. & Pachter, L. Bioinformatics for whole-genome shotgun sequencing of microbial communities. PLoS Comput Biol 1, 106-112 (2005). 64. Clarridge, J.E., 3rd. Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases. Clinical microbiology reviews 17, 840-862, table of contents (2004). 65. Caporaso, J.G., et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A 108 Suppl 1, 4516-4522 (2011). 66. Tringe, S.G. & Hugenholtz, P. A renaissance for the pioneering 16S rRNA gene. Current opinion in microbiology 11, 442-446 (2008). 67. Pace, N.R. A molecular view of microbial diversity and the biosphere. Science 276, 734-740 (1997). 68. Woese, C.R. & Fox, G.E. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A 74, 5088-5090 (1977). 69. Bazinet, A.L. & Cummings, M.P. A comparative evaluation of sequence classification programs. BMC Bioinformatics 13, 92 (2012).

111

70. Poretsky, R., Rodriguez, R.L., Luo, C., Tsementzi, D. & Konstantinidis, K.T. Strengths and limitations of 16S rRNA gene amplicon sequencing in revealing temporal microbial community dynamics. PLoS One 9, e93827 (2014). 71. Stackebrandt, E. & Goebel, B. Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. International Journal of Systematic and Evolutionary Microbiology 44, 846-849 (1994). 72. Mysara, M., et al. Reconciliation between operational taxonomic units and species boundaries. FEMS Microbiol Ecol 93(2017). 73. Blaxter, M., et al. Defining operational taxonomic units using DNA barcode data. Philos Trans R Soc Lond B Biol Sci 360, 1935-1943 (2005). 74. Hughes, J.B., Hellmann, J.J., Ricketts, T.H. & Bohannan, B.J. Counting the uncountable: statistical approaches to estimating microbial diversity. Appl. Environ. Microbiol. 67, 4399-4406 (2001). 75. Chao, A. & Lee, S.-M. Estimating the number of classes via sample coverage. Journal of the American statistical Association 87, 210-217 (1992). 76. Hughes, J.B., Hellmann, J.J., Ricketts, T.H. & Bohannan, B.J. Counting the uncountable: statistical approaches to estimating microbial diversity. Applied and environmental microbiology 67, 4399-4406 (2001). 77. Hill, M.O. Diversity and evenness: a unifying notation and its consequences. Ecology 54, 427- 432 (1973). 78. Jost, L. Entropy and diversity. Oikos 113, 363-375 (2006). 79. Tuomisto, H. A diversity of beta diversities: straightening up a concept gone awry. Part 1. Defining beta diversity as a function of alpha and gamma diversity. Ecography 33, 2-22 (2010). 80. Tuomisto, H. A consistent terminology for quantifying species diversity? Yes, it does exist. Oecologia 164, 853-860 (2010). 81. Mayoral, M. Renyi's entropy as an index of diversity in simple-stage cluster sampling. Information Sciences 105, 101-114 (1998). 82. Bonilla-Rosso, G., Eguiarte, L.E., Romero, D., Travisano, M. & Souza, V. Understanding microbial community diversity metrics derived from metagenomes: performance evaluation using simulated data sets. FEMS Microbiol Ecol 82, 37-49 (2012). 83. Bray, J.R. & Curtis, J.T. An ordination of the upland forest communities of southern Wisconsin. Ecological monographs 27, 325-349 (1957). 84. Lozupone, C. & Knight, R. UniFrac: a new phylogenetic method for comparing microbial communities. Applied and environmental microbiology 71, 8228-8235 (2005). 85. Xia, Y. & Sun, J. Hypothesis Testing and Statistical Analysis of Microbiome. Genes Dis 4, 138- 148 (2017). 86. Segata, N., et al. Metagenomic biomarker discovery and explanation. Genome biology 12, R60 (2011). 87. McLachlan, G. Discriminant analysis and statistical pattern recognition, (John Wiley & Sons, 2004). 88. Knights, D., Costello, E.K. & Knight, R. Supervised classification of human microbiota. FEMS microbiology reviews 35, 343-359 (2011). 89. Zhou, Y., Wan, X., Zhang, B. & Tong, T. Classifying next-generation sequencing data using a zero-inflated Poisson model. Bioinformatics 34, 1329-1335 (2018). 90. Reyniers, J.A. The pure‐culture concept and gnotobiotics. Annals of the New York Academy of Sciences 78, 3-16 (1959). 91. Mazmanian, S.K., Liu, C.H., Tzianabos, A.O. & Kasper, D.L. An immunomodulatory molecule of symbiotic bacteria directs maturation of the host immune system. Cell 122, 107-118 (2005). 92. Ivanov, II, et al. Specific microbiota direct the differentiation of IL-17-producing T-helper cells in the mucosa of the small intestine. Cell Host Microbe 4, 337-349 (2008). 93. Robinson, C.M. & Pfeiffer, J.K. Viruses and the Microbiota. Annu Rev Virol 1, 55-69 (2014).

112

94. Vijay-Kumar, M., et al. Metabolic syndrome and altered gut microbiota in mice lacking Toll-like receptor 5. Science 328, 228-231 (2010). 95. Oh, J.Z., et al. TLR5-mediated sensing of gut microbiota is necessary for antibody responses to seasonal influenza vaccination. Immunity 41, 478-492 (2014). 96. Guss, J.D., et al. Alterations to the Gut Microbiome Impair Bone Strength and Tissue Material Properties. J Bone Miner Res 32, 1343-1353 (2017). 97. Turnbaugh, P.J., Backhed, F., Fulton, L. & Gordon, J.I. Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe 3, 213-223 (2008). 98. Ridaura, V.K., et al. Gut microbiota from twins discordant for obesity modulate metabolism in mice. Science 341, 1241214 (2013). 99. Turnbaugh, P.J., et al. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444, 1027-1031 (2006). 100. Kamada, N., Seo, S.U., Chen, G.Y. & Nunez, G. Role of the gut microbiota in immunity and inflammatory disease. Nat Rev Immunol 13, 321-335 (2013). 101. Sartor, R.B. Intestinal microflora in human and experimental inflammatory bowel disease. Curr Opin Gastroenterol 17, 324-330 (2001). 102. Farrell, R.J. & LaMont, J.T. Microbial factors in inflammatory bowel disease. Gastroenterology clinics of North America 31, 41-62 (2002). 103. Pahwa, R., et al. Gut Microbiome and Inflammation: A Study of Diabetic Inflammasome- Knockout Mice. J Diabetes Res 2017, 6519785 (2017). 104. DeGruttola, A.K., Low, D., Mizoguchi, A. & Mizoguchi, E. Current Understanding of Dysbiosis in Disease in Human and Animal Models. Inflamm Bowel Dis 22, 1137-1150 (2016). 105. Zaneveld, J.R., McMinds, R. & Vega Thurber, R. Stress and stability: applying the Anna Karenina principle to animal microbiomes. Nat Microbiol 2, 17121 (2017). 106. Weil, A.A. & Hohmann, E.L. Fecal microbiota transplant: benefits and risks. Open Forum Infect Dis 2, ofv005 (2015). 107. Barre-Sinoussi, F. & Montagutelli, X. Animal models are essential to biological research: issues and perspectives. Future Sci OA 1, FSO63 (2015). 108. Franklin, C.L. & Ericsson, A.C. Microbiota and reproducibility of rodent models. Lab Anim (NY) 46, 114-122 (2017). 109. Nguyen, T.L., Vieira-Silva, S., Liston, A. & Raes, J. How informative is the mouse for human gut microbiota research? Dis Model Mech 8, 1-16 (2015). 110. Babu, A., et al. Secretory phospholipase A2 is required to produce histologic changes associated with gastroduodenal reflux in a murine model. J Thorac Cardiovasc Surg 135, 1220-1227 (2008). 111. Gallo, R.L. & Hooper, L.V. Epithelial antimicrobial defence of the skin and intestine. Nat Rev Immunol 12, 503-516 (2012). 112. Carmody, R.N., et al. Diet dominates host genotype in shaping the murine gut microbiota. Cell Host Microbe 17, 72-84 (2015). 113. Human Microbiome Project, C. A framework for human microbiome research. Nature 486, 215- 221 (2012). 114. Integrative, H.M.P.R.N.C. The Integrative Human Microbiome Project: dynamic analysis of microbiome-host omics profiles during periods of human health and disease. Cell Host Microbe 16, 276-289 (2014). 115. Li, J., et al. An integrated catalog of reference genes in the human gut microbiome. Nat Biotechnol 32, 834-841 (2014). 116. Arumugam, M., et al. Enterotypes of the human gut microbiome. Nature 473, 174-180 (2011). 117. Costea, P.I., et al. Enterotypes in the landscape of gut microbial community composition. Nat Microbiol 3, 8-16 (2018). 118. Zhou, Y., et al. Exploration of bacterial community classes in major human habitats. Genome biology 15, R66 (2014).

113

119. Le Chatelier, E., et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541-546 (2013). 120. Li, Y., et al. A Functional Genomics Approach to Understand Variation in Cytokine Production in Humans. Cell 167, 1099-1110 e1014 (2016). 121. Ter Horst, R., et al. Host and Environmental Factors Influencing Individual Human Cytokine Responses. Cell 167, 1111-1124 e1113 (2016). 122. Schirmer, M., et al. Linking the Human Gut Microbiome to Inflammatory Cytokine Production Capacity. Cell 167, 1125-1136 e1128 (2016). 123. Christ, A., et al. Western Diet Triggers NLRP3-Dependent Innate Immune Reprogramming. Cell 172, 162-175 e114 (2018). 124. Woodmansey, E.J. Intestinal bacteria and ageing. J Appl Microbiol 102, 1178-1186 (2007). 125. O'Mahony, D., O'Leary, P. & Quigley, E.M. Aging and intestinal motility: a review of factors that affect intestinal motility in the aged. Drugs Aging 19, 515-527 (2002). 126. Shaw, A.C., Joshi, S., Greenwood, H., Panda, A. & Lord, J.M. Aging of the innate immune system. Curr Opin Immunol 22, 507-513 (2010). 127. Gomez, C.R., Boehmer, E.D. & Kovacs, E.J. The aging innate immune system. Curr Opin Immunol 17, 457-462 (2005). 128. Oh, J., Lee, Y.D. & Wagers, A.J. Stem cell aging: mechanisms, regulators and therapeutic opportunities. Nat Med 20, 870-880 (2014). 129. Tiihonen, K., Ouwehand, A.C. & Rautonen, N. Human intestinal microbiota and healthy ageing. Ageing Res Rev 9, 107-116 (2010). 130. Cusack, S. & O'toole, P.W. Challenges and implications for biomedical research and intervention studies in older populations: Insights from the ELDERMET study. Gerontology 59, 114-121 (2013). 131. O’Toole, P.W. & Jeffery, I.B. Gut microbiota and aging. Science 350, 1214-1215 (2015). 132. Claesson, M.J., et al. Gut microbiota composition correlates with diet and health in the elderly. Nature 488, 178 (2012). 133. Nimmons, D. & Limdi, J.K. Elderly patients and inflammatory bowel disease. World J Gastrointest Pharmacol Ther 7, 51-65 (2016). 134. Choi, J., Hur, T.Y. & Hong, Y. Influence of Altered Gut Microbiota Composition on Aging and Aging-Related Diseases. J Lifestyle Med 8, 1-7 (2018). 135. Kirkman, M.S., et al. Diabetes in older adults. Diabetes Care 35, 2650-2664 (2012). 136. Lakens, D. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front Psychol 4, 863 (2013). 137. Mattiello, F., et al. A web application for sample size and power calculation in case-control microbiome studies. Bioinformatics 32, 2038-2040 (2016). 138. La Rosa, P.S., et al. Hypothesis testing and power calculations for taxonomic-based human microbiome data. PLoS One 7, e52078 (2012). 139. Zhang, X., et al. Negative binomial mixed models for analyzing microbiome count data. BMC Bioinformatics 18, 4 (2017). 140. White, J.R., Nagarajan, N. & Pop, M. Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput Biol 5, e1000352 (2009). 141. Mandal, S., et al. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis 26, 27663 (2015). 142. Paulson, J.N., Stine, O.C., Bravo, H.C. & Pop, M. Differential abundance analysis for microbial marker-gene surveys. Nat Methods 10, 1200-1202 (2013). 143. Claesson, M.J., et al. Composition, variability, and temporal stability of the intestinal microbiota of the elderly. Proc Natl Acad Sci U S A 108 Suppl 1, 4586-4591 (2011). 144. Hopkins, M.J. & Macfarlane, G.T. Changes in predominant bacterial populations in human faeces with age and with Clostridium difficile infection. Journal of medical microbiology 51, 448-454 (2002).

114

145. Woodmansey, E.J., McMurdo, M.E., Macfarlane, G.T. & Macfarlane, S. Comparison of compositions and metabolic activities of fecal microbiotas in young adults and in antibiotic- treated and non-antibiotic-treated elderly subjects. Applied and environmental microbiology 70, 6113-6122 (2004). 146. Galkin, F., et al. Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects. bioRxiv, 507780 (2018). 147. Studenski, S. & Ferrucci, L. Chapter 29 - Research in Special Populations: Geriatrics. in Clinical and Translational Science (Second Edition) (eds. Robertson, D. & Williams, G.H.) 533-553 (Academic Press, 2017). 148. Hardy, S.E., Kang, Y., Studenski, S.A. & Degenholtz, H.B. Ability to walk 1/4 mile predicts subsequent disability, mortality, and health care costs. J Gen Intern Med 26, 130-135 (2011). 149. Podsiadlo, D. & Richardson, S. The timed “Up & Go”: a test of basic functional mobility for frail elderly persons. Journal of the American Geriatrics Society 39, 142-148 (1991). 150. Rockwood, K., Awalt, E., Carver, D. & MacKnight, C. Feasibility and measurement properties of the functional reach and the timed up and go tests in the Canadian study of health and aging. J Gerontol A Biol Sci Med Sci 55, M70-73 (2000). 151. Human Microbiome Project, C. Structure, function and diversity of the healthy human microbiome. Nature 486, 207-214 (2012). 152. Li, K., Bihan, M., Yooseph, S. & Methe, B.A. Analyses of the microbial diversity across the human microbiome. PLoS One 7, e32118 (2012). 153. Breiman, L. Random forests. Machine learning 45, 5-32 (2001). 154. Statnikov, A., et al. A comprehensive evaluation of multicategory classification methods for microbiomic data. Microbiome 1, 11 (2013). 155. van Tongeren, S.P., Slaets, J.P., Harmsen, H.J. & Welling, G.W. Fecal microbiota composition and frailty. Applied and environmental microbiology 71, 6438-6442 (2005). 156. Woodmansey, E.J., McMurdo, M.E., Macfarlane, G.T. & Macfarlane, S. Comparison of compositions and metabolic activities of fecal microbiotas in young adults and in antibiotic- treated and non-antibiotic-treated elderly subjects. Applied and environmental microbiology 70, 6113-6122 (2004). 157. Bartosch, S., Fite, A., Macfarlane, G.T. & McMurdo, M.E. Characterization of bacterial communities in feces from healthy elderly volunteers and hospitalized elderly patients by using real-time PCR and effects of antibiotic treatment on the fecal microbiota. Applied and environmental microbiology 70, 3575-3581 (2004). 158. Bian, G., et al. The Gut Microbiota of Healthy Aged Chinese Is Similar to That of the Healthy Young. mSphere 2(2017). 159. Langille, M.G., et al. Microbial shifts in the aging mouse gut. Microbiome 2, 50 (2014). 160. Claesson, M.J., et al. Gut microbiota composition correlates with diet and health in the elderly. Nature 488, 178-184 (2012). 161. Scapoli, L., et al. Microflora and . Dental research journal 9, S202-206 (2012). 162. Ren, C., Webster, P., Finkel, S.E. & Tower, J. Increased internal and external bacterial load during Drosophila aging without life-span trade-off. Cell Metab 6, 144-152 (2007). 163. Shanley, D.P., Aw, D., Manley, N.R. & Palmer, D.B. An evolutionary perspective on the mechanisms of immunosenescence. Trends Immunol 30, 374-381 (2009). 164. Ostan, R., et al. Immunosenescence and immunogenetics of human longevity. Neuroimmunomodulation 15, 224-240 (2008). 165. O'Toole, P.W. & Jeffery, I.B. Gut microbiota and aging. Science 350, 1214-1215 (2015). 166. Martin, R.E. Management of dry mouth in elderly patients. J Gt Houst Dent Soc 66, 25-28; quiz 29 (1994). 167. Qin, N., et al. Alterations of the human gut microbiome in liver cirrhosis. Nature 513, 59-64 (2014).

115

168. He, J., Li, Y., Cao, Y., Xue, J. & Zhou, X. The oral microbiome diversity and its relation to human diseases. Folia microbiologica 60, 69-80 (2015). 169. Faveri, M., et al. Microbiological diversity of generalized aggressive periodontitis by 16S rRNA clonal analysis. and immunology 23, 112-118 (2008). 170. Carding, S., Verbeke, K., Vipond, D.T., Corfe, B.M. & Owen, L.J. Dysbiosis of the gut microbiota in disease. Microb Ecol Health Dis 26, 26191 (2015). 171. Qin, J., et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490, 55-60 (2012). 172. Larsen, N., et al. Gut microbiota in human adults with type 2 diabetes differs from non-diabetic adults. PLoS One 5, e9085 (2010). 173. Karlsson, F.H., et al. Gut metagenome in European women with normal, impaired and diabetic glucose control. Nature 498, 99-103 (2013). 174. Sato, J., et al. Gut dysbiosis and detection of "live gut bacteria" in blood of Japanese patients with type 2 diabetes. Diabetes Care 37, 2343-2350 (2014). 175. Zhao, L., et al. Gut bacteria selectively promoted by dietary fibers alleviate type 2 diabetes. Science 359, 1151-1156 (2018). 176. Kriss, M., Hazleton, K.Z., Nusbacher, N.M., Martin, C.G. & Lozupone, C.A. Low diversity gut microbiota dysbiosis: drivers, functional implications and recovery. Current opinion in microbiology 44, 34-40 (2018). 177. Vrieze, A., et al. Transfer of intestinal microbiota from lean donors increases insulin sensitivity in individuals with metabolic syndrome. Gastroenterology 143, 913-916 e917 (2012). 178. Cefalu, W.T. Inflammation, insulin resistance, and type 2 diabetes: back to the future? Diabetes 58, 307-308 (2009). 179. Donath, M.Y. & Shoelson, S.E. Type 2 diabetes as an inflammatory disease. Nat Rev Immunol 11, 98-107 (2011). 180. Amar, J., et al. Involvement of tissue bacteria in the onset of diabetes in humans: evidence for a concept. Diabetologia 54, 3055-3061 (2011). 181. Thomsen, R.W., et al. Diabetes mellitus as a risk and prognostic factor for community-acquired bacteremia due to enterobacteria: a 10-year, population-based study among adults. Clin Infect Dis 40, 628-631 (2005). 182. Luissint, A.C., Parkos, C.A. & Nusrat, A. Inflammation and the Intestinal Barrier: Leukocyte- Epithelial Cell Interactions, Cell Junction Remodeling, and Mucosal Repair. Gastroenterology 151, 616-632 (2016). 183. Cosorich, I., et al. High frequency of intestinal TH17 cells correlates with microbiota alterations and disease activity in multiple sclerosis. Sci Adv 3, e1700492 (2017). 184. Kempski, J., Brockmann, L., Gagliani, N. & Huber, S. TH17 Cell and Epithelial Cell Crosstalk during Inflammatory Bowel Disease and Carcinogenesis. Front Immunol 8, 1373 (2017). 185. Lopez, P., et al. Th17 responses and natural IgM antibodies are related to gut microbiota composition in systemic lupus erythematosus patients. Scientific reports 6, 24072 (2016). 186. Hevia, A., et al. Intestinal dysbiosis associated with systemic lupus erythematosus. MBio 5, e01548-01514 (2014). 187. Liang, S.C., et al. Interleukin (IL)-22 and IL-17 are coexpressed by Th17 cells and cooperatively enhance expression of antimicrobial peptides. J Exp Med 203, 2271-2279 (2006). 188. Miossec, P. & Kolls, J.K. Targeting IL-17 and TH17 cells in chronic inflammation. Nature reviews. Drug discovery 11, 763-776 (2012). 189. Gaffen, S.L. An overview of IL-17 function and signaling. Cytokine 43, 402-407 (2008). 190. Wang, Y., Mumm, J.B., Herbst, R., Kolbeck, R. & Wang, Y. IL-22 Increases Permeability of Intestinal Epithelial Tight Junctions by Enhancing Claudin-2 Expression. J Immunol 199, 3316- 3325 (2017). 191. Aden, K., et al. Epithelial IL-23R Signaling Licenses Protective IL-22 Responses in Intestinal Inflammation. Cell Rep 16, 2208-2218 (2016).

116

192. Aujla, S.J., et al. IL-22 mediates mucosal host defense against Gram-negative bacterial pneumonia. Nat Med 14, 275-281 (2008). 193. Ishigame, H., et al. Differential roles of interleukin-17A and -17F in host defense against mucoepithelial bacterial infection and allergic responses. Immunity 30, 108-119 (2009). 194. Ivanov, II, et al. Induction of intestinal Th17 cells by segmented filamentous bacteria. Cell 139, 485-498 (2009). 195. Tan, T.G., et al. Identifying species of symbiont bacteria from the human gut that, alone, can induce intestinal Th17 cells in mice. Proc Natl Acad Sci U S A 113, E8141-E8150 (2016). 196. Atarashi, K., et al. Th17 Cell Induction by Adhesion of Microbes to Intestinal Epithelial Cells. Cell 163, 367-380 (2015). 197. Van Maele, L., et al. TLR5 signaling stimulates the innate production of IL-17 and IL-22 by CD3(neg)CD127+ immune cells in spleen and mucosa. J Immunol 185, 1177-1185 (2010). 198. Zuniga, L.A., et al. IL-17 regulates adipogenesis, glucose homeostasis, and obesity. J Immunol 185, 6947-6959 (2010). 199. Garidou, L., et al. The Gut Microbiota Regulates Intestinal CD4 T Cells Expressing RORgammat and Controls Metabolic Disease. Cell Metab 22, 100-112 (2015). 200. Natividad, J.M., et al. Impaired Aryl Hydrocarbon Receptor Ligand Production by the Gut Microbiota Is a Key Factor in Metabolic Syndrome. Cell Metab 28, 737-749 e734 (2018). 201. Baricza, E., Tamasi, V., Marton, N., Buzas, E.I. & Nagy, G. The emerging role of aryl hydrocarbon receptor in the activation and differentiation of Th17 cells. Cell Mol Life Sci 73, 95- 117 (2016). 202. Wang, X., et al. Interleukin-22 alleviates metabolic disorders and restores mucosal immunity in diabetes. Nature 514, 237-241 (2014). 203. Shen, J., Fang, Y., Zhu, H. & Ge, W. Plasma interleukin-22 levels are associated with prediabetes and type 2 diabetes in the Han Chinese population. J Diabetes Investig 9, 33-38 (2018). 204. Martins, L.M.S., et al. Interleukin-23 promotes intestinal T helper type17 immunity and ameliorates obesity-associated metabolic syndrome in a murine high-fat diet model. Immunology (2018). 205. Hong, C.P., et al. Gut-Specific Delivery of T-Helper 17 Cells Reduces Obesity and Insulin Resistance in Mice. Gastroenterology 152, 1998-2010 (2017). 206. Dalmas, E., et al. T cell-derived IL-22 amplifies IL-1beta-driven inflammation in human adipose tissue: relevance to obesity and type 2 diabetes. Diabetes 63, 1966-1977 (2014). 207. Surendar, J., Aravindhan, V., Rao, M.M., Ganesan, A. & Mohan, V. Decreased serum interleukin-17 and increased transforming growth factor-beta levels in subjects with metabolic syndrome (Chennai Urban Rural Epidemiology Study-95). Metabolism 60, 586-590 (2011). 208. Arababadi, M.K., et al. Nephropathic complication of type-2 diabetes is following pattern of autoimmune diseases? Diabetes Res Clin Pract 87, 33-37 (2010). 209. Chen, H., Ren, X., Liao, N. & Wen, F. Th17 cell frequency and IL-17A concentrations in peripheral blood mononuclear cells and vitreous fluid from patients with diabetic retinopathy. J Int Med Res 44, 1403-1413 (2016). 210. Santalahti, K., et al. Circulating Cytokines Predict the Development of Insulin Resistance in a Prospective Finnish Population Cohort. J Clin Endocrinol Metab 101, 3361-3369 (2016). 211. Jagannathan-Bogdan, M., et al. Elevated proinflammatory cytokine production by a skewed T cell compartment requires monocytes and promotes inflammation in type 2 diabetes. J Immunol 186, 1162-1172 (2011). 212. Zhang, C., et al. The alteration of Th1/Th2/Th17/Treg paradigm in patients with type 2 diabetes mellitus: Relationship with diabetic nephropathy. Hum Immunol 75, 289-296 (2014). 213. Yousefidaredor, H., et al. IL-17A plays an important role in induction of type 2 diabetes and its complications. Asian Pacific Journal of Tropical Disease 4, 412-415 (2014). 214. Wang, B., et al. Increased expression of Th17 cytokines and interleukin-22 correlates with disease activity in pristane-induced arthritis in rats. PLoS One 12, e0188199 (2017).

117

215. Al-Saadany, H.M., Hussein, M.S., Gaber, R.A. & Zaytoun, H.A. Th-17 cells and serum IL-17 in rheumatoid arthritis patients: correlation with disease activity and severity. The Egyptian Rheumatologist 38, 1-7 (2016). 216. Michalak-Stoma, A., et al. Serum levels of selected Th17 and Th22 cytokines in psoriatic patients. Disease Markers 35, 625-631 (2013). 217. Misra, D.P., Chaurasia, S. & Misra, R. Increased Circulating Th17 Cells, Serum IL-17A, and IL- 23 in Takayasu Arteritis. Autoimmune Dis 2016, 7841718 (2016). 218. Schmechel, S., et al. Linking genetic susceptibility to Crohn's disease with Th17 cell function: IL-22 serum levels are increased in Crohn's disease and correlate with disease activity and IL23R genotype status. Inflamm Bowel Dis 14, 204-212 (2008). 219. Sahin, A., et al. Serum interleukin 17 levels in patients with Crohn's disease: real life data. Dis Markers 2014, 690853 (2014). 220. Wang, Y.L., Koh, W.P., Talaei, M., Yuan, J.M. & Pan, A. Association between the ratio of triglyceride to high-density lipoprotein cholesterol and incident type 2 diabetes in Singapore Chinese men and women. J Diabetes 9, 689-698 (2017). 221. Brands, M.W. & Manhiani, M.M. Sodium-retaining effect of insulin in diabetes. Am J Physiol Regul Integr Comp Physiol 303, R1101-1109 (2012). 222. Pilling, L.C., et al. Red blood cell distribution width: Genetic evidence for aging pathways in 116,666 volunteers. PLoS One 12, e0185083 (2017). 223. Wu, W., Chen, F., Liu, Z. & Cong, Y. Microbiota-specific Th17 Cells: Yin and Yang in Regulation of Inflammatory Bowel Disease. Inflamm Bowel Dis 22, 1473-1482 (2016). 224. Wenyu Zhou, M.R.S., Kévin Contrepois, Yanjiao Zhou, Sara Ahadi, Shana R. Leopold, Martin J. Zhang, Varsha Rao, Monika Avina, Tejaswini Mishra, Jethro Johnson, Brittany Lee, Songjie Chen, Dong-Binh Tran, Hoan Nguyen, Xin Zhou, Brandon Albright, Bo-Young Hong, Lauren Petersen, Eddy Bautista, Blake Hanson, Lei Chen, Daniel Spakowicz, Denis Salins, Benjamin Leopold, Melanie Ashland, Orit Shannon Rego, Pats Limcaoco, Elizabeth Colbert, Candice Allister, Dalia Perelman, Colleen Craig, Eric Wei, Hassan Chaib, Daniel Hornburg, Jessilyn Dunn, Liang Liang, Sophia Miryam Schüssler-Fiorenza Rose, Brian Piening, Hannes Rost, David Tse, Tracey McLaughlin, Erica Sodergren, George M. Weinstock, Michael Snyder. Complex host-microbial dynamics in prediabetes revealed through longitudinal multi-omics profiling. Nature (Submitted)(2018). 225. Onuora, S. Autoimmunity: Human gut bacteria induce TH17 cells. Nat Rev Rheumatol 13, 2 (2016). 226. Quintana, F.J., et al. Control of T(reg) and T(H)17 cell differentiation by the aryl hydrocarbon receptor. Nature 453, 65-71 (2008). 227. Asarat, M., Apostolopoulos, V., Vasiljevic, T. & Donkor, O. Short-Chain Fatty Acids Regulate Cytokines and Th17/Treg Cells in Human Peripheral Blood Mononuclear Cells in vitro. Immunol Invest 45, 205-222 (2016). 228. Jackson, M.A., et al. Signatures of early frailty in the gut microbiota. Genome Med 8, 8 (2016). 229. Wang, J., et al. Respiratory influenza virus infection induces intestinal immune injury via microbiota-mediated Th17 cell-dependent inflammation. J Exp Med 211, 2397-2410 (2014). 230. Deriu, E., et al. Influenza Virus Affects Intestinal Microbiota and Secondary Salmonella Infection in the Gut through Type I Interferons. PLoS pathogens 12, e1005572 (2016). 231. Thevaranjan, N., et al. Age-Associated Microbial Dysbiosis Promotes Intestinal Permeability, Systemic Inflammation, and Macrophage Dysfunction. Cell Host Microbe 21, 455-466 e454 (2017). 232. Fransen, F., et al. Aged Gut Microbiota Contributes to Systemical Inflammaging after Transfer to Germ-Free Mice. Front Immunol 8, 1385 (2017). 233. Minciullo, P.L., et al. Inflammaging and Anti-Inflammaging: The Role of Cytokines in Extreme Longevity. Arch Immunol Ther Exp (Warsz) 64, 111-126 (2016).

118

234. Baylis, D., Bartlett, D.B., Patel, H.P. & Roberts, H.C. Understanding how we age: insights into inflammaging. Longev Healthspan 2, 8 (2013). 235. Sjoholm, A. & Nystrom, T. Inflammation and the etiology of type 2 diabetes. Diabetes Metab Res Rev 22, 4-10 (2006). 236. Jin, W. & Dong, C. IL-17 cytokines in immunity and inflammation. Emerg Microbes Infect 2, e60 (2013). 237. Vasseur, P., et al. High plasma levels of the pro-inflammatory cytokine IL-22 and the anti- inflammatory cytokines IL-10 and IL-1ra in acute pancreatitis. Pancreatology 14, 465-469 (2014). 238. Muhl, H. Pro-Inflammatory Signaling by IL-10 and IL-22: Bad Habit Stirred Up by Interferons? Front Immunol 4, 18 (2013). 239. Valeri, M. & Raffatellu, M. Cytokines IL-17 and IL-22 in the host response to infection. Pathog Dis 74(2016). 240. Eyerich, K., Dimartino, V. & Cavani, A. IL-17 and IL-22 in immunity: Driving protection and pathology. Eur J Immunol 47, 607-614 (2017). 241. Sanjabi, S., Zenewicz, L.A., Kamanaka, M. & Flavell, R.A. Anti-inflammatory and pro- inflammatory roles of TGF-beta, IL-10, and IL-22 in immunity and autoimmunity. Curr Opin Pharmacol 9, 447-453 (2009). 242. Schnyder, B. & Schnyder-Candrian, S. Dual role of IL-17 in allergic asthma. in Th 17 Cells: Role in Inflammation and Autoimmune Disease (eds. Quesniaux, V., Ryffel, B. & Di Padova, F.) 95- 104 (Birkhäuser Basel, Basel, 2009). 243. Zenewicz, L.A. & Flavell, R.A. Recent advances in IL-22 biology. International immunology 23, 159-163 (2011). 244. Chehimi, M., Vidal, H. & Eljaafari, A. Pathogenic Role of IL-17-Producing Immune Cells in Obesity, and Related Inflammatory Diseases. J Clin Med 6(2017). 245. Bartley, J.M., Zhou, X., Kuchel, G.A., Weinstock, G.M. & Haynes, L. Impact of Age, Caloric Restriction, and Influenza Infection on Mouse Gut Microbiome: An Exploratory Study of the Role of Age-Related Microbiome Changes on Influenza Responses. Front Immunol 8, 1164 (2017). 246. Krishnan, S., et al. Gut Microbiota-Derived Tryptophan Metabolites Modulate Inflammatory Response in Hepatocytes and Macrophages. Cell Rep 23, 1099-1111 (2018). 247. Mabuchi, T., Takekoshi, T. & Hwang, S.T. Epidermal CCR6+ gammadelta T cells are major producers of IL-22 and IL-17 in a murine model of psoriasiform dermatitis. J Immunol 187, 5026-5031 (2011). 248. Toussirot, E., Laheurte, C., Gaugler, B., Gabriel, D. & Saas, P. Increased IL-22- and IL-17A- Producing Mucosal-Associated Invariant T Cells in the Peripheral Blood of Patients With Ankylosing Spondylitis. Front Immunol 9, 1610 (2018). 249. Jylhava, J., Pedersen, N.L. & Hagg, S. Biological Age Predictors. EBioMedicine 21, 29-36 (2017). 250. Belsky, D.W., et al. Quantification of biological aging in young adults. Proc Natl Acad Sci U S A 112, E4104-4110 (2015). 251. Wherry, E.J. & Kurachi, M. Molecular and cellular insights into T cell exhaustion. Nat Rev Immunol 15, 486-499 (2015). 252. Wherry, E.J. T cell exhaustion. Nature Immunology 12, 492 (2011). 253. Yi, J.S., Cox, M.A. & Zajac, A.J. T-cell exhaustion: characteristics, causes and conversion. Immunology 129, 474-481 (2010). 254. Kumar, P., Natarajan, K. & Shanmugam, N. High glucose driven expression of pro-inflammatory cytokine and chemokine genes in lymphocytes: molecular mechanisms of IL-17 family gene expression. Cell Signal 26, 528-539 (2014). 255. Esko, J.D., Bertozzi, C. & Schnaar, R.L. Chemical Tools for Inhibiting Glycosylation. in Essentials of Glycobiology (eds. rd, et al.) 701-712 (Cold Spring Harbor (NY), 2015).

119

256. Magoc, T. & Salzberg, S.L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957-2963 (2011). 257. Edgar, R.C., Haas, B.J., Clemente, J.C., Quince, C. & Knight, R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27, 2194-2200 (2011). 258. Wang, Q., Garrity, G.M., Tiedje, J.M. & Cole, J.R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and environmental microbiology 73, 5261-5267 (2007). 259. Claesson, M.J., et al. Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One 4, e6669 (2009). 260. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127-128 (2007). 261. McMurdie, P.J. & Holmes, S. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One 8, e61217 (2013). 262. Scrucca, L., Fop, M., Murphy, T.B. & Raftery, A.E. mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models. R J 8, 289-317 (2016). 263. Wilke, C.O. cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 0.9.2. https://CRAN.R-project.org/package=cowplot (2017). 264. Wickham, H. ggplot2: Elegant Graphics for Data Analysis, (Springer Publishing Company, Incorporated, 2009). 265. Wickham, H. Ggplot2 : elegant graphics for data analysis, (Springer, New York, 2009). 266. Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823 (2014). 267. Bürkner, P.-C. Advanced Bayesian Multilevel Modeling with the R Package brms. arXiv preprint arXiv:1705.11123 (2017). 268. Bürkner, P.-C. brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software 80, 1-28 (2017). 269. Hoffman, M.D. & Gelman, A. The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research 15, 1593-1623 (2014). 270. Lopez-Otin, C., Blasco, M.A., Partridge, L., Serrano, M. & Kroemer, G. The hallmarks of aging. Cell 153, 1194-1217 (2013). 271. Thompson, W.W., et al. Mortality associated with influenza and respiratory syncytial virus in the United States. JAMA 289, 179-186 (2003). 272. Heron, M. Deaths: leading causes for 2010. Natl Vital Stat Rep 62, 1-96 (2013). 273. Bender, B.S., Johnson, M.P. & Small, P.A. Influenza in senescent mice: impaired cytotoxic T- lymphocyte activity is correlated with prolonged infection. Immunology 72, 514-519 (1991). 274. Bender, B.S., Taylor, S.F., Zander, D.S. & Cottey, R. Pulmonary immune response of young and aged mice after influenza challenge. J Lab Clin Med 126, 169-177 (1995). 275. Murasko, D.M. & Jiang, J. Response of aged mice to primary virus infections. Immunol Rev 205, 285-296 (2005). 276. Toapanta, F.R. & Ross, T.M. Impaired immune responses in the lungs of aged mice following influenza infection. Respir Res 10, 112 (2009). 277. Lefebvre, J.S., et al. Vaccine efficacy and T helper cell differentiation change with aging. Oncotarget 7, 33581-33594 (2016). 278. McCay, C.M., Crowell, M.F. & Maynard, L.A. The effect of retarded growth upon the length of life span and upon the ultimate body size. 1935. Nutrition 5, 155-171; discussion 172 (1989). 279. Speakman, J.R., Mitchell, S.E. & Mazidi, M. Calories or protein? The effect of dietary restriction on lifespan in rodents is explained by calories alone. Exp Gerontol 86, 28-38 (2016). 280. Kapahi, P., Kaeberlein, M. & Hansen, M. Dietary restriction and lifespan: Lessons from invertebrate models. Ageing Res Rev 39, 3-14 (2017).

120

281. Colman, R.J., et al. Caloric restriction reduces age-related and all-cause mortality in rhesus monkeys. Nat Commun 5, 3557 (2014). 282. Masoro, E.J. Overview of caloric restriction and ageing. Mech Ageing Dev 126, 913-922 (2005). 283. Mattison, J.A., et al. Impact of caloric restriction on health and survival in rhesus monkeys from the NIA study. Nature 489, 318-321 (2012). 284. Carey, J.R., et al. Life history response of Mediterranean fruit flies to dietary restriction. Aging cell 1, 140-148 (2002). 285. Cooper, T.M., Mockett, R.J., Sohal, B.H., Sohal, R.S. & Orr, W.C. Effect of caloric restriction on life span of the housefly, Musca domestica. FASEB J 18, 1591-1593 (2004). 286. Molleman, F., Ding, J., Boggs, C.L., Carey, J.R. & Arlet, M.E. Does dietary restriction reduce life span in male fruit-feeding butterflies? Exp Gerontol 44, 601-606 (2009). 287. Liao, C.Y., Rikke, B.A., Johnson, T.E., Diaz, V. & Nelson, J.F. Genetic variation in the murine lifespan response to dietary restriction: from life extension to life shortening. Aging cell 9, 92-95 (2010). 288. Rikke, B.A., Liao, C.Y., McQueen, M.B., Nelson, J.F. & Johnson, T.E. Genetic dissection of dietary restriction in mice supports the metabolic efficiency model of life extension. Exp Gerontol 45, 691-701 (2010). 289. Heilbronn, L.K. & Ravussin, E. Calorie restriction and aging: review of the literature and implications for studies in humans. Am J Clin Nutr 78, 361-369 (2003). 290. Ungvari, Z., Parrado-Fernandez, C., Csiszar, A. & de Cabo, R. Mechanisms underlying caloric restriction and lifespan regulation: implications for vascular aging. Circ Res 102, 519-528 (2008). 291. Yan, L., et al. Calorie restriction can reverse, as well as prevent, aging cardiomyopathy. Age (Dordr) 35, 2177-2182 (2013). 292. McKiernan, S.H., et al. Caloric restriction delays aging-induced cellular phenotypes in rhesus monkey skeletal muscle. Exp Gerontol 46, 23-29 (2011). 293. Aspnes, L.E., et al. Caloric restriction reduces fiber loss and mitochondrial abnormalities in aged rat muscle. FASEB J 11, 573-581 (1997). 294. Nikolich-Zugich, J. & Messaoudi, I. Mice and flies and monkeys too: caloric restriction rejuvenates the aging immune system of non-human primates. Exp Gerontol 40, 884-893 (2005). 295. Most, J., Tosti, V., Redman, L.M. & Fontana, L. Calorie restriction in humans: An update. Ageing Res Rev 39, 36-45 (2017). 296. Willcox, B.J., et al. Caloric restriction, the traditional Okinawan diet, and healthy aging: the diet of the world's longest-lived people and its potential impact on morbidity and life span. Ann N Y Acad Sci 1114, 434-455 (2007). 297. Ravussin, E., et al. A 2-Year Randomized Controlled Trial of Human Caloric Restriction: Feasibility and Effects on Predictors of Health Span and Longevity. J Gerontol A Biol Sci Med Sci 70, 1097-1104 (2015). 298. Villareal, D.T., et al. Effect of Two-Year Caloric Restriction on Bone Metabolism and Bone Mineral Density in Non-Obese Younger Adults: A Randomized Clinical Trial. J Bone Miner Res 31, 40-51 (2016). 299. Fontana, L., et al. Effects of 2-year calorie restriction on circulating levels of IGF-1, IGF-binding proteins and cortisol in nonobese men and women: a randomized clinical trial. Aging cell 15, 22- 27 (2016). 300. Fontana, L., Meyer, T.E., Klein, S. & Holloszy, J.O. Long-term calorie restriction is highly effective in reducing the risk for atherosclerosis in humans. Proc Natl Acad Sci U S A 101, 6659- 6663 (2004). 301. Fontana, L., Klein, S. & Holloszy, J.O. Effects of long-term calorie restriction and endurance exercise on glucose tolerance, insulin action, and adipokine production. Age (Dordr) 32, 97-108 (2010). 302. Messaoudi, I., et al. Delay of T cell senescence by caloric restriction in aged long-lived nonhuman primates. Proc Natl Acad Sci U S A 103, 19448-19453 (2006).

121

303. Pahlavani, M.A. Influence of caloric restriction on aging immune system. J Nutr Health Aging 8, 38-47 (2004). 304. Jolly, C.A., Fernandez, R., Muthukumar, A.R. & Fernandes, G. Calorie restriction modulates Th- 1 and Th-2 cytokine-induced immunoglobulin secretion in young and old C57BL/6 cultured submandibular glands. Aging (Milano) 11, 383-389 (1999). 305. Spaulding, C.C., Walford, R.L. & Effros, R.B. Calorie restriction inhibits the age-related dysregulation of the cytokines TNF-alpha and IL-6 in C3B10RF1 mice. Mech Ageing Dev 93, 87- 94 (1997). 306. Spaulding, C.C., Walford, R.L. & Effros, R.B. The accumulation of non-replicative, non- functional, senescent T cells with age is avoided in calorically restricted mice by an enhancement of T cell apoptosis. Mech Ageing Dev 93, 25-33 (1997). 307. Chen, J., Astle, C.M. & Harrison, D.E. Delayed immune aging in diet-restricted B6CBAT6 F1 mice is associated with preservation of naive T cells. J Gerontol A Biol Sci Med Sci 53, B330- 337; discussion B338-339 (1998). 308. Avula, C.P. & Fernandes, G. Inhibition of H2O2-induced apoptosis of lymphocytes by calorie restriction during aging. Microsc Res Tech 59, 282-292 (2002). 309. Grossmann, A., Maggio-Price, L., Jinneman, J.C., Wolf, N.S. & Rabinovitch, P.S. The effect of long-term caloric restriction on function of T-cell subsets in old mice. Cell Immunol 131, 191-204 (1990). 310. Effros, R.B., Walford, R.L., Weindruch, R. & Mitcheltree, C. Influences of dietary restriction on immunity to influenza in aged mice. J Gerontol 46, B142-147 (1991). 311. Gardner, E.M., Beli, E., Clinthorne, J.F. & Duriancik, D.M. Energy intake and response to infection with influenza. Annu Rev Nutr 31, 353-367 (2011). 312. Clinthorne, J.F., Adams, D.J., Fenton, J.I., Ritz, B.W. & Gardner, E.M. Short-term re-feeding of previously energy-restricted C57BL/6 male mice restores body weight and body fat and attenuates the decline in natural killer cell function after primary influenza infection. J Nutr 140, 1495-1501 (2010). 313. Ritz, B.W., Aktan, I., Nogusa, S. & Gardner, E.M. Energy restriction impairs natural killer cell function and increases the severity of influenza infection in young adult male C57BL/6 mice. J Nutr 138, 2269-2275 (2008). 314. Gardner, E.M. Caloric restriction decreases survival of aged mice in response to primary influenza infection. J Gerontol A Biol Sci Med Sci 60, 688-694 (2005). 315. Ritz, B.W. & Gardner, E.M. Malnutrition and energy restriction differentially affect viral immunity. J Nutr 136, 1141-1144 (2006). 316. Samuelson, D.R., Welsh, D.A. & Shellito, J.E. Regulation of lung immunity and host defense by the intestinal microbiota. Front Microbiol 6, 1085 (2015). 317. Atarashi, K., et al. Induction of colonic regulatory T cells by indigenous Clostridium species. Science 331, 337-341 (2011). 318. Lee, Y.K., Menezes, J.S., Umesaki, Y. & Mazmanian, S.K. Proinflammatory T-cell responses to gut microbiota promote experimental autoimmune encephalomyelitis. Proc Natl Acad Sci U S A 108 Suppl 1, 4615-4622 (2011). 319. Sudo, N., et al. An oral introduction of intestinal bacteria prevents the development of a long- term Th2-skewed immunological memory induced by neonatal antibiotic treatment in mice. Clin Exp Allergy 32, 1112-1116 (2002). 320. Noverr, M.C., Noggle, R.M., Toews, G.B. & Huffnagle, G.B. Role of antibiotics and fungal microbiota in driving pulmonary allergic responses. Infection and immunity 72, 4996-5003 (2004). 321. Ichinohe, T., et al. Microbiota regulates immune defense against respiratory tract influenza A virus infection. Proc Natl Acad Sci U S A 108, 5354-5359 (2011). 322. Fagundes, C.T., et al. Transient TLR activation restores inflammatory response and ability to control pulmonary bacterial infection in germfree mice. J Immunol 188, 1411-1420 (2012).

122

323. Yu, B., et al. Dysbiosis of gut microbiota induced the disorder of helper T cells in influenza virus-infected mice. Hum Vaccin Immunother 11, 1140-1146 (2015). 324. Salazar, N., Valdes-Varela, L., Gonzalez, S., Gueimonde, M. & de Los Reyes-Gavilan, C.G. Nutrition and the gut microbiome in the elderly. Gut Microbes 8, 82-97 (2017). 325. Jeffery, I.B., Lynch, D.B. & O'Toole, P.W. Composition and temporal stability of the gut microbiota in older persons. The ISME journal 10, 170-182 (2016). 326. Odamaki, T., et al. Age-related changes in gut microbiota composition from newborn to centenarian: a cross-sectional study. BMC microbiology 16, 90 (2016). 327. Biagi, E., et al. Through ageing, and beyond: gut microbiota and inflammatory status in seniors and centenarians. PLoS One 5, e10667 (2010). 328. Zhang, C., et al. Structural modulation of gut microbiota in life-long calorie-restricted mice. Nat Commun 4, 2163 (2013). 329. Pae, M., Meydani, S.N. & Wu, D. The role of nutrition in enhancing immunity in aging. Aging Dis 3, 91-129 (2012). 330. Fukushima, Y., et al. Improvement of nutritional status and incidence of infection in hospitalised, enterally fed elderly by feeding of fermented milk containing probiotic Lactobacillus johnsonii La1 (NCC533). Br J Nutr 98, 969-977 (2007). 331. Guillemard, E., Tondu, F., Lacoin, F. & Schrezenmeir, J. Consumption of a fermented dairy product containing the probiotic Lactobacillus casei DN-114001 reduces the duration of respiratory infections in the elderly in a randomised controlled trial. Br J Nutr 103, 58-68 (2010). 332. de Vrese, M., et al. Probiotic bacteria reduced duration and severity but not the incidence of common cold episodes in a double blind, randomized, controlled trial. Vaccine 24, 6670-6674 (2006). 333. Bunout, D., et al. Effects of a nutritional supplement on the immune response and cytokine production in free-living Chilean elderly. JPEN J Parenter Enteral Nutr 28, 348-354 (2004). 334. Akatsu, H., et al. Clinical effects of probiotic Bifidobacterium longum BB536 on immune function and intestinal microbiota in elderly patients receiving enteral tube feeding. JPEN J Parenter Enteral Nutr 37, 631-640 (2013). 335. Duncan, S.H., et al. Proposal of Roseburia faecis sp. nov., Roseburia hominis sp. nov. and Roseburia inulinivorans sp. nov., based on isolates from human faeces. Int J Syst Evol Microbiol 56, 2437-2441 (2006). 336. Nagafuchi, S., et al. Effects of a Formula Containing Two Types of Prebiotics, Bifidogenic Growth Stimulator and Galacto-oligosaccharide, and Fermented Milk Products on Intestinal Microbiota and Antibody Response to Influenza Vaccine in Elderly Patients: A Randomized Controlled Trial. Pharmaceuticals (Basel) 8, 351-365 (2015). 337. Jelley-Gibbs, D.M., et al. Persistent depots of influenza antigen fail to induce a cytotoxic CD8 T cell response. J Immunol 178, 7563-7570 (2007). 338. Taiyun Wei, V.S. R package "corrplot": Visualization of a Correlation Matrix. Version 0.84 Available from https://github.com/taiyun/corrplot(2017). 339. Bartley, J.M., et al. Aging augments the impact of influenza respiratory tract infection on mobility impairments, muscle-localized inflammation, and muscle atrophy. Aging (Albany NY) 8, 620-635 (2016). 340. Ley, R.E., Turnbaugh, P.J., Klein, S. & Gordon, J.I. : human gut microbes associated with obesity. Nature 444, 1022-1023 (2006). 341. Nadal, I., et al. Shifts in clostridia, bacteroides and immunoglobulin-coating fecal bacteria associated with weight loss in obese adolescents. Int J Obes (Lond) 33, 758-767 (2009). 342. Santacruz, A., et al. Interplay between weight loss and gut microbiota composition in overweight adolescents. Obesity (Silver Spring) 17, 1906-1915 (2009). 343. McDermott, M.R. & Bienenstock, J. Evidence for a common mucosal immunologic system. I. Migration of B immunoblasts into intestinal, respiratory, and genital tissues. J Immunol 122, 1892-1898 (1979).

123

344. Winter, S.E. & Baumler, A.J. Why related bacterial species bloom simultaneously in the gut: principles underlying the 'Like will to like' concept. Cell Microbiol 16, 179-184 (2014). 345. Unutmaz, D. & Pulendran, B. The gut feeling of Treg cells: IL-10 is the silver lining during . Nat Immunol 10, 1141-1143 (2009). 346. Kane, M., et al. Successful of a retrovirus depends on the commensal microbiota. Science 334, 245-249 (2011). 347. Kuss, S.K., et al. Intestinal microbiota promote enteric virus replication and systemic pathogenesis. Science 334, 249-252 (2011). 348. Abt, M.C., et al. Commensal bacteria calibrate the activation threshold of innate antiviral immunity. Immunity 37, 158-170 (2012). 349. Dolowy, W.C. & Muldoon, R.L. Studies of Germfree Animals. I. Response of Mice to Infection with Influenza a Virus. Proc Soc Exp Biol Med 116, 365-371 (1964). 350. Geva-Zatorsky, N., et al. Mining the Human Gut Microbiota for Immunomodulatory Organisms. Cell 168, 928-943 e911 (2017). 351. Kawase, M., et al. Heat-killed Lactobacillus gasseri TMC0356 protects mice against influenza virus infection by stimulating gut and respiratory immune responses. FEMS Immunol Med Microbiol 64, 280-288 (2012). 352. Nakayama, Y., et al. Oral administration of Lactobacillus gasseri SBT2055 is effective for preventing influenza in mice. Scientific reports 4, 4638 (2014). 353. Fujimura, K.E., et al. House dust exposure mediates gut microbiome Lactobacillus enrichment and airway immune defense against allergens and virus infection. Proc Natl Acad Sci U S A 111, 805-810 (2014). 354. Ferrucci, L., Guralnik, J.M., Pahor, M., Corti, M.C. & Havlik, R.J. Hospital diagnoses, Medicare charges, and nursing home admissions in the year when older persons become severely disabled. JAMA 277, 728-734 (1997).

124