<<

The assembly and functions of microbial communities on complex substrates

by

Xiaoqian Yu

B.S./M.S., Molecular Biophysics and Biochemistry

Yale University (2011)

Submitted to the Biology Graduate Program in Partial Fulfillment of the

Requirements for the Degree of

Doctor of Philosophy

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

September 2019

2019 Massachusetts Institute of Technology. All rights reserved.

Signature of Author: ______Xiaoqian Yu Department of Biology

Certified by: ______Eric J. Alm Professor of Biological Engineering Professor of Civil and Environmental Engineering Thesis Advisor

Accepted by: ______Amy Keating Professor of Biology Co-Director, Biology Graduate Committee

The assembly and functions of microbial communities on complex substrates

by Xiaoqian Yu Submitted to the Department of Biology on August 5th, 2019 in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Biology

Abstract

Microbes form diverse and complex communities to influence the health and function of all ecosystems on earth. However, key ecological and evolutionary processes that allow microbial communities to form and maintain their diversity, and how this diversity further affects ecosystem function, are largely underexplored. This is especially true for natural microbial communities that harbor large numbers of whose interactions are often the result of long term evolutionary processes of co-occurring organisms. In this thesis, I make use of “common garden experiments” — introducing varying microbial communities to the same environments — to investigate how the assembly and functions of natural microbial communities are affected by the diversity of communities, as well as the chemical nature of substrates that they assemble on. In the first project, I present an experimental workflow that streamlines the generation of self-assembled microbial communities with a wide range of diversity, measurements of community function in “common gardens”, followed by subsequent isolation of the most abundant taxa from these communities via dilution-to-extinction. This high-throughput workflow is applied to assess how interactions scale with organismal diversity to affect the function of microbial communities from the coastal ocean. In the second project, I use a combination of theoretical models and an ex vivo experimental framework to examine how the volume and content of gas produced by gut microbiota assembling on different prebiotic substrates (“gardens”) are influenced by the chemical nature of the substrate and the composition of the gut microbiota itself. As a whole, this body of work represents a small step towards finding common organization principles in microbial community assembly and their functional consequences.

Thesis Supervisor: Eric J. Alm

Title: Professor of Biological Engineering, Professor of Civil and Environmental Engineering

2

Acknowledgements

As the long journey of graduate school finally comes to a conclusion, despite it has taken many more twists and turns than I expected, I look back but still cannot help say it was a joyful and satisfying experience. I owe this to the mentors that have guided me along the way, and my friends and family who were always encouraging, patient and supportive.

My first thanks must, of course, go to my advisor, Eric Alm. Eric was brave enough to take me into his lab despite I knew nothing about the microbiome, genomics, or statistics. Transitioning from a protein biochemist to a microbial ecologist was hard, and I definitely had a long lag phase in doing so, but Eric was always positive, optimistic and encouraging, pulling me out of the weeds enough times before I eventually learned to not get lost in them so frequently. And as I became more scientifically mature, I was given all the freedom and trust to carry my projects in directions that I desired, which was a great boost to my confidence. Eric was also always able to come to the rescue when I got confused with math – any improvement in my quantitative skills I had over the past years, I would attribute to Eric.

I am also deeply thankful to Martin Polz, who have effectively been my co-advisor despite not signing any official papers. Little did I know when I first met with Martin because I “did a rotation on Vibrios” how lucky I would be to have him as a mentor, always being there when I needed help or support. Through working with Martin I learned how to treat scientific results with both rigor and respect, and how to not be intimidated by the imperfectness of experimental results. He also exemplified to me how novelty in science is deeply rooted in logic and knowledge, as well as how doing research is a life-long learning process. These are values that I will carry on for life.

I am also grateful to my committee members, Penny Chisholm and Jing-ke Weng, who have been with me for six years. Both in and outside of committee meetings, Penny has showered me with gems of wisdom, providing me with both intellectual and emotional support. Jing-ke not only brings a non- ecologist view to my committee meetings, but has been a constant source for practical advice. Outside of my committee, I have largely benefited from the conversations I have had with Serguei Saavedra, which always helped me put my non-systematic knowledge of ecology into a systematic context. Jeff Gore was also pivotal for my decision of switching into the field of microbial ecology.

I have been incredibly lucky to have two “homes” for my graduate school: both the Alm lab and the Polz lab were filled with young scientists who were friendly, fun and incredibly helpful.

Sarah Spencer was my first window to the Alm lab; this was followed by many conversations over tea in the Tech square kitchen. A lot of my intellectual growth during grad school I owe to the three postdocs that I shared an office with: Sean Gibbons, Tami Lieberman and Fangqiong Ling. They were always patient in answering my questions about data processing, helping me look over my writing, or telling me how to build academic connections. We also had great times together brainstorming about anything related to the microbiome, ecology, or population genomics. Shijie Zhao has also been a constant source of intellectual stimulation, always bringing new ideas and techniques to the table. My collaborators in the lab for the project in chapter three, Thomas Gurry and Tu Nguyen, have always been quick and responsive to answer my questions or provide experimental support. I also greatly enjoyed the times that I 3

would “whine” with Sean Kearney about library preps or funky , or just anything not working in lab. And as I was rushing to try to finish my projects at the end of grad school, Fuqing Wu and Anni Zhang were never hesitant to lend a helping hand. Shandrina was also a great hero in making sure that I could get facetime with Eric, that is, making things happen.

Jan-Hendrik Hehemann was my first mentor in the Polz lab. Also coming from a biochemistry background, he helped me see and learn how genomics, biochemistry and ecology can be brought together into a project and complement each other. He also helped shape some of my earliest research ethics, that is to always be respectful of others’ ideas and contributions. Manoshi Datta also provided me a lot of guidance during my early years in the lab, despite that she was only one year senior than me. Chris Corzett was one of the best collaborators I could have hoped for—responsible and meticulous, I would often trust his experimental results more than my own. Kathryn Kauffman is truly someone that I admire and love: her persistence, kindness, and generosity make her the exemplary lab citizen that I would hope to become one day. Fabiola Miranda is also someone that I would look forward to seeing when I come to work—her energy is infectious like sunshine, and she was never shy to share her knowledge and advice, or even cuddle me a bit when I am low in emotions. And whenever I rushed into the Polz lab office for help, i.e. looking for Joy because I needed help with stats, or Joseph and Dave for bioinformatics, or Fatima for almost anything general in lab, I was able to. Finally, Michael Cutler was essential for helping me in many aspects in the lab, amazingly cobbling things together (often from cardboard!) whenever I needed it.

I also owe much of my development to the wonderful scientific community here at MIT. I am deeply grateful for the generosity and supportiveness of Amy Keating, the co-director of the graduate program in the biology department, while I was going through the “bumps” of life and research. Betsey Walsh was also an ever amazing admin that knew the solution to every question or problem that I had. Craig Mclean has been a great friend and collaborator, sharing my passions in connecting chemistry to ecology. I have also benefited greatly from talking to many people in the Gore lab, the Cordero lab, the Chisholm lab and the Saavedra lab: Jonathan Friedman, Arolyn Conwil, Avihu Yona, Daniel Amor, Shaul Pollak, Ali Ebrahimi, Julia Schwartzman, Rogier Braakman, and Chuliang Song. Much of my molecular biology techniques I learned from my mentors and labmates during my short time in the Lindquist lab: Kendra Frederick, Greg Newby, and Can Kayatekin.

Finally, I thank my friends and family for keeping me happy and positive during my PhD. Jie Bai and I came to MIT together from college, and we have not only shared rooms and dinners, but also all the ups and downs in graduate school, research and life. My wonderful friends in the “gang of Chedan (bullshit)”, Lixi Wang, Tewei Luo and Xuanzong Guo have traveled far and wide with me, attempted to keep a book blog together, and listened to my rants about difficulties in life. I have had many fun discussions about life and research with my fellow PhD friends Eugene Serebryany, Yuan Qiao, and Xiang Ma. And finally, I owe much to my parents for their quiet yet firm support, and respecting me as an independent person to make my own life choices and pursue my dreams.

4

Table of Contents

Abstract ...... 2 Acknowledgements ...... 3 Table of Contents ...... 5 Chapter 1 Introduction ...... 8 1.1 Studying ecological processes in microbiomes is challenging ...... 8 1.2 Common garden experiments to explore relationships between diversity, interaction and community functions ...... 9 1.3 Common garden experiments to explore gas production of human gut microbiota ...... 11 1.4 Studying microbiome assembly in well controlled contexts can complement survey based studies 12 1.5 Figures illustrating the common garden experiment ...... 14 Chapter 2 Interactions in self-assembled microbial communities saturate with diversity ...... 15 Abstract ...... 15 2.1 Introduction ...... 16 2.2 Results ...... 18 2.2.1 Experimental Workflow ...... 18 2.2.2 Relationships between diversity and community functions vary ...... 20 2.2.3 Relative total function as an indicator for interaction effects on community function ...... 20 2.2.4 Interaction effects on community function differentially increase with diversity ...... 23 2.2.5 Community resource uptake increases with diversity while individual taxa resource uptake decreases ...... 23 2.2.6 Competition in more diverse communities reduces carbon use efficiency (CUE) ...... 24 2.2.7 Effects of interactions on community function increase logistically with diversity ...... 26 2.2.8 Specific taxa effects on community function ...... 26 2.3 Discussion ...... 28 2.4 Methods...... 32 2.4.1 Media preparation ...... 32 2.4.2 Sample collection and experimental design ...... 33 2.4.3 Community growth and function measurements ...... 35 2.4.4 DNA extraction ...... 37

2.4.6 Calculation of 푅푇퐹푐...... 38 2.4.7 Estimation of relative CUE ...... 39 5

2.4.8 Curve fitting and diversity calculations ...... 40 2.4.9 Data availability ...... 40 2.4.10 Acknowledgements ...... 41 2.5 Figures...... 42 Figure 1 ...... 42 Figure 2 ...... 43 Figure 3 ...... 44 Figure 4 ...... 46 Figure 5 ...... 47 2.6 Supplementary Figures ...... 48 Figure S1 ...... 48 Figure S2 ...... 49 Figure S3 ...... 50 Figure S4 ...... 51 Figure S5 ...... 52 Figure S6 ...... 53 Figure S7 ...... 54 Figure S8 ...... 55 Figure S9 ...... 56 Figure S10 ...... 57 Figure S11 ...... 60 2.7 Supplementary Tables ...... 61 Table S1 ...... 61 Table S2 ...... 62 Table S3 ...... 63 Table S4 ...... 74 Chapter 3 Prebiotics and community composition influence gas production of the human gut microbiota 75 Abstract ...... 75 3.1 Introduction ...... 76 3.2 Results ...... 78 3.2.1 Modeling community production with mass and electron balance ...... 78 3.2.2 Feasible product space of pectin fermentation is more limited compared to inulin ...... 78 3.2.3 Pectin degradation takes up reducing agents from the environment ...... 79

6

3.2.4 H2 and Acetate distinguishes the product profile of inulin and pectin degradation, but inter- personal variation of gas production is large ...... 81 3.2.5 Relative effects of microbiome and substrate chemistry on gas production differ among gases 82 3.3 Discussion ...... 83 3.4 Methods...... 85 3.4.1 Experimental model and participant details ...... 85 3.4.2 Linear systems model for modeling community production ...... 85 3.4.3 Setup of ex vivo system ...... 86 3.4.4 Gas and SCFA measurements ...... 86 3.4.5 DNA extraction, library prep, sequencing and analysis ...... 87 3.4.6 Metagenome analysis ...... 88 3.4.7 Acknowledgements ...... 88 3.5 Figures...... 89 Figure 1 ...... 89 Figure 2 ...... 90 Figure 3 ...... 91 Figure 4 ...... 92 Figure 5 ...... 93 3.6 Supplementary Figures ...... 94 Figure S1 ...... 94 Figure S2 ...... 95 Figure S3 ...... 98 3.7 Supplementary Text ...... 99 Chapter 4 Discussion ...... 100 4.1 Limitations and Expansion of current work ...... 100 4.1.1 Data resolution limits interpretation of ecological processes and community functions .. 100 4.1.2 Variations of the common garden experiment can provide additional insights ...... 101 4.1.3 Culture-based experiments can complement “top-down” experiments ...... 102 4.2 Connecting ecological theory and industrial applications ...... 103 References ...... 105

7

Chapter 1 Introduction

1.1 Studying ecological processes in microbiomes is challenging

Microbes are key players in global biogeochemical cycles and directly influence human health and welfare. However, the microbial world had largely been hidden from us: a large portion of bacteria found in the wild remain uncultivated in laboratories due to growth requirements that are yet unknown (Stewart,

2012). Older cultivation-independent microbial survey techniques such as denaturing gradient gel electrophoresis characterize the composition of microbial communities by amplifying select regions of the microbial genome that vary in GC content and can be resolved on a denaturing gel, and is thus limited in both resolution and throughput (Green et al., 2010). Not until recent advances in nucleic acid sequencing technology did we start to systematically catalog the diversity and functions of microbial communities in various environments, frequently referring to them as “microbiomes” (Cho and Blaser, 2012; Tropini et al., 2017). These surveys of microbiome composition in a wide range of habitats from hydrothermal vents in the deep ocean to sites on our own bodies, have captured microbial community compositions at an unprecedented resolution, and revealed systematic relationships between community structure and habitat characteristics (Pasolli et al., 2019; The Human Microbiome Project Consortium et al., 2012; Thompson et al., 2017). However, the ecological processes that shape these communities, as well as how these processes influence community function and adaption to habitat, are largely underexplored.

The difficulties of studying ecological processes and their effects on community function stem primarily from the fact that inferring community history from taking snapshots of community composition at one single time point is almost impossible. Methods that compare measured community composition to that simulated from a random assembly process can give us a coarse-grain view of the relative importance of selection versus stochastic processes such as dispersal and drift in shaping community composition, but they only provide very limited insight for the detailed mechanisms of community assembly (Dini-

Andreote et al., 2015; Stegen et al., 2012). Artificial communities consisting of a few isolates can provide 8

highly mechanistic insights into how communities form, stay stable or interact with the environment

(Friedman et al., 2017; Ratzke and Gore, 2018). However, since these communities are limited in richness and phylogenetic diversity compared to natural communities, it is not clear to what extent these mechanisms, such as intra-organism interactions or self-inhibition, scale with diversity to reflect the ecological processes that shape community composition and function in the wild. High-resolution time- series of natural communities could avoid the problems of the previous two methods, but the acquisition of this type of data is often limited by the amount of labor and cost required to consecutively sample a site at suitable time intervals (David et al., 2014; Martin-Platero et al., 2018).

One approach that may allow us to simultaneously preserve the diversity and interactions within natural communities and track community assembly in real time at relatively low costs is derivatizations of what is called a “common garden experiment”. Traditionally used by plant biologists to identify the genetic basis of phenotypic differences across plant populations by introducing plants from different habitats to the same environment (Linhart and Grant, 1996), this kind of experiment can be extended to observe how communities of varying compositions assemble on the same substrate, and how this variance leads to different assembly patterns and community functions (Bell, 2019; Goldford et al., 2017; Rivett and Bell,

2018). The variation in community composition can either be natural variation in communities directly collected from diverse habitats, or can be artificially introduced to test specific hypotheses (Figure 1a, b).

In this thesis, I present two studies that make use of this approach, one applied to communities that originate from the coastal ocean, and another applied to human gut communities. Both studies aim to move towards a more mechanistic understanding of community assembly for natural communities.

1.2 Common garden experiments to explore relationships between diversity, interaction and community functions

In the second chapter of this thesis, I demonstrate how the common garden experiment can be combined with systematic manipulations of the diversity of a natural community to disentangle how the effects of 9

different interactions on community function scale with diversity. While many ecological models have been developed to determine how and why the relative strength of different interactions on community function alters with diversity, they often require measuring the relevant functions of community members in monoculture, making them most suitable for systems in which diversity can be varied by making a series of assemblages from well-characterized species (Fox, 2006a; Loreau and Hector, 2001a). However, due to the high diversity and lack of systematic culturing of microbes in the wild, artificial assemblages of microbes can be severely limited in richness and phylogenetic diversity compared to natural communities, and may not harbor naturally occurring interactions that are often the result of long term evolutionary processes of co-occurring organisms. Therefore, how naturally occurring interactions can influence community function across ecologically relevant ranges of diversity remains an open question.

In this study, I address the problem of how microbial interactions change across ecologically relevant ranges of diversity and affect ecosystem functions by developing an experimental workflow that is comprised of two stages. The first stage generates communities of different diversities via serial dilution of natural communities and assesses the functions of these communities after they self-assemble on the same substrate; the second stage brings about isolates that represent the most abundant taxa in communities from the previous stage via dilution-to-extinction, thus allowing community function to be compared to that expected from individual taxa without interactions. Specifically, I grew serially diluted seawater on brown algal leachate to generate self-assembled communities that span a wide range of diversity, and measured their biomass production and respiration as complementary community functions of interest. Following the measurements, each community was diluted to extinction to generate monocultures representing abundant taxa in the communities, and the same functions were measured and compared to the community values. I further developed relative total function (RTF, the summed relative functions of all individual taxa in a community compared to their monoculture) as an index for net positive or negative effects of interactions on different community functions. By comparing the RTF of different community functions, I found that with increasing diversity, stronger niche complementation 10

allowed the total resource access of the communities to increase, but the efficiency of carbon to biomass conversion in the communities decreased due to the increase of competition. The increase of both niche complementation and competition happened over a narrow range of taxonomic richness, and quickly reached a threshold at a moderate level of taxa. This thus suggests the production of a natural community may be limited at two ends: the upper limit is imposed by its potential to access resources, while the lower limit is imposed by the amount of niche overlap that allows members to coexist in nature.

1.3 Common garden experiments to explore gas production of human gut microbiota

In the third chapter of the thesis, instead of manually manipulating communities to alter their composition, I demonstrate how to take advantage of the natural variation of the human gut microbiota and use the common garden experiment to identify how this variation can affect the gas production abilities of the microbiota on different prebiotics (different common gardens). While there has been much attention for designing prebiotics (compounds selectively utilized by human associated microbes to confer health benefits) that can maximize the production of health-promoting metabolites (Gibson et al., 2017;

Holscher, 2017), the by-products formed in this production process and their physiological effects are often ignored. An example is many studies aimed at selecting suitable prebiotics place much emphasis on maximizing the production of metabolites that are beneficial for the human body, such as short chain fatty acids (SCFAs), but do not take into account gas, a byproduct of SCFA production that also has physiological effects on the human body (Azpiroz, 2005; Sahakian et al., 2010). There is thus a lack of systematic investigations on factors that affect the volume and content of gas production in prebiotic fermentation.

In this project, I “transplant” the fecal microbiota of 9 different subjects into two different “prebiotic gardens”, the dietary fibers inulin and pectin, and ask how the chemical nature of prebiotic “gardens” and the composition of the fecal microbiota influence gas production after the transplant. I found that although the major gas products in both “gardens” were H2, CO2 and CH4, the more oxidized fiber pectin resulted 11

in less production of the reductive gas H2 as well as more of the most oxidized SCFA acetate. Cross- referencing the experimental results with a theoretical model based on simple chemical principles such as mass balance and electron balance, I found that pectin degradation not only produced less H2, but even took up reducing agents from the environment. I also observed heterogeneity in H2 production between different gut microbiotas during inulin degradation, mostly driven by a single Lachnospiraceae amplicon sequencing variant (ASV) that was likely a strong hydrogen producer. By contrast, methane production was not affected by prebiotic chemistry and only showed dependence on whether there were detectable levels of Methanobacteria in the microbiota. I thus concluded that metabolites that can be produced by more organisms in the human gut, such as H2, are more affected by the chemical composition of prebiotics compared to metabolites that are produced by less common organisms in the gut, such as methane.

1.4 Studying microbiome assembly in well controlled contexts can complement survey based studies

The rise of the field of microbiome research is often seen as a result of large decreases in sequencing costs, and has thus historically been dominated by studies that perform observational ‘omics surveys of communities that associate features of community structure with environmental or health parameters.

However, it is clear that the field is now moving beyond hypothesis-generating association studies towards more mechanistic studies regarding the colonization, recovery, succession and stability of microbiomes (Coyte et al., 2015; Faith et al., 2013; Fischbach, 2018; Guittar et al., 2019; Zaneveld et al.,

2017). This thesis represents an effort to study the assembly and function of microbial communities in semi-natural contexts with well controlled, “common garden” experiments. I demonstrate how this method could allow us to gain insights into how biotic interactions could scale with diversity to affect community production (Chapter 2), and how community functions such as gas production could be affected by community composition and substrate chemistry (Chapter 3). I expect more future work with

12

these common garden experiments will serve as a complementary method to the ‘omics surveys, allowing us to validate the hypotheses generated via the survey experiments, and better understand ecological mechanisms and statistical rules of microbiome assembly and function.

13

1.5 Figures illustrating the common garden experiment

Figure 1

a

b

Figure 1 Figures illustrating the common garden experiment in a) Chapter 2, where community variance is artificially introduced and b) Chapter 3, where community variance is natural.

14

Chapter 2 Interactions in self-assembled microbial communities saturate with diversity

Xiaoqian Yu, Martin F. Polz, Eric J. Alm

The work presented in this chapter is available as a manuscript published in the ISME Journal (2019) Feb

26;1

Abstract

How the diversity of organisms competing for or sharing resources influences community function is an important question in ecology but has rarely been explored in natural microbial communities. These generally contain large numbers of species making it difficult to disentangle how the effects of different interactions scale with diversity. Here, we show that changing diversity affects measures of community function in relatively simple communities but that increasing richness beyond a threshold has little detectable effect. We generated self-assembled communities with a wide range of diversity by growth of cells from serially diluted seawater on brown algal leachate. We subsequently isolated the most abundant taxa from these communities via dilution-to-extinction in order to compare productivity functions of the entire community to those of individual taxa. To parse the effect of different types of organismal interactions, we defined relative total function (RTF) as an index for positive or negative effects of diversity on community function. Our analysis identified three overall regimes with increasing diversity.

At low richness (<12 taxa), positive and negative effects of interactions were both weak, while at moderate richness (12-26 taxa), community resource uptake increased but the carbon use efficiency decreased. Finally, beyond 26 taxa, the effect of interactions on community function saturated and further diversity increases did not affect community function. Although more diverse communities had overall greater access to resources, on average individual taxa within these communities had lower resource availability and reduced carbon use efficiency. Our results thus suggest competition and complementation simultaneously increase with diversity but both saturate at a threshold. 15

2.1 Introduction

Organismal diversity is recognized as a driver of ecological functions such as biomass production, resource turnover, and community stability. As the number of taxa increases so do positive and negative biotic interactions such as facilitation, niche complementation, and competition, which all modulate the efficiency of resource use. While niche complementation leads to optimization of resource use through avoidance of resource use overlap, competition – either directly for resources or indirectly by chemical interference – often negatively impacts ecosystem functions. A considerable number of models have been developed to determine how and why the relative strength of different interactions on community function changes with diversity (Fox, 2006b; Jaillard et al.; Jousset et al., 2011; Loreau and Hector, 2001b;

Maynard et al., 2017). Most models determine the net effect of interactions on communities by comparing the observed community function to that predicted from monoculture functions of community members and agree on a general increase of all types of interactions with diversity, as well as diverse communities being more productive in many ecosystems due to the strong effect of niche complementation. It is, however, also possible for this relationship to be reversed (Cardinale et al.,

2006). Especially in microbial systems, it has been proposed that the negative effects of antagonism on community function are only outweighed by the positive effects of niche complementation if the microbes are functionally dissimilar and the resource environment is heterogeneous (Jousset et al.,

2011). Recent work further shows that even when both conditions are satisfied, a negative relationship between diversity and biomass production can occur if interspecific competition is strong and hierarchical, such as in a highly antagonistic system of wood degrading fungi (Maynard et al., 2017).

Because of their strong dependence on obtaining measurements of relevant functions of community members in monoculture, many community interaction models are most suitable for systems in which diversity can be varied by making a series of assemblages from well-characterized species. However, this approach is difficult for microbial ecosystems because they typically display high richness and often only a small portion of the total diversity can easily be isolated and grown in pure culture (Epstein, 2013). An 16

alternative approach that removes species from natural communities via serial dilution can effectively generate communities of decreasing diversity while circumventing isolation, but in turn makes separating the effects of individuals on community function from that of interactions challenging (Peter et al., 2011;

Philippot et al., 2013; Szabó et al., 2007). As a result, artificial microbial assemblages have been used to experimentally study organismic interactions and these assemblages have been limited in richness and phylogenetic diversity. Importantly, it is not clear to what extent such artificial assemblages reflect naturally occurring interactions, which are often the result of long term evolutionary processes of co- occurring organisms. It therefore remains an open question how different types of biotic interactions contribute to community function across ecologically relevant ranges of diversity for microbes that have co-diversified in the wild.

Here, we address the problem of how microbial interactions and their effects on ecosystem functions change over a wide diversity spectrum by combining the dilution approach with isolation and pure culture studies. Dilution series of planktonic microbial communities are first allowed to self-assemble on seaweed extract as a realistic, complex environmental carbon substrate, which mimics an algal bloom in the coastal ocean (Takemura et al., 2017; Teeling et al., 2012). This process generated a series of replicate microbial communities with varying diversity and allowed measurement of biomass production and respiration as relevant community wide parameters. Following the measurements, each community was diluted to extinction to approach monoculture-level diversity, and the same production measures were obtained and compared to the community values. We extend relative yield total (RYT), a classical criterion for determining whether intercropping leads to higher crop yield than mono-cropping (C.T. and J.P., 1965), to our multi-taxa system and generalize it to summable functions across community members as the relative total function (RTF). RTF can be further broken down to investigate how interactions affect community production through changes in total resource use or resource conversion efficiency. We find that with increasing diversity, total resource access increases due to niche complementation, but carbon use efficiency decreases due to competition, leading to a stronger increase in community respiration than 17

biomass production. Our results show that the production of a natural community may be limited by both its potential to access resources and by the amount of niche overlap that allows members to coexist in nature.

2.2 Results

2.2.1 Experimental Workflow

To test the relationship between diversity and community function, we generated microbial communities of decreasing diversity by serially diluting the same seawater sample and subsequently tracking relevant measures of microbial functions during growth in seaweed-seawater medium (SSM, pasteurized seawater containing extract from the brown algae Fucus). Specifically, our approach consisted of two stages where the first generated communities of similar overall biomass but different richness for which community production and respiration were measured, and the second consisted of dilution-to-extinction of communities from the previous stage to generate monocultures for which the same community functions were measured as input to the RTF model (Fig. 1, Methods).

In the first stage, we generated 15 replicate samples from seawater passed through a 5 µm filter and serially diluted each sample in 12 steps of 4-fold dilution each to yield a total of 185 communities of differing cell numbers and diversity. To acclimate these communities to the conditions used for measurements of community function, each was regrown first in pasteurized filtered seawater without any additional nutrients, then diluted 1:1 into SSM and regrown for 48h, and finally diluted 1:30 into SSM and grown for 160h (Fig. 1). This final regrowth experiment was used to measure community functions at different diversity levels. In the second stage, the resultant communities were diluted to extinction to obtain low complexity communities (typically 1-10 taxa) and regrown in SSM for 160h in order to determine the same functions under near monoculture conditions.

18

Biomass production was measured as the maximum of both cell numbers and per cell protein content that the communities reached within the first 110 hours of observation, while respiration was estimated as the total CO2 production during this time. The combined measures of biomass production and respiration allow estimation of resource use efficiency. Finally, community composition was assessed by 16S rRNA amplicon sequencing immediately after the first diversity removal step and after each community reached early stationary phase in SSM. To estimate the total taxonomic richness, we first counted the number of taxa as clusters of identical 16S rRNA amplicon sequence variants (ASVs) (Callahan et al., 2016, 2017), and then inferred the number of ASVs that were uncounted due to limitations in sequencing depth (Willis and Bunge, 2015). Since observed and estimated taxonomic richness was highly similar in all cases (Fig.

S1), we estimate richness from direct ASV counts.

Of the total 185 possible communities from the first stage dilution series, 151 showed sufficient cell numbers after regrowth in pasteurized seawater to serve as inoculum into SSM. These communities ranged in taxonomic richness and cell densities from 5 to 350 and 1.1×104 to 3×105 cells/mL, respectively

(Figs. 1 and S2a). As expected, more diluted communities contained fewer and more dissimilar sets of taxa (Figs. S2b, c). All communities experienced further loss of taxa during their regrowth in SSM (Figs.

S3a, b) possibly due to environmental filtering, population bottlenecks during transfer or competitive exclusion.

The second stage dilution to extinction experiment to create monocultures yielded 882 samples, most containing less than 10 taxa, with 275 almost completely dominated by a single taxon ( >90% of reads belonging to one ASV). These monocultures represent 37 ASVs in 7 families (Table S1), and cover

75.8% of total reads for the communities grown in SSM from the first dilution, hence providing a good basis for comparison of community-level and single taxon measures of function.

19

2.2.2 Relationships between diversity and community functions vary

We observed Michaelis-Menten-like hyperbolic relationships between taxonomic richness and cell density or CO2 accumulation, but negative relationships between taxonomic richness and protein production per cell, independent of whether taxonomic richness of the inoculum or resultant communities after growth in SSM was used (Figs. 2a, b). Normalizing both respiration (CO2 production) and cell density to the maximum measured difference between two communities, we found that respiration increased at a faster rate than cell density with increasing taxonomic richness (Fig. 2c, p<2.2×10-16 , paired t-test, two-tailed). Since the amount of protein per cell also decreased with diversity, this differential rate indicates that an increasing portion of assimilated carbon was released as CO2 as the number of taxa rises. Because such lowered yield in more complex communities could be due to individuals having intrinsically lower efficiencies in converting assimilated carbon into biomass or due to different taxa negatively affecting each other, we developed an indicator that allows differentiating between these two possibilities.

2.2.3 Relative total function as an indicator for interaction effects on community function

We established relative total function (RTF) as an indicator for inter-taxa relationship effects on community function by generalizing the concept of relative yield total (RYT) suggested by De Wit & Van den Bergh in 1965 (13). RYT is calculated as the sum of the relative yields of two species in a community compared to each of their monocultures. It is thus a measure of how resource use of one species is influenced by the other when the two occur together in a community, under the assumption that the resource to biomass conversion efficiency of each species is maintained between mono and community cultures so that biomass becomes an indirect readout for resource use. Generalizing two species to multi-species and yield to all community functions that are summable across species, we defined RTF as the summed relative function of each taxon in a community to their monocultures:

20

푁퐶 퐹퐶,𝑖 푅푇퐹퐶 = ∑푖 (1) 퐹𝑖

where 퐹퐶,푖 is the function of a taxon 푖 in community 퐶, 퐹푖 is the measured monoculture function of taxon

푖, 푁퐶 is the number of taxa in community 퐶. Under the null model that community function is solely dependent on individual function, 푅푇퐹퐶 equals 1, while a 푅푇퐹퐶 larger than 1 indicates that biotic interactions increase community function, and a 푅푇퐹퐶 smaller than 1 indicates a decrease (see Figs. 3, S4 for graphical illustration).

Under most resource concentration regimes, the carrying capacity of a community can be seen as a linear function of resource uptake. Only when resource overabundance induces incomplete respiration of substrates, the assumption of linearity may be violated (Polz and Cordero, 2016). Since this is unlikely here due to limited resources being provided, we can further define for community functions analogous to carrying capacity:

푁퐶 푎퐶,𝑖 푆퐶,𝑖 ∗ 푁퐶 푆퐶,𝑖 푅푇퐹퐶 = ∑푖 = 푎̅̅푐̅ ∑푖 (2) 푎𝑖 푆𝑖 푆𝑖

푎 푆 * Represents that the second step only holds when 퐶,𝑖 and 퐶,𝑖 do not co-vary 푎𝑖 푆𝑖 where 푎 is efficiency of resource conversion into the function of interest (i.e., the amount of resource converted into a specific community function, such as biomass or carbon dioxide production, divided by the total resource uptake), and 푆 is the amount of resource uptake. Thus, there are two major ways that interactions could affect community function: by altering the total amount of resource uptake or the efficiency of resource conversion into the community function of interest. Under the null model, both the

푁퐶 푆퐶,𝑖 mean relative resource conversion efficiency (푎̅̅푐̅), and the relative total resource uptake (∑푖 ) would 푆𝑖 be 1. If either factor is larger or smaller than 1, it is positively or negatively affected by interactions between taxa (Figs. 3, S4).

21

While 푅푇퐹퐶 compares the function of a community to the monoculture functions of its constituents, the function of individual taxa in a community can be compared to its monoculture using relative mean function 푅푀퐹퐶. 푅푀퐹퐶 is calculated by dividing 푅푇퐹퐶 by the total number of taxa in the community (푁푐).

A relative mean function (푅푀퐹퐶) larger than 1 indicates facilitation between the majority of community members. Furthermore, relative mean function (푅푀퐹퐶) can be seen as the product of mean relative per

̅ 푁퐶 푆퐶,𝑖 taxa resource uptake (푆푐 = ∑푖 /푁푐) and mean relative resource conversion efficiency (푎̅̅푐̅). Thus, in 푆𝑖 ecological terms, the mean relative per taxa resource uptake (푆̅푐) is the ratio of the realized niche to the fundamental niche for the average community member (Figs. 3, S4).

We used the 푅푇퐹퐶 indicator to evaluate how interactions between taxa impacted different community function measurements as diversity increased. Since 푅푇퐹퐶 is only suitable for community functions that are summed across taxa, we applied it to three functions of interest: cell number, total protein production, and total respiration (as CO2 accumulation). However, calculation of 푅푇퐹퐶 also requires knowing the function in community (퐹퐶,푖) and in monoculture (퐹푖) for every taxon in a community. We achieved this by defining communities as “constitutable” that were at least 85% covered by the 37 taxa we were able to isolate in sufficient purity. Hence we assumed that the remaining 15% of taxa contribute to community function similar to the constitutable 85%. The 85% cutoff was chosen because it was the highest criterion that allowed constitutable communities to cover the full diversity spectrum of the stationary phase communities (Fig. S5). Using this cutoff, we identified 82 (35%) of the 235 total communities with good sequencing coverage to be constitutable for every community function.

Subsequently, we calculated functions of the individual taxa in each community (퐹퐶,푖) by multiplying the measured total community function 퐹퐶 with relative abundances of each taxon from community composition measurements. This calculation estimates cells counts for each taxon (퐹퐶,푖(퐶푒푙푙 푐표푢푛푡)) in the community; however, for estimating protein and CO2 productions of individual taxa in the community (퐹퐶,푖(푃푟표푡푒푖푛) and 퐹퐶,푖(퐶푂2)) the calculation is only valid if there are no large variations in the

22

amount of protein per cell and CO2 production per cell across different taxa, which we found to be the case for the majority of taxa in our communities (see Methods for details).

2.2.4 Interaction effects on community function differentially increase with diversity

By tracking how different community functions relative to monoculture change with diversity, we found that as diversity increased, interactions lead to a stronger increase in community respiration (CO2 production) than community biomass production (cell count and total protein). At low richness (푁푐 ≤

12), 푅푇퐹퐶 was not significantly different from 1 for CO2 production, and slightly below 1 for total protein production and cell count, indicating that the effect of interactions on respiration was negligible but

-6 weakly negative for biomass production (Figs. 4a, S6a, t-test, two-tailed, pCO2=0.90, pprotein=1.0×10 ,

-10 pcell=2.4×10 ). However, as taxonomic richness increased beyond 12, 푅푇퐹퐶(퐶푂2) steadily rose above 1 until it eventually plateaued when a moderate richness level of 26 taxa was reached, while 푅푇퐹퐶(푃푟표푡푒푖푛) remained around 1 over the entire diversity range, and 푅푇퐹퐶(퐶푒푙푙) remained around 1 until 26 taxa,

-3 -11 beyond which it stayed slightly above 1 (t-test, two-tailed, pCO2,1226=1.4×10 ,

-3 pprotein,1226=0.29, pcell,1226=7.5×10 ). Thus, at moderately high diversity, interactions had a strong net positive effect on community respiration, but only a weakly positive effect on the production of cells, and no net effect on community protein production.

2.2.5 Community resource uptake increases with diversity while individual taxa resource uptake decreases

Since the relative community CO2 production increased significantly with diversity, and relative community biomass production also exhibited a weak increase, the relative resource uptake of the community must also increase with diversity. Detailing this under the 푅푇퐹퐶 framework, since each

푁퐶 푆퐶,𝑖 community has a single relative total resource uptake value (∑푖 ), and the conversion efficiencies for 푆𝑖

CO2 and biomass accumulation have to change in opposite directions (i.e., they cannot simultaneously 23

increase or decrease), the relative total resource uptake of a community must fall between 푅푇퐹퐶(퐶푂2) and 푅푇퐹퐶(퐵푖표푚푎푠푠). Taking total protein production as the measurement for biomass, we find that the region between 푅푇퐹퐶(퐶푂2) and 푅푇퐹퐶(푃푟표푡푒푖푛) moved from around 1 to higher than 1 as diversity increases

(Figs. 4a, S6a). This indicates that biotic interactions with positive effects on resource uptake, such as niche complementation, are more prevalent in more diverse communities.

However, individual taxa in more diverse communities on average take in fewer resources, i.e., they have narrower realized niches relative to their fundamental niches in monoculture. This is supported by the gradual decrease of the mean relative per taxa resource uptake (푆̅푐), whose boundaries are defined by relative mean functions 푅푀퐹퐶(퐶푂2) and 푅푀퐹퐶(푃푟표푡푒푖푛) , with diversity (Figs. 4b, S6b). Also, since all

푅푀퐹퐶s never exceeded 1, communities where most community members facilitate each other are rare. In fact, only a very small fraction of taxa produced more biomass or CO2 in communities than in monocultures (Fig. S7), indicating that each taxon always has more competitors than facilitators, and the frequency of mutualistic interactions among different taxa is lower than negative interactions.

2.2.6 Competition in more diverse communities reduces carbon use efficiency (CUE)

Since relative community CO2 production increased faster with diversity than relative community biomass production, the relative carbon to biomass conversion efficiency (often known as carbon use efficiency, CUE) decreased, possibly as a result of stronger competition. Because the ratio of 푅푇퐹퐶(퐶푂2) to 푅푇퐹퐶(퐵푖표푚푎푠푠) should scale positively with both the expected CUE from monocultures and how much this CUE changes upon introduction of the organism into a community, we estimated the range of the relative-to-expected CUE from the 푅푇퐹퐶(퐶푂2) to 푅푇퐹퐶(푃푟표푡푒푖푛) ratios and limiting the expected CUE to the thermodynamic limits of 0-0.6[(Geyer et al., 2016), see methods for details]. We found that as taxonomic richness increased, interactions had increasingly negative effects on the relative CUE of the community, but the effect leveled off at moderate taxonomic diversity (Figs. 4c, S6c). Furthermore, since

24

at higher richness 푅푇퐹퐶(푃푟표푡푒푖푛) was lower than 푅푇퐹퐶(퐶푒푙푙) (푁푐 > 26, p=0.06, Paired Wilcoxon rank sum test, one-tailed), the relative CUE decrease is a joint effect of both smaller cell size and lower cell number. This is likely because microbes in more diverse communities are under stronger competition for resources, as evidenced by their narrower relative realized niches. They either have to grow faster to directly compete with other microbes for preferred resources or avoid competition by using more recalcitrant and less preferred resources. Because growth rate generally scales positively with cell size

(Cermak et al., 2017; Roller et al., 2016; Schaechter et al., 1958), it appears more likely that our observation of smaller cells in more diverse communities indicates reduced growth rate and partitioning of a larger portion of carbon taken up toward respiration due to the lower energy yield of more recalcitrant substrates.

However, it is possible that the “faster growth” strategy dominates initial periods of growth and the

“recalcitrant resource” strategy takes over in later growth phases. We observed that more diverse communities had, at least initially, faster growth rates. An overwhelming majority of communities with more than 26 taxa reached maximum cell density by 24h of growth, while those with less than 26 taxa had equal probability of growing to maximum at 24h, 32h or 40h (Fig. 5). Hence, our data suggest that diverse communities had a high probability of containing members specialized for substrates that allow initial rapid growth. Conversely, communities with lower diversity only occasionally contained such potentially fast growers, which can bloom under conducive conditions but otherwise occur at low concentration in natural communities, explaining the broader distribution of times to maximum cell numbers. Although this does not provide direct evidence for individual taxa growing faster in more diverse communities, it may indicate that it could be advantageous for taxa in more diverse communities to regulate itself for faster growth compared to monocultures. We would then expect relatively lower CUE but larger cells to accumulate early in the experiment (Lipson, 2015; Pfeiffer et al., 2001). Over the full observation period, however, more diverse communities on average had smaller

25

cell size, possibly because after reaching peak density, the initial population is replaced by other taxa or switched into a metabolism that utilizes substrates that are more recalcitrant and less energy efficient.

2.2.7 Effects of interactions on community function increase logistically with diversity

Although cell density and CO2 both exhibit Michaelis-Menten-like hyperbolic relationships with diversity, both competition and complementation have logistically increasing effects on community function as diversity increases. By estimating CUE through the size of the difference between 푅푇퐹퐶(퐶푂2) and 푅푇퐹퐶(푃푟표푡푒푖푛), and total niche occupation through their relative position to 1, we found that at low diversity (푁푐 ≤ 12), the effects of both competition and niche complementation were weak, with the negative effects of competition narrowly exceeding benefits of niche complementation. As diversity increased beyond this level, the positive effect of niche space expansion gradually exceeded that of the negative effect of competition until they both stabilized at moderately high diversity (푁푐 > 26). This indicates that while niche complementation is probably limited by the total amount of resources available, competition also has plateaued (Fig. 4d). Overall, these patterns translate to a logistic model of growth as diversity increases for impacts on community function by either competition or complementation.

2.2.8 Specific taxa effects on community function

Since it is often assumed that organisms that are more phylogenetically distant are also more metabolically distinct, and taxonomic richness only partially explained the variance of community function (Figs. 2a, b), we asked if phylogenetic diversity of the communities affected community cell production or respiration through niche complementation. We found that when compared to species richness, neither of the two most common measurements for phylogenetic distance, the abundance weighted mean pairwise distance (MPD) or the abundance weighted mean nearest taxon distance

26

(MNTD), was better in explaining the variance of community functions (see Table S2 for comparison between all models/diversity metrics).

We then checked the possibility of specific “key” taxa affecting community function, i.e., whether one taxon could alter community function without altering MPD or MNTD. We screened for these taxa by looking for ASVs whose relative abundance significantly correlated (Kendall rank correlation, q<0.05 after FDR correction) with community function within sliding windows of taxonomic richness (window widths ranging from 2-10) (see Table S3 for all significant correlations). In most cases, each identified taxon was specific to a certain range of taxonomic richness (usually of size 10-15), with the direction of correlation highly consistent among windows within the range. Also, correlations between taxa and CO2 were few compared to those for cell density and total protein production, indicating that biomass production is more sensitive to taxonomic alterations than respiration. We identified five ASVs that exemplify how diversity may affect the outcome of microbial interactions and consequently community function.

Taxonomic richness of the inoculum strongly influenced dominance patterns in communities and thereby had significant effects on community function. For example, one taxon (ASV3) belonging to the genus

Alteromonas was found to positively correlate with CO2 production but only in communities with less than 10 taxa. However, ASV3 did not show strong dominance in communities with higher inoculum diversity. This suggests that there were other, more rare taxa that were able to compete with ASV3 and demonstrates the importance of priority effects in colonization of small resource patches in the environment (Datta et al., 2016).

Despite having generally more niche overlap for resources, communities with high diversity can still benefit from having taxa that use the less common resources. ASV30, identified as Wenyingzhuangia, positively correlated with total protein production in communities with high taxonomic richness.

Members in the Wenyingzhuangia genus are one of the few organisms known to degrade the highly 27

sulfated and recalcitrant sugar Fucoidan (Chen et al., 2016), estimated to be 4-10% of the dry weight of

Fucus (Fletcher et al., 2017). ASV30 may thus have positively affected community production by occupying a niche inaccessible to most other taxa, and might even have acted as a pioneer taxon by degrading the recalcitrant Fucoidan and converting it into more easily usable substrates.

Other taxa were found to have effects on community production likely through facilitation and predation.

Several ASVs belonging to the genus Sulfitobacter were consistently found to be positively correlated with community production across different ranges of community richness and production measurements.

Members of this genus are found as stable associates with algae (Singh and Reddy, 2014), and possess sulfite-oxidation and aromatic compound degradation abilities (Mas-Lladó et al., 2014). Given that Fucus contains large amounts of aromatic compounds, Sulfitobacter could have positively affected the community production through degradation of aromatic compounds, which could otherwise impede the growth of other bacteria. We also found a negative correlation between ASV66, a putative predatory bacterium from the genus Halobacteriovorax (Williams et al., 2016), and community cell density at high taxonomic richness, indicating that predatory behavior may also play a role in determining community production.

2.3 Discussion

In this study, by growing serially diluted seawater on brown algal leachate to generate self-assembled communities that span a wide range of diversity, we examined how diversity could impact community function through various interactions. Consistent with observations from artificially assembled systems, we found communities with greater diversity to have greater resource uptake due to niche complementation. However, the expansion of resource use came at the cost of increased competition driven by niche overlap and decreased the carbon use efficiency of the communities.

28

Compared to assembling “bottom-up” communities from known isolates, our method has both advantages and caveats. Since our communities self-assemble, our workflow is inevitably less well controlled compared to artificial assembly experiments and has more challenges regarding detailed characterization of individual traits and interactions. For example, unlike isolate assemblages that could be mixed to the exact same cell density, our inoculum communities only had cell densities on the same order of magnitude. Thus, checking that relationships between diversity and community function remain significant after considering the effect of initial cell density via multi-variable regression is necessary

(Table S4). Also, since our method was based on 16S rRNA gene amplicon sequencing, we were only able to distinguish bacteria that had at least one nucleotide difference in the 16S rRNA V4 region.

Different strains of bacteria that show genetic variability beyond the 16S rRNA region and have different productivities would be collapsed as one taxon in our analyses.

Despite these limitations, our method has the advantage of providing a more accurate recapitulation of how natural microbial communities assemble under defined environmental conditions and a more precise measurement of how community diversity alters different measurements of community function. Using organisms that have shared and adapted to a common habitat is especially important in light of studies that show evolutionary history can alter resource use patterns of taxa and can have a strong influence on how community functions are affected by biodiversity (Fiegna et al., 2015; Gravel et al., 2011).

Therefore, communities artificially assembled from strains that do not necessarily have a common history may provide a less realistic picture than provided by using a “top-down” approach of directly allowing natural communities to self-assemble to different diversities.

We thus argue that a “top-down approach” provides a better window into how diversity affects community function in nature. In previous work, this self-assembly approach has proven powerful in evaluating how and when various biogeochemical functions are affected by loss of diversity in a number of ecosystems such as fresh water lakes and top soil (Peter et al., 2011; Philippot et al., 2013; Szabó et al.,

29

2007). However, it was not possible to provide more detailed exploration of the underlying mechanisms for functional loss (Peter et al., 2011; Philippot et al., 2013; Szabó et al., 2007). We addressed this problem by adding a dilution-to-extinction step to the traditional dilution approach, effectively allowing us to study how community function is mediated via interactions across a diversity gradient by comparing community function to monoculture functions via the 푅푇퐹퐶 index.

Data generated from our method is also suitable for most other models that evaluate the net effect of interactions on communities by comparing the observed community function to that predicted from monoculture functions of community members. While our null model places a strong emphasis on comparing the uptake and utilization efficiency of resources between monocultures and communities, many other null models assume the performance of an individual in a community is directly proportional to its initial relative abundance, i.e., all individuals have the same growth rate. For example, in the Loreau and Hector model (Loreau and Hector, 2001b), the deviation between the expected and observed community function is called the net biodiversity effect, which can be further broken down into a part due to growth rate differences between individuals (“selection effect”) and a part due to resource complementation and competition (“complementarity effect” that is linearly related to the 푅푇퐹퐶 index).

Applying such a null model towards our system shows that the net biodiversity effect is mostly driven by the complementarity effect (Fig. S9). Thus, since our major purpose was to elucidate how interactions influence community functions through different resource use patterns across a diversity gradient, we chose to use our current null model that allows easy breakdown of interaction effects on functions into that on resource use and resource to function conversion efficiency.

In our system, we found a large decrease in CUE with diversity, possibly due to increased competition for resources. Although decrease of CUE with increasing diversity due to interactions has been documented previously, our data suggest an unusually large effect. For example, in an artificially assembled system of saprophytic basidiomycete fungi, it was found that interactions in multispecies communities can decrease

30

CUE by up to 25%, a stronger reduction than induced by many abiotic factors such as temperature increase (Maynard et al.). However, it is estimated that most of our communities at moderately high diversity have a 60%-80% decrease in CUE due to interactions. It is unlikely that this large CUE decrease in our system is due to antagonism as suggested by Maynard et al (Maynard et al.), since instead of selecting a system (wood decaying fungi) and conditions that favor antagonism, we are mimicking a free- living and dilute marine environment. Although antagonistic potential has been demonstrated in marine bacteria (Cordero et al., 2012; Rypien et al., 2010), it is more likely that the large interaction effect on

CUE in our system results from resource competition, which Maynard et al made efforts to minimize by having excess supply of both carbon and nitrogen (Maynard et al.). By contrast, with seaweed extract being mostly organic matter, our system is likely limited by nitrogen or phosphate. Thus, with the narrower realized niches we observe as diversity increases, resource competition is likely the major driving force for the CUE decrease in our communities.

Furthermore, in our system, the impacts of both competition and complementation on community function logistically increased as diversity increased. This was probably a result of algal exudates being overall ubiquitous in the coastal ocean, but consisting of substrates that have different numbers of bacterial consumers. Certain substrates may only be utilized by a portion of bacteria in the environment, and the chance of getting such bacteria in a community would be equally small among communities with different richness when diversity is low (푁푐 ≤ 12). Only when there was a sufficient number of taxa in the community did adding in new taxa actually expand the resource use profile, and this expansion quickly become limited by the amount and types of resources available. Moreover, the speed of the resource use expansion was slower than that of the increase in taxa, thus niche overlap also increased with diversity. However, niche overlap also only started to translate to negative effects on community production when there was a sufficient amount of taxa in the community, possibly because only when there are enough competitors around there would be a need for employing a fast growth, low efficiency strategy. 31

The saturation of competition in our system at moderately high diversity is possibly due to a combined effect of organisms co-diversifying over long periods of time under the specific environmental conditions of the coastal ocean. The limited amount of competition we observe is consistent with theoretical predictions that environmental fluctuation place upper bounds on how much overlap there can be between niches for species in the community to stably coexist (May and Arthur, 1972). Indeed, the communities assembled here were drawn from the highly fluctuating coastal environment, and given that niche overlap is the result of organisms having sets of redundant functional genes, this might indicate that there is a limited amount of functional similarity between algal degrading bacteria due to the frequency of environmental fluctuation.

In conclusion, using self-assembled microbial communities directly derived from a costal ocean seawater sample, we found that complementation and competition for resources both increase with diversity but reach a threshold at a moderate number of taxa, beyond which no further effect of these interactions is evident. The simultaneous increase of complementation and competition with diversity generates trade- offs between the range and efficiency of resource use. Although the exact diversity thresholds where the impact of competition and complementation on community function saturates in our system are specific to our experimental setup, such a threshold should exist in many natural habitats: while it is expensive for organisms to maintain pathways for resources they rarely encounter, competing for more common resources can also require costly strategies and puts the organism at higher risk of competitive exclusion.

Therefore, limits on interactions between wild populations of bacteria are likely a result of them maintaining a tremendous amount of diversity in fluctuating environments over long periods of time.

2.4 Methods

2.4.1 Media preparation

To prepare pasteurized seawater as a media for bacterial growth and as the solvent for making seaweed- seawater media (SSM), a total of 8 L of costal surface seawater was collected from a sampling site near 32

Northeastern University’s Marine Science Center (Canoe Beach, Nahant, MA, USA; N 42° 25' 11.6",W

70° 54' 24.8"), on Nov 12th, 2016. The water temperature at the time of sample collection was 12.0°C.

Seawater was pasteurized as described in Takemura et al 2017 (Takemura et al., 2017). Briefly, the seawater sample was divided into 2 L bottles, heated to temperatures between 78-82°C in water baths and maintained at the temperature for 1 hour. Each bottle was pasteurized twice with at least 48 hour intervals between the pasteurization events. The pasteurized seawater was then combined and filtered through 0.22

µm filters to remove any large size particles.

Stock solution for making the SSM was made from Fucus vesiculosus collected from the rocky shorelines of Canoe Beach on July 12th, 2015. The Fucus was washed, sun dried and grinded using a blender

(Waring). Four grams of the ground Fucus was mixed with 100 mL of pasteurized seawater, and stirred at

150 rpm for 2 hours at room temperature. The mixture was then passed through 20 µm Steriflip filters

(Millipore), diluted 4 fold with pasteurized seawater, passed through a 0.22 µm filter (Corning, pre- washed three times with MilliQ water), and pasteurized again. The seaweed media extract stock solution was stored in the dark at room temperature till time of use, when it was diluted 10 fold in pasteurized seawater to make SSM. Lyophilizing 10 mL of the 10X stock solution resulted in 22 mg of dried material; thus the concentration of dissolved organic matter in the SSM was approximately 0.022% (w/v).

2.4.2 Sample collection and experimental design

In order to generate the inoculum communities, a costal surface seawater sample was collected from a sampling site near Northeastern University’s Marine Science Center (Nahant, MA, USA; N

42° 25' 11.6",W 70° 54' 24.8"), on Nov 18th, 2016. The water temperature at the time of sample collection was 11.5°C.

The collected seawater was filtered through a 5 µm filter (Whatman) to remove particulates and larger eukaryotes and was estimated to have a microbial concentration of 3×105 cells/mL via Fluorescence- activated cell sorting (FACS) using absolute count beads. Thus, an initial “undiluted” seawater 33

community was defined as 30 mL of the filtrate, containing approximately 106 cells in total. Fifteen

“undiluted” seawater communities were 4-fold serial diluted with pasteurized seawater, generating 15 communities (30 mL) for each dilution level (42 X to 411X). An additional 5 and 15 sub-communities were generated for the two highest dilution levels (410 X and 411 X).

All diluted communities were placed in 50 mL Falcon tubes and rotated end-over-end at 6.5 rotations per minute in the dark. Bacterial growth in each tube was repeatedly sampled over time using FACS until they were determined to have reached stationary phase (less than 20% increase in cell count between two consecutive time points following time points with over 20% growth in between), when they were destructively sampled for DNA extraction and used as inoculate into SSM. Time points for sampling were

0, 15, 24, 38, 52, 72, 96, 120 and 360 hours.

For each community, three replicates of 0.3mL each were used to inoculate an equal volume of 2X SSM.

The communities were allowed to grow for 48 hours in 96 deep well plates on a floor shaker at 300 rpm before being diluted 1/30 into 580 µL of SSM. The re-inoculated cultures were grown in MicroResp

Systems (James Hutton Ltd, Aberdeen, UK) on a floor shaker at 300 rpm, and their community functions tracked as cell count, total protein, and CO2 production at time points 0, 16, 24, 32, 40, 64, 110, and 160 hours. The 160 hour data point was eventually omitted due to possible changes in the physical properties of the culture interfering with the FACS measurements.

At the end of the tracking period, communities similar in initial dilution levels and 160 hour cell count were combined. The combined communities were diluted in SSM according to their cell densities so that on average each diluted community would contain 1 cell per 200 µL of culture. Each combined community had 18-24 corresponding diluted communities. These diluted communities were allowed to grow for 7 days in flat-bottom 96 well plates before they were screened for positive growth using FACS.

34

Communities that scored positive for growth were diluted 1/30 into 580 µL of SSM, and again grown in

MicroResp Systems on a floor shaker at 300 rpm, with their community functions tracked as cell count, total protein, and CO2 production at time points 0, 16, 24, 32, 40, 64, 110, and 160 hours. The 160 hour data point was eventually omitted due to possible changes in the physical properties of the culture interfering with the FACS measurements.

2.4.3 Community growth and function measurements

For tracking growth of the diluted communities in pasteurized seawater for generating inoculum communities, at each time point 100 µL subsamples of the communities were obtained and fixed 1:1 with

0.8% Formaldehyde+0.5 µg/mL 4',6-Diamidino-2-Phenylindole (DAPI, Sigma). During subsequent growth of the inoculum communities in SSM for community function measurements, at each time point

20µL of each culture was fixed 1:10 with 0.8% Paraformaldehyde (BeanTown Chemical), and mixed 1:1 with staining media (1:5000 SYPRO red+0.02% SDS+1ug/mL DAPI) for 30 min in the dark at room temperature [(36), with some modifications].

All FACS measurements were performed using a BD LSRFortessa Flow Cytometer with a high throughput sampler. Bacterial cells were gated by forward and side scatter as well as the intensity of

DAPI staining under a 500 V laser with activation wavelength of 405 nm, and collected through a

450/50nm band-pass filter. Signal points that were between 20-4×104 FSC-H, 40-4×104 SSC-H and showed more than 200U blue fluorescence were counted as bacteria (Fig. S10). These gates were based on the following criteria: (a) The SSC-H threshold represents the lowest values above the region that accumulated 1000 events/min when running a blank sample, while the gated region above the threshold accumulated less than 100 events/min. (b) Events in the gated region shift to higher FSC-H values when the voltage of FSC is dialed up, while events below the FSC-H threshold do not move and are likely machine noise. (c) A population of cells clearly distinctive from background noise appeared in the region bound by SSC-H and FSC-H for different isolate cultures as well as environmental samples stained with

35

DAPI, while no events appeared for blank samples or unstained cells (To reduce background noise, acquisition was triggered by blue fluorescence signal). The gating protocols were validated by comparing the CFU/ml of a Vibrionaceae strain (grown with twice the substrate concentration as SSM) to cell density calculated from the number of gated events (Fig. S10f). The CFUs were counted in triplicate by plating 100 µL of serial dilutions on Marine Broth 2216 plates (BD Difco). Fluorescence for the SYPRO red stain was determined with a 561nm excitation laser (630V) and 610/20 band-pass filter.

For consistency, cell counts for both community function measurements and growth tracking were determined by the number of DAPI positive events in the selected gated region for bacteria. Since

99.0±3.6% DAPI positive cells in all non-blank samples also stained positive for SYPRO red, the protein per cell was determined by measuring the mean SYPRO red fluorescence for all DAPI positive events.

Total protein was calculated via multiplying protein per cell by cell counts. For each 96 well plate of samples, cell counts were normalized to 3-6 standards of CountBright absolute count beads (Thermo

Fisher) at 990,000 beads/mL, and total protein was normalized to 3-6 wells that contained a fixed standard marine bacteria mixture containing approximately 1:1 Vibrionaceae and .

CO2 production of communities was calculated from reading indicator plates in the MicroResp System on a plate reader at λ=572nm (Campbell et al., 2003). In the MicroResp system, all target communities were placed in deep 96 well blocks and connected to a top indicator plate using a seal. The seal insulated wells from each other but allowed the indicator plate to reflect the % of CO2 accumulated in the headspace of each well. The MicroResp indicator plates were made and calibrated according to the manufacturer’s

2 instructions. The relationship (R =0.996) between %CO2 in headspace and absorbance was (%CO2)

=0.1648/ (Δ572-0.2457)-0.2301, with Δ572 being the difference in A572 between the start and end of CO2 production time. Compared to the manufacturer’s instructions, we increased the number of measurements at low CO2 concentration, and were able to detect CO2 at levels as low as 0.025% (v/v). The average precision for all points on the standard curve was ±4% of each measurement (Fig. S11). The rate of CO2

36

production per volume of culture was calculated from %CO2 for each sampling interval, and a normalized total CO2 production of communities was calculated by summing CO2 production rate× average time× average community volume for each sampling interval. The effect of atmospheric CO2 was removed by normalizing the CO2 production values to the average of the blank wells.

Measurements eventually used as community functions were: maximum cell density the community reached within 110 hours, maximum total protein production and protein per cell within 110 hours, as well as total normalized CO2 production within 110 hours. To account for cells clumping in later time points, the maximum protein per cell was calculated from maximum total protein production/maximum cell density, instead of directly comparing protein per cell measurements between time points.

2.4.4 DNA extraction

In order to determine community composition for the inoculum communities, 30 mL of the inoculum communities were pushed through Swinnex Filter holders (13mm, Millipore) containing 13mm 0.22 µm filters (autoclaved, Durapore membrane PVDF, Millipore) connected to Luer-Lok syringes (BD). Filter paper was removed from the holder, cut into 4-6 smaller pieces, submerged in 125 µL QE buffer with 1%

Ready-Lyse Lysozyme (Epicentre, Quick Extract Kit) in eppendorf tubes, and shook at 400 rpm overnight at room temperature. The tubes were spun down at 1700rpm for 5mins the second day and the supernatant was stored at -20°C till future use.

The composition for communities growing in SSM were determined at early stationary phase. For each community, 200 µL of sample was taken and filtered through MultiScreen HTS GV filter plates (0.22

µm, sterile, PVDF membrane, Millipore) by spinning the plates for 5 mins at 3000 rpm. Each well was incubated overnight in 100 µL QE buffer with 1% Ready-Lyse Lysozyme (Epicentre) on a tabletop shaker at 400 rpm. DNA extract was collected by spinning the plates for 5 mins at 3000 rpm and obtaining the flow through.

37

2.4.5 Library Prep, Sequencing, and Quality Control

16S rRNA gene amplicon libraries (V4 hypervariable region, U515-E786) were prepared according to the method described by Illumina 16S metagenomic library preparation with some slight modifications (first PCR clean-up was done by using ExoSAP-IT express PCR clean up reagent,

Thermo Fisher). Samples were sequenced on an Illumina MiSeq (PE 250+250) at the BioMicro Center

(Massachusetts Institute of Technology, Cambridge, MA). Reads were processed using a custom pipeline where cutadapt was used for primer trimming, QIIME 1.9 (Caporaso et al., 2010) was used for demultiplexing, and DADA2 (Callahan et al., 2016)was used to infer amplicon sequence variants

(ASVs). Default settings were used except forward reads were truncated to 200 base pairs, and reverse reads were truncated to 175 base pairs before merging. Communities with less than 2000 reads were removed. ASVs that were more than 2% in more than 20% of the blank samples were considered as contaminants and also removed. for the sequence variants was assigned using the RDP database(Cole et al., 2014). 16S copy number correction was performed with microbiome helper(Comeau et al., 2017): sequence variants were combined with the Greengenes database v13.5(DeSantis et al., 2006) to build a new reference tree using FastTree(Price et al., 2010), and assigned copynumbers using PICRUST(Langille et al., 2013).

2.4.6 Calculation of 퐑퐓퐅퐂

For 푅푇퐹퐶 calculation, the monoculture functions measured from the second stage dilution-to-extinction was used as 퐹푖. 퐹퐶,푖 was calculated from 퐹퐶푅퐶,푖, where 퐹퐶 is the total community function and 푅퐶,푖 is the relative abundance of the of taxon 푖 in community 퐶. This is only completely accurate when the community function of study is cell density, since different taxa may have different function to cell ratios.

Thus, for community functions other than cell density,

푎푐,𝑖 푎̅̅̅푐̅,̅푖̅(̅푐푒푙푙̅̅̅̅̅) 퐹퐶,푖 = 퐹퐶푅퐶,푖 푎푐,𝑖(푐푒푙푙) 푎̅̅̅푐̅,̅푖

38

푎푐,𝑖 푎̅̅̅푐̅,푖̅(̅̅푐푒푙푙̅̅̅̅̅) 퐹퐶푅퐶,푖 can be used as an approximation for 퐹퐶,푖 as long as the adjustment factor is close 푎푐,𝑖(푐푒푙푙) 푎̅̅̅푐̅,̅푖

푎 푎̅̅̅̅̅̅̅̅̅̅ to 1. Under the null model, the adjustment factor becomes 𝑖 ( 푖(푐푒푙푙)) for each taxa in a community, 푎𝑖(푐푒푙푙) 푎̅̅̅푖 퐶 which we find to be between the ranges of 0.5-2 for the majority of our taxa in communities (Fig. S8).

The overall effect of these adjustment factors should be even closer to one when they are further averaged across different taxa in a community for calculating 푅푇퐹퐶. Thus, for simplicity, the adjustment factor was assumed to be 1 for all taxa in communities, i.e. the protein/cell ratio and CO2/cell ratio among taxa in a community were equal.

Other estimations and assumptions used to calculate RTFC included: i) The criteria for determining if a sample was a monoculture was that 90% of the reads in the sample belonged to one sequence variant; the functions of the community was directly used as the monoculture functions. ii) The criteria for a community to be “constitutable” was that 85% of the reads were covered by sequence variants that we had a monoculture function of. The constitutable parts of the communities were re-normalized so that the relative abundances of the sequence variants added up to 100%. iii) CO2 production from 0-40h were used for 푅푇퐹퐶 calculations for CO2, since community composition measurements taken at early stationary phase were all around 40h.

2.4.7 Estimation of relative CUE

̅̅̅푎̅푐̅,푖̅̅̅̅̅̅̅̅ (퐶푂2) 푅푇퐹퐶(퐶푂2) 푎푖 푎푐,𝑖 Since 푟푎푡푖표푅푇퐹푐 = = ̅푎̅푐̅,̅푖̅̅̅̅̅̅̅̅̅̅̅̅̅̅ , and is a ratio, we can alter the units of 푎푖 so 푅푇퐹퐶(푃푟표푡푒𝑖푛) (푃푟표푡푒푖푛) 푎𝑖 푎푖

that 푎푖(푃푟표푡푒푖푛) and 푎푖(퐶푂2) are expressed in the same units, i.e., number of carbons in protein/number of

carbon uptaken. Now 푎푖(푃푟표푡푒푖푛) is equivalent to CUE, and 푎푖(퐶푂2) is equivalent to 1-CUE. By definition,

푎퐶,푖(푃푟표푡푒푖푛) is relative-to-expected CUE (denoted as R) times CUE. Thus, by seeing the whole

1−푅∗퐶푈퐸 푅푇퐹퐶(퐶푂2) 1−퐶푈퐸 community as the behavior as an “average taxa”, = 푅∗퐶푈퐸 . This gives 푅푇퐹퐶(푃푟표푡푒𝑖푛) 퐶푈퐸

39

1 푅 = ( ) 푟푎푡푖표푅푇퐹푐 1 − 퐶푈퐸 + 퐶푈퐸

Estimation of relative CUE depending on the ratio between 푅푇퐹퐶(퐶푂2) and 푅푇퐹퐶(푃푟표푡푒푖푛) was performed according to the equation above and by setting up the expected CUE in intervals of 0.01, ranging from 0-

0.6.

2.4.8 Curve fitting and diversity calculations

All curve fitting were performed using the nls function in R(2017), and the 95% confidence interval of the curvefits were performed using uncertainty propagation by first-/second-order Taylor expansion and

Monte Carlo simulation including covariances using the package “propagate”(Spiess, 2018). A linear, log-linear, hyperbolic least squares fit was performed for each dataset, and the model with the least AIC or an AIC comparable to the least AIC was selected. All sum of squares calculations were type II, and performed using the function “Anova” in the R package “car” (Fox and Weisberg, 2011).

Species richness were determined as the number of ASVs in each community, and they were compared against a rarefactioned species richness, using the R package “vegan”(Oksanen et al., 2013), and an estimated species richness, using the R package “breakaway”(Willis and Bunge, 2015). MPD and MNTDs of each community were calculated by first performing a multiple-alignment using the R package “DECIPHER”(Wright, 2016), then constructing a GTR+G+I (Generalized time-reversible with

Gamma rate variation) maximum likelihood tree with the R package “phangorn”(Schliep, 2011), and calculating the actual values using the R package “picante”(Kembel et al., 2010).

2.4.9 Data availability

All amplicon sequencing data generated in this study can be accessed on the US National Center for

Biotechnology Information SRA database under BioProject PRJNA477654. All community function measurements, ASV tables, and code for data analysis are available at https://github.com/cusoiv. 40

2.4.10 Acknowledgements

This work was supported by the U.S. Department of Energy (DE-SC0008743) to M.F.P. and E.J.A. X.Y was partially supported by a Lord Foundation Graduate Fellowship. We are grateful to Christopher

Corzett for providing us dried Fucus and for helpful discussions, and Michael Cutler for laboratory assistance. We thank the BioMicroCenter at MIT for their assistance with sequencing, and the Koch

Institute Flow Cytometry Core for their assistance on flow cytometry. We thank the Chisholm lab and the

Weiss Lab at MIT for allowing us to use their qPCR machines. We also wish to thank Sean Gibbons,

Chuliang Song, Serguei Saavedra, Sean Kearney, Fangqiong Ling, Joseph Elsherbini, and Fabiola

Miranda for helpful discussions and/or comments on various versions of this manuscript.

41

2.5 Figures

Figure 1

Figure 1 Schematic for community self-assembly and functioning measurements. In the first stage, a seawater sample is serially diluted and regrown in filtered seawater to generate inocula of different diversities for community growth and function measurements in seaweed-seawater medium (SSM) over a period of 160 hours. In the second stage, communities from the first stage are diluted to extinction to generate near monoculture samples that are subjected to the same growth and function measurements in SSM over a period of 160 hours. Unfilled points/lines represent time points that were taken but were eventually omitted due to changes in physical properties of the cultures.

42

Figure 2

Figure 2 Diversity strongly impacts all measures of community function. Relationship of different measurements of community production (cell numbers and protein per cell) and respiration (CO2 production) with (a) inoculum taxonomic richness and (b) stationary phase taxonomic richness of communities. (c) Comparison of the hyperbolic fits for CO2 production and cell density scaled to 1 as maximum. Each dot represents one inoculum community; colored bars indicate the standard deviation of the measurement; black/colored lines indicate fits to an appropriate model selected between a linear, log- linear, and hyperbolic least squares fit with gray/colored regions around the line indicating the 95% confidence of the fit.

43

Figure 3

Figure 3 Graphic illustration of the 푹푻푭풄 concept. The 푅푇퐹푐 allows determination of interaction effects on community function by summing the relative function of each taxon in the community compared to its monoculture. Functions suitable for this analysis are thus limited to those that can be summed across taxa, such as biomass or CO2 production. For illustration purposes, the function of interest used in this graph is the number of microbes. Community C1 represents a case of the null model (푅푇퐹푐 =

44

1): taxa 1 and 2 are capable of utilizing the same resources when separately in monoculture (colored regions in circles; circles represent total resources available). When together in a community, the two taxa equally split the pool of resources that they are both capable of utilizing (relative total resource 푁푐 푆퐶,𝑖 ̅̅̅̅ uptake ∑푖 = 1, mean relative per taxa resource uptake 푆푐 = 0.5). However, the resource to function 푆𝑖 conversion efficiency of each taxon is not affected by the other (mean relative resource to function conversion efficiency ̅푎̅̅푐̅ = 1). As a result, the output function (number of cells) is proportional to the resource allocation between the two taxa. In Community C2 (푅푇퐹푐 < 1), the two taxa 3 and 4 interfere with each other when sharing resources in a community, resulting in a mean relative resource to function conversion efficiency ̅푎̅̅푐̅ < 1. Here, although resource uptake is exactly the same as in the null model, the output function is reduced. In Community C3 (푅푇퐹푐 > 1), the two taxa 5 and 6 perfectly complement 푁푐 푆퐶,𝑖 each other in resource use (relative total resource uptake ∑푖 = 2, mean relative per taxa resource 푆𝑖 uptake ̅푆̅̅푐̅ = 1), and do not affect each other’s resource conversion efficiencies ( ̅푎̅̅푐̅ = 1). See also Figure S4 for an extension case where complementation and competition cancel out to result in an 푅푇퐹푐 of 1 but should be distinguished from the null model.

45

Figure 4

Figure 4 Relative community-to-monoculture functions indicate differential increases in niche complementation and competition with diversity. Relationship between diversity and (a) relative total

푁퐶 푆퐶,𝑖 biomass production, respiration, and resource uptake (푅푇퐹퐶 and ∑푖 ) and (b) relative per taxa 푆𝑖 biomass production, respiration, and resource uptake (푅푀퐹퐶 and 푆̅푐) (c) estimated relative carbon use efficiency (CUE). In (a), (b) each point represents one community at stationary phase (biological replicates were not averaged). In (c), each point represents a combination of one relative total function

(푅푇퐹퐶) ratio with a random value of CUE drawn from 0-0.6. The black line represents the mean of all combinations with the same taxonomic richness. Colored regions around the points indicate the standard deviation of the interaction effect, and the grey areas indicate the range where relative total and per taxa resource uptake is limited to at each taxonomic richness. The y axes are in log scale for better resolution of points (see Fig. S6 for the same data in linear scale). (d) A hypothetical model for how the effects of niche complementation and competition on community function scale with taxonomic richness. Numbers on the y-axis are arbitrary.

46

Figure 5

Figure 5 More diverse communities reach stationary phase earlier. Distributions of time for communities to reach peak cell density in communities with (a) less than 26 taxa (푁푐 ≤ 26), and (b) more than 26 taxa (푁푐 > 26).

47

2.6 Supplementary Figures

Figure S1

Figure S1 Estimated taxonomic richness are almost identical to ASV counts. Relationship between the number of ASVs counted directly from sequencing data and estimated after accounting for uncounted ASVs due to finite sequencing depth for all sequenced communities. Only communities that fit the estimation criterion (at least 6 different read frequencies) are shown. Grey bars, standard error of the estimate; broken line, estimated taxonomic richness=counted number of ASVs.

48

Figure S2

a b

c

Figure S2 Effects of dilution-regrowth of microbial communities in pasterized seawater. (a) Taxonomic richness and cell denisty of all inoculum communities. (b) Relationship of dilution factor and taxonomic richness. (c) Bray-Curtis dissimilarity between any two communities with the same dilution factor. Each red dot represents one community; the black line in box plots represent the median; wiskers represent 95% confidence interval of the median.

49

Figure S3

a b

Figure S3 Relationship between inoculum taxonomic richness and stationary taxonomic richness. Taxonomic richness displayed as (a) direct ASV counts and (b) estimated taxonomic richness. Each black dot represents one inoculum community, and error bars are standard deviations over three biological replicates derived from the same inoculum community. Error bars in b) include the propagated standard error of the taxonomic richness estimates.

50

Figure S4

Figure S4 Extended graphical illustration of the RTF concept. Illustration of a case where RTFc

푁푐 푆퐶,𝑖 equals 1 but is caused by interactions having positive effects on the relative total resource uptake (∑푖 ) 푆𝑖 and negative effects on the mean relative resource to function conversion efficiency (푎̅̅̅푐). The purpose of this illustration is to emphasize there is a difference between interactions having no effects on community function (only requires RTFc equal 1), and no interactions having effects on community function (requires 푁푐 푆퐶,𝑖 RTFc, ∑푖 , and 푎̅̅푐̅ to all be 1). Our null model corresponds to the latter case. 푆𝑖

51

Figure S5

Figure S5 Constitutable communities under different community coverage criteria. Relaxing criteria (95%, 90%, 85%, 80% reads covered by taxa with known monoculture functions) for constitutable communities increase their diversity coverage. Final criteria selected for constitutable communities was 85% coverage since it was the highest standard that allowed sufficient coverage of diversity. Colored dots, communities with production and respiration measurements that are constitutable; grey dots, communities with production and respiration measurements but are not constitutable.

52

Figure S6

Figure S6 Relative community-to-monoculture functions indicate differential increases in niche complementation and competition with diversity (linear scale). Relationship between diversity and (a) relative total community biomass production, respiration, and resource uptake, and (b) relative per taxa biomass production, respiration, and resource uptake (c) estimated relative carbon use efficiency (CUE). In (a), (b) each point represents one community at stationary phase (biological replicates are not averaged). In (c), each point represents a combination of one relative total function (푅푇퐹퐶) ratio with a random value of CUE drawn from 0-0.6. The black line represents the mean of all combinations with the same taxonomic richness. Colored regions around the points indicate the standard deviation of the interaction effect, and the grey areas indicate the range where relative total community and per taxa resource uptake is limited to at each taxonomic richness. (d) A hypothetical model for how the effects of niche complementation and competition on community function scales with taxonomic richness. Numbers on the y-axis are arbitrary.

53

Figure S7 a

b

c

Figure S7 Relative function of all isolate taxa in communties. The (a) relative cell count, (b) relative protein production, and (c) relative CO2 production for all the isolate taxa in communities compared to their monocultures. Each colored line with points represents one taxa. Dotted lines represent relative functions of 1.

54

Figure S8 a b

Figure S8 Adjustment factors for all taxa in communities. The adjustment factor for (a) protein measurements and (b) CO2 production (0-40h) for all taxa of non-zero abundance in communities. Dotted black lines represent the range 0.5-2.

55

Figure S9

Figure S9 Additive partitioning of diversity-function relationships with the Loreau and Hector model. Relationship between diversity as taxonomic richness at stationary phase and community function as (a) Maximum cell density the communities reached (b) Maximum measured protein production (c) total CO2 production in 0-40h. In (a), (b), and (c), the top panels labeled “All Observed” show all observed data points for relationships between stationary taxonomic richness and different measurements of community function. “Observed”, observed taxonomic richness and community function relationships for all constitutable communities. “Expected”, expected taxonomic richness and community function relationships under the null model. “Observed-Expected”, relationship between taxonomic richness and deviance between observed function and expected function of communities (NBE). “Observed-Expected” can further be broken down as the sum of “Selection Effect” and “Complementarity Effect”. “Selection Effect”, relationship between taxonomic richness and the selection effect on community function. “Complementarity effect”, relationship between taxonomic richness and the complementarity effect on productivity. Each dot represents one community; black lines indicate fits to an appropriate model between a linear, log-linear, and hyperbolic least squares fit, with gray regions around the line indicating the 95% confidence of the fit.

56

Figure S10 a

b

c

57

d

e

f

58

g h i

j k l

Figure S10 FACS histograms and calibrations. The 2D FSC-H, SSC-H density plots as well as fluorscence histograms for several bacterial isolates, communities of different diversity and controls. (a) a Vibrionaceae strain at mid-log phase stained with DAPI and SYPRO red and 7 µm beads, (b) an unstained Vibrionaceae strain at mid-log phase and 7 µm beads, (c) 7 µm beads only, (d) a high diversity community at mid-log phase stained with DAPI and SYPRO red, and (e) a low diversity community at mid-log phase stained with DAPI and SYPRO red. (f) Comparison between FACS and CFU counts for the strain of Vibrionaceae from (a). Each black dot represent the average FACS counts/CFUs for three replicates, and error bars represent standard deviations of the measurement. The dark cyan line represents the fitted line that describes the relationship between FACS measurements and CFU counts. The grey shaded area represents all the FACS measurements (jittered for better visualization) in our experiment. The 2D FSC-H, SSC-H density plots for cultures of (g) , (h) Halomonadaceae, (i) , (j) , (k) , (l) Oceanospirillaceae.

59

Figure S11

Figure S11 Standard curve for CO2 measurements with Microresp colormetric assay. Each black dot represent the average O.D. for four replicates at the same CO2 concentration, and error bars represent standard deviations of the measurement. The dark cyan line represents the fitted standard curve (%CO2) =0.1648/ (Δ572-0.2457)-0.2301, and the grey shaded area represents all the raw O.D. measurements (jittered for better visualization) for CO2 in our experiment.

60

2.7 Supplementary Tables

Table S1 List of isolates

Kingdom Phylum Class Order Family Genus Species

ASV_13 Bacteria Flavobacteriaceae Algibacter NA

ASV_71 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales Flavobacteriaceae Algibacter NA

ASV_33 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales Flavobacteriaceae Maribacter NA

ASV_18 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales Flavobacteriaceae Polaribacter NA

ASV_40 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales Flavobacteriaceae Polaribacter NA

ASV_30 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales Flavobacteriaceae Wenyingzhuangia NA

ASV_55 Bacteria Bacteroidetes Flavobacteriia Flavobacteriales Flavobacteriaceae Zobellia russellii

ASV_28 Bacteria Rhodobacteraceae Celeribacter NA

ASV_64 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Labrenzia NA

ASV_29 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Litoreibacter NA

ASV_26 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Loktanella NA

ASV_8 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Loktanella pontiacus

ASV_20 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_23 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_38 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_42 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_45 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_72 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_88 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_89 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae NA NA

ASV_52 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Oceanicola NA

ASV_43 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Roseovarius nubinhibens

ASV_16 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Sulfitobacter arcticus

ASV_12 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Sulfitobacter dubius

ASV_41 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Sulfitobacter NA

ASV_9 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Sulfitobacter NA

ASV_44 Bacteria Proteobacteria Alphaproteobacteria Rhodobacterales Rhodobacteraceae Thalassobius NA

ASV_31 Bacteria Proteobacteria Alphaproteobacteria Rhodospirillales Rhodospirillaceae Thalassospira NA

ASV_3 Bacteria Proteobacteria Alteromonadaceae NA

ASV_34 Bacteria Proteobacteria Gammaproteobacteria Alteromonadales Alteromonadaceae Alteromonas NA

ASV_17 Bacteria Proteobacteria Gammaproteobacteria Alteromonadales Alteromonadaceae NA

ASV_6 Bacteria Proteobacteria Gammaproteobacteria Alteromonadales Pseudoalteromonadaceae NA

ASV_7 Bacteria Proteobacteria Gammaproteobacteria Alteromonadales Pseudoalteromonadaceae Pseudoalteromonas NA

ASV_177 Bacteria Proteobacteria Gammaproteobacteria Halomonadaceae Cobetia NA

ASV_14 Bacteria Proteobacteria Gammaproteobacteria Oceanospirillales Oceanospirillaceae Marinomonas NA

ASV_19 Bacteria Proteobacteria Gammaproteobacteria Oceanospirillales Oceanospirillaceae Marinomonas NA

ASV_21 Bacteria Proteobacteria Gammaproteobacteria Vibrionales Vibrionaceae Vibrio NA

61

Table S2 Comparison between different diversity metrics and models for explaining community function measurements

Diversity Metric Model Measurement AIC R2 SR Hyperbolic Cell Density 3971.87 0.38 SR Linear Cell Density 3974.00 0.36 SR Log Linear Cell Density 3970.11 0.39 MPD Hyperbolic Cell Density 4006.16 0.16 MPD Linear Cell Density 3999.64 0.21 MPD Log Linear Cell Density 3997.31 0.22 MNTD Hyperbolic Cell Density 4024.37 0.02 MNTD Linear Cell Density 4012.92 0.11 MNTD Log Linear Cell Density 4021.29 0.04 SR Hyperbolic Protein/Cell -2307.11 0.05 SR Linear Protein/Cell -2321.94 0.16 SR Log Linear Protein/Cell -2317.03 0.13 MPD Hyperbolic Protein/Cell -2305.45 0.04 MPD Linear Protein/Cell -2313.57 0.10 MPD Log Linear Protein/Cell -2311.63 0.09 MNTD Hyperbolic Protein/Cell -2303.58 0.02 MNTD Linear Protein/Cell -2303.32 0.02 MNTD Log Linear Protein/Cell -2301.26 0.00 SR Hyperbolic CO2 294.04 0.57 SR Linear CO2 302.05 0.54 SR Log Linear CO2 289.55 0.59 MPD Hyperbolic CO2 367.35 0.19 MPD Linear CO2 340.40 0.36 MPD Log Linear CO2 347.33 0.32 MNTD Hyperbolic CO2 389.42 0.00 MNTD Linear CO2 354.66 0.27 MNTD Log Linear CO2 368.74 0.18

SR= species richness at stationary phase MPD= abundance weighted mean pairwise distance MNTD= abundance weighted mean nearest taxon distance

62

Table S3 Taxa associated with community function in all taxonomic richness windows

Taxono mic ASV Kendall's richness number RDP Classification Tau p-value window

Cell Density

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.49 3.99E-02 3-5

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.35 4.13E-02 9-17

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.37 2.36E-02 9-18

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.37 5.00E-02 10-18

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.45 1.43E-03 10-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.47 3.30E-03 11-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.45 2.88E-03 11-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.50 5.87E-03 12-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.46 7.92E-03 12-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.39 2.85E-02 12-21

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA 0.35 3.15E-02 12-21

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.53 5.70E-03 13-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.49 8.60E-03 13-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.40 3.14E-02 13-21

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.39 1.37E-02 13-22

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA 0.38 1.37E-02 13-22

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.58 3.77E-03 14-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.53 6.62E-03 14-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.43 3.09E-02 14-21

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.40 1.93E-02 14-22

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA 0.38 1.93E-02 14-22

ASV_28 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Celeribacter;NA 0.34 3.63E-02 14-22

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.36 4.56E-02 14-23

ASV_17 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Paraglaciecola;NA -0.31 4.56E-02 14-23

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA 0.31 4.56E-02 14-23

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.31 4.77E-02 14-23

ASV_28 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Celeribacter;NA 0.32 4.56E-02 14-23

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.60 7.04E-03 15-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.52 1.67E-02 15-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.64 1.40E-02 16-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.52 4.66E-02 16-20

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.45 4.04E-02 20-28

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.48 1.24E-02 20-29

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.38 4.95E-02 23-30

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.41 4.95E-02 23-30

63

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 4.39E-02 23-31

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.43 1.72E-02 23-31

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.37 2.06E-02 23-33

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.35 2.06E-02 23-33

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.30 4.35E-02 23-33

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.44 2.35E-02 24-31

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.33 4.22E-02 24-33

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 4.22E-02 24-33

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.31 4.22E-02 24-33

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.42 3.25E-02 24-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.34 2.87E-02 24-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.32 2.87E-02 24-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.32 2.87E-02 24-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.53 1.22E-02 25-30

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.53 4.29E-03 25-31

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.38 2.53E-02 25-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.39 1.09E-02 25-35

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.59 1.45E-02 26-30

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.61 2.64E-03 26-31

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.36 3.67E-02 26-33

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 3.60E-02 26-33

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.43 2.26E-02 26-33

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.45 3.03E-02 26-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.36 3.42E-02 26-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 3.32E-02 26-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.44 1.07E-02 26-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.34 3.34E-02 26-35

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.45 3.39E-03 26-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.29 4.06E-02 26-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.33 1.65E-02 26-36

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.37 1.23E-02 26-36

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.62 7.25E-03 27-31

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.41 2.19E-02 27-33

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.41 2.19E-02 27-33

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.49 2.19E-02 27-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.41 1.33E-02 27-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.42 1.33E-02 27-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.41 1.33E-02 27-34

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.46 1.33E-02 27-34

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.33 4.87E-02 27-35

64

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.38 2.09E-02 27-35

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.43 1.48E-02 27-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.31 3.87E-02 27-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 2.01E-02 27-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.30 3.76E-02 27-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 5.38E-03 27-37

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.69 8.11E-03 28-31

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.47 2.44E-02 28-33

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.44 2.44E-02 28-33

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.49 2.44E-02 28-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.37 4.59E-02 28-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.48 1.19E-02 28-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.47 1.19E-02 28-34

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 2.77E-02 28-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.43 1.19E-02 28-35

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.48 8.84E-03 28-35

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.42 3.90E-02 28-35

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.40 8.22E-03 28-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.43 7.66E-04 28-37

ASV_36 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.33 3.02E-02 28-37

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 7.66E-04 28-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.43 5.12E-04 28-38

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.75 8.35E-03 29-31

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.51 4.84E-02 29-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.48 2.85E-02 29-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.42 3.45E-02 29-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.45 2.85E-02 29-34

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 3.45E-02 29-34

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.38 4.74E-02 29-35

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.38 4.74E-02 29-35

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.48 1.83E-02 29-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.37 1.48E-02 29-36

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.33 2.73E-02 29-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 1.48E-02 29-36

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 1.48E-02 29-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.33 2.19E-02 29-37

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.28 4.07E-02 29-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.41 2.36E-03 29-37

ASV_36 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.33 2.96E-02 29-37

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 2.28E-03 29-37

65

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.32 1.83E-02 29-38

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.42 9.86E-04 29-38

ASV_36 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.30 3.69E-02 29-38

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.50 9.86E-04 29-38

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.31 1.85E-02 29-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.42 6.40E-04 29-39

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.50 6.40E-04 29-39

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.66 4.18E-03 30-33

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.62 4.18E-03 30-33

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.52 4.95E-02 30-33

ASV_7 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA 0.44 4.18E-02 30-34

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.60 4.95E-03 30-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.59 4.95E-03 30-34

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.42 4.18E-02 30-34

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 4.18E-02 30-34

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.45 3.40E-02 30-35

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.52 1.21E-02 30-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.41 1.08E-02 30-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.46 5.84E-03 30-36

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 1.08E-02 30-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.35 1.84E-02 30-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.48 6.89E-04 30-37

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 1.71E-03 30-37

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.34 1.66E-02 30-38

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.48 2.83E-04 30-38

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 1.03E-03 30-38

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.32 1.77E-02 30-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.48 2.08E-04 30-39

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.50 5.79E-04 30-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.47 6.24E-05 30-40

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.48 5.11E-04 30-40

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.69 4.22E-02 31-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.64 4.96E-02 31-33

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.52 4.72E-02 31-34

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.60 2.63E-02 31-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.58 2.63E-02 31-34

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.61 2.63E-02 31-34

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.44 1.57E-02 31-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.40 1.88E-02 31-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.48 1.10E-02 31-36

66

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 1.88E-02 31-36

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.36 2.55E-02 31-37

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.32 4.35E-02 31-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.52 8.83E-04 31-37

ASV_36 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.33 4.84E-02 31-37

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 4.85E-03 31-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.51 4.01E-04 31-38

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 2.78E-03 31-38

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.51 2.64E-04 31-39

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 1.38E-03 31-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.49 9.09E-05 31-40

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 1.34E-03 31-40

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.25 2.89E-02 31-41

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.44 8.13E-05 31-41

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.25 2.89E-02 31-41

ASV_66 Bacteria;Proteobacteria;Deltaproteobacteria;Bdellovibrionales;Bacteriovoracaceae;Halobacteriovorax;NA -0.31 2.89E-02 31-41

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.49 5.57E-03 33-37

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 9.69E-03 33-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.50 2.00E-03 33-38

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.51 4.85E-03 33-38

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.48 1.68E-03 33-39

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.50 2.95E-03 33-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.47 5.82E-04 33-40

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 2.46E-03 33-40

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.42 4.77E-04 33-41

ASV_66 Bacteria;Proteobacteria;Deltaproteobacteria;Bdellovibrionales;Bacteriovoracaceae;Halobacteriovorax;NA -0.35 1.66E-02 33-41

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.41 2.84E-03 33-41

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.40 6.07E-04 33-42

ASV_66 Bacteria;Proteobacteria;Deltaproteobacteria;Bdellovibrionales;Bacteriovoracaceae;Halobacteriovorax;NA -0.34 1.47E-02 33-42

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.41 1.51E-03 33-42

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 3.29E-03 34-43

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.35 9.64E-03 35-43

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 1.07E-03 35-44

ASV_73 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.58 2.50E-02 36-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 9.43E-03 36-43

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.38 1.03E-03 36-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 1.34E-03 36-45

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.39 4.97E-02 37-41

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.38 2.44E-02 37-43

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.35 2.55E-02 37-43

67

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.40 3.86E-03 37-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 3.76E-03 37-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 3.74E-03 37-46

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.35 3.60E-02 38-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.33 2.95E-02 38-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 3.48E-02 38-46

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.29 3.71E-02 38-47

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.52 2.95E-02 39-41

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.46 3.33E-02 39-42

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.42 4.74E-02 39-43

ASV_78 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.54 4.50E-02 44-46

Protein

ASV_9 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;NA 0.40 4.92E-02 11-17

ASV_9 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;NA 0.40 4.93E-02 11-18

ASV_9 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;NA 0.37 2.31E-02 11-20

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.45 4.33E-02 18-25

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.44 4.09E-02 18-26

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.42 4.25E-02 18-27

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.44 4.78E-02 19-26

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.42 4.90E-02 19-27

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.40 4.34E-02 19-28

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.59 1.27E-02 20-25

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.55 1.42E-02 20-26

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.53 1.07E-02 20-27

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.51 7.89E-03 20-28

ASV_13 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Algibacter;NA 0.37 3.61E-02 20-29

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.52 3.60E-03 20-29

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.43 4.10E-02 23-31

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.42 9.93E-03 23-33

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.41 1.91E-02 24-33

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.37 3.30E-02 24-34

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.43 1.68E-02 25-33

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.39 3.02E-02 25-34

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.43 5.55E-03 25-35

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.40 2.01E-02 26-35

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 3.23E-02 33-41

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.28 4.99E-02 33-42

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.26 4.99E-02 33-42

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.29 1.89E-02 36-44

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.29 1.89E-02 36-44

68

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.30 1.93E-02 36-45

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.25 4.70E-02 36-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.34 2.54E-02 37-44

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.31 3.04E-02 37-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.34 1.07E-02 37-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 2.16E-02 37-46

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.26 4.38E-02 37-46

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.32 4.18E-02 38-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 4.05E-02 39-44

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.31 4.05E-02 39-44

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA 0.37 3.34E-02 39-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.32 2.78E-02 39-45

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA 0.38 1.20E-02 39-45

ASV_7 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA 0.45 4.01E-02 40-43

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA 0.37 4.21E-02 40-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.34 2.19E-02 40-45

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA 0.38 1.60E-02 40-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 1.11E-02 40-49

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.29 1.11E-02 40-49

ASV_7 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA 0.43 2.43E-02 41-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.41 2.43E-02 41-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.41 1.50E-02 41-45

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA 0.34 4.51E-02 41-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 2.60E-02 41-46

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA 0.32 3.86E-02 41-46

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 1.12E-02 41-47

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.33 2.81E-02 41-48

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 2.84E-03 41-49

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.29 1.56E-02 41-49

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 2.31E-03 41-50

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.31 8.87E-03 41-50

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.32 1.54E-02 42-49

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.36 1.13E-02 42-49

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.32 1.24E-02 42-50

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.38 5.13E-03 42-50

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.32 1.60E-02 42-51

ASV_7 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA 0.64 2.55E-02 43-44

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.52 3.71E-02 43-45

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 1.06E-02 43-49

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.36 1.06E-02 43-49

69

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.37 5.22E-03 43-50

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.37 5.22E-03 43-50

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 1.68E-02 43-51

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.32 1.68E-02 43-51

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 1.70E-02 43-53

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.32 1.70E-02 43-53

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.33 4.50E-02 44-50

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.38 2.31E-02 46-56

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.52 2.32E-02 47-50

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.43 1.49E-02 47-56

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.43 1.01E-02 47-57

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.40 4.16E-02 48-57

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.43 1.77E-02 48-58

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.43 3.05E-02 49-58

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.42 2.30E-02 49-59

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.78 1.21E-02 53-63

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.67 3.17E-02 53-63

ASV_30 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Wenyingzhuangia;NA -0.61 4.12E-02 53-63

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.64 3.33E-02 53-66

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.60 3.33E-02 53-66

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.56 4.96E-02 53-77

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.60 4.96E-02 53-77

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.67 2.53E-02 54-66

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.67 2.53E-02 54-66

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.69 2.81E-02 54-77

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.67 3.17E-02 55-77

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.72 2.94E-02 55-77

CO2

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.36 1.16E-02 2-7

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.34 8.29E-03 2-8

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.34 5.40E-03 2-9

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.30 1.49E-02 2-10

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.28 2.14E-02 2-11

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.47 3.03E-02 3-5

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.45 1.38E-02 3-6

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.44 2.75E-03 3-7

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.40 2.41E-03 3-8

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.39 1.45E-03 3-9

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.34 5.36E-03 3-10

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.31 9.50E-03 3-11

70

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.30 8.69E-03 3-12

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.37 4.47E-02 4-7

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.34 3.22E-02 4-8

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.34 2.12E-02 4-9

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.41 4.90E-02 5-7

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.38 2.55E-02 5-8

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA 0.38 1.43E-02 5-9

ASV_28 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Celeribacter;NA 0.39 1.97E-02 10-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.36 3.55E-02 11-19

ASV_28 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Celeribacter;NA 0.39 3.55E-02 11-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.36 2.33E-02 11-20

ASV_28 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Celeribacter;NA 0.37 2.33E-02 11-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.47 2.27E-02 13-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.43 3.16E-02 13-20

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.49 3.06E-02 14-19

ASV_6 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Pseudoalteromonadaceae;Pseudoalteromonas;NA -0.51 4.07E-02 15-19

ASV_12 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;dubius 0.41 3.93E-02 20-27

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.43 3.93E-02 20-27

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.43 3.93E-02 20-27

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.40 3.69E-02 20-28

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.42 3.69E-02 20-28

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.34 4.94E-02 20-29

ASV_39 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Aliiglaciecola;NA -0.40 4.94E-02 20-29

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.63 7.86E-03 22-27

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.56 1.05E-02 22-28

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.46 3.49E-02 22-29

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.45 1.66E-02 22-30

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.38 3.74E-02 22-30

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.47 5.07E-03 22-31

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.37 3.25E-02 22-31

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.68 4.34E-02 23-26

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.72 5.36E-03 23-27

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.54 3.71E-02 23-27

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.61 1.07E-02 23-28

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.56 1.07E-02 23-28

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.48 2.13E-02 23-29

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.50 1.97E-02 23-29

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.48 6.71E-03 23-30

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.47 6.71E-03 23-30

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.50 3.80E-03 23-31

71

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.45 5.65E-03 23-31

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.44 3.44E-03 23-33

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.70 1.65E-02 24-27

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.58 2.67E-02 24-28

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.54 2.67E-02 24-28

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.53 2.34E-02 24-29

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.43 3.16E-02 24-30

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.49 1.57E-02 24-30

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.45 1.01E-02 24-31

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.47 1.01E-02 24-31

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.39 2.26E-02 24-33

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.41 7.61E-03 24-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.32 4.31E-02 24-34

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.47 4.18E-02 25-30

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.44 1.89E-02 25-31

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.45 1.89E-02 25-31

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.38 4.33E-02 25-33

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.40 1.36E-02 25-34

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.38 1.73E-02 25-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.30 4.93E-02 26-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.30 4.93E-02 26-36

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.53 4.02E-02 27-31

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.37 4.58E-02 27-35

ASV_20 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;NA;NA -0.35 4.58E-02 27-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.32 4.90E-02 28-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 4.90E-02 28-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.31 4.90E-02 28-38

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.37 2.98E-02 29-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.35 2.98E-02 29-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.35 3.16E-02 29-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.33 3.16E-02 29-37

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.35 2.59E-02 29-38

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.30 3.96E-02 29-38

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.36 1.33E-02 29-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.30 3.88E-02 29-39

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.54 4.13E-02 30-33

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.50 4.13E-02 30-33

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.51 3.44E-02 30-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.47 3.44E-02 30-34

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.46 3.44E-02 30-34

72

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.45 3.40E-02 30-35

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.47 3.29E-02 30-35

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.42 3.79E-02 30-35

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.42 1.23E-02 30-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.45 8.21E-03 30-36

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.38 1.17E-02 30-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.39 1.17E-02 30-37

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.37 1.16E-02 30-38

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 1.16E-02 30-38

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.39 8.74E-03 30-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.34 1.33E-02 30-39

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.31 1.81E-02 30-40

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 6.05E-03 30-40

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.61 4.84E-02 31-34

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.50 3.72E-02 31-35

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA 0.52 3.72E-02 31-35

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.51 5.53E-03 31-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.43 1.47E-02 31-37

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.38 2.79E-02 31-38

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.35 2.12E-02 31-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.35 2.12E-02 31-39

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.39 5.16E-03 31-40

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.31 1.88E-02 31-41

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.46 4.14E-02 33-36

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.36 2.59E-02 33-40

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.29 4.05E-02 33-42

ASV_66 Bacteria;Proteobacteria;Deltaproteobacteria;Bdellovibrionales;Bacteriovoracaceae;Halobacteriovorax;NA -0.31 4.05E-02 33-42

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.43 3.05E-02 34-39

ASV_8 Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Loktanella;pontiacus 0.58 6.86E-03 36-39

ASV_19 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Oceanospirillaceae;Marinomonas;NA -0.45 4.55E-02 36-39

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.48 4.30E-02 40-42

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.45 3.32E-02 40-43

ASV_3 Bacteria;Proteobacteria;Gammaproteobacteria;Alteromonadales;Alteromonadaceae;Alteromonas;NA -0.52 4.72E-02 41-43

ASV_18 Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Polaribacter;NA -0.54 3.31E-02 42-44

73

Table S4 ANOVA of community function

74

Chapter 3 Prebiotics and community composition influence gas production of the human gut microbiota

Xiaoqian Yu, Thomas Gurry, Le Thanh Tu Nguyen, Hunter S. Richardson, Eric J. Alm

The work in the chapter presented below constitutes a manuscript in preparation for submission.

Abstract

Prebiotics confer benefits to human health often by promoting the growth of gut bacteria that produce metabolites valuable to the human body, such as short chain fatty acids (SCFAs). While prebiotic selection has strongly focused on maximizing the production of SCFAs, less attention has been paid to gases, a byproduct of SCFA production that also has physiological effects on the human body. Here, we investigate how the content and volume of gas production by human gut microbiota is affected by the chemical composition of the prebiotic and by the composition of the microbiota. We first constructed a linear systems model based on mass and electron balance and compared the theoretical product range of two prebiotics, inulin and pectin. Modeling shows that pectin is more restricted in product space, with less potential for H2 but more potential for CO2 production. An ex vivo experimental system showed pectin degradation produced significantly less H2 than inulin, but CO2 production fell outside the theoretical product range, suggesting fermentation of fecal debris. Microbial community composition also impacted results. Methane production was dependent on the presence of Methanobacteria, and inter-individual differences in H2 production during inulin degradation was driven by a Lachnospiraceae taxon. Overall, these results suggest that both the chemistry of the prebiotic and the composition of the microbiota are relevant to gas production: metabolic processes that are relatively prevalent in the microbiome, such as H2 production, will depend more on substrate, while rare metabolisms like methanogenesis depend more strongly on microbiome composition.

Author Contributions: X.Y. and E.A. developed the theoretical product range model; L.N., X.Y., T.G. developed the ex vivo platform and performed experiments with the assistance of H.R. X.Y., T.G. and E.A. prepared the manuscript. 75

3.1 Introduction

The gut microbiota plays an important role in human nutrition and health, leading to increasing interest in modulation of the gut microbiome via dietary interventions for improving human health (Cotillard et al.,

2013; De Filippis et al., 2018; Tremaroli and Bäckhed, 2012). Compounds that can be selectively metabolized by microbes in the gut resulting in beneficial effects on the host are defined as prebiotics

(Gibson et al., 2017). While some phenolic compounds and fatty acids are suspected to have prebiotic activities, most known prebiotics are dietary carbohydrates that are neither digested or absorbed in the human small intestine, thus capable of reaching the colon and promoting the growth of selective beneficial bacteria (Gibson et al., 2017; Holscher, 2017). These bacteria, in turn, can prevent the colonization of pathogens or produce metabolites that are beneficial for the human body, mostly notably short chain fatty acids (SCFAs) such as acetate, propionate and butyrate. These SCFAs not only contribute directly to host energy metabolism but have a number of positive effects on host physiology.

Butyrate is the major energy source for colonocytes and enterocytes (Canani et al., 2011), and can also activate gluconeogenesis and modulate inflammatory responses and cytokine levels via G protein‐coupled receptors or histone deacetylases (Belcheva et al., 2015). Similarly, acetate and propionate are involved in the regulation of host immune or metabolic systems (Belcheva et al., 2015; Shibata et al., 2017). Thus, selection for prebiotics has largely focused on those that allow the proliferation of bacteria that maximize production of SCFAs (Holscher, 2017; Míguez et al., 2016; Verspreet et al., 2016).

SCFA fermentation from carbohydrates by the gut microbiota are often coupled with the production of

+ gases. Production of H2 is often necessary for the cycling of NAD /NADH during fermentation, and CO2 is released whenever decarboxylation occurs (Müller, 2008). H2 can be further utilized by methanogens and sulfate reducers for the production of CH4 and H2S (Carbonero et al., 2012). Most intestinal gas is absorbed into the bloodstream and removed via the lungs (Cormier, 1990), but it can still have physiological effects on the human body. The volume of gas production can affect the distension of the

76

colonic wall and in turn affect the speed of material transition through the colon (Azpiroz, 2005).

Methane production can result in slowed intestinal transit and reduced serotonin levels in the gastrointestinal tract, potentially impacting constipation predominant irritable bowel syndrome (IBS-C) and chronic constipation (Sahakian et al., 2010). Therefore, gas production may be an important factor to consider in the selection of prebiotics, especially since bloating is a major symptom for many functional gut disorders such as IBS (Seo et al., 2013).

Many prebiotics are already known to impact fermentation products. For example, short-chain fructooligosacchaides (FOS) and inulin are some of the most extensively documented prebiotics because they promote the growth of bifidobacteria and increase SCFA production (Gibson et al., 2017; Holscher,

2017). In addition, the low FODMAP (fermentable oligosaccharides, disaccharides, monosaccharides, and polyols) diet that has been shown to improve the symptoms of some IBS patients, because foods containing FOS and inulin can increase luminal distension and gas production (Gibson and Shepherd,

2010; McIntosh et al., 2017). Thus, it may be valuable to identify prebiotics that maximize SCFA and minimize gas production, or minimizes the production of specific gases. However, few studies that consider the efficacy of prebiotics simultaneously take gas and SCFA production into account, and systematic investigations on factors that affect gas production in prebiotic fermentation are lacking.

In this study, we investigate whether the chemical composition of the prebiotic and heterogeneity in the composition of gut microbiota can affect the content and volume of gas production during prebiotic fermentation. We compare the fermentation products of two common prebiotics, inulin and pectin, both theoretically via linear systems modeling and experimentally via an ex vivo framework that measures gas and SCFA production of stool microbiota responding to fiber addition. We find that inulin, a more reduced carbohydrate, produces more H2 compared to pectin, but the amount of H2 production is strongly associated with a Lachnospiraceae amplicon sequencing variant (ASV). Inulin also yielded greater amounts of the more reduced SCFA butyrate and less acetate. Methane production is, however, less

77

affected by the chemcial nature of the substrate, being entirely dependente on the level of

Methanobacteria in the microbiota. Overall, these results suggest that the production of different gases upon prebiotic fermentation by gut microbiota are differentially affected by the chemical nature of the prebiotic and microbiome composition.

3.2 Results

3.2.1 Modeling community production with mass and electron balance

To explore the general effect of prebiotic chemical composition on fermentation product formation, we established a linear systems model that allowed us to determine the theoretical range of product output considering mass and electron balance. Considering a system of n chemicals as possible inputs and outputs, made up of a total of m chemical elements (counting total valence electrons as an element), we defined a matrix M in which the rows represent different elements and columns represent different chemicals; the elemental composition of a chemical is thus a column in M. Any reaction that satisfies both mass and electron balance is a n-dimensional vector s, whose elements are the stoichiometric coefficients of the chemicals in M, and satisfy Ms=0 (See Figure 1a for a more detailed representation).

By definition, s must be within the null space of M. The feasible product space of the biological system, represented by the elements in s that are coefficients of the possible products, is thus a convex cone defined by the linear combinations of the basis vector of null(M). Since our model did not account for the thermodynamic constraints on the metabolic fluxes within the system, it represented an upper limit of the feasible product space.

3.2.2 Feasible product space of pectin fermentation is more limited compared to inulin

We applied our model to compare the feasible product space for the fermentation of 1 mol of inulin

(C6nH10n+2O5n+1) to that of 1 mol of pectin (C6nH8n+2O6n+1) in a closed system. Since we were modeling product output from carbohydrate input, we only included C, H, O and valence electrons as rows in our

78

matrix M. For products (columns) in M, we included the three most abundant SCFAs in the gut (acetate, propionate and butyrate), the three major components of intestinal gas (H2, CH4, and CO2), as well as water and biomass. All product concentrations were restricted to be non-negative to simulate a closed system (i.e. product formation is solely from fiber input). Our model showed that in a closed system, the product space of pectin was more restricted than inulin (Figure S1); in particular, inulin had more potential for H2 production, while pectin had more potential for the production of CO2 (Figure 1c). The results of our model are in accordance with the simple intuition that a more oxidized substrate (pectin) would lead to more production of oxidized products such as CO2 and less production of reduced products such as H2. Model predictions were conserved even if further constraints were placed on the system

(Figure S2) , i.e. 15% of C in the fiber is converted into biomass as in a typical carbohydrate fermentation

(Sawyer et al., 2003).

3.2.3 Pectin degradation takes up reducing agents from the environment

We next asked if our theoretical predictions could be experimentally validated using an ex vivo framework in which we measured the response of stool microbiota to fiber addition. First, stool from 9 healthy human subjects was homogenized with phosphate saline buffer (PBS) under anaerobic conditions to create a fecal slurry. The slurry was then incubated in serum bottles at 37°C starting with 100% N2 in the headspace, with inulin, pectin, cellulose, or no additional fiber input (Figure 2a). Since our preliminary testing showed that fiber degradation in this system was almost entirely complete within 24h

(Gurry et al. 2019), we used the gas and SCFA concentrations at 24h as the experimental product concentrations for comparison to those predicted from our theoretical models. Because the fecal slurry itself contained a certain amount of residue material from food digestion in the human body, even samples that did not receive additional fiber produced gas and SCFAs; the fermentation products of a certain fiber in a sample was thus determined as the difference between product concentrations measured in a sample which received additional fiber and which did not.

79

Focusing on gas production in the ex vivo systems, we found that the amount of H2 produced by pectin fermentation was significantly lower than that of inulin (Figure 2b, Kruskal–Wallis test, paired, p=0.004), and the total amount of gas production was also lower (Kruskal–Wallis test, paired, p=0.07). However, we did not observe a higher amount of CO2 production in pectin fermentation compared to inulin fermentation as theoretically predicted; in fact, the measured CO2 productions from pectin fermentation did not fall within the previous theoretically predicted range (Figure 3a). This was also the case for acetate production in some samples that fermented inulin. Since our model only considers the most basic laws of chemistry and represents the maximum possible theoretical product range for a closed system, we hypothesized that the experimental violation of the results from the theoretical model was due to assuming our experimental system was closed. Indeed, despite the serum bottle being a closed system with no material exchange with the environment outside of the bottle, fermentation of the additional fiber should be seen as a subsystem that can exchange products with the other subsystem in the bottle that ferments residue material in the fecal slurry (Figure 3b).

We thus investigated what input the “fiber subsystem” would need from the “residue subsystem” for the measured CO2 to fall within the feasible product range determined by the theoretical model. Since on average the samples that did not receive additional fiber produced approximately 1/3 as much gas and

SCFAs compared to those that did, we limited the input from the residue subsystem to the equivalent amount of product that can be produced by 1 mol of inulin or pectin. Allowing one input at a time, we found that only when H2 or CH4 was used as input would the measured CO2 fall within the predicted range (Figure 3c). Although we did not observe net uptake of either H2 or CH4 in our experimental data, but because both H2 and CH4 are chemicals with reducing power, there was likely influx of other reducing substrates not presented in our model from the “residue subsystem” to the “fiber subsystem”. Thus, in our ex vivo system, pectin degradation not only had a lower net production of H2 compared to inulin, but also took up reducing agents from the surrounding environment. We thus speculate that when pectin is

80

degraded in the human gut, it is also taking up reducing agents—a process for which consequences are unclear and possibly worth further investigation.

3.2.4 H2 and Acetate distinguishes the product profile of inulin and pectin degradation, but inter- personal variation of gas production is large

We next asked if the overall product profiles of inulin degradation and pectin degradation can be distinguished from each other, and whether changing fiber or microbial community contributed more to the variation in product profiles. We found that product profiles primarily clustered by fiber and not human subjects (Figure 4). Given that inulin fermentation generated significantly more H2 and less acetate compared to pectin (Figure 2b), we hypothesized these are the two major products that allowed the product profiles of inulin and pectin fermentation to be distinguished. Indeed, after training a Random

Forest Classifier to predict whether a product profile was the result of inulin or pectin fermentation

(AUC=1), we found that H2 and acetate were the two most important features for predicting the fiber of fermentation (Figure S3). Not only did the oxidation state of the fiber have impact on gas production, it also influenced SCFA production: the more oxidized substrate, pectin, produced more of the most oxidized SCFA, acetate. Also, aside from H2, inulin also produced more of the most reduced SCFA, butyrate (Figure 2b).

Having determined that fiber type strongly impacts product profiles, we next investigated inter-individual differences. Euclidian distances between the product profiles of different fibers within the same person were not significantly larger than those between different people on inulin (Mann–Whitney U test, p=0.25), and slightly larger for pectin (Mann–Whitney U test, p=0.03). Thus, the microbiota of different people exhibits functional heterogeneity in converting the same fibers into products of different quantities and composition.

81

3.2.5 Relative effects of microbiome and substrate chemistry on gas production differ among gases

We further explored if there were signatures within the microbiomes that promoted the production of gases. Since levels of H2 production were generally low for pectin fermentation, we investigated if there were specific ASVs associated with net H2 production during inulin fermentation. Selecting for these

ASVs via Lasso regression identified a Lachnospiraceae amplicon sequencing variant(ASV) positively

-5 associated with net H2 production (Figure 5a, observed vs. expected Pearson’s r=0.95, p= 6.243×10 ).

Since net H2 production in the gut is the difference between the total production of H2 and the total consumpution of H2 (Carbonero et al., 2012), and Lachnospiraceae can be either hydrogen producers or consumers (Gagen et al., 2015; Rajilić-Stojanović and de Vos, 2014), the positive association of the

Lachnospiraceae ASV with H2 production in the human gut may indicate that net H2 production is more dependent on H2 production than consumption. Consistent with this hypothesis, H2 consumption abilities of gut microbiota may be more consistent between different people compared to H2 production because of the higher diversity of H2 consumption pathways (methanogenesis, reductive acetogenesis and sulfate reduction) compared to production. It is, however observed that net H2 production was lowest in the samples that produced methane, probably because the amount of sulfate in the ex vivo system is not enough to for the most energetically favorable H2 consumption pathway, sulfate reduction, to consume all the H2 produced (Supplementary text), and methanogenesis is more energetically favorable than reductive acetogenesis.

In contrast to H2, whose production was strongly affected by substrate, we found that methane production was solely dependent on whether there were detectable levels of Methanobacteria in the microbiota

(Figure 5b). Given that methane is a downstream product of H2 (Figure 5c), we asked why methane production was not affected by substrate chemistry as H2 production was. We hypothesized that this is because there are generally large amounts of bacteria in the human gut that contain carbohydrate active

82

enzymes and hydrogenases that allow fiber breakdown and hydrogen production; thus the amount of these enzymes are not a limiting factor, allowing hydrogen production to be instead dependent on substrate stoichiometry. Meanwhile, a large portion of the human population do not harbor sufficient numbers of methanogens, making them the limiting factor for methane production; however, when the number of methanogens are sufficient they are not limited by the amount of H2 production because methanogens are stronger competitors for H2 compared to reductive acetogens. To test this hypothesis, we surveyed the abundance of carbohydrate active enzymes (CAZymes) for inulin and pectin, as well as hydrogenogenic hydrogenases, and methyl-CoM reductases (mcrA, marker gene for methanogens) in the metagenomes of

160 randomly selected healthy human subjects from the human microbiome project (HMP). While only approximately 20% of subjects had detectable levels of methanogens, nearly all subjects harbored

CAZymes for inulin and pectin degradation as well as hydrogenogenic hydrogenases (Figure 5d). The percentage of subjects (20%) with detectable methanogens in the HMP data is in accordance with our results: 2 out of the 9 subjects were methane producers in our ex vivo experiment. Thus, overall, the production of more “general” metabolites such as H2 is more likely to be affected by the chemical composition of the prebiotic, while more “rare” metabolites such as methane are more likely to be limited by the organisms that produce it.

3.3 Discussion

In this study, we used a combination of theoretical models and an ex vivo experimental framework to examine how the chemistry of prebiotics and the composition of the gut microbiota influence gas production during prebiotic fermentation by gut microbiota. Specifically selecting two different common prebiotics (inulin and pectin) with different levels of oxidation, we find that metabolites that can be produced by more organisms in the human gut, such as H2, are more affected by the chemical composition of prebiotics compared to metabolites that are produced by less common organisms in the

83

gut, such as methane. Overall, these results suggest that both the chemical nature of the prebiotic and the individual’s gut microbiome needs to be taken into account when administering prebiotics to individuals.

Our data also reveal that there may be general trade-offs in the production of SCFAs verses gas. For example, while inulin fermentation leads to more production of the more reduced SCFA butyrate, it also leads to more production of the reducing agent H2, and in turn increasing overall gas production.

However, which is more preferable for the subject—more production of butyrate, or less production of overall gas, or just less production of H2 is often unknown and specific to the individual subject. This can be further complicated if inter-individual differences in H2 and SCFA production is considered: not every individual produces more butyrate when fermenting inulin. Similarly, for pectin we were able to infer by comparing the experimental data to the theoretical product range that pectin degradation requires uptake of reducing agents from the surrounding environment. Again, what effect this has on the host is unknown: would the uptake of these reducing agents lead to the generation of more reactive oxygen species that can directly attack cells in the gut epithelial barrier, interfere with iron uptake, or initiate lipid peroxidation processes (Owen et al., 2000)? Would this be costlier to the host compared to generating more H2? More importantly, we also lack a way to evaluate if the scale of the differences is large enough for them to count as a factor in prebiotic selection.

These problems emphasize that the effect of prebiotics on gut and human health must be looked at from both individual and systems perspectives. Often a compound is deemed as a prebiotic because it can increase the growth of known beneficial microbes such as bifidobacteria and lactobacilli, or promote the production of target metabolites. However, the full diversity of a mixed culture environment such as the human gut must be considered when selecting for prebiotics: it is very hard to only selectively grow organisms that produce one or a few metabolites of interest, and the effect of any by-products must be considered. Inter-individual differences in product formation due to heterogeneity in gut microbiota composition, as well as responses to the metabolites produced, must also be considered. Our use of

84

theoretical modeling and the ex vivo experimental system to explore gas production and its relationship to

SCFA production is just a beginning: these are relatively cheap and simple methods to shine light on important points that should be considered in prebiotics design. In the future, a more systematic evaluation on what important factors other than the formation of beneficial metabolites should be considered in prebiotic design is needed.

3.4 Methods

3.4.1 Experimental model and participant details

9 healthy human volunteers were enrolled into the study under the supervision of the MIT Committee on the Use of Humans as Experimental Subjects (COUHES), who approved the study under protocol number

1510271631. All participants provided written, informed consent, and the study was conducted in accordance with the relevant guidelines and regulations. Inclusion and exclusion criteria for subjects are the same as in Gurry et al 2018 (Gurry et al., 2018). Enrollment occurred between June 2017 and Oct

2017.

3.4.2 Linear systems model for modeling community production

The product space for the system Ms=0 is a polytope defined by linear combinations of the basis vectors of Null(M), i.e. Bx=s. Constraints on the product space (i.e. for the closed system all elements in s corresponding to products are non-negative) were used to find the vertices of the polytope of x by converting the half-space representation (the intersection of half spaces, represented by Bx=s) into vertex representation (set of extreme points of the polytope). Vertices of the polytope of s were calculated from multiplying B with the vertices of polytope x. The vertices for s were used to draw 2D hulls for pairs of products to visualize the product polytope, as in Figure 1c, 3a. and 3c. When product input was allowed for the system, the constraint on the element in s corresponding to the input product would be relaxed to be larger than (– equivalent amount of product that can be produced by 1/3 mol of inulin or pectin).

85

The basis set of vectors for the null space of matrix M was calculated from the QR-decomposition of the matrix using the R package “pracma”(Borchers, 2019). The conversion of H-Representation to V-

Representation of polytopes were performed using the R package “rcdd”(Meeden and Fukuda, 2019).

3.4.3 Setup of ex vivo system

The setup of the ex vivo system was the same as in Gurry et al 2019 (Gurry et al. 2019), with some adaptation for gas measurements. Briefly, fresh stool samples were collected and homogenized with reduced PBS containing 0.1% L-Cysteine in a ratio of 1g/5ml. Fiber was spiked in to the homogenates from stock solutions such that the final concentrations of fibers in the samples are as follows: control no fiber, 10g/L inulin, 5g/L pectin or 20g/L cellulose. For each participant, 2mL of the final fecal slurry of each condition was added in triplicates to 60mL glass serum bottles (Supelco, Bellefonte PA). The serum bottles containing the samples were transferred to a vinyl anaerobic chamber filled with 100% N2, with no detectable amounts of CO2 and H2, and sealed in the chamber using magnetic crimp seals with

PTFE/silicone septa (Supelco, Bellefonte PA). A total of 12 bottles per participant were set up and incubated at 37°C for 24h with no shaking.

3.4.4 Gas and SCFA measurements

Concentrations of headspace gases were determined using gas chromatography. We used a Shimadzu GC-

2014 gas chromatography (GC) configured with a packed column (Carboxen-1000, 5’ x 1/8” (Supelco,

Bellefonte PA)) held at 140°C, argon carrier gas, thermal conductivity (TCD) and methanizer-flame ionization detectors (FID). At the end of the 24h incubation period, subsamples of the headspace (0.20 cm3) at the laboratory temperature, ca. 23°C) from each serum bottle were taken via a gas-tight syringe and injected onto the column. Gas concentrations were determined by comparing the partial pressures of samples and standards with known concentrations. Accuracy of the analyses, evaluated from standards, was

86

5%. Measurements of H2 were taken using the TCD while measurements of CH4 and CO2 were taken with the FID.

SCFA measurements were made from taking 1mL of fecal slurry from each serum bottle immediately after the GC measurements were taken, and freezing the fecal slurry at -80C until time of measurement. SCFA measurements were made on an Agilent 7890B system with a FID at the Harvard Digestive Disease Core

(Agilent Technologies, Santa Clara, CA). Detailed procedures SCFA measurements are the same as in

Gurry et al 2019 (Gurry et al. 2019). Although the amount of 10 volatile acids (acetic, propionic, isobutyric, butyric, isovaleric, valeric, isocaproic, caproic, and heptanoic acids) were reported, all but the acetic, butyric and propionic acids were in trace amounts and we only used these three SCFAs for our models.

3.4.5 DNA extraction, library prep, sequencing and analysis

The MoBio Powersoil 96 kit (now Qiagen Cat No./Id: 12955-4), with minor modifications, was used to extract the DNA from all fecal samples. For all samples, 250uL of the fecal slurry was used with the Mobio

High Throughput PowerSoil bead plate (12955-4 BP). 16S rRNA gene amplicon libraries (V4 hypervariable region, U515-E786) using a two-step PCR approach were prepared according to the method described in Preheim et al (Preheim et al., 2013). Samples were sequenced on an Illumina MiSeq

(PE 150+150) at the Broad Walk-up Sequencing platform (Broad Institute, Cambridge, MA). The average sequencing depth of the samples were 53,046 reads/sample.

All 16S rRNA amplicon libraries were processed according to a custom pipeline based DADA2, as described in Yu et al 2019 (Yu et al., 2019), except that only the forward reads were used due to issues in merging reads. The output of the pipeline was amplicon sequencing variants (ASVs). Taxonomy for all sequence variants was assigned using the RDP database.

87

3.4.6 Metagenome analysis

We downloaded 160 randomly selected metagenomes from the human stool microbial communities of the Human Microbiome Project (National Institutes of Health, USA). Each metagenome was rarefactioned to 20 million reads (forward+reverse) using seqtk seeded with the parameter –s100. The rarefactioned metagenomes were screened in DIAMOND (maximum number of high scoring pairs

(HSPs) per subject sequence to save for each query=1, blastx) against hydrogenogenic hydrogenases retrieved from the HydDB database (Søndergaard et al., 2016), mcrA genes retrieved from the PhyMet database (Michał et al., 2018), and CAZymes from the dbCAN database (Zhang et al., 2018). Results were then filtered [length of amino acid>25 residues, percent identical matches>65% (mcrA and hydrogenases) or >35% (CAZymes)]. Reads were eventually normalized to reads per kilobase million

퐴푐푡푢푎푙 푟푒푎푑 푐표푢푛푡 using the formula 푎푣푒.𝑔푒푛푒 푙푒푛𝑔푡ℎ 푠푒푞.푑푒푝푡ℎ . ( )( ) 1000 106

3.4.7 Acknowledgements

This work was supported by a grant from the Rasmussen Family Foundation. We are deeply thankful to the Ono Lab at MIT and Jeemin Rhim for teaching and allowing us to use their GC for measurements of gas production. We thank the Broad walk-in sequencing center for their assistance with sequencing, and

Harvard Digestive Disease Core for their assistance on GC for measurements of SCFA production. We also wish to thank Chengzhen Dai, Siavash Isazadeh, Chuliang Song, Fangqiong Ling, and Martin Polz for helpful discussions and/or comments on various versions of this manuscript.

88

3.5 Figures

Figure 1

Figure 1 Modeling community production with mass and electron balance a) A detailed representation of the theoretical model Ms=0, where M is a matrix with n chemicals (j inputs and k products). In M, inputs are represented in negative numbers while outputs are represented in positive numbers. Each row in M represents an element (or electrons) that needs to be balanced. b) The specific M matrix corresponding to our system of interest, fermentation of two different fibers, inulin and pectin. c) A set of selected 2D projections of the feasible product space predicted from our theoretical model for the fermentation of 1 mol inulin or 1 mol pectin. See Figure S1 for the full set of 2D projections for all fermentation products.

89

Figure 2

Figure 2 Inulin fermentation produces more H2 production and less acetate than pectin in ex vivo system a) Experimental scheme for studying fiber fermentation products in ex vivo system b) Major product concentrations in the ex vivo system after 24h measured as mol product production per mol of fiber. **,p<0.01, *, p<0.05 for paired Kruskal-wallis test.

90

Figure 3

Figure 3 Pectin degradation requires the uptake of reducing agents a) Comparison of product measurements in ex vivo system to theoretically predicted feasible product ranges. b) Illustration of the two subsystems within the serum bottle and their material exchanges. c) Comparison of product measurements in ex vivo system to theoretically predicted feasible product ranges in a closed system and when allowing different inputs.

91

Figure 4

Figure 4 Principle coordinate analysis (PCoA) of the product profiles of different fibers

92

Figure 5

Figure 5 Different gases are differentially influenced by substrate chemistry and gut microbiome composition a) Relationship between the predicted value of H2 production per mol of inulin based on a linear model with two Lachnospiraceae ASVs as variables, and the observed amount of H2 production per mol of inulin in ex vivo system. Dotted line represents a linear fit of the relationship, with grey areas around the line representing the standard error of the fit. b) Relationship between the production of methane per mol of fiber in the ex vivo system and the relative abundance of Methanobacteria in the samples. Error bars represent standard deviation (n=3). c) Schematic of fiber degradation and production of gas and SCFAs. d) Distribution of the abundance of mcrA genes, hydrogenenic hydrogenases, and CAZymes in the metagenomes of different people in the HMP dataset. Since x-axis is in log-scale, all the gene counts were increased by 10-5 so that samples with zero hits could be included in the distribution.

93

3.6 Supplementary Figures

Figure S1

Figure S1 2D projections of the feasible product space for all fermentation products The full set of 2D projections of the feasible product space predicted from our theoretical model for the fermentation of 1 mol inulin or 1 mol pectin. Units on all axes are mol product/mol fiber.

94

Figure S2

a

95

b

96

c

Figure S2 2D projections of the feasible product space for all fermentation products assuming constant biomass yield a) The full set of 2D projections for the feasible product space predicted from our theoretical model for the fermentation of 1 mol inulin or 1 mol pectin, when assuming that 15% of the C in fiber is eventually converted into biomass. Units on all axes are mol product/mol fiber. b) Comparison of the 2D projections for the feasible product space predicted from our theoretical model for 1 mol of inulin with no restriction on biomass yield (light blue) and assuming a 15% biomass yield during the fermentation process (dark blue). c) Comparison of the 2D projections for the feasible product space predicted from our theoretical model for 1 mol of pectin with no restriction on biomass yield (light coral) and assuming a 15% biomass yield during the fermentation process (coral). 97

Figure S3

Figure S3 Performance of Random Forest Classifier for predicting substrate based on product profile a) Receiver operating characteristic (ROC) curve of the classifier b) Importance graphs for the classifier, ranked by top to bottom by two parameters (Mean Decrease Accuracy or Mean Decrease Gini) that represent the increase in misclassification produced on average by removing the given predictor (more important predictors are on top).

98

3.7 Supplementary Text

Estimation of Sulfate content in ex vivo system

The content of sulfate in human feces is estimated to be 2.7µmol/g, independent of diet (Florin et al.,

1991). Since each serum bottle in our ex vivo system contains approximately 0.4g of feces, the sulfate content in each bottle is 1.08µmol, which would take 4.32umol of H2 to undergo complete reduction to hydrogen sulfide. This is lower/approximately the same compared to the net amount of H2 left in the serum bottles at our time of measurement (32±22 µmol for inulin, 3.2±1.4 µmol for pectin), and given that gross production should be not less than net production, it is likely that the amount of sulfate the ex vivo system does not support elimination of H2 based on sulfate reduction alone.

99

Chapter 4 Discussion

4.1 Limitations and Expansion of current work

In this thesis, I present two projects that share the same experimental scheme, “the common garden experiment”, that allows studying the assembly and function of natural or naturally-derived microbial communities under well-controlled conditions. In the first project, when combined with a high-throughput experimental workflow to systematically manipulate the diversity of a natural community and collect isolates from the community, this approach allowed me to disentangle how the effects of different interactions on community function change over a wide diversity spectrum. In the second project, by taking advantage of the natural variation in human gut microbiomes and transplanting these microbiomes into different “common gardens”, I assessed how different gas products of prebiotic fermentation could be differentially affected by the chemistry of the prebiotic and variations in the gut microbiome. While both projects take advantage of the well-controlled environmental conditions in common garden experiments, they share similar limitations. I now discuss how both projects could be improved though acquiring data with higher resolution, more variation of the “gardens” that communities are transplanted into, and more complementary culture based experiments.

4.1.1 Data resolution limits interpretation of ecological processes and community functions

One of the most significant limitations of both projects is that community composition is determined at the 16s rRNA level. Despite the use of high-resolution sequence denoising algorithms so that bacteria with at least one nucleotide difference in the 16S rRNA V4 region could be distinguished from each other, bacterial functions can be decoupled from their phylogeny due to horizontal gene transfer and convergent evolution; thus, even bacteria that are 100% identical in their 16s rRNA sequences can subsume multiple ecological units (Ravenhall et al., 2015). As a result, insights on the functional structure

100

of the communities and their assembly process have been limited in both projects. Future work should strive to incorporate more meta-omics into the current experimental frameworks.

The project in chapter 2 would have benefited from coupling species richness in the self-assembled communities to functional genes inferred from metagenomics or metatranscriptomics, as well as quantifying resource use via metabolomics. This will allow the interactions between organisms and their effects on community function to be understood on a more molecular level. It would especially be interesting to see for chapter 2, if the narrow region (between 12-26 taxa) where the effects of both complementation and competition community function increased rapidly is truly due to more uptake of total resources (i.e. gain of functional genes relevant to the uptake of components in the seaweed extract, or increased uptake of certain substrates identified in metabolomics data) and more overlap in resource use potential between community members, as was hypothesized in the chapter.

The project in chapter 3 could also be improved if supplemented with meta-omics data. Finding genes and pathways that are enriched for in the metagenomes of communities that received the pectin spike-in compared to those that received the inulin spike-in may allow us to understand why pectin fermentation produces less H2 than inulin from a biochemical aspect. Having metatranscriptomics data of how gene expression in the communities change over time would provide information for the dynamics of hydrogen production and consumption in the communities, and testing the hypothesis that net H2 production is more dependent on H2 producers than consumers.

4.1.2 Variations of the common garden experiment can provide additional insights

In both projects, I performed the common garden experiment with batch cultures as “gardens” and observed how communities behaved in the gardens for a short period of time. While this procedure maximizes throughput and records the most original response of the community to the new environment, it also poses limits on the kinds of ecological processes and community functions that can be studied. One

101

obvious example is stability, which cannot be evaluated unless the community is studied over a relatively long period of time (usually 60-100 generations). This could be resolved by allowing the communities to undergo multiple-dilution regrowth cycles as a simplified version of a chemo stat and tracking how diversity and function of the communities change over time.

Increasing the types of “common gardens” in both projects would also be helpful. For the project in chapter 2, allowing the inoculum communities to also assemble on “common gardens” where the substrate is a component of the seaweed extract could help elucidate how interaction networks in the communities change over a diversity gradient. Although the project in chapter 3 already has two different

“common gardens” with different levels of oxidation, increasing the number of substrates would allow a much more systematic study of how substrate chemistry could affect gas production in prebiotic fermentation.

There are also more ways that transplanted communities could be manipulated to generate variation between communities so that specific ecological processes could be studied. One example is introduction of a well characterized organism to the communities, i.e. invasion. If the inoculum communities in chapter 2 were split into two groups, where a strong “killer strain” of bacteria that can antagonize a wide range of bacteria is introduced to communities in one group but not the other, and the function and diversity in communities in both groups are tracked over time, this would allow exploration of stability- function trade-offs. Similarly, in chapter 3, simultaneously transplanting a known beneficial microbe, a

“probiotic”, with the gut microbiota into the “prebiotic gardens” would facilitate the study of prebiotic- probiotic interactions.

4.1.3 Culture-based experiments can complement “top-down” experiments

Although both projects in this thesis place a strong emphasis on using natural communities as the raw material for experiments so that observations could more likely reflect the ecological processes that shape

102

community composition and function in the wild, complementing these “top-down” experiments with phenotypic or genotypic characterization of individual members of the community could help elucidate the molecular mechanisms behind ecological processes. In chapter 2, a cheaper alternative for acquiring the meta-omics of the self-assembled communities would be acquiring whole-genome sequences/substrate uptake profiles of the isolates that perfectly align in the 16S rRNA region with those detected in the community. Although this method has the potential problem of the isolate being sequenced not necessarily being the representative strain in the community (since 16S rRNA does not resolve bacteria at the strain level), and also cannot account for the function of rare community members that are less likely to be isolated, it allows better characterization of niche overlap between community members.

Meanwhile, chapter 3 could benefit from the isolation of the Lachnospiraceae ASV that was found to be positively associated with hydrogen production. Acquiring its whole genome, testing its growth on inulin and measuring the gas production from the growth, and comparing its growth and genetic profile to other isolates (preferrably other Lachnospiraceae ASVs that were not identified to be associated with H2 production), will greatly help to confirm if the association is, actually, a causal relationship.

4.2 Connecting ecological theory and industrial applications

Despite sharing a common experimental scheme, the two projects in this thesis have very different foci: the project in chapter 2 aims to find general principles on how interactions and their effects on community function would change with diversity as a result of community assembly over time in nature, while the project in chapter 3 aims to provide guidance for selection of chemicals that could be used as prebiotics, emphasizing that prebiotics design needs to take into account that the gut microbiome is a diverse microbial community whose composition can vary over time and between people.

Thus, one of the purposes of this thesis is to demonstrate that there could be a tighter connection between theory and applications in microbial ecology. Consider the engineering of synthetic microbial consortia for the production of chemicals such as biofuels or pharmaceuticals: the key principles for designing such 103

a consortia are often the same as the important concepts in community ecology. A synthetic community for industrial production needs to be highly functional, highly stable, and resilient to environmental fluctuations, external invasion or rise of mutants in the communities (Eng and Borenstein, 2019; Tsoi et al., 2019). These are also core questions asked in community ecology. However, many ecologists do not consider potential applications when they perform an experiment or develop a model; often a paper cites

“Topic X is a central question in ecology” as the major reason that the work is important. On the other hand, when it comes to designing communities de novo, rarely are ecological theories taken seriously; they are often deemed as too theoretical, with too many assumptions to generalize to the system of interest. Indeed, it may be true that it is impossible to develop “unified theories” for all microbial communities, but for bioengineers selectively looking at theoretical studies of microbial systems that share similar characteristics with the to-be-designed system may be helpful. Microbial ecologists who study ecological processes in natural communities to elucidate ecological theory may also be able to further validate their theory if they can use the theory to predict the behavior of a synthetic community that originates from the natural community. My hope is that, in the future, this strict “niche partitioning” between bioengineers and microbial ecologists could be removed, with everyone moving from specialists to generalists.

104

References

Azpiroz, F. (2005). Intestinal gas dynamics: mechanisms and clinical relevance. Gut 54, 893–895.

Belcheva, A., Irrazabal, T., and Martin, A. (2015). Gut microbial metabolism and colon cancer: Can manipulations of the microbiota be useful in the management of gastrointestinal health? BioEssays 37, 403–412.

Bell, T. (2019). Next-generation experiments linking community structure and ecosystem functioning. Environ. Microbiol. Rep. 11, 20–22.

Borchers, H.W. (2019). pracma: Practical Numerical Math Functions.

Callahan, B.J., McMurdie, P.J., Rosen, M.J., Han, A.W., Johnson, A.J.A., and Holmes, S.P. (2016). DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581.

Callahan, B.J., McMurdie, P.J., and Holmes, S.P. (2017). Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 11, 2639–2643.

Campbell, C.D., Chapman, S.J., Cameron, C.M., Davidson, M.S., and Potts, J.M. (2003). A Rapid Microtiter Plate Method To Measure Carbon Dioxide Evolved from Carbon Substrate Amendments so as To Determine the Physiological Profiles of Soil Microbial Communities by Using Whole Soil. Appl. Environ. Microbiol. 69, 3593–3599.

Canani, R.B., Costanzo, M.D., Leone, L., Pedata, M., Meli, R., and Calignano, A. (2011). Potential beneficial effects of butyrate in intestinal and extraintestinal diseases. World J. Gastroenterol. WJG 17, 1519–1528.

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I., et al. (2010). QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336.

Carbonero, F., Benefiel, A.C., and Gaskins, H.R. (2012). Contributions of the microbial hydrogen economy to colonic homeostasis. Nat. Rev. Gastroenterol. Hepatol. 9, 504–518.

Cardinale, B.J., Srivastava, D.S., Emmett Duffy, J., Wright, J.P., Downing, A.L., Sankaran, M., and Jouseau, C. (2006). Effects of biodiversity on the functioning of trophic groups and ecosystems. Nature 443, 989–992.

Cermak, N., Becker, J.W., Knudsen, S.M., Chisholm, S.W., Manalis, S.R., and Polz, M.F. (2017). Direct single-cell biomass estimates for marine bacteria via Archimedes’ principle. ISME J. 11, 825–828.

Chen, F., Chang, Y., Dong, S., and Xue, C. (2016). Wenyingzhuangia fucanilytica sp. nov., a sulfated fucan utilizing bacterium isolated from shallow coastal seawater. Int. J. Syst. Evol. Microbiol. 66, 3270– 3275.

105

Cho, I., and Blaser, M.J. (2012). The human microbiome: at the interface of health and disease. Nat. Rev. Genet. 13, 260–270.

Cole, J.R., Wang, Q., Fish, J.A., Chai, B., McGarrell, D.M., Sun, Y., Brown, C.T., Porras-Alfaro, A., Kuske, C.R., and Tiedje, J.M. (2014). Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res. 42, D633–D642.

Comeau, A.M., Douglas, G.M., and Langille, M.G.I. (2017). Microbiome Helper: a Custom and Streamlined Workflow for Microbiome Research. MSystems 2, e00127-16.

Cordero, O.X., Wildschutte, H., Kirkup, B., Proehl, S., Ngo, L., Hussain, F., Roux, F.L., Mincer, T., and Polz, M.F. (2012). Ecological Populations of Bacteria Act as Socially Cohesive Units of Antibiotic Production and Resistance. Science 337, 1228–1231.

Cormier, R.E. (1990). Abdominal Gas. In Clinical Methods: The History, Physical, and Laboratory Examinations, H.K. Walker, W.D. Hall, and J.W. Hurst, eds. (Boston: Butterworths), p.

Cotillard, A., Kennedy, S.P., Kong, L.C., Prifti, E., Pons, N., Le Chatelier, E., Almeida, M., Quinquis, B., Levenez, F., Galleron, N., et al. (2013). Dietary intervention impact on gut microbial gene richness. Nature 500, 585–588.

Coyte, K.Z., Schluter, J., and Foster, K.R. (2015). The ecology of the microbiome: Networks, competition, and stability. Science 350, 663–666.

C.T., de, Wit, and J.P., van den, Bergh, (1965). Competition between herbage plants. 13.

Datta, M.S., Sliwerska, E., Gore, J., Polz, M.F., and Cordero, O.X. (2016). Microbial interactions lead to rapid micro-scale successions on model marine particles. Nat. Commun. 7, ncomms11965.

David, L.A., Materna, A.C., Friedman, J., Campos-Baptista, M.I., Blackburn, M.C., Perrotta, A., Erdman, S.E., and Alm, E.J. (2014). Host lifestyle affects human microbiota on daily timescales. Genome Biol. 15, R89.

De Filippis, F., Vitaglione, P., Cuomo, R., Berni Canani, R., and Ercolini, D. (2018). Dietary Interventions to Modulate the Gut Microbiome—How Far Away Are We From Precision Medicine. Inflamm. Bowel Dis. 24, 2142–2154.

DeSantis, T.Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E.L., Keller, K., Huber, T., Dalevi, D., Hu, P., and Andersen, G.L. (2006). Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072.

Dini-Andreote, F., Stegen, J.C., Elsas, J.D. van, and Salles, J.F. (2015). Disentangling mechanisms that mediate the balance between stochastic and deterministic processes in microbial succession. Proc. Natl. Acad. Sci. 112, E1326–E1332.

Eng, A., and Borenstein, E. (2019). Microbial community design: methods, applications, and opportunities. Curr. Opin. Biotechnol. 58, 117–128.

106

Epstein, S. (2013). The phenomenon of microbial uncultivability. Curr. Opin. Microbiol. 16, 636–642.

Faith, J.J., Guruge, J.L., Charbonneau, M., Subramanian, S., Seedorf, H., Goodman, A.L., Clemente, J.C., Knight, R., Heath, A.C., Leibel, R.L., et al. (2013). The Long-Term Stability of the Human Gut Microbiota. Science 341, 1237439.

Fiegna, F., Moreno-Letelier, A., Bell, T., and Barraclough, T.G. (2015). Evolution of species interactions determines microbial community productivity in new environments. ISME J. 9, 1235–1245.

Fischbach, M.A. (2018). Microbiome: Focus on Causation and Mechanism. Cell 174, 785–790.

Fletcher, H.R., Biller, P., Ross, A.B., and Adams, J.M.M. (2017). The seasonal variation of fucoidan within three species of brown macroalgae. Algal Res. 22, 79–86.

Florin, T., Neale, G., Gibson, G.R., Christl, S.U., and Cummings, J.H. (1991). Metabolism of dietary sulphate: absorption and excretion in humans. Gut 32, 766–773.

Fox, J.W. (2006a). Using the price equation to partition the effects of biodiversity loss on ecosystem function. Ecology 87, 2687–2696.

Fox, J.W. (2006b). Using the price equation to partition the effects of biodiversity loss on ecosystem function. Ecology 87, 2687–2696.

Fox, J., and Weisberg, S. (2011). An R Companion to Applied Regression (SAGE Publications).

Friedman, J., Higgins, L.M., and Gore, J. (2017). Community structure follows simple assembly rules in microbial microcosms. Nat. Ecol. Evol. 1, 0109.

Gagen, E.J., Padmanabha, J., Denman, S.E., and McSweeney, C.S. (2015). Hydrogenotrophic culture enrichment reveals rumen Lachnospiraceae and Ruminococcaceae acetogens and hydrogen-responsive Bacteroidetes from pasture-fed cattle. FEMS Microbiol. Lett. 362.

Geyer, K.M., Kyker-Snowman, E., Grandy, A.S., and Frey, S.D. (2016). Microbial carbon use efficiency: accounting for population, community, and ecosystem-scale controls over the fate of metabolized organic matter. Biogeochemistry 127, 173–188.

Gibson, P.R., and Shepherd, S.J. (2010). Evidence-based dietary management of functional gastrointestinal symptoms: The FODMAP approach. J. Gastroenterol. Hepatol. 25, 252–258.

Gibson, G.R., Hutkins, R., Sanders, M.E., Prescott, S.L., Reimer, R.A., Salminen, S.J., Scott, K., Stanton, C., Swanson, K.S., Cani, P.D., et al. (2017). Expert consensus document: The International Scientific Association for Probiotics and Prebiotics (ISAPP) consensus statement on the definition and scope of prebiotics. Nat. Rev. Gastroenterol. Hepatol. 14, 491–502.

Goldford, J.E., Lu, N., Bajic, D., Estrela, S., Tikhonov, M., Sanchez-Gorostiaga, A., Segre, D., Mehta, P., and Sanchez, A. (2017). Emergent Simplicity in Microbial Community Assembly. BioRxiv 205831.

107

Gravel, D., Bell, T., Barbera, C., Bouvier, T., Pommier, T., Venail, P., and Mouquet, N. (2011). Experimental niche evolution alters the strength of the diversity–productivity relationship. Nature 469, 89–92.

Green, S.J., Leigh, M.B., and Neufeld, J.D. (2010). Denaturing Gradient Gel Electrophoresis (DGGE) for Microbial Community Analysis. In Handbook of Hydrocarbon and Lipid Microbiology, K.N. Timmis, ed. (Berlin, Heidelberg: Springer Berlin Heidelberg), pp. 4137–4158.

Guittar, J., Shade, A., and Litchman, E. (2019). Trait-based community assembly and succession of the infant gut microbiome. Nat. Commun. 10, 512.

Gurry, T., Gibbons, S.M., Nguyen, L.T.T., Kearney, S.M., Ananthakrishnan, A., Jiang, X., Duvallet, C., Kassam, Z., and Alm, E.J. (2018). Predictability and persistence of prebiotic dietary supplementation in a healthy human cohort. Sci. Rep. 8, 12699.

Gurry, T., Nguyen, L.T.T., Yu, X., Fischbach, M.A., and Alm, E.J. Functional heterogeneity in the fermentation capabilities of the healthy human gut microbiota (In prep).

Holscher, H.D. (2017). Dietary fiber and prebiotics and the gastrointestinal microbiota. Gut Microbes 8, 172–184.

Jaillard, B., Richon, C., Deleporte, P., Loreau, M., and Violle, C. An a posteriori species clustering for quantifying the effects of species interactions on ecosystem functioning. Methods Ecol. Evol. 9, 704–715.

Jousset, A., Schmid, B., Scheu, S., and Eisenhauer, N. (2011). Genotypic richness and dissimilarity opposingly affect ecosystem functioning. Ecol. Lett. 14, 537–545.

Kembel, S.W., Cowan, P.D., Helmus, M.R., Cornwell, W.K., Morlon, H., Ackerly, D.D., Blomberg, S.P., and Webb, C.O. (2010). Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463–1464.

Langille, M.G.I., Zaneveld, J., Caporaso, J.G., McDonald, D., Knights, D., Reyes, J.A., Clemente, J.C., Burkepile, D.E., Vega Thurber, R.L., Knight, R., et al. (2013). Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821.

Linhart, Y.B., and Grant, M.C. (1996). Evolutionary Significance of Local Genetic Differentiation in Plants. Annu. Rev. Ecol. Syst. 27, 237–277.

Lipson, D.A. (2015). The complex relationship between microbial growth rate and yield and its implications for ecosystem processes. Front. Microbiol. 6, 615.

Loreau, M., and Hector, A. (2001a). Partitioning selection and complementarity in biodiversity experiments. Nature 412, 72–76.

Loreau, M., and Hector, A. (2001b). Partitioning selection and complementarity in biodiversity experiments. Nature 412, 72–76.

108

Martin-Platero, A.M., Cleary, B., Kauffman, K., Preheim, S.P., McGillicuddy, D.J., Alm, E.J., and Polz, M.F. (2018). High resolution time series reveals cohesive but short-lived communities in coastal plankton. Nat. Commun. 9, 266.

Mas-Lladó, M., Piña-Villalonga, J.M., Brunet-Galmés, I., Nogales, B., and Bosch, R. (2014). Draft Genome Sequences of Two Isolates of the Roseobacter Group, Sulfitobacter sp. Strains 3SOLIMAR09 and 1FIGIMAR09, from Harbors of Mallorca Island (Mediterranean Sea). Genome Announc. 2, e00350- 14.

May, R.M., and Arthur, R.H.M. (1972). Niche Overlap as a Function of Environmental Variability. Proc. Natl. Acad. Sci. 69, 1109–1113.

Maynard, D.S., Crowther, T.W., and Bradford, M.A. (2017). Competitive network determines the direction of the diversity–function relationship. Proc. Natl. Acad. Sci. 201712211.

Maynard, D.S., Crowther, T.W., and Bradford, M.A. Fungal interactions reduce carbon use efficiency. Ecol. Lett. 20, 1034–1042.

McIntosh, K., Reed, D.E., Schneider, T., Dang, F., Keshteli, A.H., De Palma, G., Madsen, K., Bercik, P., and Vanner, S. (2017). FODMAPs alter symptoms and the metabolome of patients with IBS: a randomised controlled trial. Gut 66, 1241–1251.

Meeden, C.J.G. and G.D., and Fukuda, incorporates code from cddlib (ver 0 94f) written by K. (2019). rcdd: Computational Geometry.

Michał, B., Gagat, P., Jabłoński, S., Chilimoniuk, J., Gaworski, M., Mackiewicz, P., and Marcin, Ł. (2018). PhyMet2: a database and toolkit for phylogenetic and metabolic analyses of methanogens. Environ. Microbiol. Rep. 10, 378–382.

Míguez, B., Gómez, B., Gullón, P., and Alonso, B.G. and J.L. (2016). Pectic Oligosaccharides and Other Emerging Prebiotics. Probiotics Prebiotics Hum. Nutr. Health.

Müller, V. (2008). Bacterial Fermentation. In ELS, (American Cancer Society), p.

Oksanen, J., Blanchet, F.G., Kindt, R., Legendre, P., Minchin, P., O’Hara, R., Simpson, G., Solymos, P., Stevens, M., and Wagner, H. (2013). Vegan: Community Ecology Package. R Package Version. 2.0-10. CRAN.

Owen, R.W., Spiegelhalder, B., and Bartsch, H. (2000). Generation of reactive oxygen species by the faecal matrix. Gut 46, 225–232.

Pasolli, E., Asnicar, F., Manara, S., Zolfo, M., Karcher, N., Armanini, F., Beghini, F., Manghi, P., Tett, A., Ghensi, P., et al. (2019). Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle. Cell 176, 649-662.e20.

Peter, H., Beier, S., Bertilsson, S., Lindström, E.S., Langenheder, S., and Tranvik, L.J. (2011). Function- specific response to depletion of microbial diversity. ISME J. 5, 351–361.

109

Pfeiffer, T., Schuster, S., and Bonhoeffer, S. (2001). Cooperation and Competition in the Evolution of ATP-Producing Pathways. Science 292, 504–507.

Philippot, L., Spor, A., Hénault, C., Bru, D., Bizouard, F., Jones, C.M., Sarr, A., and Maron, P.-A. (2013). Loss in microbial diversity affects nitrogen cycling in soil. ISME J. 7, 1609.

Polz, M.F., and Cordero, O.X. (2016). Bacterial evolution: Genomics of metabolic trade-offs. Nat. Microbiol. 1, 16181.

Preheim, S.P., Perrotta, A.R., Friedman, J., Smilie, C., Brito, I., Smith, M.B., and Alm, E. (2013). Chapter Eighteen - Computational Methods for High-Throughput Comparative Analyses of Natural Microbial Communities. In Methods in Enzymology, E.F. DeLong, ed. (Academic Press), pp. 353–370.

Price, M.N., Dehal, P.S., and Arkin, A.P. (2010). FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments. PLOS ONE 5, e9490.

Rajilić-Stojanović, M., and de Vos, W.M. (2014). The first 1000 cultured species of the human gastrointestinal microbiota. FEMS Microbiol. Rev. 38, 996–1047.

Ratzke, C., and Gore, J. (2018). Modifying and reacting to the environmental pH can drive bacterial interactions. PLOS Biol. 16, e2004248.

Ravenhall, M., Škunca, N., Lassalle, F., and Dessimoz, C. (2015). Inferring Horizontal Gene Transfer. PLoS Comput. Biol. 11.

Rivett, D.W., and Bell, T. (2018). Abundance determines the functional role of bacterial phylotypes in complex communities. Nat. Microbiol. 3, 767.

Roller, B.R.K., Stoddard, S.F., and Schmidt, T.M. (2016). Exploiting rRNA operon copy number to investigate bacterial reproductive strategies. Nat. Microbiol. 1, 16160.

Rypien, K.L., Ward, J.R., and Azam, F. (2010). Antagonistic interactions among coral-associated bacteria. Environ. Microbiol. 12, 28–39.

Sahakian, A.B., Jee, S.-R., and Pimentel, M. (2010). Methane and the Gastrointestinal Tract. Dig. Dis. Sci. 55, 2135–2143.

Sawyer, C.N., McCarty, P.L., and Parkin, G.F. (2003). Chemistry for Environmental Engineering and Science (McGraw-Hill Education).

Schaechter, M., Maaloe, O., and Kjeldgaard, N.O. (1958). Dependency on medium and temperature of cell size and chemical composition during balanced grown of Salmonella typhimurium. J. Gen. Microbiol. 19, 592–606.

Schliep, K.P. (2011). phangorn: phylogenetic analysis in R. Bioinformatics 27, 592–593.

Seo, A.Y., Kim, N., and Oh, D.H. (2013). Abdominal Bloating: Pathophysiology and Treatment. J. Neurogastroenterol. Motil. 19, 433–453.

110

Shibata, N., Kunisawa, J., and Kiyono, H. (2017). Dietary and Microbial Metabolites in the Regulation of Host Immunity. Front. Microbiol. 8.

Singh, R.P., and Reddy, C.R.K. (2014). Seaweed–microbial interactions: key functions of seaweed- associated bacteria. FEMS Microbiol. Ecol. 88, 213–230.

Søndergaard, D., Pedersen, C.N.S., and Greening, C. (2016). HydDB: A web tool for hydrogenase classification and analysis. Sci. Rep. 6, 34212.

Spiess, A.-N. (2018). propagate: Propagation of Uncertainty.

Stegen, J.C., Lin, X., Konopka, A.E., and Fredrickson, J.K. (2012). Stochastic and deterministic assembly processes in subsurface microbial communities. ISME J. 6, 1653–1664.

Stewart, E.J. (2012). Growing Unculturable Bacteria. J. Bacteriol. 194, 4151–4160.

Szabó, K.É., Itor, P.O.B., Bertilsson, S., Tranvik, L., and Eiler, A. (2007). Importance of rare and abundant populations for the structure and functional potential of freshwater bacterial communities. Aquat. Microb. Ecol. 47, 1–10.

Takemura, A.F., Corzett, C.H., Hussain, F., Arevalo, P., Datta, M., Yu, X., Le Roux, F., and Polz, M.F. (2017). Natural resource landscapes of a marine bacterium reveal distinct fitness-determining genes across the genome. Environ. Microbiol. 19, 2422–2433.

Teeling, H., Fuchs, B.M., Becher, D., Klockow, C., Gardebrecht, A., Bennke, C.M., Kassabgy, M., Huang, S., Mann, A.J., Waldmann, J., et al. (2012). Substrate-Controlled Succession of Marine Bacterioplankton Populations Induced by a Phytoplankton Bloom. Science 336, 608–611.

The Human Microbiome Project Consortium, Huttenhower, C., Gevers, D., Knight, R., Abubucker, S., Badger, J.H., Chinwalla, A.T., Creasy, H.H., Earl, A.M., FitzGerald, M.G., et al. (2012). Structure, function and diversity of the healthy human microbiome. Nature 486, 207–214.

Thompson, L.R., Sanders, J.G., McDonald, D., Amir, A., Ladau, J., Locey, K.J., Prill, R.J., Tripathi, A., Gibbons, S.M., Ackermann, G., et al. (2017). A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551, 457–463.

Tremaroli, V., and Bäckhed, F. (2012). Functional interactions between the gut microbiota and host metabolism. Nature 489, 242–249.

Tropini, C., Earle, K.A., Huang, K.C., and Sonnenburg, J.L. (2017). The Gut Microbiome: Connecting Spatial Organization to Function. Cell Host Microbe 21, 433–442.

Tsoi, R., Dai, Z., and You, L. (2019). Emerging strategies for engineering microbial communities. Biotechnol. Adv.

Verspreet, J., Damen, B., Broekaert, W.F., Verbeke, K., Delcour, J.A., and Courtin, C.M. (2016). A Critical Look at Prebiotics Within the Dietary Fiber Concept. Annu. Rev. Food Sci. Technol. 7, 167–190.

111

Williams, H.N., Lymperopoulou, D.S., Athar, R., Chauhan, A., Dickerson, T.L., Chen, H., Laws, E., Berhane, T.-K., Flowers, A.R., Bradley, N., et al. (2016). Halobacteriovorax, an underestimated predator on bacteria: potential impact relative to viruses on bacterial mortality. ISME J. 10, 491–499.

Willis, A., and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics 71, 1042–1049.

Wright, E.S. (2016). Using DECIPHER v2.0 to Analyze Big Biological Sequence Data in R. 8, 8.

Yu, X., Polz, M.F., and Alm, E.J. (2019). Interactions in self-assembled microbial communities saturate with diversity. ISME J. 1.

Zaneveld, J.R., McMinds, R., and Thurber, R.V. (2017). Stress and stability: applying the Anna Karenina principle to animal microbiomes. Nat. Microbiol. 2, nmicrobiol2017121.

Zhang, H., Yohe, T., Huang, L., Entwistle, S., Wu, P., Yang, Z., Busk, P.K., Xu, Y., and Yin, Y. (2018). dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 46, W95–W101.

Zubkov, M.V., Fuchs, B.M., Eilers, H., Burkill, P.H., and Amann, R. (1999). Determination of Total Protein Content of Bacterial Cells by SYPRO Staining and Flow Cytometry. Appl. Environ. Microbiol. 65, 3251–3257.

R core team (2017). R: A Language and Environment for Statistical Computing (Vienna, Austria: R Foundation for Statistical Computing).

112