Environmental proteomics reveals taxonomic and functional changes in an enriched aquatic ecosystem
The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters
Citation Northrop, Amanda C., Rachel K. Brooks, Aaron M. Ellison, Nicholas J. Gotelli, and Bryan A. Ballif. 2017. “Environmental Proteomics Reveals Taxonomic and Functional Changes in an Enriched Aquatic Ecosystem.” Ecosphere 8 (10) (October): e01954. doi:10.1002/ ecs2.1954.
Published Version 10.1002/ecs2.1954
Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:34389684
Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA 1 Environmental proteomics reveals taxonomic and functional changes in an enriched
2 aquatic ecosystem
1 1 2 3 Authors: Amanda C. Northrop , Rachel Brooks , Aaron M. Ellison , Nicholas J.
1* 1* 4 Gotelli , and Bryan A. Ballif
5 Affiliations:
1 6 Department of Biology, University of Vermont, Burlington, VT 05405, USA.
2 7 Harvard Forest, Harvard University, Petersham, MA 01366, USA.
8 *Corresponding authors. E-mail: [email protected] (B.A.B); [email protected]
9 (N.J.G.)
10
11 Conflict of Interest
12 The authors declare no conflict of interest.
13
14
1 15 Abstract
16 Aquatic ecosystem enrichment can lead to distinct and irreversible changes to undesirable
17 states. Understanding changes in active microbial community function and composition
18 following organic-matter loading in enriched ecosystems can help identify biomarkers of
19 such state changes. In a field experiment, we enriched replicate aquatic ecosystems in the
20 pitchers of the northern pitcher plant, Sarracenia purpurea. Shotgun metaproteomics
21 using a custom metagenomic database identified proteins, molecular pathways, and
22 contributing microbial taxa that differentiated control ecosystems from those that were
23 enriched. The number of microbial taxa contributing to protein expression was
24 comparable between treatments; however, taxonomic evenness was higher in controls.
25 Functionally active bacterial composition differed significantly among treatments and
26 was more divergent in control pitchers than enriched pitchers. Aerobic and facultative
27 anaerobic bacteria contributed most to identified proteins in control and enriched
28 ecosystems, respectively. The molecular pathways and contributing taxa in enriched
29 pitcher ecosystems were similar to those found in larger enriched aquatic ecosystems and
30 are consistent with microbial processes occurring at the base of detrital food webs.
31 Detectable differences between protein profiles of enriched and control ecosystems
32 suggest that a time series of environmental proteomics data may identify protein
33 biomarkers of impending state changes to enriched states.
34
35 Key words: aquatic ecosystems; bacterial communities; environmental proteomics;
36 model ecosystem; organic matter enrichment; Sarracenia purpurea.
2 37 Introduction
38 Chronic and directional environmental drivers such as nutrient and organic matter
39 enrichment are causing state changes in many ecosystems (Rabalais et al. 2009, Scheffer
40 2009). Mitigating or preventing these state changes requires predicting them with
41 sufficient lead-time (Biggs et al. 2009). Current prediction methods rely on the statistical
42 signature of “critical slowing down” (Scheffer et al. 2009) – an increase in the variance or
43 temporal autocorrelation of a state variable (Dakos et al. 2015). However, such indicators
44 usually require long time series of data with frequent sampling of an appropriate state
45 variable (Bestelmeyer et al. 2011, Levin and Mollmann 2015). Even when such data are
46 available, the signature of critical slowing down may not provide enough lead-time for
47 intervention (Biggs et al. 2009, Contamin and Ellison 2009).
48 In aquatic systems, water quality indicators such as total suspended solids
49 (Hargeby et al. 2007), submersed macrophyte vegetation cover (Dennison et al. 1993,
50 Sondergaard et al. 2010), diatom composition (Pan et al. 1996), and phytoplankton
51 biomass (Carpenter et al. 2008) often are used as state variables. However, whether top-
52 down or bottom-up forces initiate the change, the proximate cause of eutrophication in
53 many freshwater aquatic ecosystems is microbial processes associated with the
54 breakdown of detritus (Chrost and Siuda 2006). A primary reason that it has been
55 difficult to forecast shifts with sufficient lead-time may be that changes in monitored
56 variables lag behind the microbial processes that underlie state changes. We hypothesize
57 that biomarkers linked closely to microbial function, such as proteins, may serve as better
58 early warning signals of impending state changes than traditional aquatic ecosystem
59 biomarkers.
3 60 One of the challenges to studying aquatic ecosystem state changes is the lack of
61 replicable natural ecosystems that can be ethically manipulated. Recently, we have
62 identified the aquatic ecosystem that assembles in the cup-shaped leaves of the northern
63 pitcher plant Sarracenia purpurea as a model system for identifying whole-ecosystem
64 microbial processes associated with detrital enrichment. Each leaf functions as an
65 independent ecosystem that can be experimentally enriched and monitored through time
66 in the field or lab (Srivastava et al. 2004). Arthropod prey, mostly ants and flies, form the
67 base of a “brown” food web that includes dipteran larvae, protozoa, mites, rotifers, and a
68 diverse assemblage of bacteria that decompose and mineralize nearly all the captured
69 prey biomass (Ellison et al. 2003, Butler et al. 2008, Koopman and Carstens 2011, Gray
70 et al. 2012). Even in the absence of macroinvertebrates, the dominant transfer of nutrients
71 to the plant occurs via microbial activity (Butler et al. 2008). With excess organic matter
72 loading, microbial activity increases, pitcher fluid becomes turbid, and oxygen levels
73 collapse to hypoxic conditions even during daytime photosynthesis (Sirota et al. 2013).
74 Such consequences are similar to those seen in larger aquatic ecosystems that have
75 switched from a green to a brown food web dominated by detritivores, as an initial
76 increase in primary production leads to internal organic-matter loading and increasing
77 biological oxygen demand as primary producers decompose (Correll 1998).
78 In the last decade, environmental proteomics has emerged as a powerful tool to
79 measure microbial community function in a variety of aquatic habitats, including
80 contaminated groundwater (Benndorf et al. 2007), coastal upwelling systems (Sowell et
81 al. 2011), estuaries (Colatriano et al. 2015), and meromictic lakes (Lauro et al. 2011).
82 Additionally, environmental proteomics has promise as a tool for identifying biomarkers
4 83 of changing environmental conditions, including aquatic pollution (Campos et al. 2012,
84 Ullrich et al. 2016). Environmental proteomics looks at the complete set of proteins
85 expressed in an ecosystem at a single time point and gives insight into the function of a
86 community. While metatranscriptomics also serves as an important tool for
87 understanding community function, mRNA and protein levels are generally not strongly
88 correlated (Vogel and Marcotte 2012); this is especially true for bacteria in perturbed
89 systems (Jayapal et al. 2008). Therefore, metaproteomics may provide a more accurate
90 picture of bacterial community function in enriched aquatic habitats.
91 As a first step toward determining the utility of microbial protein biomarkers as
92 early warning signals of state changes we conducted an environmental proteomics screen
93 of the aquatic ecosystem in S. purpurea pitchers enriched with organic matter to
94 determine whether there are detectable differences between the proteins, associated
95 molecular pathways, and taxa contributing to expressed proteins in microbial (nonviral
96 organisms <30 µm) communities in enriched vs. control ecosystems. We hypothesized
97 that an environmental proteomics survey would reveal detectable differences in taxa
98 contributing to protein expression, proteins, and functional pathways between enriched
99 and control ecosystems. We expected to find differences between control and enriched
100 pitchers in pathways related to respiration and decomposition, changes in the oxygen
101 requirement of microbes contributing to expressed proteins, and shifts in the taxonomic
102 composition of microbes contributing to protein expression. Specifically, we predicted an
103 abundance of contributing anaerobic bacteria in enriched pitchers relative to controls. We
104 identified and found detectable differences in taxa, proteins, and pathways common to a
105 wide range of aquatic ecosystems. Our results suggest that environmental proteomics can
5 106 be a useful tool for detecting alternative enriched and unenriched states in aquatic
107 ecosystems and may serve as a means to identify protein biomarkers of impending shifts
108 between such states.
109
110 Methods
111 Enrichment Experiment
112 The field experiment was conducted in Tom Swamp, a nutrient-poor fen located at the
113 northern end of Harvard Pond (42.51 N, −72.21 W) at Harvard Forest, Worcester County,
114 Massachusetts. Newly opened pitchers were identified and randomly assigned to an
115 ambient control or detritus-enriched treatment (Appendix S1). Previous work by Peterson
116 et al. (2008) using culture-independent methods revealed that newly opened pitchers are
117 sterile and impermeable to bacteria, so we are reasonably sure that our experimental
118 pitchers did not harbor diverse bacterial communities prior to the start of the experiment.
-1 -1 119 Detritus-enriched pitchers received 1 mg ml d of oven-dried, finely ground wasps
120 (Dolichovespula maculata) (Appendix S1), which have elemental ratios (C:N, 5.99:1,
121 N:P:K, 10.7:1.75:1.01) similar to those of Sarracenia’s natural ant prey (C:N, 5.9:1;
122 N:P:K, 12.1:1.52:0.93) (Farnsworth and Ellison 2008). Proteomic analysis of the ground
123 wasp (not reported here) failed to identify microbial proteins, so we are confident that
124 microbial contribution to enriched pitchers from the wasps was minimal. Enrichment
125 treatments were applied for 14 consecutive days; all pitchers were otherwise
126 unmanipulated. Pitcher fluid was sampled on the first and last days of the experiment,
127 filtered to remove microbes > 30 µm, pelleted, and stored at −80 °C until processed
128 (Appendix S1).
6 129
130 Protein Extraction, SDS-Page, and Mass Spectrometry
131 Six of ten replicate microbial pellets from each treatment yielded enough protein
132 for analysis via tandem mass spectrometry. All replicates were analyzed separately using
133 SDS-PAGE and Coomassie staining (Fig. 1, Appendix 1: Fig. S1a, and Appendix 1: Fig.
134 S1b). All six of the enriched pitchers and five of the six control pitchers had visible
135 protein staining levels and were chosen for mass spectrometry. Proteins were subjected to
136 a tryptic digest (Appendix S1) and to LC-MS/MS as previously described (Cheerathodi
137 and Ballif 2011) using a linear ion trap mass spectrometer (Thermo Electron, Waltham,
138 MA, USA). MS/MS spectra were matched to peptides in a custom protein database using
139 SEQUEST software as described below.
140
141 Custom Metagenomic Databases
142 We generated a custom protein database from a six-frame forward and reverse translation
143 of a metagenomic database constructed from microbial communities of three previously
144 collected pitchers that had captured diverse amounts of prey (Appendix 1: Fig. S2).
145 Pitchers were collected from Molly Bog, an ombrotrophic bog located in Morristown, VT
146 (44.50 N, -72.64 W) on 18 August 2008 and transported in a cooler directly from the
147 field to the University of Vermont. Microbial pellets were obtained immediately as
148 described above. DNA was extracted, prepared, and sent for library construction,
149 sequencing, and assembly to Genome Quebéc (Montréal, QC, Canada) with the 454 GS-
150 FLX Titanium Sequencing System (Roche) (Appendix S1). Contigs were assembled de-
151 novo with Roche’s Newbler assembler v2.3 (release 091027_1459) using default
7 152 parameters (minimum Read Length = 20; overlap Seed Step = 12; overlap Seed Length
153 =16; overlap Min Seed Count = 1; overlap Seed Hit Limit = 70; overlap Min Match
154 Length = 40; overlap Min Match Identity = 90; overlap Match Ident Score = 2; overlap
155 Match Diff Score = -3; overlap Match Unique Thresh = 12; map Min Contig Depth = 1;
156 all Contig Thresh = 100), with the exception of minimum read length (20 bp) and overlap
157 Hit Position Limit (1,000,000). The assembled contigs were imported into MG-RAST
158 4.0.2 to assess functional and taxonomic potential (Meyer et al. 2008). Taxonomic
159 assignments were visualized using the Krona plugin and the following cutoffs were
160 applied to both taxonomic and subsystem functional category assignments: minimum
-5 161 identity = 60%, e-value of 1 x 10 or less, and a minimum alignment length of 15 bp
162 (Appendix 1: Fig. S3). We calculated Hurlbert’s probability of an interspecific encounter
163 (PIE) to estimate the evenness of bacterial classes in the metagenome (Hurlbert 1971)
164 (Appendix S1). KEGG pathways (level 2 and level 3) were assigned to contigs using the
165 KEGG database via MG-RAST (we report only the top 73 level 3 pathways
166 here)(Appendix 1: Fig. S4).
167 A metaproteomic database was created with a six-frame forward and reverse
168 translation of the assembled metagenome using open-source Ruby software. Sequences
169 with greater than 100 amino acids (n=184,128) in length were retained. A decoy database
170 was constructed by reversing the retained sequences and concatenating them to the
171 forward database to allow for an estimation of the false discovery rate as has been
172 described (Elias and Gygi 2007).
173
174 Protein Orthologue Identification
8 175 Peptide and protein identifications were made via a SEQUEST search of the tandem mass
176 spectral data against the custom pitcher-plant microbial community protein database
177 described above (Appendix S1). The number of protein hits varied substantially among
178 replicates, so to have enough proteins for treatment comparisons, peptides and proteins
179 from the five control samples and six enriched samples were pooled after LC-MS/MS
180 and the SEQUEST search into a single control and a single enriched sample dataset. The
181 doubly- and triply-charged peptide ions were further considered and each dataset was
182 filtered by first adjusting the cutoffs for XCorr and ΔCn until the false discovery rate was
183 < 10%. The final filters were: Xcorr ≥ 3.0 for doubly-charged ions, Xcorr ≥ 3.3 for triply-
184 charged ions and unique Δcorr ≥ 0.15. The resulting list of protein hits for each treatment
185 was then ranked by unique number of peptides and the top 220 proteins from each
186 treatment were selected so that the false discovery rate for control and enriched
187 treatments were 6.6% and 0%, respectively. These top 220 proteins and their associated
188 peptides are found in Data Supplement S1.
189 In the list of control peptides, a protein hit from the decoy database was
190 represented by 25 total peptides; therefore, we suspected that this hit was a true positive
191 not represented in our target database. However, a BLAST search of the full amino acid
192 sequence did not yield an identical match, so we cannot definitively claim it is a true
193 positive; therefore, we removed this peptide from our top 220 list of control peptides.
194 With this peptide removed, the false discovery rate for the control treatment was 4.3%.
195 All peptide hits were pooled within treatments and mapped back to their source
196 sequences in the custom protein database. Those source sequences were imported in fasta
197 file format into blast2go v.2.8.0 (Conesa et al. 2005) for identification and annotation
9 198 using the following configuration settings: blastp program, Blast Expect Value of 1.0E-3,
199 10 Blast Hits, Annotation CutOff >55, GO Weight >5.
200
201 Analysis of the Top Proteins Shared Between Treatments
202 A randomization test was done using R Studio (v. 0.98.1059) to test the hypothesis that
203 there was a single common protein pool for both the control and enriched treatments and
204 that the number of observed shared proteins between treatments reflects chance effects
205 resulting from random draws from this single protein pool (Appendix S1). We conducted
206 an additional simulation in R to determine the likelihood of a Type I error in our
207 randomization test (Appendix S1).
208
209 Comparison of the Top 20 Proteins from Each Treatment
210 We downloaded the sequence by annotation file from the blast2go search for each
211 treatment to get the protein names associated with each protein hit (sequence description
212 in blast2go). Each of the top 220 identified proteins in each treatment, ordered by the
213 number of total peptides associated with the protein hit, was matched to a protein name
214 using R software. If multiple protein hits within a treatment matched a single protein
215 name, the protein names were merged in silico and the total peptides representing them
216 were summed. Protein names were ranked in order of the abundance of total peptides for
217 each treatment.
218
219 Taxonomic Analysis
10 220 To determine the taxonomic composition of the microbes contributing to identified
221 proteins in our treatments, we conducted a BLAST homology search of the metagenomic
222 sequence data for protein hits. All peptides from the top 220 identified proteins in each
223 treatment were mapped back to their contigs of origin to obtain nucleotide sequences.
224 Because contigs were at least 500 base pairs in length, we felt confident that a BLAST
225 search of the nucleotide sequences would yield correct taxonomic identifications at
226 course taxonomic levels and acknowledge that ambiguity can remain in the taxonomic
227 identification from a metacommunity at genus and species levels. The top BLAST hit
228 was retained for each nucleotide sequence associated with an identified protein and
229 linked to a bacterial class (Appendix S1). For each bacterial class identified, a 2×2
230 contingency table was created with treatments as columns and the number of peptides
231 associated and not associated with the taxon as rows. A chi-square test was then used to
232 determine if the abundance of the bacterial class was significantly different between
233 treatments. All P values were adjusted using the Benjamini-Hochberg method (Benjamini
234 and Hochberg 1995) (Table 1). Species composition was visualized using Krona (Ondov
235 et al. 2011) (Appendix 1: Fig. S5). In addition to the BLAST homology search, we used
236 Unipept (Mesuere 2016) to map tryptic peptides to the UniprotKB database and retrieve
237 the least common taxonomic ancestor (= most derived shared taxonomic node) associated
238 with each peptide for pooled replicates (Appendix 1: Fig. S6). We calculated Hurlbert’s
239 PIE to estimate the evenness of bacterial classes contributing to expressed proteins in
240 control and enriched pitchers (Hurlbert 1971) (Appendix S1).
241
242 Functional Analysis
11 243 Functional pathways (two levels) associated with each identified protein from each
244 treatment were retrieved using the KEGG (Kyoto Encyclopedia of Genes and Genomes)
245 (Kanehisa et al. 2014) mapping function of blast2go v.2.8.0. Each pathway was weighted
246 by the total number of peptides associated with protein hits, or the number of spectral
247 counts, mapping to that pathway (Appendix 1: Fig. S7). For each pathway identified, a
248 2×2 contingency table was created with treatments as columns and the number of
249 peptides associated and not associated with the pathway as rows. A chi-square test was
250 used to determine if each pathway was significantly over- or under-represented in
251 enriched pitchers relative to controls. All P values were adjusted using the Benjamini-
252 Hochberg method (Benjamini and Hochberg 1995) (Appendix 1: Table S1).
253 To determine whether bacteria contributing to expressed proteins in control and
254 enriched ecosystems differed in their O2 requirements, we mapped each bacterial species
255 identified in our BLAST search to its O2 requirement using data from the Integrated
256 Microbial Genomes database (IMG) (Timinskas et al. 2014, Reddy et al. 2015)
257 (Appendix S1, Fig. ). The IMG database contains 6 classes of O2 requirements: aerobe,
258 anaerobe, facultative, microaerophillic, obligate aerobe, and obligate anaerobe. The latter
259 three categories make up less than 7% of the database. We merged any species classified
260 as obligate aerobes or obligate anaerobes into the aerobe and anaerobe classes,
261 respectively.
262
263 Analysis of Unpooled Data
264 In addition to analyzing pooled data, we used ordination and permutation analyses to
265 determine the effect of enrichment on microbial community protein expression,
12 266 taxonomic contribution to expressed proteins at the class and family levels, and KEGG
267 pathways. We tested the similarity within and among replicates of control and enriched
268 microbial communities using ADONIS, a nonparametric permutation test in the ‘vegan’
269 package (v. 2.4.1) in R (Oksanen et al. 2016). We used a multivariate homogeneity of
270 group dispersions test (betadisper function in the ‘vegan’ package) to determine if the
271 composition of contributing microbial taxa was more divergent in control replicates than
272 in enriched replicates. The permutation tests used 999 permutations and were done using
273 total peptide counts associated with protein identifications, microbial classes, microbial
274 families, and KEGG pathways (Table 2). To visualize the similarities among replicate
275 ecosystems, we used the ‘vegan’ package function metaMDSto perform non-metric
276 multidimensional scaling (NMDS) ordination using Bray-Curtis distances. Data were
277 square-root transformed and standardized using Wisconsin double standardization. To
278 determine which taxa contributed the most to Bray-Curtis dissimilarity of taxa
279 contributing to protein expression between the treatments, we did a similarity percentages
280 test using the simper function in the ‘vegan’ package.
281
282 Results
283 From 243 Mb of DNA sequence information, roughly 54% of 567,549 filtered
284 reads (median read length=482 bp) were assembled into 26,713 contigs ranging from 500
285 to 43,200 bp (N50=1135) (Appendix 1: Fig. S2b, Appendix 1: Fig. S2c). All the contigs
286 passed MG-RAST quality control. The metagenome was dominated by bacteria (99.11%)
287 at the domain level. The top five bacterial classes were Betaproteobacteria (31.99%),
288 Alphaproteobacteria (19.42%), Sphingobacteria (13.32%), Gammaproteobacteria
13 289 (10.10%), and Acidobacteria (7.04%). The top five genera comprising the genome were
290 Burkholderia (8.87%), Variovorax (6.50%), Pedobacter (5.24%), Mucilaginibacter
291 (4.04%) and Lutiella (3.91%). Within the metagenome, 23% of aligned contigs were
292 mapped to the order Burkholderiales while only 7% mapped to Neisserialies (Appendix
293 1: Fig. S3). Taxonomic evenness of the metagenome, calculated using Hurlbert’s PIE,
294 was equal to 0.79.
295 Representation of the contigs mapping to functional pathways was dominated by
296 amino acid metabolism (20.6%), followed by membrane transport (12.9%), carbohydrate
297 metabolism (11.9%), translation (7.2%), and metabolism of cofactors and vitamins
298 (6.4%). Within amino acid metabolism, pathways were represented primarily by glycine,
299 serine, and threonine metabolism (17.1%), alanine, aspartate, and glutamate metabolism
300 (13.8%), and valine, leucine, and isoleucine degradation (12.7%). Membrane transport
301 was represented by ABC transporters (78.2%), bacterial secretion system (19.4%), and
302 phosphotransferase system (PTS) (2.4%). Carbohydrate metabolism was dominated by
303 pyruvate metabolism (13.9%), glycolysis/glucogenesis (12.6%), and pentose phosphate
304 pathway (11.6%). Overall, the top 5 level 3 KEGG categories included ABC transporters
305 (10.1%), two-component system (4.8%), aminoacyl-tRNA biosynthesis (3.8%), glycine,
306 serine, and threonine metabolism (3.5%), and ribosome (3.3%) (Appendix 1: Fig. S4).
307 We identified a total of 986 proteins in the enriched treatment and 616 proteins in
308 the control treatment. Of the 220 most abundant protein identifications for each
309 treatment, 65 were shared between treatments leaving 155 unique to each treatment (Fig
310 2a). The randomization test revealed significantly fewer protein hits shared between the
311 treatments than expected by chance (Fig 2b). In both treatments, the top three of the 20
14 312 most abundant proteins, as measured by the total number of matched peptides (spectral
313 counts), were the same in the control and enriched treatments. However, the relative
314 abundances of the remaining 17 proteins in this top list differed strongly between
315 treatments, with only seven of the 20 proteins unique to each treatment (Fig. 2c).
316 The majority of identified proteins were associated with bacteria. The most
317 common microbial class contributing to identified proteins in both treatments was
318 Betaproteobacteria, but the contribution was higher in enriched (84.4%) versus control
319 (50.3%) treatments (Table 1, Fig. 3a, Appendix 1: Fig. S5, Appendix 1: Fig. S6). This
320 difference was driven by a higher abundance of Alphaproteobacteria in multiple families,
321 including Sphingobacteriaceae, Phyllobacteriaceae, Xanthomonadaceae, and
322 Rhizobiaceae, in control ecosystems relative to the enriched ecosystems. The similarity
323 percentages test identified Betaprotebacteria (38.8%) and Alphaproteobacteria (9.9%) as
324 the main contributors to dissimilarity of active microbial class composition between
325 treatments and Neisseriaceae (23.8%) and Comamonadaceae (9.7%) as the main
326 contributors to active microbial family dissimilarity between treatments. Although both
327 treatments yielded similar numbers of identified microbial classes (control = 12, enriched
328 = 11), taxonomic evenness of microbial classes contributing to identified proteins was
329 substantially higher in the controls (PIE = 0.71) than in the enriched pitchers (PIE =
330 0.31). Similar taxonomic profiles were obtained using Unipept’s search for the least
331 common taxonomic ancestors of the pooled data (Appendix 1: Fig. S6). For the unpooled
332 data, taxonomic and functional variability among treatments was greater than variability
333 among replicate ecosystems within treatments (Fig. 3, Fig. 4). Multivariate analysis of
334 group dispersion revealed that composition of microbes contributing to protein
15 335 expression was significantly more variable in control replicates than in enriched
336 replicates at both the family (P = 0.003) and class (P = 0.023) levels.
337 The BLAST search yielded taxonomic assignments for 191 and 173 of the 220
338 sequences in enriched and control treatments, respectively, and all E-values were less
-5 339 than 10 . Of top species hits identified in the BLAST search, Variovorax paradoxus and
340 Chromobacterium violaceum were the only two of the most six abundant “species”
341 contributing to identified proteins common to both treatments. Novosphingobium
342 aromaticivorans, Starkeya novella, Sphingomonas wittichii, and Sphingomonas sp. were
343 among the six most abundant contributors in control pitchers. Pseudogulbenkiania sp.,
344 Rhodanobacter denitrificans, Janthinobacterium sp., and Dechlorosoma suillum were
345 among the six most abundant contributors in enriched pitchers (Appendix 1: Table S2).
346 Obligate aerobic bacteria contributed the most to identified proteins in the control
347 pitchers, while facultative anaerobic bacteria contributed the most in enriched pitchers
348 (Fig. 5b).
349 Functional pathways represented by the top 220 expressed microbial proteins also
350 differed between control and enriched pitchers. We detected significant differences in
351 metabolic pathways, including those involved in the metabolism of amino acids,
352 carbohydrates, lipids, secondary metabolites, cofactors & vitamins, and terpenoids &
353 polyketides (Appendix 1: Table S1, Appendix 1: Fig. S7, Appendix 1: Fig. S8a) and, at
354 courser pathway levels, energy metabolism, nucleotide metabolism and amino acid
355 metabolism (Figure 5a). In the control treatment, 161 of the top 220 protein hits were not
356 assigned to a KEGG pathway (represented by 906 total peptides). Of the 220 top protein
16 357 hits in the enriched treatment, 129 were not assigned to a pathway (represented by 2,375
358 total peptides).
359
360 Discussion
361 We hypothesized that there would be detectable differences in the taxonomic
362 composition of microbes contributing to expressed proteins. Indeed, we observed striking
363 differences between unenriched and enriched ecosystems in the taxonomic composition
364 of the microbes contributing to identified proteins (Fig. 3). The taxonomic composition of
365 bacteria contributing to protein expression in our study, and in our metagenome, is
366 consistent with findings of previous studies of bacterial communities in Sarracenia
367 species. S. purpurea pitchers contain more than 1,000 species of bacteria and a negligible
368 amount of archaea (Paisie et. al, 2014). One genomic study of S. alata pitcher bacterial
369 communities revealed an abundance of Proteobacteria (primarily Gammaproteobacteria).
370 Taxonomic groups within the Betaprotebacteria had relative abundances similar to our
371 metagenome and to control pitcher communities in our experiment, with a high
372 percentage of sequences derived from Burkholderiales and a lower proportion from the
373 Neisseriales (Koopman et al. 2010). A study of sub-habitats in S. purpurea revealed an
374 abundance of Betaproteobacteria (primarily Burkholderiales) on the pitcher walls and in
375 the sediment, co-dominance in pitcher liquid by Beta- and Alphaproteobacteria, and the
376 presence of Bacteroidetes and Firmicutes, though in a low proportion, in the sediment,
377 fluid, and pitcher walls (Krieger and Kourtev 2012). This finding is fairly consistent with
378 the taxonomic potential revealed by our metagenome, in which 35%, 23%, 14%, and 1%
379 of identified contigs were mapped to Betaproteobacteria, Alphaproteobacteria,
17 380 Bacteroidetes, and Firmicutes, respectively. Grey et al. (2012) found that S. purpurea
381 pitchers were composed primarily of Proteobacteria and Bacteroidetes, with
382 Gammproteobacteria, Alphaproteobacteria, or Betaproteobacteria dominating within the
383 Proteobacteria, but that taxonomic composition varied from pitcher to pitcher within and
384 across geographic regions.
385 The composition of bacteria contributing to protein expression in our experiment
386 varied between control replicates, much more so than between enriched pitcher
387 communities. This pattern is likely the result of a combination of factors. First, pitchers
388 contain distinct sub-habitats that vary in light availability and concentration of dissolved
389 oxygen and organic matter and therefore provide multiple habitats for a diverse set of
390 microbes (Krieger and Kourtev 2012). As organic matter enrichment increases biological
391 oxygen demand, the subsequent decline in dissolved oxygen may create a more
392 homogenous oxygen environment such that microbes sensitive to oxygen conditions can
393 no longer compete against low-oxygen tolerant bacteria, decreasing bacterial diversity.
394 Low bacterial diversity in enriched pitchers echoes findings in larger enriched
395 aquatic ecosystems. Analysis of the 16S rRNA gene product of bacterial communities in
396 nutrient-enriched salt marsh sediments revealed that the bacterial diversity of active
397 bacteria decreased relative to that of communities in unenriched sediments (Kearns et al.
398 2016). Similarly, enrichment of heterotrophic stream biofilm communities yielded
399 lowered diversity; however, in contrast to our enriched pitcher communities, the stream
400 biofilm communities diverged in composition (Van Horn et al. 2011).
401 The composition of microbes contributing to protein expression in S. purpurea
402 pitchers was similar to the composition of larger freshwater aquatic ecosystems.
18 403 Betaproteobacteria dominated microbes contributing to protein expression in both
404 enriched and control pitchers though in higher abundances in enriched pitchers relative to
405 control pitchers. Betaproteobacteria are generally the most abundant class of bacteria in
406 freshwater lakes (Percent et al. 2008, Newton et al. 2011) and dominate contaminated
407 sediments (Haller et al. 2011) and organic aggregates in eutrophic lakes (Tang et al.
408 2009). Betaproteobacteria populations associated with the beta II clade have been shown
409 to increase rapidly with the addition of organic carbon in humic lakes (Burkert et al.
410 2003, Kent et al. 2006). Furthermore, experimental dissolved organic matter additions to
411 microcosms containing alpine lake bacteria cultures led to a near-dominance of
412 Betaproteobacteria, suggesting that these bacteria are good competitors in enriched
413 aquatic ecosystems (Perez and Sommaruga 2006). These results suggest that bacterial
414 communities in S. purpurea pitchers are structured and behave like bacterial communities
415 in larger lakes and ponds in response to enrichment. It is important to note that most
416 existing literature on freshwater bacteria and S. purpurea bacterial communities rely
417 primarily on genomic methods for identification and therefore are likely capturing
418 functionally active and inactive bacteria, whereas our methods are capturing only the
419 functionally active bacteria. As a result, we use caution when directly comparing the
420 results of our study to those in larger aquatic ecosystems. However, the Unipept search of
421 our identified tryptic peptides and NCBI Blast search of their contigs of origin yielded
422 remarkably similar results (Fig 3a, Appendix 1: Fig. S6), suggesting that tryptic peptides
423 could be used to correctly identify microbes contributing to identified proteins, though at
424 coarser taxonomic levels than can be achieved by nucleic acid analysis.
19 425 We hypothesized that there would be detectable differences in the function of
426 microbial communities in control and enriched pitchers. We measured function in two
427 ways: first, we mapped identified bacterial classes associated with proteins to their
428 oxygen requirements and second, we mapped peptides to functional KEGG pathways.
429 Oxygen requirements differed significantly between taxa contributing to protein
430 expression in control and enriched microbial communities. Bacteria contributing to
431 protein expression in control pitchers were predominately aerobic whereas bacteria
432 contributing to protein expression in enriched pitchers were primarily facultatively
433 anaerobic. The difference in oxygen requirement of contributing bacteria between the two
434 treatments was driven largely by two taxa: the obligate aerobe Variovorax paradoxus
435 (28.4% of total peptides in the control treatment and 7.2% in the enriched treatment) and
436 the facultative anaerobe Chromobacterium violaceum (53.3% of total peptides in the
437 enriched treatment and 6.6% in the control treatment) (Appendix 1: Table S2). Peptides
438 that mapped to C. violaceum in the BLAST search mapped in the Unipept search to
439 Aquitalea magnusonii, a betaproteobacteria most closely related to C. violaceum, isolated
440 from a humic lake in Wisconsin, USA (Lau et al. 2006). Although we did not measure
441 dissolved oxygen during the field experiment, enriched pitchers in a subsequent
442 experiment enriched with the same concentration of organic matter became hypoxic
443 within 48 hours, suggesting that pitchers in the field were likely hypoxic (Sirota et al.
444 2013). Dissolved oxygen concentration is one of three primary drivers of bacterial
445 community composition in eutrophic, dimictic lakes (Shade et al. 2007) and appears to
446 also drive the composition of functionally active bacteria in enriched S. purpurea
447 pitchers.
20 448 We expected to see a high proportion of obligate anaerobic bacteria in enriched
449 pitchers. Bacteroidetes and Firmicutes, to a lesser degree, have been found to inhabit S.
450 purpurea pitchers (Krieger and Kourtev 2012); however, we identified very few proteins
451 associated with these taxa. Of the 3008 and 969 peptides associated with the top 220
452 proteins in enriched and control treatments, respectively, we found only 17 peptides
453 associated with obligate anaerobes in the enriched pitchers (7 of which were associated
454 with Firmicutes) and 13 associated with obligate anaerobes in the control pitchers (3 of
455 which were associated with Firmicutes). Though we did find a higher number of peptides
456 associated with Bacteroidetes (74 peptides in control pitchers and 89 in enriched
457 pitchers), they were facultative anaerobes and not strict anaerobes. It is likely that the low
458 numbers of identified peptides associated with these taxa in experimental pitchers are the
459 result of a skewed protein database. Our database was built using metagenomic data from
460 pitchers in the field, the majority of which are oxygen-rich (Adlassnig et al. 2011), and
461 likely contained nucleotide sequences primarily from aerobic and facultative anaerobic
462 bacteria. Additionally, pitchers are generally oxygen-rich due to photosynthetic activity
463 of the plant and therefore primarily harbor aerobic inquilines (Adlassnig et al. 2011).
464 Even when dissolved oxygen is low, there is a constant flux of oxygen into the pitcher
465 fluid and so the pitchers are rarely ever truly anoxic. It is not surprising, therefore, that
466 peptides associated with anaerobic bacteria were rare. In the absence of a fully
467 representative database, we feel that the higher number of proteins represented by
468 facultative bacteria in enriched pitchers relative to control pitchers is a good indicator of
469 changing oxygen conditions. These results are consistent with the shift to a hypoxic state
470 when S. purpurea is enriched with additional prey (Sirota et al. 2013).
21 471 We assigned KEGG pathways to contigs in the metagenome and to protein
472 identifications in the metaproteomes to compare microbial community function between
473 control and enriched pitchers, and between the metaproteomes and functional potential in
474 the metagenome. Not surprisingly, the functional potential revealed by the metagenome
475 differed from function revealed by the metaproteomes. Amino acid metabolism and
476 carbohydrate metabolism were represented in the top five rank-ordered pathways in both
477 the metaproteomes and the metagenome; however, carbohydrate metabolism was ranked
478 first in the metaproteomes (~34-40% of total peptides) and third in the metagenome
479 (~12% of mapped contigs). Nucleotide metabolism and energy metabolism were
480 represented in the top five in the metaproteomes (~18% of total peptides in controls and
th th 481 ~34% of total peptides in enriched pitchers), but were ranked 9 (~4%) and 7 (~5%) in
482 the metagenome, respectively. Such differences could be a result of not all nucleotide
483 sequences being transcribed and translated to proteins, but may also be an artifact of only
484 including 220 proteins from each treatment in the metaproteome analysis.
485 We hesitate to hypothesize broader relevance of our functional pathway results
486 for two reasons. First, we are most interested in the identification of proteins that can
487 serve as biomarkers of aquatic ecosystem state changes. Whereas we expect that
488 functional information will be useful for determining the utility and generality of such
489 biomarkers, it is not necessary for finding useful biomarkers. Second, it seems
490 impossible, with our limited data, to identify a complete set of functions. With that
491 caveat, we found that coarse KEGG pathway assignments differed between control and
492 enriched microbial communities. Enriched pitchers contained significantly more
493 microbial biomass, as evidenced by the size of the microbial pellets post-centrifugation.
22 494 When samples were pooled and total peptide counts were normalized, chi-square analysis
495 revealed an enrichment of peptides associated with energy metabolism in enriched
496 pitchers.
497 These results are consistent with patterns seen in larger aquatic ecosystems:
498 mineralization of organic matter, an effect of microbial energy metabolism, has been
499 shown to increase along trophic gradients, with bacteria contributing most to
500 mineralization in eutrophic freshwater lakes (Simcic 2005). Not surprisingly, peptides
501 associated with processes requiring oxygen including oxidative phosphorylation and the
502 citric acid cycle were enriched in oxygen-rich control pitcher microbial communities.
503 One protein associated with the citric acid cycle, isocitrate lyase, was present in the top
504 20 rank ordered protein identifications in the enriched treatment, but not in the control
505 treatment. This protein, which has been found to be upregulated during periods of oxygen
506 depletion in M. tuberculosis (Wayne and Lin 1982), could be a candidate biomarker for
507 an impeding tipping point in the S. purpurea microecosystem. Though we did not find a
508 significant difference in lipid metabolism pathways between control and enriched pitcher
509 proteins, there was a trend for increased pathway representation of unsaturated fatty acid
510 biosynthesis and fatty acid elongation in enriched pitchers. Such an increase has been
511 found in bacteria in low-oxygen or anaerobic conditions, primarily resulting from an
512 increase in membrane lipids (Lemmer et al. 2015). While these differences do not
513 immediately reveal a functional explanation, it is promising that there were signatures of
514 detectable differences in the protein profiles between treatments. Such differences imply
515 that there are changes in the expression of the most abundant proteins in the most
516 abundant taxa related to organic matter loading.
23 517 In larger aquatic systems, traditional water quality indicators may not provide
518 enough lead-time to forecast a tipping point (Contamin and Ellison 2009), especially if
519 they lag behind changes in the microbial community. We hypothesize that microbial
520 proteins may be more sensitive and timely indicators of impending tipping points than
521 traditional chemical markers of water quality. We argue that even though
522 metatranscriptomic and metagenomic methods have superior throughput, metaproteomic
523 methods can inexpensively and rapidly simultaneously characterize the function and
524 (indirectly) composition of the active microbial community members responsible for
525 processes related to aquatic ecosystem state changes. Our study includes a semi-
526 quantitative small initial sampling at only a single time point and therefore does not yet
527 enable a comprehensive enough proteomic analysis to determine the identity of
528 biomarkers or place them in an ecological context. Future studies using more sensitive
529 instrumentation will allow for the identification of a larger number of proteins. Time
530 series of environmental proteomics data and quantitative analysis of changes in protein
531 abundances prior to state changes will allow for the identification and ecological
532 characterization of tipping point biomarkers.
533
534 Acknowledgments
535 This work was funded by the National Science Foundation (grant numbers 1144055 and
536 1144056). Proteomic analysis was funded by the Vermont Genetics Network through
537 U.S. National Institutes of Health Grant 8P20GM103449 from the INBRE program of the
538 NIGMS. The authors thank Hailee Tenander for assisting with preparation of samples for
539 mass spectrometry analysis.
24 540 This Whole Genome Shotgun project has been deposited at DDB/ENA/GenBank
541 under the accession NMRC01000000. The version described in this paper is version
542 NMRC01000000. The protein database and all code used to analyze the data is freely
543 available on the Harvard Forest Data Archive under ID number HF295.
544
25 545 References
546
547 Adlassnig, W., M. Peroutka, and T. Lendi. 2011. Traps of carnivorous pitcher plants as a
548 habitat: composition of the fluid, biodiversity and mutualistic activities. Annals of
549 Botany 107:181-194.
550 Benjamini, Y., and Y. Hochberg. 1995. Controlling the False Discovery Rate - a Practical
551 and Powerful Approach to Multiple Testing. Journal of the Royal Statistical
552 Society Series B-Methodological 57:289-300.
553 Benndorf, D., G. U. Balcke, H. Harms, and M. von Bergen. 2007. Functional
554 metaproteome analysis of protein extracts from contaminated soil and
555 groundwater. Isme Journal 1:224-234.
556 Bestelmeyer, B. T., A. M. Ellison, W. R. Fraser, K. B. Gorman, S. J. Holbrook, C. M.
557 Laney, M. D. Ohman, D. P. C. Peters, F. C. Pillsbury, A. Rassweiler, R. J.
558 Schmitt, and S. Sharma. 2011. Analysis of abrupt transitions in ecological
559 systems. Ecosphere 2.
560 Biggs, R., S. R. Carpenter, and W. A. Brock. 2009. Turning back from the brink:
561 Detecting an impending regime shift in time to avert it. Proceedings of the
562 National Academy of Sciences of the United States of America 106:826-831.
563 Burkert, U., F. Warnecke, D. Babenzien, E. Zwirnmann, and J. Pernthaler. 2003.
564 Members of a readily enriched beta-proteobacterial clade are common in surface
565 waters of a humic lake. Applied and Environmental Microbiology 69:6550-6559.
26 566 Butler, J. L., N. J. Gotelli, and A. M. Ellison. 2008. Linking the brown and green:
567 Nutrient transformation and fate in the Sarracenia microecosystem. Ecology
568 89:898-904.
569 Campos, A., S. Tedesco, V. Vasconcelos, and S. Cristobal. 2012. Proteomic research in
570 bivalves Towards the identification of molecular markers of aquatic pollution.
571 Journal of Proteomics 75:4346-4359.
572 Carpenter, S. R., W. A. Brock, J. J. Cole, J. F. Kitchell, and M. L. Pace. 2008. Leading
573 indicators of trophic cascades. Ecology Letters 11:128-138.
574 Cheerathodi, M., and B. A. Ballif. 2011. Identification of CrkL-SH3 Binding Proteins
575 from Embryonic Murine Brain: Implications for Reelin Signaling during Brain
576 Development. Journal of Proteome Research 10:4453-4462.
577 Chrost, R. J., and W. Siuda. 2006. Microbial production, utilization, and enzymatic
578 degradation of organic matter in the upper trophogenic layer in the pelagial zone
579 of lakes along a eutrophication gradient. Limnology and Oceanography 51:749-
580 762.
581 Colatriano, D., A. Ramachandran, E. Yergeau, R. Maranger, Y. Gelinas, and D. A.
582 Walsh. 2015. Metaproteomics of aquatic microbial communities in a deep and
583 stratified estuary. Proteomics 15:3566-3579.
584 Conesa, A., S. Gotz, J. M. Garcia-Gomez, J. Terol, M. Talon, and M. Robles. 2005.
585 Blast2GO: a universal tool for annotation, visualization and analysis in functional
586 genomics research. Bioinformatics 21:3674-3676.
27 587 Contamin, R., and A. M. Ellison. 2009. Indicators of regime shifts in ecological systems:
588 What do we need to know and when do we need to know it? Ecological
589 Applications 19:799-816.
590 Correll, D. L. 1998. The role of phosphorus in the eutrophication of receiving waters: a
591 review. Journal of Environmental Quality 27:261-266.
592 Dakos, V., S. R. Carpenter, E. H. van Nes, and M. Scheffer. 2015. Resilience indicators:
593 prospects and limitations for early warnings of regime shifts. Philosophical
594 Transactions of the Royal Society B-Biological Sciences 370.
595 Dennison, W. C., R. J. Orth, K. A. Moore, J. C. Stevenson, V. Carter, S. Kollar, P. W.
596 Bergstrom, and R. A. Batiuk. 1993. Assessing water-quality with submersed
597 aquatic vegetation. Bioscience 43:86-94.
598 Elias, J. E., and S. P. Gygi. 2007. Target-decoy search strategy for increased confidence
599 in large-scale protein identifications by mass spectrometry. Nature Methods
600 4:207-214.
601 Ellison, A. M., N. J. Gotelli, J. S. Brewer, D. L. Cochran-Stafira, J. M. Kneitel, T. E.
602 Miller, A. C. Worley, and R. Zamora. 2003. The evolutionary ecology of
603 carnivorous plants. Advances in Ecological Research, Vol 33 33:1-74.
604 Farnsworth, E. J., and A. M. Ellison. 2008. Prey availability directly affects physiology,
605 growth, nutrient allocation and scaling relationships among leaf traits in 10
606 carnivorous plant species. Journal of Ecology 96:213-221.
607 Gray, S. M., D. M. Akob, S. J. Green, and J. E. Kostka. 2012. The Bacterial Composition
608 within the Sarracenia purpurea Model System: Local Scale Differences and the
609 Relationship with the Other Members of the Food Web. Plos One 7.
28 610 Haller, L., M. Tonolla, J. Zopfi, R. Peduzzi, W. Wildi, and J. Pote. 2011. Composition of
611 bacterial and archaeal communities in freshwater sediments with different
612 contamination levels (Lake Geneva, Switzerland). Water Research 45:1213-1228.
613 Hargeby, A., I. Blindow, and G. Andersson. 2007. Long-term patterns of shifts between
614 clear and turbid states in Lake Krankesjon and Lake Takern. Ecosystems 10:29-
615 36.
616 Hurlbert, S. H. 1971. The nonconcept of species diversity: A critique and alternative
617 parameters. Ecology 52:577-586.
618 Jayapal, K. P., R. J. Philp, Y. J. Kok, M. G. S. Yap, D. H. Sherman, T. J. Griffin, and W.
619 S. Hu. 2008. Uncovering Genes with Divergent mRNA-Protein Dynamics in
620 Streptomyces coelicolor. Plos One 3.
621 Kanehisa, M., S. Goto, Y. Sato, M. Kawashima, M. Furumichi, and M. Tanabe. 2014.
622 Data, information, knowledge and principle: back to metabolism in KEGG.
623 Nucleic Acids Research 42:D199-D205.
624 Kearns, P. J., J. H. Angell, E. M. Howard, L. A. Deegan, R. H. R. Stanley, and J. L.
625 Bowen. 2016. Nutrient enrichment induces dormancy and decreases diversity of
626 active bacteria in salt marsh sediments. Nature Communications 7.
627 Kent, A. D., S. E. Jones, G. H. Lauster, J. M. Graham, R. J. Newton, and K. D.
628 McMahon. 2006. Experimental manipulations of microbial food web interactions
629 in a humic lake: shifting biological drivers of bacterial community structure.
630 Environmental Microbiology 8:1448-1459.
631 Koopman, M. M., and B. C. Carstens. 2011. The microbial phyllogeography of the
632 carnivorous plant Sarracenia alata. Microbial Ecology 61:750-758.
29 633 Koopman, M. M., D. M. Fuselier, S. Hird, and B. C. Carstens. 2010. The Carnivorous
634 Pale Pitcher Plant Harbors Diverse, Distinct, and Time-Dependent Bacterial
635 Communities. Applied and Environmental Microbiology 76:1851-1860.
636 Krieger, J. R., and P. S. Kourtev. 2012. Bacterial diversity in three distinct sub-habitats
637 within the pitchers of the northern pitcher plant, Sarracenia purpurea. Fems
638 Microbiology Ecology 79:555-567.
639 Lau, H. T., J. Faryna, and E. W. Triplett. 2006. Aquitalea magnusonii gen. nov., sp nov.,
640 a novel Gram-negative bacterium isolated from a humic lake. International
641 Journal of Systematic and Evolutionary Microbiology 56:867-871.
642 Lauro, F. M., M. Z. DeMaere, S. Yau, M. V. Brown, C. Ng, D. Wilkins, M. J. Raftery, J.
643 A. E. Gibson, C. Andrews-Pfannkoch, M. Lewis, J. M. Hoffman, T. Thomas, and
644 R. Cavicchioli. 2011. An integrative study of a meromictic lake ecosystem in
645 Antarctica. Isme Journal 5:879-895.
646 Lemmer, K. C., A. C. Dohnalkova, D. R. Noguera, and T. J. Donohue. 2015. Oxygen-
647 Dependent Regulation of Bacterial Lipid Production. Journal of Bacteriology
648 197:1649-1658.
649 Levin, P. S., and C. Mollmann. 2015. Marine ecosystem regime shifts: challenges and
650 opportunities for ecosystem-based management. Philosophical Transactions of the
651 Royal Society B-Biological Sciences 370.
652 Mesuere, B., Williams, T., Van der Jeugt, F., Devreese, B., Vandamme, P., Dawyndt, P.
653 2016. Unipept web services for metaproteomics analysis. Bioinformatics.
654 Meyer, F. Paarmann, D., D’Souza, M., Olson, R., Glass, E.M., Kubal, M., Paczian, T.,
655 Rodriguez, A., Stevens, R., Wilke, A., Wilkening, J., Edwards, R.A. 2008. The
30 656 Metagenomics RAST Server – a Public Resource for the Automatic Phylogenetic
657 and Functional Analysis of Metagenomes. BMC Bioinformatics 9: 386.
658 Newton, R. J., S. E. Jones, A. Eiler, K. D. McMahon, and S. Bertilsson. 2011. A Guide to
659 the Natural History of Freshwater Lake Bacteria. Microbiology and Molecular
660 Biology Reviews 75:14-49.
661 Oksanen, J., F. Blanchet, R. Kindt, R. Legendre, P. R. Minchin, R. B. O'Hara, G. L.
662 Simpson, P. Solymos, M. Henry, H. Stevens, E. Szoecs, and H. Wagner. 2016.
663 Vegan: community ecology package, R Package Version 2.4-1 edn. Oksanen J,
664 Blanchet FG, Kindt R, Legendre R, Minchin PR, O’Hara RB et al. (2012). Vegan:
665 community ecology package, R Package Version 2.1-17 edn.
666 Ondov, B. D., N. H. Bergman, and A. M. Phillippy. 2011. Interactive metagenomic
667 visualization in a Web browser. Bmc Bioinformatics 12.
668 Pan, Y. D., R. J. Stevenson, B. H. Hill, A. T. Herlihy, and G. B. Collins. 1996. Using
669 diatoms as indicators of ecological conditions in lotic systems: A regional
670 assessment. Journal of the North American Benthological Society 15:481-495.
671 Percent, S. F., M. E. Frischer, P. A. Vescio, E. B. Duffy, V. Milano, M. McLellan, B. M.
672 Stevens, C. W. Boylen, and S. A. Nierzwicki-Bauer. 2008. Bacterial community
673 structure of acid-impacted lakes: What controls diversity? Applied and
674 Environmental Microbiology 74:1856-1868.
675 Perez, M. T., and R. Sommaruga. 2006. Differential effect of algal- and soil-derived
676 dissolved organic matter on alpine lake bacterial community composition and
677 activity. Limnology and Oceanography 51:2527-2537.
31 678 Peterson, C. N., S. Day, B. E. Wolfe, A. M. Ellison, R. Kolter, and A. Pringle. 2008. A
679 keystone predator controls bacterial diversity in the pitcher-plant (Sarracenia
680 purpurea) microecosystem. Environmental Microbiology 10:2257-2266.
681 Rabalais, N. N., R. E. Turner, R. J. Diaz, and D. Justic. 2009. Global change and
682 eutrophication of coastal waters. Ices Journal of Marine Science 66:1528-1537.
683 Reddy, T. B. K., A. D. Thomas, D. Stamatis, J. Bertsch, M. Isbandi, J. Jansson, J.
684 Mallajosyula, I. Pagani, E. A. Lobos, and N. C. Kyrpides. 2015. The Genomes
685 OnLine Database (GOLD) v.5: a metadata management system based on a four
686 level (meta)genome project classification. Nucleic Acids Research 43:D1099-
687 D1106.
688 Scheffer, M. 2009. Critical Transitions in Nature and Society. Princeton University Press.
689 Scheffer, M., J. Bascompte, W. A. Brock, V. Brovkin, S. R. Carpenter, V. Dakos, H.
690 Held, E. H. van Nes, M. Rietkerk, and G. Sugihara. 2009. Early-warning signals
691 for critical transitions. Nature 461:53-59.
692 Shade, A., A. D. Kent, S. E. Jones, R. J. Newton, E. W. Triplett, and K. D. McMahon.
693 2007. Interannual dynamics and phenology of bacterial communities in a
694 eutrophic lake. Limnology and Oceanography 52:487-494.
695 Simcic, T. 2005. The role of plankton, zoobenthos, and sediment in organic matter
696 degradation in oligotrophic and eutrophic mountain lakes. Hydrobiologia 532:69-
697 79.
698 Sirota, J., B. Baiser, N. J. Gotelli, and A. M. Ellison. 2013. Organic-matter loading
699 determines regime shifts and alternative states in an aquatic ecosystem.
32 700 Proceedings of the National Academy of Sciences of the United States of America
701 110:7742-7747.
702 Sondergaard, M., L. S. Johansson, T. L. Lauridsen, T. B. Jorgensen, L. Liboriussen, and
703 E. Jeppesen. 2010. Submerged macrophytes as indicators of the ecological quality
704 of lakes. Freshwater Biology 55:893-908.
705 Sowell, S. M., P. E. Abraham, M. Shah, N. C. Verberkmoes, D. P. Smith, D. F. Barofsky,
706 and S. J. Giovannoni. 2011. Environmental proteomics of microbial plankton in a
707 highly productive coastal upwelling system. Isme Journal 5:856-865.
708 Srivastava, D. S., J. Kolasa, J. Bengtsson, A. Gonzalez, S. P. Lawler, T. E. Miller, P.
709 Munguia, T. Romanuk, D. C. Schneider, and M. K. Trzcinski. 2004. Are natural
710 microcosms useful model systems for ecology? Trends in Ecology & Evolution
711 19:379-384.
712 Tang, X. M., G. Gao, B. Q. Qin, L. P. Zhu, J. Y. Chao, J. J. Wang, and G. J. Yang. 2009.
713 Characterization of bacterial communities associated with organic aggregates in a
714 large, shallow, eutrophic freshwater lake (Lake Taihu, China). Microbial Ecology
715 58:307-322.
716 Timinskas, K., M. Balvociute, A. Timinskas, and C. Venclovas. 2014. Comprehensive
717 analysis of DNA polymerase III alpha subunits and their homologs in bacterial
718 genomes. Nucleic Acids Research 42:1393-1413.
719 Ullrich, N., P. Casper, A. Otto, and M. O. Gessner. 2016. Proteomic evidence of
720 methanotrophy in methane-enriched hypolimnetic lake water. Limnology and
721 Oceanography 61:S91-S100.
33 722 Van Horn, D. J., R. L. Sinsabaugh, C. D. Takacs-Vesbach, K. R. Mitchell, and C. N.
723 Dahm. 2011. Response of heterotrophic stream biofilm communities to a gradient
724 of resources. Aquatic Microbial Ecology 64:149-161.
725 Vogel, C., and E. M. Marcotte. 2012. Insights into the regulation of protein abundance
726 from proteomic and transcriptomic analyses. Nature Reviews Genetics 13:227-
727 232.
728 Wayne, L.G., and K. Lin. 1982. Glyoxylate Metabolism and Adaptation of
729 Mycobacterium tuberculosis to Survival under Aerobic Conditions. Infection and
730 Immunity 37:1042-1049.
34 731 Table 1. Results of chi-square analysis of bacterial classes in control and enriched
732 pitchers. Bolded values represent those in which the adjusted P value is <0.05.
Class Control Peptides Enriched Peptides Adjusted chi-square Acidobacteria 6 0 0.000 Actinobacteria 32 3 0.000 Alphaproteobacteria 276 196 0.000 Bacteroidia 12 16 0.059 Betaproteobacteria 469 2448 0.000 Chloroflexi 0 3 0.816 Clostridia 3 7 0.959 Cytophagia 14 17 0.021 Deltaproteobacteria 2 0 0.132 Flavobacteriia 0 3 0.816 Gammaproteobacteria 50 146 0.816 Gloeobacteria 8 0 0.000 Sphingobacteriia 48 53 0.000 Spirochaetia 11 9 0.006 733
35 734 Table 2. Effect of treatment on microbial proteins, contributing taxa (class and family),
735 and pathways. Bolded values represent those in which the adjusted P value is <0.05.
Proteins Taxa (Class) Taxa (Family) KEGG Pathways
df F R2 P df F R2 P df F R2 P df F R2 P
Treatment 1 4.217 0.319 0.004 1 3.766 0.295 0.022 1 4.218 0.319 0.003 1 4.753 0.373 0.024
Residuals 9 0.681 9 0.705 9 0.681 9 0.627
Total 10 1.000 10 1.000 10 1.000 10 1.000
736
36 blastp' Control' ortholog' N=5' designa>on' Protein)analysis) C E
-elongation factor tu -elongation factor tu Enriched' -f0f1 atp synthase subunit beta -f0f1 atp synthase subunit beta -molecular chaperone -molecular chaperone -branched-chain amino acid abc transporter -porin substrate-binding protein -branched-chain amino acid abc transporter -porin N=6' substrate-binding protein -f0f1 atp synthase subunit alpha -atp synthase beta subunit -phosphate abc transporter -dna-directed rna polymerase subunit beta substrate-binding protein -abc transporter -heat shock protein 60 substrate-binding protein -dna-directed rna polymerase subunit beta -outer membrane protein --dependent receptor plug -isocitrate lyase -glutamine synthetase -phosphate abc transporter substrate-binding protein -malate dehydrogenase -3-hydroxyacyl- dehydrogenase -membrane protein -aldehyde dehydrogenase -rna polymerase sigma factor -atp synthase alpha subunit In#gel'digest'&' -outer membrane protein -glutamine synthetase -polymerase -phasin family protein --dependent receptor -ribosomal protein l1 -abc transporter -malate dehydrogenase LC#MS/MS' substrate-binding protein -atp synthase beta subunit -rna polymerase sigma factor -60 kda partial -ribosomal l10 family protein Metagenome' -not present -not present ' ' Protein'ID'via'search'of' Color Key custom'database' 0 0.02 0.04 0.06 0.08Func/onal)analysis)0.1 Proportion of all peptides in pathway
Pyrimidine metabolism Purine metabolism Citrate cycle TCA cycle Glyoxylate and dicarboxylate metabolism Cysteine and methionine metabolism Shared Proteins Pyruvate metabolism Carbon fixation in photosynthetic organisms Pentose phosphate pathway Novobiocin biosynthesis Valine leucine and isoleucine degradation protein'entries' Phenylalanine tyrosine and tryptophan biosynthesis Taxonomic)analysis) Arginine and proline metabolism Lysine degradation Butanoate metabolism Limonene and pinene degradation C E Oxidative phosphorylation Alanine aspartate and glutamate metabolism Pantothenate and CoA biosynthesis Glutathione metabolism Aminobenzoate degradation Tyrosine metabolism Toluene degradation Valine leucine and isoleucine biosynthesis Aflatoxin biosynthesis Sulfur metabolism beta Alanine metabolism Selenocompound metabolism Phenylalanine metabolism Porphyrin and chlorophyll metabolism Glycine serine and threonine metabolism 155 65 155 C5 Branched dibasic acid metabolism alpha Linolenic acid metabolism Retinol metabolism
C E
blast2go' 737
738 Fig. 1. Pipeline for data collection and analysis. Proteins from the microbial 739 communities in experimentally enriched and ambient control pitcher fluid were processed 740 using SDS-PAGE, tryptic digest, LC-MS/MS, and a SEQUEST search of a custom 741 metagenomic database. The composition of microbial communities was determined using 742 a BLAST homology search of metagenomic data associated with identified proteins. 743 Protein identity and annotation was determined via a blastp search to identify orthologs 744 and blast2go.
37 (a) (b)
Shared Proteins 0.10
C E 0.08 0.06 Density 155 65 155 0.04 0.02 0.00 60 80 100 120 140 160 180 Number of shared proteins
(c)
C E
-elongation factor tu - (158) -elongation factor tu - (623) -f0f1 atp synthase subunit beta - (105) -f0f1 atp synthase subunit beta - (192) -molecular chaperone - (100) -molecular chaperone - (162) -branched-chain amino acid abc transporter -porin - (128) substrate-binding protein - (53) -branched-chain amino acid abc transporter -porin - (38) substrate-binding protein - (126) -f0f1 atp synthase subunit alpha - (34) -atp synthase beta subunit - (113) -phosphate abc transporter -dna-directed rna polymerase subunit beta - (84) substrate-binding protein - (30) -abc transporter -heat shock protein 60 - (28) substrate-binding protein - (81) -dna-directed rna polymerase subunit beta - (22) -outer membrane protein - (69) --dependent receptor plug - (19) -isocitrate lyase - (58) -phosphate abc transporter -glutamine synthetase - (19) substrate-binding protein - (54) -malate dehydrogenase - (17) -3-hydroxyacyl- dehydrogenase - (53) -membrane protein - (16) -aldehyde dehydrogenase - (50) -rna polymerase sigma factor - (16) -atp synthase alpha subunit - (46) -outer membrane protein - (13) -glutamine synthetase - (44) -polymerase - (13) -phasin family protein - (44) --dependent receptor - (12) -ribosomal protein l1 - (36) -abc transporter -malate dehydrogenase - (34) substrate-binding protein - (12) -atp synthase beta subunit - (11) -rna polymerase sigma factor - (34) -60 kda partial - (9) -ribosomal l10 family protein - (34) -not present - (369) -not present - (1154) 745
746 Fig. 2. Protein identifications differed between control and enriched pitchers. (a) 747 Protein hits shared between control and enriched treatments. (b) Results of a 748 randomization test in which 220 protein hits were randomly assigned to each treatment 749 and the number of shared protein hits was calculated. Red line indicates the actual shared 750 number of proteins. Grey probability density function indicates the 95% confidence 751 interval for the simulated shared protein hit values. (c) Top 20 proteins in rank order for 752 each treatment. Proteins are ranked by the number of total peptides associated with them 753 (in parentheses). Identical proteins in both treatments are connected by lines. Blue lines 754 indicate proteins unique to the top 20 in control pitchers (C) and brown lines indicate 755 proteins unique to the top 20 in enriched pitchers (E).
38 (a)$
(b)$
756 757 Fig. 3. Distinctly different microbial communities contributed to protein expression 758 in control and enriched pitchers. The proportion of total peptides from the top 220 759 proteins associated with particular microbial classes (a) and families (b) in all enriched 760 and control replicate pitchers.
39 0.5 PROTEIN HITS Control TAXA (CLASS) Control Enriched Enriched 0.0 0.5 -0.5 -0.5 NMDS2 -1.0 -1.5 -3 -2 -1 0 1 2 3 4 -2 -1 0 1 2
Control Control
TAXA (FAMILY) 0.6 Enriched KEGG PATHWAYS Enriched 0.5 0.4 0.0 0.2 NMDS2 0.0 -0.5 -0.2 -1.0 -2 -1 0 1 -1.0 -0.5 0.0 0.5 1.0 NMDS1 NMDS1
761 762 Fig. 4. Microbial communities in control and enriched pitchers differ in the proteins 763 they produce, taxa that contribute to protein expression, and function. Ordination of 764 Bray-Curtis dissimilarities of total peptides shows clustering of pitcher microbial 765 communities by treatment for protein hits (adonis P=0.004), microbial classes (adonis 766 P=0.022), microbial families (adonis P=0.003) and KEGG pathways (adonis P=0.003) as a 767 function of treatment (control or enriched)
40 Color Key Color Key (a)$ Color Key
0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4 Proportion of total peptide identifications in treatment/replicate Proportion of total peptide identifications in treatment/replicate Proportion of total peptide identifications in treatment/replicate
Carbohydrate metabolism Carbohydrate metabolism Carbohydrate metabolism Amino acid metabolism Amino acid metabolism Amino acid metabolism Nucleotide metabolism * Nucleotide metabolism Nucleotide metabolism Energy metabolism * Energy metabolism Xenobiotics biodegradation and metabolism Energy metabolism Xenobiotics biodegradation and metabolism Biosynthesis of other secondary metabolites Xenobiotics biodegradation and metabolism * Biosynthesis of other secondary metabolites Lipid metabolism Biosynthesis of other secondary metabolites Lipid metabolism Metabolism of other amino acids
Lipid metabolism Metabolism of terpenoids and polyketides * Metabolism of other amino acids Metabolism of cofactors and vitamins MetabolismMetabolism of other of terpenoids amino acids and polyketides Signal transduction MetabolismMetabolism of terpenoids of cofactors and and polyketides vitamins Translation
MetabolismSignal transduction of cofactors and vitaminsC E E1 E2 E3 E4 E5 E6 C1 C2 C3 C4
SignalTranslation transduction
C E H1 H2 H3 2C 5B 5A H4 H6 3E 4C Translation
C E E1 E2 E3 E4 E5 E6 C1 C2 C3 C4 (b)$
0.8
C E 0.6
0.4
0.2 Proportion of all peptides
0.0 Aerobe Anaerobe Facultative Unclassified
768
769 Fig. 5. Microbial function differed between control and enriched pitchers. (a) Heat 770 map of the proportional representation of course-level KEGG pathways between control 771 pitchers (C) and enriched pitchers (E) and individual control (H4, H6, E3, C4) and 772 enriched (H1, H2, H3, 2C, 5B, 5A) replicates. Significantly different pathways between 773 pooled control and enriched samples are indicated with “*”. (b) Oxygen requirement of 774 microbial classes contributing to protein expression as a proportion of all peptides in 775 control (C) and enriched (E) pitchers.
41 776 Appendix S1 777 778 779 Detailed Methods 780 781 Field Experiment 782 783 Starting June 10, 2011, we selected newly opened, and therefore sterile (Peterson et al.,
784 2008), S. purpurea pitchers for five days until 20 pitchers were selected. One pitcher
785 from each group was randomly assigned to one of two treatments—ambient control and
786 detritus-enriched. The average pitcher length, measured from the base of the pitcher
787 along the back of the keel to the top of the hood, was 12.4±2.3 cm. Pitcher volume was
788 not measured during the experiment. The final average volume of fluid in the pitchers
789 was 9.6±5.8 mL.
790 After the first rain, initial samples of 1.5 ml of pitcher fluid were drawn from all
791 pitchers and replaced with 1.5 ml of deionized water. In the detrital enrichment treatment,
792 each pitcher received 1mg of detritus per day between 7:00 am and 9:00 am. Wasps were
793 ground in a coffee grinder, dried for 48-72 hours in an oven at 70 °C, weighed, and stored
794 in a −20 °C freezer until used.
795
796 Sampling
797 Initial 1.5-ml samples and the entire final contents of each pitcher were drawn
798 independently through the frit of separate Bio-Rad (Hercules, California, USA) Poly-Prep
799 chromatography columns to remove any organisms larger than 30 microns. For each
800 sample, the filtrate was centrifuged in 2ml aliquots at 13,000g to concentrate the
801 microbial assemblage and the supernatant was removed. The resulting microbial pellet
802 was stored at −80 °C. All frozen samples were transported on dry ice from Harvard
42 803 Forest to the University of Vermont (June 29, 2011), where they were stored at −80 °C
804 until processed.
805
806 Metagenome Extraction and Sequencing
807 We used the DNeasy Blood and Tissue kit (Qiagen) to extract DNA from the microbial
808 pellets of three pitchers using the Purification of Total DNA from Animal Tissues Spin-
809 Column Protocol (pages 28-30 of the handbook dated 07/2006). Samples were pre-treated
810 with proteinase K (as described on page 45 of the booklet). For each pitcher, one pellet
811 was also pre-treated with lysozyme during the extraction. Five percent of genomic DNA
812 preparation was loaded on a 1% agarose gel (Appendix 1: Fig. S2a). Samples from all six
813 preparations were pooled and the DNA was precipitated with 0.1 volume 3M sodium
814 acetate and two volumes of absolute ethanol. The pooled samples were then centrifuged
815 and the precipitated DNA was washed with 75% ethanol and then resuspended in water.
816 More than 10 µg of total DNA was sent for library construction, sequencing and
817 assembly to Genome Quebéc (Montréal, QC, Canada) using the 454 GS-FLX Titanium
818 Sequencing System (Roche).
819 820 821 Protein Extraction, SDS-Page, and Mass Spectrometry
822 Microbial pellets were resuspended in 100 µl of bromophenol blue sample buffer
823 (150mM Tris pH 6.8, 2% SDS, 5% β-mercaptoethanol, 7.8% glycerol) and boiled at 95
824 °C for five minutes. All samples were diluted proportional to their pellet size to obtain
825 similar staining levels. After centrifugation, samples were loaded into separate lanes of a
826 10% polyacrylamide (37.5:1 acrylamide:bis-acrylamide) gel and subjected to SDS-PAGE
827 and Coomassie staining (Fig. 1, Appendix 1: Fig. S1a, and Appendix 1: Fig. S1b).
43 828 All six of the enriched pitchers and five of the six control pitchers had visible
829 protein staining levels and were chosen for mass spectrometry. These 11 sample lanes
830 were each divided into five regions (Appendix1: Fig. S1b) and each region was diced into
3 831 1 mm pieces. Gel cubes were rinsed with HPLC-grade water, incubated at 37 °C for 30
832 minutes in 1 ml of destain solution (50 mM ammonium bicarbonate, 50% acetonitrile),
833 and dehydrated in 100% acetonitrile for 10 min in order to remove the Coomassie stain.
834 This destain procedure was repeated a second time to ensure complete removal of the
835 stain.
836 An in-gel tryptic digest was performed by submerging the dehydrated gel pieces
837 in ice-cold sequencing-grade modified trypsin (6 ng/µl) (Promega, Fitchburg, WI, USA)
838 for 15 minutes, adding ice-cold 50 mM ammonium bicarbonate solution, letting the gel
839 pieces swell on ice, and then incubating the pieces overnight at 37 °C. Digests were
840 centrifuged at 13,000g for five minutes and the peptide-containing supernatant
841 transferred to a .6 ml tube. Peptides were further extracted from the gel pieces by adding
842 100 µl of 50% acetonitrile and 2.5% formic acid, centrifuging for 15 minutes at 13,000 x
843 g, and dehydrating in 100% acetonitrile. All extracted peptides were pooled, dried in a
844 SpeedVac for 1 hour, and stored at -80°C.
845
846 Custom Metagenomic and Protein Databases
847 We generated a custom protein database from a six-frame forward and reverse translation
848 of a metagenomic database constructed from microbial communities of three previously
849 collected pitchers that had captured diverse amounts of prey (Appendix 1: Fig. S2).
850 Pitchers were collected from Molly Bog, an ombrotrophic bog located in Morristown, VT
44 851 (44.50 -72.64) on August 18, 2008 and transported in a cooler directly from the field to
852 the University of Vermont. Microbial pellets were obtained immediately as described
853 above.
854 We used the DNeasy Blood and Tissue kit (Quiagen) to extract DNA from the
855 microbial pellets of three pitchers using the Purification of Total DNA from Animal
856 Tissues Spin-Column Protocol (pages 28-30 of the handbook dated 07/2006). Samples
857 were pre-treated with proteinase K (as described on page 45 of the booklet). For each
858 pitcher, one pellet was also pre-treated with lysozyme during the extraction. Five percent
859 of genomic DNA preparation was loaded on a 1% agarose gel (Appendix 1: Fig. S2a).
860 Samples from all six preparations were pooled and the DNA was precipitated with 0.1
861 volume 3M sodium acetate and two volumes of absolute ethanol. The pooled samples
862 were then centrifuged and the precipitated DNA was washed with 75% ethanol and then
863 re-suspended in water. More than 10 µg of total DNA was sent for library construction,
864 sequencing and assembly to Genome Quebéc (Montréal, QC, Canada) using the 454 GS-
865 FLX Titanium Sequencing System (Roche). From 243 Mb of sequence information,
866 roughly 54% of 567,549 filtered reads (median read length = 482 bp) were assembled
867 into 26,713 contigs of length greater than 500 bp (Appendix 1: Fig. S2b, Appendix 1:
868 Fig. S2c).
869 A custom metaproteomic database was created from the metagenome database
870 using open-source Ruby programming software. Each contig was translated to an amino
871 acid sequence in all six reading frames. Of the resulting amino acid sequences, only
872 sequences with greater than 100 amino acids in length were retained. Those 184,128
873 sequences were written to new fasta files and retained their original description line. If
45 874 multiple amino acid sequences came from a single contig, the resulting description lines
875 included unique letter identifiers. As such, all amino acid sequences could be mapped
876 back to a single nucleotide sequence greater than 300 bp in length. To create a decoy
877 database, all retained protein sequences were reversed and then concatenated to the
878 forward database. The decoy database allowed for an estimate of the false identification
879 rate during the database search process as has been described (Elias & Gygi, 2007).
880
881 SEQUEST Search Parameters
882 The following search parameters were used during the SEQUEST search: peptides were
883 required to be tryptic; peptide precursor mass tolerance was set at plus or minus 2 Da;
884 and differential oxidation of methionine (15.9949 Da) and differential acrylamidation of
885 cysteine (71.0371 Da) were permitted.
886
887 Randomization Test of Shared Proteins
888 A pool of identified proteins was generated by combining the total protein hits from the
889 top 220 protein hits in both treatments. We chose to analyze only the top 220 proteins in
890 each group because the identification status of proteins that are rarer is less certain and
891 because including many rare proteins in the test was likely to add noise caused by the
892 sampling of rare elements. With enough noise added from rarity, there is a danger that the
893 real signal of differences among the common proteins will be swamped by this noise.
894 Two hundred twenty protein hits were randomly drawn and assigned to each of the two
895 treatments, without replacement. Each protein hit in the original pool was weighted by
896 sum of the total number of peptides associated with that protein hit in the two treatments.
46 897 For each simulation (N =1000), the number of shared protein hits between treatments was
898 calculated, yielding a probability distribution of the expected number of shared protein
899 hits. The observed shared number of protein hits was calculated by finding the
900 intersection of the list of top 220 control protein hits and the top 220 enriched protein
901 hits. Whether protein hits were drawn with or without replacement, the number of shared
902 proteins was less than expected by chance supporting the alternative hypothesis that the
903 protein pools from the two treatments are distinct from one another (Fig 2b).
904 We conducted an additional simulation experiment (programmed in R) to test for
905 the possibility of a Type I error (incorrectly rejecting a true null hypothesis) in our
906 randomization test to determine the expected number of shared proteins between enriched
907 and control pitchers. We first simulated a single source protein pool consisting of 10,000
908 distinct protein types. Next, we created two sets to represent control and treatment
909 groups. For each group, we sampled with replacement from the protein source pool until
910 we had accumulated enough proteins so that there were exactly 200 proteins represented
911 in each group (typically this necessitated sampling somewhere between 200 and 210
912 individual proteins because there were occasional duplicates observed). As you would
913 expect, there are usually no proteins shared or only a small number between these two
914 samples.
915 Next, we followed the procedure that we described in our randomization test.
916 Namely, we reshuffled these proteins between the two groups, and calculated the number
917 of shared proteins between them. We used 100 replicates per simulated set of proteins
918 and repeated this procedure for 100 trials (preliminary runs showed that the results were
919 just as precise using only 100 replicates instead of the full 1000 employed in the analysis
47 920 of the real data). If our algorithm is behaving properly, less than 5% of such trial
921 simulations should yield a statistically significant result. We conducted two variants of
922 this test. In the first variant, each of the 10,000 proteins was equally abundant. In the
923 second variant, the protein abundances followed an exponential distribution, in which
924 there are a few relatively abundant proteins and a large number of relatively rare proteins.
925 We simulated this distribution by drawing elements from a beta distribution with
926 parameters shape1 = 0.5, shape2 = 1.0.
927 Of the 100 trials with equally abundant proteins, there was only 1 simulation in
928 which the null hypothesis was rejected. Of the 100 trials with an exponential distribution
929 of protein abundances, none of the trial data sets rejected the null hypothesis. We
930 conclude from this exercise that the null model test that we used has good Type I error
931 properties, and does not lead to spurious rejection of the null hypothesis when both
932 treatments are sampled from a single protein pool.
933
934 Taxonomic Analysis
935 To determine the taxonomic composition of the microbes contributing to identified
936 proteins in our treatments, we conducted a BLAST homology search of the metagenomic
937 sequence data for protein hits. All peptides from the top 220 identified proteins in each
938 treatment were mapped back to their contigs of origin to obtain nucleotide sequences.
939 Each nucleotide sequence was repeated by the number of associated peptides and
940 searched via BLAST (NCBI), allowing us to obtain a weighted hit table for each
941 treatment. The GI number from the top blast hit was extracted from the hit table for each
942 query sequence for each treatment. The resulting GI numbers were then searched against
48 943 the NCBI Nucleotide Database via a script that returned organism subfield values (i.e.
944 species name), yielding a list of species names for each treatment, associated with the top
945 blast hit.
946 Hurlbert’s Probability of an Interspecific Encounter (PIE) was calculated for each
947 treatment using the following equation:
! � ��� = 1 − � ! � − 1 ! !!!
948 where N is the total number of peptides identified in a treatment, pi is the proportion of
949 peptides in a treatment represented by bacterial class i, and s is the number of bacterial
950 classes identified in a treatment.
951
952 O2 Requirements
953 We mapped each bacterial species identified in our BLAST search to its O2 requirement
954 using data from the Integrated Microbial Genomes database (IMG) (Reddy et al., 2015;
955 Timinskas et al., 2014)The IMG database contains 6 classes of O2 requirements: aerobe,
956 anaerobe, facultative, microaerophilic, obligate aerobe, and obligate anaerobe. The latter
957 three categories make up less than 7% of the database. We merged any species classified
958 as obligate aerobes or obligate anaerobes into the aerobe and anaerobe classes,
959 respectively.
49 960 Table S1. Results of a chi-square analysis of pathways represented by proteins in control 961 and enriched pitchers. Bolded values represent those in which the adjusted p-value is 962 <0.05. Columns with peptide counts refer to the total number of peptides associated with 963 a pathway. Pathway Control Enriched Adjusted Aflatoxin biosynthesis Peptides9 Peptides 0 Pvalue0.000 Alanine aspartate and glutamate metabolism 15 196 0.000 alpha Linolenic acid metabolism 0 10 0.294 Aminoacyl tRNA biosynthesis 0 21 0.069 Aminobenzoate degradation 1 23 0.120 Arginine and proline metabolism 21 170 0.000 Ascorbate and aldarate metabolism 4 10 0.824 Benzoate degradation 7 34 0.697 beta Alanine metabolism 4 190 0.000 Biosynthesis of unsaturated fatty acids 2 37 0.048 Biotin metabolism 6 8 0.172 Butanoate metabolism 21 118 0.120 C5 Branched dibasic acid metabolism 0 20 0.077 Caprolactam degradation 7 15 0.465 Carbon fixation in photosynthetic organisms 47 53 0.000 Carbon fixation pathways in prokaryotes 51 152 0.354 Chloroalkane and chloroalkene degradation 8 67 0.064 Citrate cycle TCA cycle 92 140 0.000 Cyanoamino acid metabolism 0 7 0.447 Cysteine and methionine metabolism 68 39 0.000 Drug metabolism cytochrome P450 2 14 0.660 Fatty acid biosynthesis 4 6 0.400 Fatty acid degradation 22 51 0.168 Fatty acid elongation 2 56 0.004 Geraniol degradation 3 67 0.004 Glutathione metabolism 14 73 0.346 Glycerolipid metabolism 2 111 0.858 Glycerophospholipid metabolism 0 19 0.085 Glycine serine and threonine metabolism 0 45 0.004 Glycolysis and glucogenesis 33 169 0.120 Glyoxylate and dicarboxylate metabolism 81 239 0.186 Histidine metabolism 5 33 0.369 Inositol phosphate metabolism 0 70 0.000 Limonene and pinene degradation 21 67 0.781 Lysine degradation 21 53 0.298 Metabolism of xenobiotics by cytochrome P450 2 6 1.000 Methane metabolism 28 76 0.331 Naphthalene degradation 0 4 0.741 Nitrogen metabolism 21 169 0.000 Novobiocin biosynthesis 39 0 0.000 One carbon pool by folate 0 16 0.120 Oxidative phosphorylation 18 98 0.194 Pantothenate and CoA biosynthesis 14 19 0.021 Pentose and glucuronate interconversions 3 39 0.075 Pentose phosphate pathway 40 2 0.000 Phenylalanine tyrosine and tryptophan biosynthesis 30 3 0.000 Phenylalanine metabolism 3 49 0.026 Phenylpropanoid biosynthesis 0 27 0.031 Phosphatidylinositol signaling system 0 5 0.637 Porphyrin and chlorophyll metabolism 3 4 0.027 Primary bile acid biosynthesis 3 20 0.554 Propanoate metabolism 22 144 0.027 Purine metabolism 91 346 0.767
50 Pyrimidine metabolism 94 257 0.048 Pyruvate metabolism 62 58 0.000 Retinol metabolism 0 3 0.858 Selenocompound metabolism 4 0 0.004 Streptomycin biosynthesis 0 3 0.858 Sulfur metabolism 4 0 0.004 Synthesis and degradation of ketone bodies 5 5 0.120 Taurine and hypotaurine metabolism 6 7 0.120 Terpenoid backbone biosynthesis 2 3 0.741 Tetracycline biosynthesis 2 0 0.120 Thiamine metabolism 3 11 1.000 Toluene degradation 11 17 0.085 Tryptophan metabolism 20 203 0.000 Tyrosine metabolism 13 14 0.007 Valine leucine and isoleucine biosynthesis 10 89 0.021 Valine leucine and isoleucine degradation 37 222 0.013 964
51 965 Table S2. Species, oxygen requirements, and bacterial classes identified in control and 966 enriched pitchers in a BLAST search of nucleotide sequences associated with the top 220 967 proteins in each treatment, weighted by total peptides. NA values represent species that 968 were non-bacterial. Oxygen Control Enriched Species Name Requirement Class Peptides Peptides
Achromobacter xylosoxidans Aerobe Betaproteobacteria 9 0
Acidiphilium multivorum Aerobe Alphaproteobacteria 0 7
Acidovorax avenae Aerobe Betaproteobacteria 12 4
Acidovorax citrulli Aerobe Betaproteobacteria 13 32
Agrobacterium radiobacter Aerobe Alphaproteobacteria 3 0
Alicycliphilus denitrificans Facultative Betaproteobacteria 9 2
Alkalilimnicola ehrlichii Anaerobe Gammaproteobacteria 0 2
Azoarcus sp Facultative Betaproteobacteria 4 0
Azorhizobium caulinodans Unclassified Alphaproteobacteria 3 3
Azospirillum sp Facultative Alphaproteobacteria 0 3
Bordetella pertussi Aerobe Betaproteobacteria 0 2
Bradyrhizobium sp Aerobe Alphaproteobacteria 10 0
Burkholderia cenocepacia Facultative Betaproteobacteria 0 6
Burkholderia cepaci Aerobe Betaproteobacteria 2 0
Burkholderia fungoru Aerobe Betaproteobacteria 14 0
Burkholderia gladiol Aerobe Betaproteobacteria 0 24
Chitinophaga pinensis Aerobe Sphingobacteriia 17 14
Chromobacterium violaceum Facultative Betaproteobacteria 62 1549
Clavibacter michiganensis Aerobe Actinobacteria 2 0
Clostridium saccharobutylicum Anaerobe Clostridia 3 7
Collimonas fungivorans Aerobe Betaproteobacteria 10 0
Corynebacterium halotolerans Aerobe Actinobacteria 3 0
Croceicoccus naphthovoran Unclassified Alphaproteobacteria 4 0
Cupriavidus taiwanensis Facultative Betaproteobacteria 2 0
Dechloromonas aromatica Facultative Betaproteobacteria 2 5
Dechlorosoma suillum Anaerobe Betaproteobacteria 17 61
Delftia acidovorans Aerobe Betaproteobacteria 3 0
52 Delftia sp Aerobe Betaproteobacteria 0 2
Desulfovibrio vulgaris Anaerobe Deltaproteobacteria 2 0
Draconibacterium oriental Facultative Bacteroidia 12 16
Dyadobacter fermentans Aerobe Cytophagia 0 2
Dyella jiangningensi Aerobe Gammaproteobacteria 2 17
Emticicia oligotrophica Aerobe Cytophagia 4 11
Flavobacteriaceae bacterium Aerobe Flavobacteriia 0 3
Gloeobacter violaceus Aerobe Gloeobacteria 8 0
Hymenobacter sp Aerobe Cytophagia 8 4
Janthinobacterium agaricidamnosum Unclassified Betaproteobacteria 0 4
Janthinobacterium sp Unclassified Betaproteobacteria 13 82
Laribacter hongkongensis Anaerobe Betaproteobacteria 0 8
Leifsonia xyli Aerobe Actinobacteria 14 0
Leptospira interrogans Aerobe Spirochaetia 11 9
Leptothrix cholodnii Aerobe Betaproteobacteria 2 0
Mesorhizobium ciceri Aerobe Alphaproteobacteria 11 0
Methylobacterium aquaticu Aerobe Alphaproteobacteria 4 5
Methylobacterium populi Aerobe Alphaproteobacteria 2 4
Methylobacterium radiotolerans Aerobe Alphaproteobacteria 17 0
Methylovorus sp Facultative Betaproteobacteria 4 0
Microbacterium testaceum Aerobe Actinobacteria 13 3
Niabelli soli Aerobe Sphingobacteriia 7 6
Novosphingobium aromaticivorans Aerobe Alphaproteobacteria 56 40
Oxalis latifoli NA NA 2 2
Pedobacter heparinus Aerobe Sphingobacteriia 16 27
Polaromonas naphthalenivorans Aerobe Betaproteobacteria 0 6
Polymorphum gilvum Facultative Alphaproteobacteria 2 0
Pseudogulbenkiania sp Facultative Betaproteobacteria 15 445
Pseudomonas denitrificans Aerobe Gammaproteobacteria 0 7
Pseudomonas entomophila Aerobe Gammaproteobacteria 0 7
Pseudomonas knackmussii Unclassified Gammaproteobacteria 0 10
53 Pseudomonas pseudoalcaligene Aerobe Gammaproteobacteria 0 3
Pseudomonas putida Aerobe Gammaproteobacteria 0 12
Pseudomonas rhizosphaera Aerobe Gammaproteobacteria 21 0
Pseudopedobacter saltans Aerobe Sphingobacteriia 8 6
Pseudoxanthomonas spadix Aerobe Gammaproteobacteria 2 0
Pusillimonas sp Unclassified Betaproteobacteria 2 0
Ramlibacter tataouinensis Aerobe Betaproteobacteria 8 6
Rhizobium etli Aerobe Alphaproteobacteria 22 31
Rhizobium sp Aerobe Alphaproteobacteria 0 5
Rhizophagus intraradice NA NA 2 0
Rhodanobacter denitrifican Facultative Gammaproteobacteria 22 88
Rhodopseudomonas palustris Facultative Alphaproteobacteria 2 31
Roseiflexus sp Facultative Chloroflexi 0 3
Runella slithyformus Aerobe Cytophagia 2 0
Sinorhizobium fredii Aerobe Alphaproteobacteria 8 0
Sphingobium chlorophenolicum Aerobe Alphaproteobacteria 0 5
Sphingobium japonicum Aerobe Alphaproteobacteria 2 0
Sphingobium sp Unclassified Alphaproteobacteria 7 0
Sphingomonas sanxanigenens Aerobe Alphaproteobacteria 3 0
Sphingomonas sp Aerobe Alphaproteobacteria 38 3
Sphingomonas tax Aerobe Alphaproteobacteria 4 0
Sphingomonas wittichii Aerobe Alphaproteobacteria 30 11
Sphingopyxis alaskensis Aerobe Alphaproteobacteria 7 7
Starkeya novella Aerobe Alphaproteobacteria 41 41
Stenotrophomonas rhizophil Unclassified Gammaproteobacteria 3 0
Terriglobus roseus Aerobe Acidobacteria 4 0
Terriglobus saanensis Aerobe Acidobacteria 2 0
Variovorax paradoxus Aerobe Betaproteobacteria 266 210 969
54 (a)"
(b)"
a" b"
c" d" e"
970 971 Fig. S1. Microbial proteins in control (C) and enriched (E) pitchers. (a) Three 972 replicate pitchers of each treatment were initially processed in November 2012. (b) The 973 remaining replicates were processed in May 2013. Lanes 4, 5, and 7 represent enriched 974 pitchers. Lanes 9, 11, and 13 represent control pitchers. The replicate in lane 13 was 975 omitted from the study due to a lack of protein. Letters a-e represent the regions that each 976 lane was cut into for MS/MS analysis.
55 (a)$
(b)$ (c)$ 7 140 6 5 100 4 80 3 60 Frequency 2 40 Frequency (thousands) Frequency 1 20 0 0 18 77 143 217 291 365 439 513 587 500 837 1211 1624 2058 2606 3554 8388 Read length (bp) Contig length (bp)
977 978 Fig. S2. (a) Agarose gel electrophoresis of metagenomic DNA from three pitcher plant 979 microbial communities. One pellet from each pitcher was treated with lysozyme. All 980 samples were pooled prior to sequencing. (b) Frequency distribution of the read lengths 981 in the sequenced metagenomic data. The median read length was 482 bp. (c) Frequency 982 distribution of assembled contig lengths in the metagenomic database. All contigs were 983 500 bp or greater in length.
56 984 985 Fig. S3 Taxonomic assignments of metagenome, as visualized by Krona. The rings, 986 from the center outward represent Kingdom (Bacteria), Phylum, Class, Order.
57 a)#
0.20
0.15
0.10
0.05 Proportion of contigs assigned to pathways assigned of contigs Proportion 0.00 Cancers Translation Cell motility Cell Transcription Lipid metabolism Lipid Digestive system Digestive Immune diseases Immune Endocrine system Endocrine Infectious diseases Infectious Signal transduction Signal Energy metabolism Energy Membrane transport Membrane Replication and repair and Replication Cell growth and death and growth Cell Nucleotide metabolism Nucleotide Amino acid metabolism acid Amino Substance dependence Substance Transport and catabolism and Transport Environmental adaptation Environmental Carbohydrate metabolism Carbohydrate Neurodegenerative diseases Neurodegenerative Folding, sorting and degradation and sorting Folding, Metabolism of other amino acids of amino other Metabolism Endocrine and metabolic diseases metabolic and Endocrine Glycan biosynthesis and metabolism and biosynthesis Glycan b)# vitamins of cofactors and Metabolism Metabolism of terpenoids and polyketides and of terpenoids Metabolism Xenobiotics biodegradation and metabolism and biodegradation Xenobiotics Biosynthesis of other secondary metabolites of secondary other Biosynthesis
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00 Proportion of contigs assigned to pathways assigned of contigs Proportion Ribosome Peroxisome DNA replication DNA Mismatch repair Mismatch RNA polymerase RNA ABC ABC transporters RNA degradation RNA Biotin metabolism Biotin Sulfur metabolism Sulfur Purine metabolism Purine Flagellar assembly Flagellar Lysine degradation Lysine Sulfur relay system relay Sulfur Folate biosynthesis Folate Lysine biosynthesis Lysine Base excision repair excision Base Nitrogen metabolism Nitrogen Tyrosine metabolism Tyrosine Histidine metabolism Histidine Bacterial chemotaxis Bacterial Methane metabolism Methane Pyruvate metabolism Pyruvate Geraniol degradation Geraniol Thiamine metabolism Thiamine Riboflavin metabolism Riboflavin Benzoate degradation Benzoate Galactose metabolism Galactose Butanoate metabolism Butanoate Pyrimidine metabolism Pyrimidine Fatty biosynthesis acid Glutathione metabolism Glutathione Cell cycle - cycle Caulobacter Cell Propanoate metabolism Propanoate Two-component system Two-component metabolism Glycerolipid HIF-1 signaling pathway HIF-1 signaling Citrate cycle (TCA cycle cycle) Citrate Chemical carcinogenesis Chemical Nucleotide excision repair excision Nucleotide One carbon pool by folate pool One carbon Oxidative phosphorylation Oxidative Streptomycin biosynthesis Streptomycin Bacterial secretion system secretion Bacterial Phenylalanine metabolism Phenylalanine Peptidoglycan biosynthesis Peptidoglycan Homologous recombination Homologous pathway PI3K-Akt signaling Pentose phosphate pathway phosphate Pentose Arachidonic acid metabolism acid Arachidonic Glycolysis / Glycolysis Gluconeogenesis Aminoacyl-tRNA biosynthesis Aminoacyl-tRNA Phenylpropanoid biosynthesis Phenylpropanoid Inositol phosphate metabolism phosphate Inositol Starch and sucrose metabolism sucrose Starch and Arginine and proline metabolism proline and Arginine Lipopolysaccharide biosynthesis Lipopolysaccharide Glycerophospholipid metabolism Glycerophospholipid Terpenoid backbone biosynthesis backbone Terpenoid Phosphotransferase system (PTS) Phosphotransferase Fructose and mannose metabolism mannose Fructose and Pantothenate and CoA biosynthesis CoA and Pantothenate Cysteine and methionine metabolism methionine and Cysteine Porphyrin and chlorophyll metabolism chlorophyll and Porphyrin Nicotinate and nicotinamide metabolism nicotinamide and Nicotinate Carbon fixation pathways in prokaryotes in pathways fixation Carbon Glyoxylate and dicarboxylate metabolism dicarboxylate and Glyoxylate Glycine, serine and threonine metabolism threonine and serine Glycine, Pentose and glucuronate interconversions glucuronate and Pentose Valine, leucine and isoleucine degradation isoleucine and leucine Valine, Valine, leucine and isoleucine biosynthesis isoleucine and leucine Valine, Alanine, aspartate and glutamate metabolism glutamate and aspartate Alanine, Amino sugar and nucleotide sugar metabolism sugar nucleotide and sugar Amino Chlorocyclohexane and chlorobenzene degradation chlorobenzene and Chlorocyclohexane Phenylalanine, tyrosine and tryptophan biosynthesis tryptophan and tyrosine Phenylalanine, Ubiquinone and other terpenoid-quinone biosynthesis terpenoid-quinone other and Ubiquinone 987 988 Fig. S4. Functional potential of the metagenome. Rank abundance of the proportion of 989 mapped contigs assigned to a) level 2 KEGG pathways and b) level 3 KEGG pathways.
58 s i t a
B h
8 p C s
H h o
r N h o . p (a)$ m p r
s o e J t b a a c
i n a a t n h c b i t a l i C i n e o k u o r
b l i n a li u m m e c m u t b c e o l c n v ri u u a i A e g m s o a l o s f a z s u d u y c t r p n u . g e a o e M i u d v s i a m r a o d i P
r r n p a s a ll s e n a e i 8 o l s C v l % % z e o
2 n A 1 a 2 % y % % e 5 % rk Ac m 2 ta ido riu S vo te ae rax 3 c e tli av m ba ac % e en or o er iaceae Rh 5 m ae e in ct eisser o biu su th a N do X zo b n b . a hi sp a lo .. R . J a ac .. R av x e . h Acid en O a a % ns ov a e e iz 3 era o e x R tol rax 1 o dio % a a h R ra ci r i A b tru r i ium ll ia z . i er o r l u i e ct te p . a 1 v t c o . b % a e m lo h y o c ob b eth a M d a te i i o a % r p 2 b P l . c e . o r . um A o s zobi e dyrhi t t . Bra e . 1% o . B r o
.
p b . u . Mesorhizobium ciceri a a 1% C r t Bacteria k c o e h t m B e o r e a l d i a a m . e . . o r i p 7% n a s N S o a l te vo e a sp d s e . hin s .le id .. gob a Act.. ro ium c te G . arom e ac .s .. aticiv a B .. oran e S s ae .. Mi... . 3% 2 mo 9% R re 2 2 h % o s 2 d u an ox % P o d 2 e b a 2 a r % C d c a % o te p h b r x i a d a L t c e r i n o M n t v e o e itr o i r i i i f p f r c s h ic a h e a V r o o a p n n s b g a i a a r a in p c x u t i s e y n l e r i i n u s s u m i b s s Gloeobacter violaceus 1% t e p s .
t c a y c n Terriglobus 0.8% e o u d m o n t i Clostridium saccharobutylicum 0.4% s
Roseiflexus sp. RS-1 0%
C h (b)$ r o m o b a c te r iu m v io la c e u m
5 3 % ae eriace Neiss ria cte oba ote apr Bet ia Proteobacter NH8B kiania sp. udogulben 15% Pse Bacteria
tes acteroide a . e B .. .. a 4% A. ce .a a C Ja O B G.. .. r a n x ur . te nd th a. kho c id in .. ld eae ba 2 a o ea eria n... o % tus b e les Xa th A ac C n A cc te o... a 1 z u r ae X 2 % o m i 2 s u u % p li m % i ba S r c % a t 3 3 t e S a o r % r r p S k p yz ho e e l % p h s l y a p i 3 e 1 h i a h e % R n a
i m t s 5 g n i n h s r o x o o a g o v a m r o e s d e M r l . b a o l o u a p n v x a n s a o o c o d Actinomycetales 0.1% m d d t b i e a u a i c a r c r r c i e A a e a t t e a l c p e e a r Gloeobacter violaceus 0% s b x d o a e n r i n o th i v t n r Terriglobus 0% o i a i f i J r c a a
V n s Clostridium saccharobutylicum 0.2%
Roseiflexus sp. RS-1 0.1% 990 991 Fig. S5 Taxonomic assignments of bacterial proteins, as visualized by Krona, 992 differed between control (a) and enriched (b) pitchers. Sunburst diagrams were 993 constructed using nucleotide sequences from the metagenomic data associated with 994 identified proteins in the custom protein database. Nucleotide sequences were weighted 995 by the total number of peptides associated with each sequence. Replicates were pooled 996 for each treatment. Figures feature only matches to bacteria. The rings, from the center 997 outward represent Kingdom (Bacteria), Phylum, Class, Order, Family.
59 (a)
(b)
998 999 Fig. S6. Taxonomic assignments of bacterial proteins, as visualized by Unipept, 1000 differed between (a) control and (b) enriched pitchers. The rings, from the center 1001 outward represent Kingdom (dark blue = Bacteria), Phylum (white = Proteobacteria), 1002 Class (red = Alphaproteobacteria, light blue = Betaproteobacteria), Order (dark blue = 1003 Burkholderiales, rose = Neisseriales, light blue = Sphingomonadales). 1004
60 1005 1006 Fig. S7. Pathway representation of the proportion of total peptides associated with 1007 KEGG pathways differed between control (blue) and enriched (brown) pitchers.
61 Color Key
0 0.05 0.1 0.15 Proportion of total peptide identifications in treatment/replicate
* Pyrimidine metabolism * Purine metabolism * Pyrimidine metabolism * Citrate cycle TCA cycle * Purine metabolism * Glyoxylate and dicarboxylate metabolism * Cysteine and methionine metabolism * Citrate cycle TCA cycle * Pyruvate metabolism * Glyoxylate and dicarboxylate metabolism Carbon fixation pathways in prokaryotes * Carbon fixation in photosynthetic organisms * Cysteine and methionine metabolism * Pentose phosphate pathway * Pyruvate metabolism * Novobiocin biosynthesis * Valine leucine and isoleucine degradation Carbon fixation pathways in prokaryotes Glycolysis Gluconeogenesis * Carbon fixation in photosynthetic organisms * Phenylalanine tyrosine and tryptophan biosynthesis Methane metabolism * Pentose phosphate pathway Fatty acid degradation * Novobiocin biosynthesis Propanoate metabolism * Arginine and proline metabolism * Valine leucine and isoleucine degradation * Butanoate metabolism Glycolysis Gluconeogenesis * Limonene and pinene degradation * Lysine degradation * Phenylalanine tyrosine and tryptophan biosynthesis Tryptophan metabolism Methane metabolism * Oxidative phosphorylation Fatty acid degradation * Alanine aspartate and glutamate metabolism * Aminobenzoate degradation Propanoate metabolism * Glutathione metabolism * Arginine and proline metabolism * Pantothenate and CoA biosynthesis * Tyrosine metabolism * Butanoate metabolism * Toluene degradation * Limonene and pinene degradation * Valine leucine and isoleucine biosynthesis * Aflatoxin biosynthesis * Lysine degradation Nitrogen metabolism Tryptophan metabolism Chloroalkane and chloroalkene degradation Benzoate degradation * Oxidative phosphorylation Caprolactam degradation * Alanine aspartate and glutamate metabolism Biotin metabolism Fatty acid biosynthesis * Aminobenzoate degradation Taurine and hypotaurine metabolism * Glutathione metabolism Histidine metabolism Ascorbate and aldarate metabolism * Pantothenate and CoA biosynthesis * beta Alanine metabolism * Tyrosine metabolism * Selenocompound metabolism * Sulfur metabolism * Toluene degradation Geraniol degradation * Valine leucine and isoleucine biosynthesis Pentose and glucuronate interconversions * Aflatoxin biosynthesis * Phenylalanine metabolism * Porphyrin and chlorophyll metabolism Nitrogen metabolism Primary bile acid biosynthesis Chloroalkane and chloroalkene degradation Thiamine metabolism Biosynthesis of unsaturated fatty acids Benzoate degradation Drug metabolism cytochrome P450 Caprolactam degradation Fatty acid elongation Glycerolipid metabolism Biotin metabolism Metabolism of xenobiotics by cytochrome P450 Fatty acid biosynthesis Synthesis and degradation of ketone bodies Terpenoid backbone biosynthesis Taurine and hypotaurine metabolism Tetracycline biosynthesis Histidine metabolism * alpha Linolenic acid metabolism Aminoacyl tRNA biosynthesis Ascorbate and aldarate metabolism * C5 Branched dibasic acid metabolism * beta Alanine metabolism Cyanoamino acid metabolism Glycerophospholipid metabolism * Selenocompound metabolism * Glycine serine and threonine metabolism * Sulfur metabolism Inositol phosphate metabolism Naphthalene degradation Geraniol degradation One carbon pool by folate Pentose and glucuronate interconversions Phenylpropanoid biosynthesis * Phenylalanine metabolism Phosphatidylinositol signaling system * Retinol metabolism * Porphyrin and chlorophyll metabolism Streptomycin biosynthesis Primary bile acid biosynthesis Thiamine metabolism C E E1 E2 E3 E4 E5 E6 C1 C2 C3 C4 Biosynthesis of unsaturated fatty acids Drug metabolism cytochrome P450 Fatty acid elongation Glycerolipid metabolism Metabolism of xenobiotics by cytochrome P450 Synthesis and degradation of ketone bodies Terpenoid backbone biosynthesis Tetracycline biosynthesis * alpha Linolenic acid metabolism Aminoacyl tRNA biosynthesis * C5 Branched dibasic acid metabolism Cyanoamino acid metabolism Glycerophospholipid metabolism * Glycine serine and threonine metabolism Inositol phosphate metabolism Naphthalene degradation One carbon pool by folate Phenylpropanoid biosynthesis Phosphatidylinositol signaling system * Retinol metabolism Streptomycin biosynthesis C E H1 H2 H3 2C 5B 5A H4 H6 3E 4C 1008 1009 Figure S8. KEGG pathway assignments differed between control and enriched 1010 pitchers. (a) Heat map of the proportional representation of pathways between control 1011 pitchers (C) and enriched pitchers (E) and individual control (H4, H6, E3, C4) and 1012 enriched (H1, H2, H3, 2C, 5B, 5A) replicates. Significantly different pathways between 1013 pooled control and enriched samples are indicated with “*”.
62 1014 Data S1. Table in .csv file form listing the proteins from the top 220 analyzed in control 1015 and enriched treatments and their associated peptides.
63