bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1
2 Title: A robust metaproteomics pipeline for a holistic taxonomic and functional
3 characterization of microbial communities from marine particles
4 Authors: Doreen Schultz,1 Daniela Zühlke,1 Jörg Bernhardt,1 Thomas Ben Francis,2 Dirk
5 Albrecht,1 Claudia Hirschfeld,1 Stephanie Markert,3 and Katharina Riedel 1*
6
7 Addresses: 1 Institute of Microbiology, University of Greifswald, Greifswald, Germany.
8 2 Max-Planck Institute for Marine Microbiology, Bremen, Germany.
9 3 Institute of Pharmacy, University of Greifswald, Greifswald, Germany.
10
11 *Corresponding author mailing address: Institute of Microbiology, Center for Functional
12 Genomics of Microbes (C_FunGene), University of Greifswald, Felix-Hausdorff-Straße 8, DE-
13 17487 Greifswald, Germany. E-mail [email protected]; Tel (+49) 3834 420 5900
14
15 Running title: Metaproteomic pipeline to analyse marine particles
1
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
16 Originality-Significance Statement
17 Marine particles consist of organic particulate matter (e.g. phyto- or zooplankton) and
18 particle-associated (PA) microbial communities, which are often embedded in a sugary
19 matrix. A significant fraction of the decaying algal biomass in marine ecosystems is expected
20 to be mineralized by PA heterotrophic communities, which are thus greatly contributing to
21 large-scale carbon fluxes. Whilst numerous studies have investigated the succession of
22 planktonic marine bacteria along phytoplankton blooms, the community structure and
23 functionality of PA bacterial communities remained largely unexplored and knowledge on
24 specific contributions of these microorganisms to carbon cycling is still surprisingly limited.
25 This has been mostly been due to technical problems, i.e. to the difficulty to retrieve genomic
26 DNA and proteins from these polysaccharide-rich entities, their enormous complexity and the
27 high abundance of eukaryotic microorganisms.
28 Our study presents an innovative, robust, reproducible, and reliable metaproteomics
29 pipeline for marine particles, which will help to address and fill the above-described
30 knowledge gap. Employing the here established workflow enabled us to identify more than
31 5,000 PA proteins, which is, at least to our knowledge, the largest number of protein groups
32 ever assigned to marine particles. Notably, the novel pipeline has been validated by a first,
33 comparative metaproteome analysis of free-living and PA bacterial communities indicating a
34 significant functional shift enabling surface-associated bacteria to adapt to particle-specific
35 living conditions. In conclusion, our novel metaproteomics pipeline presents a solid and
36 promising methodological groundwork for future culture-independent analyses of seasonal
37 taxonomic and functional successions of PA microbial communities in aquatic habitats.
2
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
38 Summary
39 This study aimed to establish a robust, reproducible and reliable metaproteomic pipeline for
40 an in-depth characterization of marine particle-associated (PA) bacteria. To this end, we
41 compared six well-established protein extraction protocols together with different MS-sample
42 preparation techniques using particles sampled during a North Sea spring algae bloom in
43 2009. In this optimized workflow, proteins are extracted using a combination of SDS-
44 containing lysis buffer and cell disruption by bead-beating, separated by SDS-PAGE, in-gel
45 digested and analysed by LC-MS/MS, before MASCOT search against a metagenome-
46 based database and data processing/visualization with the in-house-developed
47 bioinformatics tools Prophane and Paver.
48 As proof of principle, free-living (FL) and particulate communities sampled in April 2009
49 were analysed, resulting in an as yet unprecedented number of 9,354 and 5,034 identified
50 protein groups for FL and PA bacteria, respectively. Our data revealed that FL and PA
51 communities appeared similar in their taxonomic distribution, with notable exceptions:
52 eukaryotic proteins and proteins assigned to Flavobacteriia, Cyanobacteria, and some
53 proteobacterial genera were found more abundant on particles, whilst overall proteins
54 belonging to Proteobacteria were more dominant in the FL fraction. In contrast, significant
55 functional differences including proteins involved in polysaccharide degradation, sugar- and
56 phosphorus uptake, adhesion, motility, and stress response were detected.
3
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
57 Introduction
58 The oceans provide about one half of the global net primary production (NPP) (Field et al.,
59 1998; Falkowski et al., 1998; Azam and Malfatti, 2007), a large part of which is reprocessed
60 by bacterioplankton in the so-called “microbial loop”, a pathway during which dissolved
61 organic carbon is returned to higher trophic levels via its incorporation into bacterial biomass
62 (Azam, 1998). Notably, about 20% of the bacterioplankton lives attached to algae or marine
63 particles (Azam et al., 1983). These marine particles consist of various kinds of organic
64 matter, i.e. dead/dying zoo- or phytoplankton, bacterioplankton, as well as inorganic small
65 particles held together by a sugary matrix consisting of polysaccharide-composed
66 transparent extracellular particles (TEPs) composed of polysaccharides, which are exuded
67 mostly by phytoplankton but also bacteria (Alldredge et al., 1993). These particle-associated
68 (PA) microbial communities present one of the most ancient forms of “multicellularity” (Hall-
69 Stoodley et al., 2004) and have adapted to strive and survive in marine environments. Whilst
70 some bacteria are only loosely associated with algae, others colonize algal surfaces
71 (Grossart, 1999), where they form commensalistic or symbiotic communities with their host
72 or even predate on algae (Sohn et al., 2004; Amin et al., 2012). Marine particles grow while
73 sinking and thus contribute largely to the “biological pump” by transporting carbon to deeper
74 waters and sediments (Volkman and Tanoue, 2002). These aggregates may reach several
75 centimetres in diameter. They are enzymatically well equipped to metabolize high molecular
76 weight substrates, thus providing nutrition to the attached community as well as leaving
77 nutrients to the surrounding water column community (Simon et al., 2002; Grossart, 2010).
78 About one decade ago, scientists started to link molecular systems biology of
79 microorganisms to ecosystem level processes (e.g. reviewed in Raes and Bork, 2008).
80 Metagenomic studies were initiated to provide valuable knowledge about diversity and
81 distribution of microorganisms in natural environments. Moreover, metatranscriptomics and
82 metaproteomics approaches were established to investigate, which genes are expressed at
83 a given time point and which proteins are particularly abundant in complex biological
84 systems. Metaproteomics has meanwhile widely proven its potential to revisit microbial 4
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
85 ecology concepts by linking genetic and functional diversity in microbial communities and
86 relating taxonomic and functional diversity to ecosystem stability (Schneider and Riedel,
87 2010). Numerous studies, describing large-scale proteome analyses of acid-mine drainage
88 (AMD) biofilms (Ram et al., 2005), wastewater treatment plants (Wilmes et al., 2008), and
89 fresh-water stream biofilms (Hall et al., 2012) have demonstrated the power of
90 metaproteomics to unveil molecular mechanisms involved in function, physiology, and
91 evolution of surface-associated aquatic microbial communities. Marine metaproteomics has
92 meanwhile been widely applied (Saito et al., 2019; Wang et al., 2014), in particular in
93 habitats such as ocean scale shifts (Morris et al., 2010), the Atlantic (Bergauer et al., 2017)
94 or Antarctic oceans (Williams et al., 2012), e.g., to investigate Roseobacter clade (Christie-
95 Oleza and Armengaud 2015) and bacterioplankton (e.g. Wöhlbrand et al., 2017a) physiology.
96 Teeling et al. (2012) studied the bacterioplankton response to a diatom bloom in the North
97 Sea by an integrated meta-omics approach employing metagenomics and metaproteomics
98 and provided strong evidence that distinct free-living (FL) populations of Bacteroidetes,
99 Gammaproteobacteria, and Alphaproteobacteria specialized in a successive decomposition
100 of algal-derived organic matter. As mentioned above, a significant fraction of decaying algal
101 biomass is, however, mineralized by heterotrophic bacteria living on particles, which process
102 a large fraction of the biosynthesized organic matter (Azam, 1998) and are thus greatly
103 contributing to large-scale carbon fluxes (Battin et al., 2003; Bauer et al., 2006).
104 So far, the majority of the published studies focused on FL bacterioplankton, thereby
105 leaving the PA bacterial communities largely unexplored. The particulate fraction of
106 bacterioplankton has proven challenging to comprehensive meta-omics characterization, due
107 to its high complexity, presence of DNA/protein-binding polysaccharides, process-interfering
108 substances and lack of (meta)genomic information on marine particles (e.g. Wöhlbrand et al.,
109 2017b), despite the fact that information on marine metagenomes is constantly growing
110 (reviewed by Mineta and Gojobori 2016; Alma'abadi et al., 2015). Previous experiments also
111 indicate that a high abundance of eukaryotic proteins contributes to these challenges (Smith
112 et al., 2017; Saito et al., 2019). 5
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
113 Our goal was therefore to establish a robust, reproducible and reliable pipeline
114 enabling in-depth metaproteomics analyses of marine particles by testing different
115 established protocols for their applicability for protein extraction from PA bacterioplankton in
116 order to unravel the PA community’s specific contribution to polysaccharide decomposition in
117 marine habitats. As stated above, heterotrophic microbial communities are supposed to be
118 well equipped to metabolize algal high molecular weight substrates (Simon et al., 2002;
119 Grossart, 2010). We thus hypothesize that these communities either differ taxonomically
120 from their FL counterparts (as shown by Bizic-Ionescu et al. (2015)) and/or express different
121 genes to adapt to the sessile life style and the availability of specific polysaccharides such as
122 insoluble glycan fibres (as observed by Ganesh et al. (2014)). Moreover, we postulated that
123 this adaption will mostly affect the expression of proteins involved in motility, adhesion, stress
124 response as well as CAZymes together with appropriate sugar transporters. In order to test
125 these hypotheses, a metaproteomics pipeline for marine particles was established and tested
126 as a proof of concept on spring bloom samples collected in April 2009, for which the FL
127 bacterial fraction had previously been characterized (Teeling et al., 2012).
6
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
128 Results and Discussion
129 Establishment of a metaproteomics pipeline for PA microbial communities
130 As stated above, the metaproteomics analysis of PA microbial communities is severely
131 hampered by their high complexity, the presence of a large proportion of eukaryotic proteins,
132 the sugary particle-matrix as well as the lack of (meta)genomic information on PA-specific
133 pro- and eukaryotes (Wöhlbrand et al., 2017b; Saito et al., 2019). Whilst the metaproteomics
134 analyses by Teeing et al. (2012) and Kappelmann et al. (2019) of FL bacterioplankton
135 (harvested on 0.2 µm filters) sampled during spring blooms from 2009 to 2012 at
136 “Kabeltonne” Helgoland resulted in the identification of several thousand protein groups, the
137 PA microbial communities retained on 3 and 10 µm pore-sized filters emerged as difficult to
138 analyse by the integrated metagenomic/metaproteomic approach employed at that time.
139 We thus aimed to develop a suitable and effective metaproteomics pipeline for in-depth
140 analyses of taxonomic composition and functionality of marine PA microbial communities. To
141 this end, six different well-established protein extraction protocols for various particulate
142 samples were tested using biomass from 3 µm and 10 µm pore-sized filters (assigned as PA
143 fraction) of the MIMAS bio-archive (www.mimas-project.de) covering different stages of the
144 bloom, which had been collected in 2009 during the spring phytoplankton bloom sampling
145 campaign by Teeling et al. (2012). Moreover, various MS sample preparation protocols and
146 different protein sequence databases were evaluated.
147 [Table 1]
148 Protein extraction. Efficient protein extraction is a crucial step for successful
149 metaproteomics analyses of the microbial communities. In a first step, we therefore tested
150 five different protein extraction methods that employ different strategies and that were
151 already successfully applied for metaproteome analyses of microbial communities from
152 different environments, i.e. sewage sludge (phenol extraction; Kuhn et al., 2011), leaf litter
153 (SDS-TCA; Schneider et al., 2012), stream hyporheic biofilms (SDS-acetone; Hall et al.,
7
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
154 2012), hypersaline microbial mats (bead beating; Moog, 2012), and soil (freezing and
155 thawing; Chourey et al., 2010; Thompson et al., 2008). In addition, the commercially
156 available TRI-Reagent® (Sigma Aldrich) for simultaneous isolation of RNA, DNA and proteins
157 was tested (Table 1). Filter samples used for protocol evaluation originated from several
158 sampling dates in the 2009 spring bloom sampling campaign (9th of February, 7th of April, 21st
159 of April, and 16th of June 2009). Filters were cut into pieces and subjected to the respective
160 extraction protocol (Fig. 1A). Total protein amounts extracted from the filters by each of the
161 applied methods were quite variable. Highest protein yield as determined by the Pierce™
162 BCA Protein Assay and 1D SDS-PAGE was obtained using the SDS-acetone or bead
163 beating approach (Table 1; Fig. 1B). In conclusion, SDS-acetone- and bead beating-based
164 protocols turned out to be most efficient for protein extraction from particles and were
165 therefore used for optimizing the downstream MS sample preparation procedure.
166 MS sample preparation. Total protein was extracted by the SDS-acetone and bead beating
167 method from filters collected on 28th of April 2009 and separated by 1D SDS-PAGE (Fig.
168 1B). Even though MS sample preparation via GeLC MS/MS is more time-consuming
169 compared to 1D or 2D-LC approaches, it has been proven valuable to purify protein extracts
170 and remove polymeric contaminants (e.g. Lassek et al. 2015; Keiblinger and Riedel, 2018)
171 and yields comparable results as LC-based peptide fractionation (Hinzke et al., 2019). To
172 determine whether an increase in the total number of individual gel sub-fractions will lead to
173 more protein identifications, gel lanes were cut in either 10 or 20 equal-sized fractions,
174 proteins were in gel trypsin-digested and the resulting peptides were subjected to LC-MS/MS
175 analysis. Moreover, we tested whether reduction and alkylation of the proteins prior to tryptic
176 digestion increased protein identification rates (Fig. 1B). Searching the acquired spectra in
177 the so far available 0.2 µm 2009 (MIMAS) database (Teeling et al., 2012) revealed that the
178 best results (Fig. 1B, Fig. S1) were obtained by higher fractionation (20 gel pieces) without
179 reduction and alkylation.
180 [Figure 1]
8
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
181 Optimizing databases. Metagenomic sequencing, assembling and annotation of FL (0.2 µm
182 pore-sized filters) and PA (3 and 10 µm pore-sized filters) fractions of water samples
183 collected during the Helgoland spring bloom 2009 was performed in parallel to the
184 optimization of the metaproteomics protocol (for details see experimental procedures).
185 Unfortunately, the coverage and quality of the metagenome sequences of the large
186 particulate fraction (10 µm pore-sized filters) was not sufficient to be correctly assembled,
187 annotated and translated. Thus, the metagenomic database used for subsequent database
188 searches was only composed of sequences of FL bacteria (0.2 µm pore-sized filters) and
189 microbial communities present in the medium particulate fraction (3 µm pore-sized filters).
190 The LC-MS/MS spectra obtained with the bead beating protocol were searched against
191 four different databases to identify the database that results in the highest number of reliably
192 identified protein groups (Fig. S2): (I) the non-redundant NCBI database (NCBInr), (II) a
193 database with Uniprot sequences from abundant bacteria and diatoms identified by Teeling
194 et al. (2012) (PABD), (III) the database used by Teeling et al. (2012) containing proteins
195 based on translated metagenomes of FL bacteria (0.2 µm pore-sized filters from different
196 sampling time points) of the spring bloom 2009 (MIMAS) and (IV) a database based on the
197 metagenomes of the 0.2 and 3 µm pore-sized filters from samples of the 14th of April 2009
198 (0.2 + 3 µm 2009). Best results were obtained with the 0.2 + 3 µm 2009 database (Fig. S2),
199 which is not surprising as the resolving power of metaproteome analyses relies heavily upon
200 the database used for protein identification (e.g. Schneider and Riedel, 2010; Teeling et al.,
201 2012). It is, moreover, well accepted that metaproteomic data are most informative in
202 combination with complementary omics approaches, i.e. genomics and transcriptomics (e.g.
203 Banfield et al., 2005; Ram et al., 2005).
204 Based on our results, we finally decided to use the bead beating-based protocol for
205 protein extraction since this method resulted in more reproducible protein yields compared to
206 the SDS-acetone extraction protocol (Fig. 1B). Furthermore, this method was less time-
207 intensive compared to the SDS-acetone approach and resulted in the identification of the
9
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
208 highest number of unique protein groups, most probably due to the effective disintegration of
209 the particulate matrix by EDTA added to the extraction buffer (Passow, 2002). In summary,
210 the optimized metaproteomic pipeline for marine particles (Fig. 2) comprises protein
211 extraction by bead beating, protein fractionation by 1D SDS-PAGE (20 fractions), followed by
212 in-gel tryptic digestion, LC-MS/MS and database search against the matching metagenome
213 database (0.2 + 3 µm 2009).
214 [Figure 2]
215 Proof of principle – comparative metaproteome analyses of FL and PA bacterioplankton
216 As proof of principle, the newly established protocol was applied for a comparative
217 metaproteomic analysis of FL and PA microbial communities sampled at the 14th of April
218 2009 in 0.2 µm - 3 µm (= FL), 3 - 10 µm and ≥ 10 µm (= PA) fractions. Five technical
219 replicates of each sample were subjected to the final optimized workflow and the resulting
220 MS/MS-data were searched against the matching metagenome-based database (0.2 + 3 µm
221 2009). Employing our optimized pipeline, we were able to record 460,000, 360,000, and
222 440,000 spectra, which subsequently led to the identification of 9,354 protein groups (19.4%
223 of spectral IDs), 2,263 protein groups (10.2% of spectral IDs), and 2,771 protein groups
224 (10.7% of spectral IDs) for the 0.2 - 3 µm (Table S1), 3 - 10 µm (Table S2), and ≥ 10 µm
225 (Table S3) fractions, respectively. This is, at least to our knowledge, the largest number of
226 protein groups ever identified for marine particles. Comparable studies addressing
227 metaproteomic analyses of marine sediments of the Bering Sea (Moore et al., 2012), the
228 coastal North Sea, and the Pacific Ocean (Wöhlbrand et al., 2017b) identified less than 10%
229 of the protein identification numbers resulting from the here presented novel metaproteomic
230 pipeline.
231 1,956 of the identified protein groups of the two PA fractions were also identified in the
232 FL fraction and only 276 proteins were exclusively found in the PA fractions (Fig. S3). This
233 suggests that protein expression profiles of planktonic and particulate bacteria vary less than
10
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
234 expected. However, this might also be due to the fact that PA bacteria are known to hop on
235 and off particles and, e.g. as offspring cells searching for a place to settle, may thus only
236 temporarily be part of the planktonic community (Ghiglione et al., 2007; Grossart, 2010;
237 Crespo et al., 2013). Moreover, clogging of filter pores by particles may cause retention of FL
238 bacteria thus contaminating the PA fractions by planktonic bacteria. In addition, the lack of ≥
239 10 µm pore-sized filter metagenomic sequences hampers comprehensive protein
240 identifications in this PA fraction that may result in a virtually lower abundance than
241 expected.
242 Taxonomic differences between FL and PA bacterioplankton. Besides the somewhat
243 unexpected similarity of the FL and PA metaproteomic datasets, the phylogenetic
244 assignment of the identified protein groups indicated some notable taxonomic differences
245 between the FL and PA fractions (Fig. 3, Table S4).
246 [Figure 3]
247 The number of eukaryotic protein groups was significantly higher in the PA fractions
248 (Fig. 3A) and comprised 43% and 54% of the protein groups identified for the 3 - 10 µm and
249 ≥ 10 µm fractions, respectively, in contrast to only 11% of the protein groups identified for the
250 FL phytoplankton. Moreover, the number of viral protein groups was found to be almost three
251 times higher in the two particulate fractions when compared to their planktonic counterpart
252 (Fig. 3A). The most abundant phyla within both, the FL and PA fractions, were
253 Proteobacteria (FL 55%; PA 41% and 39%, 3 µm and 10 µm pore-sized filters) and
254 Bacteroidetes (FL 40%; PA 48% and 47%, 3 µm and 10 µm pore-sized filters). Proteins
255 expressed by Alpha-, Beta- and Gammaproteobacteria were generally more dominant in the
256 FL bacteria, whilst proteins assigned to Cyanobacteria (e.g. Synechococcus, Arthrospira),
257 Opitutae, Flavobacteriia (e.g. Arenitalea, Olleya, Algibacter, Lacinutrix), and some
258 proteobacterial genera (e.g. Oceanicoccus, Candidatus Puniceispirillum, Neptuniibacter,
259 Halioglobus, Ramlibacter) were more abundant in the PA fraction (Fig. 3B). This is in good
260 accordance to other studies confirming that Bacteroidetes have been identified in both, FL 11
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
261 and PA, bacterioplankton (DeLong et al., 1993; Eilers et al., 2001; Abell and Bowman, 2005;
262 Alonso et al., 2007). Moreover, Flavobacteriia have been found highly abundant during
263 phytoplankton blooms indicating that they play an important role as consumers of algal-
264 derived organic matter (Simon et al., 1999; Riemann et al., 2000; Pinhassi et al., 2004;
265 Grossart et al., 2005; Teeling et al., 2016; Chafee et al., 2018).
266 Functional differences between FL and PA bacterioplankton. Notably, differences in the
267 protein profiles between FL and PA bacteria were more evident on the functional level (Fig.
268 4). Most importantly, the SusC/D utilization system, specific glycoside hydrolases, i.e. GH
269 family 1, 13, and 16 (including beta-glucosidases, alpha-1,4-amylases, and exo- and endo-
270 1,3-beta-glycanases), glycosyl transferases and TonB-dependent transporters were found
271 with higher overall expression levels in the PA fractions compared to the FL fraction (Fig.
272 4A). This is in good accordance with the high substrate availability (Caron et al., 1982;
273 Grossart et al., 2003; Fernández-Gómez et al., 2013), especially the presence of highly
274 abundant microalgae storage polysaccharides, i.e. alpha- and beta-glucans (Kroth et al.,
275 2008), in the particles. Sulfatases, capable of cleaving sulphate sugar ester bonds, are
276 contributing to the degradation of specific sulphated algal polysaccharides such as mannans
277 and fucans (Gómez-Pereira et al., 2012). This is well supported by our finding that sulfatases
278 are strongly expressed by PA Flavobacteriia, especially Formosa sp. (Fig. 4A & B).
279 [Figure 4]
280 FL and PA bacteria seem to employ different phosphate acquisition strategies: whilst in
281 FL bacteria phosphate and phosphonate ABC-type transporters appeared highly expressed,
282 PA bacteria rather seem to employ phytases and phosphate-selective porins. As expected,
283 various proteins involved in stress response were differentially expressed. Interestingly,
284 functions involved in oxidative stress defense appeared to be less abundant in the PA
285 fraction (maybe due to shading, reducing solar irradiation stress in the particles), whilst
286 proteins for heavy metal and antibiotic resistance were strongly expressed in the ≥ 10 µm
287 fraction, which also contained the highest proportion of eukaryotic proteins (Fig. 4A). This 12
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
288 might be due to the fact that some algae take up and store heavy metals (Gaudry et al.,
289 2007) and are capable of producing antibiotics (Grossart, 1999). This indicates that close
290 eukaryote-bacterial interactions in particles require such defense strategies of the associated
291 bacteria. As expected, adhesion proteins as well as proteins involved in motility, i.e. flagella
292 and type IV-pili, were more abundant on the particles, emphasizing their importance for
293 biofilm/aggregate formation (O´Toole & Kolter, 1998; Lemon et al., 2007; Houry et al., 2010;
294 Burke et al., 2011). Interestingly, proteorhodopsin, an inner membrane protein involved in
295 light-dependent energy generation, which has been proposed to enable FL bacteria such as
296 Polaribacter (Fernández-Gómez et al., 2013) and Pelagibacter (Giovannoni et al., 2005) to
297 survive under low nutrient conditions, was also abundantly identified in PA bacteria such as
298 Polaribacter, Paraglaciecola, and Marinosulfomonas in our analyses.
299 Eukaryotes are highly abundant and might contribute to polysaccharide degradation
300 on marine particles. Our metaproteomics data demonstrate a high abundance of
301 eukaryotes on the particles (Fig. 5). Preliminary analyses indicate that these include
302 numerous microalgal groups, e.g. diatoms, Pelagophytes, Raphidiophytes, Cryptophytes,
303 Dinoflagellates and Haptophytes, but also fungi and various protozoa (Table S2 and S3).
304 This clearly sets particles apart from the FL fraction and highlights the importance of direct
305 eukaryote-bacterial interactions in particles. Previous work on FL bacteria showed that
306 bacterial succession was largely independent of phytoplankton composition, and instead
307 determined by broad substrate availability (Teeling et al., 2016). PA bacterial composition is
308 more likely to be directly controlled by algal composition due to the intimate nature of their
309 interactions (Grossart et al., 2005), although functional redundancy may be substantial
310 (Burke et al., 2011). Moreover, eukaryotes may also contribute to polysaccharide
311 degradation in concert with bacteria. For example, fungal taxa can be abundant in marine
312 particles and have been shown to utilize algal polysaccharides such as laminarin
313 (Bochdansky et al., 2016; Cunliffe et al., 2017).
314 [Figure 5]
13
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
315 Conclusions and outlook
316 Our comparative metaproteomic analyses of marine microbial communities living either
317 planktonically or attached to particles resulted in an as yet unequalled number of identified
318 protein groups for marine particles. Interestingly, the great overlap between metaproteomes
319 of FL and PA heterotrophic bacterial communities indicates that taxonomic differences
320 between them might be less pronounced than previously thought. This might be due to the
321 fact that (I) FL bacteria can rapidly adapt to the surface-associated life style, as the majority
322 of these bacteria seems to be also present on the particles and proteins important for biofilm-
323 formation, i.e. motility and adhesion proteins, are also expressed when living planktonically,
324 and that (II) FL or PA-specific bacteria are frequently hopping on and off the particles.
325 Notably, there is strong evidence that bacteria, when living on the particles, express life style-
326 specific functions, i.e. special CAZymes, sugar transporters and proteins involved in certain
327 stress responses, which enable them to cope with the unique living conditions on marine
328 particles.
329 Although our optimized metaproteomic workflow significantly improved the identification
330 rate of PA proteins, the number of protein identifications from the particles is still
331 considerably lower compared to FL bacterial communities. We assume that especially the
332 high abundance of eukaryotic proteins poses problems in protein identification due to the
333 complexity and diversity of microbial eukaryote genomes and the presence of introns and
334 repeats in the metagenomic DNA sequence databases, which hinders peptide identification
335 (Saito et al., 2019). Metaproteome coverage of marine particles could be significantly
336 improved by employing customized databases including eukaryotic metatranscriptomic
337 (RNA-based) sequence data (Keeling et al., 2014). This can be achieved by generating
338 metatranscriptomes from the particular fractions. Alternatively, protein identification could
339 also be substantially improved by extracting already existing metatranscriptomic and
340 metagenomic data from relevant eukaryotic taxa from public databases. Key to the latter
341 approach is reliable information on which eukaryotic organisms make up the particles, which
14
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
342 can be attained by 18S rRNA gene amplicon sequencing. Perspectively, we will extend our
343 analyses on eukaryotic taxa and analyse multiple time points during phytoplankton bloom to
344 investigate succession of taxonomical clades and expressed functions of marine particles
345 from pre-bloom to post-bloom conditions.
15
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
346 Experimental Procedures
347 Bacterial biomass samples. Sampling of bacterioplankton was performed as described
348 previously (Teeling et al., 2012). Briefly, surface water samples were taken during spring
349 2009 at the station “Kabeltonne” (50° 11.3’ N, 7° 54.0’ E) between the main island Helgoland
350 and the minor island Düne about 40 km offshore in the southeastern North Sea in the
351 German Bight. Bacterial biomass for protein extraction was sequentially filtered with
352 peristaltic pumps onto 10 µm, 3 µm, and 0.2 µm pore-sized filters (142 mm diameter) to
353 separate PA and FL bacteria. All filters were stored at -80 °C until further analyses.
354 Testing of protein extraction protocols. To test six different existing protein extraction
355 protocols for their applicability on PA bacteria, filters from several time points containing
356 varying amounts of biomass were chosen. Sample preparation for the metaproteomic
357 analysis included cutting the filters into quarters and subsequently into small pieces (1-2 mm
358 in diameter). Pieces of one quarter filter were transferred into 15 ml falcon tubes and treated
359 according to the respective protocol.
360 Protocol 1 - Phenol. Filter pieces were incubated in 2.4 ml of a 0.1 M NaOH solution for 10
361 min at room temperature and were then sonicated three times for 30 s at 20% power output
362 (Sonopuls HD2200 with microtip MS 73; Bandelin electronic, Germany). Subsequently the
363 sample was centrifuged for 15 min at 12,500 x g at 20 °C to separate the supernatant from
364 the filter pieces. The supernatant was transferred into a new tube and protein extraction
365 using phenol was performed according to the protocol published by Kuhn and colleagues
366 (Kuhn et al., 2011).
367 Protocol 2 - SDS-TCA. Filter pieces were mixed with 5 ml extraction buffer (1% (w/v) SDS,
368 50 mM Tris/HCl, pH 7.0) and vigorously shaken for 2 min at room temperature. The cell
369 disruption by sonication, boiling and shaking was performed according to the protein
370 extraction protocol published by Schneider and colleagues (Schneider et al., 2012).
371 Subsequently, proteins were precipitated with 10% TCA over night at 4 °C. The precipitated
16
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
372 proteins were centrifuged for 20 min at 12,500 x g at 4 °C and the pellet was washed two
373 times in ice-cold acetone.
374 Protocol 3 - TRI-Reagent®. The TRI-Reagent® (Sigma-Aldrich, product-number T9424) is
375 used for the simultaneous isolation of RNA, DNA, and proteins. Filter pieces were transferred
376 into 4 ml TRI-Reagent and shaken vigorously for 5 min. Subsequently the proteins were
377 extracted according to the manufacturer’s guidelines.
378 Protocol 4 - Freeze and Thaw. Protein extraction was carried out according to a combination
379 of the extraction protocols of Chourey et al. (2010) and Thompson et al. (2008). To this end,
380 filter pieces were mixed with 4 ml lysis buffer (5% SDS, 50 mM Tris/HCl, 0.1 mM EDTA,
381 0.15 M NaCl, 1 mM MgCl2, 50 mM DTT, pH 8.5) and vigorously shaken for 3 min.
382 Subsequently, the samples were boiled for 10 min, followed by two freezing and thawing
383 cycles with liquid nitrogen. After cooling at 4 °C the samples were vigorously shaken for
384 3 min. To remove cell debris, samples were centrifuged for 20 min at 12,500 x g at 4 °C. The
385 proteins in the supernatant were precipitated with 25% TCA over night at 4 °C. Precipitated
386 proteins were centrifuged for 20 min at 12,500 x g at 4 °C and the resulting protein pellet was
387 washed with ice-cold acetone.
388 Protocol 5 - SDS-Acetone. Filter pieces were mixed with 5 ml extraction buffer (50 mM Tris,
389 1% (w/v) SDS, pH 7.5) and vortexed vigorously. Proteins were extracted by sonication,
390 boiling and shaking as described by Hall and colleagues (Hall et al., 2012). Subsequently,
391 proteins were precipitated with five volumes acetone over night at -20 °C. The precipitated
392 proteins were centrifuged for 20 min at 12,500 x g at 4 °C and the pellet washed two times in
393 ice-cold acetone.
394 Protocol 6 – bead beating. Protein extraction was carried out according to the extraction
395 protocol of Moog (2012), which is based on the protocol of Teeling and colleagues (Teeling
396 et al., 2012). To this end, filter pieces were covered with 4 ml lysis buffer (0.1 M DTT, 0.01 M
397 EDTA, 10% Glycerol (v/v), 1.7 mM PMSF, 5% SDS (w/v), 0.05 M Tris/HCl, pH 6.8) and 2 ml
398 glass beads (0.1 – 0.11 mm diameter) were added. The cells on the filter pieces were 17
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
399 subsequently disrupted four times for 30 s with 6.5 m/s via bead beating with a Fast Prep™-
400 24 (MP Biomedicals, Germany). To remove cell debris and glass beads, samples were
401 centrifuged for 20 min at 12,500 x g at 4 °C and the supernatant was transferred into new
402 tubes. This washing step was repeated 2 to 4 times until the beads were colourless. The
403 glass beads were washed with 3 ml lysis buffer and vigorously shaken. Proteins enriched in
404 the pooled supernatants were precipitated with 1:4 acetone at -20 °C over night. Precipitated
405 proteins were centrifuged for 20 min at 12,500 x g at 4 °C and the resulting protein pellet
406 washed with ice-cold acetone.
407 All resulting protein pellets were air-dried and resolved in 8 M urea / 2 M thiourea.
408 Determination of protein concentrations. . Protein concentrations were determined using
409 the Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific). Protein extracts were
410 prepared with the Compat-Able™ Protein Assay Preparation Reagent Kit (Thermo Fisher
411 Scientific) according to the manufacturer’s guidelines.
412 SDS-PAGE protein separation. 30 µg protein or 30 µl protein extract was mixed with
413 4x SDS sample buffer (20% glycerol, 100 mM Tris/HCl, 10% (w/v) SDS, 5% β-
414 mercaptoethanol, 0.8% bromphenol blue, pH 6.8) and loaded on TGX precast 4-20% gels
415 (Biorad, Germany). Samples were separated by electrophoresis at 150 V for 45 min. After
416 fixation (10% acetic acid, 40% ethanol, 30 min) the gels were stained with Brilliant Blue G250
417 Coomassie and imaged.
418 Protein digestion and MS-sample preparation. Three different protocols were tested on
419 proteins extracted from 3 µm and 10 µm filters.
420 Protocol 1 - 10 gel pieces. Protein lanes were cut into 10 equal-sized pieces and washed
421 with a buffer containing 50 mM ammoniumbicarbonate and 30% (v/v) acetonitrile. Prior to
422 tryptic digestion, gel pieces were dried in a vacuum concentrator and re-swollen with 2 ng/µl
423 trypsin solution (sequencing grade trypsin, Promega, USA) followed by overnight digestion at
424 37 °C. After digestion the gel pieces were covered with water and peptides were eluted from
18
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
425 the gel in an ultrasonic bath. The eluted peptides were desalted with C18 Millipore® ZipTip
426 columns (Millipore) according to the manufacturer’s guidelines.
427 Protocol 2 - 20 gel pieces. The protein lanes were cut into 20 equal-sized pieces and treated
428 as described above.
429 Protocol 3 - 20 gel pieces with reduction and alkylation. The protein lanes were cut into 20
430 equal pieces and washed with a buffer containing 100 mM ammoniumbicarbonate
431 (NH4HCO3) and 50% (v/v) methanol. Subsequently, proteins were reduced in 50 mM
432 NH4HCO3 containing 10 mM DTT for 30 min at 60 °C, followed by alkylation in 50 mM
433 NH4HCO3 containing 50 mM iodoacetamide (IAA) for 60 min in the dark at room temperature.
434 Prior to tryptic digestion, the gel pieces were dehydrated using 100% acetonitrile and dried,
435 re-swollen with 2 ng/µl trypsin solution and incubated at 37 °C over night. Peptides were
436 eluted from the gel pieces by a six-step procedure, using acetonitrile, 1% (v/v) acetic acid in
437 water, acetonitrile, 10% (v/v) acetic acid and two times acetonitrile. Peptide-containing
438 supernatants were pooled and completely dried in a vacuum concentrator. Samples were
439 subsequently resolved in buffer A (5% (v/v) acetonitrile, 0.1% (v/v) formic acid) and desalted
440 with C18 Millipore® ZipTip columns (Millipore) according to the manufacturer’s guidelines.
441 Constructions of a protein sequence database from marine particle metagenomes.
442 Environmental DNA was extracted from the 0.2 µm, 3 µm and 10 µm pore-sized filters
443 sampled on the 14th of April 2009 by a modified standard protocol of Zhou et al. (1996). In
444 detail one polycarbonate filter was cut into 4 pieces and mixed with 13.5 ml extraction buffer
445 (100 mM Tris-HCl (pH 8.0), 100 mM EDTA (pH 8.0), 100 mM Na-phosphate (pH 8.0), 1.5 M
446 NaCl, 1% CTAB (Hexadecyltrimethylammonium-bromide)). Subsequently 100 µl 10 mg/ml
447 Proteinase K was added and the sample was incubated shaking at 37 °C for 30 min. 1.5 ml
448 20% SDS was added and the sample was incubated shaking at 65 °C for 2 h. The sample
449 was centrifuged at 6,000 x g for 10 min at room temperature and the supernatants were
450 transferred to fresh tubes. Subsequently, an equal volume of chloroform/isoamylalcohol was
451 added and the sample was mixed carefully by shaking and was centrifuged at 10,000 x g for 19
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
452 10 min at room temperature. Afterwards the aqueous upper phase was transferred into a
453 new tube and the DNA was precipitated by addition of 0.6 volumes isopropanol. The sample
454 was moderately shaken over night at 4°C. After centrifugation at 50,000 x g for 20 min at
455 room temperature, the pellet was washed with 10 ml 80% (v/v) ethanol and dried. The pellet
456 was resuspended in 200 µl TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 7) and stored at -20
457 °C until sequencing.
458 DNA was sequenced at the Max Planck Sequencing Centre (Cologne, Germany), using the
459 Illumina HiSeq 2500 platform and 2 x 250 bp chemistry. Sequences were then trimmed using
460 bbduk v35.14 (http://bbtools.jgi.doe.gov) with the following parameters: ktrim = r k = 28
461 mink = 12 hdist = 1 tbo = t tpe = t qtrim = rl trimq = 20 minlength = 100. Read quality for each
462 sample was then confirmed using FastQC v0.11.2 (Andrews, 2010). Trimmed and filtered
463 reads from the three metagenomic datasets were then assembled individually. The 0.2 µm
464 pore-sized filter sample was assembled with metaSPAdes v3.10.1 (Nurk et al., 2017) with
465 kmers of length 21, 33, 55, 77, 99, and 127, and error correction mode switched on.
466 Assembly of the larger size fraction was done with MEGAHIT v1.1.3 (Li et al., 2016) with
467 kmers 21, 33, 55, 77, 99, 127, 155, 183, and 211. Assembled contigs longer than 1500 base
468 pairs were kept for gene predictions. Genes were predicted and annotated using Prokka
469 v1.11 (Seeman, 2014), which implements prodigal v2.6.3 (Hyatt et al., 2010) for ORF
470 prediction.
471 Raw read sequences and assembled contig sequences have been deposited in the
472 European nucleotide archive (ENA) under the project accession number PRJEB2888.
473 LC-MS/MS data acquisition and data analysis. Peptides were separated by reversed-
474 phase chromatography on an Easy-nLC 1000 (Thermo Scientific) with self-packed C18
475 analytical columns (100 µm × 20 cm) and coupled to a LTQ Orbitrap Velos mass
476 spectrometer(Thermo Scientific) using a non-linear binary gradient of 80 minutes from 5 %
477 solvent A (0.1 % (v/v) acetic acid) to 99 % solvent B (0.1 % acetic acid (v/v), 99.9 %
478 acetonitrile (v/v)) and a flow rate of 300 nl/min. Survey scans at a resolution of 30,000 were 20
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
479 recorded in the Orbitrap analyser (m/z 300 - 1700) and the 20 most intense precursor ions
480 were selected for CID fragmentation in the LTQ. Dynamic exclusion of precursor ions was
481 enabled; single-charged ions and ions with unknown charge state were excluded from
482 fragmentation. Internal lock mass calibration was enabled (lock mass 445.120025).
483 The mass spectrometry raw data were converted into mgf files using MSConvert (64-bit,
484 Proteowizard 3) and subsequently subjected to database searching via Mascot (Matrix
485 Science; version 2.6.0). Four different protein sequence databases were used for peptide to
486 spectrum matching: I) the non-redundant NCBI database (NCBI nr - NCBIprot_20171030
487 database (136,216,794 entries)), II) a database containing protein sequences of abundant
488 bacteria and diatoms (PABD) based on the study of Teeling et al. (2012), and retrieved from
489 Uniprot KB (Uniprot_DoS_complete_20170829 database (2,638,314 entries)), III) a database
490 containing protein sequences of the free-living fraction from Teeling et al. (2012) (0.2 µm
491 2009 (MIMAS) - MIMAS_forward_reverse_all_contaminants database (1,579,724 entries)),
492 and IV) a database based on translated metagenomes from 0.2 µm and 3 µm filters (see
493 below for details) (0.2 + 3 µm 2009 - 02_plus_3_POMPU_nr97_fw_cont_20181015 database
494 (1,463,571 entries). Mascot was searched with a fragment ion mass tolerance of 0.80 Da
495 and a parent ion tolerance of 10.0 ppm. Oxidation of methionine was specified as a variable
496 modification, trypsin was set as digestion enzyme and a maximum of two missed cleavages
497 was allowed.
498 Scaffold (version Scaffold_4.8.7, Proteome Software Inc.) was used to validate MS/MS
499 based peptide and protein identifications. Peptide identifications were accepted if they could
500 be established at greater than 95.0% probability by the Peptide Prophet algorithm (Keller et
501 al., 2002) with Scaffold delta-mass correction. Protein identifications were accepted if they
502 could be established at greater than 99.0% probability and contained at least one identified
503 peptide. Protein probabilities were assigned by the Protein Prophet algorithm (Nesvizhskii et
504 al., 2003). Proteins that contained similar peptides and could not be differentiated based on
505 MS/MS analysis alone were grouped to satisfy the principles of parsimony. Peptides that
21
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
506 were only found in one of the replicates were excluded from the following data analysis.
507 Mass spectrometry proteomics data have been deposited to the ProteomeXchange
508 Consortium via the PRIDE partner repository (Perez-Riverol et al., 2019) with the data set
509 identifier PXD12699 (reviewer account details: username [email protected];
510 password 5CKUi0AF).
511 For further data analysis, the software ProPHAnE (Proteomics result Pruning and Homology
512 group Annotation Engine; version 3.1.1) (Schneider et al., 2011) was used. For the
513 taxonomical classification of the identified protein groups the NCBI NR database (version
514 2018-08-02; e-value 0.01, query cover 0.9, max-target-seqs 1) and the diamond blastp
515 algorithm (version 0.8.22) were used. For functional classification of the identified protein
516 groups the eggmap database (version 4.5.1, downloaded at 2018-07-31) and the algorithm
517 e-mapper were used.
518 A list of common contaminants was added to all translated ORF sequences found by
519 metagenome analysis of the 0.2 µm and 3 µm filters from the sampling date 14th of April
520 2009. Redundant sequences were eliminated (97% redundancy, elimination of shorter
521 sequence) using CD-HIT (www.cdhit.org), a program for clustering and comparing protein or
522 nucleotide sequences, resulting in the database
523 02_plus_3_POMPU_nr97_fw_cont_20181015 (1,463,571 entries).
524
525 Acknowledgements.
526 We thank the DFG for financial support (RI 969/9-1) in the frame of the Research Unit
527 “POMPU” (FOR 2406). We are grateful to Sabine Kühn for technical assistance and to
528 Thomas Schweder for helpful discussions.
529
530
22
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
531 References
532 Abell, G.C.J., and Bowman, J.P. (2005) Colonization and community dynamics of class
533 Flavobacteria on diatom detritus in experimental mesocosms based on Southern Ocean
534 seawater. FEMS Microbiol Ecol 53: 379-391.
535 Alma'abadi, A.D., Gojobori, T., and Mineta, K. (2015) Marine metagenome as a resource for
536 novel enzymes. GPB 13: 290-295.
537 Alldredge, A.L., Passow, U., and Logan, B. (1993) The existence, abundance, and
538 significance of large transparent exopolymer particles in the ocean. Deep-Sea Res I 40:
539 1131-1140.
540 Alonso, C., Warnecke, F., Amann, R., and Pernthaler, J. (2007) High local and global
541 diversity of Flavobacteria in marine plankton. Environ Microbiol 9: 1253-1266.
542 Amin, S.A., Parker, M.S., and Armbrust, E.V. (2012) Interactions between Diatoms and
543 Bacteria. MMBR 76: 667-684.
544 Andrews, S. (2010) FastQC: a quality control tool for high throughput sequence data.
545 Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc
546 Azam, F., Fenchel, T., Field, J.G., Gray, J.S., Meyer-Reil, L.A., and Thingstad, F. (1983) The
547 Ecological Role of Water-Column Microbes in the Sea. Mar Ecol Prog Ser 10: 257-263.
548 Azam F. (1998) Microbial control of Oceanic carbon flux: the plot thickens. Science 280: 694-
549 696.
550 Azam, F., and Malfatti, F. (2007) Microbial structuring of marine ecosystems. Nature Reviews
551 5: 782-791.
552 Banfield, J.F., Verberkmoes, N.C., Hettich, R.L., and Thelen, M.P. (2005) Proteogenomic
553 approaches for the molecular characterization of natural microbial communities. OMICS 9:
554 301-333.
555 Battin, T.J., Kaplan, L.A., Denis Newbold, J., and Hansen, C.M. (2003) Contributions of
556 microbial biofilms to ecosystem processes in stream mesocosms. Nature 426: 439-442.
23
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
557 Bauer, M., Kube, M., Teeling, H., Richter, M., Lombardot, T., Allers, E., et al. (2006) Whole
558 genome analysis of the marine Bacteroidetes ‘Gramella frosetii’ reveals adaptations to
559 degradation of polymeric organic matter. Environ Microbiol 8: 2201-2213.
560 Bergauer, K., Fernandez-Guerra, A., Garcia, J.A.L., Sprenger, R.R., Stepanauskas, R.,
561 Pachiadaki, M.G., et al. (2018) Organic matter processing by microbial communities
562 throughout the Atlantic water column as revealed by metaproteomics. Proc Natl Acad Sci
563 U S A 115: E400-E408.
564 Bižić-Ionescu, M., Zeder, M., Ionescu, D., Orlić, S., Fuchs, B.M., Grossart, H.P., et al. (2015)
565 Comparison of bacterial communities on limnic versus coastal marine particles reveals
566 profound differences in colonization. Environ Microbiol 17: 3500-3514.
567 Bochdansky, A.B., Clouse, M.A., and Herndl, G.J. (2016). Eukaryotic microbes, principally
568 fungi and labyrinthulomycetes, dominate biomass on bathypelagic marine snow. ISME J
569 11: 362-373.
570 Burke, C., Steinberg, P., Kjelleberg, S., and Thomas, T. (2011) Bacterial community
571 assembly based on functional genes rather than species. PNAS 108: 14288-14293.
572 Caron, D.A., Davis, P.G., Madin, L.P., and Sieburth, J.M. (1982) Heterotrophic bacteria and
573 bacterivorous protozoa in oceanic macroaggregates. Science 218: 795-797.
574 Chafee, M., Fernàndez-Guerra, A., Buttigieg, P.L., Gerdts, G., Eren, A.M., Teeling, H., et al.
575 (2018) Recurrent patterns of microdiversity in a temperate coastal marine environment.
576 ISME J 12: 237-252.
577 Chourey, K., Jansson, J., VerBerkmoes, N., Shah, M., Chavarria, K.L., Tom, L.M., et al.
578 (2010) Direct cellular lysis/protein extraction protocol for soil metaproteomics. J Proteome
579 Res 9: 6615-6622.
580 Christie-Oleza, J.A., and Armengaud, J. (2015) Proteomics of the Roseobacter clade, a
581 window to the marine microbiology landscape. Proteomics 15: 3928-3942.
582 Crespo, B.G., Pommier, T., Fernández-Gómez, B., and Pedrós-Alió, C. (2013) Taxonomic
583 composition of the particle-attached and free-living bacterial assemblages in the
584 Northwest Mediterranean Sea. Microbiology Open 2: 541-552.
24
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
585 Cunliffe, M., Hollingsworth, A., Bain, C., Sharma, V., and Taylor, J.D. (2017) Algal
586 polysaccharide utilisation by saprotrophic planktonic marine fungi. Fungal Ecol 30: 135-
587 138.
588 DeLong, E.F., Franks, D.G., and Alldredge, A.L. (1993) Phylogenetic diversity of aggregate-
589 attached vs. free-living marine bacterial assemblages. Limnol Oceanogr 38: 924-934.
590 Eilers, H., Pernthaler, J., Peplies, J., Glöckner, F.O., Gerdts, G., and Amann, R. (2001)
591 Isolation of novel pelagic bacteria from the German Bight and their seasonal contribution
592 to surface picoplankton. AEM 67: 5134-5142.
593 Falkowski, P.G., Barber, R.T., and Smetacek, V. (1998) Biogeochemical controls and
594 feedbacks on ocean primary production. Science 281: 200-206.
595 Fernández-Gómez, B., Richter, M., Schüler, M., Pinhassi, J., Acinas, S., González, J.M. et
596 al. (2013) Ecology of marine Bacteroidetes. A comparative genomics approach. ISME J 7:
597 1026-1037.
598 Field, C.B., Behrenfeld, M.J., Randerson, J.T., and Falkowski, P. (1998) Primary production
599 of the biosphere: integrating terrestrial and oceanic components. Science 281: 237-240.
600 Ganesh, S., Parris, D.J., DeLong, E.F., and Stewart, F.J. (2014) Metagenomic analysis of
601 size-fractionated picoplankton in a marine oxygen minimum zone. ISME J 8: 187-211.
602 Gaudry, A., Zeroual, F., Gaie-Levrel, F., Moskura, M., Boujrhal, F.Z., Cherkaoui, E. et al.
603 (2007) Heavy metals pollution of the atlantic marine environment by the Moroccan
604 phosphate industry, as observed through their bioaccumulation in Ulva lactuca. Water Air
605 Soil Pollut 178: 267-285.
606 Ghiglione, J.F., Mevel, G., Pujo-Pay, M., Mousseau, L., Lebaron, P., and Goutx, M. (2007)
607 Diel and seasonal variations in abundance, activity, and community structure of particle-
608 attached and free living bacteria in NW Mediterranean Sea. Microb Ecol 54: 217-231.
609 Giovannoni, S.J., Tripp, H.J., Givan, S., Podar, M., Vergin, K.L., Baptista, D., et al. (2005)
610 Genome Streamlining in a Cosmopolitan Oceanic Bacterium. Science 309: 1242-1245.
25
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
611 Gómez-Pereira, P.R., Schüler, M., Fuchs, B.M., Bennke, C., Teeling, H., Waldmann, J., et al.
612 (2012) Genomic content of uncultured Bacteroidetes from contrasting oceanic provinces
613 in the North Atlantic Ocean. Environ Microbiol 14: 52-66.
614 Grossart, H.P. (1999) Interactions between marine bacteria and axenic diatoms
615 (Cylindrotheca fusiformis, Nitzschia laevis, and Thalassiosira weissflogii) incubated under
616 various conditions in the lab. Aquat. Microb Ecol 19: 1-11.
617 Grossart, H.P., Hietanen, S., and Ploug, H. (2003) Microbial dynamics on diatom aggregates
618 in Oresund, Denmark. Mar Ecol Prog Ser 249: 69-78.
619 Grossart, H.P., Levold, F., Allgaier, M., Simon, M., and Brinkhoff, T. (2005) Marine diatom
620 species harbour distinct bacterial communities. Environ Microbiol 7: 860-873.
621 Grossart, H.P. (2010) Ecological consequences of bacterioplankton lifestyles: changes in
622 concepts are needed. Environ Microbiol Rep 2: 706-714.
623 Hall, E.K., Besemer, K., Kohl, L., Preiler, C., Riedel, K., Schneider, T., et al. (2012) Effects of
624 resource chemistry on the composition and function of stream hyporheic Biofilms. Front
625 Microbiol 3: 1-14.
626 Hinzke, T., Kouris, A., Hughes, R.A., Strous, M., and Kleiner, M. (2019) More Is Not Always
627 Better: Evaluation of 1D and 2D-LC-MS/MS Methods for Metaproteomics. Front Microbiol
628 10: 238.
629 Houry, A., Briandet, R., Aymerich, S., and Gohar, M. (2010) Involvement of motility and
630 flagella in Bacillus cereus biofilm formation. Microbiology 156: 1009-1018.
631 Hyatt, D., Chen, G.L., LoCascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. (2010)
632 Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC
633 Bioinformatics 11: 119.
634 Kappelmann, L., Krüger, K., Hehemann, J.H., Harder, J., Markert, S., Unfried, F., et al.
635 (2019) Polysaccharide utilization loci of North Sea Flavobacteriia as basis for using
636 SusC/D-protein expression for predicting major phytoplankton glycans. ISME J 13: 76-91
637 Keeling, P.J., Burki, F., Wilcox, H.M., Allam, B., Allen, E.E., Amaral-Zettler, L.A., et al. (2014)
638 The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP):
26
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
639 Illuminating the functional diversity of eukaryotic life in the oceans through transcriptome
640 sequencing. PLoS Biol 12: e1001889-6.
641 Keiblinger, K.M., and Riedel, K. (2018) Sample Preparation for Metaproteome Analyses of
642 Soil and Leaf Litter. Methods Mol Biol 1841: 303-318.
643 Keller, A., Nesvizhskii, A.I., Kolker, E., and Aebersold, R. (2002) Empirical statistical model to
644 estimate the accuracy of peptide identifications made by MS/MS and database search.
645 Anal Chem 74: 5383-5392.
646 Kroth, P.G., Chiovitti, A., Gruber, A., Martin-Jezequel, V., Mock, T., Schnitzler Parker, M., et
647 al. (2008). A model for carbohydrate metabolism in the diatom Phaeodactylum tricornutum
648 deduced from comparative whole genome analysis. PLoS One 3: e1426.
649 Kuhn, R., Benndorf, D., Rapp, E., Reichl, U., Palese, L.L., and Pollice, A. (2011)
650 Metaproteome analysis of sewage sludge from membrane bioreactors. Proteomics 11:
651 2738-2744.
652 Lassek, C., Burghartz, M., Chaves-Moreno, D., Otto, A., Hentschker, C., Fuchs, S., et al.
653 (2015) A metaproteomics approach to elucidate host and pathogen protein expression
654 during catheter-associated urinary tract infections (CAUTIs). Mol Cell Proteomics 14: 989-
655 1008.
656 Lemon, K.P., Higgins, D.E., and Kolter, R. (2007) Flagellar motility is critical for Listeria
657 monocytogenes biofilm formation. J Bacteriol 189: 1009-1018.
658 Li, D., Luo, R., Liu, C.M., Leung, C.M., Ting, H.F., Sadakane, K., et al. (2016) MEGAHIT
659 v1.0: A Fast and Scalable Metagenome Assembler driven by Advanced Methodologies
660 and Community Practices. Methods 102: 3-11.
661 Mineta, K., and Gojobori, T. (2016). Databases of the marine metagenomics. Gene 576: 724-
662 728.
663 Moog, G. (2012) Optimierung der Probenaufbereitung für Metaproteomanalysen mittels
664 Massenspektrometrie. Diploma thesis at the University of Greifswald, Institute of
665 Microbiology.
27
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
666 Moore, E.K., Nunn, B.L., Goodlett, D.R., Harvey, H.R. (2012) Identifying and tracking
667 proteins through the marine water column: insights into the inputs and preservation
668 mechanisms of protein in sediments. Geochim Cosmochim Acta 83: 324-359.
669 Morris, R.M., Nunn, B.L., Frazar, C., Goodlett, D.R., Ting, Y.S., and Rocap, G. (2010)
670 Comparative metaproteomics reveals ocean-scale shifts in microbial nutrient utilization
671 and energy transduction. ISME J 4: 673-685.
672 Nesvizhskii, A.I., Keller, A., Kolker, E., and Aebersold, R. (2003) A statistical model for
673 identifying proteins by tandem mass spectrometry. Anal Chem 75: 4646-4658.
674 Nurk, S., Meleshko, D., Korobeynikov, A., and Pevzner, P.A. (2017) metaSPAdes: a new
675 versatile metagenomic assembler. Genome Res 27: 824-834.
676 O’Toole, G.A., and Kolter, R. (1998) Flagellar and twitching motility are necessary for
677 Pseudomonas aeruginosa biofilm development. Mol Microbiol 30: 295-304.
678 Perez-Riverol, Y., Csordas, A., Bai, J., Bernal-Llinares, M., Hewapathirana, S., Kundu, D.J.,
679 et al. (2019) The PRIDE database and related tools and resources in 2019: improving
680 support for quantification data. Nucleic Acids Res 47: D442-D450.
681 Pinhassi, J., Sala, M.M., Havskum, H., Peters, F., Guadayol, O., Malits, A., et al. (2004)
682 Changes in bacterioplankton composition under different phytoplankton regimens. AEM
683 70: 6753-6766.
684 Raes, J., and Bork, P. (2008) Molecular eco-systems biology: towards an understanding of
685 community function. Nat Rev Microbiol 6: 693-699.
686 Ram, R.J., Verberkmoes, N.C., Thelen, M.P., Tyson, G.W., Baker, B.J., Blake, R.C., et al.
687 (2005) Community proteomics of a natural microbial biofilm. Science 308: 1915-1920.
688 Riemann, L., Steward, G.F., and Azam, F. (2000) Dynamics of bacterial community
689 composition and activity during a mesocosm diatom bloom. AEM 66: 578-587.
690 Saito, M.A., Bertrand, E.M., Duffy, M.E., Gaylord, D.A., Held, N.A., Hervey, W.J., et al.
691 (2019) Progress and challenges in ocean metaproteomics and proposed best practices for
692 data sharing. J Proteome Res 18: 1461-1476.
28
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
693 Schneider, T., and Riedel, K. (2010) Environmental proteomics. Analysis of structure and
694 function of microbial communities. Proteomics 10: 785-798.
695 Schneider, T., Schmid, E., de Castro Junior, J.V., Cardinale, M., Eberl, L., Grube, M., et al.
696 (2011) Structure and function of the symbiosis partners of the lung lichen (Lobaria
697 pulmonaria L. Hoffm.) analyzed by metaproteomics. Proteomics 11: 2752-2756.
698 Schneider, T., Keiblinger, K.M., Schmid, E., Sterflinger-Gleixner, K., Ellersdorfer, G.,
699 Roschitzki, B., et al. (2012) Who is who in litter decomposition? Metaproteomics reveals
700 major microbial players and their biogeochemical functions. ISME J 6: 1749-1762.
701 Seemann, T. (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30: 2068-
702 2069.
703 Simon, M., Grossart, H.P., Schweitzer, B., and Ploug, H. (2002) Microbial ecology of organic
704 aggregates in aquatic ecosystems. Aquat Microb Ecol 28: 175-211.
705 Simon, M., Glöckner, F.O., and Amann, R. (1999) Different community structure and
706 temperature optima of heterotrophic picoplankton in various regions of the Southern
707 Ocean. Aquat Microb Ecol 18: 275-284.
708 Sohn, J.H., Lee, J.H., Yi, H., Chun, J., Bae, T.S., Ahn, T.Y., et al. (2004) Kordia algicida gen.
709 nov., sp. nov., an aligicidal bacterium isolated from red tide. Int J Syst Evol Microbiol 54:
710 675-80.
711 Teeling, H., Fuchs, B.M., Becher, D., Klockow, C., Gardebrecht, A., Bennke, C.M., et al.
712 (2012) Substrate-controlled succession of marine bacterioplankton populations induced by
713 a phytoplankton bloom. Science 336: 608-611.
714 Teeling, H., Fuchs, B.M., Bennke, C.M., Krüger, K., Chafee, M., Kappelmann, L., et al.
715 (2016) Recurring patterns in bacterioplankton dynamics during coastal spring algae
716 blooms. Elife 5: e1188.
717 Thompson, M.R., Chourey, K., Froelich, J.M., Erickson, B.K., VerBerkmoes, N.C., and
718 Hettich, R.L. (2008) Experimental approach for deep proteome measurements from small-
719 scale microbial biomass samples. Anal Chem 80: 9517-9525.
29
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
720 Volkman, J., and Tanoue, E. (2002) Chemical and biological studies of particulate organic
721 matter in the ocean. J Oceanogr 58: 265-279.
722 Wang, D.Z., Xie, Z.X., and Zhang, S.F. (2014). Marine metaproteomics: current status and
723 future directions. J Proteomics 97: 27-35.
724 Williams, T.J., Long, E., Evans, F., Demaere, M.Z., Lauro, F.M., Raftery, M.J., et al. (2012) A
725 metaproteomic assessment of winter and summer bacterioplankton from Antarctic
726 Peninsula coastal surface waters. ISME J 6: 1883-1900.
727 Wilmes, P., Wexler, M., and Bond, P.L. (2008) Metaproteomics provides functional insight
728 into activated sludge wastewater treatment. PLoS One 3: e1778.
729 Wöhlbrand, L., Wemheuer, B., Feenders, C., Ruppersberg, H.S., Hinrichs, C., Blasius, B., et
730 al. (2017a) Complementary metaproteomic approaches to assess the bacterioplankton
731 response toward a phytoplankton spring bloom in the Southern North Sea. Front Microbiol
732 8: 442.
733 Wöhlbrand, L., Feenders, C., Nachbaur, J., Freund, H., Engelen, B., Wilkes, H., et al.
734 (2017b) Impact of extraction methods on the detectable protein complement of
735 metaproteomic analyses of marine sediments. Proteomics 17: 22.
736 Zhou, J., Bruns, M.A., and Tiedje, J.M. (1995) DNA Recovery from Soils of Diverse
737 Composition. AEM 62: 316-322.
30
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
738 Figure legends
739 Figure 1: (A) Sampling strategy and (B) evaluation of protein extraction and MS
740 sample preparation protocols. (A) Water samples collected at “Kabeltonne” Helgoland
741 during the spring bloom 2009 were sequentially filtered to obtain the 0.2 - 3 µm (FL) and two
742 PA (3 - 10 µm = medium, ≥ 10 µm = large) fractions as described in Teeling et al. (2012).
743 Filters were initially cut into three or four pieces, which were subsequently shredded and
744 mixed with the respective extraction buffer. (B) Filters (medium particles = yellow; large
745 particles = blue) from different sampling time points (turquoise, green and red) were
746 processed according to the six different protocols describe in the experimental procedure
747 section. With regard to the extracted protein amount the bead beating and SDS-acetone
748 approaches obviously outcompeted the four other protocols. However, the SDS-acetone
749 protocol was less reproducible than the bead beating protocol. Considering bead beating and
750 SDS-acetone as best performing protocols, they were employed to test different MS sample
751 preparation approaches, i.e., different number of SDS gel fractions for tryptic digestion
752 together with protein reduction (red.) and alkylation (alk.) prior to tryptic digestion. The
753 subsequent LC-MS/MS analyses revealed best results for the bead beating protocol followed
754 by GeLC-MS/MS from 20 fractions without protein reduction and alkylation as shown in the
755 bottom line of the figure. Bubble sizes for the large (blue) and medium (yellow) particles
756 correspond to the number of identified protein groups (see also Fig. S1).
757
758 Figure 2: Final metaproteomics pipeline. Protein extraction from filters was conducted
759 using 5% (w/v) SDS containing lysis buffer, cell disruption by FastPrep-mediated bead
760 beating, separation of proteins by 1D-SDS-PAGE, tryptic in-gel digestion, LC-MS/MS
761 analyses on an Orbitrap VelosTM mass spectrometer, MASCOT database search against the
762 metagenome-based database (0.2 + 3 µm 2009) and data-processing and visualization with
763 the in-house-developed bioinformatics tools Prophane 3.1 and Paver.
764
31
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
765 Figure 3: Taxonomic affiliation of proteins of FL and PA metaproteomes during the
766 spring bloom on 14th of April 2009 at “Kabeltonne” Helgoland. (A) Distribution of pro-
767 and eukaryotes in the FL (0.2 - 3 µm) and PA (3 - 10 µm, ≥ 10 µm) fractions based on the
768 relative abundance of protein groups assigned to the different phylogenetic groups. (B)
769 Voronoi treemaps visualizing the phylogenetic assignment of bacterial protein groups
770 identified in FL (red) and PA (yellow and blue) fractions. Cell size corresponds to the relative
771 abundance of the respective bacterial genus on protein level. Proteins of Reinekea for
772 example are most abundant in the FL fraction and are therefore encoded by a large red
773 treemap cell. In the PA fractions they can be detected only in traces resulting in very small
774 cell sizes (coloured in yellow and blue). Algibacter protein abundance, on the other hand,
775 was notably higher in the PA fractions, compared to the FL fraction.
776
777 Figure 4: Functional assignment of proteins in FL and PA metaproteomes during the
778 spring bloom on 14th of April 2009 at “Kabeltonne” Helgoland. (A) Total abundance of
779 selected protein groups with assigned functions in the FL (0.2 – 3 µm = small) and PA (3 - 10
780 µm = medium and ≥ 10 µm = large) fractions. (B) Voronoi treemaps showing the
781 phylogenetic assignment of selected functional protein groups identified in FL (red) and PA
782 (yellow and blue) fractions. Cell size corresponds to the relative abundance of the respective
783 genus within specific functional categories.
784
785 Figure 5: Phylogenetic assignment of eukaryotic proteins present in the FL and PA
786 fractions during the spring bloom on 14th of April 2009 at “Kabeltonne” Helgoland. (A)
787 Distribution of different eukaryotes in the FL (0.2 µm) and PA (0.2 - 3 µm and ≥ 10 µm)
788 fractions as shown by the relative protein abundances assigned to the different eukaryotic
789 phylogenetic groups. (B) Voronoi treemap visualizing the relative abundance of eukaryotic
790 taxa based on the abundance of assigned proteins extracted from the FL (red) and PA
791 (yellow and blue) fractions. Cell size corresponds to the relative abundance of the respective
792 genus. In this preliminary analysis, protein identification is based on metagenomic (DNA-
32
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
793 based) information from the filtered fractions, which suffers limitations for eukaryotic protein
794 identification, probably resulting in incomplete functional and taxonomic profiles.
795
796
33
bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
797 Supplemental Material
798 Figure S1. Protein identifications obtained by different extraction and protein pre-
799 fractionation protocols. For the medium particle size fraction (3 - 10 µm, yellow), 20 gel
800 fractions after standard treatment, i.e. without protein reduction (red.) and alkylation (alk.),
801 resulted in the highest number of identified protein groups, no matter which protein extraction
802 protocol (SDS-acetone (red) or bead beating (green)) was applied. For the large particle
803 fraction (≥ 10 µm, blue) the general trend was similar. However, the beat beating protocol
804 performed better compared to the SDS-acetone protocol.
805
806 Figure S2: Number of identified protein groups obtained with different databases: (I)
807 the non-redundant NCBI database (NCBInr, 136,216,794 entries), (II) a database with
808 Uniprot sequences of known abundant bacteria and diatoms identified by the study of
809 Teeling et al. (2012) (PABD, 2,638,314 entries), (III) a metagenome-based database
810 employed for the FL bacterial fraction within the study of Teeling et al. (2012) (MIMAS,
811 1,579,724 entries) and (IV) a database based on translated metagenomes of the FL fraction
812 on the 0.2 µm filters and particles on the 3 µm filters sampled on the 14th of April 2009 (0.2 +
813 3 µm 2009, 1,463,572 entries).
814
815 Figure S3: Venn diagram of overlapping and fraction-specific protein sets.
816
817 Table S1: Prophane output for proteins extracted from 0.2 µm pore-sized filters
818 Table S2: Prophane output for proteins extracted from 3 µm pore-sized filters
819 Table S3: Prophane output for proteins extracted from 10 µm pore-sized filters
820 Table S4: Distribution of phylogenetic groups within proteins extracted from the 0.2 µm, 3
821 µm and 10 µm pore-sized filters.
34 bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Table 1: Comparison of the six tested protein extraction protocols
Protocol Nr. 1 - Phenol 2 - SDS-TCA 3 - TRI-reagent® 4 - Freeze and thaw 5 - SDS-acetone 6 - Bead beating
Reference Kuhn et al. (2011) Schneider et al. (2012) Sigma Aldrich Chourey et al. (2010) Hall et al. (2012) Moog (2012) Thompson et al. (2008) Originally used for sewage sludge from leaf litter simultaneous extraction soil stream hyporheic hypersaline microbial biomembrane reactors of RNA, DNA, and biofilms mats proteins Composition of 0.1 M NaOH 1% (w/v) SDS, 50 mM TRI-Reagent® 5% (w/v) SDS, 50 mM 1% (w/v) SDS, 50 mM 5% (w/v) SDS, 0.05 the protein Tris/HCl, pH 7 (guanidine thiocyanate Tris/HCl, 0.1 mM Tris/HCl, pH 6.8 mM Tris/HCl, 0.1 M extraction buffer and phenol monophase EDTA, 0.15 M NaCl, 1 DTT, 0.01 M EDTA,
solution) mM MgCl2, 50mM DTT, 10% (v/v) glycerol, 1.7 pH 8.5 mM PMSF, pH 6.8 Cell disruption sonication sonication TRI-Reagent® 2 freeze and thaw sonication FastPrep® 6.5 m/s, methodology 3 x 30 s (20% power 3 x 40 s (20% power cycles (liquid nitrogen, 5 x 1 min (20% power 4 x 30 s output) output) rt), 10 min boiling output), 15 min boiling, procedure repeated on the pellet Additional protein phenol extraction (2x) chloroform extraction, / / / / purification ethanol extraction Protein 0.1 M ammonium- 10% TCA 2-propanol (1:1.5) 25% TCA acetone (1:5) acetone (1:4) precipitation acetate in methanol (1:5) Mean total protein amount [µg] 8.9 22.7 16.1 25.2 38.6 27.3 3 - 10 µm fraction Mean total protein amount [µg] 8.8 12.8 38.2 24.3 102.1 114.2 ≥ 10 µm fraction
35
A Kabeltonne Helgoland ≥10µm large (pa*) ≥3µmmedium (pa) bioRxiv preprint doi: https://doi.org/10.1101/667428≥0.2µm small (f*); this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. sampling fltering >10 µm cutting cutting >3 µm +buffer free water flters flter pieces sample ready for sample with cells for comparative cell disruption and analyses protein extraction
B protocol development and validation samples on flters 9.2.2009 16.6.2009 7.4.2009 21.4.2009
protein phenol SDS SDS bead TRI freezing extraction TCA acetone beating Reagent thawing protein amount [µg] legend samples area ≙ 1µg on flters ≥ 3µm 28.4.2009 ≥10µm replicates
protein SDS bead extraction acetone beating
protein amount [µg]
SDS PAGE
classifed as inadequate * * classifed as inadequate protein 20 10 20 20 10 20 fractions pieces pieces pieces pieces pieces pieces bubble size ≙# red. & alk. red. & alk. of identifed protein groups
LC- MS/MS and data base search classifed as inadequate fnal classifed as inadequate protocol bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Kabeltonne Helgoland ≥10µm large (pa*) ≥3µmmedium (pa) ≥0.2µm small (f*) cell sampling fltering >10 µm cutting disruption >3 µm +glassbeads 6.5 m/s free water flters lysis buffer sample ready for 4x30s cell lysate sample with cells cell disruption and and debris protein extraction centrifuging washing pooling centri- protein drying fugation precipitation add 1:4 acetone overnight -20 °C dry protein protein precipitated pooled pellet pellet protein supernatants with proteins re- 8M urea 2M thio- suspending urea 1D SDS tryptic PAGE cutting digestion ...... 30 µg/lane one lane add trypsin 20 fractions protein in 1D gel gel fractions digest sample buffer ZipTip purifcation Mascot Server Prophane Paver signal signal ...
m/z m/z data analysis raw mass spec LC-MS/MS and visualization data measurement digest bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
A 100% proteins assigned particle fraction to main taxa color code 80% >0.2 µm >3 µm >10 µm Archaea 0.05 0.12 0.04 60% Bacteria 84.74 44.57 30.84 40% Eukaryota 11.08 43.27 54.32 unclassified 2.76 8.84 11.75 20% various 0.04 0.23 0.26 Viruses 1.33 2.97 2.79 ≥0.2µm 0% free living (f) particle associated (pa) >0.2 µm >3 µm >10 µm particle sizes small medium large
Catenovulum Lacimicrobium Salmonella Candidatus Pseudobacteriovorax Bacteriovorax Enterococcus Bowmanella Riesia Thauera Staphylococcus Citrobacter Lactobacillus Anaerococcus var Myxococcus uc Brachyspira Azovibrio Alteromonadaceae Clostridiales
Salinimonas Calditerricola
Weissella
Streptococcus
Plesiocystis Anaeromyxobacter Clostridium uc Marinomonas Deltaproteobacteria Marinobacterium Glaciecola Alishewanella Escherichia Motiliproteus Colwellia Bacillus
Chondromyces Desulfopila B Candidatus Butyricicoccus Faecalibacterium Candidatus Paenibacillus Babela Magnetomorum Eubacterium
Saccharibacillus Leptospira Desulfatibacillum Thermoactinomyces
uc Kosakonia Melittangium Enhygromyxa Geothermobacter uc Haliangium Desulfuromonas Minicystis Agarivorans Geoalkalibacter Syntrophobacter Proteobacteria Geobacter Megasphaera Thalassomonas
uc Betaproteobacteria Aquaspirillum Yersiniaceae var Ferrimonas Thalassotalea Serratia Marinobacter Proteobacteria Neisseria
Oligofexia Polynucleobacter
Tissierellia uc Psychrosphaera Paraferrimonas Methylococcaceae var Snodgrassella Alteromonas Burkholderiaceae Paraglaciecola Caballeronia Methylomonas
Burkholderia
Algicola Clostridia Neptuniibacter Methylomagnum
Deltaproteobacteria Bacilli Bordetella Cupriavidus Amphritea Luminiphilus Plasticicumulans
Spirochaetia uc Pseudoalteromonas Advenella Paraburkholderia Proteobacteria Negativicutes Reinekea var Balneatrix Limnohabitans Burkholderiales Ramlibacter Collimonas Candidatus Thioglobus Oceanospirillum Massilia var Proteobacteria uc
uc Alteromonadales Oceanospirillaceae Betaproteobacteria Bermanella Duganella Thiolapillus Comamonas Pseudospirillum Ferrovum Hydrogenophaga
Variovorax Simplicispira Candidatus Halomonas Halioglobus Rhodoferax Symbiobacter Aestuariibacter Lampropedia Neptunomonas Pseudorhodoferax uc Polaromonas
Thalassolituus Bacterioplanes uc Hylemonella Halieaceae Comamonadaceae Porticoccus Marinospirillum Aliiglaciecola uc Alteromonadaceae Acidovorax Curvibacter Nitrincola Malikia Macromonas Loktanella Ascidiaceihabitans Oceanobacter Salinicola Gynuella var uc Xylophilus Pseudohaliea Burkholderiales Tepidimonas Ketobacter Idiomarina Thiomonas uc Psychromonas var uc Burkholderiales Marinicella Nitrosomonadales Methylotenera Candidatus Alteromonadales Methylopumilus
uc uc Methylophilaceae Alcanivorax Spongiibacteraceae Gammaproteobacteria Leptothrix Methyloversatilis Haliea Chromatocurvus Corallincola Sulfuricella Endozoicomonas Nitrosomonas Saccharospirillum Thiobacillus
var var Shewanella Betaproteobacteria Spongiibacteraceae uc
Congregibacter Candidatus Kangiella uc Candidatus Lambdaproteobacteria uc Lambdaproteobacteria Oceanospirillales Dasania Gammaproteobacteria Terasakiispira Acidithiobacillia
Zetaproteobacteria Thiohalobacter uc Acidithiobacillales
uc Hahella Zetaproteobacteria
Planktomarina var
Oceanospirillales Haemophilus Oleiphilus uc Oceanicoccus Porticoccaceae Spongiibacter
Sulftobacter MonaibacteriumRoseivivax Arenicella Zhongshania var Yoonia Pseudohongiella Aquimixticola Cellvibrionales Sedimenticola
Thiolinea var Gilvimarinus Cycloclasticus Rhodobacteraceae Methylophaga Thiomicrospira
Microbulbifer Marinimicrobium Thiothrix Shimia Cellvibrio Teredinibacter
uc Leucothrix Francisellaceae
Ruegeria Saccharophagus
Fangia
var Phaeobacter Pseudomonadales Pseudooctadecabacter Marinagarivorans
Jannaschia Planktotalea Amylibacter uc Agarilytica Pseudomonas Candidatus Simiduia Endobugula Rhodobacter Cellvibrionales
Azotobacter
Pseudopelagicola Emcibacter Roseobacter Polaribacter uc Thiorhodovibrio Alphaproteobacteria Leisingera Candidatus Rhodobacterales Tenderia Leeuwenhoekiella Epsilonproteobacteria Pseudophaeobacter uc Chromatiaceae Thalassobius Psychrobacter
Rheinheimera Ectothiorhodospira Nitrosococcus Acinetobacter Seonamhaeicola Roseovarius Thioalkalivibrio
var uc Moraxellaceae Chromatiales Moraxella Xanthomonas Perlucidibaca Pseudoxanthomonas Aeromonas uc Aquisalimonas Moraxellaceae
Granulosicoccus
Stenotrophomonas
Thiohalomonas
Oceanimonas Labrenzia Actibacterium Algiphilus
Marinosulfonomonas Rhodanobacter Oleiagrimonas Sinimarinibacterium var uc Gammaproteobacteria Marivita Thalassococcus Sphingopyxis uc Pelagimonas Pseudoruegeria Alphaproteobacteria Sphingobium Vibrio Photobacterium Dyella
Maritimibacter Legionella Nereida Sulfurospirillum Litorimicrobium
Rhodobacteraceae Tropicibacter Sphingomonas
Pseudodonghicola Aliiroseovarius Primorskyibacter var Sphingomonadales Arcobacter
var uc Confuentimicrobium Alphaproteobacteria Marinovum Hasllibacter Sphingorhabdus Flaviramulus Flavobacteria Cognatiyoonia Palleronia Albidovulum Helicobacter Roseibium Gramella
Litoreibacter var Sulfurimonas
Natronohydrobacter Pelagicola Acuticoccus Pseudaestuariivita Celeribacter Rhodobacterales Altererythrobacter Nautilia Sedimentitalea Nautella Marivivens Aestuariivita
Pseudovibrio
Tropicimonas Maribius Antarctobacter
Oceaniovalibus Neptunicoccus Pseudooceanicola Croceicoccus Puniceibacterium Candidatus uc Cribrihabitans Boseongicola Hyphomonadaceae Rhodosalinus Winogradskyella Oceanicola Porphyrobacter
Donghicola Albimonas
Dinoroseobacter Epibacterium Jhaorihella
Marinibacterium
Ahrensia Candidatus Puniceispirillum Pseudozobellia Thalassobacter Flavimaricola Robiginitomaculum Jidaibacter
uc Tateyamaria Paracoccus Sphingomonadales Halocynthiibacter Defuviimonas
Thermoleophilia Rhodovulum Candidatus Midichloria
var
Actinobacteria Chryseobacterium Chlorofexia Hyphobacterium Salipiger uc Actinobacteria Salinimicrobium
uc Tritonibacter Cognatishimia Hyphomonas Chlorofexi Thalassobium Mollicutes Lentibacter
Octadecabacter Sediminimonas Nioella Deinococci Lutibacter Flavobacteriaceae Acidimicrobiia uc Mesofavibacter
uc Chlorobia Candidatus Rifebacteria Rickettsiales Zobellia Ulvibacter Hanstruepera Psychroserpens Aureitalea Anaplasma uc Tamlana Confuentibacter uc Salegentibacter Kordia
uc Nitrospirae Actinobacteria Verrucomicrobiae Rhodospirillaceae Kiloniella Nitrospira Gilvibacter uc uc Tenacibaculum var Sediminicola uc Pelagibacterales Rhodospirillales Flagellimonas Bacteria Bacteroidetes Candidatus Muricauda Marinirhabdus Phycisphaerae uc Planctomycetes uc var Pelagibacter Pelagibacteraceae Candidatus Oceanibaculum Arenibacter
uc Opitutae Lentisphaerae Endolissoclinum Flavobacteriaceae Magnetospira SkermanellaFodinicurvata
Lentisphaeria Magnetovibrio uc Kordiimonadales Thalassospira
uc Caulobacterales Phenylobacterium Candidatus Thalassobaculum Rhizobium Planctomycetia Tistrella Candidatus Fonsibacter Aquiluna Ensifer Azospirillum Magnetospirillum uc Sinorhizobium
Verrucomicrobia Martelella Robiginitalea uc Aureimonas Phyllobacteriaceae Aurantimonas
Mesorhizobium uc Hyphomicrobiaceae Nisaea
Tistlia
uc Devosia Holosporaceae Formosa
uc Maritalea var
Microbacteriaceae Bradyrhizobium uc Alphaproteobacteria Arenitalea uc Rhodoluna Actinobacteria Salinispora
uc Candidatus Bathyarchaeota var Janibacter Pelagibacterales Actinoplanes Lacinutrix uc Thaumarchaeota var Patulibacter uc Pelagibacteraceae Microvirga uc Cytophagia uc Mycobacteroides Archaea uc var uc Actinobacteria Acidimicrobiales Herpetosiphon Acidobacteria Methanobacteria Bifdobacterium Rhizobiales
uc Candidatus Bacteria Wolfebacteria Acidimicrobium Saprospiria uc Chlamydiia var Chlorofexi Microbacteriaceae Mycobacterium Mycoplasma Maribacter Cyanobacteria Cohaesibacter Methyloceanibacter Lysinimicrobium Cellulomonas uc Pontimonas Rhodococcus Corynebacterium Breoghania Sphingobacteriia uc Microbacterium Dietzia Nakamurella Candidatus uc Candidatus Nonomuraea Olleya Handelsmanbacteria Fibrobacteria Rathayibacter Deinococcus Amycolatopsis uc Candidatus uc var Bacteroidetes Euryarchaeota Marinimicrobia Acidobacteriia Atribacteria uc Candidatus Candidatus Kerfeldbacteria Deferribacteres uc Methylorubrum uc uc Candidatusuc Candidatus uc Candidatus Fusobacteriia Rhodothermaeota Parcubacteria Ignavibacteria PortnoybacteriaUhrbacteria Gemmatimonadetes Saccharibacteria Bacteroidia Chitinophagia Nitrospinae Methanomicrobia Actinophytocola Bauldia Hartmannibacter uc uc Prosthecochloris Pseudorhodoplanes Candidatus Acidimicrobiaceae Rifebacteria uc
Chloroherpeton
Methylobacterium
Streptomyces uc Terasakiella Crocinitomicaceae Actinomycetales Rubritalea Gaetbulibacter
Candidatus Microthrix
var Aquimarina Actinobacteria uc Jiangella Verrucomicrobiaceae Spongiibacterium Dokdonia Bizionia Altibacter Hyunsoonleella
uc Haloferula
Verrucomicrobium
uc Nitrospirae Actinobacteria Jejuia
Nitrospira Pedosphaera uc Cellulophaga
var Eudoraea
Imtechella
uc Xanthomarina bacterial groups Phycisphaeraceae Bacteria Bacteroidetes Riemerella
uc Verrucomicrobiales Akkermansia Algibacter Psychrofexus Flavivirga Phycisphaera Zhouia uc Planctomycetes uc
var Flammeovirgaceae Pricia Nonlabens uc Verrucomicrobiae Siansivirga Aureicoccus uc legend of hierarchical top level Lentisphaerae var Opitutae Verrucomicrobiae Lentisphaera
Gemmata Rubinisphaera Bergeyella
Opitutus Mangrovimonas uc Opitutaceae var uc Ochrovirga Capnocytophaga Flavobacteriales Planctomycetaceae Fuerstia Ereboglobus Flavobacterium Crocinitomix Moheibacter uc
uc Joostella Opitutales Schleiferia Verrucomicrobia Mesonia Elizabethkingia Nodosilinea Coraliomargarita Brumimicrobium Owenweeksia
Gillisia
Thermobaculum Fluviicola Wenyingzhuangia Oceanihabitans Aequorivita Marixanthomonas Gelidibacter Reichenbachiella Rhodopirellula Leptolyngbya
Lishizhenia uc Puniceicoccaceae uc Phaeodactylibacter
uc Rhodothermaceae Cryomorphaceae
Candidatus Scalindua Rhodothermus proteins assigned Phormidesmis uc Flavobacteriales Haliscomenobacter Planktothricoides Sediminitomix Arthrospira uc uc Saprospira Marivirga
uc Candidatus uc Bathyarchaeota
Trichodesmium
Flavobacteriia var uc Synechococcus Thaumarchaeota uc uc uc Candidatus Altiarchaeales Flammeovirgaceae Flavobacteriia Methanobrevibacter Archaea Dyadobacter
to bacterial genera which have been Roseivirga var Saprospiraceae Ekhidna
Acidobacteria Candidatus Eisenibacter Nephrothrix Methanosphaera
Sporocytophaga Aureispira uc Candidatus Bacteria Mucilaginibacter Wolfebacteria
Moorea Neochlamydia
Limnoraphis Lyngbya Cyanothece Flexithrix Spirosoma Leadbetterella Phormidium Lewinella uc Persicobacter Flectobacillus Cecembia Pedobacter uc Chryseolinea uc Candidatus Candidatus uc Prochlorococcus Marinilabiliales Handelsmanbacteria Fibrobacter Echinicola uc uc Candidatus Fabibacter uc Anaerophaga var Euryarchaeota Cytophagales Algoriphagus var Bacteroidetes Bacteroidia Marinimicrobia uc Granulicella Atribacteria uc Candidatus uc
Porphyromonas Candidatus Kerfeldbacteria Hymenobacter Prolixibacter Saccharicrinis Geovibrio Catalinimonas Mongoliibacter uc Cyanobacterium Candidatus Sphingobacterium Lewinellaceae Oscillatoriales uc Hydrobacter uc uc uc Candidatusuc Candidatus uc Candidatus uc Candidatus found in all respective fractions (colors) Cyanobacteria Aphanocapsa TrichormusNostoc Fusobacterium Rhodothermaeota Parcubacteria Ignavibacteriales PortnoybacteriaUhrbacteria Gemmatirosa Saccharibacteria HugenholtziaRufbacter Amoebophilus Aquifexum Parapedobacter uc Sphingobacteriia Bacteroides Marinilabilia Salinivirgaceae Flavihumibacter Nitrospinae Methanoperedens Prevotella bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
A fold ch. to smallest fold ch. to smallest abundance frac�ons small medium large abundance frac�ons small medium large particle fraction color code carbon compound utilization adhesion Glycoside hydrolase family 1 2.1 1.0 6.7 Pili 1.0 3.7 3.3 Glycoside hydrolase family 13 1.0 1.5 1.1 other Adhesion proteins 1.0 3.0 2.0 Glycoside hydrolase family 16 1.0 1.3 1.4 mo�lity Glycoside hydrolase family 45 +++ - ++ Starch utilization system SusD 1.0 1.8 1.0 Chemotaxisproteins ++++ - - Starch utilization system SusC 1.0 2.5 1.9 Gliding motility ++++ - + Glycosyl transferases 1.0 6.4 21.5 Flagellum 1.0 4.0 1.7 Sulfatases 2.0 1.0 4.1 other Motilityproteins ++++ - - transporters stress TonB-dependent transporter 1.0 2.4 1.6 Oxidative stress 3.0 1.0 1.1 ABC-transporter 3.2 1.2 1.0 Heavy metal resistance 1.0 1.4 3.8 TRAP-transporter 8.8 1.8 1.0 Antibiotic resistance 1.0 1.6 1.0 Antibiotic biosynthesis - - +++ phosphorus supply light driven energy genera�on Phosphonate ABC-type transport 1.8 1.3 1.0 Phosphate sodium symporter 1.1 1.0 1.1 Proteorhodopsin 1.0 1.1 1.1 ≥0.2µm Phosphate-selective porins 1.0 9.4 6.3 free living (f) particle associated (pa) Phosphate ABC-type transport 2.0 1.2 1.0 Phytase 1.0 2.7 3.1 particle sizes small medium large
uc uc uc Nitrosomonadales Lutibacter Flavobacteriaceae Crocinitomicaceae uc Flammeovirgaceae
uc Limnohabitans Polaribacter Crocinitomicaceae uc Marinirhabdus uc Porticoccaceae Hyunsoonleella Bacteria Seonamhaeicola uc Bacteroidetes
Flavobacterium Joostella
Flaviramulus Methyloversatilis Flavobacterium Flavivirga uc Thiobacillus uc uc uc Comamonadaceae Betaproteobacteria uc Lewinellaceae Crocinitomicaceae
Bacteria Tenacibaculum Formosa uc B Bacteroidetes
Aureicoccus var Algibacter Proteobacteria
var Sinorhizobium Psychroserpens Polaribacter Bacteria Olleya uc Rhodobacteraceae Synechococcus Lutibacter Lewinella Glycoside uc Crocinitomicaceae hydrolase uc family 45 Maribacter Flavobacteriia
uc Formosa Σof Flavobacteriaceae Pseudomonas uc sulfatases Polaribacter Ochrovirga Amylibacter Flavobacterium phytase Flavobacterium Bacteria Roseovarius Sulftobacter Paraglaciecola Polaribacter Gilvibacter Formosa uc uc Flammeovirgaceae Kordia Flavobacteriia
uc Kordia Bacteroidetes Cellulophaga Colwellia
uc Gammaproteobacteria Starch utilization Winogradskyella uc Gammaproteobacteria
heavy Saccharospirillum system SusD Tenacibaculum Reinekea metal uc Vibrio Clostridiales resistance Cycloclasticus var Flagellum Proteobacteria uc
uc Rhodospirillaceae uc Bacteroidetes Ochrovirga Lutibacter Ruegeria Flammeovirgaceae Algibacter Reinekea uc uc Porticoccaceae Muricauda Formosa Flavobacteriaceae Cellvibrio Cecembia Lacinutrix Leisingera Yoonia Polaribacter Starch utilization Gynuella Motility phosphonate uc ABC-type Bacteria Muricauda transport Litoreibacter Hahella Amylibacter uc Cellvibrionales Ulvibacter Psychroserpens uc
Polaribacter system SusC Aurantimonas Rhodobacterales Methylomagnum uc uc Halieaceae Pseudoruegeria Loktanella Pelagimonas uc Rhodospirillaceae Rhizobiales Algoriphagus Candidatus Marinirhabdus
Planktotalea Pelagibacter
Phaeobacter Agarivorans
uc Loktanella Jejuia Nitrospirae Pseudomonas Hartmannibacter uc uc Rhodobacteraceae Cryomorphaceae Roseobacter Candidatus Endolissoclinum
Breoghania Cohaesibacter uc Rhodoferax uc Chlorofexi Thalassobacter Rhodobacterales
var Flavobacteriaceae Octadecabacter Marivirga Polaribacter ABC-transporter uc Deltaproteobacteria uc uc Candidatus Alphaproteobacteria Rifebacteria Sulftobacter var Rhodobacteraceae uc uc Nitrospinae var Candidatus Puniceispirillum Candidatus Arenitalea Aureitalea Rhodobacteraceae Pelagibacter Bacteria Curvibacter Glycoside uc Flammeovirgaceae Schleiferia
uc hydrolase Bacteroidetes uc Rhodospirillales family 1 Malikia Formosa
Roseovarius uc Flavobacteriales Ruegeria Arenitalea
Thalassococcus uc uc Rhodospirillaceae Shimia Marinirhabdus
uc Betaproteobacteria Porticoccaceae Labrenzia
Gynuella Pseudooceanicola
uc Loktanella Chromatiales Planktomarina uc Thalassobium Pelagibacteraceae
Marinobacterium uc Maribacter uc Oceanospirillales
var uc Rhodobacterales var Bacteria Thioalkalivibrio Pontimonas Candidatus Oceanobacter Thioglobus Flavobacteriaceae Bacteria Leucothrix
Reinekea Colwellia Aquimarina
uc Reinekea Cryomorphaceae uc Candidatus Porticoccaceae
var Microbacteriaceae Agarilytica
uc Actinobacteria Algibacter
uc Aquiluna Actinomycetales Flaviramulus
Lysinimicrobium
uc Zhouia Ulvibacter Flavobacteriaceae
uc Acidimicrobium phosphate-selective Microbacteriaceae Halomonas
uc uc oxidative Flavobacteriia porins Bacteroidetes Pseudophaeobacter uc uc stress Acidimicrobiaceae var Alphaproteobacteria Bacteroidetes Gaetbulibacter
uc TonB-dependent Polaribacter Lutibacter Crocinitomicaceae
Kordia Flaviramulus Flavobacterium uc uc Alteromonadales Coraliomargarita Formosa Lacimicrobium Actinobacteria
uc transporter Flavobacteriales Lacinutrix Ruegeria
uc Bacteroidetes Imtechella Formosa
var Flavobacteriia gliding Polaribacter Arenitalea uc motility Flammeovirgaceae Pseudoalteromonas uc uc Deltaproteobacteria Rhodothermaceae uc
Winogradskyella Tateyamaria Antibiotic uc uc Opitutae uc Hyunsoonleella Alphaproteobacteria Halieaceae Gilvibacter var Bacteria Rhodobacteraceae resistance Haliscomenobacter Lewinella
Xanthomarina var Chromatocurvus Bacteria uc Tenacibaculum var Robiginitomaculum Rhodobacterales Algibacter phosphate uc Flavobacteriia Crocinitomicaceae uc TRAP-transporter Rhodothermaeota var Rhodobacteraceae uc Actibacterium Flammeovirgaceae Paraglaciecola Marivirga Arenicella
uc var Cellvibrionales Sulftobacter
uc ABC-type Rhodobacteraceae uc Betaproteobacteria Rhodobacteraceae uc uc Porticoccaceae Gammaproteobacteria uc Halieaceae uc uc Aliiroseovarius Betaproteobacteria Rickettsiales
Xylophilus Glycoside transport Psychrobacter Ruegeria Alteromonas Hasllibacter hydrolase Neisseria
family 16 phosphate Tistrella Reinekea Planktomarina Paraglaciecola Glaciecola Phycisphaera sodium Hydrogenophaga fagellum Oceaniovalibus symporter Glaciecola Idiomarina uc Nitrosomonadales Aquimarina uc Proteobacteria
Polaribacter Colwellia uc Saccharospirillum Pelagibacteraceae uc Betaproteobacteria adhesion Motiliproteus Snodgrassella Pilus Proteorhodopsin var Candidatus more motility Proteobacteria uc uc uc uc Porticoccaceae proteins Glycoside hydrolase Algibacter Pelagibacter Flavobacterium Cellvibrionales Glycosyl uc Flammeovirgaceae uc transferases family 13 Antibiotic Crocinitomicaceae Rhizobiales biosynthesis Porticoccaceae proteins uc Chemotaxis Bowmanella Porticoccaceae uc
Rhodobacteraceae Skermanella Neptunicoccus uc uc uc Rhodospirillaceae Gammaproteobacteria Arenicella Gammaproteobacteria Oceanicoccus Polaribacter Reinekea Robiginitomaculum Tateyamaria Albimonas Haliea Dokdonia uc Oceanospirillales Flaviramulus Shimia Alcanivorax
Litorimicrobium
uc Pelagibacteraceae Marinobacterium
Planktotalea
uc Thalassococcus Flavobacteriia Candidatus Thiohalobacter var uc Roseobacter
Sphingomonadales uc Crocinitomicaceae Pelagibacter Roseovarius uc Polaribacter Formosa Phaeobacter uc Gammaproteobacteria Rhodobacterales Flavobacteriaceae uc uc Marivita Litoreibacter Sulftobacter Gammaproteobacteria Colwellia uc Rhodobacterales Methylophaga Ruegeria Flavobacteriia Amylibacter uc uc Planktomarina uc Roseibium Alphaproteobacteria Flammeovirgaceae Bacteroidetes Reinekea Reichenbachiella Sphingomonas Colwellia Planktomarina Litorimicrobium Motiliproteus Roseovarius Sulftobacter Candidatus
var uc Neptuniibacter Rhodobacteraceae Pelagibacter
uc Rhodobacteraceae Pelagibacterales Candidatus Octadecabacter Puniceispirillum Epibacterium Pelagimonas Amylibacter Pseudoalteromonas uc Paraglaciecola Simiduia uc Pelagibacteraceae Candidatus Rhodospirillaceae functional groups the given Halocynthiibacter Puniceispirillum Formosa
uc Bacteria
Marinomonas uc uc Rhodospirillales var Flavobacteriia Candidatus uc Saccharospirillum Pelagibacter Rhodobacteraceae uc Spongiibacteraceae Alphaproteobacteria Ascidiaceihabitans uc uc Oceanospirillaceae uc Rhodobacterales Nitrosomonadales Planktomarina Neptunomonas Reichenbachiella Saprospira
uc Algibacter Cellvibrionales uc uc Rubritalea proteins were assigned to Rhizobiales Reinekea Chromatiales Jannaschia Primorskyibacter Oceanicoccus uc uc MarinilabilialesFormosa Porticoccaceae uc PaenibacillusAureispira Hydrogenophaga uc Actinobacteria Nitrosomonadales uc Rhodospirillaceae
uc uc Gammaproteobacteria Polaribacter uc Betaproteobacteria Arenicella Simiduia uc Bacteria
uc Betaproteobacteria Acidovorax uc uc Rhodospirillales
uc Cryomorphaceae uc Alphaproteobacteria Ekhidna Flavobacteriia Kordia uc Betaproteobacteria uc Microbacteriaceae Flammeovirgaceae Algoriphagus Candidatus Puniceispirillum Polaribacter uc Bacteria uc Bacteria Candidatus Pelagibacter uc uc Winogradskyella Cryomorphaceae Polaribacter uc Flammeovirgaceae
uc Alphaproteobacteriauc Motiliproteus Reinekea Pelagibacteraceae uc
Porticoccaceae uc Lyngbya Planctomycetaceae Candidatus Pelagibacter
Bacteria Colwellia Acidovorax uc Polaribacter Glaciecola Gammaproteobacteria
Serratia Sulfuricella uc uc uc Oceanicoccus Gammaproteobacteria proteins assigned Rhodobacterales uc Thalassotalea uc var Mucilaginibacter Bacteroidetes uc Bacteriovorax Gammaproteobacteria Citrobacter Bacteria uc Crocinitomicaceae Betaproteobacteria Porticoccaceae Alteromonas Formosa Saccharophagus uc Aestuariibacter Flavobacteriia Polaribacter Thiomicrospira
Winogradskyella Algibacter uc uc Pseudoalteromonas Alphaproteobacteria var Paraglaciecola Rubritalea Verrucomicrobiae Bergeyella uc Bacteroidetes Cyanobacteria Thalassomonas Caballeronia uc Porticoccaceae Candidatus uc Flavobacteriales Oceanicoccus to bacterial genera which have been Algibacter Candidatus Puniceispirillum Endolissoclinum Candidatus Polaribacter Polaribacter Sphingomonas Pelagibacter Paraglaciecola Candidatus Mesofavibacter Aquiluna Arenitalea uc
uc Pelagibacteraceae Candidatus uc Porticoccaceae var
uc Gammaproteobacteria Flavobacteriales uc Muricauda uc Alteromonadaceae Alteromonas Staphylococcus Gammaproteobacteria uc Pelagibacter Bacteroidetes Planktomarina Wenyingzhuangia uc Rhodospirillales Marinosulfonomonas uc Flammeovirgaceae uc Reinekea uc Bacteroidetes Bacteroidetes Rhodospirillaceae Ochrovirga Opitutus uc uc uc found in all respective fractions (colors) Reinekea Bacteria Rhizobialesuc Salinivirgaceae Psychroserpens Reinekea Methylococcaceae Enhygromyxa Lutibacter bioRxiv preprint doi: https://doi.org/10.1101/667428; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
A 100% proteins assigned particle fraction to Eukaryotes color code 80% >0.2 µm >3 µm >10 µm Phytoplankton 67.08 72.41 69.78 60% Macroalgae 10.68 9.74 9.97 40% Fungi 2.23 1.50 1.48 Animals 3.78 2.48 3.87 20% Land Plants 1.37 0.70 1.64 other Eukaryotes 14.86 13.16 13.26 ≥0.2µm 0% free living (f) particle associated (pa) >0.2 µm >3 µm >10 µm particle sizes small medium large
Heterotrichida ucAconoidasida NeuropteraAraneae Galliformes Pterioida Octopoda Diplostraca Calanoida
Scorpiones Passeriformes Parachela Bucerotiformes Pectinoida Ichthyophonida Squamata Trichinellida Anthoathecata Xiphosura Anura Rhabditida Schizopyrenida Haemosporida Ostreoida uc Orectolobiformes Ichthyosporea Trombidiformes uc Actinopteri uc Kinetoplastida Sarcoptiformes var var uc Petromyzontiformes Cypriniformes Lepidoptera Testudines Gastropoda Choanofagellida Pleuronectiformes Diplogasterida Chordata Enterogona Actiniaria uc uc Other Actinophryidae Hemiptera Eucoccidiorida Mesostigmata Siluriformes Blattodea Chiroptera uc Architaenioglossa Mammalia Osteoglossiformes Gobiiformes Coleoptera B Characiformes Scleractinia
Dasyuromorphia Clupeiformes Bacillariales Beloniformes
uc Diptera Hymenoptera Enteropneusta Euglyphida Carangiformes Rodentia Golfngiida
Lingulida
Entomobryomorpha Acytosteliales unclassifed uc Placozoa
Longamoebia Primates Capitellida
Aspidochirotida Vitales Ranunculales Perkinsida Valvatida
Proboscidea
Sirenia Trichomonadida Cyprinodontiformes Scandentia Echinoida Animals Saprolegniales Tritrichomonadida Eukaryotes Cucurbitales
Leptocylindrales Heteronemertea Haplosclerida
Strigeidida
Macrostomida Plagiorchiida
Rosales Solanales Fagales Asterales
Sapindales Opisthorchiida Gentianales
Funariales Fabales
Lamiales Selaginellales Chlorokybales Ericales Coscinodiscales Apiales Brassicales Charales Land Poales uc plants Peronosporales Bacillariophyceae Naviculales Cymbellales
Oomycetes Chaetocerotales Arecales
Klebsormidiales
Marchantiales Albuginales Pythiales
Philasterida Thalassiosirales
Peniculida
Myzocytiopsidales Eunotiales Hymenostomatida uc Labyrinthulomycetes Bacillariophyta Rhizosoleniales
Cryptomonadales Phyto- Sporadotrichida Pyrenomonadales
Licmophorales Compsopogonales
Toxariales Erythropeltidales Cymatosirales plankton Euplotida Mediophyceae
Lithodesmiales Anaulales Fragilariales
Phaeocystales Hemiaulales Parmales Triceratiales
var uc Biddulphiales Phaeophyceae
Ulvales
Pavlovales Dictyotales Pelagomonadales Isochrysidales Pyramimonadales Eustigmatales Macroalgae Synurales
Gloeochaetales Fungi Prymnesiales
Chromulinales Mamiellales
Florenciellales uc Microsporidia Monomastigales Ectocarpales generic eukaryotic groups Sphaeropleales Blastocladiales
Pucciniales Chlamydomonadales
Cystoflobasidiales Ustilaginales Cantharellales legend of hierarchical top level Hymenochaetales Microstromatales Violaceomycetales Pedinellales Gigartinales Auriculariales Sebacinales Tilletiales
Agaricales Dacrymycetales
Acrosymphytales
Basidiomycota Chattonellales Polyporales Corticiales Gloeophyllales Chlorellales Microbotryales Schizosaccharomycetales
uc Ascomycota Gracilariales uc Nephroselmidophyceae Rhodellales Pleosporales
uc Trebouxiophyceae Porphyridiales Dothideales uc protein assigned Chromerida Hypocreales Suessiales Euglenales Gymnodiniales Cyanidiales Sordariales Saccharomycetales uc Dothideomycetes Ceramiales Dixoniellales Glomerales Batrachospermales Glomerellales Peridiniales
uc Eutreptiales Leotiomycetes Gelidiales Mortierellales Laminariales to orders of Eukaryotes which have been Mischococcales Chaetothyriales Mucorales Helotiales Botrydiales Eurotiales Pneumocystidales Fucales uc var uc Xanthophyceae Bangiales Cryptomycota Rhizophydiales Basidiobolales Nemaliales Monoblepharidales Vaucheriales Plocamiales Dictyosteliales Gonyaulacales Chytridiales Neocallimastigales found in all respective fractions (colors) Protosteliales Harpellales