<<

bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1 Expanding the diversity of isolates and modeling isolation efficacy with 2 large scale dilution-to-extinction cultivation 3 4 5 6 Michael W. Henson1,#, V. Celeste Lanclos1, David M. Pitre2, Jessica Lee Weckhorst2,†, Anna M. 7 Lucchesi2, Chuankai Cheng1, Ben Temperton3*, and J. Cameron Thrash1* 8 9 1Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, 10 U.S.A. 11 2Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, 12 U.S.A. 13 3School of Biosciences, University of Exeter, Stocker Road, Exeter, EX4 4QD, U.K. 14 #Current affiliation: Department of Geophysical Sciences, University of Chicago, Chicago, IL 15 60637, U.S.A. 16 †Quantitative and Computational Biosciences Program, Baylor College of Medicine, Houston, 17 TX 77030, U.S.A. 18 19 *Correspondence: 20 21 J. Cameron Thrash 22 [email protected] 23 24 Ben Temperton 25 [email protected] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Running title: Evaluation of large-scale DTE cultivation 40 41 42 Keywords: dilution-to-extinction, cultivation, bacterioplankton, LSUCC, , 43 coastal microbiology 44 45 46

Page 1 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

47 48 Abstract 49 Cultivated bacterioplankton representatives from diverse lineages and locations are 50 essential for microbiology, but the large majority of taxa either remain uncultivated or lack 51 isolates from diverse geographic locales. We paired large scale dilution-to-extinction (DTE) 52 cultivation with microbial community analysis and modeling to expand the phylogenetic and 53 geographic diversity of cultivated bacterioplankton and to evaluate DTE cultivation success. 54 Here, we report results from 17 DTE experiments totaling 7,820 individual incubations over 55 three years, yielding 328 repeatably transferable isolates. Comparison of isolates to microbial 56 community data of source waters indicated that we successfully isolated 5% of the observed 57 bacterioplankton community throughout the study. 43% and 26% of our isolates matched 58 operational taxonomic units (OTUs) and amplicon single nucleotide variants (ASVs), 59 respectively, within the top 50 most abundant taxa. Isolates included those from previously 60 uncultivated clades such as SAR11 LD12 and acIV, as well as geographically 61 novel members from other ecologically important groups like SAR11 subclade IIIa, SAR116, 62 and others; providing the first isolates in eight genera and seven . Using a newly 63 developed DTE cultivation model, we evaluated taxon viability by comparing relative abundance 64 with cultivation success. The model i) revealed the minimum attempts required for successful 65 isolation of taxa amenable to growth on our media, and ii) identified important and abundant taxa 66 such as SAR11 with low viability that likely impacts cultivation success. By incorporating 67 viability in experimental design, we can now statistically constrain the number of attempts 68 necessary for successful cultivation of a given taxon on a defined medium. 69 70 71 Importance 72 Even before the coining of the term “great plate count anomaly” in the 1980s, scientists 73 had noted the discrepancy between the number of microorganisms observed under the 74 microscope and the number of colonies that grew on traditional agar media. New cultivation 75 approaches have reduced this disparity, resulting in the isolation of some of the “most wanted” 76 bacterial lineages. Nevertheless, the vast majority of microorganisms remain uncultured, 77 hampering progress towards answering fundamental biological questions about many important 78 microorganisms. Furthermore, few studies have evaluated the underlying factors influencing 79 cultivation success, limiting our ability to improve cultivation efficacy. Our work details the use 80 of dilution-to-extinction (DTE) cultivation to expand the phylogenetic and geographic diversity 81 of available axenic cultures. We also provide a new model of the DTE approach that uses 82 cultivation results and natural abundance information to predict taxon-specific viability and 83 iteratively constrain DTE experimental design to improve cultivation success. 84 85 86 87 88 89 90 91 92

Page 2 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

93 Introduction 94 Axenic cultures of environmentally important microorganisms are critical for 95 fundamental microbiological investigation, including generating physiological information about 96 environmental tolerances, determining organismal-specific metabolic and growth rates, testing 97 hypotheses generated from in situ ‘omics observations, and experimentally examining microbial 98 interactions. Research using important microbial isolates has been critical to a number of 99 discoveries such as defining microorganisms involved in surface methane saturation (1–3), 100 the role of proteorhodopsin in maintaining cellular functions during states of carbon starvation 101 (4, 5), the complete of ammonia within a single organism (6), and identifying novel 102 metabolites and (7–10). However, the vast majority of taxa remain uncultivated (11– 103 13), restricting valuable experimentation on such topics as of unknown function, the role 104 of analogous substitutions in overcoming auxotrophy, and the multifaceted interactions 105 occurring in the environment inferred from sequence data (11, 14–16). 106 The quest to bring new microorganisms into culture, and the recognition that traditional 107 agar-plate based approaches have limited success (17–19), have compelled numerous 108 methodological advances spanning a wide variety of techniques like diffusion chambers, 109 microdroplet encapsulation, and slow acclimatization of cells to artificial media (20–25). 110 Dilution-to-extinction (DTE) cultivation using sterile as the medium has also proven 111 highly successful for isolating bacterioplankton (26–32). Pioneered by Don Button and 112 colleagues for the cultivation of oligotrophic , this method essentially pre-isolates 113 organisms after serial dilution by separating individual or small groups of cells into their own 114 incubation vessel (32, 33). This prevents slow-growing, obligately oligotrophic bacterioplankton 115 from being outcompeted by faster-growing organisms, as would occur in enrichment-based 116 isolation methods, particularly those that would target aerobic . It is also a practical 117 method for taxa that cannot grow on solid media. Natural seawater media provide these taxa with 118 the same chemical surroundings from which they are collected, reducing the burden of 119 anticipating all the relevant compounds required for growth (33). 120 Improvements to DTE cultivation in multiple labs have increased the number of 121 inoculated wells and decreased the time needed to detect growth (26, 28, 34), thereby earning the 122 moniker “high-throughput culturing” (26, 28). We (35) and others (30) have also adapted DTE 123 culturing by incorporating artificial media in place of natural seawater media to successfully 124 isolate abundant bacterioplankton. Thus far, DTE culturing has lead to isolation of many 125 numerically abundant marine and freshwater groups such as marine SAR11 126 (28, 29, 34–36), the freshwater SAR11 LD12 clade (29), SUP05/Arctic96BD-19 127 (37–39), OM43 (26, 27, 31, 40, 41), HIMB11-Type 128 (35, 42), numerous so-called Oligotrophic Marine Gammaproteobacteria (43), and 129 acI Actinobacteria (44). 130 Despite the success of DTE cultivation, many taxa continue to elude domestication (11– 131 13, 16). Explanations include a lack of required nutrients or growth factors in media (20, 45–49) 132 and biological phenomena such as dormancy and/or phenotypic heterogeneity within populations 133 (47, 48, 50–56). However, there have been few studies empirically examining the factors 134 underlying isolation success in DTE cultivation experiments (34, 57, 58), restricting our ability 135 to determine the relative importance of methodological vs. biological influences on cultivation 136 reliability for any given organism. Moreover, even for those taxa that we have successfully 137 cultivated, in many cases we lack geographically diverse strains, restricting comparisons of the 138 phenotypic and genomic diversity that may influence taxon-specific cultivability.

Page 3 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

139 We undertook a three-year cultivation effort in the coastal northern Gulf of Mexico 140 (nGOM), from which we lack representatives of many common bacterioplankton groups, to 141 provide new model organisms for investigating microbial function, ecology, biogeography, and 142 . Simultaneously, we paired our cultivation efforts with 16S rRNA gene amplicon 143 analyses to compare cultivation results with the microbial communities in the source waters. We 144 have previously reported on the success of our artificial media in obtaining abundant taxa over 145 the course of seven experiments from this campaign (35). Here, we expand our report to include 146 the cultivation results from a total of seventeen experiments, and update the classic viability 147 calculations of Button et al. (33) with a new model to estimate the viability of individual taxa 148 using relative abundance information. Individuals from eight and seven cultivated groups 149 belonged to putatively novel genera and species, respectively, and our isolates expand the 150 geographic representation for many important clades like SAR11. Additionally, using model- 151 based predictions, we identify possible taxon-specific viability variation that can influence 152 cultivation success. By incorporating these new viability estimates into the model, our method 153 facilitates statistically-informed experimental design for targeting individual taxa, thereby 154 reducing uncertainty for future culturing work (59). 155 156 Material and Methods 157 158 Sampling 159 Surface water samples were collected at six different sites once a year for three years, except for 160 Terrebonne Bay, which was collected twice. The sites sampled were Lake Borgne (LKB; Shell 161 Beach, LA), Bay Pomme d'Or (JLB; Buras, LA), Terrebonne Bay (TBON; Cocodrie, LA), 162 Atchafalaya River Delta (ARD; Franklin, LA), Freshwater City (FWC; Kaplan, LA), and 163 Calcasieu Jetties (CJ; Cameron, LA) (lat/long coordinates provided in Table S1). Water 164 collection for biogeochemical and biological analysis followed the protocol in (35). Briefly, we 165 collected surface water in a sterile, acid-washed polycarbonate bottle. Duplicate 120 ml water 166 samples were filtered serially through 2.7 µm Whatman GF/D (GE Healthcare, Little Chalfort, 167 UK) and 0.22 µm Sterivex (Millipore, Massachusetts, USA) filters and placed on ice until 168 transferred to -20˚C in the laboratory (maximum 3 hours on ice). The University of Washington 169 Marine Chemistry Laboratory analyzed duplicate subsamples of 50 ml 0.22 µm-filtered water 170 collected in sterile 50 ml falcon tubes (VWR, Pennsylvania, USA) for concentrations of SiO4, 3- + - - 171 PO4 , NH4 , NO3 , and NO2 . Samples for cell counts were filtered through a 2.7 µm GF/D filter, 172 fixed with 10% formaldehyde, and stored on ice until enumeration (maximum 3 hours). 173 Temperature, salinity, pH, and dissolved were measured using a handheld YSI 556 174 multiprobe system (YSI Inc., Ohio, USA). All metadata is available in Table S1. 175 176 Dilution-to-extinction culturing and propagation 177 Isolation, propagation, and identification of isolates were completed as previously reported (29, 178 35). A subsample of 2.7 µm filtered surface water was stained with 1X SYBR Green (Lonza, 179 Basal, Switzerland), and enumerated using a flow cytometer as described (60). After serial 180 dilution to a predicted 1-3 cells·µl-1, 2 µl water was inoculated into five, 96-well PTFE plates 181 (Radleys, Essex, UK) containing 1.7 ml artificial seawater medium (Table S1) to achieve an 182 estimated 1-3 cells·well-1 (Table 1). The salinity of the medium was chosen to match in situ 183 salinity after experiment JLB (January 2015) (Table S1). After year two (TBON3-LKB3), a 184 second generation of media, designated MWH, was designed to incorporate additional important

Page 4 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

185 osmolytes, reduced sulfur compounds, and other constituents (Table S1) potentially necessary 186 for in vitro growth of uncultivated clades (49, 61–67). Cultures were incubated at in situ 187 temperatures (Table S1) in the dark for three to six weeks and evaluated for positive growth (> 188 104 cells·ml-1) by . 200 µl from positive wells was transferred to duplicate 125 ml 189 polycarbonate flasks (Corning, New York, USA) containing 50 ml of medium. At FWC, FWC2, 190 JLB2c, and JLB3, not all positive wells were transferred because of the large number of positive 191 wells. At each site, 48/301, 60/403, 60/103, and 60/146 of the positive wells were transferred, 192 respectively, selecting for flow cytometric signals that maximized our chances of isolating small 193 microorganisms that encompass many of the most abundant, and most wanted taxa, like SAR11. 194 195 Culture identification 196 Cultures reaching ≥ 1 x 105 cells·ml-1 had 35 ml of the 50 ml volume filtered for identification 197 via 16S rRNA gene PCR onto 25 mm 0.22-µm polycarbonate filters (Millipore, Massachusetts, 198 USA). DNA extractions were performed using the MoBio PowerWater DNA kit (QIAGEN, 199 Massachusetts, USA) following the manufacturer’s instructions and eluted in sterile water. The 200 16S rRNA gene was amplified as previously reported in Henson et al. 2016 (35) and sequenced 201 at Michigan State University Research Technology Support Facility Core. Evaluation 202 of Sanger sequence quality was performed with 4Peaks (v. 1.7.1) 203 (http://nucleobytes.com/4peaks/) and forward and reverse complement sequences (converted via 204 http://www.bioinformatics.org/sms/rev_comp.html) were assembled where overlap was 205 sufficient using the CAP3 web server (http://doua.prabi.fr/software/cap3). 206 207 Community iTag sequencing, operational taxonomic units, and single nucleotide variants 208 Sequentially (2.7µm, 0.22µm) filtered duplicate samples were extracted and analyzed using our 209 previously reported protocols and settings (35, 68). We sequenced the 2.7-0.22 µm fraction for 210 this study because this fraction corresponded with the < 2.7 µm communities that were used for 211 the DTE experiments. To avoid batch sequencing effects, DNA from the first seven collections 212 reported in (35) was resequenced with the additional samples from this study (FWC2 and after- 213 Table 1). We targeted the 16S rRNA gene V4 region with the 515F, 806RB primer set (that 214 corrects for poor amplification of taxa like SAR11) (69, 70) using Illumina MiSeq 2 x 250bp 215 paired-end sequencing at Argonne National Laboratories, resulting in 2,343,106 raw reads for 216 the 2.7-0.22 µm fraction. Using Mothur v1.33.3 (71), we clustered 16S rRNA gene amplicons 217 into distinctive OTUs with a 0.03 dissimilarity threshold (OTU0.03) and classified them according 218 to the Silva v119 database (72, 73). After these steps, 55,256 distinct OTUs0.03 remained. We 219 also used minimum entropy decomposition (MED) to partition reads into fine-scale amplicon 220 single nucleotide variants (ASVs) (74). Reads were first analyzed using Mothur as described 221 above up to the screen.seqs() command. The cleaned reads fasta file was converted to MED- 222 compatible headers with the ‘mothur2oligo’ tool renamer.pl from the functions in MicrobeMiseq 223 (https://github.com/DenefLab/MicrobeMiseq) (75) using the fasta output from screen.seqs() and 224 the Mothur group file. These curated reads were analyzed using MED (v. 2.1) with the flags –M 225 60, and -d 1. MED resulted in 2,813 refined ASVs. ASVs were classified in Mothur using 226 classify.seqs(), the Silva v119 database, and a cutoff bootstrap value of 80% (76). After 227 classification, we removed ASVs identified as “”, “mitochondria”, or “unknown” 228 from the dataset. 229 230 Community analyses

Page 5 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

231 OTU (OTU0.03) and ASV abundances were analyzed within the R statistical environment v.3.2.1 232 (77) following previously published protocols (29, 35, 68). Using the package PhyloSeq (78), 233 OTUs and ASVs were rarefied using the command rarefy_even_depth() and OTUs/ASVs 234 without at least two reads in four of the 34 samples (2 sites; ~11%) were removed. This cutoff 235 was used to remove any rarely occurring or spurious OTUs/ASVs. Our modified PhyloSeq 236 scripts are available on our GitHub repository https://github.com/thrash-lab/Modified-Phyloseq. 237 After filtering, the datasets contained 777 unique OTUs and 1,323 unique ASVs (Table S1). For 238 site-specific community comparisons, beta diversity between sites was examined using Bray- 239 Curtis distances via ordination with non-metric multidimensional scaling (NMDS) (Table S1). 240 The nutrient data were normalized using the R function scale which subtracts the mean and 241 divides by the standard deviation for each column. The influence of the transformed 242 environmental parameters on beta diversity was calculated in R with the envfit function. Relative 243 abundances of an OTU or ASV from each sample were calculated as the number of reads over 244 the sum of all the reads in that sample. The relative abundance was then averaged between 245 biological duplicates for a given OTU or ASV. To determine the best matching OTU or ASV for 246 a given LSUCC isolate, the OTU representative fasta file, provided by Mothur using 247 get.oturep(), and the ASV fasta file were used to create a BLAST database (makeblastdb) against 248 which the LSUCC isolate 16S rRNA genes could be searched via blastn (BLAST v 2.2.26) 249 (“OTU_ASVrep_db” - Available as Supplemental Information at 250 https://doi.org/10.6084/m9.figshare.12142098). We designated a LSUCC isolate 16S rRNA gene 251 match with an OTU or ASV sequence based on ≥ 97% or ≥ 99% sequence identity, respectively, 252 as well as a ≥ 247 bp alignment. 253 254 16S rRNA gene phylogeny 255 Taxa in the Alpha-, Beta-, and Gammaproteobacteria phylogenies from (35) served as the 256 backbone for the trees in the current work. For places in these trees with poor representation near 257 isolate sequences, additional taxa were selected by searching the 16S rRNA genes of LSUCC 258 isolates against the NCBI nt database online with BLASTn (79) and selecting a variable number 259 of best hits. The and Actinobacteria trees were composed entirely of non- 260 redundant top 100-300 MegaBLAST hits to a local version of the NCBI nt database, accessed 261 August 2018. Sequences were aligned with MUSCLE v3.6 (80) using default settings, culled 262 with TrimAl v1.4.rev22 (81) using the -automated1 flag, and the final alignment was inferred 263 with IQ-TREE v1.6.11 (82) with default settings and -bb 1000 for ultrafast bootstrapping (83). 264 Tips were edited with the nw_rename script within Newick Utilities v1.6 (84) and trees were 265 visualized with Archaeopteryx (85). Fasta files for these trees and the naming keys are available 266 as Supplemental Information at https://doi.org/10.6084/m9.figshare.12142098. 267 268 Assessment of isolate novelty 269 We quantified taxonomic novelty using BLASTn of our isolate 16S rRNA genes to those of 270 other known isolates collected in three databases: 1) The NCBI nt database (accessed August 271 2018) - “NCBIdb”; 2) a custom database comprised of sequences from other DTE experiments - 272 “DTEdb”; and 3) a database containing all our isolate 16S rRNA genes - “LSUCCdb”. The 273 DTEdb and LSUCCdb fasta files are available as Supplemental Information at 274 https://doi.org/10.6084/m9.figshare.12142098. We compared our isolate sequences to these 275 databases as follows:

Page 6 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

276 1) All representative sequences were searched against the nt database using BLASTn 277 (BLAST+v. 2.7.1) with the flags -perc_identity 84, -evalue 1E-6, -task blastn, -outfmt “6 278 qseqid sseqid pident length slen qlen mismatch evalue bitscore sscinames sblastnames 279 stitle”, and -negative_gilist to remove uncultured and environmental sequences. The 280 negative GI list was obtained by searching `“environmental samples”[organism] OR 281 metagenomes[orgn]”` in the NCBI Nucleotide database (accessed September 12th, 2018) 282 and hits were downloaded in GI list format. This negative GI list is available as 283 Supplemental Information at https://doi.org/10.6084/m9.figshare.12142098. The resultant 284 hits from the nt database search were further manually curated to remove sequences 285 classified as single cell , clones, duplicates, and previously deposited LSUCC 286 isolates. 287 2) We observed that many known HTCC, IMCC, and HIMB isolates that had previously 288 been described as matching our clades (Figs. S1-5) were missing from the resultant lists 289 of nt hits, so we extracted isolate accession numbers from numerous DTE experiments 290 (26–28, 31, 34, 37, 44, 86, 87) from the nt database via blastdbcmd and generated a 291 separate DTEdb using makeblastdb. Duplicate accession numbers found in the NCBIdb 292 were removed. The same BLASTn settings as in 1) were used to search our isolate 293 sequences against DTEdb. Any match that fell below the lowest percent identity hit to the 294 NCBIdb was removed from the DTEdb search since the match would not have been 295 present in the first NCBIdb search. 296 3) Finally, using the same BLASTn settings, we compared all pairwise identities of our 328 297 LSUCC isolate 16S rRNA gene sequences via the LSUCCdb. 298 The output from these searches is available in Table S1 under the “taxonomic novelty” tab. 299 We placed our LSUCC isolates into 55 taxonomic groups based on sharing ≥ 94% 300 identity and/or their occurrence in monophyletic groups within our 16S rRNA gene trees (Figs. 301 S1-5, see above). For visualization purposes, in groups with multiple isolates we used our 302 chronologically first cultivated isolate as the representative sequence for blastn searches, and 303 these are the top point (100% identity to itself) in each group column of Figure 1. Sequences 304 from the other DTE culture collections were labeled with the corresponding collection name, 305 while all other hits were labeled as “Other”. 306 Geographic novelty was assessed by manually screening the accession numbers from hits 307 to LSUCC isolates with ≥ 99% 16S rRNA gene sequence identity for the latitude and longitude 308 from a connected publication or location name (e.g. source, country, site) in the NCBI 309 description. LSUCC isolates in the Janibacter sp., Micrococcus sp., Altererythrobacter sp., 310 sp., and Phycicoccus sp. groups (16 total isolates) were not assessed because of 311 missing isolation source information and no traceable publication. 312 313 Modeling DTE cultivation via Monte Carlo simulations 314 We developed a model using Monte Carlo simulation to estimate the median number of positive 315 and pure wells (and associated 95% confidence intervals (CI)) expected from a DTE experiment 316 for a given taxon at different inoculum sizes (λ), relative abundances (r), and viability (V) (Fig. 317 5). For each bootstrap, the number of cells added to each well was simulated using a Poisson 318 distribution at a mean inoculum size of λ cells per well across n wells. The number of cells added 319 to each well that belonged to a specific taxon was then estimated using a binomial distribution 320 where the number of trials was set as the number of cells in a well and the probability of a cell 321 belonging to a specific taxon, r, was the relative abundance of its representative ASV in the

Page 7 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

322 community analysis. Wells that contained at least one cell of a specific taxon were designated 323 ‘positive’. Wells in which all the cells belonged to a specific taxon were designated as ‘pure’. 324 Finally, the influence of taxon-specific viability on recovery of ‘pure’ wells was simulated using 325 a second binomial distribution, where the number of cells within a ‘pure’ well was used as the 326 number of trials and the probability of growth was a viability score ranging from 0 to 1. For each 327 simulation, 9,999 bootstraps were performed. Code for the model and all simulations is available 328 in the ‘viability_test.py’ at our GitHub repository https://github.com/thrash-lab/viability_test. 329 330 Actual versus expected number of isolates 331 For each taxon in each DTE experiment, the Monte Carlo simulation was used to evaluate 332 whether the number of recovered pure wells for each taxon was within 95% CI of simulated 333 estimates, assuming optimum growth conditions (i.e. V = 100%). For each of 9,999 bootstraps, 334 460 wells were simulated with the inoculum size used for the experiment and the relative 335 abundance of the ASV. For taxa where the number of expected wells fell outside the 95% CI of 336 the model, a deviance score was calculated as the difference between the actual number of wells 337 observed and median of the simulated dataset. The results of this output are presented in Table 338 S1 under the “Expected vs actual” tab, and the R script for visualizing this output as Figure 6 is 339 available at our GitHub repository https://github.com/thrash-lab/EvsA-visualization. 340 341 Estimating viability in under-represented taxa 342 For taxa where the observed number of positive wells was lower than the 95% CI lower limit 343 within a given experiment, and because our analysis was restricted to only those organisms for 344 which our media was sufficient for growth at least once, the deviance was assumed to be a 345 function of a viability term, V, (ranging from 0 to 1) associated with suboptimal growth 346 conditions, dormancy, persister cells, etc. To estimate a value of viability for a given taxon 347 within a particular experiment, the Monte Carlo simulation was run using an experiment- 348 appropriate inoculum size, relative abundance, and number of wells (460 for each experiment). 349 Taxon-specific viability was tested across a range of decreasing values from 99% to 1% until 350 such time as the observed number of pure wells for a given taxon fell between the 95% CI 351 bounds of the simulated data. At this point, the viability value is the maximum viability of the 352 taxon that enables the observed number of pure wells for a given taxon to be explained by the 353 model. The results of this output are presented in Table S1 under the “Expected vs actual” tab. 354 355 Likelihood of recovering taxa at different relative abundances 356 To estimate the number of wells required in a DTE experiment to have a significant chance of 357 recovering a taxon with a relative abundance of r, assuming optimum growth conditions (V = 358 100%), the Monte Carlo model was used to simulate experiments from 92 wells to 9,200 wells 359 per experiment across a range of relative abundances from 0 to 100% in 0.1% increments, and a 360 range of inoculum sizes (cells per well of 1, 1.27, 1.96, 2, 3, 4 and 5). Each experiment was 361 bootstrapped 999 times and the number of bootstraps in which the lower-bound of the 95% CI 362 was ≥ 1 was recorded. 363 364 Data accessibility 365 All iTag sequences are available at the Short Read Archive with accession numbers 366 SRR6235382-SRR6235415 (29). PCR-generated 16S rRNA gene sequences from this study are 367 accessible on NCBI GenBank under the accession numbers MK603525-MK603769. Previously

Page 8 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

368 generated 16S rRNA genes sequences are accessible on NCBI GenBank under the accession 369 numbers KU382357-KU38243 (35). Table S1 is available at 370 https://doi.org/10.6084/m9.figshare.12142113. 371 372 Results 373 General cultivation campaign results 374 We conducted a total of seventeen DTE cultivation experiments to isolate bacterioplankton (sub 375 2.7 µm fraction), with paired microbial community characterization of source waters (0.22 µm - 376 2.7 µm fraction), from six coastal Louisiana sites over a three year period (Table S1). We 377 inoculated 7,820 distinct cultivation wells (all experiments) with an estimated 1-3 cells·well-1 378 using overlapping suites of artificial seawater media, JW (years 1 and 2, (35)) and MWH (year 379 3), designed to match the natural environment (Table 1). The MWH suite of media was modified 380 from the JW media to additionally include choline, glycerol, glycine betaine, cyanate, DMSO, 381 DMSP, thiosulfate, and orthophosphate. These compounds have been identified as important 382 metabolites and osmolytes for marine and freshwater and were absent in the first 383 iteration (JW) of our media (Table S1) (88–94). A total of 1,463 wells were positive (> 104 384 cells·ml-1), and 738 of these were transferred to 125 mL polycarbonate flasks. For four 385 experiments (FWC, FWC2, JLB2, and JLB3) we only transferred a subset of positives (48/301, 386 60/403, 60/103, and 60/146) because the number of isolates exceeded our ability to maintain and 387 identify them at that time (Table 1). The subset of positive wells for these four experiments was 388 selected using flow cytometry signatures with < 102 green fluorescence and < 102 side scatter- 389 usually indicative of smaller oligotrophic cells like SAR11 strain HTCC1062 (49) using our 390 settings. Of the 738 wells from which we transferred cells across all experiments, 328 of these 391 yielded repeatably transferable isolates that we deemed as pure cultures based on 16S rRNA 392 gene PCR and Sanger sequencing. 393 394 Table 1. Cultivation statistics, including whole community viability estimates our model: estimated # wells with 1 cell our (bootstrapped model: median: (xx-xx) Vest: min- 95% CI) if max 95% in situ Medium Site Date n z p λ V* (ASE) CV V==1 ** CI *** salinity salinity Medium Study JWAMPF CJ Sep 2014 460 15 0.033 1.27 2.6 (0.67) 0.259 164 (144-185) 1.5-4.2 24.6 34.8 e (35) ARD Nov 2014 460 1 0.002 1.5 0.1 (0.15) 1.451 154 (134-174) 0.1-0.7 1.72 34.8 JW1 (35) JLB Jan 2015 460 61 0.133 1.96 7.3 (0.93) 0.127 127 (109-146) 5.6-9.2 26.0 34.8 JW1 (35) FWC† Mar 2015 460 301 0.654 2 53.1 (3.2) 0.06 125 (106-143) 47.1-59.7 5.39 5.79 JW4 (35) LKB June 2015 460 15 0.033 1.8 1.8 (0.48) 0.266 137 (118-156) 1.1-3.0 2.87 5.79 JW4 (35) Tbon2 Aug 2015 460 41 0.089 1.56 6.0 (0.93) 0.156 151 (132-171) 4.3-8.1 14.2 11.6 JW3 (35) CJ2 Oct 2015 460 61 0.133 2 7.1 (0.91) 0.128 125 (106-143) 5.6-9.1 22.2 23.2 JW2 (35) FWC2† Apr 2016 460 403 0.876 2 104.4 (6.2) 0.059 125 (106-143) >92.3 20.9 23.2 JW2 This study ARD2c Jun 2016 460 7 0.015 2 0.8 (0.29) 0.362 125 (106-143) 0.3-1.5 0.18 1.45 JW5 This study JLB2c† May 2016 460 103 0.224 2 12.7 (1.25) 0.099 125 (106-143) 10.3-15.4 6.89 5.79 JW4 This study LKB2 Jul 2016 460 39 0.085 2 4.4 (0.71) 0.161 125 (106-143) 3.2-6.0 2.39 1.45 JW5 This study Tbon3 Jul 2016 460 78 0.17 2 9.3 (1.05) 0.113 125 (106-143) 7.4-11.5 17.7 34.8 MWH1 This study CJ3 Sep 2016 460 69 0.15 2 8.1 (0.98) 0.121 125 (106-143) 6.4-10.2 23.7 23.2 MWH2 This study FWC3 Nov 2016 460 27 0.059 2 3.0 (0.58) 0.194 125 (106-143) 2.0-4.4 18.0 23.2 MWH2 This study ARD3 Dec 2016 460 58 0.126 2 6.7 (0.89) 0.132 125 (106-143) 5.2-8.6 3.72 1.45 MWH5 This study JLB3† Jan 2017 460 146 0.317 2 19.1 (1.59) 0.083 125 (106-143) 16.1-22.4 12.4 11.6 MWH3 This study LKB3 Feb 2017 460 38 0.083 2 4.3 (0.70) 0.163 125 (106-143) 3.1-5.8 3.55 1.45 MWH5 This study 395 *Viability according to equation 1, above. Asymptotic Standard Error (ASE) is presented in parentheses.

Page 9 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

396 **Based on 9,999 bootstraps. 397 ***Based on 9,999 bootstraps tested at viability increments of 0.1%. 398 †Experiments where a subset of positive wells were transferred. 399 FWC2 shows the advantage of our method over equation 1 for extreme values. 400 401 Phylogenetic and geographic novelty of our isolates 402 The 328 isolates belonged to three Phyla: (n = 319), Actinobacteria (n = 8), and 403 Bacteroidetes (n = 1) (Figs. S1-S5). We placed these isolates into 55 groups based on their 404 positions within 16S rRNA gene phylogenetic trees (Figs. S1-S5) and as a result of having ≥ 405 94% 16S rRNA gene sequence identity to other isolates. We applied a nomenclature to each 406 group based on previous 16S rRNA gene database designations and/or other cultured 407 representatives (Fig. 1, Table S1). Isolates represented eight putatively novel genera with ≤ 408 94.5% 16S rRNA gene identity to a previously cultured representative: the Actinobacteria acIV 409 subclades A and B, and one other unnamed Actinobacteria group; an undescribed 410 Acetobacteraceae clade (Alphaproteobacteria); the freshwater SAR11 LD12 (Candidatus 411 Fonsibacter ubiquis (29)); the MWH-UniPo and an unnamed Burkholderaceae clade 412 (Betaproteobacteria); and the OM241 Gammaproteobacteria (Fig. 1, Table S1). Seven 413 additional putatively novel species were also cultured (between 94.6 and 96.9% 16S rRNA gene 414 sequence identity) in unnamed Commamonadaceae and clades 415 (Betaproteobacteria); the SAR92 clade and Pseudohonigella (Gammaproteobacteria); and 416 unnamed and Bradyrhizobiaceae clades, as well as Maricaulis spp. 417 (Alphaproteobacteria) (Fig. 1). LSUCC isolates belonging to the groups BAL58 418 Betaproteobacteria (Fig. S4), OM252 Gammaproteobacteria, HIMB59 Alphaproteobacteria, 419 and what we designated the LSUCC0101-type Gammaproteobacteria (Fig. S5) had close 16S 420 rRNA gene matches to other cultivated taxa at the species level, however, none of those 421 previously cultivated organisms have been formally described (Fig. 1). The OM252, BAL58, and 422 MWH-UniPo clades were the most frequently cultivated, with 124 of our 328 isolates belonging 423 to these three groups (Table S1). In total, 73 and 10 of the 328 isolates belonged in putatively 424 novel genera and species, respectively. We estimated that at least 310 of these isolates were 425 geographically novel, being the first of their type cultivated from the nGOM (Fig. 2). This 426 included isolates from cosmopolitan groups like SAR11 subclade IIIa, OM43 427 Betaproteobacteria, SAR116, and HIMB11-type “Roseobacter” spp.. Cultivars from Vibrio sp. 428 and sp. were the only two groups that contained hits to organisms isolated from the 429 GoM. 430 431 Natural abundance of isolates 432 We matched LSUCC isolate 16S rRNA gene sequences with both operational taxonomic units 433 (OTUs) and amplicon single nucleotide variants (ASVs) from bacterioplankton communities to 434 assess the relative abundances of our isolates in the coastal nGOM waters that served as inocula. 435 While OTUs provide a broad group-level designation (97% sequence identity), this approach can 436 artificially combine multiple ecologically distinct taxa (95). Due to higher stringency for 437 defining a matching 16S rRNA gene, ASVs can increase the confidence that our isolates 438 represent environmentally relevant organisms (74, 96). However, while many abundant 439 oligotrophic bacterioplankton clades such as SAR11 (29, 97), OM43 (40, 41), SAR116 (98), and 440 Sphingomonas spp. (99) have a single copy of the rRNA gene operon, other taxa can have 441 multiple rRNA gene copies (97, 100), complicating ASV analyses. Since we could not a priori

Page 10 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

442 rule out multiple rRNA gene operons for novel groups with no sequenced 443 representatives, we used both OTU and ASV approaches. 444 In total, we obtained at least one isolate from 40 of the 777 OTUs and 71 of the 1,323 445 ASVs observed throughout the three-year dataset. 43% and 26% of LSUCC isolates matched the 446 top 50 most abundant OTUs (median relative abundances, all sites, from 0.11-8.1%; Fig. S6A) 447 and ASVs (mean relative abundances, all sites, from 0.11-3.8%; Fig. S6B), respectively, across 448 all sites and samples. Microbial communities from all collected samples clustered into two 449 groups corresponding to those inhabiting salinities below 7 and above 12, and salinity was the 450 primary environmental driver distinguishing community beta diversity (OTU: R2=0.88, P=0.001, 451 ASV: R2=0.89, P =0.001). As part of the cultivation strategy after the first five experiments, we 452 used a suite of five media differing by salinity and matched the experiment with the medium that 453 most closely resembled the salinity at the sample site. Consequently, our isolates matched 454 abundant environmental groups from both high and low salinity regimes. At salinities above 455 twelve, LSUCC isolates matched 13 and 14 of the 50 most abundant OTUs and ASVs, 456 respectively (Figs. 3A, 3B; Table S1). These taxa included the abundant SAR11 subclade IIIa.1, 457 HIMB59, HIMB11-type “Roseobacter”, and SAR116 Alphaproteobacteria; the OM43 458 Betaproteobacteria; and the OM182 and LSUCC0101-type Gammaproteobacteria. At salinities 459 below seven, 10 and 9 of the 50 most abundant OTUs and ASVs, respectively, were represented 460 by LSUCC isolates, including one of most abundant taxa in both cluster sets, SAR11 LD12 461 (Figs. 3C, 3D). Some taxa, such as SAR11 IIIa.1 and OM43, were among the top 15 most 462 abundant taxa in both salinity regimes (Fig. 3, Table S1), suggesting a euryhaline lifestyle. In 463 fact, our cultured SAR11 IIIa.1 ASV7471 was the most abundant ASV in the aggregate dataset 464 (Fig. S6). 465 Overall, this effort isolated taxa representing 18 and 12 of the top 50 most abundant 466 OTUs and ASVs, respectively (Table 2, Fig. S6). When looking at different median relative 467 abundance categories of > 1%, 0.1% - 1%, and < 0.1%, isolate OTUs were distributed across 468 those categories in the following percentages: 15%, 20%, and 27%; isolate ASVs were 469 distributed accordingly: 4%, 26%, and 37% (Table 2). Isolates with median relative abundances 470 of < 0.1%, such as Pseudohongiella spp., Rhodobacter spp., and spp., would 471 canonically fall within the rare biosphere (101) (Table S1). A number of isolates did not match 472 any identified OTUs or ASVs (38% and 33% of LSUCC isolates when compared to available 473 OTUs and ASVs, respectively), either because their matching OTUs/ASVs were below our 474 thresholds for inclusion (at least two reads from at least two sites), or because they were below 475 the detection limit from our sequencing effort (Table 2). Thus, 43% and 30% of our isolates 476 belonged to OTUs and ASVs, respectively, with median relative abundances > 0.1%. 477 478 Table 2. Median relative abundances (r) of cultured OTUs and ASVs across all samples In top 50 ranks r > 1% 1% - 0.1% r r < 0.1% Not detected OTUs 18 (140 isolates) 50 isolates (15%) 90 isolates (27%) 65 isolates (20%) 123 isolates (38%) ASVs 12 (84 isolates) 13 isolates (4%) 84 isolates (26%) 122 isolates (37%) 109 isolates (33%) 479 480 Modeling DTE cultivation 481 An enigma that became immediately apparent through a review of our data was the absence of an 482 obvious relationship between the abundance of a given taxon in the inoculum and the frequency 483 of obtaining an isolate of the same type from a DTE cultivation experiment (Figs. S7, S8). For 484 example, although we could culture SAR11 LD12 over a range of media conditions (29), and the 485 matching ASV had relative abundances of > 5% in six of our seventeen experiments (Fig. 4), we

Page 11 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

486 only isolated one representative (LSUCC0530). In an ideal DTE cultivation experiment where 487 cells are randomly subsampled from a Poisson-distributed population, if the medium were 488 sufficient for a given microorganism’s growth, then the number of isolates should correlate with 489 that microorganism’s abundance in the inoculum. However, a qualitative examination of several 490 abundant taxa that grew in our media, some of which we cultured on multiple occasions, 491 revealed no clear pattern between abundance and isolation success (Fig. 4). Considering that 492 medium composition was sufficient for cultivation of these organisms on at least some 493 occasions, we hypothesized that cultivation frequency may reflect viability differences in the 494 populations of a given taxon. Thus, we decided to model cultivation frequency in relationship to 495 estimated abundances in a way that could generate estimates of cellular viability. We also hoped 496 that this information might help us inform experimental design and make DTE cultivation efforts 497 more predictable (59). 498 Previously, Don Button and colleagues developed a statistical model for viability (V) of 499 cells in the entire sample for a DTE experiment (33): 500 501 (1) V = -(ln(1-p)/λ) 502 503 Where p is the proportion of wells or tubes, n, with growth, z, (p=z/n) and λ is the estimated 504 number of cells inoculated per well (the authors used X originally). The equation uses a Poisson 505 distribution to account for the variability in cell distribution within the inoculum and therefore 506 the variability in the number of wells or tubes receiving the expected number of cells. We and 507 others have used this equation in the past (26, 28, 35) to evaluate the efficacy of our cultivation 508 experiments in the context of commonly cited numbers for cultivability using agar-plate based 509 methods (13, 17, 102). 510 While Equation 1 is effective for its intended purpose, it has a number of drawbacks that 511 limit its utility for taxon-specific application: 1) If p=1, i.e. all wells are positive, then the 512 equation is invalid; 2) At high values of p and low values of λ, estimates of V can exceed 100% 513 (Table 2); 3) Accuracy of viability, calculated by the asymptotic standard error, ASEV, or the 514 coefficient of variation, CVV, was shown to be non-uniform across a range of λ, with greatest 515 accuracy when true viability was ~10% (33). Thus, low viability, low values of λ, and small 516 values of n were found to produce unreliable results; 4) If p=0, i.e. no positive wells are 517 observed, estimates of viability that could produce 0 positive wells cannot be calculated. In 518 addition, 5) Button’s original model assumes that a well will only produce a pure culture if the 519 inoculated well contains one cell. By contrast, in low diversity samples, samples dominated by a 520 single taxon, or experiments evaluating viability from axenic cultures across different media, a 521 limitation that only wells with single cells are axenic will underestimate the expected number of 522 pure wells. To overcome these limitations, we developed a Monte Carlo simulation model that 523 facilitates the incorporation of relative abundance data from complementary community profiling 524 data (e.g. 16S rRNA gene amplicons) to calculate the likelihood of positive wells, pure wells, 525 and viability at a taxon-specific level, based on the observed number of wells for we obtained an 526 isolate of a particular taxon (Fig. 5). By employing a Monte Carlo approach, our model is robust 527 across all values of p and n with uniform prediction accuracy, and we can estimate the accuracy 528 of our prediction within 95% confidence intervals (CI). Furthermore, the width of 95% CI 529 boundaries of viability, as well as the expected number of positive and pure wells, are entirely 530 controllable and dependent only on available computational capacity for bootstrapping (i.e., 531 these can be improved with more bootstrapping, but at greater computational cost). When zero

Page 12 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

532 positive wells are observed experimentally, our approach enables estimation of a maximum 533 viability that could explain such an observation by identifying the range of variability values for 534 which zero resides within the bootstrapped 95% CI. Finally, the ability to calculate the viability 535 of the entire community, as in Equation 1, is retained simply by estimating viability using a 536 relative abundance of one. 537 We compared our model to that of Button et al. for evaluating viability from whole 538 community experimental results, similarly to previous reports (26, 28, 35) (Table 1). Our 539 viability estimates (Vest) generally agreed with those using the Button et al. calculation, but we 540 have now provided 95% CI to depict the maximum and minimum viability that would match the 541 returned positive well distribution, as well as maximum and minimum values for the number of 542 wells that ought to have contained a single cell. Maximum Vest ranged from 1.1% to > 92.3% 543 depending on the experiment, with a median Vest across all our experiments of 8.6% (Table 1). In 544 one case, the extremely high value (FWC2) was better handled by our model compared to 545 equation 1, because it did not lead to a viability estimation greater than 100%. FWC and FWC2 546 represent Vest outliers compared with the entire dataset (maximums of 59.7% and > 92.3%, 547 respectively; Table 1). We believe these high numbers most likely resulted from underestimating 548 the number of cells inoculated into each well because of the use of microscopy, clumped cells, or 549 pipet error, thus increasing the estimated viability (35). 550 551 Isolate-specific viability estimates 552 Our new model also facilitates taxon-specific viability estimates. Cultivation efficacy was 553 evaluated for 71 cultured ASVs (219 isolates) across 17 sites (1,207 pairwise combinations) by 554 comparing the number of observed pure wells to those predicted by the Monte Carlo simulation 555 using 9,999 bootstraps, 460 wells per experiment, and an assumption that all cells were viable 556 (i.e. V = 100%). In total, for 1,158 out of 1,207 pairwise combinations (95.9%) the observed 557 number of pure wells fell within the 95% CI of data simulated at matching relative abundance 558 and inoculum size, suggesting that these two parameters alone could explain the observed 559 cultivation success for most taxa (Table S1). 1,059 out of these 1,158 combinations (91%) 560 recorded zero observed wells, but with a maximum relative abundance of 2.8% within these 561 combinations, a score of zero fell within predicted 95% CI of simulations with 460 wells. 562 Sensitivity analysis showed that with 460 wells per experiment, an observation of zero pure 563 wells falls below the 95% confidence intervals lower-bound (and is thus significantly depleted to 564 enable viability to be estimated) for taxa with relative abundances of 2.3%, 2.9% and 4.5% for 565 inoculum sizes of one, two and three cells per well, respectively (Fig. S9). In fact, modeling DTE 566 experiments from 92 wells to 9,200 wells per experiment showed that for taxa comprising ~1% 567 of a microbial community 1,104 wells (or 12 plates at 92 wells per plate), 1,380 wells (15 plates) 568 and 2,576 wells (28 plates) were required to be statistically likely to recover at least one positive, 569 pure well using inocula of one, two or three cells per well, respectively, with V = 100% (Fig. S9). 570 A small, but taxonomically relevant minority (49 out of 1,207) of pairwise combinations 571 had a number of observed pure wells that fell outside of the simulated 95% CI with V = 100% 572 (Fig. 6). Of these, 28 had either one, two, or three more observed pure wells than the upper 95% 573 CI (Table S1), suggesting cultivability higher than expected based purely a model capturing the 574 interaction between a Poisson-distributed inoculum and a binomially-distributed relative 575 abundance, with V = 100%. However, the deviance from the expected number of positive wells 576 for those above the 95% CI was limited to three or fewer cells, meaning that we only obtained 1- 577 3 more isolates than expected (Table S1). On the other hand, those organisms that we isolated

Page 13 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

578 less frequently than expected showed greater deviance. 21 out of the 49 outliers had lower than 579 expected cultivability (Fig. 6). These taxa had relative abundances ranging from 2.7% to 14.5%, 580 but recorded only 0, 1, or 2 isolates. In the most extreme case, ASV7629 (SAR11 LD12) at Site 581 ARD2c comprised 14.5% of the community but recorded no observed pure wells, compared to 582 expected number of 13-30 isolates (95% CI) predicted by the Monte Carlo simulation. All the 583 examples of taxa that were isolated less frequently than expected given the assumption of V = 584 100% belonged to either SAR11 LD12, SAR11 IIIa.1, or one particular OM43 ASV (7241) (Fig. 585 6). 586 We used our model to calculate estimated viability (Vest) for these organisms based on 587 their cultivation frequency at sites where the assumption of V = 100% appeared violated (Table 588 3). Using the extreme example of SAR11 LD12 ASV7629 at Site ARD2c, simulations across a 589 range of V indicated that a result of zero positive wells fell within 95% of simulated values when 590 the associated taxon Vest ≤ 15%. When considering all anomalous cultivation results, LD12 had 591 estimated maximum viabilities that ranged up to 55% (Table 3). OM43 (ASV7241) estimated 592 maximum viabilities ranged from 52-80%, depending on the site, and similarly, SAR11 IIIa.1 593 ranged between 22-82% viability (Table 3). 594 595 Table 3. Estimated viabilities for taxa cultivated less frequently than expected Estimated # wells with 1 cell (bootstrapped Vest: min-max 95% median: (xx-xx) 95% CI based on ASV Group Site r* n z λ CI) if V==1 ** cultivation results*** 7241 OM43 ARD3 0.03 460 0 2 4 (1-9) 0.1-80 7241 OM43 FWC† 0.04 460 0 2 5 (1-9) 0.1-77 7241 OM43 JLB 0.05 460 0 1.96 7 (2-12) 0.1-52 7471 SAR11 IIIa.1 ARD3 0.11 460 0 2 15 (8-23) 0.1-22 7471 SAR11 IIIa.1 CJ 0.03 460 0 1.27 4 (1-9) 0.1-82 7471 SAR11 IIIa.1 FWC3 0.07 460 2 2 9 (4-15) 2.5-80 7471 SAR11 IIIa.1 JLB 0.05 460 0 1.96 6 (2-11) 0.1-59 7471 SAR11 IIIa.1 JLB2c† 0.08 460 0 2 11 (5-18) 0.1-31 7471 SAR11 IIIa.1 JLB3† 0.04 460 0 2 5 (1-9) 0.1-74 7471 SAR11 IIIa.1 LKB 0.05 460 0 1.8 6 (2-12) 0.1-55 7471 SAR11 IIIa.1 LKB2 0.04 460 0 2 5 (1-9) 0.1-77 7471 SAR11 IIIa.1 LKB3 0.09 460 0 2 11 (6-19) 0.1-30 7471 SAR11 IIIa.1 TBON2 0.04 460 0 1.56 6 (2-12) 0.1-56 7471 SAR11 IIIa.1 TBON3 0.04 460 0 2 5 (1-10) 0.1-73 7629 SAR11 LD12 ARD 0.11 460 0 1.5 18 (10-27) 0.1-20 7629 SAR11 LD12 ARD2c 0.15 460 0 2 21 (13-30) 0.1-15 7629 SAR11 LD12 ARD3 0.05 460 0 2 7 (2-12) 0.1-53 7629 SAR11 LD12 FWC† 0.09 460 0 2 12 (6-19) 0.1-28 7629 SAR11 LD12 LKB 0.05 460 0 1.8 7 (2-13) 0.1-51 7629 SAR11 LD12 LKB2 0.08 460 1 2 11 (5-18) 0.3-49 7629 SAR11 LD12 LKB3 0.06 460 0 2 7 (3-13) 0.1-48 596 *Fractional relative abundance 597 **Based on 9,999 bootstraps. 598 ***Based on 9,999 bootstraps tested at viability increments of 0.1%. 599 †Experiments where a subset of positive wells were transferred. 600 601 Discussion 602 This work paired 17 DTE cultivation experiments with cultivation-independent assessments of 603 microbial community structure in source waters to evaluate cultivation efficacy. We generated 604 328 new bacterial isolates representing 40 of the 777 OTUs and 71 of the 1,323 ASVs observed 605 across all samples from which we inoculated DTE experiments. Stated another way, we 606 successfully cultivated 5% of the total three year bacterioplankton community observed via 607 either OTU or ASV analyses. A large fraction of our isolates (43% of cultured OTUs, 30% of

Page 14 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

608 cultured ASVs) represented taxa present at median relative abundances > 0.1%, with 15% and 609 4% of cultured OTUs and ASVs, respectively, at median abundances > 1%. 140 of our isolates 610 matched the top 50 most abundant OTUs, and 84 isolates matched the top 50 most abundant 611 ASVs. 612 This campaign led to the first isolations of the abundant SAR11 LD12 and Actinobacteria 613 acIV; the second isolate of the HIMB59 Alphaproteobacteria; and new genera within the 614 Acetobacteraceae, , OM241 and LSUCC0101-type Gammaproteobacteria, and 615 MWH-UniPo Betaproteobacteria; thereby demonstrating again that continued DTE 616 experimentation leads to isolation of previously uncultured organisms with value for aquatic 617 microbiology. We have also added a considerable collection of isolates to previously cultured 618 groups like OM252 Gammaproteobacteria, BAL58 Betaproteobacteria, and HIMB11-type 619 “Roseobacter” spp., and the majority of our isolates represent the first versions of these types of 620 taxa from the Gulf of Mexico, adding comparative biogeographic value to these cultures. 621 Our viability model improved upon the statistical equation developed by Button and 622 colleagues (33) to extend viability estimates to individual taxa within a mixed community and 623 provide 95% CI constraining those viability estimates. We cultured several groups of organisms 624 abundant enough to evaluate viability with 460 wells (Figs. 6, S9). The fact that these organisms 625 were successfully cultured at least once meant that we could reasonably assume that the medium 626 was sufficient for growth. Some taxa were cultivated more frequently than expected (Fig. 6). We 627 explore two possible explanations for this phenomenon- errors in quantification and variation in 628 microbial cell organization. Any systematic error that led to underestimating the abundance of an 629 organism would have correspondingly resulted in our underestimating the number of wells in 630 which we would expect to find a pure culture of that organism. Such underestimations could 631 come from primer biases associated with amplicon sequencing (69, 70, 103), but we do not know 632 if those protocols specifically underestimate the OM252, MWH-UniPo, and HIMB11-type taxa 633 cultured more frequently than expected (Fig. 6). However, due the low number of expected 634 isolates in these groups, and the small deviances in actual isolates from those expected numbers 635 (within 1-3 isolates compared to expected values), the biases inherent in the relative abundance 636 estimations for these taxa were probably small. Furthermore, one of the microorganisms isolated 637 more frequently than expected matched the OM43 ASV1389 (Fig. S6), whereas another OM43 638 ASV (7241) was cultivated less frequently than expected (see below), meaning that if primer 639 bias were the cause of this discrepancy, it would have to be operating differently on very closely 640 related organisms. 641 There exists a biological explanation for why some isolates might have been cultured 642 more frequently than expected: clumped cells. If cells of any given taxon in grew in small 643 clusters, then the number of cells we added to a well was greater than expected based on a 644 Poisson distribution. Furthermore, the model assumes that each cell is independent, and that the 645 composition of a subset of cells is only a function of the relative abundance of the taxon in the 646 community. Within a cluster of cells, this assumption is violated as the probability of cells being 647 from the same taxon is higher. Thus, the model will underestimate the probability of a well being 648 pure and therefore underestimate the number of pure wells likely to be observed within an 649 experiment, leading to a greater number of isolates than expected. Future microscopy work could 650 examine whether microorganisms such as OM252 and MWH-UniPo form small clusters in situ 651 and/or in pure culture, and whether this phenomenon may be different for different ASVs of 652 OM43, or if clumping may be a transient phenotype.

Page 15 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

653 We also identified three taxa- SAR11 LD12, SAR11 subclade IIIa.1, and the 654 aforementioned OM43 ASV7241- that were isolated much less frequently than expected based 655 on their abundances (Fig. 6, Table 3). This could mean that our assumption of V = 100% was 656 incorrect, or that, in contrast to the taxa that were cultured more frequently than expected 657 (above), our methods had biases that overestimated the abundance of these organisms, thereby 658 over-inflating the expected number of isolates. We used the modified 515/806RB primers that 659 have been shown to be much more accurate in quantifying SAR11 compared to FISH than the 660 original 515/806 primers (within 6% ± 4% SD), and this protocol almost always underestimates 661 SAR11 abundances (69). This suggests that our expected number of isolates may have actually 662 been underestimated, our cultivation success was poorer than we measured, and therefore we 663 may be overestimating viability for the SAR11 taxa. Other sources of systematic error that might 664 impinge on successful transfers, and thereby reduce our recovery, include sensitivity to pipette 665 tip and/or flask material. However, the fact that these taxa were sometimes successfully isolated 666 means that if these mechanisms were impacting successful transfers, then their activity was less 667 than 100% efficient, which implies variations in subpopulation vulnerability that would be very 668 similar conceptually to variations in subpopulation viability. 669 Another possible source of error that could have resulted in lower than expected numbers 670 of isolates was the subset of experiments for which we did not transfer all positive wells due to 671 limitations in available personnel time (Tables 1 & 3). However, our selection criteria for the 672 subset of wells to transfer was based on flow cytometric signatures that would have encompassed 673 small cells like SAR11 (see Results), and in any case, there were many examples of lower than 674 expected recovery from other experiments where we transferred all positive wells (Table 3). 675 Therefore, these four experiments were unlikely to contribute major errors biasing our estimates 676 of viability for SAR11 LD12, SAR11 IIIa.1, and other small cells like OM43. 677 If we instead explore biological reasons for the lower than expected numbers of positive 678 wells in DTE experiments, a plausible explanation supported by the literature is simply that a 679 large fraction of the population is in some state of inactivity or at least not actively dividing 680 (104). Studies using uptake of a variety of radiolabelled carbon and sulfur sources have 681 demonstrated substantial fractions of SAR11 cells may be inactive depending on the population 682 (105–108). SAR11 cells in the northwest Atlantic and Mediterranean showed variable uptake of 683 labelled leucine (30-50% -(105, 106); ~25-55% (108, 109) and amino acids (34-61% (105, 106); 684 34-66% (105, 106). Taken in reverse, this means that up to 75% of the SAR11 population may 685 be inactive at any given time. In another study focused on brackish communities, less than 10% 686 of SAR11 LD12 cells took up labelled leucine and/or thymidine (107). While this was likely not 687 the ideal habitat for LD12 based on salinities above 6 (29, 107), this study supports the others 688 above that show substantial proportions of inactive SAR11 cells, the fraction of which may 689 depend on environmental conditions and other unknown factors. Bi-orthagonal non-canonical 690 amino acid tagging (BONCAT) shows similar trends for SAR11 (110). These data also match 691 general data indicating prevalent inactivity among aquatic bacterioplankton (104, 111–113). 692 Although labelled uptake methods do not directly measure rates of cell division, the 693 incorporation of these compounds requires active DNA replication or translation, which 694 represent an even more fundamental level of activity than cell division (114). 695 Why might selection favor low viability? One possibility is as an effective defense 696 mechanism against abundant viruses. Viruses infecting SAR11 have been shown to be extremely 697 abundant in both marine (115, 116) and freshwater (117) systems. Indeed, the paradox of high 698 viral abundances and high host abundances in SAR11 has led to a refining of negative density

Page 16 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

699 dependent selection through Lokta-Volterra predator-prey dynamics (118) to include 700 heterogeneous susceptibility at the strain level (119, 120) and positive density dependent 701 selection through intraspecific proliferation of defense mechanisms (121). Activity of lytic 702 viruses infecting SAR11 in situ demonstrated that phages infecting SAR11 have lower ratios of 703 viral transcripts to host cells compared to other abundant taxa, and that observed abrupt changes 704 in these ratios suggest co-existence of several SAR11 strains with different life strategies and 705 phage susceptibility (122). Phenotypic stochasticity of phage receptor expression has been 706 shown to maintain a small proportion of phage-insensitive hosts within a population, enabling 707 coexistence of predator and prey without extinction (123). Phages adsorb to a vast array of 708 receptor proteins on their hosts, with many well-characterised receptors (e.g. OmpC, TonB, 709 BtuB, LamB) associated with nutrient uptake or osmoregulation (124). Selection therefore 710 favours phenotypes that limit receptor expression, with an associated fitness cost, particularly in 711 nutrient-limited environments. However, an alternative mechanism is possible if a population of 712 cells comprised a small number of viable cells, and a large number of non-viable or dormant 713 cells where metabolism is restricted but presentation of receptor proteins is retained. The 714 majority of host-virus encounters would occur with non-viable cells, and constrain viral 715 propagation through inefficient or failed , effectively acting as a sink for infectious 716 particles. Thus, the population as a whole can maintain expression of receptor proteins, while 717 reducing viral , with low-viability cells retaining the possibility of a return to normal 718 phenotypic growth if and when viral selection pressure abates. Similar systems of bacterial 719 phenotypic bistability associated with tolerance and survival in harsh environments are 720 well documented (125, 126) and have been suggested as a possible mechanism for maintaining 721 long-term co-stability between the abundant phage ɸCrAss001 and its host in the human gut 722 (127). 723 Detailed measurements of dormancy in SAR11 and what types of cellular functions 724 become inactivated are part of our ongoing work. In the meantime, it is worth it to think about 725 the implications of a substantial proportion of non-dividing cells for our understanding of basic 726 growth dynamics. Studies attempting to measure SAR11 growth rates in nature have yielded a 727 wide range of results, ranging from 0.03-1.8 day-1 (97, 105, 108, 128–130). These span wider 728 growth rates than observed for axenic cultures of SAR11 (0.4-1.2 day-1), but isolate-specific 729 growth ranges within that spread are much more constrained (29, 36, 49, 131, 132)). Conversion 730 factors for determining production from 3H-leucine incorporation appear to be accurate for at 731 least Ia subclade members of SAR11 (133), so variations in growth rate estimates from 732 microradiography experiments likely have other explanations. It is possible that different strains 733 of SAR11 simply have variations in growth rate not captured by existing isolates. Another, not 734 mutually exclusive, possibility is that the differences in in situ growth rate estimates also reflect 735 variations in the proportion of actively dividing cells within the population. A simple model of 736 cell division with binary fission where only a subset of cells divide and non-dividing cells 737 persist, rather than die, can still yield logarithmic growth curves (Fig. S10) like those observed 738 for SAR11 in pure culture (29, 49, 134). However, this means that the division rate for the subset 739 of cells that are actively dividing is much higher than calculated when assuming 100% viability 740 in the population. Based on our estimated viability for SAR11 LD12 of 15-55%, to obtain our 741 previously calculated maximum division rate (0.5 day-1) for the whole culture (29), the per-cell 742 division rate for only a subpopulation would span 2.48-0.79 day-1 (Fig. S10, Supplemental Text). 743 Verifying the proportion of SAR11 cells actively dividing in a culture may be challenging. Time-

Page 17 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

744 lapse microscopy (135) offers an elegant solution if SAR11 can be maintained for the requisite 745 time periods for accurate measurements in a microfluidic device. 746 In addition to identifying taxa whose isolation success suggested deviations from 747 biological assumptions of single planktonic cells with 100% viability, the model also revealed 748 the limitations of DTE cultivation in assessing viability depending on relative abundance (Fig. 749 S9). We cannot ascertain whether any given taxon may violate an assumption of V = 100% 750 unless we have enough wells to demonstrate that it grew in fewer wells than expected. For 751 example, taxa at 1% of the microbial community require more than 1,000 wells before the lack 752 of a cultured organism represents a statistically important negative event, rather than a taxon 753 simply lacking sufficient abundance to ensure inclusion in a well within 95% CI. In our 460 well 754 experiments, we could not resolve whether taxa may have had viabilities below 100% if they 755 were less than 3% of the community for any given experiment (Fig. S9). Modeling DTE 756 experiments showed that for experiments targeting rare taxa, lower inoculum sizes are favoured 757 where selective media for enrichment is either unknown or undesirable. The exponential increase 758 in the number of required wells with respect to the inoculum size is a function of a pure well 759 requiring all cells within it to belong to the same taxon, assuming all cells are equally and 760 optimally viable. 761 By providing taxon-specific predictions of viability from cultivation data, our model now 762 facilitates an iterative process to improve experimental design and make cultivation more 763 reliable. First, we use the cultivation success rates to determine for which taxa the assumption of 764 100% viability was violated. Second, we use the model to estimate viability for those organisms. 765 Third, we use the viability and relative abundance data to determine, within 95% CI, the 766 appropriate number of inoculation attempts required to isolate a new version of that organism. 767 Using SAR11 LD12 as an example, given a relative abundance of 10%, and a viability of 15%, 768 800 DTE wells should yield four pure, positive wells (1-8 95% CI). This means that, for 769 microorganisms that we know successfully grow in our media, we can now statistically constrain 770 the appropriate number of wells required to culture a given taxon again. For organisms that were 771 not abundant enough to estimate viability using the model, we can use a conservative viability 772 assumption (e.g., 50% (111)) with which to base our cultivation strategy, thereby still reducing 773 uncertainty about the experimental effort necessary to re-isolate one of these microorganisms. 774 775 Conclusions 776 This work has provided hundreds of new cultures for microbiological research, many among the 777 most abundant members of the nGOM coastal bacterioplankton community. It also provides 778 another demonstration of the effectiveness of sustained cultivation efforts for bringing previously 779 uncultivated strains into culture. Our modeled cultivation results have generated compelling 780 evidence for low viability within subpopulations of SAR11 LD12 and IIIa.1, as well as OM43 781 Betaproteobacteria. The prevalence of, and controls on, dormancy in these clades deserves 782 further study. We anticipate that future work with larger DTE experiments will yield similar 783 viability data about other groups of taxa with lower abundance, highlighting a valuable 784 diagnostic application of DTE cultivation/modeling beyond the primary role in isolating new 785 microorganisms. The integration of cultivation results, natural abundance data from inoculum 786 communities, and DTE modeling represents an important step forward in quantifying the risk 787 associated with DTE efforts to isolate valuable taxa from new sources, or repeating isolation 788 from the same locations. We hope variations of this approach will be incorporated into wider 789 community efforts to invest in culturing the uncultured.

Page 18 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

790 791 792 793 794 Acknowledgments 795 We would like to thank Dr. Nancy Rabalais for her comments and edits on a very early draft of 796 this manuscript. Portions of this research were conducted with high-performance computing 797 resources provided by Louisiana State University (http://www.hpc.lsu.edu) and the University of 798 Southern California (http://hpc.usc.edu). 799 800 Funding Information 801 This work was supported by the Department of Biological Sciences at Louisiana State 802 University, a Louisiana Board of Regents grant to JCT (LEQSF(2014-17)-RD-A-06), a National 803 Academies of Science, Engineering, and Medicine Gulf Research Program Early Career 804 Fellowship to JCT, and the Dornsife College of Letters, Arts and Sciences at the University of 805 Southern California. BT was partially funded by NERC award NE/R010935/1 and by the Simons 806 Foundation BIOS-SCOPE program. The funders had no role in study design, data collection, and 807 interpretation, or the decision to submit the work for publication. 808 809 Author Contributions 810 MWH led sample collections, cultivation experiments, nucleic acid extraction, amplicon 811 sequencing, and analyses; VCL, DMP, JLW, and AML supported sample collections, cultivation 812 experiments, and nucleic acid extractions; MWH and JCT conducted cultivation comparisons 813 and phylogenetic analyses; BT developed the viability model and led the statistical analyses; CC 814 derived the cell-specific growth rate equations incorporating viability; JCT designed the study 815 and assisted with sample collections and model refinement; MWH, BT, and JCT led manuscript 816 writing; and all authors contributed to and reviewed the text. 817 818 819 820 821

Page 19 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

822 Figures 823 824 825

826 827 828 Figure 1. Percent identity of LSUCC isolate 16S rRNA genes compared with those from other 829 isolates in NCBI (“Other”, gray dots) or from the DTE culture collections IMCC ( dots), 830 HTCC (blue dots), and HIMB (green dots). Each dot represents a pairwise 16S rRNA gene 831 comparison (via BLASTn). X-axis categories are groups designated according to ≥ 94% 832 sequence identity and phylogenetic placement (see Figs S1-S4). Above the graph is the 16S 833 rRNA gene sequence percent identity to the closest non-LSUCC isolate within a column. Groups 834 colored in red indicate those where LSUCC isolates represent putatively novel genera, whereas 835 orange indicates putatively novel species. 836 837 838 839

Page 20 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

840 841 842 843 844 845

846 847 848 Figure 2. A global map of the isolation location of isolates from selected important aquatic 849 bacterioplankton clades. Circles represent LSUCC isolates, while triangles are non-LSUCC 850 isolates. Inset: a zoomed view of the coastal Louisiana region where LSUCC bacterioplankton 851 originated. 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875

Page 21 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

876 877

0.20 C A C C B 0.15 LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUC LSUCC LSUCC LSUCC LSUCC LSUCC LSUC LSUCC LSUCC LSUCC LSUCC LSUCC

0.15

0.10

0.10

0.05 0.05

0.00 0.00 Phylum Acidobacteria sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. type type type type type type type r r r s s type type type type a a a a r r IIIa.2 IIIa.2 IIIa.1 s s s clade clade clade clade clade clade clade clade clade clade clade clade clade clade clade a a a a a Actinobacteria IIIa.1 IIIa.2 group group group group group group group clade clade clade clade clade clade clade clade clade clade clade clade clade clade o 1 1 9 9 1 9 group group group group group group 1 1 9 o e e e I 2 5 5 8 6 6 3 6 6 3 6 4 3 1 6 e e e e e e e e e 2 2 6 5 1 1 3 6 6 6 6 4 8 e e e e e e Bacteroidetes Aquilun NOR Aquilun Actinomarina Rickettsiales marin marin marin marin marin marin marin Actinomarina marin marin marin marin marin marin . s . subclad subclad subclad OTU33;SV1-8 s Actinobacteria s ASV3515;E01- 5 5 4 5 5 5 6 OTU19;OM60/

subclad subclad Gemmatimonadetes Proteobacteria Nitrosopumilus s 6 5 5 5 4 5 Saprospiraceae . Pelagibacte Pelagibacte Pelagibacte OTU1;ac Saprospiraceae ASV7591;SV1-8 . 1 1 1 Pelagibacte Pelagibacte Actinomarin e Actinomarin Actinomarin . e 1 1 s Methylophilaceae . s s s

Acidimicrobiaceae Proteobacteria s s s s s . Acidimicrobiaceae Rhodobacteraceae . Rhodobacteraceae Betaproteobacteria Betaproteobacteria 9C-2 . Alphaproteobacteria OTU8;HIMB5 . . OTU9;SAR1 OTU11;OM4 Alphaproteobacteria . . . clad clad OTU84;BAL5 . OTU10;HIMB1 OTU30;SAR1 OTU16;SAR8 OTU50;OM25 OTU56;OM18 . OTU152;Balneol ASV6693;OM4 ASV1389;OM4 ASV7241;OM4 20 6 OTU24;SAR40 OTU67;SAR11 OTU21;SAR11 OTU35;SAR32 ASV1000;BAL5 6 ASV6516;HIMB1 ASV5716;HIMB5 ASV5713;HIMB5 ASV5073;HIMB1 ASV3702;HIMB5 ASV459;SAR40 ASV7760;SAR1 ASV7230;SAR8 ASV5512;OM25 OTU184;SAR11 OTU163;Marinicell ASV1118;SAR11 ASV1106;SAR11 ASV7223;SAR11 ASV5567;SAR11 ASV7490;SAR32 Verrucomicrobia Abundance OTU15;MWH-UniP OTU31;Owenweeksi OTU6;Synechococcu OTU3;Synechococcu OTU47;NS OTU23;NS OTU136;unc OTU308;LSUCC010 OTU104;Owenweeksi ASV7027;MWH-UniP OTU59;unc ASV3029;unc ASV7518;LSUCC010 OTU43;Synechococcu OTU124;NS OTU106;NS OTU131;NS e ASV3349;NS ASV5387;NS ASV3952;NS ASV1334;NS ASV1783;NS ASV1302;NS Candidatu ASV1505;OM60/NOR ASV5606;OM60/NOR OTU147;unc ASV7713;Synechococcu ASV6017;Synechococcu OTU5;SAR1 ; OTU118;unc ASV1566;unc OTU49;SAR1 OTU22;unc OTU74;unc OTU4;Candidatu OTU66;unc OTU36;unc OTU40;unc ASV4672;SAR1 ASV7245;unc ASV4667;SAR1 ASV7471;SAR1 OTU18;Candidatu OTU70;E01-9C-2 ASV7214;unc Salinity ASV267;Candidatu ASV7530;unc ASV7707;Candidatu OTU255;unc ASV4540;unc OTU13;Candidatu OTU2 OTU44;SAR8 Relativ ASV473;SAR8 OTU117;Candidatu ASV4651;Candidatu ASV7641;Candidatu ASV7779;Candidatu 30 ASV4832;Candidatu ASV7135;Candidatu ASV7150;Candidatu 20

C C C D 10 0 LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC 0.20 LSUC LSUCC LSUCC LSUC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC 0.15

0.15 0.10

0.10

0.05

0.05

0.00 0.00 I 8 e sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. sp. IIIa sp. type r r type type type type s s s s s a a a s s IIIa.1 a a a s IIIa.1 e clade clade LD12 clade clade clade clade clade clade clade clade clade clade clade clade clade clade clade clade m m clade clade clade clade clade clade clade clade clade clade clade clade clade clade LD12 clade clade clade clade clade group gorup group 9 o o o 9 I I I I I I I I I I I I I I I e e 1 1 3 o 8 o 6 8 1 8 6 8 1 1 3 V V V V V e V V V V e e Bacteria Bacteria Opitutae Opitutae Aquiluna . . . . subclad s UniP 1 Aquilun subclad marin marin marin errimona s Actinobacteria Actinobacteria Proteobacteria subclad Proteobacteria T subclad 1 5 OTU33;SV1-8 5 a OTU1;ac . . Holophagaceae . . Holophagaceae Pelagibacte OTU286;PeM15 Burholderiaceae 1 ASV7146;MWH- ASV7025;MWH- 1 Actinomarin Sporichthyaceae OTU93;TRA3-20 OTU69;ac OTU20;ac OTU41;ac Sporichthyaceae Actinomarin . . ASV7591;SV1 . Burkholderiaceae Methylophilaceae s Flavisolibacte . ASV510;ac ASV566;ac OTU158;ac OTU133;ac . OTU80;acI OTU12;acI OTU27;acI OTU37;acI OTU63;acI s e e s . . Betaproteobacteria Betaproteobacteria Betaproteobacteria ; OTU77;LD2 ASV3167;ac ASV7420;ac ASV2238;ac ASV7613;ac ASV7735;ac ASV3173;ac ASV7609;ac OTU8;HIMB5 OTU7;SAR1 OTU9;SAR1 OTU11;OM4 Gemmatimonadetes . . . OTU14;BAL5 ASV2995;acI ASV6590;acI ASV5060;acI ASV7657;acI OTU34;unc OTU16;SAR8 OTU192;Opitutu OTU57;unc . ASV7252;LD2 ASV2671;Opitutu Gammaproteobacteria Gammaproteobacteria ASV7241;OM4 Gammaproteobacteria Gammaproteobacteria clad clad ASV7731;BAL5 ASV5716;HIMB5 ASV6293;SAR1 ASV1142;SAR8 ASV3891;unc ASV7760;SAR1 ASV7629;SAR1 . . ASV5787;unc . . OTU102; I I OTU45;Algoriphagu OTU28;Owenweeksi OTU31;Owenweeksi e OTU3;Synechococcu OTU6;Synechococcu OTU47;NS OTU15;MWH-UniP OTU65 ASV6014;Algoriphagu OTU53;Pseudospirillu OTU107;MWH-UniP ASV7027;MWH-UniP ASV2234;MWH-UniP ASV4829;Owenweeksi ASV7779;SAR1 OTU153;unc OTU46;unc typ ASV5387;NS OTU537;unc OTU147;unc OTU18;Candidatu OTU5;SAR1 ASV4880;unc OTU25;unc ASV7713;Synechococcu ASV2733;NS3 OTU39;Sediminibacteriu OTU157;unc o ASV7108;SAR1 ASV6770;unc OTU36;unc ASV3971;unc ASV7249;unc ASV7471;SAR1 ASV7245;unc OTU68;unc ASV6466;unc ASV5003;unc ASV7707;Candidatu OTU2;Candidatu OTU54;unc OTU62;unc UniP OTU4;Candidatu ASV1817;ac ASV2241;ac ASV5396;unc ASV3353;unc

878 ASV7150;Candidatu 879 880 881 Figure 3. Rank abundances of the 50 most abundant taxa based on median relative abundance at 882 salinities less than seven (top) and greater than twelve (bottom) for OTUs (A, C) and ASVs (B, 883 D) across all sites. The boxes indicate the interquartile range (IQR) of the data, with vertical lines 884 indicating the upper and lower extremes according to 1.5 x IQR. Horizontal lines within each 885 box indicate the median. The data points comprising the distribution are plotted on top of the 886 boxplots. The shade of the dot represents the salinity at the sample site (red-blue :: lower-higher), 887 while the color of the box indicates broad taxonomic identity. LSUCC labels indicate OTUs and 888 ASVs with at least one cultivated representative. 889 890 891 892 893 894 895 896 897 898 899

Page 22 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

900 901 902 acIV subclade A BAL58 clade HIMB11 type

Po clade OM252 clade OM43 clade SAR11 LD12 ASV Type type1 type2 type3 type4 type5

Relative Abundace

SAR11 subclade IIIa1 SAR11 subclade IIIa.2

903 Salinity 904 905 Figure 4. Relative abundance of ASVs within key taxonomic groups compared with salinity. 906 ASV types are colored independently, and triangle points indicate experiments for which at least 907 one isolate was obtained. Non-linear regression lines are provided as a visual aid for abundance 908 trends. 909 910 911 912 913 914 915 916 917 918 919 920 921 922

Page 23 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

923 924 925

Step 1: Simulate n wells

w1 w2 w3 w4 w5 w6 w7 w w

Step : Simulate inoculation of wells from oisson distribution =inoculum

w1 w2 w3 w4 w5 w6 w7 w w

n=cellsinwell Step : Simulate lielihood of taon from Binomial distribution p=relabund

w1 w2 w3 w4 w5 w6 w7 w w

Step : Count positive wells, taon positive wells, pure wells and taon pure wells

w1 w2 w3 w4 w5 w6 w7 w w

Step : or taon pure wells, simulate lielihood of viability from Binomial n=cellsinwell distribution p=viability

w5 w7

Step 6: Count wells where viable cells >=1 Step 7: Bootstrap steps 1-6 k times at different levels of viability, 0≤v ≤

v0 v0 v0 v0 v0

0 Expected # taxn pe e n

Step 8: dentify min, max values of v where observed wells falls within bootstrapped C

bserved value 926 927 928 Figure 5. Graphical depiction of the viability model. 929 930

Page 24 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

931 932 933 934 935

936 937 938 939 Figure 6. Actual vs. expected numbers of isolates. Each point represents the actual number of 940 isolates for every ASV/experiment pair compared to the expected number of isolates based on 941 our model assuming 100% viability. Colors represent the relationship to the model predictions: 942 green- relative abundance low enough whereby 0 isolates fell within the 95% CI; pink- isolation 943 results within the 95% CI for taxa with relative abundance ≥ 2.9%; orange- actual isolates > 944 maximum 95% CI for expected isolates; blue- actual isolates < minimum 95% CI for expected 945 isolates. Circle size is proportional to the deviation of the number of actual isolates from the 946 maximum (for orange) or minimum (for blue) 95% CI for expected isolates. The dotted line is 947 the 1:1 ratio. Notable taxa on the extremities of the actual and expected values are labeled. 948 949 950 951 952 953

Page 25 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

954 Supplemental Information 955 956 Supplemental Text 957 Math for back calculating the per-cell division rates in Henson et al. 2018 assuming 15%--55% 958 viability. 959 960 Denote the initial cell count as �, and the cell count in � generation as�. If all the cells are 961 viable, per each generation, all the cells are doubling: 962 963 � = � ⋅ 2 (eq. S1.1) 964 965 And therefore, after �generations, the cell count �would be: 966 � = � ⋅ 2 (eq. S1.2) 967 968 When not all the cells are viable, for each generation, instead of simply doubling the whole 969 population, only a portion of the cells (denoted as� < 1) replicates. In this case: 970 971 � = � ⋅ (1 + �) (eq. S1.3) 972 973 Here we can see that (eq. 3.1) is actually a special case of (eq. 3.3), when � = 1, 974 975 � = � ⋅ (1 + 1) = � ⋅ 2 976 977 When � < 1, after � generations: 978 � = � ⋅ (1 + �) (eq. S1.4) 979 980 Denoting the total time to get � generations as �, which means the number of generations per unit 981 time, also known as the per-cell division rate, is � = �/�. Rewriting (eq. 3.4), we have: 982 ⋅ ⋅!()⋅ 983 � = � ⋅ (1 + �) = � ⋅ 2 (eq. S1.5) 984 985 According to (eq. 3.5), a cell population of �viability and � per cell division rate, the population 986 division rate is:

987 � = � ⋅ ���(1 + �) (eq. S1.6) 988 989 Henson et al. 2018 show that the optimal population division rate of SAR11 LD12 is 990 � = 0.5 ��� , assuming the viability of LD12 is � = 0.15, by plugging in these two 991 numbers to (eq. S1.6), we can get the per-cell division rate of LD12 is:

992 � = �����������/���2(1 + �) 993 = 0.5 ��� /���(1 + 0.15) 994 = 0.5 ���/0.20163 995 = 2.48 ��� 996 If assuming � = 0.55, � = 0.79 ���−1. Therefore, given the range of the viability of LD12 is 997 between 15%--55%, the corresponding per cell division rate is between 0.79 divisions to 2.48 998 divisions per day.

Page 26 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

999 1000 1001 Supplemental Table 1002 Supplemental Table S1 is a spreadsheet, Table_S1.xlsx. This includes MWH media recipes, 1003 ASV and OTU tables, taxonomic and relative abundance information for all isolates, NMDS 1004 information, biogeochemical data for the samples, and the BLAST results of isolate hits. 1005 Available at https://doi.org/10.6084/m9.figshare.12142113. 1006 1007 Supplemental Figures 1008 Figure S1. 16S rRNA gene phylogeny of LSUCC isolates in the Phylum Actinobacteria. Scale 1009 bar represents 0.01 changes per position. Values at internal nodes indicate bootstrap values (n = 1010 1000). 1011 1012 Figure S2. 16S rRNA gene phylogeny of LSUCC isolates in the Class Alphaproteobacteria. 1013 Scale bar represents 0.01 changes per position. Values at internal nodes indicate bootstrap values 1014 (n = 1000). 1015 1016 Figure S3. 16S rRNA gene phylogeny of LSUCC isolates in the Phylum Bacteroidetes. Scale 1017 bar represents 0.01 changes per position. Values at internal nodes indicate bootstrap values (n = 1018 1000). 1019 1020 Figure S4. 16S rRNA gene phylogeny of LSUCC isolates in the Class Betaproteobacteria. Scale 1021 bar represents 0.01 changes per position. Values at internal nodes indicate bootstrap values (n = 1022 1000). 1023 1024 Figure S5. 16S rRNA gene phylogeny of LSUCC isolates in the Class Gammaproteobacteria. 1025 Scale bar represents 0.01 changes per position. Values at internal nodes indicate bootstrap values 1026 (n = 1000). 1027 1028 These trees have been attached at the end of the manuscript for greater visibility 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044

Page 27 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1045 1046 1047 1048 A B LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC LSUCC 0.20 0.15

Phylum 0.15 Acidobacteria Actinobacteria 0.10 Bacteroidetes Cyanobacteria Deferribacteres 0.10 Proteobacteria

Abundance Thaumarchaeota

e 0.05

0.05 Salinity Relativ 25 20 15 10 0.00 0.00 5 e a 8 sp. sp. sp. sp. sp. sp. sp. sp. sp. eri type type type type type type type type type r r type r lad s s type a s IIIa.2 a IIIa.1 IIIa.1 IIIa.1 IIIa.2 clade clade clade clade clade clade clade clade clade clade clade clade clade clade clade LD12 clade clade clade clade clade clade clade clade clade LD12 clade clade clade clade clade clade clade clade clade clade clade clade clade clade clade m c ct group group group group group clade o 9 9 9 o 9 1 o 1 o I I I I I 1 I I I e e e e l e s s 1 6 6 3 3 4 6 8 3 1 1 6 3 1 3 6 1 6 8 5 6 4 2 6 6 8 V V V V V e e e e V V Aquiluna Aquiluna cI soi s s eoba a UniP - t ; 5 Actinomarina marin marin marin marin Acinobacteria Actinobacteria Actinobacteria subclad Proteobacteria subclad subclad subclad Proteobacteria Proteobacteria Proteobacteria . s subclad OTU33;SV1- 5 5 OTU59;PeM15 5 5 . . OTU1;ac U63 . . . . apro Pelagibacte Pelagibacte Pelagibacte ASV7591;SV1-8 1 1 1 1 t 1 OTU41;ac OTU20;ac Methylophilaceae s s e s Acidimicrobiaceae Acidimicrobiaceae OTU37;acI OTU27;acI Flavobacteriaceae . Rhodobacteraceae Betaproteobacteria Betaproteobacteria Rhodobacteraceae OT OTU12;acI ASV1821;ac ASV7420;ac ASV7735;ac Betaproteobacteria ASV7613;ac ASV7609;ac Rhodobacteraceae B . . OTU8;HIMB5 . OTU9;SAR1 OTU7;SAR1 OTU11;OM4 ...... ASV6590;acI ASV7657;acI ASV5060;acI OTU14;BAL5 OTU84;BAL5 . OTU10;HIMB1 OTU16;SAR8 OTU44;SAR8 OTU50;OM25 c ASV473;SAR8 ASV6691;OM4 ASV6693;OM4 ASV1389;OM4 ASV7241;OM4 OTU21;SAR11 OTU35;SAR32 OTU67;SAR11 Gammaproteobacteria Gammaproteobacteria ASV7731;BAL5 ASV3702;HIMB5 ASV5716;HIMB5 ASV5713;HIMB5 ASV6293;SAR1 ASV1142;SAR8 ASV7760;SAR1 ASV7629;SAR1 . OTU184;SAR11 . un ASV7223;SAR11 ASV7490;SAR32 ASV5567;SAR11 ; OTU15;MWH-UniP OTU31;Owenweeksi OTU23;NS OTU47;NS OTU308;LSUCC010 OTU127;OPB3 ASV2234;MWH OTU19;OM60/NOR ASV7146;MWH-UniP OTU53;Pseudospirillu ASV7027;MWH-UniP ASV2602;Owenweeksi ASV7518;LSUCC010 OTU111;unc ASV5375;NS OTU153;unc ASV5387;NS OTU3;Synechococcu OTU6;Synechococcu ASV4880;unc OTU18;Candidatu OTU147;unc OTU257;unc OTU537;unc OTU5;SAR1 U36 ASV6017;Synechococcu ASV7713;Synechococcu ASV3595;unc OTU48;Pseudoaltermona OTU49;SAR1 ASV7707;Candidatu OTU22;unc OTU74;unc OTU40;unc OTU66;unc OT OTU4;Candidatus_Actinomarina ASV4667;SAR1 ASV7245;unc ASV7108;SAR1 ASV7471;SAR1 ASV7214;unc ASV7530;unc ASV5003;unc ASV6466;unc ASV5073;unc OTU2;Candidatu OTU62;unc ASV7150;Candidatu ASV3353;unc 1049 ASV7779;Candidatu ASV7641;Candidatu 1050 1051 1052 1053 Figure S6. Rank abundance plot of the top 50 most abundant (A) OTUs and (B) ASVs across all 1054 sites. The boxes indicate the interquartile range (IQR) of the data, with vertical lines indicating 1055 the upper and lower extremes according to 1.5 x IQR. Horizontal lines within each box indicate 1056 the median. The underlying data points are each individual OTU’s sample relative abundances. 1057 The shade of the dot represents the salinity, while the color of the box is the Phylum of the OTU. 1058 OTUs with cultured representatives from the LSU culture collection are labeled with one 1059 LSUCC isolate. 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080

Page 28 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1081 1082 1083 1084 1085 1086 acIV clade Altererythrobacter sp. BAL58 clade Bordetella sp. HIMB11 type HIMB59 type 0.0025 0.05 0.020 0.04 0.04 0.009 0.02 0.0020 0.015 0.03 0.03 0.006 0.0015 0.010 0.02 0.01 0.0010 0.02 0.003 0.01 0.005 0.0005 0.01 0.000 0.000 0.00 0.0000 0.00 0.00 sp. Limnohabitans sp. LSUCC0101 type Marinobacterium sp. MWH-UniPo type OM182 clade 0.025 0.012 0.020 0.00075 0.020 0.0075 0.015 0.02 0.008 0.00050 0.015 0.0050 0.010 0.010 0.004 0.00025 0.005 0.0025 0.01 0.005 0.000 0.000 0.00000 0.000 0.0000 0.00 OM241 clade OM252 clade OM43 clade OM60/NOR5 clade Other Roseobacter clades Polynucleobacter sp. 0.020 0.08 0.012 0.06 0.010 0.015 0.02 0.015 0.008 OTU type 0.010 0.010 0.04 type1 0.005 0.01 0.004 0.005 0.005 0.02 type2 0.000 0.00 0.000 0.00 0.000 0.000 type3 Pseudohongiella sp. SAR11 LD12 SAR11 subclade IIIa1 SAR11 subclade IIIa2 SAR116 clade SAR92 clade 0.15 0.012 0.03 Cultivated 0.003 0.006

Relative Abundance 0.009 0.10 0.10 0.02 0.002 0.004 0.006 0.001 0.05 0.05 0.01 0.003 0.002 0.000 0.00 0.00 0.000 0.00 0.000 unk. Bradyrhizobiaceae unk. Burkholderiaceae unk. Burkholderiales unk. Flavobacteria unk. Methylophilales unk. 0.008 0.020 0.004 0.006 0.006 0.002 0.002 0.015 0.003 0.004 0.004 0.010 0.002 0.001 0.001 0.001 0.002 0.005 0.002 0.000 0.000 0.000 0.000 0.000 0.000 unk. Rhodobacteraceae (3) 0 10 20 0 10 20 0 10 20 0 10 20 0 10 20

0.0015 0.0010 0.0005 0.0000 0 10 20 1087 Salinity 1088 1089 1090 Figure S7. Relative abundance of cultivated OTUs according to site salinity. The color of the 1091 line represents the different OTUs classified within the LSUCC group. Non-linear regressions 1092 have been added for reference. 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103

Page 29 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1104 1105 1106 1107 acIV subclade A Altererythrobacter sp. BAL58 clade Bordetella sp. HIMB11 type HIMB59 type 0.015 0.025 0.003 0.010 0.020 0.010 0.010 0.02 0.002 0.015 0.005 0.005 0.010 0.01 0.005 0.001 0.005 0.000 0.000 0.000 0.000 0.000 0.00 Limnohabitans sp. LSUCC0101-type Marinobacterium sp. MWH-UniPo clade OM182 clade OM241 clade 0.005 0.004 0.009 0.010 0.0050 0.02 0.02 0.003 0.006 0.0025 0.002 0.005 0.01 0.01 ASV Type 0.001 0.003 0.000 0.0000 0.000 type1 0.00 0.000 0.00 type2 OM252 clade OM43 clade OM60/NOR5 clade Other Roseobacter clades Polynucleobacter sp. Pseudoalteromonas sp. 0.020 0.05 type3 0.010 0.015 0.04 0.004 0.008 0.002 type4 0.03 0.010 0.005 type5 0.02 0.002 0.004 0.001 0.005 0.01 type6 0.000 0.000 0.000 0.00 0.000 0.000 type7 Pseudohongiella sp. Rhodobacter sp. SAR11 LD12 SAR11 subclade IIIa1 SAR11 subclade IIIa2 SAR116 clade Relative Abundance 0.00075 0.15 type8 0.006 0.003 0.09 0.006 type9 0.00050 0.10 0.004 0.002 0.06 0.004 type10 0.00025 0.05 type11 0.001 0.03 0.002 0.002 0.00000 0.000 0.00 0.00 0.000 0.000 Cultivated SAR92 clade unk. Bradyrhizobiaceae unk. Burkholderiales unk. unk. Flavobacteriaceae unk. Oxalobacteraceae 0.0012 0.020 0.003 0.0015 0.003 0.0009 0.015 0.004 0.002 0.0010 0.002 0.0006 0.010 0.002 0.001 0.0005 0.0003 0.005 0.001

0.000 0.0000 0.000 0.0000 0.000 0.000 unk. Rhodobacteraceae (2) unk. Rhodobacteraceae (3) 0 10 20 0 10 20 0 10 20 0 10 20 0.0015 0.0020 0.0010 0.0015

0.0005 0.0010 0.0005 0.0000 0.0000 0 10 20 0 10 20 1108 Salinity 1109 1110 1111 Figure S8. Relative abundance of ASVs with a cultured LSUCC representative according to site 1112 salinity. The color of the line represents the different ASVs classified within the LSUCC group. 1113 Non-linear regressions have been added for reference. Arrows represent sites where isolates were 1114 cultivated. 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126

Page 30 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1127 1128 1129 1130 1131 1 1.96 3 5 inoculum 1.27 2 4

30.0

10.0

3.0

1.0

0.3 Minimum relative abundance where zero positive wells is significant (%) wells positive where zero abundance relative Minimum 100 1000 10000 1132 Number of inoculated wells per experiment 1133 1134 1135 1136 Figure S9. Sensitivity testing of the viability model to determine minimum numbers of wells 1137 required to overcome false negatives associated with differing levels of taxon relative 1138 abundance. Different inoculum densities (λ) are depicted with colors. The dotted line indicates 1139 460 wells which corresponded to the experiments in this study. 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151

Page 31 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1152 1153 1154 1155 1156 1157 1158 1159

1160 1161 1162 1163 1164 Figure S10. Logarithmic growth with different percentages of cell viability. Under different cell 1165 viabilities, such as 100%, 55% and 15% shown in (A), the culture can have logarithmic growth 1166 in the population size if we assume no cell death. It takes around 25 generations to have 10fold 1167 expansion of population size when all cells are active (100% viable), whereas when 15% of the 1168 population is actively duplicating, it takes around 115 generations to reach the same population 1169 size expansion. B) For the SAR11 LD12 strain LSUCC0530, the previously measured population 1170 doubling rate was 0.5/day (29). If all the cells are active (100% viability), each cell within the 1171 population will double 0.5 times per day (Fit 1 in B). However, with 15% viability, the actual 1172 active cell doubling rate is 2.48 doublings day-1 (Fit 3 in B). 1173 1174 1175

Page 32 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1176 References

1177 1. Metcalf WW, Griffin BM, Cicchillo RM, Gao J, Janga SC, Cooke HA, Circello BT, Evans 1178 BS, Martens-Habbena W, Stahl DA, van der Donk WA. 2012. Synthesis of 1179 methylphosphonic acid by marine microbes: a source for methane in the aerobic ocean. 1180 Science 337:1104–1107.

1181 2. Carini P, White AE, Campbell EO, Giovannoni SJ. 2014. Methane production by 1182 phosphate-starved SAR11 chemoheterotrophic marine bacteria. Nat Commun 5:4346.

1183 3. Karl DM, Beversdorf L, Björkman KM, Church MJ, Martinez A, Delong EF. 2008. Aerobic 1184 production of methane in the sea. Nat Geosci 1:473–478.

1185 4. Steindler L, Schwalbach MS, Smith DP, Chan F, Giovannoni SJ. 2011. Energy starved 1186 Candidatus Pelagibacter ubique substitutes light-mediated ATP production for endogenous 1187 carbon respiration. PLoS One 6:e19725.

1188 5. Gómez-Consarnau L, González JM, Coll-Lladó M, Gourdon P, Pascher T, Neutze R, 1189 Pedrós-Alió C, Pinhassi J. 2007. Light stimulates growth of proteorhodopsin-containing 1190 marine Flavobacteria. Nature 445:210–213.

1191 6. Daims H, Lebedeva EV, Pjevac P, Han P, Herbold C, Albertsen M, Jehmlich N, Palatinszky 1192 M, Vierheilig J, Bulaev A, Kirkegaard RH, von Bergen M, Rattei T, Bendinger B, Nielsen 1193 PH, Wagner M. 2015. Complete nitrification by Nitrospira bacteria. Nature 528:504–509.

1194 7. Lam KS. 2006. Discovery of novel metabolites from marine actinomycetes. Curr Opin 1195 Microbiol 9:245–251.

1196 8. Jensen PR, Williams PG, Oh D-C, Zeigler L, Fenical W. 2007. Species-specific secondary 1197 metabolite production in marine actinomycetes of the genus Salinispora. Appl Environ 1198 Microbiol 73:1146–1152.

1199 9. Nett M, Erol Ö, Kehraus S, Köck M, Krick A, Eguereva E, Neu E, König GM. 2006. 1200 Siphonazole, an Unusual Metabolite from Herpetosiphon sp. Angew Chem Int Ed 45:3863– 1201 3867.

1202 10. Ling LL, Schneider T, Peoples AJ, Spoering AL, Engels I, Conlon BP, Mueller A, 1203 Schäberle TF, Hughes DE, Epstein S, Jones M, Lazarides L, Steadman VA, Cohen DR, 1204 Felix CR, Fetterman KA, Millett WP, Nitti AG, Zullo AM, Chen C, Lewis K. 2015. A new 1205 antibiotic kills pathogens without detectable resistance. Nature 517:455–459.

1206 11. Lloyd KG, Steen AD, Ladau J, Yin J, Crosby L. 2018. Phylogenetically Novel Uncultured 1207 Microbial Cells Dominate Earth Microbiomes. mSystems 3:e00055–18.

1208 12. Hug LA. 2018. Sizing Up the Uncultured Microbial Majority. mSystems 3:e00185–18.

1209 13. Steen AD, Crits-Christoph A, Carini P, DeAngelis KM, Fierer N, Lloyd KG, Thrash JC. 1210 2019. High proportions of bacteria and archaea across most biomes remain uncultured.

Page 33 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1211 ISME J 13:3126–3130.

1212 14. Overmann J, Abt B, Sikorski J. 2017. Present and Future of Culturing Bacteria. Annu Rev 1213 Microbiol 71:711–730.

1214 15. Rappé MS. 2013. Stabilizing the foundation of the house that ’omics builds: the evolving 1215 value of cultured isolates to marine microbiology. Curr Opin Microbiol 16:618–624.

1216 16. Carini P. 2019. A “Cultural” Renaissance: Genomics Breathes New Life into an Old Craft. 1217 mSystems 4:e00092–19.

1218 17. Staley JT, Konopka A. 1985. Measurement of in situ activities of nonphotosynthetic 1219 microorganisms in aquatic and terrestrial habitats. Annu Rev Microbiol 39:321–346.

1220 18. Amann RI, Ludwig W, Schleifer KH. 1995. Phylogenetic identification and in situ detection 1221 of individual microbial cells without cultivation. Microbiol Rev 59:143–169.

1222 19. Hahn MW, Koll U, Schmidt J. 2019. Isolation and Cultivation of Bacteria, p. 313–351. In 1223 Hurst, CJ (ed.), The Structure and Function of Aquatic Microbial Communities. Springer 1224 International Publishing, Cham.

1225 20. Kaeberlein T, Lewis K, Epstein SS. 2002. Isolating “uncultivable” microorganisms in pure 1226 culture in a simulated natural environment. Science 296:1127–1129.

1227 21. Zengler K, Toledo G, Rappe M, Elkins J, Mathur EJ, Short JM, Keller M. 2002. Cultivating 1228 the uncultured. Proc Natl Acad Sci U S A 99:15681–15686.

1229 22. Steinert G, Whitfield S, Taylor MW, Thoms C, Schupp PJ. 2014. Application of Diffusion 1230 Growth Chambers for the Cultivation of Marine Sponge-Associated Bacteria. Mar 1231 Biotechnol 16:594–603.

1232 23. Nichols D, Cahoon N, Trakhtenberg EM, Pham L, Mehta A, Belanger A, Kanigan T, Lewis 1233 K, Epstein SS. 2010. Use of Ichip for High-Throughput In Situ Cultivation of 1234 “Uncultivable” Microbial Species. Appl Environ Microbiol 76:2445–2450.

1235 24. Tandogan N, Abadian PN, Epstein S, Aoi Y, Goluch ED. 2014. Isolation of microorganisms 1236 using sub-micrometer constrictions. PLoS One 9:e101429.

1237 25. Hahn MW, Stadler P, Wu QL, Pöckl M. 2004. The filtration–acclimatization method for 1238 isolation of an important fraction of the not readily cultivable bacteria. J Microbiol Methods 1239 57:379–390.

1240 26. Connon SA, Giovannoni SJ. 2002. High-throughput methods for culturing microorganisms 1241 in very-low-nutrient media yield diverse new marine isolates. Appl Environ Microbiol 1242 68:3878–3885.

1243 27. Yang SJ, Kang I, Cho JC. 2016. Expansion of Cultured Bacterial Diversity by Large-Scale 1244 Dilution-to-Extinction Culturing from a Single Seawater Sample. Microb Ecol 71:29–43.

Page 34 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1245 28. Stingl U, Tripp HJ, Giovannoni SJ. 2007. Improvements of high-throughput culturing 1246 yielded novel SAR11 strains and other abundant marine bacteria from the Oregon coast and 1247 the Bermuda Atlantic Time Series study site. ISME J 1:361–371.

1248 29. Henson MW, Lanclos VC, Faircloth BC, Thrash JC. 2018. Cultivation and genomics of the 1249 first freshwater SAR11 (LD12) isolate. ISME J 12:1846–1860.

1250 30. Hahnke RL, Bennke CM, Fuchs BM, Mann AJ, Rhiel E, Teeling H, Amann R, Harder J. 1251 2015. Dilution cultivation of marine heterotrophic bacteria abundant after a spring 1252 bloom in the North Sea. Environ Microbiol 17:3515–3526.

1253 31. Sosa OA, Gifford SM, Repeta DJ, DeLong EF. 2015. High molecular weight dissolved 1254 organic matter enrichment selects for methylotrophs in dilution to extinction cultures. ISME 1255 J 9:2725–2739.

1256 32. Schut F, de Vries EJ, Gottschal JC, Robertson BR, Harder W, Prins RA, Button DK. 1993. 1257 Isolation of Typical Marine Bacteria by Dilution Culture: Growth, Maintenance, and 1258 Characteristics of Isolates under Laboratory Conditions. Appl Environ Microbiol 59:2150– 1259 2160.

1260 33. Button DK, Schut F, Quang P, Martin R, Robertson BR. 1993. Viability and isolation of 1261 marine bacteria by dilution culture: theory, procedures, and initial results. Appl Environ 1262 Microbiol 59:881–891.

1263 34. Song J, Oh HM, Cho JC. 2009. Improved culturability of SAR11 strains in dilution-to- 1264 extinction culturing from the East Sea, West Pacific Ocean. FEMS Microbiol Lett 295:141– 1265 147.

1266 35. Henson MW, Pitre DM, Weckhorst JL, Lanclos VC, Webber AT, Thrash JC. 2016. 1267 Artificial Seawater Media Facilitate Cultivating Members of the Microbial Majority from 1268 the Gulf of Mexico. mSphere 1:e00028–16.

1269 36. Rappé MS, Connon SA, Vergin KL, Giovannoni SJ. 2002. Cultivation of the ubiquitous 1270 SAR11 marine bacterioplankton clade. Nature 418:630–633.

1271 37. Marshall KT, Morris RM. 2013. Isolation of an aerobic sulfur oxidizer from the 1272 SUP05/Arctic96BD-19 clade. ISME J 7:452–455.

1273 38. Shah V, Chang BX, Morris RM. 2017. Cultivation of a chemoautotroph from the SUP05 1274 clade of marine bacteria that produces nitrite and consumes ammonium. ISME J 11:263– 1275 271.

1276 39. Spietz RL, Lundeen RA, Zhao X, Nicastro D, Ingalls AE, Morris RM. 2019. Heterotrophic 1277 carbon metabolism and energy acquisition in Candidatus Thioglobus singularis strain PS1, a 1278 member of the SUP05 clade of marine Gammaproteobacteria. Environ Microbiol 21:2391– 1279 2401.

1280 40. Huggett MJ, Hayakawa DH, Rappé MS. 2012. Genome sequence of strain HIMB624, a

Page 35 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1281 cultured representative from the OM43 clade of marine Betaproteobacteria. Stand Genomic 1282 Sci 6:11–20.

1283 41. Giovannoni SJ, Hayakawa DH, Tripp HJ, Stingl U, Givan SA, Cho J-C, Oh H-M, Kitner 1284 JB, Vergin KL, Rappé MS. 2008. The small genome of an abundant coastal ocean 1285 methylotroph. Environ Microbiol 10:1771–1782.

1286 42. Durham BP, Grote J, Whittaker KA, Bender SJ, Luo H, Grim SL, Brown JM, Casey JR, 1287 Dron A, Florez-Leiva L, Others. 2014. Draft genome sequence of marine 1288 alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage 1289 within the Roseobacter clade possessing an unusually small genome. Stand Genomic Sci 1290 9:632–645.

1291 43. Cho JC, Giovannoni SJ. 2004. Cultivation and Growth Characteristics of a Diverse Group 1292 of Oligotrophic Marine Gammaproteobacteria. Appl Environ Microbiol 70:432–440.

1293 44. Kim S, Kang I, Seo J-H, Cho J-C. 2019. Culturing the ubiquitous freshwater actinobacterial 1294 acI lineage by supplying a biochemical “helper”. ISME J 13:2252–2263.

1295 45. Tripp HJ. 2013. The unique metabolism of SAR11 aquatic bacteria. J Microbiol 51:147– 1296 153.

1297 46. Nichols D, Lewis K, Orjala J, Mo S, Ortenberg R, O’Connor P, Zhao C, Vouros P, 1298 Kaeberlein T, Epstein SS. 2008. Short Peptide Induces an “Uncultivable” Microorganism 1299 To Grow In Vitro. Appl Environ Microbiol 74:4889–4897.

1300 47. Stewart EJ. 2012. Growing unculturable bacteria. J Bacteriol 194:4151–4160.

1301 48. Kell DB, Young M. 2000. Bacterial dormancy and culturability: the role of autocrine 1302 growth factors. Curr Opin Microbiol 3:238–243.

1303 49. Carini P, Steindler L, Beszteri S, Giovannoni SJ. 2013. Nutrient requirements for growth of 1304 the extreme “Candidatus Pelagibacter ubique” HTCC1062 on a defined medium. 1305 ISME J 7:592–602.

1306 50. Epstein SS. 2013. The phenomenon of microbial uncultivability. Curr Opin Microbiol 1307 16:636–642.

1308 51. Shah D, Zhang Z, Khodursky A, Kaldalu N, Kurg K, Lewis K. 2006. Persisters: a distinct 1309 physiological state of E. coli. BMC Microbiol 6:53.

1310 52. Kell D, Potgieter M, Pretorius E. 2015. Individuality, phenotypic differentiation, dormancy 1311 and “persistence” in culturable bacterial systems: commonalities shared by environmental, 1312 laboratory, and clinical microbiology. F1000Res 4:179.

1313 53. Lennon JT, Jones SE. 2011. Microbial seed banks: the ecological and evolutionary 1314 implications of dormancy. Nat Rev Microbiol 9:119–130.

Page 36 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1315 54. Grassi L, Di Luca M, Maisetta G, Rinaldi AC, Esin S, Trampuz A, Batoni G. 2017. 1316 Generation of persister cells of and Staphylococcus aureus by 1317 chemical treatment and evaluation of their susceptibility to membrane-targeting agents. 1318 Front Microbiol 8:1917.

1319 55. Bergkessel M, Basta DW, Newman DK. 2016. The physiology of growth arrest: uniting 1320 molecular and environmental microbiology. Nat Rev Microbiol 14:549–562.

1321 56. Kussell E, Leibler S. 2005. Phenotypic diversity, population growth, and information in 1322 fluctuating environments. Science 309:2075–2078.

1323 57. Kurm V, van der Putten WH, Gera Hol WH. 2019. Cultivation-success of rare soil bacteria 1324 is not influenced by incubation time and growth medium. PLoS One 14:e0210073.

1325 58. D’Onofrio A, Crawford JM, Stewart EJ, Witt K, Gavrish E, Epstein S, Clardy J, Lewis K. 1326 2010. Siderophores from neighboring organisms promote the growth of uncultured bacteria. 1327 Chem Biol 17:254–264.

1328 59. Thrash JC. 2019. Culturing the Uncultured: Risk versus Reward. mSystems 4:e00130–19.

1329 60. Thrash JC, Weckhorst JL, Pitre DM. 2017. Cultivating Fastidious Microbes, p. 57–78. In 1330 McGenity, TJ, Timmis, KN, Nogales, B (eds.), and Lipid Microbiology 1331 Protocols. Springer Berlin Heidelberg, Berlin, Heidelberg.

1332 61. Vila-Costa M, Simó R, Harada H, Gasol JM, Slezak D, Kiene RP. 2006. 1333 Dimethylsulfoniopropionate uptake by marine phytoplankton. Science 314:652–654.

1334 62. Mou X, Hodson RE, Moran MA. 2007. Bacterioplankton assemblages transforming 1335 dissolved organic compounds in coastal seawater. Environ Microbiol 9:2025–2037.

1336 63. Stein LY. 2015. Microbiology: Cyanate fuels the . Nature 524:43–44.

1337 64. Repeta DJ, Ferrón S, Sosa OA, Johnson CG, Repeta LD, Acker M, Delong EF, Karl DM. 1338 2016. Marine methane paradox explained by bacterial degradation of dissolved organic 1339 matter. Nat Geosci 9:884–887.

1340 65. Curson ARJ, Liu J, Bermejo Martínez A, Green RT, Chan Y, Carrión O, Williams BT, 1341 Zhang S-H, Yang G-P, Bulman Page PC, Zhang X-H, Todd JD. 2017. 1342 Dimethylsulfoniopropionate biosynthesis in marine bacteria and identification of the key 1343 gene in this process. Nat Microbiol 2:17009.

1344 66. Widner B, Mulholland MR. 2017. Cyanate distribution and uptake in North Atlantic coastal 1345 waters. Limnol Oceanogr 62:2538–2549.

1346 67. Widner B, Mulholland MR, Mopper K. 2016. Distribution, Sources, and Sinks of Cyanate 1347 in the Coastal North Atlantic Ocean. Environ Sci Technol Lett 3:297–302.

1348 68. Henson MW, Hanssen J, Spooner G, Fleming P, Pukonen M, Stahr F, Thrash JC. 2018.

Page 37 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1349 Nutrient dynamics and stream order influence microbial community patterns along a 2914 1350 kilometer transect of the Mississippi River. Limnol Oceanogr 63:1837–1855.

1351 69. Apprill A, McNally S, Parsons R, Weber L. 2015. Minor revision to V4 region SSU rRNA 1352 806R gene primer greatly increases detection of SAR11 bacterioplankton. Aquat Microb 1353 Ecol 75:129–137.

1354 70. Walters W, Hyde ER, Berg-Lyons D, Ackermann G, Humphrey G, Parada A, Gilbert JA, 1355 Jansson JK, Caporaso JG, Fuhrman JA, Apprill A, Knight R. 2016. Improved Bacterial 16S 1356 rRNA Gene (V4 and V4-5) and Fungal Internal Transcribed Spacer Marker Gene Primers 1357 for Microbial Community Surveys. mSystems 1:e00009–15.

1358 71. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski R a., 1359 Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, 1360 Weber CF. 2009. Introducing mothur: Open-source, platform-independent, community- 1361 supported software for describing and comparing microbial communities. Appl Environ 1362 Microbiol 75:7537–7541.

1363 72. Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glockner FO, Glöckner 1364 FO. 2007. SILVA: a comprehensive online resource for quality checked and aligned 1365 ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res 35:7188–7196.

1366 73. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO. 1367 2013. The SILVA ribosomal RNA gene database project: Improved data processing and 1368 web-based tools. Nucleic Acids Res 41:590–596.

1369 74. Eren AM, Borisy GG, Huse SM, Mark Welch JL. 2014. Oligotyping analysis of the human 1370 oral microbiome. Proc Natl Acad Sci U S A 111:E2875–84.

1371 75. Fujimoto M, Cavaletto J, Liebig JR, McCarthy A, Vanderploeg HA, Denef VJ. 2016. 1372 Spatiotemporal distribution of bacterioplankton functional groups along a freshwater 1373 estuary to pelagic gradient in Lake Michigan. J Great Lakes Res 42:1036–1048.

1374 76. Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM, Sun Y, Brown CT, Porras-Alfaro A, 1375 Kuske CR, Tiedje JM. 2014. Ribosomal Database Project: data and tools for high 1376 throughput rRNA analysis. Nucleic Acids Res 42:D633–42.

1377 77. Team RC. 2017. R Core Team (2017). R: A language and environment for statistical 1378 computing. R Found Stat Comput Vienna, Austria URL http://www R-project org/ , page R 1379 Foundation for Statistical Computing.

1380 78. McMurdie PJ, Holmes S. 2013. phyloseq: An R Package for Reproducible Interactive 1381 Analysis and Graphics of Microbiome Census Data. PLoS One 8:e61217.

1382 79. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. 1383 BLAST+: architecture and applications. BMC Bioinformatics 10:421.

1384 80. Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high

Page 38 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1385 throughput. Nucleic Acids Res 32:1792–1797.

1386 81. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. 2009. trimAl: a tool for automated 1387 alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972–1973.

1388 82. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective 1389 stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 1390 32:268–274.

1391 83. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. 2018. UFBoot2: Improving 1392 the Ultrafast Bootstrap Approximation. Mol Biol Evol 35:518–522.

1393 84. Junier T, Zdobnov EM. 2010. The Newick utilities: high-throughput 1394 processing in the UNIX shell. Bioinformatics 26:1669–1670.

1395 85. Han MV, Zmasek CM. 2009. phyloXML: XML for evolutionary biology and comparative 1396 genomics. BMC Bioinformatics 10:356.

1397 86. Stingl U, Cho JC, Foo W, Vergin KL, Lanoil B, Giovannoni SJ. 2008. Dilution-to- 1398 extinction culturing of psychrotolerant planktonic bacteria from permanently ice-covered 1399 lakes in the McMurdo Dry Valleys, . Microb Ecol 55:395–405.

1400 87. Page KA, Connon SA, Giovannoni SJ. 2004. Representative freshwater bacterioplankton 1401 isolated from Crater Lake, Oregon. Appl Environ Microbiol 70:6542–6550.

1402 88. Noell SE, Giovannoni SJ. 2019. SAR11 bacteria have a high affinity and multifunctional 1403 glycine betaine transporter. Environ Microbiol 21:2559–2575.

1404 89. Thume K, Gebser B, Chen L, Meyer N, Kieber DJ, Pohnert G. 2018. The metabolite 1405 dimethylsulfoxonium propionate extends the marine organosulfur cycle. Nature 563:412– 1406 415.

1407 90. Asher EC, Dacey JWH, Stukel M, Long MC, Tortell PD. 2017. Processes driving seasonal 1408 variability in DMS, DMSP, and DMSO concentrations and turnover in coastal Antarctic 1409 waters. Limnol Oceanogr 62:104–124.

1410 91. Mizuno CM, Rodriguez-Valera F, Ghai R. 2015. Genomes of planktonic Acidimicrobiales: 1411 widening horizons for marine Actinobacteria by metagenomics. MBio 6:e02083–14.

1412 92. Lee J, Kwon KK, Lim S-I, Song J, Choi AR, Yang S-H, Jung K-H, Lee J-H, Kang SG, Oh 1413 H-M, Cho JC. 2019. Isolation, cultivation, and genome analysis of proteorhodopsin- 1414 containing SAR116-clade strain Candidatus Puniceispirillum marinum IMCC1322. J 1415 Microbiol 57:676–687.

1416 93. Kitzinger K, Padilla CC, Marchant HK, Hach PF, Herbold CW, Kidane AT, Könneke M, 1417 Littmann S, Mooshammer M, Niggemann J, Petrov S, Richter A, Stewart FJ, Wagner M, 1418 Kuypers MMM, Bristow LA. 2019. Cyanate and urea are substrates for nitrification by 1419 Thaumarchaeota in the marine environment. Nat Microbiol 4:234–243.

Page 39 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1420 94. Kamennaya NA, Post AF. 2011. Characterization of cyanate metabolism in marine 1421 and spp. Appl Environ Microbiol 77:291–301.

1422 95. Eren AM, Maignien L, Sul WJ, Murphy LG, Grim SL, Morrison HG, Sogin ML. 2013. 1423 Oligotyping: Differentiating between closely related microbial taxa using 16S rRNA gene 1424 data. Methods Ecol Evol 4:1111–1119.

1425 96. Needham DM, Fuhrman JA. 2016. Pronounced daily succession of phytoplankton, archaea 1426 and bacteria following a . Nat Microbiol 1:16005.

1427 97. Campbell BJ, Yu L, Heidelberg JF, Kirchman DL. 2011. Activity of abundant and rare 1428 bacteria in a coastal ocean. Proc Natl Acad Sci U S A 108:12776–12781.

1429 98. Oh HM, Kwon KK, Kang I, Kang SG, Lee JH, Kim SJ, Cho JC. 2010. Complete genome 1430 sequence of “Candidatus puniceispirillum marinum” IMCC1322, a representative of the 1431 SAR116 clade in the Alphaproteobacteria. J Bacteriol 192:3240–3241.

1432 99. Fegatella F, Lim J, Kjelleberg S, Cavicchioli R. 1998. Implications of rRNA operon copy 1433 number and ribosome content in the marine oligotrophic ultramicrobacterium 1434 Sphingomonas sp. strain RB2256. Appl Environ Microbiol 64:4433–4438.

1435 100. Acinas SG, Marcelino LA, Klepac-Ceraj V, Polz MF. 2004. Divergence and redundancy 1436 of 16S rRNA sequences in genomes with multiple rrn operons. J Bacteriol 186:2629–2635.

1437 101. Pedrós-Alió C. 2012. The rare bacterial biosphere. Ann Rev Mar Sci 4:449–466.

1438 102. Martiny AC. 2019. High proportions of bacteria are culturable across major biomes. 1439 ISME J 13:2125–2128.

1440 103. Parada AE, Needham DM, Fuhrman JA. 2015. Every base matters: assessing small 1441 subunit rRNA primers for marine microbiomes with mock communities, time series and 1442 global field samples. Environ Microbiol 18:1403–1414.

1443 104. Del Giorgio PA, Gasol JM. 2008. Physiological structure and single-cell activity in 1444 marine bacterioplankton. Microbial Ecology of the 2:243–298.

1445 105. Malmstrom RR, Kiene RP, Cottrell MT, Kirchman DL. 2004. Contribution of SAR11 1446 bacteria to dissolved dimethylsulfoniopropionate and amino acid uptake in the North 1447 Atlantic Ocean. Appl Environ Microbiol 70:4129–4135.

1448 106. Malmstrom RR, Cottrell MT, Elifantz H, Kirchman DL. 2005. production and 1449 assimilation of dissolved organic matter by SAR11 bacteria in the Northwest Atlantic 1450 Ocean. Appl Environ Microbiol 71:2979–2986.

1451 107. Piwosz K, Salcher MM, Zeder M, Ameryk A, Pernthaler J. 2013. Seasonal dynamics and 1452 activity of typical freshwater bacteria in brackish waters of the Gulf of Gdańsk. Limnol 1453 Oceanogr 58:817–826.

Page 40 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1454 108. Laghdass M, Catala P, Caparros J, Oriol L, Lebaron P, Obernosterer I. 2012. High 1455 contribution of SAR11 to microbial activity in the north west . Microb 1456 Ecol 63:324–333.

1457 109. Salter I, Galand PE, Fagervold SK, Lebaron P, Obernosterer I, Oliver MJ, Suzuki MT, 1458 Tricoire C. 2015. Seasonal dynamics of active SAR11 in the oligotrophic 1459 Northwest Mediterranean Sea. ISME J 9:347–360.

1460 110. Leizeaga A, Estrany M, Forn I, Sebastián M. 2017. Using Click-Chemistry for 1461 Visualizing in Situ Changes of Translational Activity in Planktonic Marine Bacteria. Front 1462 Microbiol 8:2360.

1463 111. Kirchman DL. 2016. Growth Rates of Microbes in the Oceans. Ann Rev Mar Sci 8:285– 1464 309.

1465 112. Samo TJ, Smriga S, Malfatti F, Sherwood BP, Azam F. 2014. Broad distribution and high 1466 proportion of protein synthesis active marine bacteria revealed by click chemistry at the 1467 single cell level. Frontiers in Marine Science 1:48.

1468 113. Smriga S, Samo TJ, Malfatti F, Villareal J, Azam F. 2014. Individual cell DNA synthesis 1469 within natural marine bacterial assemblages as detected by “click”chemistry. Aquat Microb 1470 Ecol 72:269–280.

1471 114. Bradley JA, Amend JP, LaRowe DE. 2019. Survival of the fewest: Microbial dormancy 1472 and maintenance in marine sediments through deep time. Geobiology 17:43–59.

1473 115. Zhao Y, Temperton B, Thrash JC, Schwalbach MS, Vergin KL, Landry ZC, Ellisman M, 1474 Deerinck T, Sullivan MB, Giovannoni SJ. 2013. Abundant SAR11 viruses in the ocean. 1475 Nature 494:357–360.

1476 116. Martinez-Hernandez F, Fornas Ò, Lluesma Gomez M, Garcia-Heredia I, Maestre- 1477 Carballa L, López-Pérez M, Haro-Moreno JM, Rodriguez-Valera F, Martinez-Garcia M. 1478 2018. Single-cell genomics uncover Pelagibacter as the putative host of the extremely 1479 abundant uncultured 37-F6 viral population in the ocean. ISME J 13:232–236.

1480 117. Chen L-X, Zhao Y, McMahon KD, Mori JF, Jessen GL, Nelson TC, Warren LA, 1481 Banfield JF. 2019. Wide Distribution of Phage That Infect Freshwater SAR11 Bacteria. 1482 mSystems 4:e00410–19.

1483 118. Thingstad TF, Lignell R. 1997. Theoretical models for the control of bacterial growth 1484 rate, abundance, diversity and carbon demand. Aquat Microb Ecol 13:19–27.

1485 119. Våge S, Storesund JE, Thingstad TF. 2013. SAR11 viruses and defensive host strains. 1486 Nature 499:E3–4.

1487 120. Thingstad TF, Vage S, Storesund JE, Sandaa RA, Giske J. 2014. A theoretical analysis of 1488 how strain-specific viruses can control microbial species diversity. Proc Natl Acad Sci U S 1489 A 111:7813–7818.

Page 41 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1490 121. Giovannoni S, Temperton B, Zhao Y. 2013. Giovannoni et al. reply. Nature 499:E4–E5.

1491 122. Alonso-Sáez L, Morán XAG, Clokie MR. 2018. Low activity of lytic pelagiphages in 1492 coastal marine waters. ISME J 12:2100–2102.

1493 123. Chapman-McQuiston E, Wu XL. 2008. Stochastic receptor expression allows sensitive 1494 bacteria to evade phage attack. Part I: experiments. Biophys J 94:4525–4536.

1495 124. Bertozzi Silva J, Storms Z, Sauvageau D. 2016. Host receptors for 1496 adsorption. FEMS Microbiol Lett 363:fnw002.

1497 125. Oliver JD. 2005. The viable but nonculturable state in bacteria. J Microbiol 43 Spec 1498 No:93–100.

1499 126. Dubnau D, Losick R. 2006. Bistability in bacteria. Mol Microbiol 61:564–572.

1500 127. Shkoporov AN, Khokhlova EV, Fitzgerald CB, Stockdale SR, Draper LA, Ross RP, Hill 1501 C. 2018. ΦCrAss001 represents the most abundant bacteriophage family in the human gut 1502 and infects intestinalis. Nat Commun 9:4781.

1503 128. Teira E, Martínez-García S, Lønborg C, Alvarez-Salgado XA. 2009. Growth rates of 1504 different phylogenetic bacterioplankton groups in a coastal system. Environ 1505 Microbiol Rep 1:545–554.

1506 129. Lankiewicz TS, Cottrell MT, Kirchman DL. 2016. Growth rates and rRNA content of 1507 four marine bacteria in pure cultures and in the Delaware estuary. ISME J 10:823–832.

1508 130. Ferrera I, Gasol JM, Sebastián M, Hojerová E, Koblízek M. 2011. Comparison of growth 1509 rates of aerobic anoxygenic phototrophic bacteria and other bacterioplankton groups in 1510 coastal Mediterranean waters. Appl Environ Microbiol 77:7451–7458.

1511 131. Carini P, Campbell EO, Morré J, Sañudo-Wilhelmy SA, Thrash JC, Bennett SE, 1512 Temperton B, Begley T, Giovannoni SJ. 2014. Discovery of a SAR11 growth requirement 1513 for thiamin’s pyrimidine precursor and its distribution in the Sargasso Sea. ISME J 8:1727– 1514 1738.

1515 132. Grant SR, Church MJ, Ferrón S, Laws EA, Rappé MS. 2019. Elemental Composition, 1516 Phosphorous Uptake, and Characteristics of Growth of a SAR11 Strain in Batch and 1517 Continuous Culture. mSystems 4:e00218–18.

1518 133. White AE, Giovannoni SJ, Zhao Y, Vergin K, Carlson CA. 2019. Elemental content and 1519 stoichiometry of SAR11 chemoheterotrophic marine bacteria. Limnology and 1520 Oceanography Letters 4:44–51.

1521 134. Smith DP, Cameron Thrash J, Nicora CD, Lipton MS, Burnum-Johnson KE, Carini P, 1522 Smith RD, Giovannoni SJ. 2013. Proteomic and Transcriptomic Analyses of “Candidatus 1523 Pelagibacter ubique” Describe the First PII-Independent Response to Nitrogen Limitation in 1524 a Free-Living Alphaproteobacterium. MBio 4:e00133–12.

Page 42 of 43 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1525 135. van Vliet S, Dal Co A, Winkler AR, Spriewald S, Stecher B, Ackermann M. 2018. 1526 Spatially Correlated Gene Expression in Bacterial Groups: The Role of Lineage History, 1527 Spatial Gradients, and Cell-Cell Interactions. Cell Syst 6:496–507.e6.

1528

Page 43 of 43 gi 1434172709 gb MH671546 1 Terrabacter sp strain Actino-50 98 gi 260150789 gb GQ369088 1 Terrabacter sp Z0-YC6820 76 Figure S1 98 gi 322184102 gb JF198697 1 Uncultured bacterium clone ncd2318c07c1 gi 1434172666 gb MH671503 1 Terrabacter sp strain Actino-6 95 gi 1219688315 gb MF526112 1 Terracoccus sp strain MadaFrogSkinBac DB 2810 93 9377 gi 929046992 gb KR010185 1 Terrabacter terrae strain NSB-36 8295 7896gi 1219687689 gb MF525486 1 Terracoccus sp strain MadaFrogSkinBac DB 3694 95 96 gi 33309345 gb AF408945 1 Terrabacter sp Ellin103 96 gi 675399428 gb KJ878630 1 Terrabacter sp ES3-48 95 93 gi 984880284 emb LN876372 2 Terrabacter sp URHB0043 94 gi 984880215 emb LN876345 1 Terrabacter sp URHB0013 gi 1056940501 gb KX753012 1 Uncultured bacterium clone VLK 16 gi 411023858 gb JQ376619 2 Uncultured bacterium clone W031c9 11143 gi 636560307 ref NR 116367 1 Terrabacter aeriphilus strain 5414T-18 gi 109391678 gb DQ643663 1 Uncultured soil bacterium clone W4Ba03 gi 375244974 gb JQ625316 1 Uncultured bacterium clone OLH168 54 93 100 gi 381138672 gb JQ782930 1 Terrabacter sp 8-140 gi 166202141 gb EU262306 1 Uncultured bacterium clone 2005-MA-2-100207 gi 399762847 gb JX079240 1 Uncultured Terrabacter sp clone BG2-162 gi 226816208 dbj D84624 2 Phycicoccus sp S23430 92 gi 984880275 emb LN876406 1 Phycicoccus sp URHC0019 92 96gi 307593624 gb HM049681 1 Uncultured bacterium clone SZS-0 6 76 90 gi 359815377 emb FR853183 1 Uncultured bacterium 68 LSUCC0193 86 gi 38195153 gb AY360576 1 Uncultured actinobacterium clone H5Ba24 91 gi 636560657 ref NR 116717 1 Phycicoccus cremeus strain V2M29 Phycicococcus sp. 97 100 gi 9944265 emb AJ277696 1 Uncultured bacterium ARFS-29 partial gi 1341394009 ref NR 153681 1 Phycicoccus ginsengisoli strain DCY87 91 gi 356032913 gb GQ344406 2 Phycicoccus ochangensis strain B5a-b5 gi 1146076795 gb KY012263 1 Phycicoccus ginsenosidimutans strain AS4-17 99 gi 511194379 gb KC554448 1 Uncultured bacterium clone 8-586 100 gi 341925932 dbj AB649010 1 Phycicoccus aerophilus gi 327342252 gb JF417737 1 Uncultured bacterium clone QQSB18 96 gi 902940936 gb KP639140 1 Phycicoccus sp LZB055 72 81 99 gi 322184042 gb JF198637 1 Uncultured bacterium clone ncd2317b12c1 66 88 53 gi 511201697 gb KC606872 1 Uncultured bacterium clone CSbiofilm 18 040210 G04 68 gi 627529353 gb KJ473526 1 Uncultured Tetrasphaera sp clone BB32 46 gi 322160805 gb JF175399 1 Uncultured bacterium clone ncd2025g03c1 46 gi 1066599747 gb KX858544 1 Uncultured bacterium isolate DGGE gel band b1 93 gi 162137989 gb EU298817 1 Uncultured Tetrasphaera sp clone GASP-KB3W1 D12 100 gi 1124890878 gb KY386418 1 Knoellia sp strain R-68061 82 gi 1150392589 gb KY176950 1 Tetrasphaera sp strain S512 66 gi 1150392751 gb KY445722 1 Tetrasphaera sp strain TM-S120 68 gi 1150392586 gb KY176947 1 Tetrasphaera sp strain S405 gi 1434131666 gb MG594811 1 Janibacter hoylei strain O18 93 gi 928195899 gb KR054059 1 Janibacter cremeus strain HB45 93 gi 1189442627 gb KY775504 1 Janibacter anophelis strain IC B3 2 90 gi 918464473 gb KT382393 1 Bacterium BEL B20 24 79 gi 322211313 gb JF225908 1 Uncultured bacterium clone ncd2570c07c1 100 gi 322211930 gb JF226525 1 Uncultured bacterium clone ncd2581c01c1 10066 gi 643406902 gb KF086612 1 Uncultured bacterium clone ncm23c02c1 gi 1434172683 gb MH671520 1 Janibacter sp strain Actino-23 55 20 gi 27497666 gb AY167846 1 Janibacter limosus strain SAFR-043 74 98gi 595250671 gb KJ000756 1 Janibacter sp SCU-B24 gi 498905833 gb KC195789 1 Janibacter sp AB7 100 gi 1261504570 gb MG205908 1 Janibacter terrae strain NIOER183 25 99gi 498905834 gb KC195790 1 Janibacter sp AB9 98 83 gi 363991874 gb JQ014316 1 Janibacter sp LC420 gi 826022232 gb KP345928 1 Janibacter sp 3329 gi 221706541 gb FJ595986 1 Janibacter sp LI-80 19LSUCC0537 12gi 1287453529 dbj LC338047 1 Janibacter sp ASH66 29 Janibacter sp. 99gi 725096926 gb KM253202 1 Janibacter sp UR 2-02 gi 190364966 gb EU730944 1 Janibacter anophelis strain 221 24 gi 643428046 gb KF107716 1 Uncultured bacterium clone ncm61d02c1 100 gi 116871391 gb EF028236 1 Tetrasphaera remsis strain 3-M5-R-7 72 gi 636559739 ref NR 115799 1 Tetrasphaera remsis strain 3-M5-R-4 100 gi 238331493 gb GQ095316 1 Uncultured bacterium clone nbw437a10c1 gi 1219688047 gb MF525844 1 Intrasporangium calvum strain MadaFrogSkinBac DB 2536 100 gi 846386474 gb KR856422 1 Intrasporangium mesophilum strain SG66 gi 1185542091 emb LT631758 1 Micrococcus luteus LSUCC0042 17gi 330717791 gb JF792083 1 Micrococcus yunnanensis strain 17D 86 55gi 441090801 gb KC203087 1 Micrococcus sp WY22 55 53 gi 1388592975 gb MF093190 1 Micrococcus luteus strain B8 Micrococcus sp. 100 gi 493664540 emb HF952653 1 Micrococcus luteus gi 728852048 gb KJ544037 1 Bacterium AM0325 gi 826022277 gb KP345973 1 Micrococcus sp 3738 gi 1041571625 gb KX097063 1 Micrococcus sp A1S138 gi 194475415 gb EU861822 1 Uncultured soil bacterium clone A07 bac con 96 gi 8017974 emb AJ244799 1 Gram-positive bacteria SOGA22 gi 223034390 gb FJ354115 1 Uncultured bacterium clone L73 ML 217 53 gi 940373643 gb JX193447 2 Uncultured bacterium clone Q31020 100 100 gi 223030234 gb FJ349959 1 Uncultured bacterium clone 051011 T3S4 W T SDP 331 69 gi 482680808 gb KC836020 1 Uncultured bacterium clone C-101 70 gi 197359976 gb EU703237 1 Uncultured Actinomycetales bacterium clone XZNMC38 100 gi 223033479 gb FJ353204 1 Uncultured bacterium clone C7p3 ML 273 83 gi 223033973 gb FJ353698 1 Uncultured bacterium clone I9S18D ML 083 87 100gi 793958782 gb KP687038 1 Uncultured bacterium clone T6 0611 94 gi 260072812 gb GQ850555 1 Uncultured actinobacterium clone u101 100 gi 260072831 gb GQ850574 1 Uncultured actinobacterium clone d118 96 51 gi 197131256 gb EU930663 1 Uncultured actinobacterium clone PLF173 LSUCC0392 acSTL clade gi 312860470 gb HM856511 1 Uncultured Actinomycetales bacterium clone YL147 97 gi 317573447 gb HQ661252 1 Uncultured bacterium clone H-37 gi 1246764471 gb MG002170 1 Uncultured bacterium clone PS207 86 gi 193795783 gb EU852548 1 Uncultured actinobacterium clone SBC31 85 85 gi 4996392 dbj AB021332 1 Bacterium rJ14 75 gi 643403313 gb KF083030 1 Uncultured bacterium clone ncd985h01c1 54 88 gi 359327483 emb FQ660334 1 Uncultured bacterium 87 gi 24935355 gb AY145533 1 Actinomycetales bacterium GP-5 100 gi 24935356 gb AY145534 1 Actinobacterium GP-6 gi 294471361 gb GU591555 1 Uncultured bacterium clone M72 67 gi 217388856 gb FJ524959 1 Uncultured bacterium 051102 S1 W T SDP18 183 41 62 gi 388951535 gb JQ196204 1 Uncultured bacterium clone SeaWat O253 26 gi 61676995 gb AY948359 1 Uncultured bacterium clone sponge clone11 12 12gi 388954039 gb JQ198708 1 Uncultured bacterium clone SeaWat F743 27 2198gi 61676993 gb AY948357 1 Uncultured bacterium clone sponge clone8 1565 39 gi 1132313789 gb KX932758 1 Uncultured marine bacterium clone 70S3Bb05Y 29 gi 1132314281 gb KX933250 1 Uncultured marine bacterium clone 70SA1b05G 19 30 gi 224496639 gb FJ744948 1 Uncultured Actinomycetales bacterium clone SHWH night1 38 90 gi 672374989 gb KJ811951 1 Uncultured bacterium clone 0806N7 3 81 H06914 100 gi 1216866617 gb MF498123 1 Uncultured bacterium clone W1-60 gi 239775476 gb FJ999603 1 Uncultured Actinomycetales bacterium clone B61 72 gi 952074740 gb EU182063 2 Uncultured bacterium clone D15 7 SW G 42 63 gi 238481764 gb FJ937867 1 Uncultured actinobacterium clone MWLSB148 19 100 96 gi 239939247 gb GQ144861 1 Uncultured actinobacterium clone a-26 90 gi 891166350 dbj AB974135 1 Uncultured bacterium gi 388950905 gb JQ195574 1 Uncultured bacterium clone SeaWat G670 gi 388951813 gb JQ196482 1 Uncultured bacterium clone SeaWat O026 gi 388954849 gb JQ199518 1 Uncultured bacterium clone SeaWat 51611 gi 190693925 gb EU592353 1 Uncultured bacterium clone SSW95Ap 95 gi 239939248 gb GQ144862 1 Uncultured actinobacterium clone a-64 gi 224496491 gb FJ744800 1 Uncultured Actinomycetales bacterium clone SHWH night1 gi 224496588 gb FJ744897 1 Uncultured Actinomycetales bacterium clone SHWH night1 gi 224496641 gb FJ744950 1 Uncultured Actinomycetales bacterium clone SHWH night1 gi 674111465 gb KJ807908 1 Uncultured bacterium clone A85 gi 197131259 gb EU930666 1 Uncultured actinobacterium clone PLF190 gi 21326166 gb AF513101 1 Uncultured bacterium clone 7 100 gi 237987550 emb CU918001 1 Uncultured Actinobacteria bacterium gi 295880921 gb HM129855 1 Uncultured bacterium clone SINO700 43 gi 351693997 gb JN656811 1 Uncultured actinobacterium clone KWK6S 40 42gi 190705110 gb EU800185 1 Uncultured bacterium clone 2C228192 35 8886gi 190706293 gb EU801368 1 Uncultured bacterium clone 3C002637 93 gi 39979382 dbj AB154303 1 Uncultured bacterium 78 36 gi 284517620 gb GU305709 1 Uncultured bacterium clone MYY16 42 gi 190705082 gb EU800157 1 Uncultured bacterium clone 2C228150 94 93 97 gi 296172557 emb FN668338 2 Uncultured actinobacterium 95 gi 190706986 gb EU802061 1 Uncultured bacterium clone 3C003455 gi 190705672 gb EU800747 1 Uncultured bacterium clone 2C228934 gi 190706870 gb EU801945 1 Uncultured bacterium clone 3C003327 76 92 82LSUCC0889 acIV subclade A gi 254971670 gb GQ340276 1 Uncultured bacterium clone A08-159-BAC gi 351694108 gb JN656922 1 Uncultured actinobacterium clone KWK23S 36 gi 190706444 gb EU801519 1 Uncultured bacterium clone 3C002829 gi 37496727 emb AJ575505 1 Uncultured actinobacterium 31 gi 388542108 gb JQ941833 1 Uncultured bacterium clone 17 16gi 545346894 gb KF596601 1 Uncultured bacterium clone 261 357 7836gi 944382977 gb EU234298 2 Uncultured bacterium clone D34 gi 296172562 emb FN668343 2 Uncultured actinobacterium 20gi 944382987 gb EU234309 2 Uncultured bacterium clone D26 gi 190705287 gb EU800362 1 Uncultured bacterium clone 2C228424 gi 284517672 gb GU305761 1 Uncultured bacterium clone MYW12 87 100 gi 320579894 gb HQ827940 1 Uncultured bacterium clone E36 95 84 gi 312860373 gb HM856414 1 Uncultured Acidimicrobineae bacterium clone YL044 94 98 gi 190705017 gb EU800092 1 Uncultured bacterium clone 2C228070 100 84 76 gi 296172548 emb FN668330 2 Uncultured actinobacterium gi 312860360 gb HM856401 1 Uncultured Acidimicrobineae bacterium clone YL029 76 gi 190708568 gb EU803643 1 Uncultured bacterium clone 5C231243 66 gi 312860371 gb HM856412 1 Uncultured Acidimicrobineae bacterium clone YL042 gi 308209370 gb HM446118 1 Uncultured bacterium clone WT98 074 98 gi 440657943 gb KC189730 1 Uncultured bacterium clone Wat6 29 gi 186893264 gb DQ316381 2 Uncultured actinobacterium clone STH11-7 22 88gi 308209349 gb HM446097 1 Uncultured bacterium clone SI72 071 31gi 312860508 gb HM856549 1 Uncultured Acidimicrobineae bacterium clone YL188 gi 1189398004 gb MF040530 1 Uncultured bacterium clone DWI06A gi 190708939 gb EU804014 1 Uncultured bacterium clone 5C231696 94gi 320579900 gb HQ827946 1 Uncultured bacterium clone E42 50 gi 1042417572 gb KX504602 1 Uncultured bacterium clone F4019 92 gi 171672650 gb EU592504 1 Uncultured bacterium clone IFBC1D03 7265 gi 171672771 gb EU592625 1 Uncultured bacterium clone MFBC5D09 82 gi 341833495 gb JF830216 1 Bacterium enrichment culture clone B97 2011 24 gi 482680826 gb KC836038 1 Uncultured bacterium clone C-119 25 gi 482680746 gb KC835958 1 Uncultured bacterium clone E-39 gi 388542127 gb JQ941852 1 Uncultured bacterium clone Contig 83 100 gi 391225992 gb JQ958641 1 Uncultured bacterium clone E84 gi 1025725631 gb KX163616 1 Uncultured bacterium clone HN1-3-9-14 100 gi 1025725805 gb KX163790 1 Uncultured bacterium clone HN4-1-41-1 97 gi 111073296 emb AM292622 1 Uncultured actinobacterium 100 88gi 254797337 gb GQ302523 1 Uncultured actinobacterium clone sw-xj83 72 gi 972975534 gb KT905679 1 Uncultured bacterium clone Md-77 56 100 gi 114342101 gb DQ463715 2 Uncultured actinobacterium clone TK-SE2 100gi 190708663 gb EU803738 1 Uncultured bacterium clone 5C231355 100 100 gi 1189397903 gb MF040429 1 Uncultured bacterium clone LVB03H 76 gi 440548473 gb KC253346 1 Uncultured bacterium clone C28 gi 254971537 gb GQ340143 1 Uncultured bacterium clone B07-70-BAC gi 186893252 gb DQ316363 2 Uncultured actinobacterium clone ST11-21 10 gi 312860439 gb HM856480 1 Uncultured Acidimicrobineae bacterium clone YL114 11 99gi 296172567 emb FN668348 2 Uncultured actinobacterium 68 71 gi 351694091 gb JN656905 1 Uncultured actinobacterium clone KWK23S 09 98 gi 371496940 gb JN626704 1 Uncultured bacterium clone AK2b2 93f 93r 100 100 gi 284517685 gb GU305774 1 Uncultured bacterium clone MYW32 100 gi 371496908 gb JN626672 1 Uncultured bacterium clone AK2b2 10r 10f gi 295880691 gb HM129625 1 Uncultured bacterium clone SINO1121 95 100gi 295880901 gb HM129835 1 Uncultured bacterium clone SINO672 gi 312860431 gb HM856472 1 Uncultured Acidimicrobineae bacterium clone YL106 gi 220978705 gb FJ572664 1 Uncultured actinobacterium 100 gi 226429112 gb FJ827871 1 Uncultured actinobacterium clone ME012F1 gi 295879493 gb HM128427 1 Uncultured bacterium clone SINN1018 96 92 64gi 295879760 gb HM128694 1 Uncultured bacterium clone SINN659 89 79 99gi 295879749 gb HM128683 1 Uncultured bacterium clone SINN647 91 gi 295879481 gb HM128415 1 Uncultured bacterium clone SINN1002 100 70 gi 295880721 gb HM129655 1 Uncultured bacterium clone SINO396 12 gi 295881065 gb HM129999 1 Uncultured bacterium clone SINO935 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.04689696 ; this version posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer100 review) is92 githe 295880928 author/funder, gb HM129862 who has 1 Uncultured granted bioRxivbacterium a clonelicense SINO709 to display the preprint in perpetuity. It is made 97gi 295880837 gb HM129771 1 Uncultured bacterium clone SINO577 85 73 available under aCC-BY-NC 4.0 International license. 97 gi 295880824 gb HM129758 1 Uncultured bacterium clone SINO555 96 100 gi 295881108 gb HM130042 1 Uncultured bacterium clone SINO996 83 gi 295879674 gb HM128608 1 Uncultured bacterium clone SINN529 gi 295879964 gb HM128898 1 Uncultured bacterium clone SINN946 gi 226429100 gb FJ827859 1 Uncultured actinobacterium clone ME002B4 gi 400981773 gb JQ994351 1 Uncultured Acidimicrobineae bacterium clone YD2-8 gi 325960882 gb HQ860608 1 Uncultured bacterium clone C-51 gi 237637590 gb FJ916095 1 Uncultured actinobacterium clone SA1A4 100 gi 793958344 gb KP686600 1 Uncultured bacterium clone T1 1110 44 gi 261499664 gb GQ923925 1 Uncultured bacterium clone N06Jun-85 100 gi 295880661 gb HM129595 1 Uncultured bacterium clone SINO1068 gi 793958870 gb KP687126 1 Uncultured bacterium clone T6 0911 33 gi 190708303 gb EU803378 1 Uncultured bacterium clone 5C230939 100 gi 190708950 gb EU804025 1 Uncultured bacterium clone 5C231708 LSUCC0897 acIV subclade B gi 99909473 gb DQ520183 1 Uncultured bacterium clone ML-7-116 2 gi 410061040 gb JQ906023 1 Uncultured bacterium clone SW 4 15 100 gi 440548409 gb KC253282 1 Uncultured bacterium clone CB11 99 gi 190708813 gb EU803888 1 Uncultured bacterium clone 5C231541 gi 186893265 gb DQ316383 2 Uncultured actinobacterium clone STH5-13 32 gi 312860348 gb HM856389 1 Uncultured Acidimicrobineae bacterium clone YL014 38 100 gi 1189397984 gb MF040510 1 Uncultured bacterium clone DWI02B gi 83817252 gb DQ316362 1 Uncultured actinobacterium clone ST11-18 gi 157498247 gb EU132728 1 Uncultured bacterium clone FFCH11714 78 gi 296963063 gb HM269468 1 Uncultured bacterium clone ncd249b03c1 gi 347664591 gb JN609377 1 Uncultured bacterium clone 15-4-153 gi 724140977 gb KM338810 1 Uncultured bacterium clone II960TT01CVMD3 99gi 724141479 gb KM339312 1 Uncultured bacterium clone IJ1ZXBL01DTJJT 91 gi 724140425 gb KM338258 1 Uncultured bacterium clone IJ1ZXBL01BMTLE 74 9794 gi 724142722 gb KM340555 1 Uncultured bacterium clone IJ1ZXBL01C9N5D gi 724147744 gb KM345577 1 Uncultured bacterium clone IJ1ZXBL01DL28H 76 gi 724142995 gb KM340828 1 Uncultured bacterium clone IJ1ZXBL01A2IX5 84 31 gi 724144006 gb KM341839 1 Uncultured bacterium clone IJ1ZXBL01ACV5W 44 82 100 gi 724145059 gb KM342892 1 Uncultured bacterium clone IJ1ZXBL01AE9FC 100 92 75 gi 724143421 gb KM341254 1 Uncultured bacterium clone II960TT01BDM3M 94 94 gi 724147365 gb KM345198 1 Uncultured bacterium clone IJ1ZXBL01AHUZZ 94 97 gi 724148496 gb KM346329 1 Uncultured bacterium clone IJ1ZXBL01ARY9P 94 gi 724149115 gb KM346948 1 Uncultured bacterium clone IJ1ZXBL01AGJFT gi 724147249 gb KM345082 1 Uncultured bacterium clone IJ1ZXBL01DVKU6 99 gi 724136766 gb KM334907 1 Uncultured bacterium clone IJ1ZXBL01A5YCQ gi 724137954 gb KM335787 1 Uncultured bacterium clone IJ1ZXBL01AOTKJ 83 gi 724143366 gb KM341199 1 Uncultured bacterium clone II960TT01C7BC5 89 gi 724123172 gb KM321313 1 Uncultured bacterium clone II960TT01CLYAP 99 gi 697243561 emb LN561450 1 Uncultured bacterium gi 399762676 gb JX079069 1 Uncultured bacterium clone BG1-123 79 gi 395486262 gb JX080241 1 Uncultured organism clone 56 gi 223679703 gb FJ675406 1 Uncultured bacterium clone LL141-8C18 91 gi 440497988 gb JX394382 1 Uncultured bacterium clone NYS000169 gi 442557766 gb KC251739 1 Gaiella sp EBR4-RS1 100 gi 822591905 gb KR537271 1 Uncultured bacterium clone NCAAH clone 30A21 96 gi 822591855 gb KR537221 1 Uncultured bacterium clone NCAAH 30N19 gi 365928091 gb JN825541 1 Uncultured actinobacterium clone Alchichica AL31 2 1B 105 100 gi 55418235 gb AY642541 1 Uncultured actinobacterium clone LV9-22 100 100 gi 1127547985 gb KX177192 1 Uncultured bacterium clone PW-14D09 96 95 100 gi 58866424 gb AY897257 1 Uncultured bacterium clone DOVER239 78 gi 374714717 gb JN672050 1 Uncultured bacterium clone AC30789d02 gi 300079388 gb HM444351 1 Uncultured bacterium clone MColforestRozo21 gi 927583828 gb KR029513 1 Uncultured bacterium clone a121 100 gi 1071974560 gb KX589492 1 Uncultured bacterium clone 100 100gi 376315527 gb JQ323132 1 Uncultured bacterium clone C14 gi 697225630 emb LN571084 1 Uncultured bacterium 100 gi 697276965 emb LN567929 1 Uncultured bacterium 51 gi 156145385 gb EF562061 1 Uncultured actinobacterium clone 7025P4B31 100 gi 347438703 gb JN178890 1 Uncultured actinobacterium clone 11D gi 440506236 gb KC000230 1 Unidentified marine bacterioplankton clone P1-1B 23 95 gi 984686048 gb KU578781 1 Uncultured bacterium clone E131 B07 99 gi 717470804 gb KM467786 1 Uncultured bacterium clone 2009ECS-StD 161 74 94 gi 717470823 gb KM467805 1 Uncultured bacterium clone 2009ECS-StD 180 gi 16326934 gb AF428773 1 Uncultured bacterium clone CR98-5-43 90 gi 254971469 gb GQ340075 1 Uncultured bacterium clone A07-25-BAC 98 gi 190708955 gb EU804030 1 Uncultured bacterium clone 5C231714 97 100 49LSUCC0894 100 96 99 gi 16327110 gb AF428949 1 Uncultured bacterium clone CR98-35-67 86 gi 284517648 gb GU305737 1 Uncultured bacterium clone MYS16 unk. Actinobacteria gi 317573541 gb HQ661346 1 Uncultured bacterium clone T-35 87 LSUCC0931 gi 573871253 gb KF766784 1 Uncultured bacterium clone IMCC19281 gi 381139116 emb HE654923 1 Uncultured actinobacterium gi 71089281 gb DQ125586 1 Uncultured bacterium clone AKAU3582 gi 359326793 emb FQ659644 1 Uncultured bacterium 54 gi 663095541 gb KJ014298 1 Uncultured bacterium clone 11170229B10-20-98 A09 54 gi 663095135 gb KJ013892 1 Uncultured bacterium clone 11170224A20-30-6 F04 49 50 100 gi 724121500 gb KM319641 1 Uncultured bacterium clone IJ1ZXBL01AGHCS 96 gi 731161910 dbj AB803972 1 Uncultured bacterium gene for gi 310756070 gb HQ330646 1 Uncultured bacterium clone rPT34 97 gi 187481142 gb EU644439 1 Uncultured bacterium clone SUL 705 A01 100 gi 507848943 gb KC954289 1 Uncultured bacterium clone h31 gi 663095429 gb KJ014186 1 Uncultured bacterium clone 11170228B5-10-30 D11 100 gi 663095748 gb KJ014505 1 Uncultured bacterium clone 11170232B60-100-65 G09 95 gi 364773091 gb JN535606 1 Uncultured organism clone SBZP 1247 gi 511195031 gb KC554780 1 Uncultured bacterium clone A1-182 0.01 6453798 ... 6408336 (39) 640716313 Meso R0044 Mesorhizobium sp BNC1 100 642519700 FP R0117 Fulvimarina pelagi HTCC2506 Figure S2 99 2503199691 Mesorhizobium opportunistum WSM2075 63 100 640693527 MAFFr06 Mesorhizobium loti MAFF303099 88 72 650637439 Ahrensia sp R2A130 86 649871491 Mesci R0014 Mesorhizobium ciceri biovar biserrulae WSM1271 81 NR 115822 1 Nitratireductor pacificus strain pht-3B 642537650 HPDFL R0100 Hoeflea phototrophica DFL-43 59 KC534148 2 Mesorhizobium sp VBW011 100 MH746195 1 Pseudohoeflea suaedae strain R12 640696004 SMc03222 Ensifer meliloti 1021 648748598 SinmeDRAFT R0050 Ensifer meliloti AK83 100 100648741934 SinmeBDRAFT R0056 Ensifer meliloti BL225C 97 64 640791439 Smed R0041 Ensifer medicae WSM419 643826846 NGR c26520 Rhizobium sp NGR234 78 640440760 RHE CH03751 Rhizobium etli CFN 42 96643059149 RetlK5 010100r32547 Rhizobium etli Kim 5 74 99 643086089 RetlC8 010100r30294 Rhizobium etli CIAT 894 87 98 87 80 100 643072589 RetlG 010100r17868 Rhizobium etli GR56 642696998 RHECIAT CH0000062 Rhizobium etli CIAT 652 100 643064220 RetlB5 010100r29169 Rhizobium etli Brasil 5 100 2509020457 Rhizobium leguminosarum bv trifolii WSM2297 643397863 Rleg2 R0042 Rhizobium leguminosarum bv trifolii WSM2304 89 100644826001 Rleg R0040 Rhizobium leguminosarum bv trifolii WSM1325 640718003 RLr01 Rhizobium leguminosarum bv viciae 3841 643644490 Arad 5100 radiobacter K84 640696008 AGR C r01 Agrobacterium tumefaciens C58 Cereon 100640697208 Atu0053 Agrobacterium tumefaciens C58 Dupont 97 100 643649435 Avi 6501 Agrobacterium vitis S4 649984820 AGROH133 02873 Agrobacterium sp H13-3 640703939 BQ10910 quintana Toulouse 82 640722339 BARBAKC583 1086 KC583 83 77 640703991 BH13790 Houston-1 96 100 644821222 Bgr 16540 as4aup 641343533 BTrrna0003 CIP 105476 649890396 Bartonella clarridgeiae 73 LSUCC0549 unk. Rhizobiales 640713386 RPC R0065 Rhodopseudomonas palustris BisB18 90 640718204 RPE R0063 Rhodopseudomonas palustris BisA53 97 640712573 RPB R0055 Rhodopseudomonas palustris HaA2 95 100 640714108 RPD R0070 Rhodopseudomonas palustris BisB5 65 640714216 Nham R0013 hamburgensis X14 638919880 NB311A r04906 Nitrobacter sp Nb-311A 100640709301 Nwi R0013 Nitrobacter winogradskyi Nb-255 62 640558074 BBta rRNABradyrhizobium sp BTAi1 82100640700549 Bjar01 Bradyrhizobium japonicum USDA 110 93 100 640523918 BRADO Bradyrhizobium sp ORS278 92 2507504415 Bradyrhizobium sp WSM1417 2508694674 Bradyrhizobium sp WSM1253 100 76 2508545847 Bradyrhizobium sp WSM471 640703121 RNA 52 Rhodopseudomonas palustris CGA009 100642715029 Rpal R0056 Rhodopseudomonas palustris TIE-1 84 99 649841927 Rpdx1 R0054 Rhodopseudomonas palustris DX-1 643402822 OCAR 7782 Oligotropha carboxidovorans OM5 85 100 648633438 AfiDRAFT R0022 Afipia sp 1NLS2 641366367 Mext R0013 Methylobacterium extorquens PA1 644910008 METDI Methylobacterium extorquens DM4 1002507329885 Methylobacterium extorquens DSM 13060 100 100 644810630 MexAM1 META1pMethylobacterium extorquens AM1 98 100 643523388 Mchl R0015 Methylobacterium chloromethanicum CM4 100 98 642649267 Mpop R0013 Methylobacterium populi BJ001 100 99 641643651 M446 R0029 Methylobacterium sp 4-46 100 643597972 Mnod R0008 Methylobacterium nodulans ORS 2060 100 641628105 Mrad2831 R0029 Methylobacterium radiotolerans JCM 2831 77 2508725373 Microvirga sp Lut6 74 100 2509076004 Microvirga sp WSM3557 76 MF077188 1 Bosea vestrisii strain 233-LR44 100MF101096 1 Bosea lupini strain HJG-9 100 80 100 DQ664253 1 Bosea sp IMCC1738 JQ659580 1 Bosea thiooxidans strain R2-58 2525858126 Brevundimonas aveniformis DSM 17977 100 648121223 Bresu R0046 Brevundimonas subvibrioides ATCC 15264 99 650637422 Brevundimonas sp BAL3 94 2525303628 Brevundimonas naejangsanensis DSM 23858 100 651323409 Brevundimonas diminuta ATCC 11568 98 100 641564240 Caul R0005 Caulobacter sp K31 98 100646752220 Cseg R0002 Caulobacter segnis ATCC 21756 100 640693796 CC r06 Caulobacter crescentus CB15 100643620968 CCNA R0069 Caulobacter crescentus NA1000 649816892 Astex R0001 Asticcacaulis excentricus CB 48 LSUCC0492 unk. Bradyrhizobiaceae 641704597 Bind R0005 Beijerinckia indica indica ATCC 9039 100 643465583 Msil R0043 Methylocella silvestris BL2 647521215 MettrDRAFT R0051 Methylosinus trichosporium OB3b 100 650347292 Met49242DRAFT R0020 Methylocystis sp ATCC 49242 640880150 Xaut R0006 Xanthobacter autotrophicus Py2 100 641258208 AZC 16r01 caulinodans ORS 571 100 648042682 Snov R0042 DSM 506 LSUCC0244 100 NR 115950 1 Stappia conradae strain MIO Labrenzia sp. 100 82 NR 043040 1 Labrenzia marina strain mano18 32 648027473 SADFL R0053 Labrenzia alexandrii DFL-11 98 100 642973308 SIAM R0194 Stappia aggregata IAM 12614 31 MK156390 1 Labrenzia alba strain CECT 7551 85 648027353 PJE R0075 Pseudovibrio sp JE062 648929654 TRICHSKD4 1113 Roseibium sp TrichSKD4 2507047819 Hyphomicrobium denitrificans 1NES1 100 648068968 Hden R0048 Hyphomicrobium denitrificans ATCC 51888 96 649745637 Rvan R0042 Rhodomicrobium vannielii ATCC 17100 LSUCC0240 74LSUCC0267 99 61 LSUCC0575 GU479437 1 Altererythrobacter sp IMCC5004 32LSUCC0092 92LSUCC0109 4283 67LSUCC0107 80 60AlIMCC5003 Altererythrobacter sp. 100 LSUCC0669 44 35LSUCC0408 59 AlterKYW2 77 LSUCC0236 63 LSUCC0617 84 LSUCC0608 LSUCC0824 100 638916654 NAP1 r15982 sp NAP1 99 640712130 ELI r15155 Erythrobacter litoralis HTCC2594 640445392 Saro R0053 Novosphingobium aromaticivorans DSM 12444 59 648027096 CbatJ R0047 Citromicrobium bathyomarinum JL354 LSUCC0202 100 LSUCC0211 Porphyrobacter sp. 100 LSUCC0212 640715210 Sala R0048 Sphingopyxis alaskensis RB2256 640706682 ZMOr009 Zymomonas mobilis mobilis ZM4 84646346073 Za10 R0038 Zymomonas mobilis subsp mobilis NCIMB 11163 100 645561741 ZmobDRAFT R0042 Zymomonas mobilis subsp mobilis ATCC 10988 75 646708474 SJA C1-r0010 Sphingobium japonicum UT26S 100 648766464 SphchDRAFT R0054 Sphingobium chlorophenolicum L-1 100 83 639160139 SKA58 r18278 Sphingomonas sp SKA58 640582079 Swit R0031 Sphingomonas wittichii RW1 640877330 Plav R0021 Parvibaculum lavamentivorans DS-1 99 648157341 PB2503 r13907 bermudensis HTCC2503 644909013 CLIBASIA r05785 Candidatus Liberibacter asiaticus str psy62 100 649794468 CKC r06093 Candidatus Liberibacter solanacearum CLso-ZC1 LSUCC0228 80 LSUCC0672 78LSUCC0697 27LSUCC0259 15 84LSUCC0710 10085 73 LSUCC0675 LSUCC0683 2515058738 Rhodobacteraceae sp HIMB11 55LSUCC0555 18 13LSUCC0373 11 19 LSUCC0547 85 98 LSUCC0576 9 73 LSUCC0587 78 97 8017LSUCC0687 85 LSUCC0755 HIMB11-type 49LSUCC0749 LSUCC0692 LSUCC0721 LSUCC0255 91 LSUCC0406 LSUCC0265 LSUCC0698 81 LSUCC0753 87 52LSUCC0676 LSUCC0579 LSUCC0703 LSUCC0581 97LSUCC0583 53 JX439099 1 Alpha proteobacterium IMCC14605 94 JX439136 1 Alpha proteobacterium IMCC14648 28JX438772 1 Alpha proteobacterium IMCC14208 55 29 4095 JX439094 1 Alpha proteobacterium IMCC14600 52 85 97 JX439499 1 Alpha proteobacterium IMCC15120 98 JX439618 1 Alpha proteobacterium IMCC15279 JX439119 1 Alpha proteobacterium IMCC14630 63 JX439442 1 Alpha proteobacterium IMCC15052 67 EF616591 1 Bacterium HTCC8001 640715113 TM1040 R0002 Silicibacter sp TM1040 98 648027660 SCH R0077 Silicibacter sp TrichCH4B 98 16 640698510 SPO SpSilicibacter pomeroyi DSS-3 98 100 648027458 RKLH R0059 Rhodobacteraceae bacterium KLH11 83 100 650637424 Silicibacter lacuscaerulensis ITI-1157 genomic scaffold scf 1113211289287 JX439441 1 Alpha proteobacterium IMCC15051 648027379 RR R0064 Ruegeria sp R11 80 642509711 ISM R0103 Roseovarius nubinhibens ISM 82 642508328 OB R0095 Oceanicola batsensis HTCC2597 87 642526669 SSE R0099 stellata E-37 86 LSUCC0451 86 17 91 99 100LSUCC0470 93 100 100 58 20 LSUCC0484 68 83 82 642532802 RTM R0101 Roseovarius sp TM1035 79 100 642973074 ROS R0135 Roseovarius sp 217 100 LSUCC0482 648027700 CSE R0068 Citreicella sp SE45 98 648283912 R2601 r01439 Pelagibaca bermudensis HTCC2601 69 71 642973072 MED R0169 Roseobacter sp MED193 99 650637436 bacterium Y4I 100 97 642537456 RGBS R0136 Phaeobacter gallaeciensis BS107 98 63 73642537529 RG R0125 Phaeobacter gallaeciensis 2 10 99 642526738 RSK R0129 Roseobacter sp SK209-2-6 JX439585 1 Alpha proteobacterium IMCC15238 642532430 RAZWK R0096 Roseobacter sp AzwK-3b 48 LSUCC0143 642973158 EE R0165 Sulfitobacter sp EE-36 100 642973159 NAS R0155 Sulfitobacter sp NAS-14 1 100 100 648027407 RGAI R0055 Roseobacter sp GAI101 99 642536891 OIHEL R0100 Oceanibulbus indolifex HEL-45 640715898 RD1 0727 Roseobacter denitrificans OCh 114 99 642536841 RLO R0080 Roseobacter litoralis Och 149 LSUCC0418 Other 100 LSUCC0477 Roseobacter 96 649679315 EIO 3093 Ketogulonicigenium vulgare Y25 clade LSUCC0826 100 LSUCC0866 100 86 KU199727 1 Marivita sp IMCC25610 36 LSUCC0849 648027314 RB R0042 Rhodobacterales bacterium HTCC2083 75 650637251 Maritimibacter alkaliphilus HTCC2654 1099457000262 642525980 RB R0083 Rhodobacterales bacterium HTCC2150 98 AY102029 1 Alpha proteobacterium HTCC152 98 642973296 OM R0226 Rhodobacterales sp HTCC2255 93 642526788 RCCS R0079 Roseobacter sp CCS2 99 642973281 SKA R0130 Loktanella vestfoldensis SKA53 52 640713039 Jann R0051 Jannaschia sp CCS1 91 648027586 TR R0030 Thalassiobium sp R2A62 37 639047783 OG2516 r00282 Oceanicola granulosus HTCC2516 36 648027301 OA R0047 Octadecabacter antarcticus 307 100 648027310 OA R0046 Octadecabacter antarcticus 238 JX439361 1 Alpha proteobacterium IMCC14952 LSUCC0554 65 LSUCC0620 92LSUCC0412 100 LSUCC0374 63LSUCC0387 78 LSUCC0246 Marivita sp SSW136 641266290 Dshi R0006 Dinoroseobacter shibae DFL 12 LSUCC0149 unk. Rhodobacteraceae (1) 640721004 Pden R0002 Paracoccus denitrificans PD1222 646738163 RCAP rcr00001 Rhodobacter capsulatus SB1003 LSUCC0440 unk. Rhodobacteraceae (2) 640723425 Rsph17029 R0002 ATCC 17029 643622220 RSKD131 rrna16s2 Rhodobacter sphaeroides KD131 100 98 640710014 RSP 4352 Rhodobacter sphaeroides 2 4 1 64 640484384 Rsph17025 R0036 Rhodobacter sphaeroides ATCC 17025 Rhodobacter sp. 82 AY584573 1 Rhodobacter sp HTCC515 LSUCC0463 100LSUCC0469 100 LSUCC0133 LSUCC0467 unk. Rhodobacteraceae (3) IMCC1702 100 MG456779 1 Gemmobacter lanyuensis strain IMCC34207 82 647133437 Rsw2DRAFT R0044 Rhodobacter sp SW2 640717943 HNE 2659 Hyphomonas neptunium ATCC 15444 100 644904152 Hbal R0040 Hirschia baltica ATCC 49814 99 640717895 Mmar10 R0043 Maricaulis maris MCS10 100 LSUCC0147 Maricaulis sp. 640708248 SAR11 Candidatus Pelagibacter ubique HTCC1062 100642513240 PU R0069 Candidatus Pelagibacter ubique HTCC1002 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896; this version88 posted April 18, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted88 2503364240 bioRxiv aCandidatus license toPelagibacter display the ubique preprint SAR11 in HTCC9565 perpetuity. It is made 100 2503352260 Candidatus Pelagibacter sp HTCC7211 available under aCC-BY-NC100 2504109332 4.0 International Alphaproteobacteria license. sp SAR11 HIMB5 gi 1813612 gb U75649 1 UAU75649 Unidentified alpha proteobacterium 100 100 gi 48310 emb X52172 1 Unknown marine bacterioplankton 75 76 2236452142 alpha proteobacterium SCGC AAA288-N07 76 46 Unc54443 AB193883 100 52 52 2236674298 alpha proteobacterium SCGC AAA240-E13 87 2236446051 alpha proteobacterium SCGC AAA288-G21 2236674967 alpha proteobacterium SCGC AAA288-E13 U75255 1 Unidentified alpha proteobacterium clone SAR203 AF353212 1 Uncultured alpha proteobacterium clone Arctic96AD-8 U75257 1 Unidentified alpha proteobacterium clone SAR220 gi 17225726 gb AF353214 1 Uncultured alpha proteobacterium clone Arctic95B-1 100 gi 1786054 gb U75256 1 UAU75256 Unidentified alpha proteobacterium clone SAR211 LSUCC0734 100 100LSUCC0740 100LSUCC0714 100 SAR11 subclade IIIa.1 100 gi 2707386 gb U70686 1 Unidentified cytophagales OM155 56 100 100LSUCC0261 44 100 2505687060 Candidatus Pelagibacter sp IMCC9063 2503356127 alpha proteobacterium sp SAR11 HIMB114 100LSUCC0723 SAR11 subclade IIIa.2 100 LSUCC0664 100 gi 389497379 gb JN941972 1 Candidatus Pelagibacter sp Nan172 34gi 39979383 dbj AB154304 1 Uncultured bacterium 12 6 2246768993 alpha proteobacterium SCGC AAA024-N17 6 172236660315 alpha proteobacterium SCGC AAA028-D10 1002246773267 alpha proteobacterium SCGC AAA280-B11 SAR11 LD12 100 gi 3036826 emb Z99997 1 Uncultured freshwater bacterium LD12 2246770601 alpha proteobacterium SCGC AAA280-P20 992264663758 alpha proteobacterium SCGC AAA027-C06 LSUCC0530 2504111167 alpha proteobacterium sp HIMB59 100LSUCC0245 100 JX438886 1 Alpha proteobacterium IMCC14357 100 HIMB59-type 100 EF616585 1 Bacterium HTCC7212 100 JX439163 1 Alpha proteobacterium IMCC14691 UncMa812 DQ009262 subcladeV UncMa833 DQ009255 subcladeIV 6407020 ... 6426625 (36) gi 347663534 gb JN018691 1 Uncultured bacterium clone 73-S-52 100gi 440506887 gb KC000881 1 Unidentified marine bacterioplankton clone P2-1B 42 97 99 gi 388953197 gb JQ197866 1 Uncultured bacterium clone SeaWat 71268 98gi 342357476 gb JN233113 1 Uncultured SAR116 cluster alpha proteobacterium clone IMS2R79 99 100 gi 190710242 gb EU805317 1 Uncultured bacterium clone 6C233320 100gi 670604915 gb KJ754635 1 Uncultured bacterium clone UMTpAT03 98 gi 388954851 gb JQ199520 1 Uncultured bacterium clone SeaWat 51739 LSUCC0719 JX439180 1 Alpha proteobacterium IMCC14711 100 99JX439554 1 Alpha proteobacterium IMCC15198 100 100 646717355 SAR116 Candidatus Puniceispirillum marinum IMCC1322 98 JX439158 1 Alpha proteobacterium IMCC14684 89 96 gi 190704933 gb EU800008 1 Uncultured bacterium clone 1C227689 18100gi 46392161 gb AY580542 1 Uncultured alpha proteobacterium clone PI 4h7g 96gi 388954544 gb JQ199213 1 Uncultured bacterium clone SeaWat F019 99 5 gi 260071233 gb GQ349963 1 Uncultured alpha proteobacterium clone SHAZ551 1998gi 546241098 gb KC336746 1 Uncultured bacterium clone Seq7 B09 100 gi 157063688 gb EF629999 1 Uncultured alpha proteobacterium clone WW01G03 100 98gi 190147335 gb EU600590 1 Uncultured alpha proteobacterium clone 2 F7 100 99 gi 440507680 gb KC001674 1 Unidentified marine bacterioplankton clone P4-2B 44 100gi 602750065 gb KF817112 1 Uncultured bacterium clone BL Sep2011 2 5m F09 gi 656904762 gb KF928816 1 Uncultured bacterium clone EXA-Snow-A1 LSUCC0827 gi 342357690 gb JN233592 1 Uncultured marine bacterium clone IMS2R84 gi 224496527 gb FJ744836 1 Uncultured alpha proteobacterium clone SHWH night1 100 SAR116 clade 100 85gi 342357696 gb JN233602 1 Uncultured marine bacterium clone IMS2R71 LSUCC0396 64 LSUCC0744 73gi 342358106 gb JN233634 1 Uncultured marine bacterium clone IMS2D24 LSUCC0225 gi 224496623 gb FJ744932 1 Uncultured alpha proteobacterium clone SHWH night1 30 LSUCC0226 gi 224496499 gb FJ744808 1 Uncultured alpha proteobacterium clone SHWH night1 HIMB100 SAR116 100JX439267 1 Alpha proteobacterium IMCC14833 100 72 JX439497 1 Alpha proteobacterium IMCC15118 gi 144228590 gb EF471702 1 Uncultured alpha proteobacterium clone CB22D08 97gi 440547501 gb JX405966 1 Uncultured marine bacterium clone S17-55 96 93 gi 440547110 gb JX405575 1 Uncultured marine bacterium clone B10-125 95 100 gi 440547202 gb JX405667 1 Uncultured marine bacterium clone B21-142 LSUCC0684 LSUCC0691 100 gi 388951961 gb JQ196630 1 Uncultured bacterium clone SeaWat 30167 100 100 gi 388954016 gb JQ198685 1 Uncultured bacterium clone SeaWat F325 99 gi 281495588 gb FJ900495 1 Uncultured bacterium clone Pol112 95 100 gi 281495595 gb FJ900502 1 Uncultured bacterium clone Pol565 gi 295638918 gb GU981832 1 Uncultured bacterium clone WG1 100 gi 440506587 gb KC000581 1 Unidentified marine bacterioplankton clone P1-6B 83 gi 388954922 gb JQ199591 1 Uncultured bacterium clone SeaWat 51128 641433681 BAL199 r29817 Nisaea sp BAL199 640711482 amb r1 Magnetospirillum magneticum AMB-1 100 642515386 Magn R0275 Magnetospirillum magnetotacticum MS-1 79 643411277 RC1 0374 Rhodospirillum centenum SW 100 646550891 AZL r007 Azospirillum sp B510 645031833 APA01 25560 Acetobacter pasteurianus IFO 3283-01 646875352 APA03 14420 Acetobacter pasteurianus IFO 3283-03 646893997 APA12 14420 Acetobacter pasteurianus IFO 3283-12 646879585 APA07 25560 Acetobacter pasteurianus IFO 3283-07 100 646881591 APA22 14420 Acetobacter pasteurianus IFO 3283-22 646884711 APA26 14420 Acetobacter pasteurianus IFO 3283-26 100 646887829 APA32 14420 Acetobacter pasteurianus IFO 3283-32 646891992 APA42C 25560 Acetobacter pasteurianus IFO 3283-01-42C 96 640554278 Acry R0048 Acidiphilium cryptum JF-5 82 640717603 GbCGDNIH1 R0009 Granulobacter bethesdensis CGDNIH1 82 76 643390602 Gdia R0015 Gluconacetobacter diazotrophicus PAl 5 640706954 GOX0224 Gluconobacter oxydans 621H gi 385251522 gb JQ712105 1 Uncultured bacterium clone B-123 94LSUCC0417 100 unk. Acetobacteraceae 100 LSUCC0390 gi 326372138 gb HQ190491 1 Uncultured bacterium clone BP2 NR 137248 1 Limibacillus halophilus strain CAU 1121 640711591 Rru AR0064 Rhodospirillum rubrum ATCC 11170 640693401 PM r10 multocida Pm70 85 641608812 ECDH10B 3453 K12 DH10B 100 640702970 VVAr01 YJ016 100 82 640697159 RS05737 Ralstonia solanacearum GMI1000 100 100 640702511 CV rRNA16s1 Chromobacterium violaceum ATCC 12472 640710551 Gmet R0046 Geobacter metallireducens GS-15 640720015 Mmc1 R0003 Magnetococcus sp MC-1 0.01 gi 546240860 gb KC336508 1 Uncultured bacterium clone Seq-11 C09 95gi 546241220 gb KC336868 1 Uncultured bacterium clone p01 D03 98 35 gi 546240902 gb KC336550 1 Uncultured bacterium clone Seq-11 H09 1 1 gi 306479285 emb FR647980 1 Uncultured Bacteroidetes bacterium Figure S3 1 gi 305355094 emb FR685826 1 uncultured marine bacterium 1 gi 546241075 gb KC336723 1 Uncultured bacterium clone Seq-08 G03 0 gi 304855101 emb FR685242 1 uncultured marine bacterium 90 gi 306479358 emb FR648054 1 Uncultured Bacteroidetes bacterium gi 299855530 gb HM140645 1 Uncultured Flavobacteria bacterium clone PD15 1015 8 gi 304940180 emb FR685483 1 uncultured marine bacterium gi 299855536 gb HM140651 1 Uncultured Flavobacteria bacterium clone PD42 1015 21 gi 304855076 emb FR685217 1 uncultured marine bacterium 0 gi 46392219 gb AY580600 1 Uncultured Bacteroidetes bacterium clone PI 4b12b 33 gi 546240884 gb KC336532 1 Uncultured bacterium clone Seq-11 F04 0 0 gi 144228520 gb EF471632 1 Uncultured Bacteroidetes bacterium clone CB41G01 4211gi 304940187 emb FR685490 1 uncultured marine bacterium 43 gi 499105284 gb KC899250 1 Uncultured Flavobacteriaceae bacterium clone Klon157-MesoI 3 0 61 gi 546241048 gb KC336696 1 Uncultured bacterium clone Seq-08 C09 0 1 gi 306479276 emb FR647971 1 Uncultured Bacteroidetes bacterium 72 gi 306479583 emb FR648279 1 Uncultured Bacteroidetes bacterium 0 60 gi 403310888 gb JX537862 1 Uncultured bacterium clone 32-79 gi 304658301 emb FR683251 1 uncultured marine bacterium 90 gi 304855065 emb FR685206 1 uncultured marine bacterium gi 388845097 gb JX016053 1 Uncultured bacterium clone HglApr613 6 gi 388845364 gb JX016320 1 Uncultured bacterium clone HglApr926 0 1 gi 305355136 emb FR685868 1 uncultured marine bacterium 22 gi 305375063 emb FR685923 1 uncultured marine bacterium gi 261410442 gb GQ926872 1 Uncultured bacterium clone Mn6 11 17 gi 304694087 emb FR684224 1 uncultured marine bacterium 0 gi 304940028 emb FR685331 1 uncultured marine bacterium gi 299855537 gb HM140652 1 Uncultured Flavobacteria bacterium clone PD45 1015 gi 356605700 gb JN172554 1 Uncultured bacterium clone SI-BU120-376 17 gi 388845056 gb JX016012 1 Uncultured bacterium clone HglApr563 1 gi 388844954 gb JX015910 1 Uncultured bacterium clone HglApr450 gi 388844778 gb JX015734 1 Uncultured bacterium clone HglApr263 gi 388845091 gb JX016047 1 Uncultured bacterium clone HglApr607 82 gi 388845293 gb JX016249 1 Uncultured bacterium clone HglApr849 2 gi 305375137 emb FR685997 1 uncultured marine bacterium 4 gi 388845112 gb JX016068 1 Uncultured bacterium clone HglApr635 gi 260069802 gb GQ348532 1 Uncultured Bacteroidetes bacterium clone SHAB408 37 gi 356605703 gb JN172557 1 Uncultured bacterium clone SI-BU120-379 11 11gi 356605585 gb JN172434 1 Uncultured bacterium clone SI-BU110-165 22 2 gi 546241114 gb KC336762 1 Uncultured bacterium clone Seq7 D09 9 5 gi 546240849 gb KC336497 1 Uncultured bacterium clone Seq-11 B07 2 gi 385298924 gb JQ436108 1 Uncultured bacterium clone ST11C7 gi 46392220 gb AY580601 1 Uncultured Bacteroidetes bacterium clone PI RT331 gi 388844623 gb JX015579 1 Uncultured bacterium clone HglApr85 gi 260069186 gb GQ347916 1 Uncultured Bacteroidetes bacterium clone SGSO697 5 gi 388844681 gb JX015637 1 Uncultured bacterium clone HglApr154 gi 356605412 gb JN172257 1 Uncultured bacterium clone SI-BS110-176 23 gi 388845120 gb JX016076 1 Uncultured bacterium clone HglApr647 2 gi 388845277 gb JX016233 1 Uncultured bacterium clone HglApr832 2 gi 112806778 emb AM279192 1 Uncultured Flavobacteria bacterium 96 gi 222430624 gb FJ615164 1 Uncultured Bacteroidetes/Chlorobi group bacterium clone SI021806 10AAB 1 gi 385298746 gb JQ435930 1 Uncultured bacterium clone ST10B7 8 6398 gi 388845283 gb JX016239 1 Uncultured bacterium clone HglApr839 32 24 gi 356605699 gb JN172553 1 Uncultured bacterium clone SI-BU120-375 68 2 1 gi 546240891 gb KC336539 1 Uncultured bacterium clone Seq-11 G04 gi 304940010 emb FR685313 1 uncultured marine bacterium gi 310279931 gb HM921260 1 Uncultured bacterium clone OTU390 15 gi 388844893 gb JX015849 1 Uncultured bacterium clone HglApr385 5 0 gi 260069004 gb GQ347734 1 Uncultured Bacteroidetes bacterium clone SGSO430 1 gi 388845012 gb JX015968 1 Uncultured bacterium clone HglApr514 gi 388845243 gb JX016199 1 Uncultured bacterium clone HglApr793 3 gi 388845256 gb JX016212 1 Uncultured bacterium clone HglApr809 gi 260068071 gb GQ346801 1 Uncultured Bacteroidetes bacterium clone SGPW676 gi 260069220 gb GQ347950 1 Uncultured Bacteroidetes bacterium clone SGSO744 33 gi 305375098 emb FR685958 1 uncultured marine bacterium 57 gi 304658426 emb FR683377 1 uncultured marine bacterium gi 260069822 gb GQ348552 1 Uncultured Bacteroidetes bacterium clone SHAB432 gi 545346843 gb KF596550 1 Uncultured bacterium clone 63 159 gi 388844575bioRxiv gb JX015531 preprint 1 Uncultured doi: https://doi.org/10.1101/2020.04.17.046896 bacterium clone HglApr17 ; this version posted April 18, 2020. The copyright holder for this preprint (which gi 388845088 gb JX016044 1 Uncultured bacterium clone HglApr603 gi 388845360was gb JX016316 not certified 1 Uncultured bybacterium peer clone review) HglApr922 is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made gi 260070839 gb GQ349569 1 Uncultured Bacteroidetes bacterium clone SHAW395available under aCC-BY-NC 4.0 International license. gi 388845175 gb JX016131 1 Uncultured bacterium clone HglApr715 52 gi 304654540 emb FR686324 1 uncultured marine bacterium gi 388844795 gb JX015751 1 Uncultured bacterium clone HglApr280 gi 342366347 gb HQ242317 1 Uncultured Bacteroidetes/Chlorobi group bacterium clone F9P4 10O11 2 gi 260069233 gb GQ347963 1 Uncultured Bacteroidetes bacterium clone SGSO763 gi 304742053 emb FR684688 1 uncultured marine bacterium 63 gi 304855117 emb FR685258 1 uncultured marine bacterium 44 gi 260068982 gb GQ347712 1 Uncultured Bacteroidetes bacterium clone SGSO399 2 1 gi 304741942 emb FR684577 1 uncultured marine bacterium 15 gi 306479548 emb FR648244 1 Uncultured Bacteroidetes bacterium gi 260069154 gb GQ347884 1 Uncultured Bacteroidetes bacterium clone SGSO648 1 gi 260069175 gb GQ347905 1 Uncultured Bacteroidetes bacterium clone SGSO683 25 gi 388844824 gb JX015780 1 Uncultured bacterium clone HglApr311 2 gi 388844860 gb JX015816 1 Uncultured bacterium clone HglApr349 1 gi 388844891 gb JX015847 1 Uncultured bacterium clone HglApr383 15 gi 388845036 gb JX015992 1 Uncultured bacterium clone HglApr542 2 gi 388844797 gb JX015753 1 Uncultured bacterium clone HglApr282 gi 546241206 gb KC336854 1 Uncultured bacterium clone p01 A07 96 gi 546241207 gb KC336855 1 Uncultured bacterium clone p01 A08 43gi 306479299 emb FR647994 1 Uncultured Bacteroidetes bacterium 7 576 gi 546240858 gb KC336506 1 Uncultured bacterium clone Seq-11 C07 20 20 18 gi 546241150 gb KC336798 1 Uncultured bacterium clone p02 A06 21 19 62 47 gi 546240841 gb KC336489 1 Uncultured bacterium clone Seq-11 A02 9 10 gi 546241210 gb KC336858 1 Uncultured bacterium clone p01 B03 5 10 2 8 gi 546241187 gb KC336835 1 Uncultured bacterium clone p02 F07 1 4 3 20 gi 546241178 gb KC336826 1 Uncultured bacterium clone p02 E06 16 2 31 gi 546241241 gb KC336889 1 Uncultured bacterium clone p01 F07 gi 306479306 emb FR648001 1 Uncultured Bacteroidetes bacterium 96 gi 306479374 emb FR648070 1 Uncultured Bacteroidetes bacterium 11 gi 260068062 gb GQ346792 1 Uncultured Bacteroidetes bacterium clone SGPW636 9 7 gi 46392212 gb AY580593 1 Uncultured Bacteroidetes bacterium clone PI 4z4g 32 gi 124109467 gb EF202337 1 Uncultured Flavobacteria bacterium clone MS024-3C gi 570553659 gb KF799228 1 Uncultured bacterium clone Woods-Hole a5853 gi 304940061 emb FR685364 1 uncultured marine bacterium gi 56069492 gb AY794206 1 Uncultured Bacteroidetes bacterium clone F4C41 65 gi 546240869 gb KC336517 1 Uncultured bacterium clone Seq-11 D08 64 77 gi 546240876 gb KC336524 1 Uncultured bacterium clone Seq-11 E05 15 46 2 gi 388844812 gb JX015768 1 Uncultured bacterium clone HglApr299 4 gi 167540813 gb EU283111 1 Uncultured bacterium clone HDSP-100 7 83 gi 227955408 gb FJ895247 1 Uncultured marine bacterium clone SW2-29-8F 59 53 gi 190693907 gb EU592335 1 Uncultured bacterium clone SSW191N 44 50 5241 LSUCC0859 57 58gi 255069258 dbj AB496663 1 Flavobacteria bacterium Yb008 57 gi 255069161 dbj AB496644 1 Flavobacteria bacterium Ib005 unk. Flavobacteriaceae gi 255069159 dbj AB496642 1 Flavobacteria bacterium Ib003 61 gi 255069160 dbj AB496643 1 Flavobacteria bacterium Ib004 gi 255069158 dbj AB496641 1 Flavobacteria bacterium Ib002 42 43 gi 255069157 dbj AB496640 1 Flavobacteria bacterium Ib001 gi 255069175 dbj AB496658 1 Flavobacteria bacterium Yb001 81 64gi 451775185 gb KC425509 1 Uncultured marine microorganism clone NB031206 141 43 gi 1279395310 gb MG552026 1 Uncultured bacterium clone CSVI4 65 gi 537367099 gb KF271253 1 Uncultured bacterium clone N8 12 A 6D 77 gi 388951399 gb JQ196068 1 Uncultured bacterium clone SeaWat 24006 gi 388952425 gb JQ197094 1 Uncultured bacterium clone SeaWat 30703 73gi 388951461 gb JQ196130 1 Uncultured bacterium clone SeaWat 24537 gi 429144208 gb KC006391 1 Uncultured Flavobacteriaceae bacterium clone M7N74 gi 167540907 gb EU283205 1 Uncultured bacterium clone EDP-22 78 gi 262072172 gb GU061249 1 Uncultured Bacteroidetes bacterium clone 5m-61 51 gi 1246764463 gb MG002162 1 Uncultured bacterium clone PS199 gi 30421398 gb AY274838 1 Uncultured Bacteroidetes bacterium clone 1D10 JQ012990 1 Cellulophaga sp 403 70 JQ043064 1 Bacterium W15M25 61 45KC534369 1 Cellulophaga tyrosinoxydans strain VSW306 54AB681444 1 Cellulophaga sp NBRC 101309 100 91 FR821160 1 Cellulophaga sp AB204d 100 94 DQ356487 1 Cellulophaga sp CC12 81 NR 044502 1 Cellulophaga tyrosinoxydans strain EM41 49 gi 425702885 emb HF558903 1 Uncultured bacterium 41 AF493679 1 Flavobacteriaceae str SW072 gi 56069359 gb AY794073 1 Uncultured Cellulophaga sp clone F1C36 gi 385844395 gb JN976366 1 Uncultured bacterium clone OA5-30d-053 92 gi 390197161 gb JQ858682 1 Uncultured Aureimarina sp clone SV27 95 gi 344050319 gb JN410120 1 Uncultured bacterium clone DNA 75 1 5m gi 306479404 emb FR648100 1 Uncultured Bacteroidetes bacterium 55gi 46392209 gb AY580590 1 Uncultured Bacteroidetes bacterium clone PI RT295 48 gi 306479413 emb FR648109 1 Uncultured Bacteroidetes bacterium gi 440547608 gb JX406073 1 Uncultured marine bacterium clone S24-105 80 gi 83270969 gb DQ295950 1 Uncultured Flavobacteria bacterium clone SIMO-4005 31 15 gi 618731378 emb HE804004 1 uncultured Bacteroidetes bacterium 31 96 gi 224496931 gb FJ745240 1 Uncultured sp clone SHWN night2 99 gi 118582569 gb EF108215 1 Aureimarina marisflavi strain IMCC3054 gi 929657839 gb KT764026 1 Uncultured Flavobacteriaceae bacterium clone 1-4-203 100 gi 751136972 gb KP410313 1 Uncultured bacterium clone XL5PA 17 gi 46392214 gb AY580595 1 Uncultured Bacteroidetes bacterium clone PI 4j12e 58 gi 75766783 gb DQ189541 1 Uncultured Flavobacteria bacterium clone SIMO-2566 55 39 91 gi 75767060 gb DQ189818 1 Uncultured Flavobacteria bacterium clone SIMO-2843 93 gi 75767062 gb DQ189820 1 Uncultured Flavobacteria bacterium clone SIMO-2845 gi 46392218 gb AY580599 1 Uncultured Bacteroidetes bacterium clone PI 4t12e 54 gi 417351482 gb JX948651 1 Uncultured bacterium clone 4II 23 95 gi 68160462 gb DQ015848 1 Uncultured bacterium clone WLB13-050 99 gi 68160407 gb DQ015793 1 Uncultured bacterium clone ELB16-036 100 gi 14719142 gb AF388893 1 Uncultured Flavobacteriaceae bacterium clone GOBB3-CL103 100 gi 971455426 emb LM651999 1 Uncultured bacterium gi 260069007 gb GQ347737 1 Uncultured Bacteroidetes bacterium clone SGSO433 84 gi 306479335 emb FR648031 1 Uncultured Bacteroidetes bacterium 72 gi 190147407 gb EU600662 1 Uncultured Flavobacteria bacterium clone 2 A9 95 99 gi 306479554 emb FR648250 1 Uncultured Bacteroidetes bacterium 54 gi 304654514 emb FR686298 1 uncultured marine bacterium 86 gi 310280346 gb GU451674 1 Uncultured bacterium clone OTU343 gi 75767103 gb DQ189861 1 Uncultured Flavobacteria bacterium clone SIMO-2886 87 gi 83270976 gb DQ295957 1 Uncultured Flavobacteria bacterium clone SIMO-4012 11 14 gi 75767069 gb DQ189827 1 Uncultured Flavobacteria bacterium clone SIMO-2852 12 16 gi 75767200 gb DQ189958 1 Uncultured Flavobacteria bacterium clone SIMO-2983 11 30gi 75766895 gb DQ189653 1 Uncultured Flavobacteria bacterium clone SIMO-2678 15 1 gi 157742796 gb EU167448 1 Uncultured Flavobacteria bacterium clone C0 C02 012 7 72 gi 157742750 gb EU167402 1 Uncultured Flavobacteria bacterium clone D1 B03 029 30 gi 157742612 gb EU167264 1 Uncultured Flavobacteria bacterium clone V1 E12 088 81 gi 157742593 gb EU167245 1 Uncultured Flavobacteria bacterium clone V1 C12 092 37 gi 75766907 gb DQ189665 1 Uncultured Flavobacteria bacterium clone SIMO-2690 gi 157742683 gb EU167335 1 Uncultured Flavobacteria bacterium clone D2 A02 016 gi 75766766 gb DQ189524 1 Uncultured Flavobacteria bacterium clone SIMO-2549 gi 46392210 gb AY580591 1 Uncultured Bacteroidetes bacterium clone PI 4s5a gi 46392217 gb AY580598 1 Uncultured Bacteroidetes bacterium clone PI 4z11a 50gi 53773646 gb AY712172 1 Uncultured Flavobacteriia bacterium clone SIMO-635 46gi 384872502 gb JQ432688 1 Uncultured bacterium clone C9 E10 gi 190707073 gb EU802148 1 Uncultured bacterium clone 3C003562 80gi 220979637 emb FM958459 1 Uncultured Flavobacteriaceae bacterium gi 388954246 gb JQ198915 1 Uncultured bacterium clone SeaWat F288 gi 306479567 emb FR648263 1 Uncultured Bacteroidetes bacterium gi 46392216 gb AY580597 1 Uncultured Bacteroidetes bacterium clone PI RT205 gi 452096259 gb JX864357 1 Uncultured bacterium clone BF2010 May 5m G1 gi 315110890 gb HQ585162 1 Uncultured bacterium clone C10YRMYR05SP14STGEOTHERMALENV gi 75766882 gb DQ189640 1 Uncultured Flavobacteria bacterium clone SIMO-2665 gi 546240896 gb KC336544 1 Uncultured bacterium clone Seq-11 G09 gi 546240892 gb KC336540 1 Uncultured bacterium clone Seq-11 G05 gi 46392211 gb AY580592 1 Uncultured Bacteroidetes bacterium clone PI RT267 gi 46392213 gb AY580594 1 Uncultured Bacteroidetes bacterium clone PI 4f7d gi 546240866 gb KC336514 1 Uncultured bacterium clone Seq-11 D04 67 gi 546240889 gb KC336537 1 Uncultured bacterium clone Seq-11 F12 70 64 gi 546241215 gb KC336863 1 Uncultured bacterium clone p01 C05 58 58 gi 546240850 gb KC336498 1 Uncultured bacterium clone Seq-11 B08 29 gi 403310871 gb JX537845 1 Uncultured bacterium clone 63-14 gi 385298919 gb JQ436103 1 Uncultured bacterium clone ST11B11 gi 452096471 gb JX864569 1 Uncultured bacterium clone BF2010 Nov 5m F2 gi 388953967 gb JQ198636 1 Uncultured bacterium clone SeaWat F763 gi 384872465 gb JQ432652 1 Uncultured bacterium clone JELLYFISH C9 A1 70 gi 385251541 gb JQ712124 1 Uncultured bacterium clone B-185 gi 546240853 gb KC336501 1 Uncultured bacterium clone Seq-11 C01 80 gi 56069445 gb AY794159 1 Uncultured Bacteroidetes bacterium clone F3C31 79gi 546240879 gb KC336527 1 Uncultured bacterium clone Seq-11 E09 61 62 gi 46392215 gb AY580596 1 Uncultured Bacteroidetes bacterium clone PI 4a3g 6368 gi 46392221 gb AY580602 1 Uncultured Bacteroidetes bacterium clone PI 4t10c gi 262072166 gb GU061243 1 Uncultured Bacteroidetes bacterium clone 5m-50 gi 190704532 gb EU799607 1 Uncultured bacterium clone 1C227226 93gi 385251468 gb JQ712051 1 Uncultured bacterium clone A-125 gi 388950402 gb JQ195071 1 Uncultured bacterium clone SeaWat 15469 91 gi 388952064 gb JQ196733 1 Uncultured bacterium clone SeaWat 30049 gi 388950325 gb JQ194994 1 Uncultured bacterium clone SeaWat 15674 gi 619854139 gb KJ094193 1 Uncultured bacterium clone SEM2H071 gi 388952915 gb JQ197584 1 Uncultured bacterium clone SeaWat 67565 gi 297499231 gb HM057748 1 Uncultured Bacteroidetes bacterium clone D13W 13 71 gi 388950278 gb JQ194947 1 Uncultured bacterium clone SeaWat 15530 69 gi 385251516 gb JQ712099 1 Uncultured bacterium clone B-103 gi 619854099 gb KJ094153 1 Uncultured bacterium clone SEL2C061 gi 817253970 dbj LC018938 1 Uncultured Bacteroidetes bacterium 97 gi 817254205 dbj LC018968 1 Uncultured Bacteroidetes bacterium gi 310280344 gb GU451672 1 Uncultured bacterium clone OTU341 71 gi 51492510 gb AY697923 1 Uncultured bacterium clone F4C78 52 60 gi 384872473 gb JQ432660 1 Uncultured bacterium clone JELLYFISH C9 B4 17 25 gi 874580618 gb KR054290 1 Uncultured marine bacterium clone BS OffshoreExp T48h C C2 12 44 gi 308190934 gb GU326603 1 Uncultured marine bacterium clone SiDMar08M126 gi 388951899 gb JQ196568 1 Uncultured bacterium clone SeaWat 30731 gi 602749654 gb KF816701 1 Uncultured bacterium clone BF Feb2011 5m B07 gi 308190928 gb GU326597 1 Uncultured marine bacterium clone SiDMar08M120 0.01 gi 636558407 ref NR 114463 1 konjaci strain ATCC 33996 90 gi 636558408 ref NR 114464 1 Acidovorax citrulli strain ATCC 29625 Figure S4 73 100 gi 343201069 ref NR 041756 1 Acidovorax cattleyae strain ICMP 2826 gi 507148049 ref NR 102856 1 Acidovorax avenae strain ATCC 19860 98 gi 523967244 gb KC848895 1 Variovorax paradoxus strain 09-Lb0410 78 100 gi 631253016 ref NR 114214 1 Variovorax boronicumulans strain NBRC 103145 100 96 gi 631252538 ref NR 113736 1 Variovorax paradoxus strain NBRC 15149 96 gi 631253102 ref NR 114300 1 Variovorax soli strain NBRC 106424 78 gi 636558405 ref NR 114461 1 Xylophilus ampelinus strain ATCC 33914 AY429716 1 Ramlibacter sp HTCC332 99 gi 444304218 ref NR 074643 1 Ramlibacter tataouinensis strain TTB310 58 gi 219857346 ref NR 024934 1 Hylemonella gracilis strain ATCC 19624 12 67 99 gi 219857526 ref NR 025114 1 strain KF46F 52 57 gi 343206349 ref NR 044941 1 Simplicispira metamorpha strain DSM 1837 gi 219856883 ref NR 024702 1 Curvibacter lanceolatus strain ATCC 14669 gi 32351696 gb AY315173 1 Glacier bacterium FJI16 97 gi 542215029 gb KF441575 1 Albidiferax sp A1-591 42 43 KX505858 2 Rhodoferax sp strain IMCC26218 gi 672238947 ref NR 125536 1 Rhodoferax saidenbachensis strain ED16 27 87 86 gi 444439445 ref NR 074760 1 Albidiferax ferrireducens strain T118 90 79 52 gi 559795245 ref NR 104835 1 Rhodoferax antarcticus strain ANT BR 50 82 gi 50957209 gb AY584578 1 Beta proteobacterium HTCC541 66 DQ664242 1 Rhodoferax sp IMCC1723 gi 265678411 ref NR 028713 1 Curvibacter delicatus strain 146 48 gi 631252498 ref NR 113696 1 Curvibacter delicatus strain NBRC 14919 MG456807 1 Limnohabitans australis strain IMCC34713 62 85 gi 353742094 emb HE600690 1 Limnohabitans sp T6-20 genomic DNA containing 97 gi 672238952 ref NR 125541 1 Limnohabitans planktonicus strain II-D5 85 94 94 LSUCC0146 97 89 LSUCC0148 84 LSUCC0142 gi 672238953 ref NR 125542 1 Limnohabitans parvus strain II-B4 67 AY584577 1 Beta proteobacterium HTCC540 Limnohabitans sp. 99 MG456804 1 Undibacterium squillarum strain IMCC34688 EU800810 1 Uncultured bacterium clone 2C229026 92 LSUCC0888 83 LSUCC0799 gi 631253103 ref NR 114301 1 Polaromonas jejuensis strain NBRC 106434 11 LSUCC0765 unk. Comamonadaceae (1) gi 429844856 gb KC247351 1 Hydrogenophaga sp KMM 6728 LSUCC0136 100 LSUCC0141 Hydrogenophaga sp. gi 265678414 ref NR 028716 1 Hydrogenophaga taeniospiralis strain 2K1 gi 343200216 ref NR 040903 1 bipunctata strain IAM 14880 84 96 99 gi 343204361 ref NR 043769 1 Hydrogenophaga caeni strain EMB71 84 97 gi 219857228 ref NR 024856 1 Hydrogenophaga intermedia strain S1 97 96 93 100 gi 631252931 ref NR 114129 1 Hydrogenophaga intermedia strain NBRC 102510 78 63 gi 265678719 ref NR 029024 1 Hydrogenophaga defluvii strain BSB 9 5 gi 631253226 ref NR 114424 1 Hydrogenophaga palleronii strain DSM 63 gi 317184306 gb HQ652595 1 Hydrogenophaga sp p3 2011 gi 631250870 ref NR 112066 1 spinosa strain ATCC 14606 gi 806637559 dbj AB795550 1 Hydrogenophaga taeniospiralis gi 636558800 ref NR 114857 1 Hydrogenophaga pseudoflava strain DSM 1034 91gi 636558801 ref NR 114858 1 Hydrogenophaga flava strain DSM 619 83gi 631252935 ref NR 114133 1 Hydrogenophaga flava strain NBRC 102514 LSUCC0776 67LSUCC0779 69 1 LSUCC0485 LSUCC0793 2 gi 32966962 gb AY317112 1 Beta proteobacterium BAL58 1 1 LSUCC0867 LSUCC0807 LSUCC0462 18 1 LSUCC0768 49 LSUCC0468 LSUCC0434 53 LSUCC0466 40 LSUCC0831 94 1 LSUCC0113 82 28LSUCC0781 LSUCC0117 95 100LSUCC0123 BAL58 clade LSUCC0762 97LSUCC0788 97 28LSUCC0828 AY102032 1 Beta proteobacterium HTCC226 24LSUCC0474 LSUCC0836 86 61LSUCC0814 LSUCC0815 LSUCC0832 100 LSUCC0875 LSUCC0249 92LSUCC0254 95 94 LSUCC0439 99 94LSUCC0448 LSUCC0516 LSUCC0521 unk. Comamonadaceae (2) gi 343198666 ref NR 043226 1 Caldimonas taiwanensis strain On1 100 gi 631252646 ref NR 113844 1 Caldimonas manganoxidans strain NBRC 16448 gi 343201081 ref NR 041768 1 Methylibium petroleiphilum strain PM1 99 gi 631252863 ref NR 114061 1 Piscinibacter aquaticus strain NBRC 102349 FJ612332 1 Uncultured bacterium clone DP10 1 39 99LSUCC0760 23 9 LSUCC0515 10 50 LSUCC0481 3 1 LSUCC457 LSUCC0882 LSUCC0507 3 LSUCC0526 LSUCC0432 LSUCC0502 93 LSUCC0506 95 15 LSUCC0471 37LSUCC0508 8 LSUCC0505 91 LSUCC0229 32LSUCC0922 60 75LSUCC0923 82FJ349981 1 Uncultured bacterium clone 051011 T3S4 W T SDP 280 LSUCC0115 92 LSUCC0499 LSUCC0490 94 LSUCC0531 89 LSUCC0750 93 LSUCC0773 LSUCC0452 HM129142 1 Uncultured bacterium clone SING499 39 LSUCC0464 97 12 LSUCC0810 14 994 LSUCC0813 675 40 891LSUCC0524 4829 34 LSUCC0932 91 LSUCC0479 LSUCC0921 MWH-UniPo-Type 95 LSUCC0513 100 LSUCC0885 HM129951 1 Uncultured bacterium clone SINO867 87 LSUCC0118 LSUCC0896 90 LSUCC0156 LSUCC0532 99LSUCC0534 HQ661230 1 Uncultured bacterium clone H-14 99 LSUCC0445 91 LSUCC0523 LSUCC0514 100 LSUCC0528 LSUCC0501 54 LSUCC0533 LSUCC0510 56LSUCC0443 LSUCC0475 29LSUCC0746 54 29LSUCC0433 15 530 LSUCC0522 37LSUCC0441 LSUCC0491 85 98LSUCC0567 LSUCC0887 LSUCC0450 LSUCC0886 LSUCC0437 LSUCC0459 96 39LSUCC0512 10 LSUCC0918 LSUCC0919 AJ565421 1 Beta proteobacterium MWH-UniP1 100 gi 31441940 emb AJ565422 1 Beta proteobacterium MWH-UniP4 gi 265678447 ref NR 028749 1 Pandoraea apista strain LMG 16407 100gi 636558463 ref NR 114519 1 Pandoraea pnomenusa strain LMG 18087 100 100 gi 265678449 ref NR 028751 1 Pandoraea sputorum strain LMG 18819 87 gi 265678448 ref NR 028750 1 Pandoraea pulmonicola strain LMG 18106 92 99 gi 631250872 ref NR 112068 1 Burkholderia phenazinium strain ATCC 33666 100 gi 631252920 ref NR 114118 1 Burkholderia fungorum strain NBRC 102489 gi 672238969 ref NR 125558 1 Burkholderia terrestris strain R-23321 gi 559795367 ref NR 104960 1 Burkholderia andropogonis strain LMG 2129 AB376659 1 Beta proteobacterium KF070 AB607287 1 Beta proteobacterium TEG17 84 86LSUCC0758 100 AJ964889 1 Beta proteobacterium LF4-65 94 98 AB626822 1 Beta proteobacterium IN12 AJ938029 1 Beta proteobacterium MWH-P2sevCIIIb 100 AJ938031 1 Beta proteobacterium QLW-P2DMWB-4 81 92 EF628480 1 Bacterium HTCC4038 Bordetella sp. gi 219878141 ref NR 025280 1 Castellaniella defragrans strain 54Pin 86 100 gi 343206210 ref NR 044802 1 Castellaniella denitrificans strain NKNTAU 85 MG712864 1 Bordetella sp strain HZ20 92 JN119052 1 Uncultured marine bacterium clone FJ-C9 47LSUCC0454 100 KP233467 1 Bordetella bronchiseptica strain R2 92 gi 736580552 gb KM289177 1 sp BAB-4390 99 gi 736580565 gb KM289184 1 Bordetella sp BAB-4398 76 100 gi 756403333 gb KM884887 1 79 79 NR 025669 1 Kerstersia gyiorum strain LMG 5906 AF366576 1 strain Tohama 100 AY466116 1 isolate G9910 88 100 LT223643 1 Bordetella bronchiseptica 78 77 AJ278450 1 57 97 41 KJ022626 1 strain GV 36 84 KX138518 1 Bordetella pertussis strain VITSBSTV04 KY486811 1 Collimonas pratensis strain EN14AR2 gi 343200254 ref NR 040941 1 gummosa strain IAM 13946 100 gi 631252929 ref NR 114127 1 Derxia gummosa strain NBRC 102506 LSUCC0493 57LSUCC0518 78 86 MF040427 1 Uncultured bacterium clone LVB03F unk. Burkholderiaceae 100 FJ612227 1 Uncultured bacterium clone DP7 3 90 JX013364 1 Uncultured bacterium clone PA243 NR 159181 1 Quisquiliibacterium transsilvanicum strain CGI-09 MF118165 1 Burkholderiales bacterium strain 113A2 100 MK343090 1 Quisquiliibacterium sp CC-CFT501 30 JF922449 1 Uncultured bacterium clone B2-13 99 99 LN875366 1 Uncultured bacterium DQ988313 1 Uncultured bacterium clone LR A2-32 97 LSUCC0461 100 LSUCC0473 96 GQ472430 1 Uncultured bacterium clone 5S3-4 unk. Burkholderiales 87 MG551972 1 Uncultured sp clone CSIII144 MK748163 1 Lautropia sp strain KCOM 2505 AY234491 1 Bacterium Ellin5074 gi 559795322 ref NR 104915 1 Rugamonas rubra strain MOM 28 2 79 96 gi 631252936 ref NR 114134 1 Janthinobacterium agaricidamnosum strain NBRC 102515 78 gi 359805817 dbj AB681896 1 Herminiimonas glaciei 82 98 gi 76365592 gb DQ205303 1 Ultramicrobacterium str HI-G4 94 81 gi 507148008 ref NR 102815 1 Herminiimonas arsenicoxydans 87 gi 343198624 ref NR 043090 1 Herminiimonas fonticola strain S-94 71 gi 631252422 ref NR 113620 1 Oxalicibacterium horti strain NBRC 13594 32 AB626830 1 Beta proteobacterium TE1 gi 566085528 ref NR 109599 1 Undibacterium terreum strain C3 12 75 NR 113747 1 Herbaspirillum autotrophicum strain NBRC 15327 gi 631252942 ref NR 114140 1 Herbaspirillum frisingense strain NBRC 102522 90 28 KU199736 1 Herbaspirillum sp IMCC25601 33 85 27 gi 474422727 dbj AB769218 1 Herbaspirillum sp UKPF17 100 gi 712001303 gb KM272811 1 Herbaspirillum aquaticum strain LAMA 1150 gi 40804455 gb AY429714 1 Oxalobacteraceae bacterium HTCC302 98 LSUCC0519 88 LC094640 1 Oxalobacteraceae bacterium MORI3 LSUCC0766 LSUCC0817 82 unk. Oxalobacteraceae 100LSUCC0465 98 LSUCC0122 gi 332144608 dbj AB626815 1 Beta proteobacterium IN5 gi 631250868 ref NR 112064 1 Paucimonas lemoignei strain ATCC 17989 AY429715 1 Oxalobacteraceae bacterium HTCC315 52 JQ966282 1 Herbaspirillum sp TS7 gi 515019153 gb KC778366 1 Limnobacter thiooxidans strain BGB1 97MK100324 1 Limnobacter sp strain LZ-4 100 100 gi 219878282 ref NR 025421 1 Limnobacter thiooxidans strain CS-K2 99 MH463959 2 Limnobacter sp strain DRY11W AB366173 2 Limnobacter litoralis 100 gi 631253093 ref NR 114291 1 Limnobacter litoralis strain NBRC 105857 LSUCC0497 NR 125545 1 Polynucleobacter acidiphobus strain MWH-PoolGreenA3 96LSUCC0495 88 100 LSUCC0803 91 91 Z99998 1 Uncultured freshwater bacterium LD17 92 AB599869 1 Polynucleobacter acidiphobus Polynucleobacter sp. 100 NR 125546 1 Polynucleobacter difficilis strain AM-8B5 83 bioRxiv preprint doi: https://doi.org/10.1101/2020.04.17.046896AY584579 1 Polynucleobacter sp HTCC543 ; this version posted April 18, 2020. The copyright holder for this preprint (which 85 MG786577 1 Polynucleobacter sp strain IMCC30063 was not certifiedLSUCC0455 by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 100 LSUCC0498 available under aCC-BY-NC 4.0 International license. gi 219857550 ref NR 025138 1 Cupriavidus basilensis strain DSM 11853 96 gi 444439409 ref NR 074724 1 Ralstonia eutropha JMP134 strain JMP134 100 gi 219857549 ref NR 025137 1 Cupriavidus campinensis strain WS2 gi 343205994 ref NR 044535 1 Aquitalea denitrificans strain 5YN1-3 100 gi 631253002 ref NR 114200 1 Aquitalea magnusonii strain NBRC 103050 99 gi 343202262 ref NR 042548 1 Gulbenkiania mobilis strain E4FC31 100 gi 444303800 ref NR 074222 1 Chromobacterium violaceum strain ATCC 12472 91 100 gi 631252397 ref NR 113595 1 Chromobacterium violaceum strain NBRC 12614 100 gi 566085409 ref NR 109451 1 Chromobacterium vaccinii strain MWU205 gi 219857579 ref NR 025167 1 Laribacter hongkongensis strain HKU1 100 gi 444304244 ref NR 074669 1 Laribacter hongkongensis HLHK9 strain HLHK9 79 gi 219846886 ref NR 026478 1 Formivibrio citricus strain CreCit1 Formvibrio sp. 94 LSUCC0880 91 AY102016 1 Beta proteobacterium HTCC185 87 86 92 gi 602270591 ref NR 029099 2 Chitinimonas taiwanensis strain cf 86 gi 343200113 ref NR 040800 1 Vogesella indigofera strain ATCC 19706 gi 343200208 ref NR 040895 1 Aquaspirillum serpens strain IAM 13944 100 gi 631252501 ref NR 113699 1 Aquaspirillum serpens strain NBRC 14924 gi 559795351 ref NR 104944 1 elongata subsp nitroreducens strain B1019 97 gi 631252842 ref NR 114040 1 Bergeriella denitrificans strain NBRC 102155 97 gi 645320503 ref NR 117694 1 Neisseria perflava strain Branham 7078 98 100 100 gi 659364465 ref NR 121687 1 Neisseria cinerea strain ATCC 14685 gi 636560707 ref NR 116767 1 Neisseria shayeganii strain WC 08-871 gi 219846304 ref NR 025894 1 Vitreoscilla stercoraria strain Gottingen 1488-6 LSUCC0566 LSUCC0603 94LSUCC0665 LSUCC0568 100 42 JX439553 1 Beta proteobacterium IMCC15197 100JX439570 1 Beta proteobacterium IMCC15222 4499 98 KJ491828 1 Bacterium IMCC11234 EU433381 1 Marine bacterium HIMB624 LSUCC0578 LSUCC0622 99LSUCC0717 100 AB022337 1 1140 OM43 clade bacterium POCPN-5 AY102014 1 Beta proteobacterium HTCC168 38 LSUCC0416 61 LSUCC0486 92LSUCC0227 95LSUCC0536 OM43 clade 97 85 97LSUCC0389 98LSUCC0401 25 84LSUCC0489 92 96 LSUCC0268 5 LSUCC0520 FR686240 1 1502 OM43 clade uncultured marine bacterium 93 40KJ492150 1 Bacterium IMCC8640 96 100 91 KJ492136 1 Bacterium IMCC8621 20 86 95 AACY020000612 319 1480 OM43 clade marine metagenome 91 31AACY020064250 1 942 OM43 clade marine metagenome JX016508 1 1500 OM43 clade uncultured bacterium 95 KJ491911 1 Bacterium IMCC11514 AY102015 1 Beta proteobacterium HTCC174 60 LSUCC0550 84 HTCC2181 98 KJ492153 1 Bacterium IMCC8643 gi 343200571 ref NR 041258 1 Methylophilus leisingeri strain DSM 6813 100 gi 636558503 ref NR 114560 1 Methylophilus leisingeri strain DM11 100 gi 343200570 ref NR 041257 1 Methylophilus methylotrophus strain NCIMB 10515 100 AY429717 1 Beta proteobacterium HTCC349 100 gi 444304267 ref NR 074693 1 Methylotenera versatilis strain 301 100 gi 507148035 ref NR 102842 1 Methylotenera mobilis strain JLW8 gi 733581266 emb LN681455 1 Methylophilaceae bacterium MMS-18-77 LSUCC0130 100 LSUCC0135 unk. Methylophilales gi 559795172 ref NR 104761 1 Methylovorus glucosotrophus strain 6B1 100 gi 636559187 ref NR 115244 1 Methylovorus glucosotrophus strain 6B1 100 gi 470467498 ref NR 074178 1 flagellatus strain KT 100 gi 559795171 ref NR 104760 1 Methylobacillus glycogenes strain TK 0113 gi 469939658 gb KC577612 1 Methylotrophic bacterium RS-X3 94 gi 659364515 ref NR 121695 1 Sulfuricella denitrificans skB26 gi 444304256 ref NR 074682 1 Nitrosomonas sp AL212 strain AL212 81 54 100 gi 631250962 ref NR 112159 1 Nitrosospira multiformis strain ATCC 25196 gi 219846885 ref NR 026477 1 dicarboxylicus strain CreMal1 93 gi 219857227 ref NR 024855 1 Propionivibrio pelophilus strain asp 66 91 97 gi 219878316 ref NR 025455 1 Propionivibrio limicola strain GolChi1 92 gi 444439448 ref NR 074763 1 Candidatus Accumulibacter phosphatis strain UW-1 97 gi 219846249 ref NR 025839 1 tenuis strain 2761 27 gi 219857224 ref NR 024852 1 oryzae strain 6a3 94 92 100 gi 470466077 ref NR 074103 1 Dechlorosoma suillum PS 54 98 98 gi 343198916 ref NR 044023 1 Azospira restricta strain SUA2 gi 219857253 ref NR 024884 1 Dechloromonas agitata strain CKB gi 219846872 ref NR 026464 1 Ferribacterium limneticum strain cda-1 100 gi 444439433 ref NR 074748 1 Dechloromonas aromatica RCB strain RCB gi 343198954 ref NR 044125 1 hydrophilus strain d8-1 62 gi 219857225 ref NR 024853 1 Azonexus fungiphilus strain BS5-8 95 gi 343200330 ref NR 041017 1 Azonexus caeni strain Slu-05 82 gi 219846538 ref NR 026130 1 Zoogloea ramigera strain 106 90 gi 631252871 ref NR 114069 1 Zoogloea oryzae strain NBRC 102407 100 gi 224581407 ref NR 027188 1 Zoogloea resiniphila strain DhA-35 69 gi 224581409 ref NR 027190 1 Azoarcus buckelii strain U120 56 100 gi 444304250 ref NR 074676 1 Aromatoleum aromaticum EbN1 strain EbN1 69 100 63 gi 265678958 ref NR 029266 1 Azoarcus evansii strain KB740 gi 444439486 ref NR 074801 1 Azoarcus sp BH72 strain BH72 53 100gi 566084821 ref NR 108183 1 Azoarcus olearius strain DQS-4 83 gi 219857222 ref NR 024850 1 Azoarcus communis strain SWub3 gi 219878145 ref NR 025284 1 Thauera terpenica strain 58Eu gi 265678376 ref NR 028678 1 Azovibrio restrictus strain S5b2 gi 444304233 ref NR 074658 1 Gallionella capsiferriformans strain ES-2 94 gi 636559696 ref NR 115756 1 Sideroxydans lithotrophicus strain ES-1 gi 472440221 gb KC432239 1 Uncultured bacterium clone SEAB1DB031 35 JF488142 1 Beta proteobacterium SCGC AAA206-N16 99 gi 325494847 gb HQ738415 1 Uncultured bacterium clone g-31 97 gi 71384143 gb DQ123771 1 Uncultured soil bacterium clone PAH-Feed-31 23 gi 1040460153 emb LN870675 1 Uncultured bacterium 100gi 1040460213 emb LN870735 1 Uncultured bacterium 100 3165 gi 125968113 gb EF393426 1 Uncultured bacterium clone ORSFC2 d12 100 79 gi 373426563 gb JQ072868 1 Uncultured bacterium clone NBBAB0409 04 gi 472439988 gb KC432006 1 Uncultured bacterium clone SEAA1AD041 80 LSUCC0895 unk. Betaproteobacterium 70 gi 523452485 gb KF182905 1 Uncultured beta proteobacterium clone M1-061 80 gi 733165355 emb LN680217 1 Uncultured bacterium AB308363 1 Bacterium TG161 gi 343198675 ref NR 043249 1 Denitratisoma oestradiolicum strain AcBE2-1 gi 406829528 gb DQ011289 2 Pseudoalteromonas byunsanensis strain FR1199 100 gi 566085169 ref NR 108899 1 Shewanella indica strain KJW27 gi 219878219 ref NR 025358 1 denitrificans strain NCIMB 9548 100 gi 636559698 ref NR 115758 1 Thiobacillus sajanensis strain 4HG 75 98 gi 645320478 ref NR 117675 1 Thiobacillus aquaesulis strain DSM 4255 gi 343206201 ref NR 044793 1 Thiobacillus aquaesulis strain ATCC 43788 gi 343198787 ref NR 043684 1 Leeia oryzae strain HW7 0.01 LSUCC0185 99 LSUCC0196 38 44LSUCC0183 Figure S5 70LSUCC0192 93 100 LSUCC0201 100 100 LSUCC0176 98LSUCC0180 78 LSUCC0172 77 LSUCC0197 Pseudoalteromonas sp. 100 LSUCC0184 gi 406829528 gb DQ011289 2 Pseudoalteromonas byunsanensis strain FR1199 91 98 LSUCC0039 100 gi 292386059 gb GU726859 1 Pseudoalteromonas nigrifaciens strain 1350 gi 265678506 ref NR 028809 1 Pseudoalteromonas phenolica strain O-BC30 100 LSUCC0210 LSUCC0034 100 MG456773 1 Pseudoalteromonas spongiae strain IMCC34174 gi 1850385 gb U85853 1 MPU85853 punicea ACAM 611T 98 gi 406829506 gb AY207502 2 Aestuariibacter salexigens strain JC2042 93 100 gi 4688598 emb Y18228 1 Alteromonas macleodii 98 90 71 98 LSUCC0175 Alteromonas sp. gi 409724282 ref NR 025717 2 Thalassomonas ganghwensis strain JC2041 GQ406611 1 Vibrio sp PaD1 53 97 55 95 LSUCC0419 95 Vibro sp. 93 MG996721 1 Vibrio sp strain THAF213 98 98 MG896165 1 Vibrio sinaloensis strain UMTBB221 99 97 NR 108732 1 Vibrio caribbeanicus ATCC BAA-2122 strain N384 100 MG386395 1 Vibrio brasiliensis strain HH101310 99 88 KP329558 1 Vibrio tubiashii strain T33 gi 48329 emb X16895 1 Vibrio anguillarum 91 gi 49417 emb Z21856 1 V cholerae gi 1240022 emb X80725 1 E coli ATCC 11775T 100 gi 4581981 emb AJ233408 1 freundii gi 400433 emb X74677 1 A hydrophila ATCC 7966T gi 3095030 gb U96412 1 ASU96412 Anaerobiospirillum succiniciproducens 100 gi 4775549 emb Y17600 1 Succinivibrio dextrinosolvens 93 gi 175590 gb M35018 1 PASRRDA P multocida 100 gi 557477695 gb M35019 2 HEARRDA influenzae strain ATCC 33391 gi 566085169 ref NR 108899 1 Shewanella indica strain KJW27 69 gi 4049366 dbj AB006769 1 Marinospirillum minutulum gi 43461 emb X67023 1 H elongata 100 gi 992985 emb X87219 1 C marismortui 100 gi 17976825 emb AJ306890 1 Halomonas marina 100 gi 2808527 emb Y13299 1 Carnomonas nigrifaciens 100 gi 457165 dbj D14555 1 ZYP gi 175567 gb M22365 1 OCERRDA O linum EU050833 1 Uncultured gamma proteobacterium clone SS1 B 07 22 96 HM587890 1 Uncultured bacterium clone OV01102 03 100 HM587888 1 Uncultured Oceanospirillales bacterium clone OV00301 93 HM587889 1 Uncultured Oceanospirillales bacterium clone BM580104 34 100 AM747817 1 Oceaniserpentilla haliotis 55 100 AY697896 1 Uncultured Oleispira sp clone F3C18 90 92 AM117931 1 Spongiispira norvegica 100 AY136131 2 Bermanella marisrubri strain RED65 AF468253 1 Uncultured bacterium clone ARKDMS-13 100 83DQ530482 1 Oleispira sp gap-e-97 91 AF468248 1 Uncultured bacterium clone ARKDMS-38 100 87 AJ426421 1 Oleispira antarcticam 100 DQ521390 1 Oleispira sp ice-oil-381 AM279755 1 Thalassolituus oleivorans 100 DQ513004 1 Uncultured bacterium clone FS140-103B-02 100 gi 18996279 emb AJ431699 1 Thalassolituus oleivorans 100 45 AJ302699 1 Oceanospirillum sp ME101 100 gi 4049364 dbj AB006767 1 Oceanobacter kriegii gi 343205728 ref NR 044132 1 Saccharospirillum salsuginis strain YIM-Y25 32 100 gi 558508622 ref NR 104547 1 Saccharospirillum aestuarii strain IMCC 4453 100 100 gi 31414233 emb AJ315983 1 Saccharospirillum impatiens gi 30724821 emb AJ561121 1 Reinekia marinisedimentorum KF146347 1 Marinobacterium sp IMCC1424 100 LSUCC0864 99 NR 125520 1 Marinobacterium marisflavi strain IMCC4074 Marinobacterium sp. 100 JX438851 1 Gamma proteobacterium IMCC14314 65 100 100 LSUCC0821 92 63 gi 1718231 gb U58339 1 MGU58339 Marinobacterium georgiense 57 gi 3170515 gb AF053734 1 AF053734 Neptunomonas naphthovorans NR 044016 1 Marinobacterium litorale strain IMCC1877 gi 63146099 gb DQ011528 1 Marinomonas communis strain LMG 2864 gi 4049363 dbj AB006766 1 Pseudospirillum japonicum EU308447 1 Marinobacter sp ST17 FM992711 1 Marinobacter sp M71 D21 51MG252494 1 Marinobacter sp strain LYGB-3 26AJ551128 1 Marinobacter sp es7 DQ270711 1 Marinobacter sp B-3081 33FJ461443 1 Marinobacter sp SCSWC09 52EF140752 1 Marinobacter sp DG1591 88 7990KC295366 1 Marinobacter sp DG1471 57 74 EU864262 1 Marinobacter sp p78 73LT221181 1 Marinobacter sp AT2RP15 100 LT221154 1 Marinobacter sp HL3RP4 AM990795 1 Marinobacter sp MOLA 19 66 57 72 KX954612 1 Marinobacter sp strain S-14 AB166972 1 Marinobacter sp NT N17 KX344938 1 Marinobacter sp FTI04 63KY770704 1 Marinobacter sp strain 7002-278 99 98MF382054 1 Marinobacter sp strain P7 30 53 47 CP011929 1 Marinobacter sp CP1 AB166973 1 Marinobacter sp NT N18 77 MF382066 1 Marinobacter sp strain M1 30 gi 13873309 gb AF195410 1 chejuensis strain KCTC 2396 98 gi 406829503 gb AY130994 2 Zooshikella ganghwensis strain JC2044 94 gi 11558296 emb AJ295154 1 Oleiphilus messinensis KY270473 1 Gammaproteobacteria bacterium strain 308 gi 3063723 emb Y17112 1 Balneatrix alpica LSUCC0110 100 LSUCC0112 Pseudohongiella sp. 100 DQ354722 1 Uncultured bacterium clone DR550SWSAEE23 100 100 EU037337 2 Uncultured bacterium clone G3DCM-92 100 EU887784 1 Uncultured Cellvibrio sp clone 21IISN 100 100 EF157250 1 Uncultured bacterium clone 101-155 100 100 LN851791 1 Uncultured gamma proteobacterium 99 LN851804 1 Uncultured gamma proteobacterium 100 HM127593 1 Uncultured bacterium clone SINP624 NR 126265 1 Pseudohongiella spirulinae strain Ma-20 NR 136444 1 Pseudohongiella acticola strain GBSW-5 100 NR 153732 1 Pseudohongiella nitratireducens strain SCS-49 gi 2707399 gb U70699 1 Unidentified gamma proteobacterium OM182 100LSUCC0449 64 100 GU061024 1 Gamma proteobacterium SF293 OM182 clade 99 gi 2707400 gb U70700 1 Unidentified gamma proteobacterium OM185 100 99 gi 37528802 gb AY386344 1 Marine gamma proteobacterium HTCC2188 100 gi 17225754 gb AF353242 1 Uncultured gamma proteobacterium clone Arctic96B-1 100 gi 17432498 gb AF355037 1 Uncultured gamma proteobacterium Arctic97A-11 100 gi 27652372 gb AY171346 1 Uncultured bacterium clone s10 gi 22595279 gb AF406527 1 Uncultured gamma proteobacterium clone AEGEAN 133 gi 37528800 gb AY386342 1 Marine gamma proteobacterium HTCC2151 100 gi 37528801 gb AY386343 1 Marine gamma proteobacterium HTCC2180 100 99 gi 28268910 gb AF468261 1 Uncultured bacterium clone ARKDMS-58 100 100 gi 5726416 gb AF159675 1 Uncultured marine eubacterium HstpL49 gi 37528803 gb AY386345 1 Marine gamma proteobacterium HTCC2178 gi 28208629 gb AF382137 1 Uncultured bacterium clone ZA3412c LSUCC420 59LSUCC426 45 49 LSUCC0398 LSUCC0430 46LSUCC0404 8766 LSUCC0553 9673LSUCC0382 LSUCC0384 50 LSUCC0399 47 60 LSUCC0376 60 LSUCC0427 LSUCC0258 55 86 LSUCC0270 61 42 LSUCC0400 33 LSUCC0711 58 59 LSUCC0093 62 85 79LSUCC272 100 98 LSUCC0096 100 LSUCC0218 OM252 clade 100 LSUCC0220 61 91LSUCC0222 71 84 87 LSUCC0041 92 LSUCC0724 52 LSUCC0264 LSUCC0835 79 LSUCC0848 LSUCC0383 91LSUCC0413 90 57 90 JX438912 1 Gamma proteobacterium IMCC14390 JX438821 1 Gamma proteobacterium IMCC14273 gi 380850743 gb JQ269292 1 Bacterium WHC5-12 2504636695 Gammaproteobacteria sp OM252 HIMB30 60 LSUCC0679 LSUCC0569 100 LSUCC0574 100 LSUCC0600 gi 121592413 gb EF176580 1 Litoricola lipolytica strain IMCC1097 100 gi 631252831 ref NR 114029 1 Litoricola lipolytica strain NBRC 102074 gi 37528790 gb AY386332 1 Marine gamma proteobacterium HTCC2089 100 gi 4218068 dbj AB022713 1 Obligately oligotrophic bacteria KI89C 100 gi 4063003 dbj AB021704 1 Uncultured marine bacterium LSUCC0713 90LSUCC0741 99 81 LSUCC0243 LSUCC0101 97 73 EF379881 1 Uncultured bacterium clone PC-FL10-68 93 45 JX438837 1 Gamma proteobacterium IMCC14296 90 94 gi 475051043 gb JX439362 1 Gamma proteobacterium IMCC14953 100 LSUCC0101-type 90 LSUCC0231 100 LSUCC0238 LSUCC0271 99 LSUCC0592 96 KC425607 1 Uncultured marine microorganism clone NB062806 381 JQ197119 1 Uncultured bacterium clone SeaWat 63137 KF786604 1 Uncultured gamma proteobacterium clone S1-5-69 JQ738933 1 Uncultured bacterium clone SA 59 2547562624 SAR86 cluster bacterium SAR86C 100gi 373127561 gb JN591881 1 Uncultured SAR86 cluster bacterium clone GG101008Clone23 96 2547569585 SAR86 cluster bacterium SAR86D 96 2597972942 SAR86 cluster bacterium SAR86A 100 100 100 2597973533 SAR86 cluster bacterium SAR86B 2520215531 SAR86 cluster bacterium SAR86E 96 gi 373127519 gb JN591839 1 Uncultured SAR86 cluster bacterium clone GG101008Clone184 gi 529986 gb L35461 1 GPSRGDA Uncultured gamma proteobacterium clone SAR86 LSUCC0552 100LSUCC0735 100 81 LSUCC0742 LSUCC0604 54 LSUCC0737 9994LSUCC0748 9395 99 LSUCC0605 LSUCC0582 100LSUCC0715 95LSUCC0757 100 bioRxiv preprint doi: LSUCC0247https://doi.org/10.1101/2020.04.17.046896; this version posted April 18, 2020. The copyright holder for this preprint (which LSUCC0621 was not certified by JX438958peer review) 1 Gamma is proteobacteriumthe author/funder, IMCC14442 who has granted bioRxiv a license to display the preprint in perpetuity. It is made 99 97 JX439217 1 Gamma proteobacteriumavailable IMCC14761 under aCC-BY-NC 4.0 International license. 33 44gi 37528789 gb AY386331 1 Marine gamma proteobacterium HTCC2149 91JX439325 1 Gamma proteobacterium IMCC14907 100 JX438957 1 Gamma proteobacterium IMCC14441 41JX439262 1 Gamma proteobacterium IMCC14827 gi 9255823 gb AF235120 1 Uncultured gamma proteobacterium KTc1119 LSUCC0235 100 29 LSUCC0712 OM60/NOR5 clade 37 98 LSUCC0677 100 100 LSUCC0730 JX439339 1 Gamma proteobacterium IMCC14924 LSUCC0674 LSUCC0372 73LSUCC0593 39 96LSUCC0616 98 66 95 95LSUCC0610 98 100 100 LSUCC0375 LSUCC0673 LSUCC0045 LSUCC0262 100 AY386336 1 Marine gamma proteobacterium HTCC2223 73 EF616594 1 Bacterium HTCC8011 54 97AY102017 1 Gamma proteobacterium HTCC160 99 JX439552 1 Gamma proteobacterium IMCC15196 99 gi 2707396 gb U70696 1 Unidentified gamma proteobacterium OM60 98gi 37528797 gb AY386339 1 Marine gamma proteobacterium HTCC2080 JX439096 1 Gamma proteobacterium IMCC14602 Congregibacter litoralis KT71 100gi 12000362 gb AY007676 1 Unknown marine gamma proteobacterium NOR5 100 100 Gammaproteobacterium Mo10 red 99 Haliea sp ETY-NAG 66 gi 636559001 ref NR 115058 1 Chromatocurvus halotolerans strain EG19 gi 23168853 gb AY133409 1 Uncultured bacterium clone BB2 126 92 85 88 gi 28269036 gb AF468388 1 Arctic sea ice bacterium ARK10038 gi 16517907 gb AF424063 1 Uncultured gamma proteobacterium MERTZ 2CM 38 84 26 gi 37528795 gb AY386337 1 Marine gamma proteobacterium HTCC2246 74 76 65 gi 37528792 gb AY386334 1 Marine gamma proteobacterium HTCC2148 gi 631252081 ref NR 113279 1 Halioglobus pacificus strain S1-72 LSUCC0820 94LSUCC0874 32 100 OM241 clade 100 2707402LSUCC0097 gi 2707402 gb U70702 1 Unidentified gamma proteobacterium OM241 gi 3256290 dbj AB015537 1 Uncultured gamma proteobacterium gi 3256272 dbj AB015519 1 Uncultured gamma proteobacterium 100 gi 37528791 gb AY386333 1 Marine gamma proteobacterium HTCC2143 gi 37528798 gb AY386340 1 Marine gamma proteobacterium HTCC2120 100gi 37528799 gb AY386341 1 Marine gamma proteobacterium HTCC2121 51 54EF182727 1 Gamma proteobacterium HTCC2167 60 53 EF616593 1 Bacterium HTCC8009 100 69 AY102022 1 Gamma proteobacterium HTCC153 gi 9663752 emb AJ400355 1 Uncultured marine bacterium ZD0424 EF182728 1 Gamma proteobacterium HTCC2227 JX439547 1 Gamma proteobacterium IMCC15189 92 JX438828 1 Gamma proteobacterium IMCC14284 100JX438916 1 Gamma proteobacterium IMCC14394 89 69 JX439538 1 Gamma proteobacterium IMCC15172 59 gi 148729 gb M63811 1 GAPRGDA Uncultured gamma proteobacterium clone 92 93 91JX439511 1 Gamma proteobacterium IMCC15138 JX439436 1 Gamma proteobacterium IMCC15046 100 100JX439614 1 Gamma proteobacterium IMCC15274 100 EF182732 1 Gamma proteobacterium HTCC6268 SAR92 clade 98 JX439061 1 Gamma proteobacterium IMCC14561 100 100 93 JX438865 1 Gamma proteobacterium IMCC14331 AY386338 1 Marine gamma proteobacterium HTCC2290 AY102021 1 Gamma proteobacterium HTCC151 67AY102025 1 Gamma proteobacterium HTCC221 96AY102020 1 Gamma proteobacterium HTCC148 98 AY386335 1 Marine gamma proteobacterium HTCC2207 99 100 17 JX439630 1 Gamma proteobacterium IMCC15298 100 28JX439274 1 Gamma proteobacterium IMCC14843 AY102028 1 Gamma proteobacterium HTCC234 EF195480 1 Gamma proteobacterium HTCC6124 LSUCC0872 LSUCC0057 gi 134153996 gb EF468720 1 Gamma proteobacterium IMCC2136 100 LSUCC0077 100 unk. Porticoccaceae 99 gi 9663743 emb AJ400346 1 Uncultured marine bacterium ZD0117 gi 631252981 ref NR 114179 1 Porticoccus litoralis strain NBRC 102686 gi 11 11 11 Alcanivorax dieselolei str NO1A 97 gi 513045872 gb KF021876 1 Alcanivorax sp H-207 100 97 gi 686688018 gb KM033245 1 Alcanivorax sp SA13 100 gi 55 55 55 Alcanivorax venustensis MARC4Q gi 183238720 gb EU603451 1 Alcanivorax borkumensis strain PTG4-9 100 gi 2076548 emb Y12579 1 Alcanivorax borkumensis JX122583 1 Pseudomonas guineae strain X14 100 LSUCC0140 95 Pseodomonas sp. 93HE659365 1 Pseudomonas fluorescens 95 100 KM187461 1 Pseudomonas sp HP3R 96 100 NR 042451 1 Pseudomonas peli strain R-20805 HQ848085 1 Pseudomonas peli strain ML-5 KY053176 1 Gammaproteobacteria bacterium strain IMCC25652 AY102030 1 Gamma proteobacterium HTCC172 LSUCC0134 100 NR 043919 1 Perlucidibaca piscinae strain IMCC1704 Perlucidibaca sp. 93 gi 21389187 gb AF513979 1 Alkane-degrading soil bacterium MVAB Hex1 100 52 gi 13276758 emb AJ309942 1 Psychrobacter immobilis 100 gi 984035 dbj D64049 1 MBO gi 21668433 emb AJ421425 1 Salinisphaera shabanense gi 20799096 emb AJ459801 1 Thermithiobacillus tepidarius 100 gi 454888 emb Z29975 1 T caldus DSM 8584 55 gi 1657412 emb X95917 1 X campestris 49 100 gi 19571777 emb AJ311653 1 Fulvimonas soli 54 gi 174216 gb M26629 1 CVNRRSA C vinosum 100 gi 2780221 emb AJ223234 1 Chromatium okenii gi 175189 gb M36023 1 LPNURNCTCZ strain Philadelphia 1 52 gi 174052 gb M35014 1 CDBRRDA hominis strain ATCC 15826 100 gi 174985 gb M35015 1 KINRRDAA Suttonella indologenes strain ATCC 25869 98 gi 4731577 gb AF126548 1 Thioalcalomicrobium aerophilum 100 gi 790923 gb L40809 1 THMRRDB Thiomicrospira pelophila 80 52 gi 23395405 gb AF544015 1 Parvularcula bermudensis strain HTCC2503 100 IMCC17027 98 648027305 gamma proteobacterium HTCC5015 gi 175268 gb M95657 1 MLCRRDA Methylococcus luteus 100 gi 6118093 gb AF177296 1 AF177296 Type I methanotroph AML-C10 76 96 gi 639913 gb U12624 1 CPU12624 pugetii PS-1 gi 1888435 emb X87338 1 M marina 52 gi 335353092 gb JN003574 1 Gamma proteobacterium GSO-PS1 100 gi 363542306 gb JQ253969 1 Gamma proteobacterium GSO-NP1 100 91 gi 307567063 gb HQ162915 1 Uncultured SUP05 cluster bacterium clone SHZZ468 gi 41352514 gb AY520560 1 Kangiella koreensis 0.01