J. Microbiol. Biotechnol. (2017), 27(4), 825–833 https://doi.org/10.4014/jmb.1701.01047 Research Article Review jmb

Genomic Analysis of a Freshwater Actinobacterium, “Candidatus Limnosphaera aquatica” Strain IMCC26207, Isolated from Lake Soyang Suhyun Kim, Ilnam Kang, and Jang-Cheon Cho*

Department of Biological Sciences, Inha University, Incheon 22212, Republic of Korea

Received: January 16, 2017 Revised: February 3, 2017 Strain IMCC26207 was isolated from the surface layer of Lake Soyang in Korea by the dilution- Accepted: February 6, 2017 to-extinction culturing method, using a liquid medium prepared with filtered and autoclaved lake water. The strain could neither be maintained in a synthetic medium other than natural freshwater medium nor grown on solid agar plates. Phylogenetic analysis of 16S rRNA gene

First published online sequences indicated that strain IMCC26207 formed a distinct lineage in the order February 7, 2017 Acidimicrobiales of the . The closest relative among the previously

*Corresponding author identified bacterial taxa was “Candidatus Microthrix parvicella” with 16S rRNA gene sequence Phone: +82-32-860-7711; similarity of 91.7%. Here, the draft genome sequence of strain IMCC26207, a freshwater Fax: +82-32-232-0541; actinobacterium, is reported with the description of the genome properties and annotation E-mail: [email protected] summary. The draft genome consisted of 10 contigs with a total size of 3,316,799 bp and an average G+C content of 57.3%. The IMCC26207 genome was predicted to contain 2,975 protein-coding genes and 51 non-coding RNA genes, including 45 tRNA genes. Approximately 76.8% of the protein coding genes could be assigned with a specific function. Annotation of the IMCC26207 genome showed several traits of adaptation to living in oligotrophic freshwater environments, such as phosphorus-limited condition. Comparative genomic analysis revealed that the genome of strain IMCC26207 was distinct from that of “Candidatus Microthrix” strains; therefore, we propose the name “Candidatus Limnosphaera aquatica” for this bacterium. pISSN 1017-7825, eISSN 1738-8872

Copyright© 2017 by Keywords: Actinobacteria, “Candidatus Microthrix,” “Candidatus Limnosphaera,” genome, The Korean Society for Microbiology freshwater, dilution-to-extinction culturing and Biotechnology

Introduction habitats, where these can constitute more than 50% of the total epilimnetic bacterial community [4, 5]. Molecular Freshwater ecosystems have unique planktonic bacterial approaches based on sequences of metagenomes and 16S communities that are distinct from bacterial communities rRNA genes have shown the abundance, phylogenetic in other habitats, such as soil and marine environments. diversity, and functional potential of freshwater actinobacteria The freshwater bacterial community is typically dominated [6], revealing that major freshwater actinobacteria are as- by the phyla Actinobacteria, (mainly the yet-uncultivated in spite of their enormous abundance and classes Alphaproteobacteria and Betaproteobacteria), and diversity. Recent studies using single-cell genomics have [1, 2]. Considering the crucial roles of additionally revealed putative ecological niches of the acI freshwater environments in global carbon cycling and clade, one of the most abundant groups among freshwater climate change [3], it is important to study the diversity actinobacteria [7, 8]. However, the physiology and genome and function of the freshwater bacterial communities, properties of many actinobacterial groups that play crucial which mediate diverse biogeochemical reactions. roles in biogeochemical processes in freshwater environments The phylum Actinobacteria is a common and often have mainly remained unknown because the major numerically important component in a variety of freshwater freshwater actinobacterial groups, such as acI and acIV

April 2017 ⎪ Vol. 27⎪ No. 4 826 Kim et al.

clades, have not been cultivated. 200 µM KCl, 300 µM MgSO4·7H2O, 500 µM CaCl2·2H2O, 300 µM

The genus “Candidatus Microthrix” affiliated with the order NaHCO3, 117 nM FeCl3·6H2O, 9 nM MnCl2·4H2O, 0.8 nM ZnSO4·7H2O,

Acidimicrobiales of the phylum Actinobacteria currently 0.5 nM CoCl2·6H2O, 0.3 nM Na2MoO4·2H2O, 1 nM Na2SeO3, 1 nM NiCl ·6H O, and 1:5,000 diluted Va vitamin mixture [17] per 1 L of contains two filamentous Candidatus , retrieved 2 2 from activated sludge of wastewater treatment plants [9- distilled water) amended with the above carbon mixtures were prepared. In addition, the colony-forming ability of strain 11]. Genomic and metagenomic analyses uncovered the IMCC26207 was tested by spotting 10 µl of viable cultures onto metabolic versatility of “Candidatus Microthrix parvicella,” plates of R2A, 1/3 R2A, and 1/10 R2A. which is highly abundant in sludge environments [12]. Several studies in freshwater ecosystems have described Culture Conditions, Electron Microscopy, and Genomic DNA bacterial groups related to the genus “Candidatus Microthrix,” Preparation using cultivation-independent molecular analyses [13, 14]. Strain IMCC26207 was inoculated into 4 L of the liquid medium, To our best knowledge, however, there have been no LNFMC, the same culture medium used for the isolation, and reports on the isolation and genomic characterization of grown aerobically at 15°C. During the cultivation, cell densities freshwater bacteria phylogenetically related to the “Candidatus were monitored using a flow cytometer (Guava EasyCyte Plus; Microthrix” group. Millipore). For morphological characterization, bacterial cells in the stationary growth phase were adsorbed onto a 200-mesh In this study, we report the genome sequence of strain formvar and carbon-coated copper grid (Electron Microscopy IMCC26207, an actinobacterium isolated from an oligotrophic Sciences, USA), stained with uranyl acetate solution (2% (w/v)), freshwater reservoir using dilution-to-extinction culturing and examined with a transmission electron microscope (CM200; and phylogenetically assigned to a novel group related to Philips, The Netherlands). For the genome sequencing, 4 L “Candidatus Microthrix.” Based on morphological, phylogenetic, cultures of strain IMCC26207 were harvested by filtration onto a and genomic differences between strain IMCC26207 and polyethersulfone membrane filter (0.2 µm pore size; Pall “Candidatus Microthrix,” the name “Candidatus Limnosphaera Corporation, USA). The genomic DNA was directly extracted aquatica” is additionally proposed for this bacterium. from the filter using a DNeasy Blood & Tissue Kit (Qiagen, USA) following the protocol for gram-positive bacteria, and was further purified using a PowerClean Pro DNA Clean-Up Kit (Mo Bio Materials and Methods Laboratories, USA). The purity, quality, and concentration of the genomic DNA were determined using 1% (w/v) agarose gel Isolation of Stain IMCC26207 electrophoresis and a Qubit 3.0 fluorometer (Invitrogen). Strain IMCC26207 was isolated from a surface water sample collected from Lake Soyang, an oligotrophic reservoir located in Genome Sequencing, Assembly, and Annotation South Korea, by using the high-throughput dilution-to-extinction A sequencing library was constructed from the genomic DNA culturing (HTC) method [15, 16]. The culture medium (referred to by using Nextera DNA Library Preparation Kits (Illumina, USA) as LNFMC, low nutrient freshwater medium amended with according to the manufacturer’s instructions. The genome of carbon mixtures) used for HTC was prepared by amending strain IMCC26207 was sequenced at ChunLab, Inc. (Korea), using filtered and autoclaved lake water with the following ingredients: the Illumina MiSeq platform, at a read length of 2 × 300 bp. A total 10 µM NH4Cl, 10 µM KH2PO4, 50 µM pyruvic acid, 5 µM D-glucose, of 3,441,153 paired reads obtained from the genome sequencing 5 µM D-ribose, 5 µM N-acetyl-D-glucosamine, 5 µM methyl alcohol, were assembled using SPAdes ver. 3.5.0 [18], generating 10 contigs and 1:10,000 diluted Va vitamin mixture [17]. To determine the (90–836 kbp; N50, 345 kbp) with an average of 63.1× coverage. The cell density for the inoculum, direct cell counting was performed total length of all contigs comprised 3,316,799 bp. Gene calling and microscopically, after staining with 4’,6’-diamidino-2-phenylinole, genome annotation were performed by following the Integrated using an epifluorescence microscope (Nikon 80i; Nikon, Japan). Microbial Genomes-Expert Review (IMG-ER) pipeline [19] Freshwater bacterial cells were diluted to 2 cells/ml with LNFMC, developed by the Joint Genome Institute [20]. tRNA and rRNA and 1 ml of the inoculum was transferred into 48-well plates (BD genes were found using the tRNAscan-SE tool and HMMER, Falcon, USA). Positively grown cells, including strain IMCC26207, respectively [21, 22], and other non-coding RNAs were detected were screened with SYBR Green I (Invitrogen, USA) using an by searching the genome for the corresponding Rfam profiles Easyflow Guava flow cytometer (EasyCyte Plus; Millipore, USA) using INFERNAL [23]. CRISPR repeats were checked by CRT-CLI after 4 weeks of incubation at 15°C in the dark condition. To test and PILER-CR [24]. Protein-coding genes were identified using whether the initial axenic culture of IMCC26207 could grow in Prodigal [25]. The genome map was generated by CGview chemically defined media or synthetic media, R2A broth (BD, software using a merged GenBank file from the RAST server [26]. USA), 1/3 diluted R2A broth, 1/10 diluted R2A broth, and The predicted CDSs were translated and used to search the NCBI artificial freshwater medium (200 µM KH2PO4, 300 µM (NH4)2SO4, non-redundant, TIGRFam, Pfam, KEGG, COG, and InterPro

J. Microbiol. Biotechnol. Genome Sequence of “Candidatus Limnosphaera Aquatica” 827

databases. Additional functional analyses of metabolic pathways encoded by the genome were also performed using IMG-ER.

Phylogenetic Analyses and Genome Comparison The 16S rRNA gene sequence of strain IMCC26207 was obtained from the genome sequence, and 16S rRNA gene sequence similarities between strain IMCC26207 and closely related strains were calculated using the EzTaxon-e server [27]. The 16S rRNA gene sequence of strain IMCC26207 was initially aligned with the SINA online aligner (http://www.arb-silva.de). The aligned sequence was imported into the SILVA database SSURef NR99 (Release 123) by using the ARB program [28], manually curated based on the Fig. 1. Transmission electron micrographs of strain secondary structure of SSU rRNA, and the aligned sequence set IMCC26207 in non-dividing (left) and dividing states (right). containing IMCC26207 was exported to build phylogenetic trees. Scale bars represent 100 nm (left) and 200 nm (right), respectively. Maximum-likelihood trees were constructed with rapid bootstrapping (a maximum of 1,000 bootstrap replicate searches) in RAxML (ver. liquid medium, and thus their binomial nomenclatures 8.1.17) [29]. For a comparative genomics analysis, two publicly could be validly published, the colony-forming ability or available “Candidatus Microthrix parvicella” genomes (strains growth capability in artificial liquid media were tested for BIO17-1 and RN1, GenBank accession nos. AMPG00000000 and strain IMCC26207. Unfortunately, strain IMCC26207 did CANL00000000, respectively) were chosen since strain IMCC26207 not grow in any of the artificial liquid media and on any of was phylogenetically related to the genus “Candidatus Microthrix.” the solid media, but grew only in LNFMC, which was All protein sequences encoded by the three genomes were exported from the IMG database. Unique and shared protein-coding genes consequently used for further analyses. The culture reached were identified using GET_HOMOLOGUES software [30], with the stationary phase after 2 weeks of incubation with a 5 default parameters. Proteins satisfying coverage (≥75%) and maximum cell density of approximately 8 × 10 cells/ml. E-value (≤1e-05) cutoffs in BLAST pairwise alignments were Cells of strain IMCC26207 looked pure in epifluorescence clustered by using the orthoMCL algorithm implemented in the microscopy and transmission electron microscopy, showing software. Comparisons between the IMCC26207 genome and two a coccus-shaped morphology with diameters of 0.4–0.5 µm, “Candidatus Microthrix” genomes were performed using Blast Ring dividing by binary fission (Fig. 1). Image Generator (BRIG) [31]. BLASTn results were displayed with The comparative 16S rRNA gene sequence analyses a similarity cutoff of 30%. To determine the genome sequence showed that strain IMCC26207 was only distantly related similarlity between the bacterial strains, average nucleotide to “Candidatus Microthrix parvicella” Bio17-1 [9], “Candidatus identity (ANI) values were calculated from the EzGenome database Microthrix calida” TNO2-1 [10], Aquihabitans daechungensis (http://www.ezbiocloud.net/ezgenome/ani) using the BLASTn CH22-21T [34], Aciditerrimonas ferrireducens IC-180T [35], algorithms, as described by Goris et al. [32]. and majanohamensis F12T [36], with low sequence Nucleotide Sequence Accession Numbers similarity values of 91.7%, 90.8%, 89.7%, 89.2%, and 88.2%, The assembled and annotated genome of “Candidatus respectively. In the phylogenetic tree constructed by using Limnosphaera aquatica” IMCC26207 has been deposited in GenBank the maximum-likelihood algorithm, strain IMCC26207 under the accession numbers LCZK01000001-LCZK01000010, JGI formed a monophyletic lineage with many uncultured portal with GOLD ID of Gp0120613, and IMG taxon ID of environmental clones, such as STH5-14 (DQ316384) and 2606217176. ANTLV1_D11 (DQ521487) previously reported from freshwater habitats, and occupied a phylogenetic clade that Results and Discussion was distinct from the “Candidatus Microthrix” clade (Fig. 2). The larger clade containing strain IMCC26207 and Strain Characteristics and Phylogenetic Analysis “Candidatus Microthrix” was clearly separated from other Strain IMCC26207 was initially isolated using dilution- member of the order Acidimicrobiales, suggesting that strain to-extinction culturing, employed with a filtered and IMCC26207 and “Candidatus Microthrix” may comprise a autoclaved natural freshwater medium designated LNFMC. separate family in the order. From the phylogenetic inference Since some bacteria such as Lentisphaera araneosa [33] and morphological distinction (coccus vs. filamentous), cultivated using dilution-to-extinction culturing formed strain IMCC26207 is considered to represent a novel taxon colonies on a synthetic medium or grew in an artificial that is different from the genus “Candidatus Microthrix.”

April 2017 ⎪ Vol. 27⎪ No. 4 828 Kim et al.

Fig. 2. Maximum-likelihood phylogenetic tree based on 16S rRNA gene sequences showing the position of strain IMCC26207. Bootstrap replicate searches were performed 350 times. Bootstrap values (>60%) are shown at branch nodes. Streptomyces griseus subsp. griseus NBRC 13350 was used as an outgroup. Scale bar, 0.1 substitutions per nucleotide position.

Genome Characteristics Table 1. Genome statistics of “Candidatus Limnosphaera The draft genome of strain IMCC26207 comprised 10 aquatica” strain IMCC26207 and “Candidatus Microthrix contigs with a total length of 3,316,799 bp and G+C content parvicella” strains Bio17-1 and RN1. Strains of 57.26% (Table 1). The bacterial genome contained 3,026 Attribute genes: 2,975 protein-coding genes and 51 non-coding RNA IMCC26207 Bio17-1 RN1 genes, including 3 rRNA genes and 45 tRNA genes. Neither Genome size (bp) 3,316,799 4,202,850 4,202,896 plasmid nor phage DNA sequences were detected. The DNA coding (%) 93.24 90.76 90.72 majority (2,284 genes) of 2,975 protein-coding genes predicted DNA G+C (%) 57.26 66.39 66.33 in the genome (76.77%) could be assigned a putative function. DNA scaffolds 10 1387 A total of 1,917 genes (63.35%) were allocated into COG Total genes 3,026 3,968 3,987 functional categories (Table 2). The major COG categories Protein-coding genes 2,975 3,913 3,930 were lipid transport and metabolism (I; 226 genes; 10.14%), RNA genes 51 55 57 amino acid transport and metabolism (E; 168 genes; 7.54%), rRNA genes 333 coenzyme transport and metabolism (H; 165 genes; 7.41%), tRNA genes 45 49 49 translation, ribosomal structure and biogenesis (J; 159 Other RNA genes 335 genes; 7.14%), energy production and conversion (C; 146 Genes in internal clusters 236 494 453 genes; 6.55%), cell wall/membrane biogenesis (M; 138 Genes with function prediction 2,284 2,835 2,828 genes; 6.19%), secondary metabolites biosynthesis, and Genes assigned to COGs 1,917 2,252 2,239 transport and catabolism (Q; 125 genes; 5.61%). Genes with Pfam domains 2,374 2,958 2,972 Genes with signal peptides 140 123124 Metabolic Pathways and Genome Comparison Genes with transmembrane helices 726 870 892 In order to understand the putative ecological niche of CRISPR repeats 0 33

J. Microbiol. Biotechnol. Genome Sequence of “Candidatus Limnosphaera Aquatica” 829

Table 2. Number of genes associated with general COG functional categories. Code Description Value Percent E Amino acid transport and metabolism 168 7.54 G Carbohydrate transport and metabolism 107 4.8 D Cell cycle control, cell division, chromosome partitioning 31 1.39 N Cell motility 17 0.76 M Cell wall/membrane/envelope biogenesis 138 6.19 B Chromatin structure and dynamics 1 0.04 H Coenzyme transport and metabolism 165 7.41 Z Cytoskeleton 2 0.09 V Defense mechanisms 49 2.2 C Energy production and conversion 146 6.55 W Extracellular structures 14 0.63 S Function unknown 110 4.94 R General function prediction only 238 10.68 P Inorganic ion transport and metabolism 101 4.53 U Intracellular trafficking, secretion, and vesicular transport 28 1.26 I Lipid transport and metabolism 226 10.14 X Mobilome: prophages, transposons 5 0.22 F Nucleotide transport and metabolism 62 2.78 O Posttranslational modification, protein turnover, chaperones 833.73 A RNA processing and modification 1 0.04 L Replication, recombination and repair 81 3.64 Q Secondary metabolites biosynthesis, transport and catabolism 125 5.61 T Signal transduction mechanisms 632.83 K Transcription 108 4.85 J Translation, ribosomal structure and biogenesis 159 7.14 - Not in COG 1,109 36.65

stain IMCC26207 in freshwater environments, the genome found in the IMCC26207 genome sequence, suggesting the sequence analysis was focused on metabolic pathways role of trehalose as a storage material. With regard to encoded in the genome. The details of the genes related to inorganic nutrient utilization, the IMCC26207 genome metabolic pathways described are shown in Table 3 with contained genes coding for proteins that participate in the IMG Gene ID. A schematic diagram of metabolic pathways assimilatory sulfate reduction and ammonia assimilation encoded in the genome of strain IMCC26207 based on cycle. The IMCC26207 genome also encoded the phosphate KEGG annotations is also presented in Fig. 3. transporter (pstABC), polyphosphate kinases (ppk), and Regarding carbohydrate metabolism, the IMCC26207 two-component regulatory system (senX3-regX3) inducible genome contained genes encoding the Embden-Meyerhof- by low concentration of environmental inorganic phosphate, Parnas pathway of glycolysis and the complete oxidative suggesting that this bacterium can survive under phosphorus- tricarboxylic acid cycle. Genes encoding the complete limiting freshwater condition. These phosphorus-related pentose phosphate pathway (PPP; oxidative and non- genes were also found in the genomes of two “Candidatus oxidative) were also present and it was predicted that Microthrix parvicella” strains that are known to undergo several pentoses, such as ribose and xylose, could be phosphorus uptake only under phosphorus-limiting utilized as a carbon source by assimilation through the conditions [10, 12]. PPP. Xylose is one of the most abundant monosaccharides Comparative genomics analysis based on orthologs in terrestrial plants [37], which may contribute to the among the genomes retrieved using GET_HOMOLOGUES carbon pool of inland waters. Genes encoding enzymes showed that strain IMCC26207 shared only 52% of its required for trehalose synthesis and hydrolysis were also proteins with “Candidatus Microthrix” strains, whereas

April 2017 ⎪ Vol. 27⎪ No. 4 830 Kim et al.

Table 3. A list of genes involved in major metabolic pathways predicted in the IMCC26207 genome. Pathway Genes IMG Gene ID Embden-Meyerhof- Glucose-6-phosphate isomerase 2606506697 Parnas 6-Phosphofructokinase 2606506363 glycolysis pathway Fructose-bisphosphate aldolase 2606508673 Triosephosphate isomerase 2606509032 Glyceraldehyde-3-phosphate dehydrogenase 2606509034 Phosphoglycerate kinase 2606509033 Phosphoglycerate mutase 2606506161 Enolase 2606507641 Pyruvate kinase 2606508771 Pyruvate dehydrogenase E1 component 2606507486 Dihydrolipoamide dehydrogenase 2606506403 Tricarboxylic Citrate synthase 2606509055 acid cycle Aconitase 2606509050 Isocitrate dehydrogenase 2606507723 2-Oxoglutarate dehydrogenase E1 component 2606508549 2-Oxoglutarate dehydrogenase E2 component 2606507476 Dihydrolipoamide dehydrogenase 2606507477 Succinyl-CoA synthetase alpha subunit 2606507870, 2606507871 Succinate dehydrogenase 2606507709, 2606507710, 2606507711 Fumarase 2606508271 Malate dehydrogenase 2606507718 Pentose phosphate Glucose-6-phosphate 1-dehydrogenase 2606507495 pathway 6-Phosphogluconolactonase 2606507496 6-Phosphogluconate dehydrogenase 2606507494 Ribulose-phosphate 3-epimerase 2606506785 Ribose-5-phosphate isomerase 2606506483 Transketolase 2606508788 Transaldolase 2606508787

Xylose and ribose D-Xylose isomerase 2606509163 utilization Xylulokinase 2606509164 Ribulose-phosphate 3-epimerase 2606506785 Ribokinase 2606508596 Trehalose Trehalose 6-phosphate synthase 2606508370 metabolism Trehalose 6-phosphate phosphatase 2606508371 Trehalose synthase 2606506807 Assimilatory Sulfate adenylyltransferase 2606508541, 2606508542, 2606508543 sulfate reduction Adenylyl-sulfate kinase 2606508542, 2606508543 Phosphoadenosine phosphosulfate reductase 2606506866 Sulfite reductase 2606506867 Ammonia Glutamine synthetase 2606507065, 2606507066 assimilation Glutamate synthase 2606508738, 2606508739 Polyphosphate ABC-type phosphate transport system, substrate-binding protein 2606507196, 2606507697 metabolism Phosphate transport system permease protein 2606507193, 2606507194, 2606507696, 2606507695 Phosphate transport system ATP-binding protein 2606507195, 2606507694 Polyphosphate kinase 2606507699 Two-component system, SenX3 and RegX326065 07690, 2606507691

J. Microbiol. Biotechnol. Genome Sequence of “Candidatus Limnosphaera Aquatica” 831

Fig. 3. Overview of metabolic pathways of strain IMCC26207. The presence and absence of genes were predicted within the Integrated Microbial Genomes system based on KEGG annotations.

>87% of proteins were shared between the two strains of that were unique to the IMCC26207 genome (Fig. 4B). In “Candidatus Microthrix,” which suggested a genome-scale addition, ANI values between strain IMCC26207 and the difference between strain IMCC26207 and the “Candidatus other two “Candidatus Microthrix” strains were both ~65.7%, Microthrix” strains (Fig. 4A). The overall genome statistics whereas the ANI value between the two “Candidatus of “Candidatus Microthrix” are presented in Table 1, together Microthrix” strains was 98.9%. It is confirmed that the two with those of strain IMCC26207. The 1,364 proteins found “Candidatus Microthrix” strains comprise the same genomic to be unique to strain IMCC26207 by GET_HOMOLOGUES species since their ANI value exceeded 95-96%, a value of were analyzed in terms of COG category distribution. Only the currently accepted ANI threshold for bacterial genomic 51.9% of the unique proteins were assigned to COGs, and species demarcation [38, 39]. This overall genome comparison the major COG categories were general function prediction result was consistent with the phylogenetic relatedness (7.11%), lipid transport and metabolism (5.65%), function inferred from 16S rRNA gene sequences, as strain unknown (4.62%), and cell wall/membrane/envelope IMCC26207 was located in a strongly supported branch biogenesis (4.55%). Most of the unique proteins that could clearly separated from the genus “Candidatus Microthrix” not be assigned to COG were hypothetical proteins. In (Fig. 2). contrast, of 1,489 proteins shared among the three Candidatus In conclusion, analysis of the IMCC26207 genome showed strains, 95.9% of the proteins were assigned to COGs, and several metabolic features relevant to living in freshwater major categories were lipid transport and metabolism environments. A comparative analysis of the IMCC26207 (9.47%), translation/ribosomal structure and biogenesis genome sequence showed a low level of genomic similarity (9.00%), general function prediction only (8.60%), amino to “Candidatus Microthrix” strains isolated from activated acid transport and metabolism (8.19%), coenzyme transport sludge. These analyses, combined with phylogenetic and metabolism (8.06%), and energy production and analyses, suggested that strain IMCC26207 is distantly conversion (8.06%). In the comparative genomic analyses related to “Candidatus Microthrix” and represents a novel using BRIG, the two “Candidatus Microthrix” strains showed genus-level taxon. As strain IMCC26207 does not grow on high genome sequence similarity as expected, whereas conventional agar media but grows only in a liquid strain IMCC26207 had a wide variety of genome regions medium prepared with natural lake water, we propose to

April 2017 ⎪ Vol. 27⎪ No. 4 832 Kim et al.

Fig. 4. Comparison of the genome of strain IMCC26207 with the genomes of “Candidatus Microthrix parvicella” strains BIO17-1 and RN1. (A) The Venn diagram illustrates the number of unique and shared protein-coding genes in the three genomes. (B) BLAST-based comparison of the IMCC26207 genome with other reference genomes. The innermost rings show the GC skew (purple/green) and GC content (black) of the IMCC26207 genome sequence represented as a pseudoscaffold. The next two rings show the coding regions for the two reference genomes, BIO17- 1 (red) and RN1 (blue). The colored regions indicate the pairwise genomic similarity according to BLASTn. The figure was generated using BRIG (BLAST Ring Image Generator) software. Lower identity threshold, 30%. call this species “Candidatus Limnosphaera aquatica” 4. Glöckner FO, Zaichikov E, Belkova N, Denissova L, according to the following etymology: Lim.no.sphae’ra. Gr. Pernthaler J, Pernthaler A, Amann R. 2000. Comparative 16S n. limnos pool of standing water, lake; L. fem. n. sphaera rRNA analysis of lake bacterioplankton reveals globally sphere; N.L. fem. n. Limnosphaera, a spherical cell living in distributed phylogenetic clusters including an abundant group lake; a.qua'ti.ca. L. fem. adj. aquatica, living, growing, or of actinobacteria. Appl. Environ. Microbiol. 66: 5053-5065. 5. Warnecke F, Sommaruga R, Sekar R, Hofer JS, Pernthaler J. found in or by water, aquatic. 2005. Abundances, identity, and growth state of actinobacteria in mountain lakes of different UV transparency. Appl. Acknowledgments Environ. Microbiol. 71: 5551-5559. 6. Eiler A, Zaremba-Niedzwiedzka K, Martinez-Garcia M, This study was supported by the Mid-Career Research McMahon KD, Stepanauskas R, Andersson SGE, Bertilsson S. Program through the National Research Foundation (NRF) 2014. Productivity and salinity structuring of the microplankton funded by the Ministry of Science, ICT, and Future revealed by comparative freshwater metagenomics. Environ. Planning (NRF-2016R1A2B2015142). Microbiol. 16: 2682-2698. 7. Garcia SL, McMahon KD, Martinez-Garcia M, Srivastava A, References Sczyrba A, Stepanauskas R, et al. 2013. Metabolic potential of a single cell belonging to one of the most abundant 1.Zwart G, Crump BC, Kamst-van Agterveld MP, Hagen F, lineages in freshwater bacterioplankton. ISME J. 7: 137-147. Han S-K. 2002. Typical freshwater bacteria: an analysis of 8. Ghylin TW, Garcia SL, Moya F, Oyserman BO, Schwientek available 16S rRNA gene sequences from plankton of lakes P, Forest KT, et al. 2014. Comparative single-cell genomics and rivers. Aquat. Microb. Ecol. 28: 141-155. reveals potential ecological niches for the freshwater acl 2. Newton RJ, Jones SE, Eiler A, McMahon KD, Bertilsson S. Actinobacteria lineage. ISME J. 8: 2503-2516. 2011. A guide to the natural history of freshwater lake 9. Blackall LL, Stratton H, Bradford D, Del Dot T, Sjörup C, bacteria. Microbiol. Mol. Biol. Rev. 75: 14-49. Seviour EM, Seviour RJ. 1996. “Candidatus Microthrix 3. Tranvik LJ, Downing JA, Cotner JB, Loiselle SA, Striegl RG, parvicella,” a filamentous bacterium from activated sludge Ballatore TJ, et al. 2009. Lakes and reservoirs as regulators of sewage treatment plants. Int. J. Syst. Bacteriol. 46: 344-346. carbon cycling and climate. Limnol. Oceanogr. 54: 2298-2314. 10. Levantesi C, Rossetti S, Thelen K, Kragelund C, Krooneman J,

J. Microbiol. Biotechnol. Genome Sequence of “Candidatus Limnosphaera Aquatica” 833

Eikelboom D, et al. 2006. Phylogeny, physiology and 11: 119. distribution of ‘Candidatus Microthrix calida’, a new Microthrix 26. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards species isolated from industrial activated sludge wastewater RA, et al. 2008. The RAST Server: rapid annotations using treatment plants. Environ. Microbiol. 8: 1552-1563. subsystems technology. BMC Genomics 9: 75. 11. Rossetti S, Tomei MC, Nielsen PH, Tandoi V. 2005. “Microthrix 27. Kim OS, Cho YJ, Lee K, Yoon SH, Kim M, Na H , et al. 2012. parvicella”, a filamentous bacterium causing bulking and Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence foaming in activated sludge systems: a review of current database with phylotypes that represent uncultured species. knowledge. FEMS Microbiol. Rev. 29: 49-64. Int. J. Syst. Evol. Microbiol. 62: 716-721. 12. McIlroy SJ, Kristiansen R, Albertsen M, Karst SM, Rossetti S, 28. Ludwig W, Strunk O, Westram R, Richter L, Meier H, Nielsen JL, et al. 2013. Metabolic model for the filamentous Yadhukumar, et al. 2004. ARB: a software environment for ‘Candidatus Microthrix parvicella’ based on genomic and sequence data. Nucleic Acids Res. 32: 1363-1371. metagenomic analyses. ISME J. 7: 1161-1172. 29. Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic 13.Allgaier M, Grossart H-P. 2006. Diversity and seasonal analysis and post-analysis of large phylogenies. Bioinformatics dynamics of Actinobacteria populations in four lakes in 30: 1312-1313. northeastern Germany. Appl. Environ. Microbiol. 72: 3489-3497. 30. Contreras-Moreira B, Vinuesa P. 2013. GET_HOMOLOGUES, 14. Annika CM, Murray AE, Fritsen CH. 2007. Microbiota a versatile software package for scalable and robust microbial within the perennial ice cover of Lake Vida, Antarctica. pangenome analysis. Appl. Environ. Microbiol. 79: 7696-7701. FEMS Microbiol. Ecol. 59: 274-288. 31. Alikhan NF, Petty NK, Ben Zakour NL, Beatson SA. 2011. 15. Cho J-C, Giovannoni SJ. 2004. Cultivation and growth BLAST Ring Image Generator (BRIG): simple characteristics of a diverse group of oligotrophic marine genome comparisons. BMC Genomics 12: 402. Gammaproteobacteria. Appl. Environ. Microbiol. 70: 432-440. 32. Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, 16. Connon SA, Giovannoni SJ. 2002. High-throughput methods Vandamme P, Tiedje JM. 2007. DNA–DNA hybridization for culturing microorganisms in very-low-nutrient media values and their relationship to whole-genome sequence yield diverse new marine isolates. Appl. Environ. Microbiol. similarities. Int. J. Syst. Evol. Microbiol. 57: 81-91. 68: 3878-3885. 33. Cho JC, Vergin KL, Morris RM, Giovannoni SJ. 2004. 17. Davis HC, Guillard RR. 1958. Relative value of ten genera Lentisphaera araneosa gen. nov., sp. nov, a transparent of micro-organisms as foods for oyster and clam larvae. exopolymer producing marine bacterium, and the description USFWS Fish. Bull. 58: 293–304. of a novel bacterial phylum, . Environ. 18. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Microbiol. 6: 611-621. Kulikov AS, et al. 2012. SPAdes: a new genome assembly 34. Jin L, Huy H, Kim KK, Lee H-G, Kim H-S, Ahn C-Y, Oh H- algorithm and its applications to single-cell sequencing. J. M. 2013. Aquihabitans daechungensis gen. nov., sp. nov., an Comput. Biol. 19: 455-477. actinobacterium isolated from reservoir water. Int. J. Syst. 19. Chen I, Markowitz VM, Chu K, Anderson I, Mavromatis K, Evol. Microbiol. 63: 2970-2974. Kyrpides NC, Ivanova NN. 2013. Improving microbial 35. Itoh T, Yamanoi K, Kudo T, Ohkuma M, Takashina T. 2011. genome annotations in an integrated database context. PLoS Aciditerrimonas ferrireducens gen. nov., sp. nov., an iron- One 8: e54859. reducing thermoacidophilic actinobacterium isolated from a 20. Markowitz VM, Mavromatis K, Ivanova NN, Chen I-MA, solfataric field. Int. J. Syst. Evol. Microbiol. 61: 1281-1285. Chu K, Kyrpides NC. 2009. IMG ER: a system for microbial 36. Kurahashi M, Fukunaga Y, Sakiyama Y, Harayama S, genome annotation expert review and curation. Bioinformatics Yokota A. 2009. gen. nov., sp. nov., an 25: 2271-2278. actinobacterium isolated from sea cucumber Holothuria 21. Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for edulis, and proposal of fam. nov. Int. J. Syst. Evol. improved detection of transfer RNA genes in genomic Microbiol. 59: 869-873. sequence. Nucleic Acids Res. 25: 0955-0964. 37. Matsushika A, Inoue H, Kodaki T, Sawayama S. 2009. 22. Finn RD, Clements J, Arndt W, Miller BL, Wheeler TJ, Ethanol production from xylose in engineered Saccharomyces Schreiber F, et al. 2015. HMMER web server: 2015 update. cerevisiae strains: current state and perspectives. Appl. Nucleic Acids Res. 43: W30-W38. Microbiol. Biotechnol. 84: 37-53. 23. Nawrocki EP, Kolbe DL, Eddy SR. 2009. Infernal 1.0: 38. Richter M, Rossello-Mora R. 2009. Shifting the genomic gold inference of RNA alignments. Bioinformatics 25: 1335-1337. standard for the prokaryotic species definition. Proc. Natl. 24. Edgar RC. 2007. PILER-CR: fast and accurate identification Acad. Sci. USA 106: 19126-19131. of CRISPR repeats. BMC Bioinformatics 8: 18. 39. Kim M, Oh HS, Park SC, Chun J. 2014. Towards a taxonomic 25. Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, coherence between average nucleotide identity and 16S Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and rRNA gene sequence similarity for species demarcation of translation initiation site identification. BMC Bioinformatics . Int. J. Syst. Evol. Microbiol. 64: 346-351.

April 2017 ⎪ Vol. 27⎪ No. 4