J. Microbiol. Biotechnol. (2014), 24(11), 1490–1494 http://dx.doi.org/10.4014/jmb.1405.05039 Research Article Review jmb

Genome of Betaproteobacterium Caenimonas sp. Strain SL110 S Contains a Coenzyme F420 Biosynthesis Gene Cluster Xiuling Li1, Fuying Feng2, and Yonghui Zeng3*

1Shandong Provincial Key Laboratory of Water and Soil Conservation & Environmental Protection, Linyi University, Linyi 276000, P.R. China 2Institute for Applied and Environmental Microbiology, Inner Mongolia Agriculture University, Huhhot 010018, P.R. China 3Guangdong Provincial Key Laboratory of Pathogenic Biology and Epidemiology for Aquatic Economic Animals, Guangdong Ocean University, Zhanjiang 524025, P.R. China

Received: May 19, 2014 Revised: July 7, 2014 To probe the genomic properties of microbes thriving in desert lakes, we sequenced the full Accepted: July 16, 2014 genome of a betaproteobacterial strain (SL110) belonging to the understudied genus First published online Caenimonas of the family . This strain was isolated from a freshwater lake in July 22, 2014 the western Gobi Desert, Northern China. Its genome contains genes encoding carbon

*Corresponding author monoxide dehydrogenase, nitrate reductase, nitrite reductase, nitric oxide reductase, and Phone: +420-384 340 454; sulfur oxidation enzymes, highlighting the potentially important contribution of this group of Fax: +420-384 340 415; E-mail: [email protected] to the cycling of inorganic elements in nature. Unexpectedly, a coenzyme F420 Present Address: Department of biosynthesis gene cluster was identified. A further search for F420 biosynthesis gene homologs Phototrophic Microorganisms, in genomic databases suggests the possible widespread presence of F biosynthesis gene Institute of Microbiology, 420

Academy of Sciences CAS, clusters in proteobacterial genomes. ê

Opatovickyó mlyó n, Tr eboê n 37981, Czech Republic Keywords: Genome sequencing, Caenimonas, F420 biosynthesis gene cluster, horizontal gene S upplementary data for this transfer, , Actinobacteria paper are available on-line only at http://jmb.or.kr.

pISSN 1017-7825, eISSN 1738-8872

Copyright© 2014 by The Korean Society for Microbiology and Biotechnology

The enzyme cofactor F420 is a naturally occurring deazaflavin of a F420 biosynthesis gene cluster as well as other potential derivative. So far, F420 has been proposed to play important capabilities predicted from its genome sequences. roles in the cellular metabolism of methanogenic archaea As part of a project that aims to protect and preserve the [3] and the mycobacteria of Actinobacteria [2, 22]. The microbial sources in many endangered and vanishing lower redox potential of F420 than NADP may enable it to desert lakes in northern China, we isolated a novel strain reduce many recalcitrant molecules that are widely from the surface water of the freshwater Swan Lake on 15 distributed in polluted environments, such as picric acid, December 2011 (42.005 N, 101.585 E, approx. 900 m a.s.l.) malachite green, and aflatoxin [6, 23]. F420 biosynthesis located at the northern margin of the Badain Jaran Desert, a pathways have basically been elucidated in Archaea [13-16] part of the Gobi Desert. This light-pink isolate, named and in Actinobacteria [4, 5]. However, the study of the F420 SL110, was purified from a 1/2 strength R2A agar plate biosynthesis pathway in bacteria other than Actinobacteria (Difco). The 16S rRNA gene phylogeny placed it into the has received little attention. To improve our understanding in genus Caenimonas of the family Comamonadaceae of this respect, we sequenced the genome of a novel (Fig. 1). The family Comamonadaceae is betaproteobacterium strain, SL110, and report the finding a phylogenetically coherent but physiologically heterogeneous

November 2014 ⎪ Vol. 24⎪ No. 11 1491 Li et al.

China) and then subjected to the construction of a 300 bp DNA library, followed by sequencing on an Illumina HiSeq 2000 platform at Meiji Biotech Inc., Shanghai, China. Approximately 2.04 GB of raw data of 101-bp-long pair- end reads were generated. A total of 7,718,442 reads were subjected to the processing of end trimming (quality score <20) and filtering by quality (90% positions with quality score ≥20) using the Galaxy server [12]. Finally, 6,821,277 (88.4%) high quality reads were de novo assembled into contigs using the Velvet program (ver. 1.2.08) [25]. Annotation was performed with the online server RAST [1]. The final genome assembly data have been submitted to GenBank under BioProject PRJNA235880 (Accession No. JANU00000000). Raw Illumina reads were deposited at the NCBI Sequence Read Archive under BioSample SAMN02586129 with the accession number SRR1137869. The associated 16S rRNA gene sequence from the assembled SL110 genome was deposited in GenBank under accession number KJ210008.

The draft genome consisted of 74 contigs, with the N50

value as 151,450 bp and the N90 value as 37,954 bp. The total genome contained 4,721,438 bases with 4,359 open reading frames and 46 tRNAs predicted. The G+C content was 63.4%. One complete rRNA operon was identified on the longest contig (Contig13, 429,022 bases). Its 16S rRNA gene showed 98.8% sequence identity to that of Caenimonas koreensis strain EMB320T. Various enzymes that are closely related to an aerobic lifestyle were identified, such as catalase and superoxide dismutase (Table S1). The genome contains an operon coding for flagella biosynthesis genes Phylogenetic tree of 16S rRNA genes from cultured Fig. 1. and a set of genes encoding twitching motility proteins members of the family Comamonadaceae. (Table S1), suggesting that this strain has a strong mobility To the right is the summarized current status of genome sequencing and may be able to switch its lifestyle between free living in each genus. and attachment to particles. The genome contains genes encoding enzymes that are related to inorganic elements group of prokaryotes encompassing over 15 genera [10]. Its cycling in nature, including carbon monoxide dehydrogenase members are regularly isolated from water, soil, and (CO utilization), nitrate reductase, nitrite reductase, and - - polluted environments [24]. So far, the genus Caenimonas nitric oxide reductase (NO3 → NO2 → NO → N2O, a comprises only two type strains; Caenimonas koreensis [21] denitrification process), and sulfur oxidation proteins and Caenimonas terrae [17]. Twenty complete genome (sulfur assimilation) (Table S1). These findings highlight sequences and 48 shotgun genome sequences are available the potentially important contribution of the Caenimonas in this family (NCBI Genome database records up to Jan. bacteria to the inorganic carbon, nitrogen, and sulfur 2014; for their taxa distribution refer to Fig. 1). However, cycling in the environment. genome sequences from Caenimonas have not yet been We compared the SL100 genome with all other genomes reported, which limits our abilities to assess the physiological available in the family Comamonadaceae and found that potentials of the members of this genus. one distinct feature in SL110 was the presence of a

To elucidate the genomic features of the Caenimonas coenzyme F420 biosynthesis gene cluster (Fig. 2A). In the bacteria, the genomic DNA of strain SL110 was extracted SL110 genome, all essential genes for F420 biosynthesis were using a genomic DNA purification kit (Tiangen, Beijing, identified (Fig. 2B, Table S2). So far, only one study

J. Microbiol. Biotechnol. Genome Sequences of Caenimonas sp. SL110 1492

Fig. 2. F420 biosynthesis pathway elucidated in Archaea and Bacteria and the distribution of F420 biosynthesis gene clusters in Bacteria.

(A) Distribution of F420 biosynthesis genes in the SL110 genome. For enzymes encoded by each gene, see Table S2. (B) F420 biosynthesis pathway elucidated in Archaea and Actinobacteria. (C) Comparison of F420 biosynthesis gene clusters from Alphaproteobacteria, Betaproteobacteria, and Actinobacteria.

conducted by Selengut and Haft [22] has addressed the one actinobacterium, Rubrobacter xylanophilus DSM 9941, possible distribution of F420 biosynthesis genes and the and a gram-positive thermophile, Thermobaculum terrenum genes encoding F420-dependent enzymes in the genomes of ATCC BAA-798 (Fig. 3), strongly signaling the occurrence prokaryotes other than Archaea and Actinobacteria. In that of horizontal transfers of F420 biosynthesis genes between pioneer study, the authors found the co-clustering of F420 Archaea/Actinobacteria and Proteobacteria. In addition, biosynthesis enzymes and enzymes using F420 in the F420 biosynthesis genes organized into a gene cluster in alphaproteobacterium Phenylobacterium zucineum HLK1 many proteobacterial genomes, similarly as observed in and suggested horizontal gene transfer (HGT) as a Actinobacteria (Fig. 2C), suggesting that they may have probable cause. Here, we further searched the fast growing retained their original functions. Indeed, we found that the NCBI whole genome and reference genome databases with protein sequence of SL110 CofD (Fig. S1) and of CofGH the aims to explicitly explore the distribution of F420 (Fig. S2) contains almost all key amino acid residues biosynthesis gene clusters in bacterial genomes and to find previously identified in Archaea and Actinobacteria [9, 12]. phylogenetic evidence for the HGT of F420 biosynthesis Bacterial genes in the same metabolic pathway are often genes among bacterial taxa. organized into gene clusters or operons in bacterial We found that 39 alphaproteobacterial genomes and 7 genomes [7, 20], probably facilitating their co-regulation betaproteobacterial genomes contain a complete set of or horizontal transfer among different taxa [8, 18, 19]. homologs of cofCGHED genes that encode the four key Nevertheless, direct experimental evidence for the horizontal enzymes involved in the F420 biosynthesis pathway (2- gene transfer of F420 biosynthesis gene clusters between phospho-L-lactate guanylytransferase, cofC; 7,8-didemethyl- Proteobacteria and Actinobacteria is needed to prove the in

8-hydroxy-5-deazariboflavin synthase, cofGH; deazariboflavin vivo presence and functioning of F420 biosynthesis enzymes derivative 2-phospho-L-lactate transferase, cofD; coenzyme and products. As the first step, this work provides strong

F420 L-glutamate ligase, cofE; see Fig. 2B, Table S2). genomic and phylogenetic evidence for the presence of a

Phylogenetic analysis of concatenated CofCGHED protein coenzyme F420 biosynthesis pathway in various alpha- sequences showed that proteobacterial sequences formed a and beta-proteobacteria, shedding new insights into the tight cluster independent of Actinobacteria and Archaea distribution and possible evolutionary origins of F420 (Fig. 3). Surprisingly, this proteobacterial cluster contains biosynthesis genes in Proteobacteria.

November 2014 ⎪ Vol. 24⎪ No. 11 1493 Li et al.

References

1. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. 2008. The RAST server: Rapid annotations using subsystems technology. BMC Genomics 9: 75. 2. Bashiri G, Baker EN. 2012. Biology of cofactor F420 in mycobacteria. FEBS J. 279: 434-434. 3. Berk H, Thauer RK. 1997. Function of coenzyme F-420- dependent NADP reductase in methanogenic archaea containing an NADP-dependent alcohol dehydrogenase. Arch. Microbiol. 168: 396-402. 4. Choi KP, Bair TB, Bae YM, Daniels L. 2001. Use of transposon Tn5367 mutagenesis and a nitroimidazopyran- based selection system to demonstrate a requirement for fbiA and fbiB in coenzyme F-420 biosynthesis by Mycobacterium bovis BCG. J. Bacteriol. 183: 7058-7066. 5. Choi KP, Kendrick N, Daniels L. 2002. Demonstration that fbiC is required by Mycobacterium bovis BCG for coenzyme F-420 and FO biosynthesis. J. Bacteriol. 184: 2420-2428. 6. Ebert S, Rieger PG, Knackmuss HJ. 1999. Function of coenzyme F-420 in aerobic catabolism of 2,4,6-trinitrophenol and 2,4-dinitrophenol by Nocardioides simplex FJ2-1A. J. Bacteriol. 181: 2669-2674. 7. Ermolaeva MD, White O, Salzberg SL. 2001. Prediction of operons in microbial genomes. Nucleic Acids Res. 29: 1216- 1221. 8. Fondi M, Emiliani G, Fani R. 2009. Origin and evolution of operons and metabolic pathways. Res. Microbiol. 160: 502- 512. 9. Forouhar F, Abashidze M, Xu HM, Grochowski LL, Seetharaman J, Hussain M, et al. 2008. Molecular insights into the biosynthesis of the F-420 coenzyme. J. Biol. Chem. 283: 11832-11840. Fig. 3. Phylogenetic analysis of key F420 biosynthesis genes. The aligned protein sequences of each gene were concatenated and 10. Garrity GM, Lilburn TG, Cole JR, Harrison SH, Euzeby J, subjected to the inference of neighbor-joining and maximum likelihood Tindall BJ. 2007. Taxonomic Outline of the Bacteria and trees using the ClustalW program embedded in the MEGA 5.0 package. Archaea. Release 7.7. Michigan State University Board of Bootstrap values over 50% are shown on the tree. Betaproteobacterial Trustees. DOI: 10.1601/TOBA1607.1607. are highlighted in bold. 11. Goecks J, Nekrutenko A, Taylor J, Team G. 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Acknowledgments Genome Biol. 11: R86. 12. Graham DE, Xu HM, White RH. 2003. Identification of the This work was supported by the Open Research Fund of 7,8-didemethyl-8-hydroxy-5-deazariboflavin synthase required Shandong Provincial Key Laboratory of Water and Soil for coenzyme F-420 biosynthesis. Arch. Microbiol. 180: 455- Conservation and Environmental Protection (Linyi University, 464. China; No. stkf201202) and the Natural Science Foundation 13. Graupner M, White RH. 2001. Biosynthesis of the phosphodiester bond in coenzyme F-420 in the methanoarchaea. Biochemistry of China (NSFC) project no. 30900045 granted to Y.Z. The 40: 10859-10872. Czech project Algain CZ.1.07/2.3.00/30.0059 supported 14. Graupner M, White RH. 2003. Methanococcus jannaschii the postdoctoral research of Y.Z in the Czech Republic. The coenzyme F-420 analogs contain a terminal alpha-linked field work was supported by the NSFC project no. glutamate. J. Bacteriol. 185: 4662-4665. 30860015 granted to F.F. 15. Graupner M, Xu HM, White RH. 2002. Characterization of

J. Microbiol. Biotechnol. Genome Sequences of Caenimonas sp. SL110 1494

the 2-phospho-L-lactate transferase enzyme involved in prokaryotes. Nucleic Acids Res. 33: 880-892. coenzyme F-420 biosynthesis in Methanococcus jannaschii. 21. Ryu SH, Lee DS, Park M, Wang Q, Jang HH, Park W, Jeon Biochemistry 41: 3754-3761. CO. 2008. Caenimonas koreensis gen. nov., sp nov., isolated 16. Grochowski LL, Xu HM, White RH. 2008. Identification and from activated sludge. Int. J. Syst. Evol. Microbiol. 58: 1064- characterization of the 2-phospho-L-lactate guanylyltransferase 1068.

involved in coenzyme F420 biosynthesis. Biochemistry 47: 22. Selengut JD, Haft DH. 2010. Unexpected abundance of 3033-3037. coenzyme F-420-dependent enzymes in Mycobacterium tuberculosis 17. Kim SJ, Weon HY, Kim YS, Moon JY, Seok SJ, Hong SB, and other Actinobacteria. J. Bacteriol. 192: 5788-5798. Kwon SW. 2012. Caenimonas terrae sp nov., isolated from a 23.Singh BK, Walker A. 2006. Microbial degradation of soil sample in Korea, and emended description of the genus organophosphorus compounds. FEMS Microbiol. Rev. 30: Caenimonas Ryu et al. 2008. J. Microbiol. 50: 864-868. 428-471. 18. Lawrence J. 1999. Selfish operons: the evolutionary impact 24. Willems A, Gillis M. 2005. Family IV. Comamonadaceae, of gene clustering in prokaryotes and eukaryotes. Curr. pp. 686-759. In Brenner DJ, Krieg NR, Staley JT (eds.). Bergey's Opin. Genet. Dev. 9: 642-648. Manual of Systematic Bacteriology. Springer, Germany. 19. Price MN, Arkin AP, Alm EJ. 2006. The life-cycle of 25. Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo operons. Plos Genetics 2: 859-873. short read assembly using de Bruijn graphs. Genome Res. 18: 20. Price MN, Huang KH, Alm EJ, Arkin AP. 2005. A novel 821-829. method for accurate operon predictions in all sequenced

November 2014 ⎪ Vol. 24⎪ No. 11