Complete Genome Sequence of Esterase-Producing Bacterium
Total Page:16
File Type:pdf, Size:1020Kb
Wu et al. Standards in Genomic Sciences (2017) 12:88 DOI 10.1186/s40793-017-0300-0 SHORT GENOME REPORT Open Access Complete genome sequence of esterase- producing bacterium Croceicoccus marinus E4A9T Yue-Hong Wu, Hong Cheng, Ying-Yi Huo, Lin Xu, Qian Liu, Chun-Sheng Wang and Xue-Wei Xu* Abstract Croceicoccus marinus E4A9Twas isolated from deep-sea sediment collected from the East Pacific polymetallic nodule area. The strain is able to produce esterase, which is widely used in the food, perfume, cosmetic, chemical, agricultural and pharmaceutical industries. Here we describe the characteristics of strain E4A9, including the genome sequence and annotation, presence of esterases, and metabolic pathways of the organism. The genome of strain E4A9T comprises 4,109,188 bp, with one chromosome (3,001,363 bp) and two large circular plasmids (761,621 bp and 346,204 bp, respectively). Complete genome contains 3653 coding sequences, 48 tRNAs, two operons of 16S–23S-5S rRNA gene and three ncRNAs. Strain E4A9T encodes 10 genes related to esterase, and three of the esterases (E3, E6 and E10) was successfully cloned and expressed in Escherichia coli Rosetta in a soluble form, revealing its potential application in biotechnological industry. Moreover, the genome provides clues of metabolic pathways of strain E4A9T, reflecting its adaptations to the ambient environment. The genome sequence of C. marinus E4A9T now provides the fundamental information for future studies. Keywords: Croceicoccus marinus E4A9T, Genome sequence, Esterase, Alphaproteobacteria Introduction isolated from deep-sea sediment collected from the East Lipolytic enzymes, including esterase (EC 3.1.1.1) and lipase Pacific polymetallic nodule area [6]. The strain was able to (EC 3.1.1.3), are a general class of carboxylic ester hydro- produce esterase as well as lipase [6]. To get insight into lases (EC 3.1.1), which catalyze the hydrolytic cleavage and the capability of esterase production, recently, we obtained formation of ester bonds [1, 2]. Esterase shows a preference the complete genome of C. marinus E4A9T and detected for water-soluble short chain fatty acids (< 10 carbon genes of esterase. This is the first genome report for the atoms), while lipase prefers water-insoluble longer chain strain in the genus of Croceicoccus.Wealsodescribethe fatty acids (> 10 carbon atoms) [3, 4]. Many esterases do genomic sequencing related to its annotation for under- not require cofactors and have high stereospecificity toward standing their metabolic and ecological functions in the chemicals, broad substrate specificity and high stability in environment. organic solvents [4]. They are extensively used in the food, perfume, cosmetic, chemical, agricultural and pharmaceut- Organism information ical industries [5]. Classification and features Croceicoccus [6], as a genus of the family C. marinus E4A9T was isolated from a deep-sea sedi- Erythrobacteraceae [7], can be found in the marine envi- ment sample collected from the East Pacific polymetallic ronments, including deep-sea sediment, surface seawater nodule area (8°22′38” N, 145°23′56” W) at a depth of and marine biofilm from a boat shell [6, 8, 9]. C. marinus 5280 m (temperature 2 °C, salinity 3.4%). Strain E4A9T E4A9T, the type strain of the genus Croceicoccus, was was obtained and routinely cultured on marine broth 2216 (MB, BD) at 30 °C. Subsequently polyphasic study * Correspondence: [email protected] of strain E4A9T was performed. A new species Key Laboratory of Marine Ecosystem and Biogeochemistry, Second Institute of Oceanography, State Oceanic Administration, 36th North BaoChu Road, Croceicoccus marinus gen. Nov. sp. nov. was proposed. T Hangzhou 310012, China Strain E4A9 is the type strain of the species of C. © The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Wu et al. Standards in Genomic Sciences (2017) 12:88 Page 2 of 8 Fig. 1 Transmission electron microscopy showing the cell morphology (a) and ultrastructure (b)ofCroceicoccus marinus E4A9T. The flagella are present. Bars represent scales of 0.5 μm(a) and 0.2 μm(b), respectively marinus [6], and was deposited into the China General The general features of strain E4A9T was summarized Microbiological Culture Collection (CGMCC 1.6776T). in Table 1. C. marinus [6] is a valid species belonging to the family Erythrobacteraceae [7], in the order Sphingomonadales Genome sequencing information [10, 11], class Alphaproteobacteria [11, 12] and phylum Genome project history Proteobacteria [13] .C. marinus E4A9T is a Gram- C. marinus E4A9T [6] was selected for sequencing be- staining-negative and cocci-shaped bacterium (Fig. 1). It cause it is relevant to genomic sequencing of the whole grew aerobically and used a series of organic carbon, family of Erythrobacteraceae [7] and esterase production. such as L-arabinose, D-cellobiose, D-galactose and xy- The complete genome sequence was finished on May lose, as sole sources of carbon and energy [6, 8]. 29, 2015. The gap closure and annotation processes were Based on phylogenetic analysis of 16S rRNA gene performed by the authors. The GenBank accession num- sequence, the strain falls into the cluster comprising ber of the genome is CP019602, CP019603 and the Croceicoccus species with a high bootstrap value CP019604. The main genome sequence information is (Fig. 2). Interestingly, strain E4A9T could hydrolyze present in Table 2 and Table 3. Tween 20, Tween 80 and tributyrin, indicating the presence of esterase as well as lipase [6]. The API Growth conditions and genomic DNA preparation ZYM system also supported the results that esterase C. marinus E4A9T was aerobically cultivated in Marine (C4) and esterase lipase (C8) activities are present. Broth (MB, BD Difco™) at 30 °C and stored at −80 °C Fig. 2 Phylogenetic tree based on 16S rRNA gene sequences was constructed by neighbor-joining algorithms. Related sequences were aligned with Clustal W. Evolutionary distances were calculated according to the algorithm of the Kimura two-parameter model. Bootstrap values (> 60%) based on 1000 replications are shown at branch nodes. Filled circles indicate that the corresponding nodes were also recovered in the trees generated with the maximum-likelihood and maximum-parsimony algorithms. Bar, 0.01 substitutions per nucleotide position Wu et al. Standards in Genomic Sciences (2017) 12:88 Page 3 of 8 Table 1 Classification and general features of Croceicoccus marinus E4A9T according to the MIGS recommendations [30] MIGS ID Property Term Evidence codea Classification Domain Bacteria TAS [31] Phylum Proteobacteria TAS [12] Class Alphaproteobacteria TAS [11] Order Sphingomonadales TAS [10] Family Erythrobacteraceae TAS [7] Genus Croceicoccus TAS [6] Species Croceicoccus marinus TAS [6] (Type) strain: Strain E4A9T (CGMCC 1.6776T= JCM 14846T) Gram stain Negative TAS [6] Cell shape Coccus TAS [6] Motility Motile TAS [6] Sporulation Non-sporulation TAS [6] Temperature range 4–42 °C TAS [6] Optimum temperature 28–30 °C TAS [6] pH range; Optimum 6.0–9.0; 7.0 TAS [6] Carbon source Organic carbon TAS [6] MIGS-6 Habitat Deep-sea sediment TAS [6] MIGS-6.3 Salinity Moderately halophilic, 0.5–10% NaCl TAS [6] MIGS-22 Oxygen requirement Aerobic TAS [6] MIGS-15 Biotic relationship Free-living TAS [6] MIGS-14 Pathogenicity Non-pathogen NAS MIGS-4 Geographic location East Pacific polymetallic nodule area TAS [6] MIGS-5 Sample collection Not reported MIGS-4.1 Latitude 8°22′38” N TAS [6] MIGS-4.2 Longitude 145°23′56” W TAS [6] MIGS-4.4 Altitude −5280 m TAS [6] aEvidence codes - IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [32] Table 2 Genome sequencing project information MIGS ID Property Term MIGS 31 Finishing quality Finished MIGS-28 Libraries used 10 kb MIGS 29 Sequencing platforms A PacBio RS II platform MIGS 31.2 Fold coverage 248-fold MIGS 30 Assemblers HGAP Assembly version 2, Pacific Biosciences MIGS 32 Gene calling method GeneMarkS+ (NCBI) Locus Tag A9D14 Genbank ID CP019602, CP019603, and CP019604 GenBank Date of Release June 13, 2017 GOLD ID Go0030822 BIOPROJECT PRJNA322659 MIGS 13 Source Material Identifier CGMCC(China General Microbiological Culture Collection) Project relevance Esterases production Wu et al. Standards in Genomic Sciences (2017) 12:88 Page 4 of 8 Table 3 Summary of genome: one chromosome and two plasmids Label Size (Mb) Topology INSDC identifier RefSeq ID Chromosome 3.001363 Linear CP019602.1 NZ_CP019602.1 Plasmid 1 (pCME4A9I) 0.761621 Linear CP019603.1 NZ_CP019603.1 Plasmid 2 (pCME4A9II) 0.346204 Linear CP019604.1 NZ_CP019604.1 with 30% (v/v) glycerol. High-quality genomic DNA was genes, 132 were assigned to pseudogene. The summary extracted using the Qiagen DNA extraction kit, accord- of features and statistics of the genome is shown in ing to its protocol. Table 4 and genes belonging to COG functional categor- ies are listed in Table 5. Genome sequencing and assembly Three replicons of the genome of strain E4A9, located The genome of strain E4A9T was sequenced using SMRT in a circular chromosome and two large plasmids, were technology with a PacBio RS II platform (Zhejiang Tianke detected. Two plasmid replication initiator protein genes Co. Ltd., China).