Zhong et al. Biotechnol Biofuels (2018) 11:193 https://doi.org/10.1186/s13068-018-1201-1 Biotechnology for Biofuels

RESEARCH Open Access Pan‑genome analyses of 24 Shewanella strains re‑emphasize the diversifcation of their functions yet evolutionary dynamics of metal‑reducing pathway Chaofang Zhong, Maozhen Han, Shaojun Yu, Pengshuo Yang, Hongjun Li and Kang Ning*

Abstract Background: Shewanella strains are important dissimilatory metal-reducing which are widely distributed in diverse habitats. Despite eforts to genomically characterize Shewanella, knowledge of the molecular components, functional information and evolutionary patterns remain lacking, especially for their compatibility in the metal- reducing pathway. The increasing number of genome sequences of Shewanella strains ofers a basis for pan-genome studies. Results: A comparative pan-genome analysis was conducted to study genomic diversity and evolutionary relation- ships among 24 Shewanella strains. Results revealed an open pan-genome of 13,406 non-redundant genes and a core-genome of 1878 non-redundant genes. Selective pressure acted on the invariant members of core genome, in which purifying selection drove evolution in the housekeeping mechanisms. Shewanella strains exhibited extensive genome variability, with high levels of gene gain and loss during the evolution, which afected variable gene sets and facilitated the rapid evolution. Additionally, genes related to metal reduction were diversely distributed in Shewanella strains and evolved under purifying selection, which highlighted the basic conserved functionality and specifcity of respiratory systems. Conclusions: The diversity of genes present in the accessory and specifc genomes of Shewanella strains indicates that each strain uses diferent strategies to adapt to diverse environments. Horizontal gene transfer is an important evolutionary force in shaping Shewanella genomes. Purifying selection plays an important role in the stability of the core-genome and also drives evolution in mtr–omc cluster of diferent Shewanella strains. Keywords: Pan-genome, Shewanella, Metal-reducing, Evolution

Background to their metal-reducing capability, some members of Shewanella are well known for their extensively res- Shewanella such as Shewanella oneidensis MR-1, She- piratory versatility and widely distributed in a range wanella loihica and Shewanella putrefaciens have been of aquatic habitats [1, 2]. Currently, the genus She- recognized as useful tools for bioremediation [6–8]. wanella is composed of more than 50 species [1], over Tey were used to bio-remediate toxic elements and 20 of which have abilities to reduce metal [3–5]. Due heavy metals contamination [9], and serve as biocata- lyst in microbial fuel cells to produce H­ 2 [10]. Despite their close evolutionary relationship, Shewanella *Correspondence: [email protected] strains from diverse habitats have great diferences in Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular‑imaging, genetic content [11]. S. oneidensis MR-1, a well-studied Department of Bioinformatics and Systems Biology, College model of Shewanella, discovered to have the capabil- of Science and Technology, Huazhong University of Science ity of heavy metal reduction in lakes, is the frst species and Technology, 1037 Luoyu Road, Wuhan 430074, Hubei, China

© The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat​iveco​mmons​.org/licen​ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Dedication waiver (http://creat​iveco​mmons​.org/ publi​cdoma​in/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 2 of 13

of Shewanella to be sequenced and assembled [12]. within this genus. Furthermore, genetic organization and Some genes identifed in S. oneidensis MR-1, such as evolution of mtr–omc clusters were investigated for their mtrBAC and omcA, are essential for metal reduction evolutionary patterns at the genomic level. [13, 14], but the features of their variations are still poorly understood. S. putrefaciens W3-18-1, a kind of Results and discussion psychrophile having the ability to reduce metals, has Core and pan‑genome of Shewanella diferent molecular mechanisms of iron reduction from A large proportion (94%) of genes from 24 Shewanella S. oneidensis MR-1 [15]. S. piezotolerans WP3 living at strains were grouped into 7830 homologous clusters elevated hydrostatic pressures [16], also shows diverse (Additional fle 1: Table S1). Te pan-genome pos- electron transport pathways from S. oneidensis MR-1, sessed 13,406 gene families, members of gene families although they have a close evolutionary relationship in these strains were divided into three categories (core, [17]. accessory, specifc) based on their appearance in difer- Te pan-genome analysis provides a systematic way to ent genomes. Among these gene families, 1878 families assess the genomic diversity and evolution across diverse existed in all 24 genomes and hence represented the [18, 19]. Te availability of whole genome core gene complements (core genome). And 1788 fami- sequences for a number of Shewanella strains makes it lies from core genome harbored only one gene from possible to examine both the pan and core genome and each strain (single-copy core families). Although these provides insights into gene clusters of metal reduction in strains shared a core gene set, there were individual dif- other members of this genus. A series of genome analyses ferences among the subsets of genes. Te 5801 families of Shewanella strains have demonstrated gene diversity existed in only one genome comprising the unique genes of electron acceptors for respiration [11, 20–22]. Investi- or singletons (strain-specifc genome) and the remaining gations into several Shewanella baltica isolates have been 5727 families (accessory genome) were present in more conducted to analyze the complete genomic sequences than one, but not all 24 genomes. Te number of non- and expressed transcriptomes [23], but only a few of redundant strain-specifc genes across diferent genomes the lineages are considered in genetic exchange analy- varied from 58 to 782, and S. piezotolerans WP3 had the sis. Genome analysis of 10 closely related Shewanella largest number of unique gene families (782) while S. strains suggested that variations in expressed proteomes sp. MR-4 possessed the smallest number of unique gene correlated positively with the extent of environmental families (58) (Fig. 1a). Tis was consistent with the pre- adaptation, but adaptive evolution in the core compo- vious study that S. piezotolerans WP3 had the largest nents of other species has still not been well studied [24]. genome size among the sequenced Shewanella genomes Genome-wide molecular selection analyses, designed to [26]. Te large proportion of specifc genes suggested assess selection pressure across the entire core-genome that the Shewanella strains harbored a high level of of diferent strains of Shewanella have not been reported, genomic diversity and uniqueness of each strain, showing and also no comprehensive reports have attempted to their ability to survive in diferent environments. Te size address the important role of selection functions in the of pan-genome got large unboundedly with the increase diversifcation of the core-genome of Shewanella. Most of new genomes even including 13,406 non-redundant studies focus on identifcation of electron transport genes (Fig. 1b), which indicated that the Shewanella system and genes which are essential to the diferent pan-genome was still “open”. Tis open pan-genome metabolism [25]. To date, the whole genome sequence showed great potential for discovering novel genes of certain Shewanella strains have been sequenced and with more Shewanella strains sequenced. In contrast to some of their genetic features well characterized. How- the pan-genome, the size of core genome appeared to ever, only a limited number of strains have been exten- reach a steady-state approximation after including 1878 sively studied, the metabolic and genetic diversity of non-redundant genes. In addition, only 14% of the pan- many species remain unknown. Terefore, a comparative genome was found to be kept constant, while the remain- pan-genome analysis in Shewanella is very necessary to ing 86% was variable across the strains, which indicated study evolution and genetic diversity. that the pan-genome exhibited a high level of genome In this study, we estimated both the sizes of pan and variability. core genomes, functional features, phylogeny, and hori- zontal gene transfer (HGT) to characterize population Phylogeny of Shewanella diversity and determine the forces driving adaptive evo- To gain insights into similarity and distance of the lution in 24 Shewanella strains. In addition, we attempted strains, two phylogenetic trees were constructed, one was to assess selection pressure and selection functions in based on the concatenated alignment of 1788 single-copy the diversifcation across the single copy core genomes core genes (Fig. 1c), and another was based on absence or Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 3 of 13

S. baltica OS155 a S. baltica OS185 S. sp. MR-4 b 60 S. baltica OS195 109 58 S. sp. MR-7 Pan−genome 13406 72 99 14000 S. baltica OS678 S. sp.ANA-3 Core−genome 74 145 S. frigidimarina NCIMB 400 S. baltica OS223 357 118 10000 S. woodyi ATCC 51908 S. baltica BA175 91 650 Core 8000 731 S. violacea DSS12 S. baltica OS117 61 genomes 6000 1878 Genome si ze 215 S. amazonensis SB2B S. denitrificans OS217 476 400 0 220 159 S. oneidensis MR-1 S. pealeana ATCC 700345 1878 200 0 188 225 S. halifaxensis HAW-EB4 S. loihica PV-4 0 297 385 782 S. sediminis HAW-EB3 S. putrefaciens 200 91 138 0510 15 20 25 S. putrefaciens CN-32 S . piezotolerans WP3 S. sp. W3-18-1 Number of genomes

Water Sediment Other

cdS. loihica PV-4 S. violacea DSS12 S. sediminis HAW-EB3 S. woodyi ATCC 51908 S. violacea DSS12 S. piezotolerans WP3 S. woodyi ATCC 51908 I I S. sediminis HAW-EB3 S. piezotolerans WP3 S. halifaxensis HAW-EB4 S. halifaxensis HAW-EB4 S. pealeana ATCC 700345 S. pealeana ATCC 700345 S. denitrificans OS217 S. amazonensis SB2B S. frigidimarina NCIMB 400 S. frigidimarina NCIMB 400 S. amazonensis SB2B S. denitrificans OS217 S. loihica PV-4 S. oneidensis MR-1 S. putrefaciens 200 S. sp. ANA-3 S. putrefaciens CN-32 S. sp. MR-7 S. sp. W3-18-1 S. sp. MR-4 S. oneidensis MR-1 S. putrefaciens 200 II S. sp. ANA-3 S. sp. W3-18-1 II S. sp. MR-4 S. putrefaciens CN-32 S. sp. MR-7 S. baltica OS185 S. baltica BA175 S. baltica OS678 S. baltica OS117 S. baltica OS195 S. baltica OS155 S. baltica OS223 S. baltica OS223 S. baltica BA175 S. baltica OS185 S. baltica OS155 S. baltica OS678 S. baltica OS117 S. baltica OS195 Fig. 1 Genomic diversity of strains in Shewanella. a Core and specifc families of 24 Shewanella genomes. Each oval represents a strain and is colored according to its isolation site (blue: water, red: sediment and yellow: other). The number of orthologous coding sequences (core genome) shared by all strains is in the center. The number of specifc families is in non-overlapping portions of each oval. b Increase and decrease in gene families in the pan-genome (red) and core genome (blue), respectively. c Maximum likelihood phylogenetic tree based on 1788 single-copy gene families using 100 bootstrap replications. d Pan-genome tree based on presence/absence of gene family. The strain names are colored according to their isolation sites

presence of each gene families (Fig. 1d). In the two trees, gene tree, but they were no longer the sisters on the pan- strains were grouped into two clades. Te frst clade was genome tree, which indicated that non-core genes were made up of most sediment strains, and the second one likely to make them diverged. In addition, the S. loihica consisted mostly of water strains. S. loihica PV-4, iso- PV-4 was the sister with the S. amazonensis SB2B on the lated from iron-rich microbial mats [27], was phyloge- pan-genome tree, but had substantial distance on the sin- netically distant from the rest strains, suggesting it had gle-copy core gene tree, suggesting that the variable genes a low degree of similarity within core genes to the other made up a dominant proportion of the phylogenetic sig- strains. Te separation tendency of diferent subgroups nal and played a role in the evolution of these two strains. bore a high resemblance in the two trees in spite of the It was also noted that those divergent strains, including strains having diferent relative positions. S. frigidima- S. frigidimarina NCIMB 400, S. denitrifcans OS217, S. rina NCIMB 400 showed much evolutionary related- amazonensis SB2B, S. violacea DSS12, S. woodyi ATCC ness to the S. denitrifcans OS217 on the single-copy core 51908, S. sediminis HAW-EB3, S. halifaxensis HAW-EB4, Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 4 of 13

S. pealeana ATCC 700345, S. piezotolerans WP3 and S. lyases (PL) (Fig. 2a). In addition, these CAZymes were loihica PV-4, contained more specifc genes. Hence, it more abundant in accessory and specifc genome than was likely that the larger diferences of genes lead to evo- in core genome. Tis diversifcation of the Shewanella lutionary divergence, which suggested that diferent gene CAZyme-encoding genes allowed for diferent electron- gain and loss might play important roles in the evolution transporting roles in cellular metabolism. In addition, of some Shewanella strains and make them apart from strains varied in their number of unique CAZyme-encod- their relatives. ing genes, suggesting the metabolic abilities of carbohy- drates were specialized among Shewanella strains. Tere CAZyme identifcation and profling were 18 strains possessing specifc genes for CAZymes, Carbohydrate metabolism was one of the most impor- and GTs and GHs were most abundant (Fig. 2b). tant metabolic activities [28]. Members of Shewanella Additionally, we found that strains isolated from the can anaerobically transfer electrons from cell metabo- water seemed to contain less specifc CAZyme-encod- lism to versatile electron acceptors [6]. Te process of ing genes than other strains from more diverse environ- obtaining electrons from carbohydrates by Shewanella ments. Te diversity of such specifc CAZyme-encoding was the hydrolysis of complex carbohydrates present in genes was important to the strategies of carbohydrate biomass [29]. Tis was achieved through the presence metabolism that would result in metabolism specializa- of a repertoire of secreted or complexed carbohydrate tion and environment adaptation. Te S. woodyi ATCC active enzymes (CAZymes). To further understand the 51908 had the most strain-specifc CAZymes (28), fol- mechanism of polysaccharide hydrolysis in Shewanella, lowed by S. violacea DSS12 (17) and S. sediminis HAW- we identifed the CAZymes in pan-genome. Tese strains EB3 (13). Te three strains had more strain-specifc contained an abundance of glycosyltransferases (GT), CAZyme-encoding genes and they were isolated from glycoside hydrolases (GH), carbohydrate esterases (CE), detritus or sediment. Such high number of strain- carbohydrate binding molecules (CBM), and a small specifc CAZyme-encoding genes might contribute to number of auxiliary activities (AA) and polysaccharide their metabolic diversity in more diverse environments.

ab 10 Specific 2 1 1 9 7 8 S. woodyi ATCC 51908 GT Accessory 10 0 1 1 0 1 S. sediminis HAW−EB3 8 Core 0 0 1 1 1 1 S. loihica PV−4 6 0 0 0 2 1 0 S. pealeana ATCC 700345 GH 1 0 0 2 0 0 S. oneidensis MR−1 4 3 0 0 0 1 0 S. baltica OS223 2 3 0 0 0 0 0 S. putrefaciens 200 CE 1 0 0 0 0 1 S. sp. MR−7 0 1 0 0 0 0 0 S. sp. ANA−3 1 0 0 0 0 0 S. sp. W3−18−1 CBM 2 0 0 0 0 0 S. baltica BA175 2 0 0 0 0 0 S. sp. MR−4 5 0 1 0 1 0 S. halifaxensis HAW−EB4 AA 2 1 2 1 1 2 S. denitrificans OS217 3 0 1 2 2 2 S. piezotolerans WP3 6 0 1 1 3 6 S. violacea DSS12 PL 2 2 0 2 4 1 S. frigidimarina NCIMB 400 1 0 0 1 6 3 S. amazonensis SB2B GT PL AA CE GH CB M 0 10 20 30 40 50 Gene families Fig. 2 CAZyme distribution. a Distribution of CAZymes in pan-genome. The counts of orthologous genes assigned by CAZymes in the core genome (blue bars), the accessory genome (green bars) and the specifc genome (red bars) are shown. b Distribution of specifc genes assigned by CAZymes in each Shewanella strain. Strains are colored according to their isolation sites (blue: water, red: sediment and yellow: other). GT glycosyltransferases, GH glycoside hydrolases, CE carbohydrate esterases, CBM carbohydrate-binding molecules, AA auxiliary activities, PL polysaccharide lyases Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 5 of 13

Furthermore, GTs were the most common in the Functional profling of pan‑genome unique genes, such patterns suggested that GTs might To gain insights into functional diversity in these strains, be important in the environmental adaptation of these the pan-genome from the 24 genomes was also com- strains. However, GEs, CBMs and GHs were the most pared using functional gene ontology categories. All of abundant among the S. woodyi ATCC 51908 unique the GO terms fell into three categories: biological pro- genes, suggesting that GEs, CBMs and GHs might be cesses (BP), molecular function (MF) and cellular com- important in the detritus adaptation of S. woodyi ATCC ponent (CC) (Fig. 3a). Genes with biosynthetic process, 51908. transport and signal transduction were more abundant in the BP category. Genes related to ion binding, DNA

a Biological process Molecular function Cellular component

biosynthetic process ion binding cytoplasm transport DNA binding signal transduction oxidoreductase activity cellular nitrogen compound metabolic process DNA binding transcription factor activity intracellular DNA metabolic process transmembrane transporter activity

small molecule metabolic process signal transducer activity protein complex transmembrane transport peptidase activity cellular amino acid metabolic process kinase activity ribosome carbohydrate metabolic process ATPase activity response to stress transferase activity, transferring acyl groups plasma membrane cofactor metabolic process lyase activity lipid metabolic process methyltransferase activity extracellular region catabolic process isomerase activity transposition hydrolase activity, acting on carbon−nitrogen (but not peptide) bonds generation of precursor metabolites and energy nuclease activity cell wall tRNA metabolic process RNA binding

homeostatic process hydrolase activity, acting on glycosyl bonds organelle sulfur compound metabolic process ligase activity cellular protein modification process nucleotidyltransferase activity chromosome locomotion protein transporter activity

Core 0 0 0 30 60 90 120 Accessory 100 200 300 100 200 300 400 Specific Gene families Gene families Gene families

b Biological process Molecular function Cellular component

translation structural constituent of ribosome protein folding RNA binding ribosome cellular nitrogen compound metabolic process protein transporter activity tRNA metabolic process isomerase activity cell wall catabolic process transmembrane transporter activity homeostatic process ligase activity cofactor metabolic process nucleotidyltransferase activity intracellular response to stress transferase activity, transferring acyl groups transmembrane transport oxidoreductase activity chromosome biosynthetic process methyltransferase activity DNA metabolic process peptidase activity protein complex cellular amino acid metabolic process ion binding small molecule metabolic process hydrolase activity, acting on carbon−nitrogen (but not peptide) bonds lipid metabolic process nuclease activity cytoplasm generation of precursor metabolites and energy ATPase activity transport signal transducer activity organelle sulfur compound metabolic process kinase activity signal transduction DNA binding carbohydrate metabolic process lyase activity plasma membrane cellular component assembly DNA binding transcription factor activity 0.00 0.05 0.10 0.15 0.20 0.05 0.10 0.15 0.20 0.04 0.08 0.12 0.16 dN/dS dN/dS dN/dS Fig. 3 GO annotation of gene families. a GO annotation of pan-genome in biological processes, molecular function and cellular component. Core clusters (blue), accessory clusters (green) and specifc genes (red). b Distribution of selection pressures by categories of biological processes, molecular function and cellular component. Values of dN/dS assigned by GO categories in the single-copy core genome are showed Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 6 of 13

binding and oxidoreductase activity were enriched in MF of nitrate reduction using nitrate as the electron accep- category. For the CC category, genes involved in cyto- tor under certain conditions [30]. Tese fve strains had plasm, intracellular and protein complex were dominant. such strain-specifc genes related to nitrogen utilization, Pan-genome processed a great number of genes involved which indicated that they had characteristics that were in various metabolic pathways and enzymes. Te pres- more complex in response to nitrogen. In addition, S. ence of so many carbon and energy utilization pathways, amazonensis SB2B and S. sediminis HAW-EB3 possessed material transport and synthesis systems was a refection genes encoding inorganic diphosphatase. Te specifcity of the high-pressure and deep-sea extreme environment. of this enzyme has been reported to vary with the source In addition, the existence of so many genes involved in and the activating metal ion [31]. S. violacea DSS12 had transport, transmembrane transport and ion binding pro- a specifc gene-encoding related cytochrome oxidase vided supports for iron reduction. Te vast majority of that was responsible for oxidative phosphorylation. Tis genes in the core genome were involved in housekeeping gene was adjacent to other electron transfer genes in S. functions, such as biosynthetic process, metabolic func- violacea DSS12 genome, suggesting that it might inter- tions, cytoplasm, intracellular and ribosome (Fig. 3a). act with those adjacent genes and function in the related Tese categories were also present, but hardly appeared pathways. in the accessory and specifc genome, whereas the acces- sory and specifc genes were enriched in transport, sig- Selective pressure analysis nal transduction, binding and enzymatic activity (Fig. 3a). To investigate conservation and evolution of house- Te non-core genes could be important for adaptation to keeping genes, functional diversifcation and evolution- specifc habitats. Such functional enrichment patterns of ary pressure of single-copy core genes were detected. In accessory and specifc genes refected their notable res- addition, the non-synonymous (dN) to synonymous (dS) piratory diversity. substitution rates (dN/dS) were estimated for each sin- In addition, cluster of orthologous group (COG) func- gle-copy core family. Te analysis showed that all of the tions of the pan-genome were compared. Genes with single-copy core genes encountered a strong purifying predicted functions and unknown functions were more selection (Fig. 3b). Such an important pattern of strong abundant in the pan-genome (Additional fle 2: Fig- purifying selection indicated that the single-copy core ure S1). Te majority of genes in the core genome were genes were highly conservative and purifying selection involved in ‘Energy production and conversion’ (C), contributing to their long-term stability. ‘Amino acid transport and metabolism’ (E) and ‘Transla- Te combination of selective pressure and the func- tion, ribosomal structure and biogenesis’ (J) (Additional tional categories could reveal the functions involved in fle 2: Figure S1). Whereas the accessory and specifc rapid evolution. We estimated the strength of purifying genome had a slight increase in genes involved in ‘Amino selection for single-copy genes in diferent functions. acid transport and metabolism’ (E), ‘Transcription’ (K) Genes tended to exhibit diferent degrees of conservative and ‘Signal transduction mechanisms’ (T) (Additional evolutionary directions, and some of them experienced fle 2: Figure S1). weaker purifying selection especially those involved in To reveal the uniqueness of the strains, we detected translation, protein folding and structural constituent of several specifc genes that might be related to the sur- ribosome (Fig. 3b). However, genes involved in biological vival environment. Among the strain-specifc genes processes of cellular component assembly evolved under in pan-genome, S. loihica PV-4 had a gene encoding strongest purifying selection. For diferent categories menaquinone biosynthesis protein MenD that facili- of molecular function, genes undergoing the strongest tated electron transfer. Previous study suggested that purifying selection were those involved in DNA bind- menaquinone was the electron transporter in the res- ing transcription factor activity, while genes involved piratory chain and was essential for the survival of S. in structural constituent of ribosome underwent more loihica PV-4 [27]. Tis menD gene did not exist in other relaxed purifying pressure. Furthermore, genes involved strains, which might be important for the survival of S. in ribosome exhibited higher evolutionary rates, while loihica PV-4 in the iron-rich environment. In addition, S. genes involved in plasma membrane exhibited lower woodyi ATCC 51908, S. pealeana ATCC 700345, S. sedi- evolutionary rates. Tese results suggested that genes minis HAW-EB3 and S. halifaxensis HAW-EB4 possessed undergoing higher purifying selection, played dominant one or two strain-specifc genes encoding nitrite reduc- roles in the evolutionary rates among the Shewanella tase that were involved in nitrogen oxide reduction. Te strains, making them retain the original functional pro- S. amazonensis SB2B, also had a specifc gene encoding cess of cellular component assembly, plasma membrane, glutamine synthetase that was a key enzyme of nitrogen etc. Tis observation of purifying selection in single-copy metabolism. It was reported that Shewanella was capable core genes suggested that purifying selection might drive Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 7 of 13

evolution in the essential life functions across all 24 She- gene families contracted in each strain branch, indicating wanella strains. that loss function played an important role in dynamic evolution. Te gene families expanded in those branches Gene gain and loss of strains were mainly enriched in categories of oxida- Gene gain and loss during evolution can increase the ft- tion–reduction process, catalytic activity and membrane. ness of bacteria within habitats [32, 33]. To obtain deeper (Additional fle 3: Figure S2A). Gene enrichment analysis insight into the evolution of gene families over a phylog- showed that these strains have lost the majority of genes eny, we performed an analysis of gene expansion and con- involved in oxidation–reduction process, DNA bind- traction at each branch of speciation. Diversifcation at ing and membrane (Additional fle 3: Figure S2B). Te branches was associated with a large number of changes enrichment of changed genes in reduction process and in gene family size across the Shewanella tree. Te 3594 membrane seemed to be ecologically important and con- gene families were more likely to have been inherited tribute to the successful adaptation of the strains. from the most recent common ancestor (MRCA) of She- Due to recent acquisition and deletion, the gene that wanella, and 1805 of which (50.22%) families experienced is not in the MRCA is more variable than the core gene size changes (Fig. 4a). In the tree, each strain branch of [34]. HGT is the main driver of bacterial evolution and the phylogeny has changed in evolution (Fig. 4a). In addi- adaptation [35]. To infer the recent evolutionary dynam- tion, S. loihica PV-4, the most distantly related strain ics of Shewanella on a larger scope of strains, we exam- based on single-copy phylogeny, had only six changed ined all horizontally acquired genes and obtained 1800 genes. Furthermore, contractions outnumbered expan- genes that were of horizontal origin and the number sions on all branches except the branch of S. loihica PV-4, of HGT-origin genes detected each strain was difer- S. putrefaciens CN-32, and S. baltica OS195. An average ent (Fig. 4b). Gene transfer occurred in many genes, of 15 gene families was under expansion, whereas 123 which indicated that horizontal transfer genes played

a bc Gene families +3 /-36 Expansion / Contraction S. baltica BA175 6 +4 /-30 +10 /-44 +3 /-36 S. baltica OS155 58 +0 /-55 +0 /-39 S. baltica OS117 42 +6 /-68 +9 /-39 S. baltica OS223 292 +0 /-10 +3 /-33 S. baltica OS678 30 +4 /-2 +1 /-53 S. baltica OS195 36 +1 /-74 +3 /-18 S. baltica OS185 375 +4 /-41 +6 /-24 S. sp. W3-18-1 28 +20 /-16 +6 /-266 S. putrefaciens CN-32 27 +10 /-61 +2 /-179 S. putrefaciens 200 35 +3 /-33 +1 /-44 S. sp. MR-7 183 +11 /-30 +5 /-22 S. sp. MR-4 11 +0 /-99 +3 /-102 +13 /-63 S. sp. ANA-3 21 +10 /-223 S. oneidensis MR-1 59 +2 /-181 +41 /-140 +4 /-357 S. frigidimarina NCIMB 400 13 +27 /-492 S. denitrificans OS217 14 +19 /-484 S. amazonensis SB2B 4 +0 /-37 +24 /-427 +4 /-193 S. violacea DSS12 7 +3 /-93 +38 /-104 S. woodyi ATCC 51908 204 +56 /-273 S. sediminis HAW-EB3 160 MRCA +13 /-150 +21 /-107 S. halifaxensis HAW-EB4 6 (3594) +22 /-177 +8 /-280 +19 /-77 S. pealeana ATCC 700345 2 +32 /-176 S. piezotolerans WP3 187 +3 /-3 S. loihica PV-4 0 s s s 0 100 200 300 s brionales Actiniaria Opitutales Vi Thiotrichale Chromatiale Micrococcale Cellvibrionales Burkholderiales Aeromonadales Oceanospirillales Enterobacteriale Xanthomonadales Pseudomonadales Fig. 4 Dynamic evolution of orthologous gene family. a Gene family expansion and contraction in each evolutionary branch. Phylogenetic tree is constructed by 1788 single-copy gene families. The number of expanded (green) or contracted (red) gene families in each strain is on the corresponding branch. MRCA: most recent common ancestor. b Distribution of recent horizontal genes in each strain. c The presence of HGT genes in donor bacteria Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 8 of 13

an important role in the gene fow and acquisition and degree of similarity with the homologues described in S. contributed to the open pan-genome of the Shewanella. oneidensis MR-1, the number and composition carried Among these strains, S. baltica OS185 appeared to gain out by diferent Shewanella strains were still changed the largest number of genes, whereas S. loihica PV-4 did (Fig. 5a). Te complete mtrBAC–omcA–mtrFED cluster not detect the horizontal gene. S. loihica PV-4 showed was conserved in 15 out of the 24 strains. In addition, closest evolutionary relatedness in core gene similarity to the mtrABC operon was conserved in all the genomes the nearest common ancestor, and the numbers of gene except for S. denitrifcans OS217 and S. violacea DSS12. expansion, contraction and HGT genes were minimum, Te mtrDEF operon, homologues of mtrABC, was virtu- probably because the strain’s living environment is simi- ally absent from S. sp. W3-18-1, S. putrefaciens CN-32, lar to the ancestor. To infer the contribution of donors S. putrefaciens 200, S. frigidimarina NCIMB 400, S. deni- to horizontal gene transfer, we identifed donors for trifcans OS217, S. violacea DSS12 and S. woodyi ATCC these horizontal transfer genes (Fig. 4c, Additional fle 4: 51908. Te mtr–omc clusters of 17 strains were consid- Table S2). Tese transferred genes appeared to origi- erably diferent from that of S. oneidensis MR-1 since nate from members of Vibrionales and Enterobacteriales there were gene duplication, gene loss, or new gene gains family, particularly the genera Vibrio, which suggested within it. Such change of this cluster refected the evolu- that HGT events were more efective to occur in closely tionary history and possibly metal respiratory specializa- related organisms rather than in distantly related organ- tion, which agreed well with the previous study that the isms. In addition, these HGT-origin genes were enriched absence of genes in this mtr–omc cluster would result in DNA metabolic process, transposition and DNA bind- in slower iron reduction [38]. For example, S. denitri- ing (Additional fle 3: Figure S2C). Te acquisition of fcans OS217 and S. violacea DSS12, which both lacked genes in DNA metabolic process, transposition and DNA the entire cluster, showed limited anaerobic growth binding could aid in their survival in diferent environ- capacity that may be due to gene loss in the process of ments. For example, the S. woodyi ATCC 51908, S. sedi- ecological specialization [4]. In addition, expansion of minis HAW-EB3, S. sp. MR-4 and S. sp. MR-7 acquired omcA gene occurred in S. amazonensis SB2B, S. sedi- two or three genes encoding the nitrate reductase. Te S. minis HAW-EB3, S. pealeana ATCC 700345 and S. loi- woodyi ATCC 51908, S. sediminis HAW-EB3, S. baltica hica PV-4, which refected gene acquisition and dynamic OS185, S. baltica OS223 and S. piezotolerans WP3 strains evolution in metal respiration. Te tree of genes within acquired a gene encoding glutamine synthetase. Tese the cluster and the cymA gene showed that each gene genes were involved in the nitrogen cycle, which was formed a group with its orthologs from diferent strains apparently important for the successful adaptation. Tese (Additional fle 5: Figure S3). Branches of each gene clus- results suggested that HGT was an important driver of tered into two distinct groups, which imply two distinct the evolution of these genomes. Terefore, the available evolutionary paths for diferent strains. In addition, the gene pool for HGT could explain the diference in non- evolutionary relationship of omcA between these strains MRCA genomes and the increased pan-genome size of was determined according to their genetic characteris- these strains. Tese fndings suggested that gain and loss tics. Tese S. baltica strains showed much evolutionary of genes that were apparently accessory for the majority relatedness in each gene, and the omcA genes from multi- of the strains of a genus might be a successful strategy for locus in S. loihica PV-4, S. sediminis HAW-EB3 and S. biological diversifcation, rapid evolution and environ- pealeana ATCC 700345 formed the relatively independ- mental adaptation. ent branch, showing the duplication events (Additional fle 6: Figure S4). Evolution of the mtr clusters in Shewanella Te presence of mtr–omc cluster in Shewanella was Previous studies on S. oneidensis MR-1 indicated that supposed to correlate with its deep-sea habitat, this genes within a cluster (mtrBAC–omcA–mtrFED) were cluster presumably shared with other deep-sea micro- involved in the metal-reducing pathway, while the S. organisms [39]. To examine extensive genetic exchange denitrifcans OS217 did not possess this cluster [17, between dissimilatory metal-reducing bacteria, the 36]. In particular, this cluster comprised seven genes genetic occurrence of the mtr homologues was pre- that were clustered in sequential order of mtrD–mtrE– dicted (Fig. 5b). Te apparent widespread distribution mtrF–omcA–mtrC–mtrA–mtrB [37]. Te feoA–feoB of metal-reducing pathways in other bacteria indicated operon involved in ferrous iron transport was common the importance of electron-transfer pathway in micro- to all strains and adjacent to the mtr–omc cluster. To bial oxidation–reduction of iron [37]. Te cluster facili- explore new features of metal reduction pathways, we tating the iron respiration was conserved among closely compared this kind of mtr–omc gene cluster in these 24 related species such as and Vibrio strains. Although genes within this cluster shared a high vulnifcus, and showed the same sequential order of Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 9 of 13

ab1000 mtrD mtrE mtrF omc A mtrC mtrA mtrB feoB feoA mtrD mtrE mtrF omcA mtrC mtrA mtrB Escherichia coli K12 MG1655 Escherichia albertii Citrobacter rodentium Salmonella enterica CT18 S. baltica BA175 Salmonella enterica RSK2980 Shigella flexneri Edwardsiella tarda S. baltica OS155 Edwardsiella ictaluri Pectobacterium atrosepticum S. baltica OS117 Providencia stuartii Manheimia succiniciproducens Haemophilus influenzae S. baltica OS223 Actinobacillus pleuropneumoniae 1 Actinobacillus pleuropneumoniae 5b S. baltica OS678 Aggregatibacter actinomycetemcomitans Pasteurella dagmatis Mannheimia haemolytica S. baltica OS195 Gallibacterium anatis Shewanella oneidensis S. baltica OS185 Shewanella putrefaciens Ferrimonas balearica Vibiro vulnificus S. sp. W3-18-1 Aliivibrio salmonicida Nitrosococcus halophilus S. putrefaciens CN-32 Halorhodospira halophila Aeromonas veronii Pelobacter propionicus S. putrefaciens 200 Desulfuromonas acetoxidans Geobacter metallireducens S. sp. MR-7 Anaeromyxobacter dehalogenans Deltaproteobacteria Haliangium ochraceum Desulfurivibrio alkaliphilus S. sp. MR-4 Desulfobacterium autotrophicum Desulfonatronospira thiodismutans S. sp. ANA-3 Rubrivivax benzoatilyticus Leptothrix cholodnii Rhodoferax ferrireducens S. oneidensis MR-1 Sutterella wadsworthensis Betaproteobacteria Gallionella capsiferriformans S. frigidimarina NCIMB 400 Sideroxydans lithotrophicus Dechloromonas aromatica S. denitrificans OS217 Rhodopseudomonas palustris DX1 Rhodomicrobium vannielii Alphaproteobacteria Magnetosirillum magneticum S. amazonensis SB2B Thermincola potens Dethiobacter alkaliphilus group S. violacea DSS12 Carboxydothermus hydrogenoformans Dehalogenimonas lykanthroporepellens Terriglobus saanensis S. woodyi ATCC 51908 Koribacter versatilis Acidobacterium capsulatum S. sediminis HAW-EB3 Solibacter usitatus Thermovibrio ammonificans Desulfurobacterium thermolithotrophum S. halifaxensis HAW-EB4 Persephonella marina desulfuricans S. pealeana ATCC 700345 Denitrovibrio acetiphilus Flexistipes sinusarabici Rhodopirellula baltica PVC group S. piezotolerans WP3 Opitutus terrae Thermodesulfatator indicus S. loihica PV-4 Muricauda ruestringensis Thermodesulfovibrio yellowstonii Fig. 5 Global comparison of the mtr–omc clusters in Shewanella and orthologous gene clusters in other strains. a Genetic comparative organization of the gene locus associated with the metal-reducing pathway of 24 Shewanella strains. Arrows indicate genes and their orientations. The length of the colored box indicates gene homology with those in MR-1. The double slash indicates the existence of extra genes. b The genetic occurrence of the mtr homologous clusters in diverse bacteria. The tree on the left is a taxonomic common tree, and the bars on the right represent the classifcations of bacteria. S. oneidensis MR-1 is colored in red

mtrC–mtrA–mtrB, indicating that the order of the hom- Te metal-reducing pathway was generally consid- ologues genes was highly conserved. Tis was in accord- ered as one of the most ancient microbial metabo- ance with the previous study that homologues of the lisms on earth a long time ago [39]. Previous study had metal-reducing pathway were found in a series of other described that Shewanella contained regions, which dissimilatory metal-reducing bacteria and might have were likely to be horizontally acquired from members co-evolved in these bacteria [40]. Te protein products of the Enterobacteriaceae family bacteria [4]. To infer of these mtr genes worked together to facilitate electron the evolutionary relationships of mtr–omc cluster on a transfer across the cell envelop [37], which had been rec- larger scope of strains, individual phylogenetic trees of ognized as an early form of respiration, although it was mtr genes from diferent organisms were constructed widespread among the bacteria [41], there were still (Additional fle 7: Figure S5). Phylogenetic trees of fewer strains containing complete orthologous genes of mtrE, mtrF, mtrA and mtrD showed that Shewanella metal reduction system. Te genes mtrA and mtrD were strains exhibited very similar evolutionary relatedness present in most strains, but few mtrBCEF and omcA among each other and were relatively independent from genes were present. Te lack of apparent mtr genes in other bacteria. However, several Shewanella strains the most metal-reducing strains was consistent with pre- were scattered to diferent branches in phylogenetic vious studies that the electron-transfer pathways used trees of mtrB, mtrC, omcA (Additional fle 7: Figure by metal-reducing strains for extracellular reduction of S5). Tese mtr genes of Shewanella strains showed evo- Fe(III)-containing minerals had evolved independently lutionary relatedness to Ferrimonas species, probably [17]. because of the extensive horizontal gene transfer events Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 10 of 13

a Shewanella Bacterial donor Ferrimonas mtr cluster HGT Vibrio HGT Aeromonas Photobacterium . . . Ancestor mtr cluster

b e Fe(III) Fe(II) BA175, OS117, OS155, OS185, putrefaciens 200 OS195, OS223, OS678, OS217 NCIMB 400 MtrC OmcA W3-18-1 HAW-EB3, HAW-EB4 MR-4, MR-7 OM MR-1 MtrB ANA-3 CN-32 ATCC 51908 DSS12 MtrA

eˉ PV-4 WP3 Periplasm CymA SB2B

MQH2 Water IM MQ Sediment ATCC 700345 eˉ Other

c d f Fe(III) Fe(II) DMSO DMS dN/dS gene insertion 0.14 0.033 feoB MtrC OmcA DmsA DmsB 0.12 0.054 feoA OM MtrB 0.026 mtrD 0.1 NO2ˉ NH4+ MtrA 0.045 mtrE 0.08 eˉ DmsE 0.053 mtrF Shewanella NrfA 0.06 0.066 omcA Gene lose Periplasm CymA paralogs 0.04 0.148 mtrC MQH2 0.052 mtrA IM MQ 0.054 mtrB Duplication eˉ Fig. 6 A hypothetical model for the evolution of the Mtr pathway in Shewanella. a Shewanella may acquire the mtr genes from the ancestor by HGT and exchange with other bacteria. b The geographic origin of 24 Shewanella genomes used in this study. c Selection pressure of mtr genes, with each box showing the ratio of nonsynonymous (dN) over synonymous substitution (dS) rate. Red boxes represent genes that with high ratios, blue boxes represent genes that with low ratios. d Dynamic changed of mtr clusters. Gene gains and losses derived from HGT events and genes duplication. e The metal-reducing pathway of S. oneidensis MR-1 (OM outer membrane, IM inner membrane, MQH2 reduced form of menaquinone, MQ oxidized form of menaquinone). f The depiction of multiple electron transfer pathways of respiration reported in Shewanella

(Fig. 6a). In addition, the diferences of the metal- diverse range of metabolic processes (Fig. 6b). And the reducing pathway of Shewanella strains might be due to mtr genes had changed during the evolution since they the result of multiple gene gains and loss events during were under strong purifying selection (Fig. 6c). Te the evolution. As these isolates originated from diverse strong purifying selection played a key role in select- geographic locations and habitats, such as marine ing for such iron respiration changes in their evolution. water, sediments and subsurface, they carried out a Te mtrC had lower similarity with the homologues Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 11 of 13

compared with other genes, underwent more diversify- the impact of HGT on the evolution. It could also con- ing selection. Besides, due to the constraint of purifying tribute to research on applications of metal-reducing selection, some of the mtr experienced gene loss and bacteria in bioremediation and metabolic engineering. gene duplication, as well as new gene gains (Fig. 6d). In the metal-reducing pathway, electrons from the inner Methods membrane passed through the periplasm and across Datasets the outer membrane to the extracellular minerals via Te genomic features, geographical origin and isola- the protein components encoding by mtr–omc cluster tion site characteristics of the genomic sequences used [42] (Fig. 6e). With the gain and loss of genes, genes of in this study are provided in Additional fle 8: Table S3. mtr–omc cluster were no longer core genes, and the Genomes and protein sequences were downloaded from ability for Shewanella to adapt to environmental con- National Center for Biotechnology Information (NCBI), ditions by metal-reducing pathway was presumably no representing 24 diferent strains. longer essential for survival. Another possible explana- tion could be that other pathways had been developed Gene family to utilize diferent types of electron acceptors during To understand the evolutionary relationship of She- their evolution to facultative anaerobic environments wanella, we conducted systematic comparative genomic [43] (Fig. 6f). In particular, mtr genes experienced addi- studies. Full protein-coding genes of 24 Shewanella tional contraction events with some of them lost during strains were used to construct gene families using evolution, which indicated that respiration had become OrthoMCL (v2) [44] with a BLAST E-value cut-of of less competitive as the environment changed. 1e−5 and an infation parameter of 1.5. Here, the clus- tering results yielded 7830 homologous clusters, 1788 of which were single-copy gene families, and then they were Conclusions parsed and concatenated. In this study, a comparative pan-genome analysis has been conducted to study the genomic diversity and Phylogeny construction evolutionary relationships among these 24 Shewanella Protein sequences for these single-copy gene families strains. Te pan-genome exhibited a high level of were concatenated and aligned by MUSCLE(v3.8) [45] genome variability, about 86% of which was variable. with default parameters. Te alignments were curated And this pan-genome was found to be open and had by GBlock (v0.91b) [46] to flter out poorly aligned posi- a high percentage of strain-specifc genes within it. tions. Te phylogenetic tree was constructed using the Moreover, the existence of several strain-specifc genes maximum likelihood algorithm with 100 bootstraps as and abundant CAZymes in the accessory and specifc implemented in Phylip (v3.696). genomes refected the functional diversity and meta- Te pan-genome tree based on the absence or pres- bolic diversity of the pan-genome. And also the essen- ence of each gene family in all genomes used the Manhat- tial function of single-copy core genes would evolve tan distance to measure the evolutionary relationship of under purifying selection, which illustrated the impor- strains. Each gene in the genome was scored on basis of tance of the housekeeping function to each strain for the presence (1 value) or absence (0 value), a 0/1 matrix survival. We also found that HGT played a key role in was built and Manhattan distance was calculated, then a the shaping of the Shewanella accessory and specifc phylogenetic tree of the pan-genome was generated using genome and accelerated the evolution of strains. Fur- MEGA (v5). thermore, purifying selection played an important role To determine the evolutionary origin of mtr–omc clus- in the stability of the core-genome and facilitated the ters, the co-occurrence of mtr–omc clusters in bacteria evolution of mtr–omc clusters in diferent Shewanella was obtained from the STRING [47] database. In addi- strains to diferent habitats. We re-emphasized that tion, homologous mtr genes identifcation from other the number of Shewanella strains was expanded to 24, genus were based on the best match of the alignment which had not been reported by others in the previous to the NCBI non-random protein database using the studies. BLASTp program with an E-value cut-of of 1e−5. Ten Te work presented here would help carry out further the Phylip program was used to infer the phylogenetic research on the genetic basis of Shewanella, and better relationships among these mtr genes. enhance understanding of metal-reducing pathway and Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 12 of 13

Gene distribution expansion Additional fles To gain a great insight into the evolutionary dynamics of the genes, the expansion and contraction of the gene Additional fle 1: Table S1. Summary of homologous genes identifed by families among these 24 Shewanella strains were deter- OrthoMCL. Protein sequences from 24 Shewanella genomes were used as mined. CAFE (v2.1) [48] was used to infer the change in input for OrthoMCL orthologous gene clustering (Input Genes). Clusters gene family size in each branch. Gene gain and loss were containing single genes or genes from a single strain were eliminated along with each branch of the single-copy gene family (Clusters). Genes from a single strain (Clustered as single strain). tree, and signifcant levels of expansion and contraction Additional fle 2: Figure S1. Distribution of genes based on COG category of the pan-genome. Distribution of COG categories between were determined at 0.05. the core (blue bars), accessory (green bars) and specifc genes (red bars) of Shewanella. Detection of horizontal gene transfer Additional fle 3: Figure S2. GO annotation of gained and lost genes. (A) To infer mobile elements of Shewanella genome, all non- Function enrichment of expanded gene families. (B) Function enrichment of contracted gene families. (C) Function enrichment of HGT-origin genes. recent common ancestral genes were aligned to plasmid Additional fle 4: Table S2. Possible donors of HGT genes in each strain. sequences, insertion sequences, phage sequences availa- ble in the NCBI RefSeq database, ISfnder and ACLAME Additional fle 5: Figure S3. Phylogenetic tree of mtr–omc gene clusters in 24 Shewanella genomes. The phylogenetic tree was constructed using a database, respectively. A gene was used as input for multiple sequence alignment of the full mtr–omc clusters and cymA genes HGTector (v0.2.1) [49] searches in terms of its best of all 24 Shewanella. BLASTp hit with a sequence identity of above 50% and a Additional fle 6: Figure S4. Comparison of omcA genes in 24 Shewanella cutof E value of 10­ −5. Te identifcation of HGT-origin genomes. Left: Phylogenetic tree of omcA genes. Right: omcA genes similarities of pairwise comparison. The omcA from multi-locus that are genes used HGTector with BLAST parameter thresh- not located in the mtr–omc cluster are colored red, omcA from multi-locus 5 olds 90% identity, an E-value of 10­ − and 500 top-scoring that located in the mtr–omc cluster that no longer sisters are colored matches. Te Shewanella and Shewanellaceae were set as green, respectively. self-group and close group, respectively. Additional fle 7: Figure S5. Phylogenetic relationship of mtr–omc clusters within diferent bacteria. The phylogenic trees built by protein sequences were showed for (A) mtrB, (B) mtrC, (C) mtrE, (D) mtrF, (E) omcA, Functional annotations (F) mtrA and (G) mtrD, respectively. The branches of Shewanella and clade Hmmscan was used to determine carbohydrate activity of S. oneidensis MR-1 are marked in red. enzymes (CAZymes) by comparing all gene families to Additional fle 8: Table S3. Genomes and strains used in this study. dbCAN [50] database. Gene ontology (GO) terms were information is collected from NCBI. identifed using InterProScan (v.5) [51]. GO term assign- ments for each of the genes were retrieved from Interpro- Authors’ contributions Scan results. Since GO slims were particularly useful for The whole study was designed by KN. CFZ collected datasets. CFZ and SJY analyzed the data. CFZ, MZH explained the results. CFZ, SJY, MZH, PSY, HJL and giving a summary of the genome-wide GO annotation, all KN wrote the initial draft of the manuscript. All authors read and approved the of the GO terms were mapped to GO slim (http://www. fnal manuscript. geneo​ntolo​gy.org/GO.slims​.shtml​). Additionally, gene annotation was based on COG databases, using BLAST Acknowledgements with a cutof E-value of 1e−5. The authors are grateful to Zhiwei Song who provided assistance during preparation of this report. Selective pressure analysis Competing interests To estimate the rate of evolution and test the selection The authors declare that they have no competing interests. pressure on each single-copy orthologous genes, the Availability of data and materials program PAML (v. 4.4c) [52] was used. For each pair of All data generated or analyzed during this study are included in this manu- strains, the Codeml model was used to calculate dN and script and its additional fles. dS values. Te single ω (dN/dS, the ratio of non-synon- Consent for publication ymous to synonymous divergence) across sites was esti- Not applicable. mated using M0, M7, M8 models that were fxed across the phylogeny for each alignment (referred to as ω of a Ethics approval and consent to participate Not applicable. gene). To avoid convergence problems, each analysis was repeated three times with diferent initial values of Funding ω and adopted results from the analysis with the highest This work was partially supported by National Science Foundation of China Grant 31671374, Ministry of Science and Technology’s high-tech (863) Grant likelihood. 2014AA021502. Zhong et al. Biotechnol Biofuels (2018) 11:193 Page 13 of 13

Publisher’s Note 24. Konstantinidis KT, et al. Comparative systems biology across an evolu- Springer Nature remains neutral with regard to jurisdictional claims in pub- tionary gradient within the Shewanella genus. Proc Natl Acad Sci USA. lished maps and institutional afliations. 2009;106(37):15909–14. 25. Shi L, et al. Molecular underpinnings of Fe(III) oxide reduction by Received: 28 April 2018 Accepted: 10 July 2018 Shewanella oneidensis MR-1. Front Microbiol. 2012;3:50. 26. Wang F, et al. Environmental adaptation: genomic analysis of the piezotolerant and psychrotolerant deep-sea iron reducing bacterium Shewanella piezotolerans WP3. PLoS ONE. 2008;3(4):e1937. 27. Gao H, et al. Shewanella loihica sp. nov., isolated from iron-rich microbial References mats in the Pacifc Ocean. Int J Syst Evol Microbiol. 2006;56(8):1911–6. 1. Janda JM, Abbott SL. The genus Shewanella: from the briny depths below 28. Maughan R. Carbohydrate metabolism. Surgery (Oxford). 2009;27(1):6–10. to human pathogen. Crit Rev Microbiol. 2014;40(4):293–312. 29. Serres MH, Riley M. Genomic analysis of carbon source metabolism of 2. Nealson KH, Scott J. Ecophysiology of the genus Shewanella. In: Dworkin Shewanella oneidensis MR-1: predictions versus experiments. J Bacteriol. M, editor. The : : Gamma subclass, vol. 6. New 2006;188(13):4601–9. York: Springer; 2006. p. 1133–51. 30. Chen Y, Wang F. Insights on nitrate respiration by Shewanella. Front Mar 3. Lovley D. Dissimilatory Fe(III)-and Mn(IV)-reducing prokaryotes. In the Sci. 2015;1:80. prokaryotes. Berlin: Springer; 2006. p. 635–58. 31. Owen A, Larry GB. Yeast inorganic pyrophosphatase. J Biol Chem. 4. Konstantinidis KT, et al. Comparative systems biology across an 1972;247:7315–9. evolutionary gradient within the Shewanella genus. Proc Natl Acad Sci. 32. Ochman H, Moran NA. Genes lost and genes found: evolution of bacterial 2009;106(37):15909–14. pathogenesis and symbiosis. Science. 2001;292(5519):1096–9. 5. Lovley DR. Dissimilatory Fe(III) and Mn(IV) reduction. Microbiol Rev. 33. Oh PL, et al. Diversifcation of the gut symbiont Lactobacillus reuteri as a 1991;55(2):259–87. result of host-driven evolution. ISME J. 2010;4(3):377. 6. Tiedje JM. Shewanella—the environmentally versatile genome. London: 34. Li Y-H, et al. De novo assembly of soybean wild relatives for pan- Nature Publishing Group; 2002. genome analysis of diversity and agronomic traits. Nat Biotechnol. 7. Kim HJ, et al. A microbial fuel cell type lactate biosensor using a metal- 2014;32(10):1045. reducing bacterium, Shewanella putrefaciens. J Microbiol Biotechnol. 35. Gyles C, Boerlin P. Horizontally transferred genetic elements and their role 1999;9(3):365–7. in pathogenesis of bacterial disease. Vet Pathol. 2014;51(2):328–40. 8. Gorby YA, et al. Electrically conductive bacterial nanowires produced by 36. Gao H, et al. Probing regulon of ArcA in Shewanella oneidensis MR-1 by Shewanella oneidensis strain MR-1 and other microorganisms. Proc Natl integrated genomic analyses. BMC Genomics. 2008;9(1):42. Acad Sci. 2006;103(30):11358–63. 37. Shi L, et al. Mtr extracellular electron-transfer pathways in Fe(III)-reducing 9. Bendall ML, et al. Exploring the roles of DNA methylation in the metal- or Fe(II)-oxidizing bacteria: a genomic perspective. London: Portland reducing bacterium Shewanella oneidensis MR-1. J Bacteriol. 2013. https​:// Press Limited; 2012. doi.org/10.1128/jb.00935​-13. 38. Yang Y, et al. Roles of UndA and MtrC of Shewanella putrefaciens W3-18-1 10. Dehaut A, et al. Phenotypic and genotypic characterization of H2S-posi- in iron reduction. BMC Microbiol. 2013;13(1):267. tive and H2S-negative strains of Shewanella baltica isolated from spoiled 39. Vargas M, et al. Microbiological evidence for Fe(III) reduction on early whiting (Merlangius merlangus). Lett Appl Microbiol. 2014;59(5):542–8. Earth. Nature. 1998;395(6697):65–7. 11. Shanafeld HA. Evolution of the set of signal transduction proteins in 10 40. Cairns-Smith AG, Hall AJ, Russell MJ. Mineral theories of the origin of life species of Shewanella. 2008. and an iron sulfde example. Berlin: Springer; 1992. p. 161–80. 12. Camila Odio JS. Shewanella oneidensis MR-1: background and applica- 41. Lonergan DJ, et al. Phylogenetic analysis of dissimilatory Fe(III)-reducing tions. 2016. bacteria. J Bacteriol. 1996;178(8):2402. 13. Coursolle D, Gralnick JA. Modularity of the Mtr respiratory pathway of 42. Clarke TA, Richardson DJ. Structure of a bacterial cell surface decaheme Shewanella oneidensis strain MR-1. Mol Microbiol. 2010;77(4):995–1008. electron conduit. Proc Natl Acad Sci USA. 2011;108(23):9384–9. 14. Shi L, et al. Isolation of a high-afnity functional protein com- 43. Chen Y, Wang F. Insights on nitrate respiration by Shewanella. Front Mar plex between OmcA and MtrC: two outer membrane decaheme Sci. 2014;1:80. c-type cytochromes of Shewanella oneidensis MR-1. J Bacteriol. 44. Li L, Stoeckert CJ Jr, Roos DS. OrthoMCL: identifcation of ortholog groups 2006;188(13):4705–14. for eukaryotic genomes. Genome Res. 2003;13(9):2178–89. 15. Yang Y, et al. Roles of UndA and MtrC of Shewanella putrefaciens W3-18-1 45. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and in iron reduction. BMC Microbiol. 2013;13:267. high throughput. Nucleic Acids Res. 2004;32(5):1792–7. 16. Wang F, et al. Isolation of extremophiles with the detection and retrieval 46. Talavera G, Castresana J. Improvement of phylogenies after removing of Shewanella strains in deep-sea sediments from the west Pacifc. Extre- divergent and ambiguously aligned blocks from protein sequence align- mophiles. 2004;8(2):165–8. ments. Syst Biol. 2007;56(4):564–77. 17. Fredrickson JK, et al. Towards environmental systems biology of 47. Snel B, et al. STRING: a web-server to retrieve and display the Shewanella. Nat Rev Microbiol. 2008;6(8):592. repeatedly occurring neighbourhood of a gene. Nucleic Acids Res. 18. Vernikos G, et al. Ten years of pan-genome analyses. Curr Opin Microbiol. 2000;28(18):3442–4. 2015;23:148–54. 48. De Bie T, et al. CAFE: a computational tool for the study of gene family 19. Bentley S. Sequencing the species pan-genome. London: Nature Publish- evolution. Bioinformatics. 2006;22(10):1269–71. ing Group; 2009. 49. Zhu Q, Kosoy M, Dittmar K. HGTector: an automated method facilitat- 20. Dikow RB. Genome-level homology and phylogeny of Shewanella ing genome-wide discovery of putative horizontal gene transfers. BMC (Gammaproteobacteria: lteromonadales: Shewanellaceae). BMC Genomics. Genomics. 2014;15:717. 2011;12(1):237. 50. Yin Y, et al. dbCAN: a web resource for automated carbohydrate-active 21. Caro-Quintero A, et al. Unprecedented levels of horizontal gene transfer enzyme annotation. Nucleic Acids Res. 2012;40(W1):W445–51. among spatially co-occurring Shewanella bacteria from the Baltic Sea. 51. Finn RD, et al. InterPro in 2017—beyond protein family and domain ISME J. 2011;5(1):131. annotations. Nucleic Acids Res. 2016;45(D1):D190–9. 22. Ong WK, et al. Comparisons of Shewanella strains based on genome 52. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol annotations, modeling, and experiments. BMC Syst Biol. 2014;8(1):31. Evol. 2007;24(8):1586–91. 23. Caroquintero A, et al. Unprecedented levels of horizontal gene transfer among spatially co-occurring Shewanella bacteria from the Baltic Sea. Isme J Emultidiscip J Microbial Ecol. 2011;5(1):131–40.