bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Genome sequencing and functional genes comparison between Sphingopyxis

2 USTB-05 and Sphingomonas morindae NBD5

3

4 Chao Liu1, Qianqian Xu1, Zhenzhen Zhao1, Shahbaz Ahmad1, Haiyang Zhang1, Yufan

5 Zhang1, Yu Pang1, Abudumukeyiti Aikemu1, Yang Liu1,* Hai Yan1,*

6

7 1 School of Chemistry and Biological Engineering, University of Science and

8 Technology Beijing, Beijing 100083, China

9 *Corresponding author: Prof Hai Yan; mail: [email protected]. Yang Liu; mail:

10 [email protected].

11

12 Keywords: Sphingomonadaceae; Genome; Lutein; Hepatotoxin; Microcystins;

13 Biodegradation

14

15

16

17

18

19

20

21

22

23

24

25

1 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

26 ABSTRACT

27 Sphingomonadaceae has a large number of strains that can biodegrade hepatotoxins or

28 environmental pollutants. The latest research reported that certain strains can also

29 produce lutein. Based on the third-generation sequencing technology, we analyzed the

30 whole genome sequence and compared related functional genes of two strains of

31 Sphingomonadaceae isolated from different habitats. The genome of Sphingopyxis

32 USTB-05 was 4,679,489 bp and contained 4312 protein coding genes. The 4,239,716

33 bp nuclear genome of Sphingomonas morindae NBD5, harboring 3882 protein coding

34 genes, has two sets of chromosomes. Both strains had lutein synthesis metabolism

35 pathway sharing identical synthetic genes of crtB, crtE, crtI, crtQ, crtL, crtR, atoB, dxs,

36 dxr, ispD, ispE, ispDF, gcpE, ispG, ispH, ispA, ispB and ispU. Sphingopyxis USTB-05

37 had hepatotoxins microcystins and nodularin metabolic pathways related to 16 genes

38 (ald、ansA、gdhA、crnA、phy、ocd、hypdh、spuC、nspC、speE、murI、murD、

39 murC、hmgL、bioA and glsA), while these genes were not found in Sphingomonas

40 morindae NBD5. The unique protein sequences of strain NBD5 and strain USTB-05

41 were 155 and 199, respectively. The analysis of whole genome of the two

42 Sphingomonadaceae strains provides insights into prokaryote evolution, the new

43 pathway for lutein production and the new genes for environmental pollutant

44 biodegradation.

45

46 IMPORTANCE

47 Understanding the functional genes related to the special functions of strains is essential

48 for humans to utilize microbial resources. The ability of Sphingopyxis USTB-05 to

49 degrade hepatotoxins microcystins and nodularin has been studied in depth, however

50 the complete metabolic process still needs further elucidation. Sphingomonas morindae

2 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

51 NBD5 can produce lutein, and it is necessary to determine whether there is a new

52 pathway of lutein. In this study, the whole genome sequencing of Sphingopyxis USTB-

53 05 and Sphingomonas morindae NBD5 were performed for the first time. Lutein

54 synthesis metabolic pathways and synthetic genes were discovered in

55 Sphingomonadaceae. We predicted the existence of new lutein synthesis pathways and

56 revealed most of the genes of the new synthesis pathways. A comparative analysis of

57 the functional genes of the two strains revealed that Sphingopyxis USTB-05 contains a

58 large number of functional genes related to the biodegradation of hepatotoxins or

59 hexachlorocyclohexane. Among them, the functional genes related to the

60 biodegradation and metabolism of hexachlorocyclohexane had not been previously

61 reported. These findings lay the foundation for the biosynthesis of lutein using

62 Sphingomonas morindae NBD5 or Sphingopyxis USTB-05 and the application of

63 Sphingopyxis USTB-05 for the biodegradation of hepatotoxins microcystins and

64 nodularin or environmental pollutants.

65

66 INTRODUCTION

67 Lutein is a kind of carotenoid, which widely exits in vegetables, fruits and other plant.

68 It is also the main pigment in the macular area of human's eyes (1), and cannot be

69 synthesized by the body itself. Although it must be obtained from daily food, most

70 people's daily intake is seriously insufficient. The latest research showed that daily

71 intake of 10 mg lutein and 2 mg zeaxanthin could improve visual function and delay

72 the development of age-related macular degeneration (AMD) (2). Many researches

73 focused on the production of lutein by eukaryotes, and its synthesis pathway had been

74 clarified already. However, the production of lutein by prokaryotes was reported rarely.

75 The prokaryotic strains were only confined to some specific strains, for examples,

3 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

76 Erwinia, Agrobacterium and Rhodobacter capsulatus (2). Few research was reported

77 about Sphingomonas strains in lutein production. In our previous research,

78 Sphingomonas morindae NBD5 had the function of producing lutein (3). Phylogenetic

79 analysis showed that they belonged to two closely related genera of

80 Sphingomonadaceae (4) (5).

81

82 Sphingomonadaceae has a large number of strains that can biodegrade hepatotoxins or

83 environmental pollutants. In the 1990s, Sphingomonas wittichii RW1 was reported to

84 biodegrade dibenzo-p-dioxin (DD) polychlorinated derivatives under aerobic condition

85 in contaminated soil and water (6). Other types of Sphingomonas could also biodegrade

86 a series of intractable compounds (biphenyl, herbicide dichlorohaloperidin, γ-

87 hexachlorocyclohexane, aromatic hydrocarbons, chlorophenol, pentachlorophenol,

88 naphthalenesulfonic acid, N,N-dimethylaniline, diphenyl ether, dibenzofuran) that were

89 harmful to the environment (7). Hepatotoxins microcystins (MCs) and nodularin (NOD)

90 are derived from algae and have high toxicity and potential harm to humans and aquatic

91 animals. The World Health Organization (WHO) stipulated that the concentration of

92 MCs in drinking water should not be higher than 1.0 μg/L (8). Biodegradation is a very

93 promising method to remove hepatotoxins. At present, many strains of

94 Sphingomonadaceae family have the function of biodegrading hepatotoxins. However,

95 in order to fully clarify the metabolic process of biodegradation, new hepatotoxins

96 biodegradation genes need to be discovered.

97

98 Sphingomonas morindae NBD5 was a new species that was identified as this genus

99 only a few years ago (4). It could produce high yield of lutein, and the metabolic genes

100 needed further study. Sphingopyxis USTB-05 was isolated from Dianchi Lake in China

4 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

101 and could biodegrade MCs and NOD (5) (9) (10). These functional genes USTB-05-A,

102 USTB-05-B, and USTB-05-C of Sphingopyxis sp. USTB-05 had been verified by

103 heterologous expression in Escherichia coli to biodegrade MCs (11)(12). The purified

104 first recombinant was found to have a strong ability to catalyze hepatotoxins in

105 Sphingopyxis sp. USTB-05 (13). Because of the characteristics of lutein production and

106 hepatotoxin biodegradation of these two strains, the relationship between gene and

107 function was found through searching for functional genes. Here, the method of whole

108 genome combined with biometric analysis was used to compare the similar and unique

109 characteristics of Sphingomonas morindae NBD5 and Sphingopyxis USTB-05, and

110 analyze their functional genes, especially those associated with lutein synthesis and

111 hepatotoxin biodegradation.

112

113 RESULTS

114 General features of the nuclear genome

115 The Sphingomonas morindae NBD5 genome contained two circular chromosomes and

116 two circular plasmids (Figure 1). Polychromosomes were common in some genera, but

117 rare in Sphingomonas. Two circular chromosome of 4,239,716 bases was finally

118 obtained with a G + C content of 70%, and 3882 protein coding sequences (CDSs) were

119 predicted, accounting for 62.39% of the total coding sequence. The 16S rRNA of strain

120 NBD5 had 3 complete copies (Table 1). The Sphingopyxis USTB-05 genome contained

121 one chromosome (Figure 2). Its circular chromosome of 4,679,489 bases was finally

122 obtained with a G + C content of 64%, and 4312 CDSs were predicted, accounting for

123 62.39% of the total coding sequence. The 16S rRNA of strain USTB-05 was a single

124 copy (Table 1) without CRISPR site. Compared with strain USTB-05, the genome of

125 strain NBD5 was much richer in GC and contained plasmids, which indicated that some

5 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

126 genetic transfer events happened in strain NBD5 (Table 1). The number of functional

127 genes annotated in strain NBD5 was much more than that in strain USTB-05, but the

128 chromosome length of strain USTB-05 was longer than that of strain NBD5.

129

130 131 Figure 1 Genome circle of Sphingomonas morindae NBD5

6 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

132 133 Figure 2 Genome circle of Sphingopyxis USTB-05 134 135 136 137 Table 1 Comparison of genome characteristics between Sphingomonas morindae NBD5 and 138 Sphingopyxis USTB-05 category Sphingomonas morindae NBD5 Sphingopyxis USTB-05 bases 4239716 4679489 tmRNA 1 1 tRNA 61 48 CDS 3882 4312 GC(%) both 70% 64% plasmid 2 0 139 140 Genome annotation and genome wide comparative analysis

141 GO annotation

142 The distribution of genes in different Gene Ontology (GO) terms can intuitively reflect

143 the distribution of target genes on the secondary level of GO terms. A total of 798 genes

144 were annotated in the Sphingomonas morindae NBD5 genome through the GO

145 database (Figure 3). Under the classification of biological process, the number of genes

146 annotated to metabolic process was up to 43.8%, and the number of genes for

147 intracellular processes was 40.8%. Response to stimulus, biological regulation,

148 localization, regulation of biological process, multi-organism process, cellular 7 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

149 component organization or biogenesis, signaling, and growth accounted for 15%, 14%,

150 11.8%, 11.8%, 7.3%, 6.3%, 4.1%, and 3%, respectively. Under the classification of

151 cellular component, the number of genes annotated to cell was up to 38%, and the

152 number of genes for cell part was 37.9%. Membrane, membrane part, protein-

153 containing complex, organelle, extracellular region, organelle part, nucleoid, and other

154 organism accounted for 22.6%, 14.8%, 6.2%, 3.2%, 1.9%, 1.5%, 0.4%, and 0.2%,

155 respectively. Under the classification of molecular function, the number of genes

156 annotated to catalytic activity was up to 1623 (40.6%), and the number of genes for

157 binding was 37.5%. Transporter activity, transcription regulator activity, structural

158 molecule activity, molecular transducer activity, obsolete signal transducer activity,

159 antioxidant activity, molecular function regulator, and molecular carrier activity

160 accounted for 6.9%, 4.1%, 1.7%, 1.6%, 1.4%, 0.7%, 0.4%, and 0.1%, respectively.

161

162 A total of 786 genes were annotated in the Sphingopyxis USTB-05 genome through the

163 GO database (Figure 3). Under the classification of biological process, the number of

164 genes annotated to cellular process was up to 14.5%, and the number of genes for

165 metabolic process was 13.9%. Response to stimulus, cellular component organization

166 or biogenesis, biological regulation, regulation of biological process, growth,

167 localization, negative regulation of biological process, and positive regulation of

168 biological process accounted for 4.1%, 3.2%, 3.2%, 2.6%, 2.4%, 1.6%, 0.9%, and 0.8%,

169 respectively. Under the classification of cellular component, the number of genes

170 annotated to cell was up to 14.4%, and the number of genes for cell part was 14.4%.

171 Membrane, protein-containing complex, organelle, organelle part, membrane part,

172 extracellular region, membrane-enclosed lumen, and extracellular region part

173 accounted for 5.5%, 3.4%, 2.2%, 1.8%, 1.6%, 0.4%, 0.2%, and 0.1%, respectively.

8 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

174 Under the classification of molecular function, the number of genes annotated to

175 catalytic activity was up to 11.9%, and the number of genes for binding was 7.9%.

176 Structural molecule activity, transporter activity, transcription regulator activity,

177 molecular function regulator, and antioxidant activity accounted for 1.3%, 0.9%, 0.5%,

178 0.2%, and 0.1%, respectively.

179

180 Although Sphingopyxis USTB-05 had more bases in its genome than Sphingomonas

181 morindae NBD5, Sphingomonas morindae NBD5 had significantly more genes in most

182 GO classifications than Sphingopyxis USTB-05 (Figure 3). Sphingomonas morindae

183 NBD5 had 66 unique genes in carbohydrate utilization (GO:0009758): 11 and obsolete

184 signal transducer activity (GO:0004871): 55. But there were also some genes in

185 behavior (GO:0007610): 1, cell proliferation (GO:0008283): 2, multicellular

186 organismal process (GO:0032501): 2, obsolete transcription factor activity, protein

187 binding (GO:0000988): 4, obsolete transcription factor activity, transcription factor

188 binding (GO: 0000989): 2, translation regulator activity (GO: 0045182): 1, these genes

189 in Sphingopyxis USTB-05 were unique, representing behavior; cell proliferation;

190 multicellular organismal process; obsolete transcription factor activity, protein binding;

191 obsolete transcription factor activity, transcription factor binding; translation regulator

192 activity, respectively.

9 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

193

194 Figure 3 Comparison of GO functional classification of Sphingomonas morindae

195 NBD5 and Sphingopyxis USTB-05

196 COG annotation

197 Cluster of Orthologous Groups (COG) is a database based on the systematic evolution

198 of bacteria, algae and eukaryotes. The assembled Sphingomonas morindae NBD5 gene

199 was analyzed in the COG database, and the results showed that 62.39% of the genes in

200 COG were annotated (Figure 4). In the 25 COG functional categories, 3668 genes had

201 been classified. These categories were mainly: transcription (COG category K) (7.66%,

202 as a percentage of all functional allocation genes), cell wall/membrane/envelope

203 biogenesis (M) (7.06%), carbohydrate transport and metabolism (G) (6.82 %), amino

204 acid transport and metabolism (E) (6.73%), signal transduction mechanisms (T)

205 (6.05%), inorganic ion transport and metabolism (P) (5.62%), energy production and

206 conversion (C) (5.37%), translation, ribosomal structure and biogenesis (J) (4.80%),

207 replication, recombination and repair (L) (4.77%), and lipid transport and metabolism

10 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

208 (I) (4.58%). The rate of COG classification as transcription (K) was high (7.66%),

209 which was consistent with the fact that this strain contained two circular chromosomes

210 and complex gene expression.

211

212 Analysis of the assembled Sphingopyxis USTB-05 gene in the COG database showed

213 that 62.39% of the genes were annotated in COG. Among the 25 functional categories

214 of COG, a total of 4,040 genes had been classified (Figure 4). These categories were

215 mainly: amino acid transport and metabolism (E) (8.71%), transcription (K) (7.65%),

216 lipid transport and metabolism (I) (6.76%), inorganic ion transport and metabolism (P)

217 (6.14%), energy production and conversion (C) (5.87%), cell wall/membrane/envelope

218 biogenesis (M) (5.50%), secondary metabolite biosynthesis, transport and catabolism

219 (Q) (5.05%), carbohydrate transport and metabolism (G) (4.85%), translation,

220 ribosomal structure and biogenesis (J) (4.65%), replication, recombination and repair

221 (L) (4.38%), and posttranslational modification, protein turnover, chaperones (O)

222 (4.06%).

223

224 Most COGs showed similar distributions among Sphingomonas morindae NBD5 and

225 Sphingopyxis USTB-05 (Figure 4). Among them, Sphingopyxis USTB-05 had the

226 largest number of genes in most COGs categories; however, the number of genes in the

227 COG G, M, and T categories for the Sphingopyxis USTB-05 was lower. The COG G,

228 M, and T categories for Sphingomonas morindae NBD5 represented carbohydrate

229 transport and metabolism, cell wall/membrane/envelope biogenesis and signal

230 transduction mechanisms, respectively. These data suggested that Sphingomonas

231 morindae NBD5 had significant differences in carbohydrate-related metabolic

232 synthesis and signal transduction. These reflected the fact that Sphingomonas morindae

11 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

233 NBD5 could produce more lutein than Sphingopyxis USTB-05, and the synthesis of

234 lutein was related to the metabolic synthesis of carbohydrates. The number of genes in

235 the COG C, E, I, and Q categories for Sphingopyxis USTB-05 was higher; they

236 represented energy production and conversion; amino acid transport and metabolism;

237 lipid transport and metabolism and secondary metabolites biosynthesis, transport and

238 catabolism, respectively. These data suggested that Sphingopyxis USTB-05 had

239 significant differences in catabolism. These reflected the fact that Sphingopyxis USTB-

240 05 had the molecular basis for decomposing MCs that a cyclic polypeptide composed

241 of seven amino acids in the environment.

242

243 Figure 4 Comparison of COG functional classification of Sphingomonas morindae

244 NBD5 and Sphingopyxis USTB-05

245 KEGG annotation

246 Biological functions usually require coordination of different genes. Therefore, in order

247 to identify the representative biological pathway of strain NBD5, single genes were 12 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

248 annotated. A total of 1,839 genes were annotated as 112 metabolic pathways, some of

249 which could be matched with multiple metabolic pathways. In the Kyoto Encyclopedia

250 of Genes and Genomes (KEGG) database, genes annotated as 23 major pathways

251 accounted for more than half of all annotated genes. The top ten pathways were: amino

252 acid biosynthesis, carbon metabolism, two-component system, purine metabolism,

253 flagella assembly, oxidative phosphorylation, ribosomes, ABC transporter, bacterial

254 chemotaxis, and pyrimidine metabolism (Figure 5). These annotations provided

255 important information about the specific biological processes and pathways of strain

256 NBD5.

257

258 A total of 1927 genes of strain USTB-05 had been annotated to 121 metabolic pathways,

259 some of which could be matched with multiple metabolic pathways. In the KEGG

260 database, genes annotated to 28 major pathways accounted for more than half of all

261 annotated genes. The top ten pathways were: amino acid biosynthesis, carbon

262 metabolism, two-component system, ribosomes, purine metabolism, oxidative

263 phosphorylation, Pyruvate metabolism, glyoxylate and dibasic acid metabolism,

264 quorum sensing, glycine, and serine and threonine metabolism (Figure 5). These notes

265 provided important information about the specific biological processes and pathways

266 of strain USTB-05.

267

268 Comparative analysis of KEGG function of the two strains of Sphingomonadaceae was

269 did as well. Among the top ten pathways, two-component system (ko02020), purine

270 metabolism (ko00230), flagella assembly (ko02040) (correlation of the neatness of

271 colony edges on the plate), bacterial chemotaxis (ko02030)(correlation of the neatness

272 of colony edges on the plate) genes strain NBD5 was obviously more than strain USTB-

13 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

273 05; and amino acid biosynthesis (ko01230)(synthesis of related to algal toxin

274 degradation), carbon metabolism (ko01200)(synthesis of enzymes related to algal toxin

275 degradation), pyruvate metabolism (ko00620), glyoxylic acid and dibasic acid

276 metabolism (ko00630), quorum sensing (ko02024), glycine, serine and threonine

277 metabolism (ko00260)(synthesis of enzymes related to the degradation of algal toxins).

278 The process gene strain USTB-05 was obviously more than strain NBD5. In addition,

279 the KEGG function annotated that the unique metabolic processes of strain NBD5

280 included other polysaccharide degradation (ko00511), plant-pathogen interaction

281 (ko04626) (from endophytes in plant noni), life regulation pathways-multi-species

282 (ko04213), FoxO signaling pathway (ko04068), sphingolipid metabolism (ko00600),

283 life regulation pathway (ko04211); the unique metabolic processes of strain USTB-05

284 included β-alanine metabolism (ko00410), degradation of chlorinated alkanes and

285 chloroalkenes (ko00625), taurine and hypotaurine metabolism (ko04626), xylene

286 degradation (ko00622), caprolactam degradation (ko00930), dioxin degradation

287 (ko00621), limonene and pinene degradation (ko00903), chlorocyclohexane and

288 chlorobenzene degradation (ko00361), PPAR signaling pathway (ko03320), apoptosis

289 (ko04214), D-glutamine and D-glutamate metabolism (ko00471), naphthalene

290 degradation (ko00626), styrene degradation (ko00643), toluene degradation (ko00623).

291 This was consistent with the fact that the strain NBD5 was an endophyte and lutein-

292 producing functional bacteria from the plant Noni, while the strain USTB-05 was a

293 functional bacteria that could biodegrade complex organic matter in the environment.

294

295 Figure 5 Comparison of KEGG pathway distribution of Sphingomonas morindae 14 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

296 NBD5 and Sphingopyxis USTB-05

297 Comparison of the biosynthesis pathways and genes of lutein

298 Lutein contains two ionone rings in its chemical formula and is a carotenoid with

299 vitamin A activity. Sphingomonas morindae NBD5 genome COG analysis shows that

300 its core carbon skeleton is completed by carbohydrate transport and metabolism (G)

301 (6.82%). Among them, the ratio of cell wall/membrane/envelope biogenesis (M)

302 (7.06%) related to compound endocytosis and exocytosis is significantly higher. In

303 addition, energy production and conversion (C) (5.37%) also plays an important role

304 (Figure 4).

305

306 The corresponding genes of the terpenoid backbone biosynthesis pathway and the

307 carotenoid biosynthesis pathway of Sphingomonas morindae NBD5 and Sphingopyxis

308 USTB-05 were the same. The corresponding enzyme genes for lutein synthesis were

309 also the same. Both strains found the presence of β-carotene 3-hydroxylase, which was

310 involved in the last step of lutein synthesis, and its corresponding code gene was CrtR-

311 b. This is a fact that lycopene existed as a synthetic intermediate, which is synthesized

312 through glycolysis/gluconeogenesis pathway (Figure 6), terpenoid backbone

313 biosynthesis pathway (Figure 7) and carotenoid biosynthesis pathway (Figure 8). But

314 in the metabolic pathway of lutein production, no complete synthesis pathway had been

315 found. Combined with the mass spectrometry results (3), the two strains of

316 Sphingomonadaceae could synthesize lutein, which suggest that they could probably

317 find new pathways of lutein synthesis.

318

319 In eukaryotes, especially higher plant, lutein is primarily synthesized from terpenoids.

320 Two molecules of geranylgeranyl diphosphate (GGPP) are used to synthesize phytoene.

15 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

321 Phytoene is then converted to ζ-carotene, after which ζ-carotene is converted into

322 lycopene. Lycopene is then converted to α-carotene through cyclization reaction

323 catalysed by lycopene cyclase (εLCY /βLCY). Finally, the formation of lutein from α-

324 carotene is catalysed by α-carotene hydroxylase (βCHX /εCHX) (14). In the carotenoid

325 biosynthesis pathway of the two Sphingomonadaceae strains, enzymes related to

326 lycopene synthesis have been found (Figure 8), but the cyclases from lycopene to lutein

327 synthesis are lacking. Therefore, it inferred that there were new cyclase genes for the

328 lutein metabolism pathways in these two bacteria.

329

330 The lutein synthesis genes in Sphingomonas morindae NBD5 and Sphingopyxis USTB-

331 05 genomes mainly existed in the terpenoid backbone biosynthesis pathway and the

332 carotenoid biosynthesis pathway, which are completely the same. These synthetic genes

333 are crtB, crtE, crtI, crtQ, crtL, crtR, atoB, dxs, dxr, ispD, ispE, ispDF, gcpE, ispG, ispH,

334 ispA, ispB and ispU. Only these genes ackA, pgm, gpmI, and pckA in the glycolysis &

335 gluconeogenesis pathway in Sphingomonas morindae NBD5 are unique. These genes

336 porB, meh, and fldA in the glycolysis & gluconeogenesis pathway in Sphingopyxis

337 USTB-05 are unique (Figure 9).

16 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

338

339 Figure 6 The metabolic pathway of glycolysis/gluconeogenesis of Sphingomonas

340 morindae NBD5 and Sphingopyxis USTB-05. The enzyme or gene in the orange box

341 in the figure indicates that two strains co-exist, the enzyme or gene in the blue box

342 indicates that neither of the two bacteria exist, the enzyme or gene in the pink box

343 indicates that it only exists in Sphingomonas morindae NBD5, and the enzyme or gene 17 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

344 in the bright yellow box indicates that it only exists in Sphingopyxis USTB-05.

345

346 Figure 7 The biosynthetic pathway of terpenoid skeleton of Sphingomonas morindae

347 NBD5 and Sphingopyxis USTB-05. The enzyme or gene in the orange box in the figure

348 indicates that two strains co-exist, and the enzyme or gene in the blue box indicates that

349 neither of the two bacteria exist. 18 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

350

351 Figure 8 The biosynthetic pathway of carotenoid of Sphingomonas morindae NBD5

352 and Sphingopyxis USTB-05. The enzyme or gene in the orange box in the figure

353 indicates that two strains co-exist, and the enzyme or gene in the blue box indicates that

354 neither of the two bacteria exist.

19 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

355

356 Figure 9 Comparison of genes related to lutein synthesis in the genomes of

357 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05. The gray box in the figure

358 indicate that the gene exists in the strain, and the white indicate that the gene does not

359 exist in the strain.

360

361 Comparison of the pathways and genes of hepatotoxin biodegradation

362 Nowadays, the mlr gene cluster encoding functional have been identified in

363 many hepatotoxin biodegrading bacteria (15). The mlr gene cluster contained mlrC,

364 mlrA, mlrB and mlrD, corresponding to the enzymes MlrC, MlrA, MlrB and MlrD that

365 biodegraded hepatotoxin. By comparing the genomes of these two strains, it was found

366 that Sphingopyxis USTB-05 contained the high homology genes mlrC, mlrA, mlrB and

20 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

367 mlrD, while Sphingomonas morindae NBD5 did not have the mlr gene cluster (Suppl

368 Table S1). It was consistent with the fact that Sphingopyxis USTB-05 was MCs

369 biodegrading strain, while Sphingomonas morindae NBD5 was not.

370

371 Since hepatotoxin MCs and NOD are a class of monocyclic heptapeptide and

372 pentapeptide compounds, some amino acid metabolism processes may be involved in

373 MCs and NOD biodegradation. According to the general chemical molecular structure

374 of MCs and NOD, these structures are D-alanine, variable L-amino acid, D-isoleucine,

375 D-erythro-β-methylaspartic acid, N-dehydrogenation alanine, L-arginine, D-glutamic

376 acid, Adda. Adda is the particular C20 β-amino acid: (2S, 3S, 8S, 9S) 3-amino-9-

377 methoxy-2, 6, 8-trimethyl-10-phenyldeca-4(E), 6(E)-dienoic acid (10). Among them,

378 the variable L-amino acids are leucine and arginine. Sphingopyxis USTB-05 had the

379 following metabolic processes: alanine, aspartate and glutamate metabolism; arginine

380 and proline metabolism; degradation of aromatic compounds; valine, leucine and

381 isoleucine biosynthesis; D-glutamine and D-glutamate metabolism.

382

383 These MCs and NOD biodegradation genes in the genome of Sphingopyxis USTB-05

384 mainly existed in ABC transporters; alanine, aspartate and glutamate metabolism;

385 arginine and proline metabolism; D-glutamine and D-glutamate metabolism; and valine,

386 leucine and isoleucine degradation. These genes ald, ansA, and gdhA in the metabolic

387 pathways of alanine, aspartate and glutamate were unique to Sphingopyxis USTB-05.

388 These genes crnA, phy, ocd, hypdh, spuC, nspC, and speE in the metabolic pathways

389 of arginine and proline were unique to Sphingopyxis USTB-05. These genes murI,

390 murD, and murC in the metabolic pathways of D-glutamine and D-glutamate were

391 unique to Sphingopyxis USTB-05. These genes hmgL, bioA, and glsA in the degradation

21 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

392 pathway of valine, leucine and isoleucine may be involved in the biodegradation of

393 MCs (Figure 10).

394

395 Figure 10 Comparison of genes related to microcystin degradation in the genomes of

22 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

396 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05. The gray box in the figure

397 indicate that the gene exists in two strains, and the white indicate that the gene does not

398 exist in two strains.

399

400 Genome sequencing data comparison and proteins prediction

401 Based on the genome sequencing data of Sphingomonas morindae NBD5 and

402 Sphingopyxis USTB-05, the protein sequences were predicted. The same protein

403 sequences were combined into a cluster, and then the unique protein sequence was

404 annotated. 1983 of protein sequences were predicted in Sphingopyxis USTB-05, and

405 1939 protein sequences were predicted in Sphingomonas morindae NBD5. Among

406 them, there were a total of 1784 protein sequences showed homology, 199 unique

407 protein sequences for Sphingopyxis USTB-05, and 155 unique protein sequences for

408 Sphingomonas morindae NBD5 (Figure 11).

409

410 Figure 11 Venn diagram of comparison of genomic predicted difference proteins

411 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

412 DISCUSSION

413 The two strains were found in different habitats, but they were similar or different in

414 some functions. Sphingomonas morindae NBD5 had two genomes, which were

415 relatively rare in Sphingomonas. The genome and plasmid prediction showed two 23 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

416 genomes and two plasmids, indicated that Sphingomonas morindae NBD5 was a

417 special strain of Sphingomonas. According to the description in the Berger’s Handbook

418 of Bacteria, the GC mole percentage of Sphingomonas was 59-68% [7]. The GC content

419 of the genome of Sphingomonas morindae NBD5 was 70%.

420

421 Genes exchanged between various species occurs frequently. The GC content of both

422 genomes of Sphingomonas morindae NBD5 was 70%, and the GC content of its

423 plasmids was 63%. The results indicated that the plasmids of strain NBD5 may have

424 been obtained from other species during evolution. In addition, the GC content of

425 Sphingopyxis USTB-05 was 64%, which was close to the GC content of plasmids of

426 strain NBD5.

427

428 Erwinia uredovora was a representative of carotenoid production by prokaryotes, and

429 its carotenoid synthesis coding genes had been studied. The carotenoid biosynthesis

430 gene clusters with GGPP as the precursor contain 6 open reading frames (ORFs), which

431 are crtE, crtX, crtY, crtI, crtB and crtZ. α-carotene is converted into zeaxanthin by β-

432 carotene 3-hydroxylase, which could be derived from the crtZ of Erwinia uredovora

433 and the CrtR-b of the two strains of this study (16). In addition, there was a protein-

434 coding gene CrtL-b that catalyzed the mutual conversion of different types of carotene

435 in the lutein synthesis and metabolism pathway of the two strains in this study (Figure

436 8).

437

438 Combining the synthetase genes were found in the terpenoid backbone biosynthesis

439 pathway (Figure 7) and carotenoid biosynthesis pathway (Figure 8), the most probable

440 lutein synthesis pathway was predicted. In these two strains of Sphingomonadaceae, 1-

24 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

441 deoxy-D-xylulose 5-phosphate (DXP) is synthesized from pyruvate and D-

442 glyceraldehyde 3-phosphate via the 2-Cmethyl-D-erythritol 4-phosphate (MEP)

443 pathway. After seven steps of enzyme reactions, geranyl diphosphate (GPP) or

444 isopentenyl pyrophosphate (IPP) is generated. Taking GGPP as the precursor, lutein is

445 synthesized by nine steps of enzyme reactions. These reaction products are prephytoene

446 pyrophosphate, phytoene, phytofluene, ξ-carotene, neurosporene, lycopene. Then, δ-

447 carotene or α-carotene is reacted with zeaxanthin or α-cryptoxanthin, and hydroxylase

448 is hydroxylated to form lutein. Among them, the metabolic process of synthesizing

449 GGPP from GPP or IPP is lack. However, the molecular structures of zeaxanthin and

450 lutein are only different in the position of the double bond of the left six-membered

451 carbon ring, and they can be synthesized via the 2-Cmethyl-D-erythritol 4-phosphate

452 (MEP) pathway. The synthesis pathway of zeaxanthin can provide an example. The

453 condensation of IPP with dimethylallyl diphosphate (DMAPP) to form GPP is catalysed

454 by geranyl diphosphate synthase (GPS). GPP is condensed with one molecule of IPP to

455 form farnesyl diphosphate (FPP) by FPP synthase (FPS). One molecule of FPP

456 condenses with one molecule IPP to form GGPP under the catalysis of GGPP synthase

457 (CrtE) (17)(18). Last but not least, whether there were new enzymes in the metabolic

458 process of synthesizing carotene from lycopene was worthy of further investigation.

459

460 Hepatoxin MCs degrading bacterium were found in Arthrobacter spp., Brevibacterium

461 sp. and Rhodococcus sp. (19). However, the presence of biodegradable hepatoxins MCs

462 in Sphingomonadaceae has been reported more frequently (15). Through cloning and

463 gene library screening, the gene clusters for biodegrading MC-LR (mlr A, mlr B, mlr C

464 and mlr D) were identified preliminarily by Bourne et al (20). It was further discovered

465 that all or part of these four genes existed in many MCs biodegrading strains. Molecular

25 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

466 research found that Sphingopyxis USTB-05 contained USTB-05-A, USTB-05-B, USTB-

467 05-C genes with high homology to mlr A, mlr B, mlr C, respectively (21). Hashimoto

468 et al. (22) speculated that the genes involved in the degradation of MCs were far more

469 than these four genes. Through the KEGG database metabolic pathway annotation, the

470 following genes may be involved in the biodegradation process of hepatotoxins MCs

471 and NOD by Sphingopyxis USTB-05. These genes ald, ansA and gdhA in the metabolic

472 pathways of alanine, aspartate and glutamate are involved in the biodegradation of D-

473 alanine, D-erythro-β-methylaspartate and D-glutamate. These genes crnA, phy, ocd,

474 hypdh, spuC, nspC and speE in the metabolic pathway of arginine and proline are

475 involved in the biodegradation of L-arginine. These genes murI, murD and murC in the

476 metabolic pathways of D-glutamine and D-glutamate are involved in the

477 biodegradation of D-glutamate. These genes hmgL, bioA and glsA in the metabolic

478 pathways of valine, leucine and isoleucine are involved in the biodegradation of D-

479 isoleucine (Figure 10). Sphingopyxis USTB-05 has the function of biodegrading MCs,

480 but Sphingomonas morindae NBD5 does not. Sphingopyxis USTB-05 has 199 unique

481 protein sequences, including MCs degradation-related unique aspartate-type

482 activity (GO: 0004190), metallopeptidase activity (GO: 0004181) and

483 carboxylate hydrolase activity (GO: 0052689).

484

485 The biodegradation pathway of Sphingopyxis USTB-05 for MC-YR and NOD has been

486 clarified. The first step is to convert the cyclic structure to the linear structure by the

487 same enzyme. Adda is produced as the final product by their last enzyme reaction.

488 However, the further biodegradation of Adda by Sphingopyxis USTB-05 has not been

489 reported yet. Recently, these genes and transposable elements that may be involved in

490 the biodegradation of phenylacetate have been observed near the mlr gene cluster (23).

26 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

491 Sphingopyxis sp. YF1, which relies on the mlr biodegradation pathway, can biodegrade

492 Adda through phenylacetic acid metabolism (24). Through the KEGG database

493 metabolic pathway annotations, the evidence of further biodegradation of Adda is found.

494 The biodegradation of Adda is related to some degradation genes in the metabolic

495 pathways of styrene, toluene and xylene (Suppl Figure S1-3).

496

497 The results of Sphingobium indicum B90A, Sphingobium japonicum UT26 and

498 Sphingobium francense Sp+ show that they are able to transform β- and δ-

499 hexachlorocyclohexane (β- and δ-HCH, respectively), the most recalcitrant

500 hexachlorocyclohexane isomers, to pentachlorocyclohexanols, but only Sphingobium

501 indicum B90A can further transform the pentachlorocyclohexanol intermediate to the

502 corresponding tetrachlorocyclohexanediols (25). The linB gene of Sphingobium

503 indicum B90A heterologously expressed protein was incubated with γ- and β-

504 hexachlorocyclohexane, the pentachlorocyclohexanol product was further transformed

505 and eventually disappeared from the culture medium (25). The linB gene was also found

506 in the annotation of chlorocyclohexane and chlorobenzene metabolism of Sphingopyxis

507 USTB05 (Suppl Figure S4). In addition, Sphingopyxis USTB05 also found many genes

508 related to the degradation of environmental pollutants, such as: dioxins, toluene, xylene,

509 chlorocyclohexane and chlorobenzene, chloroalkane and chloroalkene, styrene,

510 naphthalene and other degradation genes (Suppl Figure S5-7). Although the functions

511 of these genes in the Sphingopyxis USTB05 genome still need to be verified, it still

512 shows that Sphingopyxis USTB05 has great potential as an environmental pollutant

513 degradation bacteria.

514

515 MATERIALS AND METHODS

27 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

516 Bacterial strains and culturing conditions

517 Sphingopyxis sp. USTB-05 was isolated and identified from the sediment of Dianchi

518 Lake in China (5). Sphingomonas morindae NBD5 was isolated and identified from

519 Noni (Morinda citrifolia L.) branch (4). They were grown in LB media at 30 ℃.

520

521 DNA extraction, identification and sequencing

522 Glycerol stocks of the original strains were initially used as inoculum for regrowth on

523 the original solid isolation media at 30 °C. Single colonies were picked and cultured in

524 LB media. Genomic DNA was extracted using the Bacterial Genomic DNA Kit (CoWin

525 Biosciences, China) according to the manufacturer’s instructions. DNA quality and

526 integrity were checked by NanoDrop Nucleic Acid Quantification (Thermo Fisher

527 Scientific, USA) and gel electrophoresis.

528

529 To confirm the identity of the strains, 16S ribosomal RNA (rRNA) gene amplicons were

530 generated by PCR using primers 27F (5´-AGAGTTTGGATCMTGGCTCAG-3´) and

531 1492R (5´-GGTTACCTTGTTACGA CTT-3´). The PCR reaction mixture contained 25

532 μL PCR Mix, 2 μL primer 27F (10 mM), 2 μL primer 1492R (10 mM) and 1μL of the

533 extracted DNA. Nuclease-free water was added to reach a total reaction volume of 50

534 μL. The following conditions were used for the bacterial 16S rRNA gene amplification:

535 initial denaturation at 94 ℃ for 5 min followed by 30 cycles of denaturation at 94 ℃

536 for 40 s, annealing at 55 °C for 40 s, elongation at 72 °C for 45 s and a final extension

537 step at 72 °C for 10 min. PCR products were purified using 1% gel electrophoresis. The

538 purified PCR products were sent for DYY-8C DNA sequencer sequencing. The

539 sequencing results was put in EZ biocloud Alignment to determine the homology

540 relationship with the known sequence.

28 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

541

542 The genomic DNA of two strains was constructed by nanopore single molecule

543 sequencing library according to the standard protocol provided by Oxford Nanopore

544 Technologies (ONT). Large fragments of DNA were recovered by using BluePippin

545 (Sage Science, USA) automatic nucleic acid recovery system (26). DNA damage repair

546 and end repair, magnetic beads purification and linker connection were processed by

547 using the official SQK-LSK 109 ligation kit (Oxford Nanopore, UK). The Qubit library

548 quantification were transferred to computer sequencing. A second-generation

549 sequencing library for the genomic DNA and plasmid DNA of the two strains was

550 constructed. Genome sequencing of strains USTB-05 and NBD5 was performed using

551 the Illumina MiSeq platform (paired end, 2 × 300 bp reads) (27).

552

553 Genome assembly and quality control

554 For genome assembly, the subreads used second-generation sequencing technology

555 were first filtered with fastp software to obtain high-quality reads. The sequences

556 containing the linker fragment were deleted. The sequences with mass value Q <25 and

557 low mass base number length more than half of the sequence length were deleted.

558

559 For nanopore data, the original fastq format was obtained by base calling fast 5 file

560 through Albacore software in MinKNOW software package. In order to obtain more

561 accurate assembly of results, it was necessary to filter these impurities to obtain reliable

562 subreads, including filtering out Polymerase Reads with a length less than 1000 bp.

563

564 Genome annotation and comparative genomic analysis

565 The online NMPDR-rust server was used to predict the gene and coding sequence (CDs)

29 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

566 region of the assembly sequence. The predicted protein sequences were searched

567 against KEGG (Kyoto Encyclopedia of Genes and Genomes), COG (Cluster of

568 Orthologous Groups), GO (Gene ontology) to predict gene functions and metabolic

569 information through Blastall (28). Circos software was used to integrate the COG

570 annotation results, methylation results, RNA annotation results, GC content, and GC-

571 skew to map the entire genome of the bacterial strain. In addition, CRISPRFinder

572 software was used to predict the clustered regularly interspaced short palindromic

573 repeats (CRISPR) structure of the genome (29). The coding sequences of the two

574 genomes were aligned using MUMmer and analyzed in conjunction with the results of

575 the genome annotation (30).

576

577 Drawing tool

578 The heat map was generated by using the online software Hiplot (https://hiplot.com.cn).

579 Adobe Illustrator CS6 was used to generate other figures.

580

581 ABBREVIATIONS

582 AMD: age-related macular degeneration; MCs: microcystins; NOD: nodularin; WHO:

583 world health organization; DD: dibenzo-p-dioxin; LPS: lipopolysaccharide; CDSs:

584 coding sequence; CRISPR: clustered regularly interspaced short palindromic repeats;

585 GO: gene ontology; COG: cluster of orthologous groups; KEGG: kyoto encyclopedia

586 of genes and genomes; VFDB: virulence factors database; CARD: comprehensive

587 antibiotic resistance database; ORF: open reading frame; NMEP: non-mevalonate

588 pathway; GPP: geranyl pyrophosphate; IPP: isoprene pyrophosphate; GGPP: geranyl

589 geranyl pyrophosphate.

590

30 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

591 ACKNOWLEDGMENTS

592 This work was supported by the National Natural Science Foundation of China

593 (21677011) and the Fundamental Research Funds for the Central Universities (FRF-

594 TP-20-044A2; FRF-BR-19-003B; FRF-TP-18-012A1; FRF-TP-17-009A2).

595

596 REFERENCES

597 1. Fuad NIN, Sekar M, Gan SH, Lum PT, Vaijanathappa J, Ravi S. 2020. Lutein: A

598 comprehensive review on its chemical, biological activities and therapeutic

599 potentials. Pharmacognosy Journal 12:1769-1778.

600 2. Scripsema NK, Hu DN, Rosen RB. 2015. Lutein, zeaxanthin, and meso-

601 zeaxanthin in the clinical management of eye disease. Journal of Ophthalmology

602 2015:1-13.

603 3. Zhang Y, Li M, Zhao Z, Liu C, Liu Y, Yan H. 2020. A novel production of lutein

604 by Sphingomonas morindae NBD5. Fourrages 244:99-109.

605 4. Liu Y, Yao S, Lee Y, Cao Y, Zhai L, Zhang X, Su J, Ge Y, Kim S, Cheng C. 2015.

606 Sphingomonas morindae sp. nov., isolated from Noni (Morinda citrifolia L.)

607 branch. Int J Syst Evol Microbiol 65:2817-23.

608 5. Wang J, Wu P, Chen J, Yan H. 2010. Biodegradation of microcystin-RR by a new

609 isolated Sphingopyxis sp. USTB-05. Chinese Journal of Chemical Engineering

610 18:108-112.

611 6. Hong H, Chang Y, Nam I, Fortnagel P, Schmidt S. 2002. Biotransformation of 2,7-

612 dichloro-and 1,2,3,4-tetrachlorodibenzo-p-dioxin by Sphingomonas wittichii RW1.

613 Appl Environ Microbiol 68:2584-2588.

614 7. Garrity G, Brenner DJ, Kreig N, Staley JT. 2005. Bergey's manual of systematic

615 bacteriology Vol. 2 part C. the alpha-, beta-, delta-, and epsilonproteobacteria.

31 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

616 8. Falconer IR. 1999. An overview of problems caused by toxic blue-green algae

617 (cyanobacteria) in drinking and recreational water. Environmental Toxicology

618 14:5-12.

619 9. Xu H, Wang H, Xu Q, Lv L, Yin C, Liu X, Du H, Yan H. 2015. Pathway for

620 biodegrading microcystin-YR by Sphingopyxis sp. USTB-05. Plos one 10:1-13.

621 10. Feng N, Yang F, Yan H, Yin C, Liu X, Zhang H, Xu Q, Lv L, Wang H. 2016.

622 Pathway for biodegrading nodularin (NOD) by Sphingopyxis sp. USTB-05. Toxins

623 8:116.

624 11. Yan H, Wang J, Chen J, Wei W, Wang H, Wang H. 2012. Characterization of the

625 first step involved in enzymatic pathway for microcystin-RR biodegraded by

626 Sphingopyxis sp. USTB-05. Chemosphere 87:12-18.

627 12. Wang H, Yan H, Ma S, Liu X, Yin C, Wang H, Xu Q, Lv L. 2015. Characterization

628 of the second and third steps in the enzymatic pathway for microcystin-RR

629 biodegradation by Sphingopyxis sp. USTB-05. Annals of Microbiology 65:495-

630 502.

631 13. Xu Q, Ma H, Zhang H, Fan J, Yin C, Liu X, Liu Y, Wang H, Yan H. 2020.

632 Purification and activity of the first recombinant enzyme for biodegrading

633 hepatotoxin by Sphingopyxis sp. USTB-05. Algal Research 47:101863.

634 14. Lado J, Zacarías L, Rodrigo MJ. 2016. Regulation of carotenoid biosynthesis

635 during fruit development, vol 79. Springer, Cham, In: Stange C. (eds) Carotenoids

636 in Nature.

637 15. Dexter J, McCormick AJ, Fu P, Dziga D. 2021. Microcystinase - a review of the

638 natural occurrence, heterologous expression, and biotechnological application of

639 MlrA. Water Research 189.

640 16. Ti J, Zhang L, Liu M, Wang J, Zhang X, Guo D, Li R. 2011. Studies on the

32 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

641 Interaction Between Geranylgeranyl Pyrophosphate Synthase and Phytoene

642 Synthase of Erwinia Uredovora. Journal of Qingdao University (Engineering &

643 Technology Edition) 26:66-70.

644 17. Zhang Y, Liu Z, Sun J, Xue C, Mao X. 2017. Biotechnological production of

645 zeaxanthin by microorganisms. Trends in Food Science Technology 71:225-234.

646 18. Buhaescu I, Izzedine H. 2007. Mevalonate pathway: A review of clinical and

647 therapeutical implications. Clinical 40:575-584.

648 19. Manage PM, Edwards C, Singh BK, Lawton LA. 2009. Isolation and identification

649 of novel microcystin-degrading bacteria. Appl Environ Microbiol 75:6924-6928.

650 20. Bourne DG, Riddles P, Jones GJ, Smith W, Blakeley RL. 2010. Characterisation

651 of a gene cluster involved in bacterial degradation of the cyanobacterial toxin

652 microcystin LR. Environ Toxicol 16:523-534.

653 21. Wang H. 2014. Pathway and molecular mechanism for the biodegradation of

654 microcystin-LR by Sphingopyxis sp. USTB-05. University of Science and

655 Technology Beijing.

656 22. Hashimoto EH, Kato H, Kawasaki Y, Nozawa Y, Tsuji K, Hirooka EY, Harada KI.

657 2009. Further investigation of microbial degradation of microcystin using the

658 advanced marfey method. Chemical Research in Toxicology 22:391-398.

659 23. Zhang X, Yang F, Chen L, Feng H, Yin S, Chen M. 2020. Insights into ecological

660 roles and potential evolution of Mlr-dependent microcystin-degrading bacteria.

661 Science of the Total Environment 710.

662 24. Yang F, Huang F, Feng H, Wei J, Massey IY, Liang G, Zhang F, Yin L, Kacew S,

663 Zhang X, Pu Y. 2020. A complete route for biodegradation of potentially

664 carcinogenic cyanotoxin microcystin-LR in a novel indigenous bacterium. Water

665 Research 174.

33 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

666 25. Sharma P, Raina V, Kumari R, Malhotra S, Dogra C, Kumari H, Kohler HPE,

667 Buser HR, Holliger C, Lal R. 2006. Haloalkane dehalogenase LinB is responsible

668 for β- and δ-hexachlorocyclohexane transformation in Sphingobium indicum

669 B90A. Appl Environ Microbiol 72:5720-5727.

670 26. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6: Molecular

671 evolutionary genetics analysis version 6.0. Molecular Biology and Evolution

672 30:2725-2729.

673 27. Versluis D, McPherson K, van Passel MWJ, Smidt H, Sipkema D. 2017. Recovery

674 of previously uncultured bacterial genera from three mediterranean sponges.

675 Marine Biotechnology 19:454-468.

676 28. Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Robin CR,

677 Wortman JR. 2008. Automated eukaryotic gene structure annotation using

678 EVidenceModeler and the Program to Assemble Spliced Alignments. Genome

679 Biology 9.

680 29. Bland C, Ramsey TL, Sabree F, Lowe M, Brown K, Kyrpides NC, Hugenholtz P.

681 2007. CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered

682 regularly interspaced palindromic repeats. BMC bioinformatics 8.

683 30. Delcher AL, Salzberg SL, Phillippy AM. 2003. Using mummer to identify similar

684 regions in large sequence sets. Current Protocols in Bioinformatics 00.

685

686

687

688

689

690

34 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

691 Tables

692

693 Table 1. Comparison of genome characteristics between Sphingomonas morindae NBD5 and

694 Sphingopyxis USTB-05

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716 35 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

717 Figure Legends

718

719 Figure 1: Genome circle of Sphingomonas morindae NBD5

720

721 Figure 2: Genome circle of Sphingopyxis USTB-05

722

723 Figure 3: Comparison of GO functional classification of Sphingomonas morindae NBD5 and

724 Sphingopyxis USTB-05

725

726 Figure 4: Comparison of COG functional classification of Sphingomonas morindae NBD5 and

727 Sphingopyxis USTB-05

728

729 Figure 5: Comparison of KEGG pathway distribution of Sphingomonas morindae NBD5 and

730 Sphingopyxis USTB-05

731

732 Figure 6: The metabolic pathway of glycolysis/gluconeogenesis of Sphingomonas morindae

733 NBD5 and Sphingopyxis USTB-05

734 The enzyme or gene in the orange box in the figure indicates that two strains co-exist, the enzyme

735 or gene in the blue box indicates that neither of the two bacteria exist, the enzyme or gene in the

736 pink box indicates that it only exists in Sphingomonas morindae NBD5, and the enzyme or gene in

737 the bright yellow box indicates that it only exists in Sphingopyxis USTB-05.

738

739 Figure 7: The biosynthetic pathway of terpenoid skeleton of Sphingomonas morindae NBD5

740 and Sphingopyxis USTB-05

741 The enzyme or gene in the orange box in the figure indicates that two strains co-exist, and the

742 enzyme or gene in the blue box indicates that neither of the two bacteria exist.

743

744 Figure 8: The biosynthetic pathway of carotenoid of Sphingomonas morindae NBD5 and

36 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

745 Sphingopyxis USTB-05

746 The enzyme or gene in the orange box in the figure indicates that two strains co-exist, and the

747 enzyme or gene in the blue box indicates that neither of the two bacteria exist.

748

749 Figure 9: Comparison of genes related to lutein synthesis in the genomes of Sphingomonas

750 morindae NBD5 and Sphingopyxis USTB-05

751 The gray box in the figure indicate that the gene exists in the strain, and the white indicate that the

752 gene does not exist in the strain.

753

754 Figure 10: Comparison of genes related to microcystin degradation in the genomes of

755 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

756 The gray box in the figure indicate that the gene exists in two strains, and the white indicate that the

757 gene does not exist in two strains.

758

759 Figure 11: Venn diagram of comparison of genomic predicted difference proteins

760 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

761

762

763

764

765

766

767

768

769

770

771

37 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

772 Table 1. Comparison of genome characteristics between Sphingomonas morindae NBD5 and

773 Sphingopyxis USTB-05

category Sphingomonas morindae NBD5 Sphingopyxis USTB-05 bases 4239716 4679489 tmRNA 1 1 tRNA 61 48 CDS 3882 4312 GC(%) both 70% 64% plasmid 2 0 774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793 38 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

794

795

796

797

798

799

800

801

802 Figure 1: Genome circle of Sphingomonas morindae NBD5

803

804

805

806

807

808

809

810

811

812

813

814 39 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

815

816

817

818

819

820

821

822

823

824

825 Figure 2: Genome circle of Sphingopyxis USTB-05

826

827

828

829

830

831

832

833

834

40 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

835

836

837

838

839

840

841 Figure 3: Comparison of GO functional classification of Sphingomonas morindae NBD5 and

842 Sphingopyxis USTB-05

843

844

845

846

847

848

849

850

851

41 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

852

853

854

855

856

857

858 Figure 4: Comparison of COG functional classification of Sphingomonas morindae NBD5 and

859 Sphingopyxis USTB-05

860

861

862

863

864

865

866

867

868 42 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

869

870

871

872

873

874

875 Figure 5: Comparison of KEGG pathway distribution of Sphingomonas morindae NBD5 and

876 Sphingopyxis USTB-05

877

878

879

880

881

882

883

884

885

886

43 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

887

888 Figure 6: The metabolic pathway of glycolysis/gluconeogenesis of Sphingomonas morindae

889 NBD5 and Sphingopyxis USTB-05

890 The enzyme or gene in the orange box in the figure indicates that two strains co-exist, the enzyme

891 or gene in the blue box indicates that neither of the two bacteria exist, the enzyme or gene in the

892 pink box indicates that it only exists in Sphingomonas morindae NBD5, and the enzyme or gene in

893 the bright yellow box indicates that it only exists in Sphingopyxis USTB-05. 44 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

894

895

896 Figure 7: The biosynthetic pathway of terpenoid skeleton of Sphingomonas morindae NBD5

897 and Sphingopyxis USTB-05

898 The enzyme or gene in the orange box in the figure indicates that two strains co-exist, and the

899 enzyme or gene in the blue box indicates that neither of the two bacteria exist.

900

901

902

45 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

903

904

905

906 Figure 8: The biosynthetic pathway of carotenoid of Sphingomonas morindae NBD5 and

907 Sphingopyxis USTB-05

908 The enzyme or gene in the orange box in the figure indicates that two strains co-exist, and the

909 enzyme or gene in the blue box indicates that neither of the two bacteria exist.

910

911

912

913

914

915

916

917

918

919

46 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

920

921 Figure 9: Comparison of genes related to lutein synthesis in the genomes of Sphingomonas

922 morindae NBD5 and Sphingopyxis USTB-05

923 The gray box in the figure indicate that the gene exists in the strain, and the white indicate that the

924 gene does not exist in the strain.

925

926

927

928

47 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

929

930 Figure 10: Comparison of genes related to microcystin degradation in the genomes of

931 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

932 The gray box in the figure indicate that the gene exists in two strains, and the white indicate that the

933 gene does not exist in two strains. 48 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

934

935

936

937

938

939

940

941

942

943

944

945

946

947 Figure 11: Venn diagram of comparison of genomic predicted difference proteins

948 Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

949

950

951

952

953

954

955 49 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

956 Supplementary Tables and Figures

957

958 Suppl Table S1. The sequences of USTB-A, USTB-B, USTB-C, and USTB-D in the

959 Sphingopyxis USTB-05 genome are similar to mlr A, mlr B, mlr C and mlr D, respectively

960

961 Suppl Figure S1. The metabolic pathway of styrene degradation of Sphingopyxis USTB-05

962

963 Suppl Figure S2. The metabolic pathway of toluene degradation of Sphingopyxis USTB-05

964

965 Suppl Figure S3. The metabolic pathway of xylene degradation of Sphingopyxis USTB-05

966

967 Suppl Figure S4. The metabolic pathway of chlorocyclohexane and chlorobenzene

968 degradation of Sphingopyxis USTB-05

969

970 Suppl Figure S5. The metabolic pathway of dioxin degradation of Sphingopyxis USTB-05

971

972 Suppl Figure S6. The metabolic pathway of chloroalkane and chloroalkene degradation of

973 Sphingopyxis USTB-05

974

975 Suppl Figure S7. The metabolic pathway of naphthalene degradation of Sphingopyxis USTB-

976 05

977

978

979

980

981

982

983

50 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

984

985

986

987

988

989

990

991

992

993

994

995 Suppl Table S1. The sequences of USTB-A, USTB-B, USTB-C, and USTB-D in the

996 Sphingopyxis USTB-05 genome are similar to mlr A, mlr B, mlr C and mlr D,

997 respectively

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

51 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018 Suppl Figure S1. The metabolic pathway of styrene degradation of Sphingopyxis

1019 USTB-05. The enzyme or gene in the red box in the figure indicates that two strains

1020 co-exist, and the enzyme or gene in the blue or white box indicates that neither of the

1021 two bacteria exist.

1022

1023

1024

1025

1026

1027

1028

1029

52 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1030

1031

1032

1033

1034

1035 Suppl Figure S2. The metabolic pathway of toluene degradation of Sphingopyxis

1036 USTB-05. The enzyme or gene in the red box in the figure indicates that two strains

1037 co-exist, and the enzyme or gene in the blue or white box indicates that neither of the

1038 two bacteria exist.

1039

1040

1041

1042

1043

1044

53 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1045

1046

1047

1048

1049

1050

1051

1052 Suppl Figure S3. The metabolic pathway of xylene degradation of Sphingopyxis

1053 USTB-05. The enzyme or gene in the red box in the figure indicates that two strains

1054 co-exist, and the enzyme or gene in the blue or white box indicates that neither of the

1055 two bacteria exist.

1056

1057

1058

1059

1060

1061

1062

1063

1064

54 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1065

1066

1067

1068

1069 Suppl Figure S4. The metabolic pathway of chlorocyclohexane and chlorobenzene

1070 degradation of Sphingopyxis USTB-05. The enzyme or gene in the red box in the

1071 figure indicates that two strains co-exist, and the enzyme or gene in the blue or white

1072 box indicates that neither of the two bacteria exist.

1073

1074

1075

1076

55 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1077

1078 Suppl Figure S5. The metabolic pathway of dioxin degradation of Sphingopyxis

1079 USTB-05. The enzyme or gene in the red box in the figure indicates that two strains

1080 co-exist, and the enzyme or gene in the blue or white box indicates that neither of the

1081 two bacteria exist.

1082

1083

1084

1085

56 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1086

1087 Suppl Figure S6. The metabolic pathway of chloroalkane and chloroalkene

1088 degradation of Sphingopyxis USTB-05. The enzyme or gene in the red box in the

1089 figure indicates that two strains co-exist, and the enzyme or gene in the blue or white

1090 box indicates that neither of the two bacteria exist.

1091

1092

1093

1094

1095

1096

1097

1098

57 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1099

1100 Suppl Figure S7. The metabolic pathway of naphthalene degradation of

1101 Sphingopyxis USTB-05. The enzyme or gene in the red box in the figure indicates that

1102 two strains co-exist, and the enzyme or gene in the blue or white box indicates that

1103 neither of the two bacteria exist.

1104

1105

1106

1107

58 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 1: Genome circle of Sphingomonas morindae NBD5 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 2: Genome circle of Sphingopyxis USTB-05 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 3: Comparison of GO functional classification of Sphingomonas morindae NBD5 and

Sphingopyxis USTB-05 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 4: Comparison of COG functional classification of Sphingomonas morindae NBD5

and Sphingopyxis USTB-05 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 5: Comparison of KEGG pathway distribution of Sphingomonas morindae NBD5 and

Sphingopyxis USTB-05 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 6: The metabolic pathway of glycolysis/gluconeogenesis of Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

The enzyme or gene in the orange box in the figure indicates that two strains co-exist, the enzyme or gene in the blue box indicates that neither of the two bacteria exist, the enzyme or gene in the pink box indicates that it only exists in Sphingomonas morindae NBD5, and the enzyme or gene in the bright yellow box indicates that it only exists in Sphingopyxis USTB-05. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 7: The biosynthetic pathway of terpenoid skeleton of Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

The enzyme or gene in the orange box in the figure indicates that two strains co-exist, and the enzyme or gene in the blue box indicates that

neither of the two bacteria exist. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 8: The biosynthetic pathway of carotenoid of Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

The enzyme or gene in the orange box in the figure indicates that two strains co-exist, and the enzyme or gene in the blue box indicates that neither of the two bacteria exist. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 9: Comparison of genes related to lutein synthesis in the genomes of Sphingomonas morindae NBD5 and Sphingopyxis USTB-05

The gray box in the figure indicate that the gene exists in the strain, and the white indicate that the gene does not exist in the strain. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 10: Comparison of genes related to microcystin degradation in the genomes of Sphingomonas morindae NBD5 and Sphingopyxis

USTB-05

The gray box in the figure indicate that the gene exists in two strains, and the white indicate that the gene does not exist in two strains. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 11: Venn diagram of comparison of genomic predicted difference proteins Sphingomonas morindae

NBD5 and Sphingopyxis USTB-05 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Tables

Table 1. Comparison of genome characteristics between Sphingomonas morindae NBD5 and

Sphingopyxis USTB-05

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.29.437629; this version posted March 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Table 1. Comparison of genome characteristics between Sphingomonas morindae NBD5 and

Sphingopyxis USTB-05

category Sphingomonas morindae NBD5 Sphingopyxis USTB-05

bases 4239716 4679489

tmRNA 1 1

tRNA 61 48

CDS 3882 4312

GC(%) both 70% 64%

plasmid 2 0