Ohta Yukari (Orcid ID: 0000-0001-5645-2311)

Degradation of ester linkages in rice straw components by

Sphingobium species recovered from the sea bottom using a non- secretory tannase-family α/β hydrolase

Yukari Ohta1, Madoka Katsumata1, Kanako Kurosawa2, Yoshihiro Takaki2, Hiroshi Nishimura3,

Takashi Watanabe3, Ken-ichi Kasuya1,4

1 Gunma University Center for Food Science and Wellness, 4-2 Aramaki, Maebashi, Gunma

371-8510, Japan

2 Super-cutting-edge Grand and Advanced Research Program, JAMSTEC, 2-15, Natsushima,

Yokosuka, Kanagawa 237-0061, Japan

3 Biomass Conversion, Research Institute for Sustainable Humanosphere, Kyoto University

Gokasho, Uji, Kyoto 611-0011, Japan

4 Division of Molecular Science, Faculty of Science and Technology, Gunma University, 1-5-1

Tenjin, Kiryu, Gunma 376-8515, Japan

Correspondence to: Yukari Ohta

Gunma University Center for Food Science and Wellness, Gunma University, 4-2 Aramaki,

Maebashi, Gunma 371-8510, Japan

Telephone: +81-27-220-7641

E-mail: [email protected]

Running title: Non-secretory α/β hydrolase for plant degradation

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/1462-2920.15551 This article is protected by copyright. All rights reserved.

Originality-significance statement

We show that an unrecognized non-secretory hydrolase of the strain functions as a ferulic acid esterase. Before our study, the gene for the enzyme was annotated as mono(2-hydroxyethyl) terephthalic acid hydrolase (MHETase). The MHETase is involved in the microbial biodegradation pathway of polyethylene terephthalate. This study demonstrated that the enzyme cleaved hydroxycinnamic acid esters, such as ferulic, p-coumaric esters but did not degrade mono(2-hydroxyethyl) terephthalate (MHET). Further, we showed that the enzyme was responsible for critical phenotypic features of the strain for degrading terrestrial plants in an aquatic environment. The findings are critically important to understand the global carbon cycle and assess the biodegradability of commodity plastics incidentally imported into aquatic environments. Also, the sunken wood used for screening of the biomass-degrading in this study was retrieved from the sea bottom in that region has been seriously influenced by the huge inputs of terrestrial organic materials from the Tohoku earthquake and the tsunami in 2011.

This study uncovers the molecular bases of microbial actions on recalcitrant terrestrial plant components, increasing impact on global carbon cycle and climate change and environmental disturbances, such as heavy rains, typhoons, and discharge of anthropogenic harmful chemicals.

Summary

Microbial decomposition of allochthonous plant components imported into the aquatic environment is one of the vital steps of the carbon cycle on earth. To expand the knowledge of the biodegradation of complex plant materials in aquatic environments, we recovered a sunken wood from the bottom of Otsuchi Bay, situated in northeastern Japan in 2012. We isolated

Sphingobium with high ferulic acid esterase activity. The strain, designated as OW59, grew on various aromatic compounds and sugars, occurring naturally in terrestrial plants. A genomic study of the strain suggested its role in degrading hemicelluloses. We identified a gene encoding a non-secretory tannase-family α/β hydrolase, which exhibited ferulic acid esterase activity. This enzyme shares the consensus catalytic triad (Ser-His-Asp) within the tannase family block X in the ESTHER database. The molecules, which had the same calculated elemental compositions, were produced consistently in both the enzymatic and microbial degradation of rice straw crude extracts. The non-secretory tannase-family α/β hydrolase activity may confer an important phenotypic feature on the strain to accelerate plant biomass degradation. Our study provides insights into the underlying biodegradation process of terrestrial plant polymers in aquatic environments.

Introduction

The components of terrestrial plants mainly comprise three polymers: cellulose, hemicellulose, and lignin. These components are associated with each other, have different degrees of modification, and function as physical and chemical barriers of degradation. Ferulic, p- coumaric, and p-hydroxybenzoic acids are the frequently detected hydroxycinnamic acids

(HACs) in grass plants. The HACs incorporated into polysaccharides (Wong, 2006; Karlen et al.

2016; Iiyama et al., 1994) and lignin (Hatfield et al., 1999; Regner et al., 2018) provide plants with persistency (Hartley et al. 1972; Grabber et al., 2009; de Oliveira et al., 2015. Despite their recalcitrance, an array of microbial enzymes has evolved to efficiently degrade plant biomasses

(Guerriero et al., 2015). Decomposition of plant biomass is accelerated by treatment with HAC esterases, including ferulic acid esterases (Faes) (Wong, 2006, Faulds et al., 2010).

Currently, dozens of Faes have been identified and characterized at the protein level. These enzymes release ferulic acid and p-coumaric acid from aquatic slurry or plant biomass extracts

(Wong, 2006; Mathew and Abraham, 2004). The major part consists of enzymes from fungal strains belonging to the genera, such as Anaeromyces, Aspergillus, Neurospora, and

Thermothelomyces and several bacterial strains of Cellulosilyticum, Cellvibrio, Prevotella, and

Ruminiclostridium. In an earlier study (Crepin et al., 2004), Faes were classified into ABCD types based on substrate specificity. Hundreds of putative Fae sequences were deposited in

databases, such as CAZy (Levasseur et al., 2013) and the ESTHER database (Lenfant et al.,

2013) of the α/β hydrolase fold superfamily of proteins (Marchot and Chatonnet, 2012). In the

ESTHER database, Faes are classified into three groups within block X, namely, A85-feruloyl- esterase, FaeC, and tannase families. A85-feruloyl-esterase is closely related to the CAZy CE family 1(Makela et al., 2018). The two families’ enzymes can hydrolyze the cross-links between plant polysaccharide chains releasing the diferulic acids (Dilokpimol et al., 2016). The FaeC group proteins often have non-catalytic domains, such as cellulose biding domains or Ricin B lectin motifs, and hydrolyze the feruloyl esters linked to the sugar moieties, which is usually arabinose in 'natural' substrates.

The tannase-family enzymes show diverse substrate specificity for not only galloyl and feruloyl esters but also terephthalic acid monoester with ethylene glycol (mono(2-hydroxyethyl) terephthalic acid/MHTE) (Palm et al., 2019). Furthermore, more than 1000 of computationally predicted Faes are predicted from fungal genome sequences by database mining and divided into 13 subfamilies, including uncharacterized subfamilies (Banerjee et al., 2012; Underlin et al., 2020). These studies suggested the plausible potential of the discovery of new Faes for degrading complex natural substrates.

Sunken wood is a large organic input from land and ubiquitously distributed in the ocean floor

(Wolff, 1979) and is involved in microbial diversity and dispersal in the marine environment

(Fagervold et al., 2012, Bienhold 2013). It develops high diversity in the microbial community, including primary degraders of wood components and successive generations of heterotrophic and chemosynthetic organisms. Earlier studies have suggested that dissolved organic carbon

(DOC) in marine ecosystems is sourced from complex materials, including terrestrial organic matter from plants, which are discharged by rivers (McNichol and Aluwihare, 2007; Raymond and Bauer, 2001) and other sources, such as chemosynthesis, sedimentary methane, and anthropogenic activities. A significant part of the marine DOC is probably refractory carbon

(Griffith et al., 2012). Their composition, transformation, and turnover remain poorly understood (Bianchi, 2012; Follett et al., 2014). The main processes for the decomposition of plants in shallow marine environments include photodegradation and biodegradation (Miller and Zepp, 1995; Benner and Opsahl, 2001; Ward et al., 2013; Fichot and Benner, 2014).

Previous studies have shown that biodegradation dominates over photodegradation in river- influenced ocean margins (Ward et al., 2013; Fichot and Benner, 2014).

Here, we detected dozens of bacteria capable of degrading feruloyl ester linkages from a sunken wood recovered from the sea bottom in the Tohoku district, northeastern Japan. To the best of our knowledge, only a few bacterial Faes have been identified in water-logged environments.

We investigated the enzymatic and microbial properties of a selected isolate on synthetic esterase substrates and natural plant biomass components. This study will shed light on the

microbial processes overlooked in terrestrial plant biomass degradation in aquatic environments.

Results

Screening of ethyl ferulate (EF)-degrading bacteria from sunken wood

Bacteria living on the sunken wood were grown on an agar medium containing rice straw meal as the nutrient sources in artificial seawater. Sixty-six isolates were obtained by observing the differences in colony morphologies. Subsequently, these isolates were inoculated on a solid agar medium containing EF (Fig S1, compound 1). EF has low solubility in water; therefore, the medium becomes turbid. After a day of incubation, transparent regions were observed. About two-thirds of all isolates (45/66) showed detectable transparent zones around the colonies (Fig

S2), which indicated the decomposition of EF. One isolate that formed a large transparent zone, designated as strain OW59 (Fig. 1a), was subjected to further study. The Fae activities in the culture supernatants and cell lysates were measured separately. A higher activity was detected in the latter fraction, thereby indicating that the enzyme existed in the cell or was associated with its membrane (Fig. 1b). The addition of EF to the medium was not required for the expression of Fae activity.

Carbon source for the growth of strain OW59

To test the utilization of plant components as carbon sources by strain OW59, it was cultured in a minimally essential medium supplemented with a defined compound selected from naturally occurring aromatic monomers and sugars as a sole carbon source. The strain grew when sinapinic, ferulic, caffeic, syringic, vanillic, protocatechuic, benzoic, chlorogenic and quinic acids, (Fig. S1, compound 2–10) and L-arabinose (Fig. S1, compound 11) and D-xylose (Fig.

S1, compound 12) were supplemented as the sole carbon source. Furthermore, the relative cell densities after 3-day cultivation were 105.5 ± 20.5, 125.4 ± 35.6, 84.9 ± 11.4, 90.5 ± 14.8, 97.8

± 16.8, 80.3 ± 14.2, 87.4 ± 3.3, 86.2 ± 26.4, 98.0 ± 30.5, 70.7 ± 21.8, and 77.1 ± 8.1 (in %, with a standard deviation), respectively, for each substrate when the growth on D-glucose (Fig.

S1, compound 13) was considered as 100%. Strain OW59 assimilated all the tested compounds.

Growth was most enhanced by the addition of ferulic acid among all the tested substrates.

Draft genome analysis of strain OW59

To assess the genetic potential of strain OW59 for plant biomass degradation, the whole genome of strain OW59 was sequenced (GenBank: BBQY00000000.1). In all, 94 contigs consisted of a total sequence length of 4,645,154 base pairs (bp). Functional assignments were manually conducted by homology searches against the nonredundant (nr) NCBI protein database and the

Kyoto Encyclopedia of Genes and Genomes protein database. The strain OW 59 was affiliated

with Sphingobium xenophagum within the order of Alphaproteobacteria class based on the 16S rDNA sequence and phylogenetic analysis. The 16S rDNA sequence of

OW59 showed 99% identity with S. xenophagum BN6T (type strain, DSM 6383) (GenBank:

X94098). Among the published genome sequences of S. xenophagum, it showed 100% identity with S. xenophagum QYY (DSM 28059) (GenBank: GCA_000277525.1), S. xenophagum

NBRC107872 (the representative genome of S. xenophagum, GenBank: GCA_000367345.1), and S. hydrophobicum C1 (KCTC42740, currently amended to S. xenophagum) (Feng et al.,

2019) (GenBank: GCA_002288285.1). To assess the relatedness of the S. xenophagum strains, orthologous gene analysis was performed in the four genomes of strains OW59, QYY,

NBRC107872, and C1 (Table S1). The common orthologs conserved in the four strains (2780 orthologs) ranged from 63.2% to 70.1% of all protein genes in each genome, whereas 14.4%–

35.4% unique genes were found for each strain (Table S1, Fig. S3). To assess the genetic potential of OW59 for plant biomass decomposition, carbohydrate‐active enzyme (CA Zyme) genes were searched using the dbCAN meta-server (Zhang et al., 2018). The results of the search are listed in Table 1. Strain OW59 genome possessed multiple hemicellulose-degrading enzymes such as xylan 1,4-β-xylosidase (GenBank: GBH28846, GBH28847), α-D-xyloside xylohydrolase (GenBank: GBH28856, GBH28868). The total number of CAZymes of strain

OW59 was more than those of the other strains. In this search, feruloyl esterase was not

assigned as a CAZyme despite being classified into carbohydrate esterase families 1, 4, and 6.

Additionally, genes for the enzymes involved in the degradation of lignin-derived compounds were searched in the OW59 genome annotation by the EC numbers of the enzymes described in the earlier studies (Jackson et al., 2017; Janusz et al., 2017 (Table 2). Some small aromatic molecules occurring by lignin breakdown may be metabolized oxidatively—mainly through peroxidases (GenBank: GBH28845, GBH29097, and GBH29105) and oxygenases (GenBank:

GBH28757 and GBH28762) (Table 2).

Additional BLASTP search (version 2.8.0+) using the retrieved primary amino acid sequence against the UniProtKB/Swiss-Prot database showed the existence of the protein (GenBank:

GBH29579) having an identity of 41% (coverage: 219/534 amino acids) and an e-value of 3e-

114 with mono(2-hydroxyethyl) terephthalate hydrolase (MHETase) (PDB: 6JTU) of Ideonella sakaiensis (Betaproteobacteria class), and an identity of 24% (coverage: 114/479 amino acids) with an e-value of 6e-26 with feruloyl esterase (Fae) B (PDB: 3WMT) of Aspergillus oryzae

(Eurotiomycetes class). Prediction of protein localization sites (PSORT WWW Server, version

6.4, (https://psort.hgc.jp/) suggested that the desired protein was presumably an inner/outer cell membrane protein with a signal peptide consisting of 24 amino acids. The multiple sequence alignment was constructed using Genious software (Biomatters, Ltd. Auckland, New Zealand) employing Clustal Omega Cazprogram (Sievers, Higgins, 2018) (Fig. S4). The protein

sequences used for the alignment are as follows: GBH29579 from S. xenophagum strain OW59

(GeneBank: GBH29579), MHETase from Ideonella sakaiensis (PDB: 6JTU), FaeB from

Aspergillus oryzae (PDB: 3WMT), FaeC from A. oryzae (PDB: 6G21), and FaeC from

Fusarium oxysporum (PDB: 6TAT). These sequences' alignment revealed that they have an α/β hydrolase domain sharing a conserved catalytic triad (Ser217-His508-Asp471 in GBH29579 numbering) within the tannase family block X in the ESTHER database. Also, a part of the oxyanion hole (Gly125) and disulfide linkage (Cys216-Cys509), which are important for catalysis (Suzuki et al., 2014; Palm et al., 2019; Dimarogona et al., 2020), were highly conserved throughout the five proteins. They have considerable similarities at the α/β hydrolase domain, especially in the vicinity of the catalytic residues and dissimilarities at the lid domains.

The recombinant production, purification, and biochemical characterization of GBH29579

A gene for GBH29579 (locus tag: MBESOW_P0833) was heterologously expressed using an expression plasmid (GenBank: LC586980) and Bacillus subtilis strain ISW1214 as host. The

Bacillus strain was used to examine the intrinsic signal peptide's property whether the protein is sorted across the cellular membrane. The rGBH29579 was produced extracellularly and purified from the culture supernatants (Table S2) to form a single band on SDS-polyacrylamide gel electrophoresis (Fig. S5). The apparent molecular weight (MW) of the rGBH29579 was

calculated as 53.9 ± 0.2 kDa (Fig. S6) from the linear approximation from plots of the log MW

of size markers vs. the relative migration distance (Rf) of the SDS-polyacrylamide gel electrophoresis. It was smaller than that of the protein encoded by the GBH29579 gene (MW

60.5 kDa, amino acid numbers 1–575), which indicated that dozens of amino acids were cleaved from the gene product. Next, we analyzed the N-terminal amino acids of the purified rGBH29579, determining them to be Gly/Ara, followed by Ara/Pro, and Pro. Taking into account the apparent MW and the amino acids at the N-terminus of the rGBH29579, the rGBH29579 may be a mixture produced by cleaving between amino acids 50–51 and 51–52.

The calculated MWs were 55.6 and 55.5 kDa (amino acid numbers 51–575, 52–575, respectively). The calculated MWs were close to the apparent MW on SDS-polyacrylamide gel electrograms. To assess the Fae activity, the reaction mixture was analyzed using liquid chromatography/mass spectrometry (LC/MS) with full-scan electrospray ionization (ESI) method in negative mode (Fig. 2a). In this mode, the observed m/z of the peaks correspond to the deprotonated molecules, [M–H]–. The rGBH29579 degraded EF, and the resulting reaction product was detected on a total ion chromatogram (TIC) at retention time (Rt) 2.8 min of reverse-phase column chromatography (Fig. 2a, upper TIC). No degradation product was detected in the absence of enzyme (Fig. 2a, lower TIC). The observed m/z of the reaction products was 193.05. By comparing with the standard chemical using the Rts and fragmentation

patterns of mass spectra at the corresponding Rts (Fig. 2, inset), the reaction product was confirmed to be ferulic acid. The purified enzyme's optimum pH and temperature were determined using EF as a substrate at pH 7.5 (Fig. 3a) and 55°C temperature (Fig. 3b).

To assess the reactivity of the enzyme toward HACs with different substitutions on the aromatic ring, the enzyme was incubated with methyl esters of p-coumaric, sinapinic and ferulic acid (Fig

S1, compound 14–16). The enzyme degraded the ester linkages in methyl esters of p-coumaric and sinapinic acids showing the 80.2 ± 14.7 and 4.0 ± 1.0 mol% activity for the substrates, respectively, compared with the activity for methyl esters of ferulic acid. Additionally, the enzyme reactivity was tested for synthetic substrates, such as p-nitrophenyl acetate (pNP-C2), p-nitrophenyl butylate (pNP-C4) and p-nitrophenyl octanoate (pNP-C8) (Fig S1, compound 17–

19). The enzyme hydrolyzed pNP-C2 and pNP-C4 but did not degrade pNP-C8. The relative activity for pNP-C2 and pNP-C4 was 19.5 ± 4.2 and 23.1 ± 1.4 mol% for EF, respectively.

Based on the observed substrate preference, GBH29579 was considered as Fae.

Furthermore, the activity for mono(2-hydroxyethyl) terephthalate (MHET) and bis(2- hydroxyethyl) terephthalate (BHET) (FigS1, compound 20 and 21) were tested (Fig. 4a) because the amino acid sequence of GBH29579 showed similarity to MHETase. Unexpectedly, rGBH29579 did not degrade MHET even after prolonged incubation for 16 h, whereas the overall similarity to the reported MHETase was higher than those of the reported Faes (Fig. 4a).

In contrast, a small amount of the product was detected after the enzymatic reaction with BHET

(Fig. 4b). The reaction product was considered as MHET based on the comparison of the mass spectrum with that of authentic MHET (Fig. 4b_inset). In the same condition, EF completely degraded into ferulic acid as the reaction product (Fig. 4c).

The action of GBH29579 and OW59 on crude plant biomass extracts

To examine the activity of rGBH29579 and strain OW59 toward plant biomass, rice straw components extracted using acidified 1,4-dioxane (acid-dioxane extract of rice straw meal;

ADRS) were used as the substrate. After 16 h of incubation of rGBH29579 with ADRS, the reaction mixture was analyzed using LC/MS. (Fig. 5). The observed m/z of the predominant ions of the two major peaks (labeled with ** on mass chromatogram in Fig. 5) generated from the enzyme reactions were consistent with the theoretical values of ferulic acid ([M–H]– 193.05) and p-coumaric acid ([M–H]−163.04). Rts and mass spectra well matched to those of the authentic compounds (Fig.2 lower inset, Fig. 5, upper-right inset).

To detect unidentified reaction products and substrates of rGBH29579 and metabolites of strain

OW59 on the natural plant component, ADRS, multivariate data analysis was conducted. The score plot of principal component analysis (PCA) suggested a clear difference between rGBH29579-treated ADRS and untreated ADRS (Fig. 6a), suggesting the critical role of Fae in

ADRS degradation. To identify the major discriminating molecules between treated and untreated ADRS, the loading data were plotted from orthogonal partial least squares discriminant analysis based on their contribution to the correlation and variation of loading data within the dataset (Fig. 6b). Candidate ions changed by the enzymatic treatment of the ADRS were chosen under the limitation of the p-value of correlation >|0.6| and variation of loading data > |0.0015|. The selected ions (shown in the dotted box in Fig. 6b) contributing to significant differences were subjected to elemental composition analysis. The increased and decreased ions by rGBH29579 activity are separately shown in Fig. 6c and d, respectively. The change in the abundance of the selected ions was quantified from the area of the extracted mass chromatograms. The two ions labeled as Rt 3.03 min_m/z 163.04 and Rt 3.18 min_m/z

193.05(** in Fig. 6c) were the mass ions derived from p-coumaric and ferulic acids, respectively. They were the two major reaction products of rGBH29579 with ADRS. The ions shown in Fig 5d correspond to the substrates for the enzyme. Although their structures were unclarified following mass analysis, elemental compositions of the decreased three ions were

calculated to be C15H17O8 (Rt 2.87 min_m/z 325.09), C20H27O5 (Rt 4.55 min_m/z 347.19), and

C18H31O2 (RT 7.49 min_m/z 279.23) based on the high-resolution mass spectra at the respective

Rts. Thus, the enzyme was capable of degrading synthetic HAC esters and lipase substrates, and

various unknown substructures of natural plant components to produce p-coumaric and ferulic acids as the predominant products.

In addition to the enzymatic conversion of ADRS, the metabolism of plant components by strain

OW59 was examined by adding ADRS to the strain cultures. After a 96-h culture in ADRS- supplemented medium, the metabolites in the culture supernatant were analyzed using LC/MS.

The possible characteristic ions for microbial metabolism were selected using the same methods as described above. The increased and decreased ions are shown in Fig. 7c and d, respectively.

Whereas ferulic and p-coumaric acids (** in Fig. 5, 6c, and 7d) were produced enzymatically in vitro (** in Fig. 6c), the two compounds were consumed by strain OW59 in vivo during cultivation (** in Fig. 7d). Based on the Rts and calculated elemental compositions as the

identifiers, the three major ions consumed by strain OW59 (C15H17O8 (Rt 294min_m/z 325.09),

C20H27O5 (Rt 4.59 min_m/z 347.19), and C18H31O2 (RT 7.56 min_m/z 279.23) (labeled with * in

Fig. 7d) were considered as identical to the above-mentioned three ions degraded by rGBH29579 treatment. Thus, strain OW59 was considered to degrade the ADRS mainly via Fae action.

Discussion

Environmental settings and the role of the bacterial isolate in ferulic acid esterase activity

Strain OW59 was isolated from sunken wood recovered from the sea bottom off Otsuchi Bay in

2012. Earlier studies have demonstrated that the resident time of water in Otsuchi Bay was relatively short (0.5–1 year) (Lu et al., 2016). The offshore area was assumed to be strongly affected by riverine flows. Additionally, the riverine flows are affected by land usage and seasonal change.

HAC esters between lignin and hemicelluloses are present in non-woody angiosperms (Iiyama et al., 1994). They include major agricultural plant species, such as rice, wheat, corn, and sugar beet (Mathew and Abraham, 2004), and bamboo. A previous study using these methods reported that the origin of terrestrial organic carbon flowing into Otsuchi Bay from 2012 to 2013 was dominated by non-woody angiosperms (Lu et al., 2016). Here, the active biodegrading activity of plant biomasses was observed in Otsuchi Bay, where a relatively high proportion (67%) of lignin in DOC was biologically degraded.

Based on carbon source utilization, the characteristic of strain OW59 was assumed to be fit for growth, which relied on low-molecular-weight compounds naturally occurring in plants.

Considering the environmental settings together with the phenotypic and genomic features of strain OW59 and closely related strains, Sphingomonas strains may promote the degradation of terrestrial plant components using extracellular hemicellulolytic enzymes together with the Fae in submerged environments. Strain OW59 exhibited Fae activity around the colony on solid

agar medium, whereas the enzymatic activity was associated with cells in our experimental setting's liquid culture. Enzyme activity outside the cell is advantageous for degrading polymeric plant biomass because they are not transported into the living cells across the cell membrane due to the large molecular size. In this study, the recombinant enzyme was produced extracellularly when a Gram-positive bacterium was used as the host. The results suggested that

GBH29579 could be sorted into the Gram-negative bacterium strain’s periplasmic space across the inner membrane. Proteins in the periplasmic space can be transported after being incorporated into outer membrane vesicles in response to environmental and physical conditions

(Kulp and Kuehn, 2010). To the best of our knowledge, non-secretory Fae was identified and characterized for the first time. Different lengths of N-terminal amino acids were cleaved from the gene product in the purified rGBH29579. It indicated that the cleavage position was not tightly regulated in the heterologous protein expression system used in this study. Still, the localization, underlying the mechanism of transport and physiological significance of the enzyme of strain OW59 remains unelucidated.

Biochemical characteristics and substrate preference of OW59

The optimal temperature for rGBH29579 was much higher than that for the growth of mesophilic bacteria, which indicates the high stability of the rGHB29579. The enzyme's high stability may explain that it remained active after the enzyme's dispersion outside the cells.

The rGBH29579 could efficiently hydrolyze ester linkages of HACs in crude extracts of plant biomass. In the test for synthetic substrates, rGBH29579 displayed broad substrate preference on not only methyl esters of HACs with different substitutions on aromatic rings but also aliphatic lipase substrates. Referring to the ABCD classification (Crepin et al., 2004), the substrate preference of rGBH29579 was similar to that of type B and C enzymes. Unlike type A and D enzymes, rGBH29579 did not release any diferulic acid, thereby indicating no reaction with dimeric HAC-linkages between the two polymeric chains in plant biomass polymers. rGBH29579 showed only low activity on sinapinic acid methyl ester, which was not a preferable substrate for type B Faes. rGBH29579 shared substrate preferences with some enzymes, such as AniFaeC from Aspergillus nidurans and AsFaeC from A. sydowii in the subfamily 5 and AsFaeE from A. sydowii in the subfamily 6 within the proposed classification for fungal Faes (Dilokpimol et al., 2018; Underlin et al., 2020).

According to structure-based amino acid sequence alignment, the catalytic machinery is typical for α/β hydrolases classified into the tannase family block X in the ESTHER database (Suzuki et al., 2014; Palm et al., 2019; Dimarogona et al., 2020). The amino acid sequence of

GBH29579 showed higher similarity with the mono(2-hydroxyethyl) terephthalate (MHET) hydrolase of Ideonella sakaiensis than that reported with ferulic acid hydrolases. Unexpectedly, rGBH29579 did not show detectable activity on MHET. Quite recently, I. sakaiensis MHETase was switched to bis(2-hydroxyethyl) terephthalate hydrolase (BHETase) by protein engineering for better degradation of poly (ethylene terephthalate) film (Sagong et al., 2020). Future studies on the 3D structure of GBH29579 will expand our knowledge on substrate recognition among the MHETases, BHETases, and Faes, leading to the development of a rational design with higher degradation ability of recalcitrant organic materials, including plastics.

Metabolism of aromatic compounds by strain OW59 and comparison with phylogenetically related strains

Previous studies have reported that members of S. xenophagum are degraders of xenobiotics and recalcitrant aromatic compounds, including anthraquinone dyes and chlorophenols, as their sole carbon or nitrogen or both sources. The metabolic pathway and the genes responsible for this activity have not been elucidated. This study demonstrated that Fae activity is an important phenotypic feature of a Sphingobium strain for accelerating biomass degradation. Our study highlighted a previously overlooked involvement of a Sphingobium strain and the relevant enzymes in the biodegradation of HACs linked to hemicelluloses in terrestrial plant biomasses.

Furthermore, genetic and biochemical studies of related strains and degrading enzymes of anthropogenically and naturally occurring complex aromatic compounds will help decipher the involvement of microorganisms in recalcitrant organic materials degradation in aquatic environments.

Experimental Procedures

Sampling of sunken wood as a screening source of bacteria

Sunken wood was collected from the Research Vessel (R/V) Natsishima NT12-12 (May 2012) cruises by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC). It was recovered off Sendai and Sanriku, Northern Japan (39°14.409N, 142°13.433E), at a depth of

492 m during the Hyperdolphin Dive #1384.

Culture conditions for screening, isolation, and carbon utilization tests for the bacteria

Bacterial isolation was conducted based on the methods described in our previous study (Ohta et al., 2012), using a medium containing 2% (w/v) rice straw as a nutrient source. The rice straw was a gift from rice farmers in Fukushima, Japan. To detect ethyl ferulate (EF)-degrading activity, the isolates were inoculated and incubated for a week on TM agar, which was composed of half-strength trypticase soy broth without dextrose (Thermo Fisher Scientific,

Göteborg, Sweden), half-strength marine broth 2216 (Thermo Fisher Scientific), and 2% agar.

Then, the cells were transferred onto the TM agar medium plus 10 mM EF (compound 1, Fig

S1). After incubation for 24 h at 25°C, clear zones around the colonies were observed by visual assessment.

To test for carbon utilization, a defined mineral medium containing 1 mM of the test substrate as the sole carbon source was used. The mineral medium (100 mL) consisted of 20 mL basal salt

solution (33.9-g Na2HPO4, 15.0-g KH2PO4, 10.0-g NaCl, and 5.0-g NH4Cl per liter of deionized

H2O), 0.5 mL of 1 M MgSO4, 0.5 mL of 1 M CaCl2, 1 mL of 0.25% (w/v) Daigo's IMK medium

(FUJIFILM Wako Chemicals, Osaka, Japan), 1 mL trace vitamin solution, 1 mL of 100 mM

substrate stock solution, and 86.5 mL deionized H2O. The trace vitamin solution was prepared according to Balch et al., 1979. Before use, the medium was sterilized using a 0.22-μm-pore membrane filter. Stock solutions of 100 mM sinapinic, ferulic, caffeic, syringic, and vanillic acids (Fig. S1, compound 2–6) were prepared using N,N-dimethylformamide (DMF) as a solvent. Protocatechuic, benzoic, chlorogenic, and quinic acids (Fig. S1, compound 7–10), L- arabinose, D-xylose, and D-glucose (Fig. S1, compound 11–13) were prepared in deionized

H2O. A mineral medium containing 1 mM glucose with/without 1% (v/v) DMF was used as a positive control for growth. The growth of strain OW59 was not affected by supplementation with 1% DMF. Cell growth was monitored optically by measuring the absorbance at 600 nm

using each culture's supernatant as a reference.

Analysis of bacterial metabolites using HPLC and LC/MS

Strain OW59 was cultured in TM broth containing 1.0 mM EF at 30°C with shaking for 4 d.

Daily sampling was conducted. The samples were separated into supernatants and cell pellets by centrifugation. The cells were lysed using the BugBuster protein extraction reagent (Merck

KGaA, Darmstadt, Germany) according to the manufacturer's instructions for 10 min at 20°C.

Then, the culture supernatants and cell lysates were analyzed using reversed-phase HPLC and the method described in our previous study (Ohta et al., 2015). The amount of substrate and metabolites in the culture supernatant were calculated based on the area of the corresponding chromatographic peaks using 0.01–1.0 mM EF and ferulic acid as the standards. Uninoculated medium incubated under the same conditions as the test cultures was used as a blank sample to assess the effect of abiotic degradation of the respective substrate. High-resolution LC/MS data were generated using a Xevo G2 quadrupole time-of-flight mass spectrometer (Waters, MA,

USA) operated in the negative ion ESI mode (Ohta et al., 2017). The mass data were processed using a Waters MarkerLynx XS Application Manager. The inlet system was a Waters Acquity H- class UPLC system. It was operated at a flow rate of 0.4 mL/min using a BEH C18 reverse- phase column (1.8-μm particle size, 100 × 2.1 mm) (Waters) using the mobile phase gradients A

(2 mM sodium acetate and 0.05% formic acid) and B (95% acetonitrile/H2O) and the following conditions: 0–6 min, 95%–5% A with B as the remainder; and 6–7 min, 100% B.

Genome sequence analysis of the isolate

According to the manufacturer's instructions, the total genomic DNA of strain OW59 was extracted using a NucleoSpin Plant II Midi (Takara Bio, Shiga, Japan). An Illumina shotgun library was constructed using KAPA Hyper Prep kit for Illumina (Kapa Biosystems, MA, USA) and sequenced with 2 × 300 bp paired-end using Illumina MiSeq Platform (CA, USA). Raw reads were cleaned using Trimmomatic v. 0.36 (Bolger et al., 2014) to remove adapter and low- quality sequences with the following parameters: "ILLUMINACLIP:TruSeq3-PE-

2.fa:2:30:10:8:true, LEADING:3, TRAILING:3, SLIDINGWINDOW:4:20, and MINLEN:100.”

De novo assembly was performed using CLC Genomic Workbench v. 11.0 software (Qiagen,

Hilden, Germany) with the following parameter settings: k-mer value of 64 bp and bubble size of 500 bp and map read back to the contigs with length fraction of 0.9 and similarity fraction of

0.9. This assembly constructed 94 contigs with a total length of 4,645,154 bp. The N50 was

328,600 bp long and the average read coverage was 105. Protein coding regions in the contigs were identified using MetaGeneMark (Zhu et al., 2010) and Glimmer-MG (Kelley et al., 2012).

Functional assignments were manually conducted by homology searches against the

nonredundant (nr) NCBI protein database and the Kyoto Encyclopedia of Genes and Genomes protein database. The rRNA and tRNA were predicted by Barrnap v. 0.9

(https://github.com/tseemann/barrnap) and tRNAscan-SE v. 1.13 (Lowe and Chan, 2016), respectively. Other non-coding RNAs were identified by searching the genome for the corresponding Rfam profiles.

The carbohydrate‐active enzymes (CA Zymes) genes w ere searched using the dbCA N meta server (Zhang et al., 2018) with the tools of HMMER (E-Value < 1e–15, coverage > 0.35) (Finn et al., 2011), DIAMOND (E-Value < 1e–102) (Buchfink et al., 2015) and Hotpep (frequency >

2.6, hits > 6) (Busk et al., 2017).To compare with other S. xenophagum strains at the whole genome sequences, orthology analysis was performed among the following four S. xenophagum strains (GeneBank assembly accessions): OW59 (GCA_004305355.1), QYY

(GCA_000277525.1), NBRC107872 (GCA_000367345.1), and strain C1 (GCA_002288285.1).

Orthologous gene families were identified using all-against-all BLAST followed by InParanoid ver. 4.1 program (O’Brien et al., 2005) with cutoff values (bit core >50, overlap >70%). The

Venn diagram was generated in the web service

(http://bioinformatics.psb.ugent.be/webtools/Venn) and the numbers in the diagram indicate the number of shared orthologous gene families.

Recombinant expression and purification of GBH29579

The expression plasmid for enzyme production was constructed using pHY300PLK as a parent plasmid (Ishiwa and Shibaharasone, 1986). The whole coding sequence for GBH29579 was inserted between 393-bp and 235-bp stretches of promoter and terminator sequences, respectively. The full nucleotide sequence of the expression plasmid was submitted to the

GenBank database under the accession number: LC586980. The plasmid, designated as pHG29579, was introduced into Bacillus subtilis ISW1214. B. subtilis ISW1214-harboring pHG29579 was incubated at 30°C in PPS medium (Table S3) for 96 h with shaking at 120 rpm.

Cells and supernatants of the cultures were separated by centrifugation at 9,100 ×g for 10 min.

The supernatant was used for enzyme purification. All procedures for enzyme purification were performed at temperatures below 4°C. The centrifugal supernatant (100 ml) was diluted to 4 L with 10 mM Tris/HCl (pH 7.5). After removing insoluble materials by centrifugation at 9,100

×g for 10 min, the retentate was applied to a TOYOPEARL Super Q-650 (TOSOH, Tokyo,

Japan) anion exchange column equilibrated with 50 mM Tris/HCl (pH 7.5). The pass-through fraction was divided into four fractions of equal volumes and loaded onto four hydroxyapatite columns (1.0 × 10 cm), previously equilibrated with 2 mM sodium phosphate (pH 7.5). The elution of the enzyme was performed using 50 ml of 50 mM sodium phosphate for each column.

The eluted fractions were loaded onto a Super Q-650 anion exchange column equilibrated with

10 mM Tris/HCl (pH 7.5). After washing the column with 200 mL of 10 mM Tris/HCl (pH 7.5), the enzyme was eluted with 50 mL of 50 mM NaCl in the same buffer. The active fractions were combined and concentrated by ultrafiltration on a centrifugal filter unit (Amicon Ultra-15, 10 kDa cutoff) (Merck KGaA). The concentrated fraction was used as the final preparation of purified recombinant enzymes throughout the experiments. The N-terminal amino acids of rGBH29579 were analyzed using a Procise 492 HT protein sequencer (Applied Biosystems,

Foster City, CA, USA) after desalting using a ProSorb cartridge (Applied Biosystems).

Activity measurement and biochemical characterization of ferulic acid esterase

The enzyme activity was measured in 50 mM Tris/HCl (pH 7.5) containing 5 v/v% DMF using

5 mM EF as a substrate. The reactions were performed in triplicates. After a 30-min incubation at 50°C, the enzyme was inactivated by heating 95°C for 10 min, otherwise stated. The formation of the reaction product, ferulic acid, was quantified using HPLC as described above.

The optimum pH for the enzymatic activity was determined using 50 mM Britton–Robinson buffer. The optimal temperature was determined by measuring the product formation after 30 min of incubation at an optimal pH and a temperature of 55°C. For determination of substrate specificity, methyl esters of p-coumaric, sinapinic and ferulic acids (compound 14–16, Fig S1), mono(2-hydroxyethyl) terephthalic, and bis(2-hydroxyethyl) terephthalate (BHET) acid

(compound 20 and 21, Fig S1) were used at 1 mM in 50 mM Tris/HCl (pH 7.5) containing 5 v/v% DMF. After incubation for 1 or 16 h, the enzyme was inactivated by heating at 95°C for 10 min. The product formation was monitored using LC/MS as described above. Collision energy operating in selected ion monitoring (SIM) mode was set at 10 V. For specific detection of the hydrolytic products of MHET, BHET, and EF, the ions of m/z 165 (m/z of terephthalic acid:

165.02), 209 (m/z of MHET: 209.05), and 193 (m/z of ferulic acid:193.05) were selected, respectively, as target ions by quadrupole settings.

The hydrolytic activity for the synthetic lipase substrates, p-nitrophenyl (pNP) acetate, butylate, and octanoate (compound 17–19, Fig S1) were measured using HPLC as described above. The tested substrates' activities were compared based on the molar amounts of the product, p- nitrophenol, to that of EF in the parallel experiments under standard conditions.

The preparations of plant materials for ferulic acid release tests were as follows: rice straw was dried at 60°C for 16 h and milled using a wonder-blender D3V-10 (Osaka Chemical, Osaka,

Japan) for 2 min at 25,000 rpm. The meal was suspended at 10 w/v% in 1,4-dioxane containing

20 mM HCl for a day at room temperature. The slurry was filtered and evaporated at 40°C under reduced pressure to dry. The dry material was used as an acid-dioxane rice straw extract

(ADRS). To detect unidentified reaction products and substrates of rGBH29579 and metabolites of strain OW59 from ADRS, a multivariate data matrix was analyzed using EZinfo software

(Waters). Each peak was assigned using Rt and m/z pairs as identifiers. The resulting 2D data, labeled by Rt_m/z pair, were subjected to principal component analysis (PCA) and orthogonal partial least squares discriminant analysis. The peak intensities for each peak detected were then normalized to the sum of each sample's peak intensities.

Material and sequence data availability

The genomic and plasmid nucleotide sequences have been deposited in the DNA Data Bank of

Japan, European Molecular Biology Laboratory, and GenBank databases under the accession numbers GenBank: BBQY01000001–BBQY01000094 and LC586980. The strain was deposited in the culture collection of the National Institute of Technology and Evaluation as

NBRC 114557.

Acknowledgments

This work was supported by a Grant-in-Aid for Scientific Research (15KT00123) from the

Japan Society for the Promotion of Science (JSPS) to Y.O, H.N., and T. W., K. K. and Y. O. is grateful for the support of grant number JP19H04311 from the JST-Mirai program of the Japan

Science and Technology Agency. We would like to thank the captain (Dr. Kazumasa Oguri) and the crew of the Hyperdolphin team, the scientists who were onboard the R/V Natsushima during

the NT12-12 cruise, Dr. Takuma Haga, Dr. Yuji Hatada, and Mr. Shinro Nishi, and members of the Research and Development Center for Marine Biosciences (JAMSTEC) for their support.

Author contributions

Y. O. designed this study. Y. O., M. K., and K. K. performed experiments and data curation. Y.

T. analyzed the genomic data. Y.O., H.N., and M. K wrote the paper, guided by T. W. and K. K.

All authors read, revised, and approved the final manuscript.

Conflict of interest

The authors declare they have no conflict of interest.

References

Balch, W.E., Fox, G.E., Magrum, L.J., Woese, C.R., and Wolfe, R.S. (1979) Methanogens:

reevaluation of a unique biological group. Microbiol Rev 43: 260-296.

Banerjee, A., Jana, A., Pati, B.R., Mondal, K.C., and Das Mohapatra, P.K. (2012)

Characterization of tannase protein sequences of bacteria and fungi: an in

silico study. Protein J 31: 306-327.

Benner, R., and Opsahl, S. (2001) Molecular indicators of the sources and transformations of

dissolved organic matter in the Mississippi river plume. Org Geochem 32: 597-611.

Bianchi, T.S. (2011) The role of terrestrially derived organic carbon in the coastal ocean: A

changing paradigm and the priming effect. Proc Natl Acad Sci U S A 108: 19473-19481.

Bienhold, C., Ristova, P.P., Wenzhofer, F., Dittmar, T., and Boetius, A. (2013) How Deep-Sea

Wood Falls Sustain Chemosynthetic Life. PLOS One 8.

Bolger, A.M., Lohse, M., and Usadel, B. (2014) Trimmomatic: a flexible trimmer for Illumina

sequence data. Bioinformatics 30: 2114-2120.

Buchfink, B., Xie, C., and Huson, D.H. (2015) Fast and sensitive protein alignment using

DIAMOND. Nat Methods 12: 59-60.

Busk, P.K., Pilgaard, B., Lezyk, M.J., Meyer, A.S., and Lange, L. (2017) Homology to peptide

pattern for annotation of carbohydrate-active enzymes and prediction of function. BMC

Bioinformatics 18.

Crepin, V.F., Faulds, C.B., and Connerton, I.F. (2004) Functional classification of the microbial

feruloyl esterases. Appl Microbiol Biotechnol 63: 647-652. de Oliveira, D.M., Finger-Teixeira, A., Mota, T.R., Salvador, V.H., Moreira-Vilar, F.C.,

Molinari, H.B. et al. (2015) Ferulic acid: a key component in grass lignocellulose

recalcitrance to hydrolysis. Plant Biotechnol J 13: 1224-1232.

Dilokpimol, A., Makela, M.R., Aguilar-Pontes, M.V., Benoit-Gelber, I., Hilden, K.S., and de

Vries, R.P. (2016) Diversity of fungal feruloyl esterases: updated phylogenetic classification,

properties, and industrial applications. Biotechnol Biofuels 9.

Dimarogona, M., Topakas, E., Christakopoulos, P., and Chrysina, E.D. (2020) The crystal

structure of a Fusarium oxysporum feruloyl esterase that belongs to the tannase family. FEBS

Lett 594: 1738-1749.

Fagervold, S.K., Galand, P.E., Zbinden, M., Gaill, F., Lebaron, P., and Palacios, C. (2012)

Sunken woods on the ocean floor provide diverse specialized habitats for

microorganisms. FEMS Microbiol Ecol 82: 616-628.

Faulds, C.B. (2010) What can feruloyl esterases do for us? Phytochem Rev 9: 121-132.

Feng, G.D., Chen, M.B., Zhang, X.J., Wang, D.D., and Zhu, H.H. (2019) Whole genome

sequences reveal the presence of 11 heterotypic synonyms in the genus Sphingobium and

emended descriptions of Sphingobium indicum, Sphingobium fuliginis, Sphingobium

xenophagum and Sphingobium cupriresistens. Int J Syst Evol Microbiol 69: 2161-2165.

Fichot, C.G., and Benner, R. (2014) The fate of terrigenous dissolved organic carbon in a river-

influenced ocean margin. Global Biogeochem Cycles 28: 300-318.

Finn, R.D., Clements, J., and Eddy, S.R. (2011) HMMER web server: interactive sequence

similarity searching. Nucleic Acids Res 39: W29-W37.

Follett, C.L., Repeta, D.J., Rothman, D.H., Xu, L., and Santinelli, C. (2014) Hidden cycle of

dissolved organic carbon in the deep ocean. Proc Natl Acad Sci U S A 111: 16706-16711.

Gontikaki, E., Thornton, B., Cornulier, T., and Witte, U. (2015) Occurrence of priming in the

degradation of lignocellulose in marine sediments. PLOS ONE 10: e0143917.

Grabber, J.H., Mertens, D.R., Kim, H., Funk, C., Lu, F.C., and Ralph, J. (2009) Cell wall

fermentation kinetics are impacted more by lignin content and ferulate cross-linking than by

lignin composition. J Sci Food Agric 89: 122-129.

Griffith, D.R., McNichol, A.P., Xu, L., McLaughlin, F.A., Macdonald, R.W., Brown, K.A., and

Eglinton, T.I. (2012) Carbon dynamics in the western Arctic Ocean: insights from full-depth

carbon isotope profiles of DIC, DOC, and POC. Biogeosciences 9: 1217-1224.

Guerriero, G., Hausman, J.F., Strauss, J., Ertan, H., and Siddiqui, K.S. (2015) Destructuring

plant biomass: Focus on fungal and extremophilic cell wall hydrolases. Plant Sci 234: 180-

193.

Hartley, R.D. (1972) p-Coumaric and ferulic acid components of cell walls of ryegrass and their

relations with lignin and digestibility. Journal of the Science of Food and Agriculture 23:

1347–1354.

Hatfield, R.D., Ralph, J., and Grabber, J.H. (1999) Cell wall cross-linking by ferulates and

diferulates in grasses. J Sci Food Agric 79: 403-407.

Iiyama, K., Lam, T.B.T., and Stone, B.A. (1994) Covalent cross-links in the cell-wall. Plant

Physiol 104: 315-320.

Ishiwa, H., and Shibaharasone, H. (1986) New shuttle vectors for Escherichia coli and Bacillus

subtilis .4. The Nucleotide-sequence of PHY300PLK and some properties in relation to

transformation. Jpn J Genet 61: 515-528.

Jackson, C.A., Couger, M.B., Prabhakaran, M., Ramachandriya, K.D., Canaan, P., and

Fathepure, B.Z. (2017) Isolation and characterization of Rhizobium sp. strain YS-1r that

degrades lignin in plant biomass. J Appl Microbiol 122: 940-952.

Janusz, G., Pawlik, A., Sulej, J., Swiderska-Burek, U., Jarosz-Wilkolazka, A., and Paszczynski,

A. (2017) Lignin degradation: microorganisms, enzymes involved, genomes analysis and

evolution. FEMS Microbiol Rev 41: 941-962.

Karlen, S.D., Zhang, C.C., Peck, M.L., Smith, R.A., Padmakshan, D., Helmich, K.E. et al.

(2016) Monolignol ferulate conjugates are naturally incorporated into plant lignins. Sci Adv 2:

e1600393.

Kelley, D.R., Liu, B., Delcher, A.L., Pop, M., and Salzberg, S.L. (2012) Gene prediction with

Glimmer for metagenomic sequences augmented by classification and clustering. Nucleic

Acids Res 40: e9.

Kulp, A., and Kuehn, M.J. (2010) Biological Functions and Biogenesis of Secreted Bacterial

Outer Membrane Vesicles. In Annual Review of Microbiology, Vol 64, 2010. Gottesman, S.,

and Harwood, C.S. (eds). Palo Alto: Annual Reviews, pp. 163-184.

Lenfant, N., Hotelier, T., Velluet, E., Bourne, Y., Marchot, P., and Chatonnet, A. (2013)

ESTHER, the database of the α/β-hydrolase fold superfamily of proteins: tools to explore

diversity of functions. Nucleic Acids Res 41: D423-D429.

Levasseur, A., Drula, E., Lombard, V., Coutinho, P.M., and Henrissat, B. (2013) Expansion of

the enzymatic repertoire of the CAZy database to integrate auxiliary redox

enzymes. Biotechnol Biofuels 6: 41.

Lowe, T.M., and Chan, P.P. (2016) tRNAscan-SE On-line: integrating search and context for

analysis of transfer RNA genes. Nucleic Acids Res 44: W54-W57.

Lu, C.J., Benner, R., Fichot, C.G., Fukuda, H., Yamashita, Y., and Ogawa, H. (2016) Sources

and transformations of dissolved lignin phenols and chromophoric dissolved organic matter

in Otsuchi Bay, Japan. Front Mar Sci 3: 85.

Makela, M.R., Dilokpimol, A., Koskela, S.M., Kuuskeri, J., de Vries, R.P., and Hilden, K.

(2018) Characterization of a feruloyl esterase from Aspergillus terreus facilitates the division

of fungal enzymes from Carbohydrate Esterase family 1 of the carbohydrate-active enzymes

(CAZy) database. Microb Biotechnol 11: 869-880.

Marchot, P., and Chatonnet, A. (2012) Hydrolase versus other functions of members of the α/β-

hydrolase fold superfamily of proteins. Protein Pept Lett 19: 130-131.

Mathew, S., and Abraham, T.E. (2004) Ferulic acid: An antioxidant found naturally in plant cell

walls and feruloyl esterases involved in its release and their applications. Crit Rev

Biotechnol 24: 59-83.

McNichol, A.P., and Aluwihare, L.I. (2007) The power of radiocarbon in biogeochemical

studies of the marine carbon cycle: Insights from studies of dissolved and particulate organic

carbon (DOC and POC). Chem Rev 107: 443-466.

Miller, W.L., and Zepp, R.G. (1995) Photochemical production of dissolved inorganic carbon

from terrestrial organic matter: Significance to the oceanic organic carbon cycle. Geophys

Res Lett 22: 417-420.

O'Brien, K.P., Remm, M., and Sonnhammer, E.L.L. (2005) Inparanoid: a comprehensive

database of eukaryotic orthologs. Nucleic Acids Res 33: D476-D480.

Ohta, Y., Nishi, S., Hasegawa, R., and Hatada, Y. (2015) Combination of six enzymes of a

marine Novosphingobium converts the stereoisomers of β-O-4 lignin model dimers into the

respective monomers. Sci Rep 5: 15105.

Ohta, Y., Hasegawa, R., Kurosawa, K., Maeda, A.H., Koizumi, T., Nishimura, H. et al. (2017)

Enzymatic specific production and chemical functionalization of phenylpropanone platform

monomers from lignin. ChemSusChem 10: 425-433.

Ohta, Y., Nishi, S., Haga, T., Tsubouchi, T., Hasegawa, R., Konishi, M. et al. (2012) Screening

and phylogenetic analysis of deep-sea bacteria capable of metabolizing lignin-derived

aromatic compounds. Open J Mar Sci 2: 11.

Palm, G.J., Reisky, L., Bottcher, D., Muller, H., Michels, E.A.P., Walczak, M.C. et al. (2019)

Structure of the plastic-degrading Ideonella sakaiensis MHETase bound to a substrate. Nat

Commun 10: 1717.

Ralph, J. (2010) Hydroxycinnamates in lignification. Phytochem Rev 9: 65-83.

Raymond, P.A., and Bauer, J.E. (2001) Use of 14C and 13C natural abundances for evaluating

riverine, estuarine, and coastal DOC and POC sources and cycling: a review and

synthesis. Org Geochem 32: 469-485.

Regner, M., Bartuce, A., Padmakshan, D., Ralph, J., and Karlen, S.D. (2018) Reductive

Cleavage Method for Quantitation of Monolignols and Low-Abundance Monolignol

Conjugates. ChemSusChem 11: 1600-1605.

Sagong, H.Y., Seo, H., Kim, T., Son, H.F., Joo, S., Lee, S.H. et al. (2020) Decomposition of the

PET Film by MHETase Using Exo-PETase Function. ACS Catal 10: 4805-4812.

Sievers, F., and Higgins, D.G. (2018) Clustal Omega for making accurate alignments of many

protein sequences. Protein Science 27: 135-145.

Suzuki, K., Hori, A., Kawamoto, K., Thangudu, R.R., Ishida, T., Igarashi, K. et al. (2014)

Crystal structure of a feruloyl esterase belonging to the tannase family: A disulfide bond near

a catalytic triad. Proteins 82: 2857-2867.

Underlin, E.N., Frommhagen, M., Dilokpimol, A., van Erven, G., de Vries, R.P., and Kabel,

M.A. (2020) Feruloyl Esterases for Biorefineries: Subfamily Classified Specificity for

Natural Substrates. Front Bioeng Biotechnol 8: 17.

Ward, N.D., Keil, R.G., Medeiros, P.M., Brito, D.C., Cunha, A.C., Dittmar, T. et al. (2013)

Degradation of terrestrially derived macromolecules in the Amazon river. Nat Geosci 6: 530-

533.

Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R. et al. (2018)

SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids

Res 46: W296-W303.

Wolff, T. (1979) Macrofaunal utilization of plant remains in the deep-sea. SARSIA 64: 117-+.

Wong, D.W.S. (2006) Feruloyl esterase - A key enzyme in biomass degradation. Appl Biochem

Biotechnol 133: 87-112.

Zhang, H., Yohe, T., Huang, L., Entwistle, S., Wu, P.Z., Yang, Z.L. et al. (2018) dbCAN2: a

meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 46:

W95-W101.

Zhu, W.H., Lomsadze, A., and Borodovsky, M. (2010) Ab initio gene identification in

metagenomic sequences. Nucleic Acids Res 38: e132.

Figure 1. Detection of ethyl ferulate (EF)-degrading activity of the bacteria isolated from the sunken wood. (a) Clear zone formation on the solid media containing 10 mM EF by the selected isolates (designated as OW55–60). The results for 66 isolates (designated as OW1–66) for EF-degrading activity are shown in Figure S2. (b) Quantification and localization of ferulic acid esterase (Fae) activity of the selected isolate (strain OW59) during liquid culture. Periodical sampling was conducted for 4 d after the inoculation of the strain into the media (0 d) with/without EF. The Fae activity in the cell lysate and supernatant of culture broth obtained from cultivation with EF (open circle and triangle) and without EF (closed circle and closed triangle) were quantified using HPLC. Error bars represent standard error of the mean of duplicate experiments.

Figure 2. Identification of the reaction product of the recombinant gene product of

GBH29579 (rGBH29579) with EF using liquid chromatography/mass spectrometry

(LC/MS). (a) Total ion chromatogram (TIC) was obtained by LC/MS to monitor the reaction of rGBH29579 with EF (upper TIC) using full-scan electrospray ionization (ESI) method in negative mode. Rt denotes the retention time of reverse-phase column chromatography. The detected peaks of the reaction products were labeled with the observed m/z that corresponded to the deprotonated molecules, [M–H]–. TIC from the parallel experiment without the addition of

enzyme (no enzyme control) was shown below that from the enzyme reactions (lower TIC). (b)

TIC of authentic ferulic was obtained under identical condition for reference. The insets are mass spectra obtained under identical conditions.

Figure 3. Biochemical characterization of rGBH29579. (a) The pH–activity curves of the purified rGBH29579 are shown. The activity was measured in 50 mM Britton–Robinson buffer at 50°C. The values are shown as percentages of the maximal activity of rGBH29579 observed at pH 7.5, which are taken as 100%. (b) The temperature–activity curves of the purified rGBH29579 are also shown. The values are shown as percentages of the maximal activity of rGBH29579 observed at 55°C, which is taken as 100%. Error bars represent the standard error of the mean of duplicate experiments.

Figure 4. LC/UV monitoring of the rGBH29579 reactivity with the terephthalate esters and LC/MS operating in selected ion monitoring (SIM) mode to identify the products. (a)

Reaction mixtures of mono(2-hydroxyethyl) terephthalate (MHET) with or without rGBH29579 were analyzed using LC/UV coupled to a mass detector after 16-h incubation at 50°C.

Chromatograms of UV detector response at 240 nm were shown on the left side. Mass chromatograms operating SIM mode were shown on the right side. The target mass was set at

the deprotonated terephthalic acid molecules ([M–H]– m/z 165), which is the hydrolytic product of MHET. To examine the non-enzymatic degradation of the substrate, a parallel experiment without the addition of enzyme (no enzyme control) was conducted. The mass chromatogram is shown below that from the enzyme reactions. (b) Bis(2-hydroxyethyl) terephthalate (BHET) was used as the substrate and analyzed in the same method as (a). The target mass was set at the deprotonated molecules of MHET ([M–H]– m/z 209), which is the hydrolytic 'product of BHET.

The mass spectra of the detected product (upper inset) and that authentic MHET were acquired under the same condition (lower inset) (c) EF was used as the substrate and analyzed in the same method as (a) as a positive control for enzyme activity. The target mass was set at the deprotonated molecules of ferulic acid ([M–H]– m/z 193), which are the FE's hydrolytic products.

Figure 5. LC/MS analysis of major products from ADRS by rGBH29579 reaction. (a) Mass chromatogram based on base peak ions (BPI) of rGBH29579-treated ADRS (upper chromatogram). BPI represents the ions of the highest intensity among the ions detected at the respective time. The detected major peaks and ions produced by rGBH29579 reaction were labeled with double asterisks (**) with the m/z of BPI. Mass spectra of the two major products and authentic p-coumaric acid were obtained under identical conditions and shown in the insets

for comparison of m/z and fragmentation patterns. Comparing the mass spectrum of the peak of m/z 195.3 with authentic ferulic acid, refer to the spectrum in Fig. 2 inset. BPI from untreated

ADRS was obtained in parallel control experiments without the addition of rGBH29579 (lower chromatogram). The peaks reduced by rGBH29579 reaction were labeled with asterisks (*) with the respective m/z of BPI.

Figure 6. Multivariate analysis of LC/MS data for detecting unidentified reaction products and substrate for rGBH29579 in ADRS. (a) The principal component analysis (PCA) score plot was constructed based on loading data of mass ions from rGBH29579-treated ADRS

(black-line circle) and untreated ADRS (dotted-line circle) and rGBH29579 reaction mixture without ADRS (gray-line circle). (b) Possible discriminating ions (dotted-line box) for the enzyme reaction were selected from the loading plot of the orthogonal partial least square discriminant analysis (OPLS-DA) model for rGBH29579-treated ADRS and untreated ADRS under the limitation of p-value > |0.6| and loadings > |0.0015|. The abundances of the selected ions were calculated from the area of the extracted mass chromatogram. The ion peak area for strain rGBH29579-treated ADRS (gray bar) and untreated ADRS (black bar) is depicted in the bar chart. The increased and decreased ions after rGBH29579 treatment are shown in (c) and

(d), respectively. The ions with double and single asterisk (**, *) are considered identical to

those in Figs. 5 and 7. Error bars represent the standard error of the mean of triplicate experiments.

Figure 7. Multivariate analysis of LC/MS data for detecting unidentified substrates and metabolites of strainOW59 from ADRS. (a) The PCA score plot was constructed based on loading data of mass ions from strain OW59-treated ADRS (black-line circle) and untreated

ADRS (dotted-line circle) and strain OW59 culture broth (gray-line circle). (b) Possible discriminating ions (dotted-line box) for the enzyme reaction were selected from the loading plots of the OPLS-DA model for strain OW59-treated ADRS and untreated ADRS under the same limitation in Fig. 6. The abundances of the selected ions were calculated from the area of the extracted mass chromatogram. The ion peak area for strain OW59-treated ADRS (gray bar) and untreated ADRS (black bar) is depicted in the bar chart. The increased and decreased ions after strain OW59 treatment are shown in (c) and (d), respectively. The ions with double and single asterisks (**, *) are considered identical to those in Figs. 5 and 6. Error bars represent the standard error of the mean of triplicate experiments.

emi_15551_figs_ohta et al 20210305-01.eps emi_15551_figs_ohta et al 20210305-02.eps emi_15551_figs_ohta et al 20210305-03.eps emi_15551_figs_ohta et al 20210305-04.eps emi_15551_figs_ohta et al 20210305-05.eps emi_15551_figs_ohta et al 20210305-06.eps emi_15551_figs_ohta et al 20210305-07.eps Table 1. Putative genes of carbohydrate‐active enzymes (CAZymes) in strain OW59 genome and the orthologs in closely related strains of Sphingobium xenophagum.

Gene ID GenBank CAZy Gene ID (locus_tag) of the orthologs in the related strains Annotation EC number (locus_tag) accession family QYY NBRC C1 xylan 1,4-β-xylosidase EC:3.2.1.37 / MBESOW_P0099 GBH28846 GH43_12 NA SX1_RS22870 NA /α-N-arabinofuranosidase EC:3.2.1.55 MBESOW_P0100 GBH28847 GH39 xylan 1,4-β-xylosidase EC:3.2.1.37 NA SX1_RS17250 NA MBESOW_P0108 GBH28855 GH29 α-L-fucosidase EC:3.2.1.51 NA NA NA

MBESOW_P0109 GBH28856 GH31 α-D-xyloside xylohydrolase EC:3.2.1.177 NA NA NA

MBESOW_P0110 GBH28857 GH35 hypothetical protein NA NA NA NA Article MBESOW_P0111 GBH28858 GH3 β-glucosidase EC:3.2.1.21 NA NA NA MBESOW_P0113 GBH28860 GH97 α-glucosidase EC:3.2.1.20 NA SX1_RS17280 NA MBESOW_P0114 GBH28861 GH97 α-glucosidase EC:3.2.1.20 NA SX1_RS17285 NA MBESOW_P0119 GBH28866 GH2 β-galactosidase EC:3.2.1.23 NA SX1_RS17310 NA MBESOW_P0121 GBH28868 GH31 α-D-xyloside xylohydrolase EC:3.2.1.177 NA SX1_RS17320 NA EC:2.4.1.129 MBESOW_P0172 GBH28919 GT51 penicillin-binding protein 1A QYYP_RS0112080 SX1_RS11720 CJD35_RS05820 EC:3.4.16.4 MBESOW_P0384 GBH29131 GH43_26 hypothetical protein NA QYYP_RS0115410 SX1_RS11645 CJD35_RS07395 arabinan MBESOW_P0385 GBH29132 GH43_5 EC:3.2.1.99 QYYP_RS0115415 SX1_RS11640 CJD35_RS07400 endo-1,5-α-L-arabinosidase MBESOW_P0386 GBH29133 GH51 α-N-arabinofuranosidase EC:3.2.1.55 QYYP_RS0115420 SX1_RS11635 CJD35_RS07405 Accepted MBESOW_P0497 GBH29244 GH3 β-N-acetylhexosaminidase EC:3.2.1.52 QYYP_RS0115970 NA CJD35_RS07945 MBESOW_P0503 GBH29250 GH3 β-glucosidase EC:3.2.1.21 QYYP_RS0116000 SX1_RS09620 CJD35_RS07975

This article is protected by copyright. All rights reserved. membrane-bound lytic MBESOW_P0709 GBH29455 GH102 EC:4.2.2.- QYYP_RS0100685 SX1_RS15715 CJD35_RS08915 murein transglycosylase A MBESOW_P0738 GBH29484 GT4 hypothetical protein NA QYYP_RS0100830 SX1_RS15570 CJD35_RS09060 polyisoprenyl-phosphate MBESOW_P0869 GBH29615 GT2 EC:2.4.-.- QYYP_RS0103685 SX1_RS13225 CJD35_RS09715 glycosyltransferase MBESOW_P0996 GBH29742 GH3 β-N-acetylhexosaminidase EC:3.2.1.52 QYYP_RS0117025 SX1_RS00885 CJD35_RS10345 UDP-N-acetylglucosamine-- N-acetylmuramyl-(pentapept ide) MBESOW_P1016 GBH29762 GT28 pyrophosphoryl-undecapren EC:2.4.1.227 QYYP_RS0117130 SX1_RS00785 CJD35_RS10445 ol N-acetylglucosamine transferase MBESOW_P1272 GBH30019 GT4 hypothetical protein NA QYYP_RS0103110 SX1_RS14345 CJD35_RS02435

Article EC:2.4.1.129 MBESOW_P1291 GBH30038 GT51 penicillin-binding protein 1A QYYP_RS0103010 SX1_RS14245 CJD35_RS02340 EC:3.4.16.4 EC:2.4.1.129 MBESOW_P1589 GBH30335 GT51 penicillin-binding protein 1A QYYP_RS0102525 SX1_RS02590 CJD35_RS00875 EC:3.4.16.4 monofunctional MBESOW_P1623 GBH30369 GT51 EC:2.4.1.129 QYYP_RS0117615 SX1_RS16100 CJD35_RS00690 glycosyltransferase MBESOW_P1710 GBH30455 GT4 hypothetical protein NA QYYP_RS0109865 SX1_RS08355 CJD35_RS00235 MBESOW_P1807 GBH30552 GT35 glycogen phosphorylase EC:2.4.1.1 QYYP_RS0119850 SX1_RS07860 CJD35_RS14650 1,4-α-glucan branching MBESOW_P1808 GBH30553 GH13_9 EC:2.4.1.18 QYYP_RS0119845 SX1_RS07855 NA enzyme MBESOW_P1810 GBH30555 GT5 starch synthase EC:2.4.1.21 QYYP_RS0119835 SX1_RS07845 NA glycogen debranching MBESOW_P1811 GBH30556 GH13_11 EC:3.2.1.196 QYYP_RS0119830 SX1_RS07840 CJD35_RS14630 enzyme

Accepted MBESOW_P2010 GBH30755 GH23 hypothetical protein NA QYYP_RS0120745 SX1_RS18910 CJD35_RS19690 MBESOW_P2085 GBH30830 GH130 hypothetical protein NA NA NA CJD35_RS18790 MBESOW_P2309 GBH31048 GT4 hypothetical protein NA NA NA NA MBESOW_P2410 GBH31149 GH23 hypothetical protein NA QYYP_RS0118935 SX1_RS04355 CJD35_RS12800

This article is protected by copyright. All rights reserved. MBESOW_P2479 GBH31218 GH103 hypothetical protein EC:4.2.2.- QYYP_RS0107700 SX1_RS04745 CJD35_RS13155 MBESOW_P2535 GBH31274 GT4 hypothetical protein NA QYYP_RS0107035 SX1_RS17560 CJD35_RS13450 MBESOW_P2554 GBH31293 GH15 hypothetical protein NA QYYP_RS0107495 SX1_RS15465 CJD35_RS13860 trehalose 6-phosphate EC:2.4.1.15 MBESOW_P2555 GBH31294 GT20 QYYP_RS0107500 SX1_RS15460 CJD35_RS13865 synthase EC:2.4.1.347 MBESOW_P2622 GBH31366 GH3 β-glucosidase EC:3.2.1.21 NA NA NA MBESOW_P2723 GBH31467 GH25 lysozyme EC:3.2.1.17 QYYP_RS0108495 SX1_RS12445 CJD35_RS11940 MBESOW_P2766 GBH31510 CE0 hypothetical protein NA QYYP_RS0102310 SX1_RS15770 CJD35_RS11725 MBESOW_P2880 GBH31623 CE1 hypothetical protein NA NA SX1_RS09295 CJD35_RS16935 MBESOW_P2907 GBH31650 GH68 levansucrase EC:2.4.1.10 NA SX1_RS09505 CJD35_RS17160 MBESOW_P2917 GBH31660 GH3 β-glucosidase EC:3.2.1.21 NA SX1_RS22015 NA Article soluble lytic murein MBESOW_P3039 GBH31778 GH23 EC:4.2.2.- QYYP_RS0110170 SX1_RS10640 CJD35_RS02885 transglycosylase MBESOW_P3383 GBH32122 GT4 hypothetical protein NA QYYP_RS0113395 SX1_RS06795 CJD35_RS19280 MBESOW_P3436 GBH32197 GT4 hypothetical protein NA QYYP_RS0102235 SX1_RS05920 CJD35_RS15105 MBESOW_P3439 GBH32200 GH97 α-glucosidase EC:3.2.1.20 QYYP_RS0102220 SX1_RS05935 CJD35_RS15090 MBESOW_P3440 GBH32201 GH13_23 α-glucosidase EC:3.2.1.20 QYYP_RS0102215 SX1_RS05940 CJD35_RS15085 MBESOW_P3441 GBH32202 GH13 hypothetical protein NA QYYP_RS0102210 SX1_RS05945 CJD35_RS15080 MBESOW_P3499 GBH32267 AA3 choline dehydrogenase EC:1.1.99.1 QYYP_RS0105280 SX1_RS12750 CJD35_RS16680 4-cresol dehydrogenase (hydroxylating) flavoprotein MBESOW_P3569 GBH32337 AA4 subunit, vanillyl-alcohol EC:1.17.99.1 QYYP_RS0114110 SX1_RS21055 NA oxidases Accepted choline dehydrogenase, glucose-methanol-choline MBESOW_P3575 GBH32343 AA3 (GMC) oxidoreductases EC:1.1.99.1 QYYP_RS0114080 SX1_RS21085 CJD35_RS16290 family protein MBESOW_P3656 GBH32425 CE1 hypothetical protein NA QYYP_RS0101910 SX1_RS06245 CJD35_RS17910

This article is protected by copyright. All rights reserved. MBESOW_P3709 GBH32478 GH13_23 α-glucosidase EC:3.2.1.20 QYYP_RS0108060 SX1_RS06505 CJD35_RS17660

Note, CAZymes were searched in the OW59 genome (GeneBank assembly accession; GCA_004305355.1) using the dbCAN meta server with the tools of HMMER (E-Value < 1e-15, coverage > 0.35), DIAMOND (E-Value < 1e-102) and Hotpep (frequency > 2.6, hits > 6). Abbreviations (GeneBank assembly accessions); QYY: S. xenophagum strain QYY (GCA_000277525.1), NBRC: S. xenophagum NBRC107872 (GCA_000367345.1), C1: S. hydrophobicum strain C1 (GCA_002288285.1), AA: auxiliary activity, GH: glycoside hydrolase, CE: carbohydrate esterase, GT: glycosyl transferase, NA, not assigned.

Article Accepted

This article is protected by copyright. All rights reserved.

Table 2. Putative genes in strain OW59 genome related to degradation of lignin or lignin breakdown compounds Gene ID Gene ID (locus_tag) of the ortholog in the related strain GenBank accession Annotation EC number (locus_tag) QYY NBRC C1 MBESOW_P0009 GBH28757 catechol 1,2-dioxygenase EC:1.13.11.1 NA SX1_RS23985 NA MBESOW_P0014 GBH28762 gentisate 1,2-dioxygenase EC:1.13.11.4 NA NA NA MBESOW_P0098 GBH28845 non-heme chloroperoxidases EC:1.11.1.10 NA SX1_RS17240 NA MBESOW_P0350 GBH29097 glutathione peroxidase EC:1.11.1.9 QYYP_RS0113955 SX1_RS03685 CJD35_RS07210 MBESOW_P0358 GBH29105 non-heme chloroperoxidases EC:1.11.1.10 QYYP_RS0115265 SX1_RS03725 CJD35_RS07255

Article Note, Genes for the enzymes involved in the degradation for lignin-derived compounds were searched in the OW59 genome annotation (GeneBank assembly accession; GCA_004305355.1) by the EC numbers of the enzymes described in the earlier studies (Jackson et al., 2017; Janusz et al., 2017). Abbreviations (GeneBank assembly accession); QYY: Sphingobium xenophagum strain QYY (GCA_000277525.1), NBRC: S. xenophagum NBRC107872 (GCA_000367345.1): C1: S. hydrophobicum strain C1 (GCA_002288285.1), NA, not assigned. Accepted

This article is protected by copyright. All rights reserved.