Bacterial degradation pathways for xenobiotic and natural organosulfonates

Dissertation

zur Erlangung des akademischen Grades des Doktors der Naturwissenschaften (Dr. rer. nat.)

an der Universität Konstanz

Mathematisch-Naturwissenschaftliche Sektion

Fachbereich Biologie

vorgelegt von Michael Weiß

Tag der mündlichen Prüfung: 17. Dezember 2014

1. Referent: Prof. Dr. Bernhard Schink

2. Referent: Prof. Dr. Dieter Spiteller

3. Referent: Prof. Dr. Jörg Hartig

Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-267999

DANKSAGUNG Mein herzlichster Dank geht an Dr. David Schleheck für die Möglichkeit meine Doktorarbeit zu diesem Thema anzufertigen, für seine herausragende Betreuung, für sein Vertrauen und die akademische Freiheit, die ich genießen durfte. Vielen Dank auch für die Gastfreundschaft bei zahlreichen Grillabenden.

Vielen Dank an Herrn Prof. Dr. Bernhard Schink für die Übernahme des Referats, seine Betreuung im Rahmen des Graduiertenprogamms und der als kompetenter Ansprechpartner immer ein offenes Ohr für meine Fragen hatte.

Vielen Dank an Herrn Prof. Dr. Jörg Hartig, dass er sich für mein Betreuungs- und Prüfungskomitee zur Verfügung gestellt hat.

Vielen Dank an Herrn Prof. Dr. Dieter Spiteller für die spontane Übernahme des Referats, für die Möglichkeit, seine äußerst hilfreichen Geräte (GC-MS, LC-MS, NanoPhotometer) mitbenutzen zu können sowie für seine stete Diskussionsbereitschaft.

Vielen herzlichen Dank an Ann-Katrin Felux. Es ist schön, mit einer zuverlässigen und hilfsbereiten Kollegin zusammenzuarbeiten, mit der man auch viel Spaß haben kann.

Vielen Dank an die AG Spiteller (Daniela, Dieter, Karin, Kathrin und Ralf) für das freundschaftliche Verhältnis in unserem Flügel; insbesondere an Karin Denger für ihre tatkräftige Unterstützung vor allem während meiner Startphase an der Universität Konstanz sowie für Racletteabende und dafür, dass sie nicht vergisst, dass ihre Kollegen rechtzeitig in die Mensa kommen, an Ralf Schlesiger für seine kritische Durchsicht dieser Arbeit, für seine großartige Hilfe bei dem Bedienen sämtlicher Geräte und für glutenfreie Crêpes und Kuchen, an Kathrin Schmidt, besonders für ihre Gastfreundschaft bei manch torreichen WM-Spielen.

Vielen Dank an Herrn Prof. Dr. Alasdair M. Cook, dessen Geist als Graue Eminenz die Grundlage manch erfolgreicher Arbeiten darstellte.

Vielen Dank an die eifrigen Teilnehmer des Fehlerfindespiels (Ralf, Ann-Katrin, Anna und Karin).

Vielen Dank an Sabine Lehmann für das ausgesprochen gute und freundschaftliche Verhältnis, selbst als wir um die letzten Sauerstoffanteile in der Denkzelle konkurrierten.

Vielen Dank an meine Vertiefungskurs- und Hiwi-Studentinnen Lena Wurmthaler, Anna Rist und Silvie Berthelot für ihre tatkräftige Mitarbeit.

Vielen Dank an Dr. Thomas Huhn, der mit seinen chemischen Synthesen (3-C4-SPC, 4-C6-SPC, SQ) erst die Möglichkeit eröffnete, bestimmte Fragestellungen zu bearbeiten. Meinen herzlichen Dank der Graduiertenschule KoRS-CB für die Möglichkeit, im Rahmen eines Doktorandenstipendiums, diese Arbeit anzufertigen, für finanzielle Unterstützungen für Sach- und Reisemittel sowie für die Möglichkeit auf den Retreats und insbesondere beim Besuch der Harvard University interessante Bekanntschaften und Erfahrungen zu machen. Vielen Dank Frau Dr. Heike Brandstädter, Frau Renée Rummel und Frau Katharina Magerkurth für ihre Hilfe und Unterstützung.

Vielen Dank an Herrn Dr. Roland Kissmehl, für seine vielen kompetenten Antworten.

Vielen Dank dem Zukunftskolleg der Universität Konstanz, insbesondere für die finanzielle Unterstützung in der Endphase meiner Arbeit.

Vielen Dank an die Arbeitsgruppe van Kleunen für die freundschaftliche Nachbarschaft, der Mitbenutzung ihrer Räumlichkeiten und dass ich ihre botanische Exkursion begleiten durfte.

Vielen herzlichen Dank den Arbeitsgruppen von M9 (AG Adamska, AG Kroth, AG Schink) für ihre freundliche Inklusion in ihre Aktivitäten, insbesondere in ihre Weihnachtsfeier sowie für das freundliche Teilen von Geräten (Ultrazentrifuge, Elektroporator, French Press), von Chemikalien und für Plasmid pGWB408 (Dr. Dietmar Funck). Der AG Schink sei auch besonders herzlich für die Aufnahme in ihr Seminarprogramm (inkl. lehrreicher Nachsitzungen) und zu ihren Betriebsausflügen gedankt.

Vielen Dank an Tobias Erb (Zürich) und an Ivan Berg (Freiburg), bei denen ich immer Antworten auf meine Fragen finden konnte und dass sie sich die Zeit genommen haben, Vorträge zu verschiedenen Anlässen zu halten. Tobi, vielen Dank auch für die ASKA-Klone und die zahlreichen Publikationen, zu denen ich keinen Zugang hatte.

Vielen Dank an Maike Voges für ihre Begleitung nach Boston. Trotz unserer Interessen, die nicht unterschiedlicher sein könnten, waren es drei äußerst harmonische und erfahrungsreiche Wochen.

Vielen Dank an Caro Bogs, die mir durch ihren Eindruck, dass ich doch schon immer hier sei, zum schnellen Einleben und Wohlfühlen verhalf.

Vielen Dank an alle meine Konstanzer Mitbewohner, an Lisa, Bernd, Anke, Björn, Verena, Gerald, Judith und Hakan. Es war mir immer eine große Freude mit euch zusammenzuwohnen.

Vielen Dank an meine ganze Familie, insbesondere an meine Eltern Gertrud und Karl, an meine Geschwister Monika, Christian und Fabian und an meine Tante Rosemarie, die mich immer unterstützt haben.

Vielen Dank Anna, schön, dass es dich gibt! CHAPTERS OF THIS THESIS ARE PUBLISHED IN:

Denger, K., M. Weiss, A. K. Felux, A. Schneider, C. Mayer, D. Spiteller, T. Huhn, A. M. Cook and D. Schleheck (2014). Sulfoglycolysis in Escherichia coli K-12 closes a gap in the biogeochemical sulfur cycle. Nature 507(7490): 114-117.

Weiss, M., A. I. Kesberg, K. M. LaButti, S. Pitluck, D. Bruce, L. Hauser, A. Copeland, T. Woyke, S. Lowry, S. Lucas, M. Land, L. Goodwin, S. Kjelleberg, A. M. Cook, M. Buhmann, T. Thomas and D. Schleheck (2013). Permanent draft genome sequence of Comamonas testosteroni KF-1. Stand. Genomic Sci. 8(2): 239-254.

Weiss, M., K. Denger, T. Huhn and D. Schleheck (2012). Two enzymes of a complete degradation pathway for linear alkylbenzenesulfonate (LAS) surfactants: 4-sulfoacetophenone Baeyer-Villiger monooxygenase and 4-sulfophenylacetate esterase in Comamonas testosteroni KF-1. Appl. Environ. Microbiol. 78: 8254-8263.

Schleheck, D., M. Weiss, S. Pitluck, D. Bruce, M. L. Land, S. Han, E. Saunders, R. Tapia, C. Detter, T. Brettin, J. Han, T. Woyke, L. Goodwin, L. Pennacchio, M. Nolan, A. M. Cook, S. Kjelleberg and T. Thomas (2011). Complete genome sequence of Parvibaculum lavamentivorans type strain (DS-1T). Stand. Genomic Sci. 5(3): 298-310.

FURTHER PUBLICATIONS NOT INCLUDED IN THIS THESIS:

Felux, A.-K., K. Denger, M. Weiss, A. M. Cook and D. Schleheck (2013). Paracoccus denitrificans PD1222 utilizes hypotaurine via transamination followed by spontaneous desulfination to yield acetaldehyde, and finally acetate for growth. J. Bacteriol. 195: 2921- 2930.

TABLE OF CONTENTS I

TABLE OF CONTENTS

SUMMARY ...... 1

ZUSAMMENFASSUNG ...... 3

CHAPTER 1 ...... 5

General introduction

CHAPTER 2 ...... 11

Complete genome sequence of Parvibaculum lavamentivorans type strain DS-1T

CHAPTER 3 ...... 25

Permanent draft genome sequence of Comamonas testosteroni KF-1

CHAPTER 4 ...... 41

Complete genome sequence of Delftia acidovorans SPH-1

CHAPTER 5 ...... 55

Two enzymes of a complete degradation pathway for linear alkylbenzenesulfonate (LAS) surfactants: 4-sulfoacetophenone Baeyer-Villiger monooxygenase and 4-sulfophenylacetate esterase in Comamonas testosteroni KF-1

CHAPTER 6 ...... 83

Characterization of a steroid Baeyer-Villiger monooxygenase and esterase in Comamonas testosteroni KF-1

CHAPTER 7 ...... 119

A proteomic approach to unravel enzymes of the sulfophenylcarboxylate degradation pathway in Comamonas testosteroni KF-1 and identification of a hydroxyquinol ring cleavage dioxygenase

CHAPTER 8 ...... 149

Sulfoglycolysis in Escherichia coli K-12 closes a gap in the biogeochemical sulfur cycle

CHAPTER 9 ...... 171

General discussion II TABLE OF CONTENTS

CHAPTER 10 ...... 177

Appendix

Abbreviations ...... 177

Record of Contributions ...... 188

CHAPTER 11 ...... 189

General references

SUMMARY 1

SUMMARY

The complete degradation of organosulfonates, regardless of their natural or anthropogenic origin, by is important, or these compounds would accumulate in the environment. In the course of this thesis, several aspects of bacterial degradation of organosulfonates have been examined, i.e., for organosulfonates that are produced either by industrial synthesis in the case of the xenobiotic laundry surfactants linear alkylbenzenesulfonates (LAS), or naturally in the case of 6-deoxy-6-sulfoglucose (sulfoquinovose, SQ).

LAS are the most important anionic surfactants in respect to their global consumption and can be degraded completely in two tiers by bacterial communities. A three-member model bacterial community has been genome-sequenced, and in this thesis, their genomes were analyzed. These organisms are Parvibaculum lavamentivorans DS-1T, Comamonas testosteroni KF-1 and Delftia acidovorans SPH-1. P. lavamentivorans utilizes the alkyl chains of the LAS molecules as a carbon and energy source and excretes 4-sulfophenylcarboxylates (SPCs), which are utilized and completely degraded by the second tier organisms, C. testosteroni KF-1 and D. acidovorans SPH-1. The P. lavamentivorans genome encodes a wealth of genes for degradation of alkanes, but only few candidates for sugar or amino acid catabolism, hence, shows a pronounced niche adaptation for degradation of alkyl-group containing substrates. The genomes of C. testosteroni KF-1 and D. acidovorans SPH-1 encode various sets of genes for the degradation of aromatic substrates, and the genome of strain KF-1 also for the degradation of steroids.

C. testosteroni KF-1 utilizes a major SPC from LAS for growth, 3-(4-sulfophenyl)butyrate

(3-C4-SPC), and a pathway for its degradation has been proposed, involving a 4-sulfoacetophenone (SAP) Baeyer-Villiger monooxygenase (BVMO). Transcriptional analyses showed that only one of four BVMO candidate genes in the genome of strain KF-1 is specifically induced during growth with 3-C4-SPC, and this gene was heterologously overexpressed. The purified enzyme catalyzed a BVMO reaction with SAP, but also with other aromatic and aliphatic ketones as substrates. The product of the SAP-BVMO reaction,

4-sulfophenyl acetate, is cleaved by a 3-C4-SPC inducible esterase encoded directly next to the SAP-BVMO gene, yielding acetate and 4-sulfophenol. The 4-sulfophenol was initially thought to be hydroxylated to 4-sulfocatechol for aromatic ring cleavage and desulfonation. However, proteomic analyses of 3-C4-SPC-grown cells revealed two identical gene copies of a highly induced aromatic ring cleavage dioxygenase, for which the recombinant proteins did not 2 SUMMARY convert 4-sulfocatechol, but hydroxyquinol. Hence, hydroxyquinol is the ʻtrueʼ intermediate for ring cleavage in the 3-C4-SPC/4-sulfophenol degradation pathway, and this implies yet unknown desulfonating oxygenase system(s) in order to convert 4-sulfophenol to hydroxyquinol. In addition, differential proteomics revealed sets of 3-C4-SPC inducible genes that are most likely responsible for the conversion of 3-C4-SPC to SAP involving CoA-esters and reactions in analogy to short-chain fatty acid degradation, as well as candidates for the implied desulfonating oxygenase system(s) and aromatic ring cleavage pathway; these can be characterized in future work. Interestingly, all identified 3-C4-SPC degradative (candidate) genes are encoded in the same genome region and within a complex arrangement of mobile genetic elements (IS1071 elements). It is easy to rationalize that these genes have only recently been mobilized from a different location into the C. testosteroni genome in order to allow for the metabolism of xenobiotic 3-C4-SPC.

The other three BVMO candidates in the strain KF-1 genome were also heterologously overexpressed and the substrate range of the purified enzymes examined. One enzyme showed high BVMO activity with the steroid substrates progesterone and pregna-1,4-diene-3,20-dione, producing each the corresponding acetate esters, testosterone acetate and boldenone acetate, respectively. This BVMO was specifically induced during growth of strain KF-1 with progesterone. The corresponding acetate esters are hydrolytically cleaved by an inducible esterase encoded directly next to the steroid-BVMO gene, yielding testosterone and boldenone, respectively, which enter the well-known steroid degradation pathway of C. testosteroni.

SQ is the polar headgroup of the plant sulfolipid sulfoquinovosyldiacylglycerol (SQDG) and represents, with an estimated annual production of about 10 billion tons, a major portion of the organo-sulfur in nature. Escherichia coli K-12, the most widely studied prokaryotic model organism, is able to utilize SQ as sole carbon and energy source. SQ is catabolized in analogy to the glycolytic pathway, hence, the SQ pathway was termed ‘sulfoglycolysis’. Sulfoglycolysis involves a SQ isomerase, a 6-deoxy-6-sulfofructose kinase, a 6-deoxy-6-sulfofructose- 1-phosphate aldolase, and a 3-sulfolactaldehyde reductase. The enzymes are encoded in a ten- gene cluster, including transport and regulation, and are specifically induced during growth with SQ. The pathway was reconstituted in vitro using recombinant proteins, and all intermediates formed were identified by HPLC-mass spectrometry. E. coli K-12 utilizes only half of the carbon of SQ and excretes 2,3-dihydroxypropane-1-sulfonate (DHPS), which is degraded completely by other bacteria. ZUSAMMENFASSUNG 3

ZUSAMMENFASSUNG

Der vollständige Abbau von Organosulfonaten natürlichen oder anthropogenen Ursprungs durch Bakterien ist von großer Wichtigkeit, da sich diese Verbindungen sonst in der Umwelt anreichern. Im Zuge dieser Arbeit wurden verschiedene Aspekte des bakteriellen Abbaus von Organosulfonaten untersucht, die einerseits in industrieller Synthese hergestellt werden, wie im Fall der Waschmittel-Tenside, Lineare Alkylbenzensulfonate (LAS), oder natürlich entstehen, wie im Fall von 6-Desoxy-6-sulfoglukose (Sulfoquinovose, SQ).

LAS sind hinsichtlich ihres weltweiten Verbrauchs die wichtigsten anionischen Tenside und können von bakteriellen Gemeinschaften in zwei Stufen abgebaut werden. Eine Modell- Gemeinschaft aus drei Bakterienstämmen wurde Genom-sequenziert und in dieser Arbeit untersucht. Die Organismen sind Parvibaculum lavamentivorans DS-1T, Comamonas testosteroni KF-1 und Delftia acidovorans SPH-1. In der ersten Stufe nutzt P. lavamentivorans die Alkylketten der LAS-Moleküle als Kohlenstoff- und Energiequelle und scheidet 4-Sulfophenylcarboxylate (SPCs) aus, die von den Organismen der zweiten Stufe, C. testosteroni KF-1 und D. acidovorans SPH-1, vollständig abgebaut werden. Das Genom von P. lavamentivorans kodiert eine Vielzahl an Genen für einen Abbau von Alkanen, jedoch nur wenige für eine Verwertung von Zuckern und Aminosäuren, und ist somit stark an eine Verwertung von Substraten, die Alkylketten enthalten, angepasst. Die Genome von C. testosteroni KF-1 und D. acidovorans SPH-1 kodieren eine Vielzahl an Genen für einen Abbau aromatischer Substrate und das von Stamm KF-1 auch einen Abbauweg für Steroide.

C. testosteroni KF-1 kann ein aus LAS gebildetes SPC für sein Wachstum nutzen,

3-(4-Sulfophenyl)butyrat (3-C4-SPC), wofür ein Abbauweg unter Beteiligung einer 4-Sulfoacetophenon (SAP) Baeyer-Villiger Monooxygenase (BVMO) vorgeschlagen wurde. Transkriptionsanalysen zeigten, dass nur eines von vier BVMO-Kandidatengenen aus dem

Genom während des Wachstums mit 3-C4-SPC induziert vorliegt. Dieses Gen wurde heterolog überexprimiert. Das gereinigte Enzym katalysiert eine BVMO-Reaktion mit SAP, aber auch mit anderen aromatischen und aliphatischen Ketonen. Das Produkt der SAP-BVMO Reaktion,

4-Sulfophenylacetat, wird durch eine 3-C4-SPC induzierte und direkt neben der SAP-BVMO kodierte Esterase hydrolytisch gespalten, wodurch Acetat und 4-Sulfophenol entstehen. Für den weiteren Abbau von 4-Sulfophenol wurde ursprünglich eine Hydroxylierung zu 4-Sulfocatechol als Ringspalt-Substrat angenommen. Durch proteomische Analysen wurden zwei identische Gen-Kopien einer stark induzierten Ringspalt-Dioxygenase identifiziert, deren 4 ZUSAMMENFASSUNG rekombinante Proteine jedoch keine Aktivität mit 4-Sulfocatechol, aber mit Hydroxychinol zeigten. Somit ist Hydroxychinol das Ringspalt-Substrat im 3-C4-SPC/4-Sulfophenol Abbauweg, was eine Beteiligung eines noch unbekannten desulfonierenden Oxygenase- Systems impliziert, um 4-Sulfophenol zu Hydroxychinol umzusetzen. Durch proteomische

Analysen wurden weitere 3-C4-SPC induzierte Gene identifiziert, die wahrscheinlich am Abbau von 3-C4-SPC zu SAP über SPC CoA-Ester, analog zum Abbau kurzkettiger Fettsäuren, beteiligt sind. Weiter könnten für das desulfonierende Oxygenasesystem und eine Verwertung des Ringspalt-Produkts kodieren, dies muss jedoch erst in zukünftigen Arbeiten bestätigt werden. Interessanterweise sind alle identifizierten (Kandidaten-) Gene des 3-C4-SPC Abbauwegs gemeinsam mit mobilen genetischen Elementen (IS1071 Elemente) in der gleichen Genom-Region kodiert, weshalb gut vorstellbar ist, dass diese Gene erst kürzlich in das Genom von Stamm KF-1 transferiert wurden, um eine Verwertung des xenobiotischen Substrats

3-C4-SPC zu ermöglichen.

Die anderen drei BVMO Kandidaten im Genom von Stamm KF-1 wurden ebenfalls heterolog überexprimiert und das Substratspektrum der gereinigten Enzyme untersucht. Ein Enzym zeigte BVMO-Aktivität mit den Steroid-Substraten Progesteron und Pregna-1,4-dien-3,20-dion unter Bildung der entsprechenden Acetat-Ester Testosteronacetat und Boldenoneacetat. Diese BVMO ist spezifisch während des Wachstums von Stamm KF-1 mit Progesteron induziert. Die Acetat-Ester werden durch eine induzierte und direkt neben der Steroid-BVMO kodierten Esterase hydrolytisch gespalten, wodurch Testosteron und Boldenon entstehen, die über den bereits bekannten Steroid-Abbauweg von C. testosteroni verwertet werden.

SQ ist die polare Kopfgruppe des Pflanzen-Sulfolipids Sulfoquinovosyldiacylglycerin (SQDG) und stellt mit einer geschätzten jährlichen Produktion von circa 10 Milliarden Tonnen einen Hauptteil des organisch gebundenen Schwefels in der Natur dar. Escherichia coli K-12, der am besten untersuchte prokaryotische Modellorganismus, ist in der Lage, SQ als alleinige Kohlenstoff- und Energiequelle zu nutzen. Der Abbau erfolgt in Analogie zur Glykolyse und wurde deshalb ‚Sulfoglykolyse‘ genannt. Die Sulfoglykolyse beinhaltet eine SQ Isomerase, eine 6-Desoxy-6-sulfofruktose Kinase, eine 6-Desoxy-6-sulfofruktose-1-phosphat Aldolase und eine 3-Sulfolaktaldehyd Reduktase. Die Enzyme sind in einem 10-Gen Cluster, inklusive Transport und Regulation, kodiert und werden während des Wachstums mit SQ induziert. Der Abbauweg wurde anhand rekombinanter und gereiniger Enzyme in vitro rekonstituiert und alle dabei gebildeten Intermediate mittels HPLC-Massenspektrometrie nachgewiesen. E. coli K-12 nutzt dabei nur die Hälfte des Kohlenstoffs von SQ und scheidet 2,3-Dihydroxypropan- 1-sulfonat (DHPS) aus, welches durch andere Bakterien vollständig abgebaut wird. CHAPTER 1 5

CHAPTER 1

General introduction

Organosulfonates - Organosulfonates are organic compounds that contain a sulfono group (-SO3 ) directly linked - to a carbon atom, as opposed to organo-sulfate esters (-O-SO3 ). The carbon-sulfur bond of most organosulfonates is remarkably stable against hydrolysis (Wagner and Reid 1931). Therefore, microorganisms had to acquire more sophisticated ways than a facile hydrolysis for desulfonation e.g. in order to make use of the sulfur and/or completely utilize the carbon moiety of these compounds for their growth (Cook and Denger 2002).

Several desulfonating strategies and enzymes are known: (i) desulfonating monooxygenation of alkanesulfonates (Thysse and Wanders 1974, Cook et al. 1998, Kelly and Murrell 1999, Erdlenbruch et al. 2001); (ii) desulfonating dioxygenation, e.g. of para-sulfobenzoate (Junker et al. 1996); (iii) a sulfoacetaldehyde acetyltransferase (Xsc), which forms acetyl phosphate and sulfite from sulfoacetaldehyde (Ruff et al. 2003); (iv) a 3-sulfolactate sulfo-lyase (SuyAB), which forms pyruvate and sulfite from 3-sulfolactate (Rein et al. 2005, Denger and Cook 2010); and (v) a cysteate sulfo-lyase (CuyA), which forms pyruvate, ammonium and sulfite from cysteate (Denger et al. 2006). Notably, in order to circumvent the negative effects of the toxic product of intracellular desulfonation, of sulfite, bacteria can protect themselves by extrusion of sulfite through specific sulfite exporters (e.g., Felux et al. 2013) and by sulfite-oxidizing enzymes (SorA and SorT) (Kappler et al. 2000, Denger et al. 2008, Wilson and Kappler 2009). Further, organosulfonates/organosulfonic acids are strong acids and therefore at a physiological pH deprotonated and charged (Cook et al. 1998), which, on the one hand, requires import (and export) systems in bacteria to transport organosulfonates across the cell membranes. On the other hand, sulfonated groups are often used in organic synthesis to make an organic compound more water soluble and, thus, many industrial chemicals are organosulfonates, e.g. most of the surfactants in commercial detergents, dyes, or pharmaceuticals (Cook et al. 2007).

Importantly, there are also heterotrophic organosulfonate-degrading bacteria that are unable to catalyze a desulfonation, and such organisms utilize only parts of the carbon moiety of organosulfonates and release corresponding organosulfonate degradation intermediates. These intermediates are then completely degraded (and desulfonated) by other bacteria. Hence, organosulfonates may also be degraded completely, inclusive desulfonation, in a cooperated 6 CHAPTER 1 effort by bacterial communities. Different aspects of organosulfonate degradation catalyzed by bacterial communities are the subject of this thesis (see below).

In conclusion, the bacterial organosulfonate degradation process represent a major component of both the biological sulfur and carbon cycle (e.g. Harwood and Nicholls 1979) and are important to maintain a healthy environment. Further, they may provide interesting and novel biochemistry, and in respect to the occurrence of both natural and xenobiotic organosulfonates in the environment, they may allow a comparison for supposedly ʻancientʼ and ʻnewʼ degradation pathways, respectively. For the latter, it is especially exciting to define the enzymes and genes that have been recruited in these bacteria to assemble such novel pathways and to, ultimately, explore the fundamental molecular adaptation strategies of microbes to changing environmental conditions (van der Meer 2008).

Linear Alkylbenzenesulfonates: Prominent xenobiotic organosulfonates Organosulfonates that contain a non-polar hydrophobic alkane moiety and the negatively charged, hydrophilic sulfonated group, are amphipathic and powerful surface active compounds, i.e., for their use as surfactants in household detergents and in industrial applications. The most relevant compounds within the class of anionic synthetic surfactants are the linear alkylbenzenesulfonates (LAS, Figure 1) with an annual consumption of about 3 x 106 tons worldwide (Knepper and Berna 2003). Notably, LAS replaced branched chain alkylbenzenesulfonates (Figure 1), which resisted effective biodegradation, whereas LAS are completely biodegradable, e.g. in sewage treatment plants, as known since many years (Sawyer and Ryckman 1957, Knepper and Berna 2003). For the complete degradation of LAS, a process involving bacterial communities was proposed: First, bacteria catalyze an initial degradation of LAS and second, different bacteria completely degrade the LAS degradation intermediates (Schleheck et al. 2004a). As a representative for the initial degradation step, the Alphaproteobacterium Parvibaculum lavamentivorans DS-1T converts commercial LAS via omega-oxygenation and fatty-acid beta-oxidations of the alkyl chains to a complex set of short- chain sulfophenylcarboxylates (SPCs), which are excreted (Schleheck et al. 2000, Dong et al. 2004, Schleheck et al. 2004a). In the second step, these SPCs are completely utilized by different organisms, which involve a removal of the carboxylate side chains from the aromatic rings of SPCs, aromatic ring cleavage and desulfonation (Schleheck et al. 2004a). CHAPTER 1 7

Figure 1. A LAS congener, here 2-(4-sulfophenyl)dodecane (2-C12-LAS, top), one of its major degradation intermediates, 3-(4-sulfophenyl)butyrate (3-C4-SPC, center), and of a branched chain C12-alkylbenzenesulfonate (bottom).The degradation pathway for 3-C4-SPC was a major aspect of this study. The structures are taken from Knepper et al. 2003 and from Schleheck et al. 2004a.

Due to their xenobiotic character and since LAS has been introduced only few decades ago (Knepper and Berna 2003), the bacteria were faced with a new, potential carbon and energy source. This fact raises the questions how these bacteria have adapted to utilize these new substrates for their growth, which enzymes and genes have been recruited in these bacteria to assemble such novel pathways, and what distinguishes these organisms to their relatives that are unable to utilize these compounds. Completely sequenced genomes greatly facilitate answering these questions. The genome of the heterotrophic LAS-degrading bacterial model community, hence, of P. lavamentivorans DS-1T and the second tier members Comamonas testosteroni KF-1 and Delftia acidovorans SPH-1 were kindly provided by the Joint Genome Institute (JGI), in order to elucidate the pathways, enzymes and genes involved in LAS and SPC degradation. The present thesis focused mainly on the experimentally well-accessible pathway for a SPC in C. testosteroni KF-1, for 3-(4-sulfophenyl)butyrate (3-C4-SPC; Figure 1). 8 CHAPTER 1

Sulfoquinovose: A prominent natural organosulfonate Natural organosulfonates can also fulfill functions as amphipathic compounds. A sulfonate containing lipid, 1,2-diacyl-3-sulfoquinovosylglycerol (sulfoquinovosyldiacylglycerol, SQDG, Figure 2), is found in the photosynthetic membranes of all higher plants and represents, with an estimated annual production of 3.6 x 1010 tons, a major component of the organosulfur in nature (Benson et al. 1959, Benson 1963, Harwood and Nicholls 1979). The polar headgroup of this sulfolipid is a sulfonated sugar, sulfoquinovose (SQ, 6-deoxy-6-sulfoglucose) (Daniel et al. 1961). Notably, the biosynthetic route of SQDG is still not completely understood, but it is thought that uridinediphosphate (UDP)-glucose and sulfite are the precursors for an UDP-SQ molecule (Pugh et al. 1995), which afterwards is fused with a diacylglycerol molecule, forming SQDG along with recovery of the UDP-moiety (Seifert and Heinz 1992). Also the physiological function of SQDG is not completely understood. Although it occurs in high amounts in photosynthetic tissues (Harwood and Nicholls 1979), SQDG is likely not involved in the photosynthetic process, since there are photosynthetic organisms without SQDG (Selstam and Campbell 1996), as well as organisms and plant tissues not involved with photosynthesis that also contain SQDG (Benson 1963, Isono et al. 1967, Anderson et al. 1978, Cedergren and Hollingsworth 1994). However, SQDG is thought to represent an alternative to phospholipids under phosphate limiting growth conditions (Yu and Benning 2003), as well as a sulfur storage compound under sulfur limited growth conditions (Sugimoto et al. 2007).

Figure 2. Structure of sulfoquinovosyldiacylglycerol (SQDG). R1 and R2 of the diacylglycerol indicate acyl chains of variable length and degree of unsaturation. The sulfoquinovose (SQ) and diacylglycerol moieties are connected via a glycosidic bond to the anomeric carbon of SQ (here, the alpha-anomer is depicted); the structure is taken from Benning 1998.

Importantly, the complete degradation of SQDG and SQ by bacteria and the recycling of the organosulfur as inorganic sulfate (or sulfite) as part of the global sulfur cycle, is not well understood, although a glycolytic-type pathway for the bacterial degradation of SQ has been proposed (Roy et al. 2003). Further, it is known that SQ can be degraded completely by bacterial CHAPTER 1 9

communities, where a first tier utilizes half of the carbon of the C6-organosulfonate SQ for growth and excretes an C3-organosulfonate, either 2,3-dihydroxy-1-propanesulfonate (DHPS) or 3-sulfolactate, which are then completely utilized, inclusive desulfonation, by a second tier of bacteria (Denger et al. 2012). However, no enzymatic or genetic information on the initial SQ degradation pathway(s) has been made available until today.

Aims of this study With the genome sequences of the three members of the LAS-degrading bacterial model community being available (see above), bioinformatic analyses and examinations at the genome level in respect to LAS and SPC degradation are possible, as well as the consolidation of the knowledge established in the literature, in order to publish genome reports.

Further, 3-C4-SPC is available from chemical synthesis as growth substrate for C. testosteroni KF-1, as well as its proposed degradation intermediates 4-sulfoacetophenone (SAP) and 4-sulfophenol from commercial sources (Schleheck et al. 2010). Hence, it can be examined which enzymes and genes are involved in the postulated 3-C4-SPC pathway in strain KF-1 in order to yield 4-sulfophenol; an involvement of a SAP Baeyer-Villiger type monooxygenase (SAP-BVMO) was already proposed (Schleheck et al. 2010). In addition it can be examined how the pathway proceeds further from 4-sulfophenol, for instance, which ring cleavage reaction and which ring cleavage pathway is involved. Finally, a differential proteomics approach may reveal other enzymes and genes involved in the 3-C4-SPC pathway in strain KF-1.

In respect to the unknown pathway(s) for the initial degradation of the natural organosulfonate SQ in bacteria, relevant amounts of SQ were also made available by chemical synthesis (Denger et al. 2012). Further, it was found that Escherichia coli K-12 is able to utilize SQ as sole source of carbon and energy for growth. Hence its SQ degradation pathway, and the involved enzymes and genes can be examined by physiological, biochemical, bioinformatic and molecular methods.

CHAPTER 2 11

CHAPTER 2

Complete genome sequence of Parvibaculum lavamentivorans type strain DS-1T

David Schleheck1, Michael Weiss1, Sam Pitluck2, David Bruce3, Miriam L. Land4, Shunsheng Han3, Elizabeth Saunders3, Roxanne Tapia3, Chris Detter3, Thomas Brettin4, James Han2, Tanja Woyke2, Lynne Goodwin3, Len Pennacchio2, Matt Nolan2, Alasdair M. Cook1, Staffan Kjelleberg5, Torsten Thomas5

1Department of Biological Sciences and Research School Chemical Biology, University of Konstanz, Germany

2DOE Joint Genome Institute, Walnut Creek, California, USA

3Los Alamos National Laboratory, Bioscience Division, Los Alamos, New Mexico, USA

4Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA

5Centre for Marine Bio-Innovation and School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia

This chapter was published in Standards in Genomic Sciences 2011, 5:298-310

12 CHAPTER 2

ABSTRACT Parvibaculum lavamentivorans DS-1T is the type species of the novel genus Parvibaculum in the novel family Rhodobiaceae (formerly Phyllobacteriaceae) of the order Rhizobiales of . Strain DS-1T is a non-pigmented, aerobic, heterotrophic bacterium and represents the first tier member of environmentally important bacterial communities that catalyze the complete degradation of synthetic laundry surfactants. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,914,745 bp long genome with its predicted 3,654 protein coding genes is the first completed genome sequence of the genus Parvibaculum, and the first genome sequence of a representative of the novel family Rhodobiaceae.

INTRODUCTION

Parvibaculum lavamentivorans strain DS-1T (DSM13023 = NCIMB13966) was isolated for its ability to degrade linear alkylbenzenesulfonate (LAS), a major laundry surfactant with a world- wide use of 2.5 million tons per annum (http://www.lasinfo.org). Strain DS-1T was difficult to isolate, is difficult to cultivate, and represents a novel genus in the Alphaproteobacteria (Schleheck et al. 2000, Schleheck et al. 2004b). Strain DS-1 catalyzes not only the degradation of LAS, but also of 16 other commercially important anionic and non-ionic surfactants (hence the species name lavamentivorans = consuming (chemicals) used for washing (Schleheck et al. 2004b)). The initial degradation as catalyzed by strain DS-1T involves the activation and shortening of the alkyl-chain of the surfactant molecules, and the excretion of short-chain degradation intermediates. These intermediates are then completely utilized by other bacteria in the community (Schleheck et al. 2003, Schleheck et al. 2004a). P. lavamentivorans DS-1T is therefore an example of a first tier member of a two-step process that mineralizes environmentally important surfactants.

Other representatives of the novel genus Parvibaculum have been recently isolated. Parvibaculum sp. strain JP-57 was isolated from seawater (Eilers et al. 2001) and is difficult to cultivate (Schleheck et al. 2004b). Parvibaculum indicum sp. nov. was also isolated from seawater, via an enrichment culture that degraded polycyclic aromatic hydrocarbons (PAH) and crude oil (Lai et al. 2010). Another Parvibaculum sp. strain was isolated from a PAH-degrading enrichment culture, using river sediment as inoculum (Hilyard et al. 2008). Parvibaculum species were also reported in a study on marine alkane-degrading bacteria (Wang et al. 2010). Parvibaculum species are frequently detected by cultivation-independent methods, predominantly in habitats or settings with hydrocarbon degradation. These include a bacterial CHAPTER 2 13 community on marine rocks polluted with diesel oil (Alonso-Gutiérrez et al. 2009), a bacterial community from diesel-contaminated soil (Paixão et al. 2010), a petroleum-degrading bacterial community from seawater (Li et al. 2009), an oil-degrading cyanobacterial community (Sánchez et al. 2005) and biofilm communities in pipes of a district heating system (Kjeldsen et al. 2007). Parvibaculum species have also been detected in denitrifying, linear-nonylphenol (NP)-degrading enrichment cultures from NP-polluted river sediment (De Weert et al. 2011) and in groundwater that had been contaminated by linear alkyl benzenes (LABs; non-sulfonated LAS) (Martínez-Pascual et al. 2010). Additionally, Parvibaculum species were detected in biofilms that degraded polychlorinated biphenyls (PCBs) using pristine soil as inoculum (Macedo et al. 2007), and in a PAH-degrading bacterial community from deep-sea sediment of the West Pacific (Wang et al. 2008). Finally, Parvibaculum species were detected in an autotrophic Fe(II)-oxidizing, nitrate-reducing enrichment culture (Blothe and Roden 2009), as well as in Tunisian geothermal springs (Sayeh et al. 2010). The widespread occurrence of Parvibaculum species in habitats or settings related to hydrocarbon degradation implies an important function and role of these organisms in environmental biodegradation, despite their attribute as being difficult to cultivate in a laboratory.

Here we present a summary classification and a set of features for P. lavamentivorans DS-1T, together with the description of a complete genome sequence and annotation. The genome sequencing and analysis was part of the Microbial Genome Program of the DOE Joint Genome Institute.

CLASSIFICATION AND FEATURES P. lavamentivorans DS-1T is a Gram-negative, non-pigmented, very small (approx. 1.0 x 0.2 µm), slightly curved, rod-shaped bacterium that can be motile by means of a polar flagellum (Figure 1, Table 1). Strain DS-1T grows very slowly on complex medium (e.g. on LB- or peptone-agar plates) and forms pinpoint colonies only after more than two weeks of incubation. The organism can be quickly overgrown by other organisms. Larger colonies are obtained when the complex medium is supplemented with a surfactant, e.g. Tween 20 (see DSM-medium 884; http://www.dsmz.de) or LAS (Schleheck et al. 2004b). When cultivated in liquid culture with mineral-salts medium, strain DS-1T grows within one week with the single carbon sources acetate, ethanol, or succinate, or alkanes, alkanols and alkanoates (C8 - C16); no sugars tested were utilized (Schleheck et al. 2004b).

To allow for growth of the organism in liquid culture with most of the 16 different surfactants at high concentrations (e.g. for LAS, >1 mM; see Schleheck et al. 2004b), the culture fluid 14 CHAPTER 2 needs to be supplemented with a solid surface, e.g. polyester fleece or glass fibers (Schleheck et al. 2000, Schleheck et al. 2004b). The additional solid surface is believed to support biofilm formation, especially in the early growth phase when the surfactant concentration is high, and the organism grows then as single, suspended cells (non-motile) during the later growth phase. Growth with a non-membrane toxic substrate (e.g. acetate) is independent of a solid surface, and constitutes suspended, single cells (motile). We presume that the biofilm formation by strain DS-1T is a protective response to the exposure to membrane-solubilizing agents (cf. Klebensberger et al. 2006).

Based on the 16S rRNA gene sequence, strain DS-1T was described as the novel genus Parvibaculum, which was originally placed in the family Phyllobacteriaceae within the order Rhizobiales of Alphaproteobacteria (Schleheck et al. 2004b, Euzéby 2005). The nearest well- described organism to strain DS-1T is Afifella marina (formerly Rhodobium marinum) (92% 16S rRNA gene sequence identity), a photosynthetic purple, non-sulfur bacterium. The genus Rhodobium was later re-classified to the novel family Rhodobiaceae (Garrity et al. 2005c, Validation-List-107 2006), together with two novel genera of other photosynthetic purple non- sulfur bacteria (Afifella and Roseospirillum), as well as with two novel genera of heterotrophic aerobic bacteria, represented by the red-pigmented Anderseniella baltica (gen. nov., sp. nov.) (Brettar et al. 2007, Euzéby 2008) and non-pigmented Tepidamorphus gemmatus (gen. nov., sp. nov.) (Albuquerque et al. 2010, Euzéby 2010). A phylogenetic tree (Figure 2) was constructed with the 16S rRNA gene sequence of P. lavamentivorans DS-1T and that of (i) other isolated Parvibaculum strains, (ii) representatives of other genera within the family Rhodobiaceae, (iii) representatives of the genera in the family Phyllobacteriaceae, as well as, (iv) representatives of other families within the order Rhizobiales. The phylogenetic tree confirmed the placement of Parvibaculum species within the family Rhodobiaceae, and that the Parvibaculum sequences clustered as a distinct evolutionary lineage within this family (Figure 2). This classification of Parvibaculum has been adopted in the Ribosomal Database Project (RDP) and SILVA rRNA Database Project, but not in the GreenGenes database. The family Rhodobiaceae has also not been included in the NCBI-, IMG-taxonomy, and GOLD databases.

Currently, 360 genome sequences of members of the order Rhizobiales of Alphaproteobacteria have been made available (GOLD database; August 2011), and within the family Phyllobacteriaceae there are 21 genome sequences available (Chelativorans sp. BNC1, Hoeflea phototrophica DFL-43, and 18 Mesorhizobium strains). No genome sequences CHAPTER 2 15 currently exist for a representative of the novel family Rhodobiaceae, except of the genome of P. lavamentivorans DS-1T.

Figure 1. Scanning electron micrograph of P. lavamentivorans DS-1T. Cells derived from a liquid culture that grew in acetate/mineral salts medium.

Chemotaxonomy Examination of the respiratory lipoquinone composition of strain DS-1T showed that ubiquinones are the sole respiratory quinones present, and the major lipoquinone is ubiquinone 11 (Q11) (Schleheck et al. 2004b). The fatty acids of P. lavamentivorans are straight chain saturated and unsaturated, as well as ester- and amide-linked hydroxy-fatty acids, in membrane fractions (Schleheck et al. 2004b). The major polar lipids are phosphatidyl glycerol, diphosphatidyl glycerol, phosphatidyl ethanolamine, phosphatidyl choline, and two, unidentified aminolipids; the presence of the two additional aminolipids appears to be distinctive of the organism (Schleheck et al. 2004b). The G + C content of the DNA was determined at 64% (Schleheck et al. 2004b), which corresponds well to the GC content observed for the complete genome sequence (see below).

16 CHAPTER 2

Rhodobiaceae

RHIZOBIALES

Phyllobacteriaceae

Figure 2. Phylogenetic tree of 16S rRNA gene sequences showing the position of P. lavamentivorans DS-1T relative to other type strains within the families Rhodobiaceae, Phyllobacteriaceae and other families in the order Rhizobiales (see the text). Strains within the Rhodobiaceae and Phyllobacteriaceae shown in bold have genome projects underway or completed. The corresponding 16S rRNA gene accession numbers (or draft genome sequence identifiers) are indicated. The sequences were aligned using the GreenGenes NAST alignment tool (DeSantis et al. 2006); neighbor-joining tree building and visualization involved the CLUSTAL and DENDROSCOPE software (Huson et al. 2007). Caulobacterales sequences were used as outgroup. Bootstrap values >30% are indicated; bar, 0.01 substitutions per nucleotide position.

CHAPTER 2 17

) )

)

)

2003,

107 2006 107 2006

- -

2004

1072006

-

et al. et

)

List List

- -

List

et et al.

-

2010

from the from Gene Ontology

) ) ) ) ) ) ) ) ) ) ) ) ) )

, Dong ,

et et al.

) ) ) ) )

)

)

2000, Schleheck 2000 2004b 2004b 2004b 2004b 2004b 2004b 2004b 2004b 2004b 2004b 2000 2004b 2004b 2004b 2004b 2000 2000 2000 2000

2005e 2005a, Validation 2005c, Validation

1990

traceable Author Statement (i.e., not

-

et al. et al. et et et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al.

2004b, Lai

a

et et al. et al. et al.

et et al.

et et al.

leheck

hleheck hleheck hleheck

Kuykendall 2005, Kuykendall Validation Schleheck Schleheck Woese Garrity Garrity Garrity Sc Schleheck Schleheck Schleheck Schleheck Sc Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Sc Schleheck Sch Schleheck Schleheck Schleheck

( ( ( ( ( ( ( (

TAS TAS ( TAS TAS Evidence code TAS TAS TAS TAS TAS( TAS( TAS( TAS( TAS( TAS( TAS( TAS( Schleheck TAS( TAS( TAS( TAS( TAS( TAS( TAS( TAS TAS( TAS( TAS

), anionic various

16

C

T

1

8

-

DS

degrading laboratory trickling trickling degradinglaboratory filterthat was

-

of an industrial an of sewage treatmentplant in

accepted property for the species, or anecdotal evidence). These evidence codes are

1

-

ionic surfactants

Bacteria

Proteobacteria

Parvibaculum lavamentivorans

-

Rhodobiaceae

Parvibaculum

Rhizobiales

Alphaproteobacteria

Parvibaculum lavamentivorans living

sporulating

-

3%NaCl

-

Order pyruvate, acetate, ethanol, alkanessuccinate, (C a surfactant from isolated Term Domain Phylum Class Family Genus Species TypeDS strain negative smallrod motile non mesophile 30ºC and non chemoorganotroph molecular oxygen aerobic habitat 0 aerobic free none inoculated sludgewith Ludwigshafen, Germany 1999 49.48 8.44 96 m

).

source

2000

IDA: IDA: Inferred Directfrom Assay; TAS: Traceable StatementAuthor (i.e., a direct report exists in the literature); NAS: Non

-

et et al.

Carbon Carbon location Geographic Property Currentclassification stainGram shapeCell Motility Sporulation Temperature range temperatureOptimum Energy source electronTerminal receptor Habitat Salinity Oxygen requirement Biotic relationship Pathogenicity collectionSample time Latitude Longitude Depth Altitude

codes

Classification and general of features general and Classification

4 6 6.3 22 15 14 5 4.1 4.2 4.3 4.4

Ashburner

------

MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS

) Evidence ) Evidence

Table1. directly observed for the living, isolated sample, but based on a generally project( a 18 CHAPTER 2

GENOME SEQUENCING INFORMATION

Genome project history The genome was selected for sequencing as part of the U.S. Department of Energy - Microbial Genome Program 2006. The DNA sample was submitted in April 2006 and the initial sequencing phase was completed in October 2006. The genome finishing and assembly phase was completed in June 2007, and presented for public access on December 2007; a modified version was presented in February 2011. Table 2 presents the project information and its association with MIGS version 2.0 compliance (Field et al. 2008).

Table 2. Project information MIGS ID Property Term MIGS-31 Finishing quality Finished MIGS-28 Libraries used 3.5 kb, 9 kb and 37 kb DNA libraries MIGS-29 Sequencing platforms Sanger MIGS-31.2 Fold coverage 16x MIGS-30 Assemblers Phred/Phrap/Consed MIGS-32 Gene calling method Glimmer/Criteria Genbank ID 17639 Genbank Date of Release July 31, 2007 GOLD ID Gc00631 MIGS-13 Source material identifier DSM 13023 = NCIMB 13966 Project relevance Biodegradation, biotechnological

Growth conditions and DNA isolation

P. lavamentivorans DS-1T was grown on LB agar plates (2 weeks) and pinpoint colonies were transferred into selective medium (1 mM LAS/minimal salts medium; with glass-fiber supplement, 5-ml scale; (Schleheck et al. 2004b)). This culture was sub-cultivated to larger scale (100-ml and 1-liter scale) in 30 mM acetate/minimal salts medium; cell pellets were stored frozen until DNA preparation. DNA was prepared following the JGI DNA Isolation Bacterial CTAB Protocol (http://my.jgi.doe.gov/general/index.html).

Genome sequencing and assembly

The genome of P. lavamentivorans DS-1T was sequenced at the Joint Genome Institute (JGI) using a combination of 3.5 kb, 9 kb and 37 kb DNA libraries. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website (http://www.jgi.doe.gov). Draft assemblies were based on 76,870 reads. Combined, the reads from all three libraries provided 16-fold coverage of the genome. The Phred/Phrap/Consed software package (http://www.phrap.com) was used for sequence assembly and quality CHAPTER 2 19 assessment (Ewing and Green 1998, Ewing et al. 1998, Gordon et al. 1998). After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with Dupfinisher (Han and Chain 2006), PCR amplification, or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI, USA). Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification (Roche Applied Science, Indianapolis, IN, USA). A total of 24 primer walk reactions were necessary to close gaps and to raise the quality of the finished sequence. The completed genome assembly contains 76,885 reads, achieving an average of 16-fold sequence coverage per base with an error rate less than 5 in 100,000.

Genome annotation

Genes were identified using a combination of Critica (Badger and Olsen 1999) and Glimmer (Delcher et al. 2007) as part of the genome annotation pipeline at Oak Ridge National Laboratory (ORNL), Oak Ridge, TN, USA, followed by a round of manual curation. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases; miscellaneous features were predicted using TMHMM (Krogh et al. 2001) and signalP (Dyrløv Bendtsen et al. 2004). These data sources were combined to assert a product description for each predicted protein. The tRNAScanSE tool (Lowe and Eddy 1997) was used to find tRNA genes, whereas ribosomal RNAs were found by using BLASTn against the ribosomal RNA databases. The RNA components of the protein secretion complex and the RNaseP were identified by searching the genome for the corresponding Rfam profiles using INFERNAL (http://infernal.janelia.org). Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform (http://img.jgi.doe.gov) developed by the Joint Genome Institute, Walnut Creek, CA, USA (Markowitz et al. 2008).

GENOME PROPERTIES The genome of P. lavamentivorans DS-1T comprises one circular chromosome of 3,914,745 bp (62.33% GC content) (Figure 3), for which a total number of 3,714 genes were predicted. Of these predicted genes, 3,654 are protein-coding genes, and 2,723 of the protein-coding genes were assigned to a putative function and the remaining annotated as hypothetical proteins; 18 pseudogenes were also identified. A total of 60 RNA genes and one rRNA operon are predicted; the latter is reflective of the slow growth of P. lavamentivorans DS-1T (Schleheck and Cook 2005, Schleheck et al. 2007). Furthermore, one Clustered Regularly Interspaced Short 20 CHAPTER 2

Palindromic Repeats element (CRISPR) including associated protein genes were predicted. The properties and the statistics of the genome are summarized in Table 3, and the distribution of genes into COGs functional categories is presented in Table 4.

Table 3. Nucleotide and gene count levels of the genome of P. lavamentivorans DS-1 Attribute Value % of totala Genome size (bp) 3,914,745 100 Coding region (bp) 3,535,064 90.30 G+C content (bp) 2,439,986 62.33 Number of replicons 1 Extrachromosomal elements 0 Total genes 3,714 100 RNA genes 60 1.62 rRNA operons 1 Protein-coding genes 3,654 98.38 Pseudo genes 18 0.48 Genes with function prediction 2,723 73.32 Genes in paralog clusters 620 16.69 Genes assigned to COGs 2,904 78.19 Genes assigned to Pfam domains 3,054 82.23 Genes with signal peptides 717 19.31 Genes connected to KEGG pathways 1,085 29.21 Genes with transporter classification 430 11.58 Genes with transmembrane helices 782 21.06 CRISPR count 1 % of totala a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.

METABOLIC FEATURES The genome of P. lavamentivorans encodes complete pathways for synthesis of all proteinogenic amino acids and essential co-factors, and the central metabolism is represented by complete pathway for the citrate cycle, glycolysis/gluconeogenesis, and the non-oxidative branch of the pentose-phosphate pathway; no candidate genes for the oxidative branch of the pentose-phosphate pathway or for the Entner–Doudoroff pathway are predicted.

P. lavamentivorans DS-1T does not grow on D-glucose, D-fructose, maltose, D-mannitol, D-mannose, and N-acetylglucosamine (Schleheck et al. 2004b, Lai et al. 2010), and there are no valid candidate genes predicted in the genome for ATP-dependent sugar uptake systems or for D-glucose uptake via a phosphotransferase system. Similarly, no valid candidate genes were predicted for ATP-dependent amino-acid and di/oligo-peptide transport systems or for other CHAPTER 2 21 amino-acid/peptide transporters, which reflects the poor growth of strain DS-1T in complex medium (LB-medium).

Table 4. Number of genes associated with the general COG functional categories in P. lavamentivorans DS-1T Code Value %age Description J 163 5.07 Translation, ribosomal structure and biogenesis A 1 0.0 RNA processing and modification K 243 7.0 Transcription L 137 3.9 Replication, recombination and repair B 1 0.0 Chromatin structure and dynamics D 25 0.7 Cell cycle control, mitosis and meiosis Y 0 0 Nuclear structure Z 0 0 Cytoskeleton W 0 0 Extracellular structures V 85 2.4 Defense mechanisms T 118 3.4 Signal transduction mechanisms M 131 3.8 Cell wall/membrane biogenesis N 6 0.2 Cell motility U 39 1.1 Intracellular trafficking and secretion O 77 2.2 Posttranslational modification, protein turnover, chaperones C 153 4.4 Energy production and conversion G 294 8.4 Carbohydrate transport and metabolism E 214 6.1 Amino acid transport and metabolism F 79 2.3 Nucleotide transport and metabolism H 110 3.2 Coenzyme transport and metabolism I 73 2.1 Lipid transport and metabolism P 152 4.4 Inorganic ion transport and metabolism Q 30 0.9 Secondary metabolites biosynthesis, transport and catabolism R 318 9.1 General function prediction only S 200 5.7 Function unknown - 1082 31.0 Not in COGs

For the assimilation of acetyl-CoA from the degradation of alkanes and surfactants (Schleheck et al. 2000, Schleheck et al. 2003, Schleheck et al. 2004b), or during growth with acetate, the genome of P. lavamentivorans encodes the glyoxylate cycle (isocitrate lyase, Plav_0592; malate synthase, Plav_0593) to generate succinate for the synthesis of carbohydrates. The genome also encodes the complete ethyl-malonyl-CoA pathway to assimilate acetate (Erb et al. 2007). This observation, i.e. glyoxylate cycle and ethyl-malonyl-CoA pathway in the same organism, has been made before (Erb et al. 2009), and these two pathways in 22 CHAPTER 2

P. lavamentivorans DS-1T might be differentially expressed under varying environmental conditions.

Figure 3. Graphical circular map of the genome of P. lavamentivorans DS-1T. From outside to center: Genes on forward strand (color by COG categories), genes on reverse strand (color by COG categories), RNA genes (tRNA, green; rRNA, red; other RNAs, black), GC content, GC skew.

For the degradation of alkanes and surfactants through abstraction of acetyl-CoA (Schleheck and Cook 2005), the genome contains a wealth of candidate genes for the entry into alkyl-chain degradation (omega-oxygenation to activate the chain) supplemented by a wealth of genes predicted for omega-oxidations (to generate the corresponding fatty-acids) and fatty-acid beta- oxidations (to excise acetyl-CoA units). We are currently exploring this high abundance of genes for alkane/alkyl-utilization in strain DS-1T by transcriptional and translational analysis (unpublished). For example, at least 9 cytochrome-P450 (CYP) alkane monooxygenase (COG2124), 44 alcohol dehydrogenase (COG1028), 11 aldehyde dehydrogenase (COG1012), 20 acyl-CoA synthetase (COG0318), 40 acyl-CoA dehydrogenase (COG1960), 31 enoyl-CoA CHAPTER 2 23 hydratase (COG1024), 14 acyl-CoA acetyl-transferase (COG0183), six thioesterase (COG0824), and 17 putative long-chain acyl-CoA thioester hydrolase (PF03061) candidate genes are predicted in the genome.

Other predicted oxygenase genes comprise three putative Baeyer-Villiger-type FAD-binding monooxygenase genes (COG2072). Cyclohexanone and hydroxyacetophenone, which are putative substrates for such oxygenases (e.g. Chen et al. 1988, Rehdorf et al. 2009) were tested T as carbon source for growth of strain DS-1 , as well as cycloalkanes (C6, C8, C12), however, none supported growth. The terpenoids camphor (for the involvement of a cytochrome-P450 oxygenase in the degradation pathway; Hedegaard and Gunsalus 1965) and geraniol, citronellol, linalool, menthol and eucalyptol (for the involvement of acyl-CoA interconversion enzymes in the degradation pathways) as substrates for growth were also tested negative.

In contrast to the high abundance of genes for aliphatic-hydrocarbon degradation, the genome contains few genes for aromatic-hydrocarbon degradation. One gene set for an aromatic ring dioxygenase component (Plav_1761 and 1762, BenAB-type), three aromatic ring monooxygenase component genes (Plav_1541 and 0131, MhpA-type; Plav_1785, HpaB-type), and three valid candidate genes for extradiol ring cleavage dioxygenase (Plav_1539 (Sipila et al. 2008) and 1787, BphC-type; Plav_0983, LigB-type) were predicted in the genome. Strain DS-1T did not grow with benzoate, protocatechuate, phenylacetate, phenylpropionate, or phenylalanine and tyrosine as carbon source when tested.

Finally, P. lavamentivorans DS-1T is predicted to store carbon in form of intracellular polyhydroxyalkanoate/butyrate (PHB) as its genome encodes a PHB-synthase (PhbC) gene (Plav_1129), PHB-depolymerase (PhaZ) gene (Plav_0012), and PHB-synthesis repressor (PhaR) gene (Plav_1572).

ACKNOWLEDGEMENTS We thank Joachim Hentschel for SEM operation. The work was supported by the University of Konstanz and the Konstanz Research School Chemical Biology, the University of New South Wales and the Centre for Marine Bio-Innovation, and the Deutsche Forschungsgemeinschaft (DFG grant SCHL 1936/1-1 to D.S.). The work conducted by the U.S. Department of Energy Joint Genome Institute was supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and that of the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and Los Alamos National Laboratory under contract No. W-7405-ENG-36.

CHAPTER 3 25

CHAPTER 3

Permanent draft genome sequence of Comamonas testosteroni KF-1

Michael Weiss1,2, Anna I. Kesberg1, Kurt M. LaButti3, Sam Pitluck3, David Bruce4, Loren Hauser5, Alex Copeland3, Tanja Woyke3, Stephen Lowry3, Susan Lucas3, Miriam Land5, Lynne Goodwin3,4, Staffan Kjelleberg6, Alasdair M. Cook1, Matthias Buhmann1, Torsten Thomas6, and David Schleheck1,2

1Department of Biological Sciences, University of Konstanz, Germany

2Konstanz Research School Chemical Biology, University of Konstanz, Germany

3DOE Joint Genome Institute, Walnut Creek, California, USA

4Los Alamos National Laboratory, Bioscience Division, Los Alamos, New Mexico, USA

5Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA

6Centre for Marine Bio-Innovation and School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia

This chapter was published in Standards in Genomic Sciences 2013, 8:239-254

26 CHAPTER 3

ABSTRACT Comamonas testosteroni KF-1 is a model organism for the elucidation of the novel biochemical degradation pathways for xenobiotic 4-sulfophenylcarboxylates (SPCs) formed during biodegradation of synthetic 4-sulfophenylalkane surfactants (linear alkylbenzenesulfonates, LAS) by bacterial communities. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 6,026,527 bp long chromosome (one sequencing gap) exhibits an average G+C content of 61.79% and is predicted to encode 5,492 protein-coding genes and 114 RNA genes.

INTRODUCTION Comamonas testosteroni strain KF-1 (DSM14576) was isolated for its ability to degrade xenobiotic sulfophenylcarboxylates (SPCs), which are degradation intermediates of the synthetic laundry surfactants linear alkylbenzenesulfonates (LAS) (Schleheck et al. 2004a). LAS is in use worldwide (appr. 3 x 106 tons per year; Knepper et al. 2003) and consists of a complex mixture of linear alkanes (C10-C13) sub-terminally substituted by 4-sulfophenyl rings (i.e., 38 different compounds) (Knepper et al. 2003). Commercial LAS is completely biodegradable, as known for more than 50 years (Swisher 1970), e.g., in sewage treatment plants, and its degradation is catalyzed by heterotrophic aerobic bacterial communities in two steps. First, an initial degradation step is catalyzed by bacteria such as Parvibaculum lavamentivorans DS-1T (Schleheck et al. 2011) through activation and shortening of the alkyl- chains of LAS, and many short-chain degradation intermediates are excreted by these organisms, i.e., approximately 50 different SPCs and related compounds (Schleheck et al. 2000, Dong et al. 2004, Schleheck et al. 2004a, Schleheck et al. 2004b, Schleheck et al. 2007). Secondly, the ultimate degradation step, i.e., mineralization of all SPCs, is catalyzed by other bacteria in the community, and one representative of these is Comamonas testosteroni KF-1. In particular, strain KF-1 was isolated from a laboratory trickling filter that had been used to enrich a bacterial community from sewage sludge that completely degraded commercial LAS and SPCs (Dong et al. 2004, Schleheck et al. 2004a). Strain KF-1 is able to utilize four individual

SPCs (both enantiomers), namely R/S-3-(4-sulfophenyl)butyrate (3-C4-SPC), enoyl-3-C4-SPC,

R/S-3-(4-sulfophenyl)pentanoate (3-C5-SPC), and enoyl-3-C5-SPC (see therefore also below), as novel carbon an energy sources for its heterotrophic aerobic growth (Schleheck et al. 2004a, Schleheck et al. 2010, Weiss et al. 2012).

The first Comamonas testosteroni (formerly Pseudomonas testosteroni (Tamaoka et al. 1987)) strain, type-strain ATCC 11996, was enriched from soil and isolated in 1952 for its ability to degrade testosterone (Talalay et al. 1952, Talalay 2005). Since then, the physiology, CHAPTER 3 27 biochemistry, genetics, and regulation of steroid degradation in this and in other C. testosteroni strains have been elucidated in great detail (e.g., Marcus and Talalay 1956, Shaw et al. 1965, Horinouchi et al. 2001, Horinouchi et al. 2003b, Xiong et al. 2003, Rösch et al. 2008, Horinouchi et al. 2010, Gong et al. 2012b). Most recently, the genome of C. testosteroni ATCC 11996T has been sequenced in order to further improve the understanding of the molecular basis for the degradation of steroids (Gong et al. 2012a).

In the environment, members of the genus Comamonas may also be important degraders of aromatic compounds other than steroids, especially of xenobiotic pollutants, since they have frequently been enriched and isolated for their ability to utilize (xenobiotic) aromatic compounds. For example, Comamonas sp. strain JS46 is able to grow with 3-nitrobenzoate (Providenti et al. 2006b), Comamonas sp. strain CNB-1 with 4-chloronitrobenzene (Ma et al. 2009), C. testosteroni T-2 with 4-toluenesulfonate and 4-sulfobenzoate (Locher et al. 1989), C. testosteroni WDL7 with chloroaniline (Krol et al. 2012), Comamonas sp. strain JS765 with nitrobenzene (Nishino and Spain 1995), Comamonas sp. strain B-9 with lignin-polymer fragments (Chen et al. 2012), C. testosteroni B-356 with biphenyl and 4-chlorobiphenyl (Ahmad et al. 1990), Comamonas sp. strain KD-7 with dibenzofuran (Wang et al. 2004), Comamonas sp. strain 4BC with naphthalene-2-sulfonate (Song et al. 2005), or C. testosteroni SPB-2 (as well as strain KF-1) with 4-sulfophenylcarboxylates (Schleheck et al. 2004a). In several C. testosteroni strains, the physiology, biochemistry, genetics, and/or regulation of the utilization of aromatic compounds have been elucidated (e.g., Ornston and Ornston 1972, Locher et al. 1989, Ahmad et al. 1990, Schläfli et al. 1994, Nishino and Spain 1995, Sylvestre et al. 1996, Teramoto et al. 1999, Providenti et al. 2001, Tralau et al. 2001, Lessner et al. 2002, Tralau et al. 2003a, Tralau et al. 2003b, Mampel et al. 2004, Mampel et al. 2005, Providenti et al. 2006a, Providenti et al. 2006b, Sasoh et al. 2006, Fukuhara et al. 2010, Kamimura et al. 2010, Kasai et al. 2010, Ni et al. 2012, Weiss et al. 2012). Furthermore, the genome sequence of (plasmid-cured) C. testosteroni CNB-2 has been published (Ma et al. 2009), and the sequence of its plasmid pCNB1 (of C. testosteroni CNB-1) (Ma et al. 2007), in order to further improve the understanding of the molecular basis for the ability of C. testosteroni to degrade such a large array of aromatic compounds.

Members of the genus Comamonas are able to cope with harsh environmental conditions such as high concentrations of arsenate (Zhang et al. 2007, Cai et al. 2009), zinc (Xiong et al. 2011), cobalt and nickel (Siunova et al. 2009), or phenol (Turek et al. 2011), and can exhibit increased resistance to oxidative stress (Godocíková et al. 2005) or antibiotics (Oppermann et al. 1996). Another C. testosteroni genome sequence, of strain S44, has recently been established in order 28 CHAPTER 3 to improve the understanding of the molecular basis for its resistance to increased concentrations of zinc (Xiong et al. 2011). Notably, an increased antibiotic resistance (and enhanced insecticide catabolism) as consequences of induction of the steroid degradation pathway has been shown for C. testosteroni ATCC 11996T (Oppermann et al. 1996).

Here, we present a summary classification and a set of features for another C. testosteroni strain, strain KF-1, which has been genome-sequenced in order to improve the understanding of the molecular basis for its ability to degrade xenobiotic compounds, particularly xenobiotic, chiral

3-C4-SPC, and how this novel degradation pathway has been assembled in this organism, together with the description of its draft genome sequence and annotation. The genome sequence and its annotation have been established as part of the Microbial Genomics Program 2006 of the DOE Joint Genome Institute, and are accessible via the IMG platform (Markowitz et al. 2012).

CLASSIFICATIONS AND FEATURES

Morphology and growth conditions C. testosteroni KF-1 is a rod-shaped (size, appr. 0.5 x 2 µm, Figure 1) Gram-negative bacterium that can be motile and grows strictly aerobically with complex medium (e.g., in LB- or peptone medium) or in a prototrophic manner when cultivated in mineral-salts medium (Thurnheer et al. 1986) with a single carbon source (e.g., acetate). Strain KF-1 grows overnight on LB-agar plates and forms whitish-beige colonies (Table 1). The strain grew with all amino acids tested (D-alanine, L-alanine, L-aspartate, L-phenylalanine, L-valine, glycine, L-histidine, L-methionine), but not with any of the sugars tested (D-glucose, D-fructose, D-galactose, D-arabinose, and D-maltose). Strain KF-1 utilized the following alcohols and carboxylic acids when tested (in this study): Ethanol, acetate, glycerol, glycolate, glyoxylate, butanol, butyrate, isobutyrate, succinate, meso-tartaric acid, D- and L-malate, mesaconate, and nicotinate. Furthermore, strain KF-1 was positive for growth with poly-beta-hydroxybutyrate (this study). Strain KF-1 is able to utilize the steroids testosterone and progesterone (confirmed in this study), as well as taurocholate and cholate (and taurine and N-methyl taurine) (Rösch et al. 2008), and taurodeoxycholate; strain KF-1 was tested negative for growth with cholesterol, ergosterol, 17β-estradiol and ethinylestradiol (this study), correlating with the findings for C. testosteroni strain TA441 (Horinouchi et al. 2010).

In respect to other aromatic compounds, strain KF-1 is known to utilize benzoate, 3- and 4-hydroxybenzoate, protocatechuate (3,4-dihydroxybenzoate), gentisate (2,5-dihydroxybenzoate), phthalate, terephthalate, vanillate, isovanillate, veratrate, 2- and CHAPTER 3 29

3-hydroxyphenylacetate (tested in this study, and ref. Schleheck et al. 2004a). Xenobiotic aromatic substrates for strain KF-1 known are the 4-sulfophenylcarboxylates

R/S-3-(4-sulfophenyl)butyrate (R/S-3-C4-SPC), 3-(4-sulfophenyl)-∆2-enoylbutyrate (enoyl-

3-C4-SPC), R/S-3-(4-sulfophenyl)pentanoate (R/S-3-C5-SPC), 3-(4-sulfophenyl)-

∆2-enoylpentanoate (enoyl-3-C5-SPC), as well as the three xenobiotic metabolites in the

3-C4-SPC-pathway, 4-sulfoacetophenone (4-acetylbenzenesulfonate), 4-sulfophenyl acetate, and 4-sulfophenol (Schleheck et al. 2004a, Schleheck et al. 2010). Finally, strain KF-1 did not utilize the following, other carbon sources tested (this study and refs. Schleheck et al. 2004a,

Schleheck et al. 2010): n-alkanes (C6-C12), cycloalkanes (C8-C12), secondary- 4-sulfophenylalkanes (LAS surfactants), secondary alkanesulfonates (SAS surfactants), dodecylsulfate (SDS surfactant), benzene sulfonate, 4-toluenesulfonate, 4-sulfobenzoate, phenylacetate, 3-phenylpropionate, 3- and 4-phenylbutyrate, 4-sulfostyrene, 4-sulfocatechol, cyclohexanone, 4-aminoacetophenone, gallic acid (3,4,5-trihydroxybenzoic acid) and gallotannic acid, pentanesulfonate, isethionate, sulfoacetate, D-tartaric acid, acetamide, gamma-aminobutyrate, oxalate, methanol, methylamine, methanesulfonate or formate, and not

2-C4-SPC (2-(4-sulfophenyl)butyrate), 4-C5-SPC, 4-C6-SPC, 5-C6-SPC, or any of the C7 – C9 SPCs generated during commercial LAS surfactant degradation.

C. testosteroni KF-1 has been recognized for its poor ability to form structured biofilms on surfaces (Buhmann 2008, see also Li et al. 2008), or micro- or macroscopic cellular aggregates in liquid cultures (Schleheck et al. 2009), in direct comparison to ‘good’ biofilm forming organisms such as Delftia acidovorans SPH-1 (Buhmann 2008), Pseudomonas aeruginosa PAO1 (Schleheck et al. 2009), or C. testosteroni SPB-2 (Schleheck et al. 2004a).

No significant production of siderophores could be observed for C. testosteroni KF-1 when grown in presence of non-inhibitory levels of iron chelator 2,2'-dipyridyl (see Stelzer et al. 2006), in comparison to siderophore-producing Delftia acidovorans SPH-1, Pseudomonas aeruginosa PAO1, and Pseudoalteromonas tunicata D2 (Thomas et al. 2008) (reported in this study, data not shown).

Finally, strain KF-1 is able to grow in the presence of up to 500 µg ml-1 ampicillin or 600 µg ml-1 kanamycin in liquid cultures, as tested in this study.

Phylogeny Based on its 16S rRNA gene sequence, strain KF-1 is a member of the genus Comamonas, which is placed in the family Comamonadaceae within the order Burkholderiales of Betaproteobacteria, as illustrated by a phylogenetic tree shown in Figure 2. Currently, 686 30 CHAPTER 3 genome sequences of members of the order Burkholderiales of Betaproteobacteria, and 147 genome sequences within the family Comamonadaceae, have been, currently are, or are targeted to be established (GOLD database; May 2013).

1 µm

Figure 1. Scanning electron micrograph of Comamonas testosteroni KF-1. Cells derived from a liquid culture that grew in LB medium.

GENOME SEQUENCING INFORMATION

Genome project history The genome was selected for sequencing as part of the U.S. Department of Energy - Microbial Genomics Program 2006. The DNA sample was submitted in February 2006 and the initial sequencing phase was completed in July 2006. After the finishing and assembly phase the genome was presented for public access on January 2009; a modified version was presented (IMG) in August 2011. Table 2 presents the project information and its association with MIGS version 2.0 compliance (Field et al. 2008).

Growth conditions and DNA isolation Comamonas testosteroni KF-1, obtained from the Leibniz-Institut DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen (DSM14576), was grown on LB agar plates and CHAPTER 3 31 transferred into selective medium (6 mM 4-sulfophenol/mineral-salts medium) in the 3-ml scale, and this culture was sub-cultivated in larger scale; cell pellets were stored frozen until DNA preparation. DNA was prepared following the JGI’s DNA Isolation Bacterial CTAB Protocol.

Genome sequencing and assembly The genome of Comamonas testosteroni KF-1 was sequenced at the Joint Genome Institute (JGI) using a combination of 3.5 kb, 9 kb and 37 kb DNA libraries. All general aspects of library construction and sequencing performed at the JGI can be found at JGI website (www.jgi.doe.gov). In total, 66.91 Mbp of Sanger sequence data were generated for the assembly from all three libraries, which provided for a 12.8-fold coverage of the genome. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment (Ewing and Green 1998, Ewing et al. 1998, Gordon et al. 1998). After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis- assemblies were corrected with Dupfinisher (Han and Chain 2006), PCR amplification, or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI, USA). Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification (Roche Applied Science, Indianapolis, IN, USA). The genome could not be closed due to clone viability issues, however, several clones circularized the contig, and a PCR product was obtained that spanned the ends, but all attempts at primer walking and transforming the amplicon were unsuccessful. At this time no additional work is planned for this project (labeled as Permanent Draft; one linear contig).

Genome annotation Genes were identified using Prodigal (Hyatt et al. 2010) as part of the genome annotation pipeline at Oak Ridge National Laboratory (ORNL), Oak Ridge, TN, USA, followed by a round of manual curation using the JGI GenePRIMP pipeline (Pati et al. 2010). The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE (Lowe and Eddy 1997), RNAMMer (Lagesen et al. 2007), Rfam (Griffiths-Jones et al. 2003), TMHMM (Krogh et al. 2001), and signalP (Dyrløv Bendtsen et al. 2004). Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform (http://img.jgi.doe.gov) developed by the Joint Genome Institute, Walnut Creek, CA, USA (Markowitz et al. 2008).

32 CHAPTER 3

Table 1. Table project project a

)

MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS ID MIGS

Evidence

(

------

Ashburner

4.4 6 4.2 4.1 5 4 14 15 22

Classificationand general features of

codes codes

Altitude Habitat Longitude Latitude time collection Sample location Geographic Pathogenicity relationship Biotic requirement Oxygen receptor Terminalelectron source Energy source Carbon Optimumtemperature range Temperature Sporulation Motility Cellshape stain Gram classification Current Property

et et al.

-

IDA: Inferred from Direct Assay; TAS: Traceable Author Statement; NAS: Non NAS: Statement; Author Traceable TAS: Assay; Direct from Inferred IDA:

2000

)

. . If

the evidence is IDA, the then was property for directly a observed live isolate by one of the authors or an mentioned expert

440 m 440 habitat aerobic 9° 11' 16.25" 11' 9° 27.24" 41' 47° 1999 Switzerland). (Herisau, plant treatment sewage fromcommunal a with sludge inoculated been had that Germany) Konstanz, (Universityof LASsurfactant from a isolated TRBA) German to according (classification 1 group Risk nonpathogenic, free aerobic oxygen molecular chemoorganotroph isovanillate vanillate, 4 taurine, benzoate, cholate, taurocholate, progesterone, 4 3 ºC 30 mesophile non motile rod small negative StrainKF Species Genus Family Order Class Phylum Domain Term

- -

sulfoacet (4

-

-

-

sporulating

living

sulfophenyl)butyrate (3 sulfophenyl)butyrate

Betaproteobacteria

Burkholderiales

Comamonas

Comamonadaceae

Comamonas Comamonas

Proteobacteria

Bacteria

-

1

ophenone, 4 ophenone,

Comamonas testosteroni Comamonas

testosteroni

-

sulfophenyl acetate, 4 acetate, sulfophenyl

-

C

-

4

degrading laboratory trickling filter filter trickling laboratory degrading

-

SPC) and other SPCs ( SPCs other and SPC)

KF

-

-

1 according to the MIGS recommendations according 1 the to MIGS

sulfophenol, testosterone, testosterone, sulfophenol,

see text see

-

hydroxybenzoate, hydroxybenzoate,

-

traceable Author Statement. These evidence codes are from the Gene Ontology Ontology Gene the from are codes evidence These Statement. Author traceable

)

, ,

TAS TAS TAS TAS TAS TAS TAS TAS TAS TAS TAS ( IDA,TAS TAS TAS TAS TAS TAS al. et TAS TAS TAS TAS TAS TAS code Evidence

( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (

1987

Dong Dong Dong Dong Dong Dong Dong Dong Dong Dong Schleheck Schleheck Schleheck Schleheck Tamaoka 1962 Park Davisand Willems Garrity Garrity Garrity Woese

, ,

Schleheck

Willems

et al. et al. et et al. et al. et al. et al. et al. et al. et al. et al. et

et al. et

et al. et al. et al. et

a

et al. et

et al. et

et al. et al. et al. et al. et

2004, Schleheck 2004, Schleheck 2004, 2004, Schleheck 2004, Schleheck 2004, Schleheck 2004, Schleheck 2004, Schleheck 2004, Schleheck 2004, Schleheck 2004, Schleheck 2004,

1990

2005d, Validation 2005d, Validation 2005b, 2005e

(

1991a

et al. et

Field

1987

2004a 2004a 2004a 2004a

et al. et

)

)

1991b

, ,

)

, ,

Willems

De Vos De

2004a, Rösch 2004a,

etal.

) ) ) )

, ,

in the acknowledgements.

Zhang

et al. et al. et et al. et al. et al. et al. et al. et al. et al. et al. et

2008

et al. et

et al. et

- -

List List

2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a

et al. et

)

1985

- -

et al. et

1991b

107 2006 107 2006 107

2013

) ) ) ) ) ) ) ) ) )

, ,

2008

Tamaoka

)

)

) )

)

CHAPTER 3 33

Figure 2. Illustration of the phylogenetic position of Comamonas testosteroni KF-1 within the order Burkholderiales of Betaproteobacteria. The 16S rRNA gene alignment included the three other C. testosteroni strains whose genome sequences have been published, strain S44 (Xiong et al. 2011), strain CNB-2 (Ma et al. 2009), and type-strain ATCC 11996 (Gong et al. 2012a), and some of other genome-sequenced representatives of the family Comamonadaceae or of other families within the order Burkholderiales. The corresponding genome- project accession numbers, or 16S rRNA gene accession numbers, are indicated. ʻTʼ indicates a type strain. The sequences were aligned using the RDP tree builder (Cole et al. 2009) and displayed using MEGA4 (Tamura et al. 2007). Bootstrap values are indicated; bar, 0.02 substitutions per nucleotide position.

GENOME PROPERTIES The genome of C. testosteroni KF-1 comprises a chromosome of 6,026,527 bp (61.76% GC content) (Table 3), for which a total number of 5,606 genes were predicted. Of these predicted genes, 5,492 are protein-coding genes, and 4,009 of the protein-coding genes were assigned to a putative function and the remaining annotated as hypothetical proteins. Genome analysis predicted 114 RNA genes and six rRNA operons. The properties and the statistics of the genome are summarized in Table 3, the distribution of genes into COGs functional categories is presented in Table 4, and the chromosome map of the genome of C. testosteroni KF-1 is illustrated in Figure 3.

The chromosome of C. testosteroni KF-1 (6.03 Mb) is larger in comparison to these of the three other C. testosteroni strains whose sequences have been published, of strain S44 (Xiong et al. 2011) (5.53 Mb), strain CNB-2 (Ma et al. 2009) (5.46 Mb), and strain ATCC 11996 (Gong et 34 CHAPTER 3 al. 2012a) (5.41 Mb), and in comparison to that of C. testosteroni NBRC 100989 (5.59 Mb) whose draft sequence has not yet been published (BioProject ID PRJNA70139). Upon genomic BLAST comparison however, the strain NBRC 100989 chromosome showed the highest similarity to the chromosome of C. testosteroni KF-1.

For the three C. testosteroni genomes accessible within the IMG platform for direct comparison (Markowitz et al. 2012), strains KF-1, S44 and CNB-2, the gene abundance profile indicated, most strikingly, a much higher abundance of transposases (COG2801, COG2826 and COG4644) in strain KF-1 (42 total) in comparison to strains S44 (4 total) and CNB-2 (9 total); retroviral integrases (pfam00665) are more abundant in strain KF-1 (36 total) in comparison to strains S44 (none) and CNB-2 (13 total), and hemagluttinin repeat proteins (pfam05594) implicated in cell aggregation are more abundant (10 total) in comparison to strains S44 (none) and CNB-2 (none).

Table 2. Project information MIGS ID Property Term MIGS-31.1 Sequencing status Complete MIGS-28 Libraries used 3.5 kb, 9 kb and 37 kb DNA libraries MIGS-29 Sequencing platforms Sanger MIGS-31.2 Sequencing depth 12.8× MIGS-30 Assemblers Phred/Phrap/Consed MIGS-32 Gene calling method Prodigal Genbank ID 17465 Genbank Date of Release January 14, 2009 GOLD ID Gi01330 MIGS-13 Source material identifier DSM 14576 Project relevance Biotechnological

In respect to candidate genes encoding the metabolic features of C. testosteroni KF-1 (see above), almost identical (syntenic) gene clusters were found for the main steroid degradation genes characterized in C. testosteroni TA441 (Horinouchi et al. 2001, Horinouchi et al. 2003b, Horinouchi et al. 2010), including the genes characterized in C. testosteroni ATCC 11996 (Möbus and Maser 1998, Xiong and Maser 2001, Xiong et al. 2003, Pruneda-Paz et al. 2004, Xiong et al. 2011, Gong et al. 2012b); the strain KF-1 genes are up to 98% identical in their amino-acid sequences. Candidate genes for the degradation of the acyl side chain of cholate in Pseudomonas sp. strain Chol1 (Birkenmaier et al. 2007, Birkenmaier et al. 2011) were also found (thiolase, locus tag CtesDRAFT_PD3654; acyl-CoA dehydrogenase, PD3666), and the genes for inversion of the cholate-stereochemistry in Comamonas testosteroni TA441 (Horinouchi et al. 2008) (PD3740-44). In respect to the complete degradation of taurocholate CHAPTER 3 35

(Rösch et al. 2008), several candidate genes for bile-salts hydrolase (taurocholate hydrolase) and candidate genes for the complete degradation of the taurine-moiety (2-aminoethanesulfonate) (Rösch et al. 2008), e.g., for sulfoacetaldehyde acetyltransferase (Xsc, PD0776), were found.

Table 3. Nucleotide and gene count levels of the genome of C. testosteroni KF-1 Attribute Value % of totala Genome size (bp) 6,026,527 100 DNA coding region (bp) 5,275,818 87.54 DNA G+C content (bp) 3,723,913 61.79 Number of replicons 1 Extrachromosomal elements 0 Genes total number 5,606 100 Protein-coding genes 5,492 97.97 RNA genes 114 2.03 rRNA operon count 6 Genes with function prediction 4,009 71.51 Genes in paralog clusters 1314 23.44 Genes assigned to COGs 4,131 73.69 Genes assigned to Pfam domains 4,375 78.04 Genes connected to KEGG pathways 1,502 26.79 Genes with transmembrane helices 1,265 22.57 Genes with signal peptides 1,410 25.15 a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.

Strain KF-1 has acquired the ability to utilize xenobiotic 3-C4-SPC, 3-C4-SPC-2H, 3-C5-SPC and 3-C5-SPC-2H, 4-sulfoacetophenone (SAP), and 4-sulfophenol (SP) (see above) (Schleheck et al. 2004a, Schleheck et al. 2010). The 3-C4-SPC is converted to SAP (Schleheck et al. 2010) and further to 4-sulfophenol acetate (SPAc) by a recently identified Baeyer-Villiger monooxygenase (‘SAPMO’, PD5437), and SPAc hydrolyzed by a recently identified carboxylester hydrolase encoded by the next gene in the genome (PD5438), to yield acetate and SP (Weiss et al. 2012). The two identified genes, together with other (predicted) catabolic genes, are framed by IS1071 insertion sequence elements (Tn3-family transposase genes), which suggests that these genes have only recently been acquired, possibly in the form of a ‘catabolic composite transposon’ through horizontal gene transfer (Weiss et al. 2012). Genes for other sections of the proposed 3-C4-SPC degradation pathway in strain KF-1, i.e., the 36 CHAPTER 3

‘upper’ and ‘lower’ pathway, from 3-C4-SPC to SAP and from SP further to central metabolites, respectively (Schleheck et al. 2010), are examined in our present work (unpublished).

Table 4. Number of genes associated with the general COG functional categories in C. testosteroni KF-1 Code Value %age Description J 187 4.02 Translation, ribosomal structure and biogenesis A 2 0.04 RNA processing and modification K 407 8.75 Transcription L 212 4.56 Replication, recombination and repair B 2 0.04 Chromatin structure and dynamics D 32 0.69 Cell cycle control, cell division, chromosome partioning Y - - Nuclear structure V 53 1.14 Defense mechanisms T 263 5.65 Signal transduction mechanisms M 239 5.14 Cell wall/membrane/envelope biogenesis N 102 2.19 Cell motility Z - - Cytoskeleton W Extracellular structures U 159 3.42 Intracellular trafficking, secretion, and vesicular transport O 154 3.31 Posttranslational modification, protein turnover, chaperones C 304 6.54 Energy production and conversion G 170 3.66 Carbohydrate transport and metabolism E 361 7.76 Amino acid transport and metabolism F 90 1.94 Nucleotide transport and metabolism H 164 3.53 Coenzyme transport and metabolism I 283 6.08 Lipid transport and metabolism P 303 6.51 Inorganic ion transport and metabolism Q 154 3.31 Secondary metabolites biosynthesis, transport and catabolism R 546 11.74 General function prediction only S 464 9.98 Function unknown NA 1475 26.31 Not in COGs

C. testosteroni KF-1 encodes a wealth of genes for aromatic ring cleavage oxygenases and aromatic ring-hydroxylating oxygenase (systems), as commonly observed for members of the order Burkholderiales (Pérez-Pantoja et al. 2012). Firstly, the complete protocatechuate 4,5-cleavage (meta) degradation operon (pmd-operon) characterized in C. testosteroni strain BR6020 (Providenti et al. 2001, Providenti et al. 2006a), strain E6 (Kamimura et al. 2010) and CNB-1 (Ni et al. 2012) involved in the degradation pathways for vanillate, isovanillate and 3- and 4-hydroxybenzoate, was found in strain KF-1 (pmdB, PD1898) (and two pmdB paralogs, PD1614 and 1810). An ortholog of the 3-hydroxybenzoate monooxygenase characterized in CHAPTER 3 37

C. testosteroni GZ39 (Chang and Zylstra 2008) was found in strain KF-1 (PD1242), as were the genes for conversion of vanillate and isovanillate (vanA/ivaA: PD0400/PD0403) (Providenti et al. 2006a).

Gene clusters of the meta-pathway enzymes for degradation of phenol as characterized in C. testosteroni TA441, i.e., aphCEFGHJI Arai et al. 2000 and aphKLMNOPQB (Arai et al. 1998), were not found in strain KF-1, but in strains S44 and CNB-2. However, homologs for all meta-pathway enzymes (corresponding to aphCEFGHJI) seem to be encoded distributed at different locations in the strain KF-1 genome, but a valid candidate gene cluster of the phenol hydroxylase components (aph- (Arai et al. 1998) or phcKLMNOP (Teramoto et al. 1999) genes) and catechol 2,3-dioxygenase (aphB) could not be found in the strain KF-1 genome. Also the gene cluster for the 3-(3-hydroxyphenyl)propionic acid degradation pathway (mhp-operon) characterized in Comamonas testosteroni TA441 (Arai et al. 1999) was not found in the genome of strain KF-1, and not in strain CNB-2, but in strain S44; homologs for all pathway enzymes (corresponding to mhpABDFE) seem to be distributed at different locations in the strain KF-1 genome.

Figure 4. Chromosome map of the genome of C. testosteroni KF-1. From bottom to top: Genes on forward strand (color by COG categories), genes on reverse strand (color by COG categories), RNA genes (tRNA, green; rRNA, red; other RNAs, black), GC content.

An almost identical gene cluster for the terephthalate (benzene-1,4-dicarboxylic acid) pathway (tph-cluster) as characterized in C. testosteroni YZW-D (Wang et al. 1995) and strain E6 (Sasoh et al. 2006, Kasai et al. 2010) was found in strain KF-1 (tphA, PD2130). The gene cluster for the isophthalate (benzene-1,3-dicarboxylic acid) pathway of C. testosteroni YZW-D (Wang et al. 1995) and strain E6 (Fukuhara et al. 2010) was also found in strain KF-1 (iphA, PD2139), encoded directly upsteam of the tph-cluster. Notably, at least nine other Rieske-domain ring- hydroxylating oxygenase component genes (COG4638) similar to tpaA/iphA 38 CHAPTER 3

(PD2130/PD2139) and vanA/ivaA (see above, PD0400/PD0403), seem to be encoded in strain KF-1 (PD2042, 1888, 4205, 2022, 0968, 3693, 1612, 2032, 5293).

No ortholog of the catechol 2,3-ring cleavage dioxygenase (non-heme Fe2+) of the phenol- pathway gene cluster (aphB) (Arai et al. 1998) was found in strain KF-1, but two other class I/II extradiol ring cleavage dioxygenase candidates (PD0021, 5290) in addition to a (decarboxylating) 4-hydroxyphenylpyruvate dioxygenase candidate (PD0347) (also in CNB-2 and S44), tesB of the steroid gene cluster (PD3739), and the class-III type extradiol ring cleavage dioxygenases mentioned above (PmdAB) were found.

In respect to intradiol ring cleavage dioxygenases, three candidates for (non-heme Fe3+) catechol 1,2-dioxygenase/protocatechuate 3,4-dioxygenase beta subunit/hydroxyquinol 1,2-dioxygenase were found in strain KF-1, i.e., PD0424, 5469, and 5471; notably, the latter two candidates are not represented in strains CNB-2 and S44.

Not represented in the C. testosteroni KF-1 genome is the nitrobenzene (nbz) degradation gene cluster of Comamonas sp. JS765 (Lessner et al. 2002), the 3-nitrobenzoate (mnb) degradation cluster of C. testosteroni BR6020 (Providenti et al. 2006b), the 4-chlorobenzoate uptake and degradation cluster of Comamonas sp. strain DJ-12 (Chae and Zylstra 2006, Cai et al. 2009), and not the 4-chloronitrobenzene (cnb) cluster on plasmid pCNB1 in C. testosteroni CNB-1 (Ma et al. 2007) and the upper-pathway chloroaniline (dca) cluster on plasmid pWDL7 in C. testosteroni WDL7 (Krol et al. 2012). Finally, an ortholog of the aliphatic nitrilase/cyanide hydratase (NitA) characterized in a C. testosteroni soil isolate (Levy-Schil et al. 1995) was also not found in the genome of strain KF-1, nor in those of CNB-2 or S44.

Strain KF-1 utilized none of the sugars tested (see above), and this observation is reflected by an absence of appropriate candidate genes in strain KF-1 for hexokinase and glucokinase in glycolysis, as well as of genes of the oxidative branch of the pentose phosphate pathway, as reported also for C. testosteroni CNB-2 (Ma et al. 2009).

Strain KF-1 is able to utilize nicotinate for growth and encodes an orthologous set of genes for the nicotinate dehydrogenase /hydroxylase complex (PD0815-13) characterized in C. testosteroni JA1 (Yang et al. 2010).

The poly(3-hydroxybutyrate) (PHB) biosynthesis and utilization operon of Comamonas sp. EB172 (Yee et al. 2012) is also encoded in strain KF-1 (e.g., PD2272). Furthermore, strain KF-1 was tested positive for growth with extracellular poly(3-hydroxybutyrate) (this study), and strain KF-1 encodes an ortholog (PD3795) of the characterized poly(3-hydroxybutyrate) CHAPTER 3 39 depolymerase precursor (PhaZ) of Comamonas sp. strain 31A (Jendrossek et al. 1995); notably, the ortholog was also found in C. testosteroni ATCC 11996T, but not in strains S44 and CNB-2.

In respect to the ampicillin (beta-lactam) antibiotic resistance of strain KF-1, the genome encodes at least two beta-lactamase class A (PD2722, 4357) and one beta-lactamase class B (PD0340) candidates, and with respect to kanamycin (aminoglycoside) resistance, two aminoglycoside phosphotransferase candidates (PD3717, 1418); notably, the latter two are not represented in strains CNB-2 and S44.

All four heavy metal exporter ATPase genes (zntA) and five CzcA-family exporter gene clusters described for highly zinc-resistant C testosteroni S44 (Xiong et al. 2011) were found in strain KF-1, and in total eight zntA and 11 cntA candidates. Two arsenical resistance gene clusters (PD1708-06 and 3544-42), each with candidates for arsenical pump (ArsB), arsenate reductase (ArsC), NADPH:FMN oxidoreductases (ArsH), and transcriptional regulator (ArsR), and a third arsC candidate (PD0567), were found in strain KF-1.

ACKNOWLEDGEMENTS We thank Joachim Hentschel for SEM operation, and several students of our practical classes for testing growth substrates. The work was financially supported by the University of Konstanz and the Konstanz Research School Chemical Biology, the University of New South Wales and the Centre for Marine Bio-Innovation, and by the Deutsche Forschungsgemeinschaft (DFG grant SCHL 1936/1-1 to D.S.). The work conducted by the U.S. Department of Energy Joint Genome Institute was supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and by the University of California, Lawrence Berkeley National Laboratory under contract No. W-7405-Eng-48, Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098 and Los Alamos National Laboratory under contract No. W-7405-ENG-36.

CHAPTER 4 41

CHAPTER 4

Complete genome sequence of Delftia acidovorans SPH-1

Michael Weiss1,2, Anna I. Kesberg1, Kurt M. LaButti3, Sam Pitluck3, David Bruce4, Loren Hauser5, Alex Copeland3, Tanja Woyke3, Stephen Lowry3, Susan Lucas3, Miriam Land5, Lynne Goodwin3,4, Staffan Kjelleberg6, Alasdair M. Cook1, Matthias Buhmann1, Torsten Thomas6, and David Schleheck1,2

1Department of Biological Sciences, University of Konstanz, Germany

2Konstanz Research School Chemical Biology, University of Konstanz, Germany

3DOE Joint Genome Institute, Walnut Creek, California, USA

4Los Alamos National Laboratory, Bioscience Division, Los Alamos, New Mexico, USA

5Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA

6Centre for Marine Bio-Innovation and School of Biotechnology and Biomolecular Science, University of New South Wales, Sydney, Australia

42 CHAPTER 4

ABSTRACT Delftia acidovorans SPH-1 is the second genome-sequenced model bacterium for the elucidation of novel biodegradation pathways for xenobiotic sulfophenylcarboxylates (SPCs), which are formed during initial biodegradation of commercial surfactants linear alkylbenzenesulfonates (LAS). Here, we describe Delftia acidovorans SPH-1 together with its complete genome sequence. The 6,767,514 bp long genome with its 6,048 protein-coding genes and 98 RNA genes exhibits an average G+C content of 66.48%.

INTRODUCTION Delftia acidovorans strain SPH-1 (DSM 14801) was isolated from an enrichment culture with the sulfophenylcarboxylate (SPC) 4-(4-sulfophenyl)hexanoate (abbreviated 4-C6-SPC) as the sole carbon and energy source for aerobic growth. The enrichment culture was inoculated with sludge from a sewage treatment plant (Schleheck et al. 2004a). The substrate, 4-C6-SPC, had been generated biologically, through growth of Parvibaculum lavamentivorans DS-1T (Schleheck et al. 2011) with a single surfactant congener of linear alkylbenzenesulfonate

(LAS), i.e., with 3-(4-sulfophenyl)dodecane (abbreviated 3-C12-LAS): P. lavamentivorans DS-1T catalyzes a primary degradation of all LAS congeners that are present in a commercial

LAS-surfactant mixture (i.e., 3-C12-LAS and 19 other congeners of C10-C13 LAS), through shortening of the LAS alkyl chains by fatty-acid beta-oxidation, and produces a complex mixture of short-chain SPCs and other related compounds (appr. 50 compounds) (Dong et al. 2004, Schleheck et al. 2004a, Schleheck et al. 2004b, Schleheck et al. 2007).

In fact, commercial LAS are synthetic, xenobiotic compounds of petrochemical origin, and are completely degraded in a cooperated effort by complex bacterial communities, e.g., in sewage treatment plants. In order to elucidate how the environmental bacteria have adapted so quickly to a complete utilization of LAS and SPCs as growth substrates and, hence, to their complete degradation, two representative bacterial strains have been isolated and genome-sequenced, i.e., the LAS-degrading isolate P. lavamentivorans DS-1 (Schleheck et al. 2000, Schleheck et al. 2011) and the SPC-degrading isolate Comamonas testosteroni KF-1 (Schleheck et al. 2004a, Weiss et al. 2013). D. acidovorans SPH-1 is a second SPC-degrading isolate and has a different SPC degradation pathway (Schleheck et al. 2004a). In particular, D. acidovorans SPH-1 has only a very narrow substrate range for SPCs: It degrades the 4-C6-SPC in the SPC mixture generated from commercial LAS and 4-C5-SPC (4-(4-sulfophenyl)pentanoate), as well as the corresponding alpha/beta-unsaturated SPCs of 4-C6-SPC and 4-C5-SPC, but none of the many other SPCs (see therefore Schleheck et al. 2004a), for example not 3-C4-SPC

(3-(4-sulfophenyl)butyrate). The 3-C4-SPC, however, can be degraded by the other SPC- CHAPTER 4 43 degrading genome-sequenced isolate, Comamonas testosteroni KF-1, but this strain, in turn, can not utilize the 4-C6-SPC (Schleheck et al. 2004a). Hence, despite the intriguing question of how these two bacterial strains have established their SPC degradation pathways, another interesting question regards their substrate specificity for SPC degradation: Why is

C. testosteroni KF-1 not able to utilize also 4-C6-SPC (or any other SPC) and why is

D. acidovorans SPH-1 not able to utilize also 3-C4-SPC (or any other SPC)?

2 µm

Figure 1. Scanning electron micrograph of Delftia acidovorans SPH-1. Cells derived from a liquid culture that grew in LB medium.

Here, we describe some features of D. acidovorans SPH-1 and a summary classification of this organism together with its complete genome sequence, which has been established as part of the Microbial Genomic Program 2006 of the DOE Joint Genome Institute. The genome is accessible in GenBank and its annotation in IMG (Integrated Microbial Genomes; Markowitz et al. 2012) platform.

D. acidovorans was formerly known as Comamonas acidovorans, and even prior to that as Pseudomonas acidovorans. The type strain of D. acidovorans is strain ATCC 15668T (Tamaoka et al. 1987, Wen et al. 1999). Other members of the genus Delftia have been recognized for their potential to degrade aromatic compounds (Pérez-Pantoja et al. 2012), for instance strain 44 CHAPTER 4

MBLF, which is able to utilize cocaine due to the presence of a cocaine esterase yielding benzoate and ecgonine methyl ester (Lister et al. 1996), strains MC1 and P4a, which are able to utilize chlorophenoxy herbicides (Müller et al. 1999, Hoffmann et al. 2003), or other members that utilize anilines and chloroanilines (Boon et al. 2001, Urata et al. 2004). Further, D. acidovorans strains have been recognized to utilize diethylphosphate and diethylthiophosphate as sole phosphorous source (Cook et al. 1980) and to express a phosphodiesterase (PdeA) (Tehara and Keasling 2003).

ORGANISM INFORMATION AND CLASSIFICATION Morphology and growth conditions

D. acidovorans SPH-1 is a rod-shaped (size, appr. 0.8 x 3 µm, Figure 1) Gram-negative bacterium forming beige colonies on LB agar-plates.

Phylogeny

On the basis of 16S rRNA gene sequence, strain SPH-1 is classified as a member of the genus Delftia within the family Comamonadaceae and the order Burkholderiales of Betaproteobacteria, as illustrated by a phylogenetic tree (Figure 2). Currently, 959 genome sequences of members of the order Burkholderiales of Betaproteobacteria and 219 genome sequences within the family Comamonadaceae, have been, currently are, or are targeted to be established (GOLD database; September 2014).

GENOME SEQUENCING INFORMATION

Genome project history The genome was selected for sequencing as part of the U.S. Department of Energy - Microbial Genomic Program 2006. The DNA sample was submitted in February 2006 and the initial sequencing phase was completed in August 2006. After the finishing and assembly phase, the genome was presented for public access in November 2007. Table 2 presents the project information and its association with MIGS version 2.0 compliance (Field et al. 2008). CHAPTER 4 45

Figure 2. Phylogenetic tree of 16S rRNA gene sequences showing the position of D. acidovorans SPH-1 relative to genome-sequenced Comamonacea and other families within the order Burkholderiales. The corresponding 16S rRNA gene accession numbers or project accession numbers are indicated. Parvibaculum lavamentivorans DS-1 was used as outgroup. The sequences were aligned using the RDP tree builder (Cole et al. 2009) and displayed using MEGA4 (Tamura et al. 2007).

46 CHAPTER 4

Table 1. Table project project a

) Evidence codes codes Evidence )

MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS MIGS ID MIGS

(

------

Ashburner

4.1 4.1 4.4 4.2 5 4 14 15 22 6

Classificationand general features of

Latitude receptor Cellshape Altitude Longitude time collection Sample location Geographic Pathogenicity relationship Biotic requirement Oxygen Habitat Terminalelectron source Energy source Carbon Optimumtemperature range Temperature Sporulation Motility stain Gram classification Current Property

-

et et al.

IDA: Inferred from Direct Assay; TAS: Traceable Author Statement; NAS: Non NAS: Statement; Author Traceable TAS: Assay; Direct from Inferred IDA:

2000

)

. . If the evidence is IDA, then the property was directly for observed a live isolate by one of the authors or an expert menti

47° 40' 41.96" 47° smallrod 399 m 399 8' 26.87" 9° 2001 Germany). (Konstanz, plant treatment sewage communal of a tank fromanaeration with sludge inoculated been had that Germany) Konstanz, (Universityof from isolated TRBA) German to according (classification 1 group Risk nonpathogenic, biofilms in or in aggregates living, free aerobic habitat aerobic oxygen molecular chemoorganotroph 4 4 2 phenylacetate, 4 30ºC mesophile non motile negative StrainSPH Species Genus Family Order Class Phylum Domain Term

- - -

hydroxybenzoate, protocatechuate hydroxybenzoate, benzoate, ethanol, taurocholate, taurine, phenylbutyrate, (4

-

-

sporulating

sulfophenyl)hexanoate (4 sulfophenyl)hexanoate

Betaproteobacteria

Burkholderiales

Delftia

Comamonadaceae

Delftia acidovorans Delftia

Proteobacteria

Bacteria

-

Delftia acidovorans Delftia

1

a 4 a

-

-

hydroxyphenylacetate, 3 hydroxyphenylacetate,

C

6

-

SPC

-

degrading laboratory enrichment culture culture enrichment laboratory degrading

-

C

6

-

SPC) and SPC)

SPH

-

-

hydroxyphenylacetate, hydroxyphenylacetate,

1 1 accordingrecommendations the to MIGS

other SPCs ( SPCs other

-

see text see

traceable Author Statement. These evidence codes are from the Gene Ontology Ontology Gene the from are codes evidence These Statement. Author traceable

)

, ,

TAS TAS TAS TAS TAS TAS ( TAS TAS TAS TAS TAS ( IDA,TAS TAS TAS TAS TAS TAS TAS TAS TAS TAS TAS code Evidence

( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (

Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Schleheck Tamaoka Tamaoka Willems Garrity Garrity Garrity Woese

Schleheck

et al. et

et al. et al. et al. et

a

et al. et

et al. et al. et

(

et al. et et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et al. et

Field

1990

2005d, Validation 2005d, Validation 2005b, 2005e

1991a

1987 1987

2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a 2004a

et al. et

)

etal.

)

, , ,

)

oned oned in the acknowledgements.

Wen Wen

2004a, Rösch 2004a,

) ) ) ) ) ) ) ) ) ) ) ) )

2008

et al. et al. et

)

- -

List List

1999 1999

- -

et al. et

107 2006 107 2006 107

) )

2008

) )

)

CHAPTER 4 47

GROWTH CONDITIONS AND DNA ISOLATION D. acidovorans SPH-1 was grown on LB agar plates and colonies were transferred into selective medium (1 mM 4-C6-SPC-salts medium; 3-ml scale; Schleheck et al. 2004a). This culture was sub-cultivated to larger scale (100-ml and 1-liter scale) in 30 mM acetate/minimal salts medium; cell pellets were stored frozen until DNA preparation. DNA was prepared following the JGI’s DNA Isolation Bacterial CTAB Protocol (http://my.jgi.doe.gov/general/index.html).

Table 2. Project information MIGS ID Property Term MIGS-31.1 Sequencing status Complete Sequencing quality Finished MIGS-28 Libraries used 3.5 kb, 9 kb and 37 kb DNA libraries MIGS-29 Sequencing platforms Sanger MIGS-31.2 Sequencing depth 12.8x MIGS-30 Assemblers Phred/Phrap/Consed MIGS-32 Gene calling method Glimmer/Criteria Genbank ID CP000884 Genbank Date of Release November, 2007 GOLD ID Gc00683 MIGS-13 Source material identifier DSM 14801 Project relevance Biotechnological

GENOME SEQUENCING AND ASSEMBLY The genome of Delftia acidovorans SPH-1 was sequenced at the Joint Genome Institute (JGI) using a combination of 3.5 kb, 9 kb and 37 kb DNA libraries. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website (www.jgi.doe.gov). Sanger sequence data were generated for the assembly from all three libraries, which provided for a 12.8-fold coverage of the genome. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment (Ewing and Green 1998, Ewing et al. 1998, Gordon et al. 1998). After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible misassemblies were corrected with Dupfinisher (Han and Chain 2006), PCR amplification, or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI, USA). Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification (Roche Applied Science, Indianapolis, IN, USA).

GENOME ANNOTATION Genes were identified using Prodigal (Hyatt et al. 2010) as part of the genome annotation pipeline at Oak Ridge National Laboratory (ORNL), Oak Ridge, TN, USA, followed by a round 48 CHAPTER 4 of manual curation using the JGI GenePRIMP pipeline (Pati et al. 2010). The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE (Lowe and Eddy 1997), RNAMMer (Lagesen et al. 2007), Rfam (Griffiths-Jones et al. 2003), TMHMM (Krogh et al. 2001), and signalP (Dyrløv Bendtsen et al. 2004). Additional gene prediction analysis and manual functional annotation were performed within the Integrated Microbial Genomes (IMG) platform (http://img.jgi.doe.gov) developed by the JGI, Walnut Creek, CA, USA (Markowitz et al. 2008).

Table 3. Nucleotide and gene count levels of the genome of D. acidovorans SPH-1 Attribute Value % of totala Size (bp) 6,767,514 100 G+C content (bp) 4,498,790 66.48 Coding region (bp) 6,102,591 90.17 Number of replicons 1 Extrachromosomal elements 0 Genes total number 6,146 100 Protein-coding genes 6,048 98.41 Pseudo genes 9 0.15 RNA genes 98 1.59 rRNA operon count 5 Genes with function prediction 4,422 71.95 Genes in paralog clusters 1,439 23.41 Genes assigned to COGs 4,627 75.28 Genes assigned to Pfam domains 4,870 79.24 Genes connected to KEGG pathways 1,664 27.07 Genes with transmembrane helices 1,372 22.32 Genes with signal peptides 1,567 25.50 a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.

Genome properties The genome of D. acidovorans SPH-1 comprises a chromosome of 6,767,514 bp (66.48% GC content) (Table 3), for which a total number of 6,146 genes is predicted. For these predicted genes, 6,048 are protein-coding genes and 4,422 of the protein-coding genes were assigned to a putative function, whereas the remaining are annotated as hypothetical proteins. 98 RNA genes and five rRNA operons are predicted. The properties and the statistics of the genome are CHAPTER 4 49 summarized in Table 3, and the distribution of genes into COGs functional categories is presented in Table 4.

GENOME PROPERTIES AND METABOLIC FEATURES Delftia acidovorans SPH-1 is a prototrophic organism and able to utilize aromatic carbon sources, such as benzoate, 4-hydroxybenzoate, 3,4-dihydroxybenzoate (protocatechuate), phenylacetate, 2-hydroxyphenylacetate, 3-hydroxyphenylacetate, 4-phenylbutyrate,

4-sulfocatechol, and the LAS degradation intermediates 4-C5-SPC, 4-C6-SPC, 4-C5-2en-SPC, and 4-C6-2en-SPC (this study and Schleheck et al. 2004a).

In respect to degradation of aromatic substrates, strain SPH-1 harbors candidate genes for phenylpropionate dioxygenases and related ring-hydroxylating dioxygenases (COG4638; Daci_0610, 0964, 1352, 1355, 2143, 2239). Two of them are located in a gene cluster almost identical to a cluster of the C. testosteroni KF-1 genome for vanillate and isovanillate degradation (vanA: Daci_1352; ivaA: 1355). These oxygenases catalyze the demethylation of vanillate, isovanillate, and veratrate to protocatechuate, which is a central intermediate of the aerobic degradation of aromatic compounds (Providenti et al. 2006a). The genes responsible for the 4,5-meta-fission pathway of protocatechuate (pmdABCDEFK) (Providenti et al. 2001) were also found in strain SPH-1 (Daci_4446, 4445, 4444, 4447, 4449, 4448, 4450). Interestingly, there are two other paralogs of pmdAB encoded in the genome of SPH-1 (Daci_3814, 3813 and Daci_0915, 0914).

Gene candidates for an ortho-degradation of catechol are also present (catechol 1,2-dioxygenase, Daci_1867; muconate cycloisomerase, Daci_1870; muconolactone delta- isomerase, Daci_1869; 3-oxoadipate enol-lactonase, Daci_1996; 3-oxoacid coenzyme A (CoA)-transferase, Daci_5561, 5562), and candidates for a CoA-dependent aerobic benzoate degradation pathway (boxABC, regulator, and a benzoate CoA ligase; Daci_0072-0076; Gescher et al. 2002).

For phenylacetate (Paa) degradation, the phenylacetyl-CoA monooxygenase genes of an aerobic phenylacetate catabolic pathway paaA-paaE (Teufel et al. 2010) are encoded (Daci_0810-0814). This gene cluster organization is commonly found in almost all bacteria that harbor the paa-encoded Paa degradation pathway (Fernandez et al. 2006). The multicomponent monooxygenase PaaABCE regenerates phenylacetyl-CoA from the toxic epoxide intermediate (Teufel et al. 2012). Phenylacetyl-CoA, which is formed by a Paa-CoA ligase (PaaK, Daci_0809 and Daci_0570), is hydrolytically cleaved in a backwards reaction into Paa and HS- CoA by a phenylacetyl-CoA thioesterase (PaaI, Daci_0569) (Teufel et al. 2012). The 50 CHAPTER 4 epoxidized phenylacetyl-CoA is isomerized by an enoyl-CoA isomerase (PaaG, Daci_0568) to oxepin-CoA, which is subject of a ring cleavage reaction (PaaZ). Notably, D. acidovorans SPH-1 does not encode the responsible fusion protein PaaZ, but the aldehyde dehydrogenase (PacL, Daci_0808). The aldehyde producing hydratase is encoded outside the gene cluster, as also known for some other Paa-degrading bacteria (Teufel et al. 2010, Teufel et al. 2011).

Table 4. Number of genes associated with the general COG functional categories in D. acidovorans SPH-1 Code Value %age Description J 188 3.56 Translation, ribosomal structure and biogenesis A 1 0.02 RNA processing and modification K 567 10.75 Transcription L 170 3.22 Replication, recombination and repair B 3 0.06 Chromatin structure and dynamics D 33 0.63 Cell cycle control, cell division, chromosome partioning Y - - Nuclear structure V 72 1.37 Defense mechanisms T 350 6.64 Signal transduction mechanisms M 255 4.84 Cell wall/membrane/envelope biogenesis N 135 2.56 Cell motility Z - - Cytoskeleton W Extracellular structures U 178 3.38 Intracellular trafficking, secretion, and vesicular transport O 162 3.07 Posttranslational modification, protein turnover, chaperones C 316 5.99 Energy production and conversion G 220 4.17 Carbohydrate transport and metabolism E 455 8.63 Amino acid transport and metabolism F 86 1.63 Nucleotide transport and metabolism H 183 3.47 Coenzyme transport and metabolism I 250 4.74 Lipid transport and metabolism P 328 6.22 Inorganic ion transport and metabolism Q 162 3.07 Secondary metabolites biosynthesis, transport and catabolism R 588 11.15 General function prediction only S 571 10.83 Function unknown NA 1519 24.72 Not in COGs

There is no hexokinase candidate gene for a complete glycolysis encoded in the D. acidovorans SPH-1 genome, and no PTS or other transport system for glucose is predicted. However, there are candidates for a PQQ-dependent glucose dehydrogenase (Daci_0342), a gluconolactonase (Daci_0257), a glucokinase (Daci_3761), a 6-phosphogluconate dehydratase (Daci_3765), and CHAPTER 4 51 a 2-keto-3-deoxy phosphogluconate aldolase (Daci_3766) encoded. Further, there are candidates for fructose permease (fruA, Daci_2665), phosphoenolpyruvate-protein phosphotransferase (fruB, Daci_2663), and 1-phosphofructokinase (Daci_2664). Indeed, strain SPH-1 is able to grow with D-fructose, but not with D-glucose (this study). Strain SPH-1 harbours candidates for isocitrate lyase (Daci_4617) and malate synthase (Daci_0665), hence, most likely uses the glyoxylate cycle for anaplerosis.

Figure 2. Graphical circular map of the genome of D. acidovorans SPH-1. From outside to center: Genes on forward strand (color by COG categories), genes on reverse strand (color by COG categories), RNA genes (tRNA, green; rRNA, red; other RNAs, black), GC content, GC skew.

D. acidovorans SPH-1 utilizes taurine as sole carbon and energy source and encodes a taurine dehydrogenase (TauXY, Daci_2019/2020), sulfoacetaldehyde acetyltransferase (Xsc, Daci_1992), phosphotransacetylase (Pta, Daci_1991), sulfite exporter (TauZ, Daci_1990) and a sulfite dehydrogenase (SorAB, Daci_0054/0055; Denger et al. 2008, Rösch et al. 2008). Strain SPH-1 utilizes also taurocholate through initial hydrolytic cleavage into taurine and 52 CHAPTER 4 cholate catalyzed by a periplasmic bile salt hydrolase (Bsh, Daci_3467); cholate is not subject of further degradation (Rösch et al. 2008). Interestingly, D. acidovorans SPH-1 does not encode an ABC-taurine transport system (TauABC, Eichhorn et al. 2000 or TauKLM, Brüggemann et al. 2004), but a taurine permease (TauP, Daci_2021; Krejčík et al. 2012). Candidate genes for a ketoglutarate-dependent taurine hydroxylase (TauD) are also encoded in the genome (e.g. Daci_2006, Daci_2012, Daci_3041, Daci_4757, and Daci_4891).

The D. acidovorans SPH-1 genome harbours the del gene cluster (Daci_4753-4759) including a nonribosomal peptide synthetase/polyketide synthase responsible for the production of a linear polyketide-nonribosomal peptide (delftibactin), which protects the organism from toxic soluble gold ions (Au3+) by complexation and precipitation of solid gold particles (Johnston et al. 2013).

With respect to heavy metal ion resistance, there is a genomic island present in strain SPH-1, which contains all accessory genes for metal resistance, and which is organized in five clusters (Van Houdt et al. 2009): (i) resistance to copper (copKDCFOGBARS, Daci_0443-0452); (ii) resistance to silver/copper (silABCDR, Daci_0473-0477); (iii) resistance to cadmium/lead (pbrRACBT, Daci_0478-0482); (iv) resistance to mercury (merAPTR, Daci_0483- 0486); (v) resistance to arsenic (arsRIC2BC1H, Daci_ 0487-0492) (Hynninen et al. 2009, Van Houdt et al. 2009). The cluster for copper resistance is flanked by a conjugative module (traR-trb, Daci_0428-0442) and the parAB (Daci_0453-0468) maintenance genes, whereas the other four clusters are framed by parAB and the int gene (Daci_0497; Van Houdt et al. 2009). Ten candidate genes for heavy metal efflux pumps of the CzcA-family (COG3696; Daci_0018, 0473, 1087, 1550, 1716, 1742, 2709, 2752, 3918, 4763) and 17 candidates for efflux transporter of the RND family, MFP subunit (COG 0845; Daci_0019, 0163, 0474, 0501, 1088, 1549, 1715, 2256, 2420, 2574, 2685, 2710, 3367, 4764, 5137, 5654, 5858) are encoded; and further, 29 annotated outer membrane (efflux) proteins (COG1538; Daci_0017, 0125, 0165, 0475, 0499, 0761, 1089, 1551, 1714, 1739, 2077, 2259, 2315, 2417, 2681, 2707, 2751, 2911, 3056, 3284, 3369, 3744, 3917, 3958, 4247, 4572, 4765, 5135, 5656), candidate genes for a macrolide specific efflux protein MacA (Daci_2256, 2317, 3282), drug resistance transporter of the subfamily EmrB/QacA (Daci_3742, 5911), drug resistance transporter of the subfamily Bcr/CflA (Daci_3365, 5152, 5334), and further 36 candidate genes of COG2814.

D. acidovorans SPH-1 grows in LB Medium supplemented with ampicillin in concentrations up to 1 mg ml-1. For protection against beta-lactam antibiotics, strain SPH-1 encodes a set of predicted beta-lactamases and penicillin binding proteins (COG2367: Daci_4574; COG1680: Daci_0349, 0805, 3034, 3665, 3787, 5145; COG2602: Daci_2332, 4544). CHAPTER 4 53

For aminoglycoside resistance, strain SPH-1 encodes a predicted streptomycin kinase gene (COG3570; Daci_2795).

In Delftia sp. HT23, a D-threo-3-hydroxyaspartate dehydratase was characterized with activity towards D-threo-3-hydroxyaspartate, L-threo-3-hydroxyaspartate, L-erythro- 3-hydroxyaspartate, and D-serine; an ortholog is found in strain SPH-1 (Daci_0971; Maeda et al. 2010).

Strain SPH-1 is highly tolerant to surfactant stress in concentrations up to 0.4 mM LAS, by increased aggregate formation in liquid culture or biofilm formation on surfaces (Buhmann 2008). The genome of strain SPH-1 harbours components for microbial surface attachement, for example at least one cluster related to pili biogenesis, putatively containing type IV pilus assembly protein PilE (Daci_5192 and Daci_5198) and type IV pilus assembly protein PilV (Daci_5188 and Daci_5194; Thomas et al. 2008).

ACKNOWLEDGEMENTS We thank Joachim Hentschel for SEM operation. The work was financially supported by the University of Konstanz, the Konstanz Research School Chemical Biology, and the Zukunftskolleg of the University of Konstanz, the University of New South Wales and the Centre for Marine Bio-Innovation, and by the Deutsche Forschungsgemeinschaft (DFG grant SCHL 1936/1-1 to D.S.). The work conducted by the U.S. Department of Energy Joint Genome Institute was supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and by the University of California, Lawrence Berkeley National Laboratory under contract No. W-7405-Eng-48, Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098 and Los Alamos National Laboratory under contract No. W-7405-ENG-36.

CHAPTER 5 55

CHAPTER 5

Two enzymes of a complete degradation pathway for linear alkylbenzenesulfonate (LAS) surfactants: 4-sulfoacetophenone Baeyer-Villiger monooxygenase and 4-sulfophenylacetate esterase in Comamonas testosteroni KF-1

Michael Weiss1,2, Karin Denger1, Thomas Huhn2,3, and David Schleheck1,2

1Department of Biology, University of Konstanz, Germany

2Konstanz Research School Chemical Biology, University of Konstanz, Germany

3Department of Chemistry, University of Konstanz, D-78457 Konstanz, Germany

This chapter was published in Applied and Environmental Microbiology. 2012, 78(23):8254- 8263.

56 CHAPTER 5

ABSTRACT Complete biodegradation of the surfactant linear alkylbenzenesulfonate (LAS) is accomplished by complex bacterial communities in two steps. First, all LAS congeners are degraded into about 50 sulfophenylcarboxylates (SPCs), one of which is 3-(4-sulfophenyl)butyrate

(3-C4-SPC). Second, these SPCs are mineralized. 3-C4-SPC is mineralized by Comamonas testosteroni KF-1 in a process involving 4-sulfoacetophenone (SAP) as a metabolite and an unknown inducible Baeyer-Villiger monooxygenase (BVMO) to yield 4-sulfophenyl acetate (SPAc) from SAP (SAPMO enzyme); hydrolysis of SPAc to 4-sulfophenol and acetate is catalyzed by an unknown inducible esterase (SPAc esterase). Transcriptional analysis showed that one of four candidate genes for BVMOs in the genome of strain KF-1, as well as a SPAc esterase candidate gene directly upstream, was inducibly transcribed during growth with

3-C4-SPC. The same genes were identified by enzyme purification and peptide fingerprint-mass spectrometry when SAPMO was enriched, and SPAc esterase purified to homogeneity by protein chromatography. Heterologously overproduced pure SAPMO converted SAP to SPAc and was active with phenylacetone and 4-hydroxyacetophenone but not with cyclohexanone and progesterone. SAPMO showed highest sequence homology to archetypal phenylacetone BVMO (57%) followed by steroid BVMO (55%) and 4-hydroxyacetophenone BVMO (30%). Finally, the two pure enzymes added sequentially, SAPMO with NADPH and SAP, and then SPAc esterase, catalyzed the conversion of SAP via SPAc to 4-sulfophenol and acetate in a 1:1:1:1 molar ratio. Hence, the first two enzymes of a complete LAS degradation pathway were identified, giving evidence for the recruitment of members of the very versatile type I BVMO and carboxylester hydrolase enzyme families for the utilization of a xenobiotic compound by bacteria.

INTRODUCTION Linear alkylbenzenesulfonate (LAS) is the major synthetic laundry surfactant in worldwide use, with an annual production of approximately 3 million tons per year (Knepper et al. 2003). The commercial preparation of LAS is nominally constituted of a mixture of 20 congeners of secondary 4-sulfophenylalkanes (secondary C10 to C13-LAS), 18 of which are chiral (Knepper et al. 2003). This complex mixture of congeners and enantiomers is completely degraded by complex heterotrophic aerobic bacterial communities in two steps (Figure 1) (van Ginkel 1996, Dong et al. 2004, Schleheck et al. 2004a). In the first step, bacteria utilize the alkyl chains of LAS for growth, by acquisition of acetyl-coenzyme A (acetyl-CoA) through beta-oxidation of the alkyl moiety, and release a complex mixture of approximately 50 short-chain

4-sulfophenylcarboxylates (SPCs, C4 to C9 chain lengths) and related SPC-like intermediates, CHAPTER 5 57 as observed with our model organism for primary LAS degradation, Parvibaculum lavamentivorans DS-1T (Schleheck et al. 2000, Schleheck et al. 2004b, Schleheck and Cook 2005, Schleheck et al. 2011). For an example relevant in this paper, the primary degradation of T a C10-LAS congener, (R,S)-4-sulfophenyldecane (2-C10-LAS), by P. lavamentivorans DS-1 yields (R,S)-3-(4-sulfophenyl)butyrate (3-C4-SPC) as major product (see Figure 1 for the structures).

In the second step, the SPCs (and the other products) are completely degraded by other heterotrophic bacteria to cell material, CO2, water, and sulfate (Schulz et al. 2000, Schleheck et al. 2004a, Schleheck et al. 2010). For this ultimate degradation step, our work with pure cultures of SPC-utilizing bacteria has shown that many different organisms must be active, because all known SPC-degrading representatives have a narrow substrate spectrum of only 3 to 4 individual SPC-like compounds (Schulz et al. 2000, Schleheck et al. 2004a).

Any detailed information on the LAS and SPC degradation pathways, particularly the enzymes and genes involved, is still absent. In this study, we used the SPC-degrading bacterium Comamonas testosteroni KF-1 (Schleheck et al. 2004a) and its available genome sequence

(Genome Online Database (GOLD) ID Gi01330), the proposed 3-C4-SPC degradation pathway (see below), and the available single SPC as substrate for strain KF-1 (chemically synthesized

3-C4-SPC; Schleheck et al. 2010), as our model system in the attempt to reveal the first enzymes and genes of a complete degradation pathway for LAS.

Our recent work (Schleheck et al. 2010) strongly suggested that an NADPH-dependent Baeyer-

Villiger monooxygenase reaction is employed in the degradation pathway for 3-C4-SPC in C. testosteroni KF-1 (Figure 1). Baeyer-Villiger monooxygenases (BVMOs) are a specific type of oxidoreductases (EC 1.14.13.x), found in many different organisms for the catalysis of a variety of oxidation reactions, including Baeyer-Villiger oxidations (Baeyer and Villiger 1899) in which aliphatic, cyclic and/or aromatic ketones are converted to esters or lactones by insertion of an oxygen atom into the higher substituted carbon-carbon bond of the carbonylic substrate (e.g. Kamerbeek et al. 2003). In the 3-C4-SPC pathway in C. testosteroni KF-1, the identified carbonylic metabolite 4-sulfoacetophenone (SAP; 4-acetylbenzenesulfonate) (Figure 1) was indicated to be converted by an unknown, strictly NADPH-dependent SAP BVMO enzyme (SAPMO) to 4-sulfophenyl acetate (SPAc) (Figure 1) (Schleheck et al. 2010). The ester, which is stable, was shown to be cleaved by an unknown esterase to yield 4-sulfophenol (4-hydroxybenzenesulfonate) and acetate (Figure 1) (Schleheck et al. 2010). The precedents for this reaction sequence in strain KF-1 (Figure 1) can be found in Pseudomonas strains that convert a structural analog of SAP, carbonylic 4-hydroxyacetophenone (HAP), to 58 CHAPTER 5

4-hydroxyphenyl acetate (HPAc) using an NADPH-dependent HAP BVMO enzyme (HAPMO), which is subsequently hydrolyzed to hydroquinone and acetate by an esterase (HPAc-esterase) (Kamerbeek et al. 2001, Moonen et al. 2008a, Rehdorf et al. 2009, Rehdorf et al. 2012). HAPMO belongs to a large family of so-called type I BVMOs within the family of flavoprotein monooxygenases; these enzymes are NADPH dependent, contain flavin adenine dinucleotide and (FAD), and share a highly conserved ʻBVMO fingerprintʼ sequence motif (Fraaije et al. 2002). Type II BVMOs use NADH and flavin mononucleotide (FMN) and contain no BVMO fingerprint motif (Leisch et al. 2011).

Figure 1. Schematic representation of the complete degradation of LAS surfactant by pairs of heterotrophic bacteria and illustration of the two enzyme reactions that are the topic of this study, involving a 4-sulfoacetophenone (SAP) Baeyer-Villiger-type monooxygenase (SAPMO) and a 4-sulfophenyl acetate (SPAc) esterase (SPAc-esterase) in Comamonas testosteroni KF-1. LAS (here 2-C10-LAS; 2-(4-sulfophenyl)decane) is degraded in a first step to 4-sulfophenylcarboxylates (SPCs) (here, to 3-C4-SPC; 3-(4-sulfophenyl)butyrate), by e.g. Parvibaculum lavamentivorans DS-1T (Schleheck et al. 2004b, Schleheck et al. 2007). In a second step, the SPCs are mineralized by other bacteria, represented here by Comamonas testosteroni KF-1, which is able to utilize 3-C4-SPC for growth (Schleheck et al. 2004a). None of the enzymes or genes for LAS and SPC-degradation has been identified thus far.

The BVMO-esterase reaction sequence from SAP via SPAc to 4-sulfophenol and acetate in C. testosteroni KF-1 (Figure 1) allowed for the first proposition of a reasonable SPC CHAPTER 5 59

degradation pathway in bacteria (Schleheck et al. 2010). Briefly, the upper part of the 3-C4-SPC pathway leading to SAP is thought to involve the utilization of a first C2 unit from the C4 side chain of 3-C4-SPC, most likely by a reaction sequence in analogy to beta-oxidation and, thus, an abstraction of acetyl-CoA (see schematic representation in Figure 1) (for details and structures, see Schleheck et al. 2010). The lower 3-C4-SPC pathway, i.e., after the proposed BVMO-esterase reaction sequence and from 4-sulfophenol further via aromatic ring cleavage to central metabolites, is thought to proceed via 4-sulfophenol 2-monooxygenation to 4-sulfocatechol, followed by 4-sulfocatechol ortho-ring cleavage (Contzen et al. 2001, Halak et al. 2006, Halak et al. 2007) (see schematic representation in Figure 1) (for details and structures see reference Schleheck et al. 2010).

In the present study, we focused on the two readily accessible enzyme reactions in the 3-C4-SPC degradation pathway in C. testosteroni KF-1, catalyzed by predicted SAPMO and SPAc esterase, and report that the genome sequence of C. testosteroni KF-1, kindly made available by the Joint Genome Institute of the U.S. Department of Energy (DOE-JGI), contains four valid candidate genes for BVMOs and that one of these genes encodes SAPMO. The SPAc-esterase was also identified.

MATERIALS AND METHODS

Growth conditions C. testosteroni KF-1 (DSM 14576) was isolated in our laboratory (Schleheck et al. 2004a). A phosphate-buffered mineral-salts medium (Thurnheer et al. 1986) supplemented with 3-(4-sulfophenyl)butyrate (6 mM) or succinate (15 mM) as the sole carbon source was used. Cultures in the 3-ml scale were incubated in glass tubes (Corning) in a roller and cultures in the 0.1- to 2-l scale in Erlenmeyer flasks on a shaker at 30°C in the dark. Cultures were inoculated (1%) with cultures pre-grown with the same substrate.

Chemicals Standard chemicals were obtained from Sigma-Aldrich, Fluka, or Merck. 4-sulfoacetophenone (4-acetylbenzenesulfonate) was purchased from ABCR (Karlsruhe, Germany) and 4-sulfophenyl acetate (1-phenol-4-sulfonate-acetate) from SYNCHEM (Felsberg-Altenburg, Germany), and biochemicals (NADH, NADPH, NAD+, and NADP+) from Biomol (Hamburg, Germany). Racemic 3-(4-sulfophenyl)butyrate was synthesized as described previously (Schleheck et al. 2004a). 60 CHAPTER 5

RNA-preparation and RT-PCR RNA preparation and reverse transcription (RT)-PCR was performed as described previously (Ruff et al. 2010) with the following modifications. After cells were grown in the appropriate selective medium (5 ml) and harvested in the mid-exponential growth phase (optical densitiy at

580 nm (OD580) ≈ 0.3), the cell pellets were stored at -20°C in RNAlater RNA stabilisation solution (Ambion, Applied Biosystems). Total RNA was prepared using the E.Z.N.A. Bacterial RNA Kit (Omega Bio-Tek) following the manufacturer’s instructions; the obtained RNA preparation (40 µl) was treated with RNase-free DNase (2 U for 30 min at 37°C) (Fermentas). For cDNA synthesis, the Moloney murine leukemia virus (MMLV) reverse transcriptase (Fermentas) was used following the manufacturer’s instructions; the reactions contained 0.2 µg total RNA and 20 pmol sequence-specific primer. The used primer sequences are provided in Table S1 in the supplemental material and were purchased from Microsynth (Balgach, Switzerland). PCRs (20-µl scale) were done using Taq DNA polymerase (Fermentas) and the manufacturer’s standard reaction mixture (including 2.5 mM MgCl2). As the template, cDNA from RT reactions (2 µl of RT reaction mixture), genomic DNA (4 ng DNA) for PCR-positive controls, or non-reverse-transcribed total RNA (2 µl) for the confirmation of absence of DNA impurities in the RNA preparations was used.

Preparation of cell extracts and purification of native enzymes C. testosteroni KF-1 cells were harvested in the exponential growth phase by centrifugation (15,000 x g for 20 min at 4°C) and stored frozen (-20°C). Cells were resuspended in 50 mM -1 Tris-H2SO4 buffer (pH 8.0 or pH 9.0) containing 0.03 mg ml DNase I (Sigma) and 2 mM

MgCl2, and disrupted by three to four passages through a French pressure cell (140 MPa at 4°C) (Newport Scientific, Inc./AMINCO, Silver Spring, MD). Whole cells and debris were removed by centrifugation (15,000 x g for 20 min at 4°C) to obtain crude extract, and membranes were removed by ultracentrifugation (60,000 x g for 30 min at 4°C) to obtain the soluble fraction; the membrane pellets were resuspended in 50 mM Tris-H2SO4 buffer (pH 8.0 or pH 9.0) to obtain the membrane fraction.

For purification of native SPAc-esterase, the salinity of the soluble fraction (in 50 mM Tris-

H2SO4 buffer, pH 9.0) was increased to 1.7 M ammonium sulfate through dropwise addition of ammonium sulfate solution (3.4 M) while stirring on ice. The precipitated proteins were collected (15,000 x g for 20 min at 4°C), and the supernatant was passed over a PD10 column

(Pharmacia) equilibrated with 50 mM Tris-H2SO4 buffer, pH 9.0, for desalting. The protein solution was then subject to anion-exchange chromatography on MonoQ HR (high-resolution)

10/10 column (Pharmacia) equilibrated with 50 mM Tris-H2SO4 buffer, pH 9.0, at a flow rate CHAPTER 5 61

-1 of 1 ml min . Bound proteins were eluted by using a linear Na2SO4 gradient (to 0.2 M in 45 min, and to 0.5 M in 10 min), and fractions (3 ml) were collected; the esterase activity eluted at about

80 mM Na2SO4 in two fractions. Active fractions were pooled, the salinity increased to 1.7 M ammonium sulfate (see above), and they were subjected to hydrophobic interaction chromatography on phenyl-Superose HR 10/10 column (Pharmacia) equilibrated with 1.7 M ammonium sulfate in 50 mM Tris-H2SO4 buffer, pH 8.5; the esterase activity eluted at about

150 mM Na2SO4 in two fractions, which were pooled and concentrated using a Vivaspin (10-kDa cutoff, Sartorius, Goettingen, Germany). Finally, gel filtration on a Superose 12

(HR 10/30; Pharmacia) was performed in 50 mM Tris-H2SO4 buffer, pH 9.0, including 150 mM sodium sulfate, at a flow rate of 0.4 ml min-1. Standard high-molecular weight proteins (aprotinin, RNase A, carbonic anhydrase, ovalbumin, conalbumin, and aldolase) were used to calibrate the column.

For separation of native SAPMO, the resuspension-buffer (50 mM Tris-H2SO4, pH 8.0) was supplemented with FAD and NADP+, each at 50 µM concentration, and 10% (v/v) glycerol. Anion-exchange chromatography on MonoQ and gel filtration were performed as described + above, when using 50 mM Tris-H2SO4 buffer, pH 8.0, supplemented by FAD and NADP (100 µM each).

Heterologous expression of the Comamonas testosteroni SAPMO gene in Escherichia coli and purification of the recombinant protein Chromosomal DNA was isolated by phenol-chloroform extraction and the gene (locus tag CtesDRAFT_PD5437) amplified by PCR using Phusion HF DNA Polymerase (Finnzymes) and the primer pair that is given in the supplemental material (see Table S1); the PCR conditions were 30 cycles of 20 s denaturation at 98°C, 20 s annealing at 58°C, and 90 s elongation at 72°C. The PCR product was separated by agarose gel electrophoresis, excised, and purified using the QIAquick gel extraction kit (Qiagen), followed by ligation into an N-terminally His6- tagged expression vector using the Champion pET100 directional TOPO expression kit (Invitrogen). OneShot TOP10 E. coli cells (Invitrogen) were transformed with the construct and the correct integration of the insert confirmed by sequencing (GATC-Biotech, Konstanz, Germany). For expression, BL21 Star (DE3) OneShot E. coli cells (Invitrogen) were transformed with the construct and grown at 37°C in LB medium containing 100 mg liter-1 ampicillin. At an OD580 ≈ 0.6, the culture was induced by addition of 0.5 mM isopropyl-β-D- thiogalactopyranoside. The cells were grown for additional 5 h at 20°C, harvested by centrifugation (15,000 x g for 20 min at 4°C) and stored frozen (-20°C). Cells were resuspended in buffer A (25 mM Tris-HCl pH 8.0, 50 µM FAD, 50 µM NADP+, 10% (vol/vol) glycerol) 62 CHAPTER 5 containing 0.03 mg ml-1 DNase I (Sigma), and disrupted by four passages through a precooled French pressure cell (140 MPa; Newport Scientific, Inc./AMINCO, Silver Spring, MD). Cell debris was removed by centrifugation (15,000 x g for 10 min at 4°C), and membranes by ultracentrifugation (60,000 x g for 1 h at 4°C). The soluble fraction was loaded on a Ni2+- chelating Agarose affinity column (1-ml column volume, Macherey-Nagel, Germany) preequilibrated with buffer A (see above), followed by a washing step using 20 mM imidazole in buffer A. The His-tagged protein was eluted using 200 mM imidazole in buffer A, concentrated using a Vivaspin (50-kDa cutoff; Sartorius, Germany), and stored at -20°C after the addition of glycerol to a 30% (vol/vol) final concentration.

Enzyme assays Activity of native and recombinant SAPMO was followed routinely as decrease of absorption of co-substrate NADPH at 365 nm (ε = 3.5 x 10³ M-1cm-1) (Bergmeyer 1975) to avoid interference with substrates 4-hydroxyacetophenone, 4-aminoacetophenone and 4-hydroxypropiophenone at 340 nm, or occasionally as substrate disappearance by revesed- phase high-pressure liquid chromatography (HPLC) with UV detection (HPLC-UV; see below) or substrate-dependent oxygen consumption with a Clark-type oxygen electrode (Schleheck et al. 2010). The standard reaction mixture (1 ml) contained 50 mM Tris-HCl (pH 8.0), 0.5 mM NADPH, 0.5 mM SAP, and SAPMO as present in cell extracts or in partially-purified preparations (2 to 50 µg total protein) or in form of the purified recombinant enzyme (13 µg total protein). The Km for NADPH was determined when its concentration was varied (from

0.005 to 0.3 mM) while keeping the SAP concentration constant (0.5 mM). The Kms for SAP and other carbonylic substrates (Table 1) were determined when their concentrations were varied (SAP, 0.01 to 1 mM; phenylacetone (PA), 0.01 to 0.75 mM; 4-aminoacetophenone (AAP), 0.01 to 4 mM; 4-nitroacetophenone (NAP), 0.01 to 1 mM; 4-hydroxyacetophenone (HAP), 0.01 to 1 mM; 4-hydroxypropiophenone (HPP), 0.05 to 5 mM; acetophenone (AP), 0.5 to 20 mM; and benzaldehyde (BA), 0.05 to 5 mM) while keeping the NADPH concentration constant (0.5 mM). All substrate stocks were dissolved in ethanol at concentrations such that final ethanol concentrations in the reaction mixtures did not exceed 2% (vol/vol). The activities were plotted using hyperbolic fit in Origin (Microcal Software, Inc.). The pH-optimum of SAPMO was determined when the pH of the reaction buffer was varied from pH 5.2 to 10.5 (Cleland 1982). The thermostability of SAPMO was determined as described for PAMO (Reetz and Wu 2009), with the enzyme incubated at different temperatures (22 to 47°C) for 1 h, followed by activity determination under routine conditions (see above). CHAPTER 5 63

SPAc-esterase was routinely spectrophotometrically measured as the increase of absorption of the reaction product 4-sulfophenol at 280 nm (ε = 0.9 mM-1 cm-1) or, occasionally as substrate disappearance and product formation by HPLC, as described previously (Schleheck et al. 2010).

Analytical methods SAP and other aromatic ketones tested (Table 1), and SPAc, NADPH, and NADP+ were analyzed by reversed-phase HPLC with UV-detection (HPLC-UV) using a Nucleosil C18 column (125 by 3 mm with a particle size of 5 µm, Macherey-Nagel, Germany) and a gradient system (mobile phase A, 20 mM potassium phosphate buffer, pH 2.2; eluent B, 100% methanol; with a flow rate of 0.5 ml min-1); the gradient program was 100% A for 3 min, to 35% B in 3 min, to 75% B in 11 min, 75% B for 6 min, to 100% A in 1 min, followed by reequilibration for 10 min. The retention times of the analytes were as follows: SAP, 8.1 min; SPAc, 8.4 min; 4 sulfophenol, 2.4 min; PA, 16.1 min; AAP, 10.2 min; HAP, 13.7 min; NAP, 14.8 min; HPP, 15.7 min; AP, 16.3 min; BA, 15.2 min; NADPH, 2.4 min; NADP+, 6.7 min. Acetate was determined by gas chromatography (Laue et al. 1997). Soluble protein was assayed by protein- dye binding (Bradford 1976). Denatured proteins were separated by 16% or 13% SDS-PAGE gels and stained with Coomassie brillant Blue R250 (Laemmli 1970). Stained protein bands were cut from the gel and subjected to peptide fingerprinting-mass spectrometry (PF-MS) at the Proteomics Facility of the University of Konstanz (www.proteomics-facility.uni- konstanz.de) to identify the corresponding genes; the MASCOT engine (Matrix Science, London, United Kingdom) was used to search against a local amino-acid sequence database of all annotated genes (DOE-JGI Integrated Microbial Genomes (IMG), 16 August 2011 version) and a local database of the translated (6 frames) nucleotide sequence, of the genome of Comamonas testosteroni KF-1.

RESULTS

BVMO candidate genes in the genome of C. testosteroni KF-1 SAPMO is strictly NADPH dependent (Schleheck et al. 2010), and four valid candidate genes for NADPH-dependent (type I) BVMOs (Figure 2A) were found in the unclosed, draft genome sequence of strain KF-1 (6.026 Mbp, 5,492 protein-coding genes; unpublished data) when prototype NADPH-dependent (type I) Baeyer-Villiger monooxygenase sequences (e.g. of HAPMO) were used for the search (IMG BLASTp); the candidates share up to 57% amino acid sequence identity with representatives of the type I Baeyer-Villiger monooxygenases (see Discussion; for sequence alignment see Figure S1 in the supplemental material). Three of the 64 CHAPTER 5 four predicted BVMO genes (BVMO1 to BVMO3 genes) are located next to a candidate gene for a corresponding esterase (alpha/beta-fold hydrolase), either directly downstream on the same strand (est1 and est2) or directly upstream in divergent orientation (est3) (Figure 2A). Notably, the BVMO2 and BVMO3 gene candidates are each clustered with predicted genes for acyl-CoA metabolism, and the BVMO3 candidate gene cluster appears to be framed by two Tn3 family transposase genes (Figure 2A) that share >99% nucleotide sequence identity (see Discussion). Finally, the BVMO1 candidate is located next to the protocatechuate meta-ring cleavage pathway operon (Pmd-operon) identified in a different C. testosteroni strain (Providenti et al. 2001), and the BVMO4 gene candidate is clustered with putative genes for arsenic resistance (see Figure 2A).

Transcriptional analysis of BVMO-candidate genes SAPMO is inducibly expressed (Schleheck et al. 2010), and reverse transcription-PCR (RT- PCR) was used to test whether any of the attributed candidate BVMO genes is inducibly transcribed during growth with 3-C4-SPC. The results (Figure 2B) suggested that candidate

BVMO3 gene was strongly induced during growth with 3-C4-SPC but not during growth with succinate, whereas no signals for the candidate BVMO1 and BVMO2 genes were detectable under any growth conditions tested; the BVMO4 gene candidate exhibited negligible transcription during both growth conditions tested. Furthermore, when tested by RT-PCR, a strong induction of esterase gene candidate est3 was indicated (Figure 2A) (data not shown) during growth with 3-C4-SPC but not during growth with succinate. Hence, we had first experimental support that the candidate BVMO3 and est3 genes could encode SAPMO and SPAc-esterase, respectively, in C. testosteroni KF-1.

CHAPTER 5 65

Figure 2. Illustration of clusters of genes in the C. testosteroni KF-1 genome that contain BVMO candidate genes (A), and analysis by reverse-transcription PCR of inducible transcription of the BVMO candidate genes in cells grown with 3-C4-SPC in comparison to their transcription in succinate-grown cells (B). (A) The four BVMO candidates are indicated, and their locus tag numbers, not including the prefix CtesDRAFT_PD, are shown. Colocalized esterase candidate genes (est1 to est3), as well as colocated genes for meta-ring cleavage dioxygenase (pmdAB) and degradation of protocatechuate, for acyl-CoA metabolism, for arsenic resistance, and for transposases of insertion (IS) elements (see the text), are also shown. (B) Agarose gel illustrating the strong PCR signal observed for the reverse transcript (cDNA) of the BVMO3 candidate gene specifically during growth with 3-C4-SPC. Also shown are positive controls for PCR with genomic DNA as template (PCR control), and the RT- PCR positive controls (+) when testing for cDNA of a constitutively expressed gene (CtesDRAFT_PD5114; succinyl-CoA synthetase alpha subunit). Length marker (M) in bp.

Partial purification, some characteristics, and identification of SAPMO

The SAPMO activity in cell extracts of 3-C4-SPC-grown C. testosteroni KF-1 appeared to be labile, as observed previously (Schleheck et al. 2010), but a supplementation of all buffers with glycerol, FAD, and NADP+ (van den Heuvel et al. 2005) increased the activity and the stability. Under these conditions, membrane-free soluble fraction exhibited a specific SAPMO activity of up to 0.12 µmol min-1 mg-1, about 10-fold higher than observed previously (Schleheck et al. 2010) and, hence, closer to the calculated in vivo activity of 0.29 µmol min-1 mg-1 (Schleheck et al. 2010). Furthermore, the enzyme could be stored frozen without significant loss of activity, whereas about 50% of the activity was lost after 1 day at 4°C, no matter it was kept under air or nitrogen. After a first purification step (anion-exchange chromatography, MonoQ), about 66 CHAPTER 5

50% of the total activity was lost, but the specific activity had increased 8-fold to 0.96 µmol min-1 mg-1; a second MonoQ step at a higher pH was unsuccessful (no activity) as well as a gel filtration step (no purification) (data not shown).

We used the partially purified, native SAPMO (MonoQ fraction) to determine apparent Km values for SAP and NADPH of 23 µM (± 2 µM) and 6.3 µM (± 1.2 µM) (means ± standard deviations), respectively, which confirmed a high affinity of the enzyme to the substrates (see also below); NADH could not replace NADPH as the cosubstrate (Schleheck et al. 2010). Furthermore, the activity of SAPMO with some structural analogues of SAP was tested (see below): 4-hydroxyacetophenone (HAP), 4-aminoacetophenone, and 4-nitroacetophenone appeared to be good substrates, whereas neither cyclohexanone nor acetone were accepted, and these substrates were not inhibitory, as judged by normal activity upon addition of SAP.

We anticipated a 60-kDa protein to be enriched in SAPMO active fractions compared to SAPMO inactive fractions when observed by denaturing SDS-PAGE, hence, a protein band representative of the monomer encoded by the BVMO3 candidate (predicted molecular mass, 59.8 kDa); the other three BVMO gene candidates encoded proteins with lower molecular masses (approximately 38 to 55 kDa). A valid band (at around 60 kDa) was observed, cut from the gel, and subjected to peptide fingerprint-mass spectrometry (PF-MS) (data not shown), which attributed locus-tag CtesDRAFT_PD5437 to it (score, >500, and coverage, >45%), i.e. the BVMO3 gene candidate (Figure 2A). Three other prominent protein bands visible in the active fraction at lower molecular weight (approximately 58, 52 and 46 kDa) were also identified by PF-MS, but the fingerprints did not match to any BVMO candidate genes (the locus tags identified were CtesDRAFT_PD5455, _PD3982, and _PD1824 for acetyl-CoA acetyltransferase, threonine synthase and isocitrate dehydrogenase, respectively).

Heterologous expression of the SAPMO candidate gene and characteristics of the purified enzyme The SAPMO gene (BVMO3) was cloned and overexpressed in E. coli as a soluble, N-terminally

His6-tagged protein that could be purified via nitrilotriacetic acid (NTA)-agarose affinity chromatography (see Material and Methods). SDS-PAGE (Figure 3) confirmed that a prominent protein band at approximately 60 kDa had appeared in cell extracts after isopropyl- β-D-thiogalactopyranoside (IPTG) induction and that a single protein band at approximately 60 kDa (with negligible impurities at around 30 kDa) was obtained from the affinity chromatography step. The purified recombinant protein of C. testosteroni catalyzed an NADPH-dependent oxygenase reaction with SAP, as assayed routinely as the photometrical decrease of absorption of the cosubstrate NADPH. Furthermore, the disappearance of SAP and CHAPTER 5 67

NADPH and the formation of SPAc and NADP+ during the reaction was confirmed by HPLC analysis of samples taken during the reaction (see below); SAP-specific, NADPH-dependent oxygen consumption was confirmed with an oxygen electrode (Schleheck et al. 2010; data not shown). The activity of the recombinant enzyme showed an apparent Km of 61.7 µM (and a -1 -1 Vmax of 2.8 µmol min mg ) for the reaction with SAP (Table 1); therefore, the Km value was slightly higher than determined for the native, semipurified enzyme (see above). As illustrated by the kinetic parameters shown in Table 1, the recombinant SAPMO of C. testosteroni also catalyzed the conversion of several other aromatic ketones, during which the uncoupling rate (NADPH-disappearance in the absence of a substrate) was negligible (0.03 µmol min-1 mg-1). Interestingly, the conversion of phenylacetone (phenyl-2-propanone) occurred at a rate (and affinity) that was similar to the conversion of SAP (Table 1), whereas specific activities were lower for the other substrates, decreasing in the order 4-aminoacetophenone, 4-hydroxyacetophenone (HAP), 4-nitroacetophenone, 4-hydroxypropiophenone, acetophenone, and benzaldehyde (Table 1). However, the highest substrate affinity was determined for the reaction with HAP (Km, 35.9 µM). Notably, progesterone, cyclohexanone, and acetone did not affect any measurable activity when tested under the conditions we used, and these substrates were not inhibitory, as judged by normal activity upon addition of SAP. The pH optimum of the SAPMO reaction, around pH 8, was determined to be broad. Furthermore, the temperature stability of SAPMO was determined as described for phenylacetone BVMO (PAMO) (Reetz and Wu 2009), however, loss of about half of the activity after 1 h at a temperature of 30°C compared to the control reaction with enzyme that had been stored at 4°C indicated that SAPMO of C. testosteroni is not heat stable. 68 CHAPTER 5

Figure 3. Analysis by SDS-PAGE of SAPMO (encoded by BVMO3 gene candidate) overexpression in E. coli and of the purification of the N-terminally His6-tagged SAPMO protein. Lane 1, whole cells prior to IPTG induction (approximately 60 µg total protein); lane 2, crude extract 5 h after induction (100 µg protein); lane 3, soluble protein fraction after ultracentrifugation (80 µg protein); lane 4, protein fraction obtained from Ni-NTA purification (30 µg protein); lane M, molecular mass markers (kDa).

CHAPTER 5 69

1

-

)

1

-

KF

NADPH

m

M

K

1

/

-

s

2.5 8.2 0.7 1.7

10.5 46.6 53.7 30.4

cat

3

k

substrate

(10

-

ance ance of co

)

1

-

cat

1.8 0.4 2.9 2.8 1.1 1.1 0.4 0.4

k

(s

Comamonas testosteroni Comamonas

b

Determined as disappear

m m

b

K

(µM)

52.0 ± 8.0 ± 52.0 9.6 ± 35.9

61.7 ± 11.5 ± 61.7

168.5 ± 23.7 ± 168.5 32.9 ± 134.5 173.0 ± 45.2 ± 173.0 90.6 ± 558.8 90.8 ± 240.8

)

1

-

mg

1

b

-

max

UV UV (data not shown).

-

V

1.7 ± 0.5 ± 1.7 0.1 ± 0.4 2.8 ± 0.8 ± 2.8 0.8 ± 2.7 0.3 ± 1.1 0.3 ± 1.1 0.1 ± 0.4 0.1 ± 0.4

(µmol min (µmol

structure

a

HClbuffer (pH 8.0, 50 at 25° mM) C.

-

Substrate specificity and kinetic parameters determined recombinant SAPMO for parameters kinetic determined from and specificity Substrate

substrate

acetophenone acetophenone benzaldehyde

phenylacetone phenylacetone

nitroacetophenone nitroacetophenone

sulfoacetophenone

aminoacetophenone aminoacetophenone

-

-

-

hydroxyacetophenone hydroxyacetophenone

4

4

mM) in Tris in mM)

hydroxypropiophenone hydroxypropiophenone

-

4

-

4

4

Substrate Substrate disappearance during the reactions was confirmed by HPLC

TABLE1. (0.5 a 70 CHAPTER 5

Purification, identification and characterization of SPAc esterase The SPAc esterase was highly active (1.5 µmol min-1 mg-1) (Table 2) in comparison to the calculated in vivo activity (0.3 µmol min-1 mg-1) (Schleheck et al. 2010). The enzyme was in the soluble fraction, and a three-step purification protocol (Table 2) was sufficient to obtain an essentially pure protein, as observed by SDS-PAGE (Figure 4, lane 3); a further slight increase in purity was achieved by gel-filtration (Figure 4, lane 4) to, finally, a 125-fold-purified homogenous protein with a yield of 12% (Table 2). The protein band was cut from the gel and analyzed by PF-MS, which identified locus tag CtesDRAFT_PD5438 (score, >800, and coverage, >60%) in the C. testosteroni KF-1 genome; thus, this is the esterase gene candidate est3 (Figure 2A). The predicted molecular mass of the gene product was 33.604 kDa, which matched the mass observed for the denatured protein by SDS-PAGE (Figure 4). Furthermore, a native molecular weight of about 30 to 40 kDa was indicated during the calibrated gel filtration-chromatography run; thus, a monomeric structure of the enzyme is very likely. The purified enzyme catalyzed the conversion of the substrate SPAc to the products 4-sulfophenol and acetate in a 1:1:1 stoichiometry (not shown), which confirmed our earlier observations made with crude extracts (Schleheck et al. 2010). An apparent Km value of 27 µM (± 6) for -1 -1 SPAc (Vmax = 255 µmol min mg ) was determined, showing a high affinity of the enzyme to its substrate. Finally, preincubation of the purified enzyme for 5 min at 30°C, 37°C, or 45°C did not result in a significant loss of activity (Rehdorf et al. 2012), and 25% of the initial activity remained after 5 min of incubation at 60°C (see Discussion). CHAPTER 5 71

Figure 4. SDS-PAGE analysis of the different stages of purification of SPAc esterase of C. testosteroni KF-1. Lane 1, soluble protein fraction after ultra-centrifugation; lane 2, active fraction after anion-exchange chromatography; lane M, molecular mass markers (with masses in kDa flanking the gel); lane 3, active fraction after hydrophobic-interaction chromatography; lane 4, active fraction after gel-filtration chromatography (from a different purification run).

Reconstitution of the reaction sequence from SAP via SPAc to 4-sulfophenol The two pure enzymes, SAPMO and SPAc esterase, were used to demonstrate quantitative NADPH-dependent formation of SPAc from SAP over time, followed by quantitative formation of 4-sulfophenol from SPAc and, hence, a stoichiometric conversion of SAP to 4-sulfophenol. Recombinant SAPMO was incubated with 0.5 mM SAP and 1.25 mM NADPH, and samples for HPLC-UV analysis were taken at intervals. After 30 minutes, purified esterase was added, and the reaction mixture was monitored further by HPLC-UV. The data obtained (Figure 5) confirmed quantitative conversion of SAP to 4-sulfophenol through SAPMO and SPAc-esterase, inclusive transient appearance of quantitative amounts of SPAc. Furthermore, the HPLC-UV data confirmed quantitative NADPH conversion to NADP+ during the SAPMO reaction, and a quantitative formation of acetate was confirmed by gas-chromatography in a sample taken after the SPAc-esterase reaction (data not shown). 72 CHAPTER 5

Figure 5. Reconstitution of the reaction sequence from SAP via SPAc to 4-sulfophenol using the purified SAPMO and SPAc esterase enzymes, as followed by HPLC-UV analysis of samples taken at intervals during the reactions. SAPMO (15 µg protein ml-1) and SPAc-esterase (0.7 µg protein ml-1) were added sequentially (arrows) to a reaction mixture that contained 0.5 mM SAP (■) and 1.25 mM NADPH (not shown), which led to formation of SPAc () and 4-sulfophenol (), respectively. For clarity of the illustration, the dataset showing a stoichiometric NADPH-to-NADP+ conversion during the first reaction (determined by HPLC-UV), was omitted in this graph.

DISCUSSION LAS is on the market and effectively degraded by microbes for over 50 (Sawyer and Ryckman 1957). It has also been known for many years that SPCs are intermediates in the degradation of LAS, but the microbial degradation pathway for short-chain SPCs resisted all attempts at elucidation (Dodgson and White 1983, Cain 1987, White and Russell 1994, Simoni et al. 1996) until we established that SAP, SPAc, and 4-sulfophenol (Figure 1) are intermediates in the

3-C4-SPC degradation pathway in C. testosteroni KF-1 and that a BVMO and an esterase must be involved (Schleheck et al. 2010). Now, the first two enzymes (and genes) in the 3-C4-SPC degradation pathway are known, a type I Baeyer-Villiger monooxygenase (EC 1.14.13.x) as SAPMO (locus tag CtesDRAFT_PD5437) and a carboxylesterase (EC 3.1.1.1) as SPAc esterase (locus tag CtesDRAFT_PD5438).

Research on BVMOs is progressing at an impressive pace in recent years (reviewed in Leisch et al. 2011), since an increasing number of BVMOs from various sources has been cloned and characterized, with several crystal structures available, for the aim of understanding their CHAPTER 5 73 complex reaction mechanisms. In addition, there is interest in exploiting their diverse substrate preferences and extraordinary enantio-, regio-, and chemoselectivities for biotechnological applications, as well as in altering their characteristics through directed evolution (reviewed in Leisch et al. 2011). In the present study, we revealed an example of how bacteria are exploiting BVMOs to access a novel carbon- and energy source for their growth; in this case, in

C. testosteroni KF-1 for the utilization of xenobiotic 3-C4-SPC derived from primary degradation of commercial LAS surfactants. Interestingly, it appeared that the identified SAPMO and SPAc esterase gene cluster (Figure 2A) could have been mobilized recently from a different location into the genome of strain KF-1 as part of a catabolic transposon (e.g. see Top and Springael 2003). First, the identified genes (together with other predicted catabolic genes, discussed below) are framed by two Tn3 family transposase genes (Figure 2A) with nearly identical sequences (99.9%) that, upon closer inspection (not shown), appear as complete insertion sequence (IS) elements of the IS1071 family (i.e. the 110-bp inverted repeats; 99.8% identical to archetypal IS1071 of C. testosteroni BR60 (Nakatsu et al. 1991)). IS1071 elements have frequently been identified in close proximity to various xenobiotic degradation genes in environmental bacteria (Nakatsu et al. 1991, Junker and Cook 1997, Boon et al. 2001, Ruff et al. 2010) and have been shown to transpose at high frequencies in C. testosteroni strains (Sota et al. 2006). Second, homologous gene clusters of SAPMO and SPAc esterase genes framed by IS1071 elements were not found in the other two C. testosteroni genomes available, those of strains S44 and CNB-2 (Ma et al. 2009, Xiong et al. 2011) (nor on plasmid pCNB1 of C. testosteroni CNB-1 (Ma et al. 2007)). Given the wide substrate range of BVMOs (Leisch et al. 2011), such as observed for SAPMO (Table 1), and given the anticipated wide substrate range of SPAc esterase (discussed below), it is easy to rationalize that such a multi-functional BVMO-esterase gene module mobilized via IS1071-mediated transposition would have been maintained in a bacterial genome if the genes became part of a novel biochemical pathway for exploiting novel growth substrates.

The sequence of SAPMO is typical of type I BVMOs, as it contains the BVMO fingerprint motif (FXGXXXHXXXWP) and two Rossmann fold motifs (GXGXXG) for FAD and NADPH binding (Fraaije et al. 2002); the SAPMO sequence does not include the N-terminal extension (120 amino acids) of HAPMO-like type I BVMO sequences (see the sequence alignment in Figure S1). SAPMO showed highest BLASTp hits to BVMO candidate genes encoded in Sorangium cellulosum 56 (sce4944, 60% identical amino acids), Rhodopseudomonas palustris BisB5 (RPD_1860, 58% identity) and, surprisingly, in our LAS-degrading Parvibaculum lavamentivorans DS-1T (Plav_0813; 58% identity) (Schleheck et al. 2011). With respect to 74 CHAPTER 5 cloned and well-characterized BVMOs, the highest homology of SAPMO was observed to phenylacetone BVMO (PAMO) (Fraaije et al. 2005) of Thermobifida fusca (57% identical amino acids) and steroid BVMO (STMO) (Morii et al. 1999) of Rhodococcus rhodochrous (55% identity). Cyclohexanone BVMO (CHMO) (Sheng et al. 2001) and HAPMO sequences (Kamerbeek et al. 2001, Rehdorf et al. 2009) appeared to be more distantly related to SAPMO (around 45 and 30% identity, respectively). The observed phylogenetic relationship of SAPMO with the characterized type I BVMO PAMO described above appeared to be directly reflected in the substrate-dependent SAPMO activities determined (Table 1). PAMO appears to prefer substrates bearing phenyl groups, whereas other carbonylic compounds such as cyclohexanone are not accepted. SAPMO converted also 4-amino- and 4-nitro-acetophenone with efficiencies similar to its conversion of as HAP; acetophenone, 4-hydroxypropiophenone, and benzaldehyde were less effectively converted; and cyclohexanone and acetone were not substrates (Kamerbeek et al. 2003, Bocola et al. 2005). Furthermore, SAPMO did not convert progesterone, an observation that is in accordance with the substrate preference observed for PAMO (Franceschini et al. 2012). Finally, SAPMO exhibited a broad pH optimum as typically observed for type I BVMOs and, in contrast to PAMO, was not heat-stable (Fraaije et al. 2005). Overall, we identified and characterized a novel member of the type I BVMO enzyme family that plays a key role in a degradation pathway for a xenobiotic compound in bacteria.

The SPAc esterase sequence showed highest BLASTp hits to sequences encoded in Burkholderia xenovorans LB400 (Bxe_A3606, 59% identical amino acids), Cupriavidus pinatubonensis JMP134 (Reut_B5462, 43% identity) and Burkholderia sp. CCGE1002 (BC1002_5819, 41% identity); the latter two homologs are, like the SPAc esterase gene, encoded in the immediate vicinity to a BVMO candidate gene. Furthermore, SPAc esterase showed 38% amino acid identity to HPAc esterase (ACA50464) in Pseudomonas fluorescence ACB, which is also coencoded with a corresponding BVMO gene, in this case HAPMO (Moonen et al. 2008b), as is the isofunctional, recently characterized (see below), carboxylester hydrolase in P. putida JD1 (termed PPE in Rehdorf et al. 2012), which shows 94% amino acid identity to HPAc esterase of strain ACB (Rehdorf et al. 2012) (the PPE sequence was not deposited in Genbank). The latter gene (for PPE-esterase) has been cloned and expressed and characterized as a purified enzyme (Rehdorf et al. 2012). PPE hydrolyzed a wide range of aryl esters, cyclic esters, and tertiary-alcohol esters with sterically more-demanding side-groups at the quaternary carbon atom, making this type of carboxylester hydrolase useful for biotechnological applications (Schmidt et al. 2009). Besides the alpha/beta-hydrolase fold and catalytic triad (Ser/Asp/His) including the classical pentapeptide (GDSAG) around the CHAPTER 5 75 catalytic-site serine (Bornscheuer 2002), the PPE esterase, SPAc esterase, as well as all of the other above-mentioned homologs, contain the additional GGGX motif (not shown). This motif has been recognized to represent an enlarged oxyanion-pocket in the enzyme, allowing sterically more-demanding substrates to enter the active site than is possible for esterases with the more widespread GX motif and smaller active site (Henke et al. 2002, Rehdorf et al. 2012). In this study, we had only limited amounts of purified, native SPAc-esterase available; however, it will be interesting to see in future work whether the enzyme also exhibits such a promiscuous behaviour to a wide range of acetyl esters. Notably, SPAc esterase appeared to be rather temperature insensitive in comparison to PPE esterase, which suffered 66% loss of activity after only 5 min at 37°C and was completely inactivated at 45°C (Rehdorf et al. 2012).

Access to a genome sequence and protein identification through PF-MS greatly facilitated this work, and this is particularly encouraging since questions remain regarding the nature of other important steps in the 3-C4-SPC pathways in C. testosteroni KF-1. First, how is the acetyl-side chain removed from 3-C4-SPC in strain KF-1 (Figure 1; upper pathway) to yield SAP? This reaction sequence has been postulated to involve 3-C4-SPC CoA-esters and reactions in analogy to short-chain fatty acid CoA oxidation, e.g. a beta-ketoacid lyase (discussed in Schleheck et al. 2010). The SAPMO and SPAc esterase genes are clustered with predicted genes for acyl- CoA metabolism (Figure 2A) (e.g., acyl-CoA synthetase, CoA-transferase, and acetyl-CoA acetyltransferase candidate genes), and we are currently exploring if these genes play a role in the 3-C4-SPC degradation pathway or not. Second, how is 4-sulfophenol converted to (desulfonated) central metabolites in strain KF-1? We found evidence for the involvement of 4-sulfocatechol as substrate for a desulfonative ortho-ring cleavage pathway (Schulz et al. 2000, Schleheck et al. 2004a, Schleheck et al. 2010), as has been characterized in other bacteria (Contzen et al. 2001, Halak et al. 2006, Halak et al. 2007); however, we have preliminary evidence to suggest that the pathway may also (or instead?) involve 2-hydroxyquinol (1,2,4- trihydroxybenzene) ortho-ring cleavage. Finally, the different biochemical strategies and pathways employed in other organisms that degrade other SPCs, e.g., for 4-C6-SPC in Delftia acidovorans SPH-1 (Schleheck et al. 2004a), remain completely unexplored, as are the enzymes of primary LAS degradation in P. lavamentivorans DS-1 (Figure 1), but genome sequences for these organisms have been made available (Genomes Online Database (GOLD) identification numbers Gc00683 and Gc00631) (Schleheck et al. 2011)). Hence, it is obvious that much more work lies ahead to completely uncover the metabolic pathways for SPCs and LAS in bacteria. 76 CHAPTER 5

ACKNOWLEDGMENTS We are grateful to Alasdair McLeod Cook and to Hans-Peter Kohler for their continuing support and helpful discussion, to DOE-JGI for making the C. testosteroni KF-1 genome available, and to several practical students for their help on individual experiments.

The work of M.W. was supported by the Konstanz Research School Chemical Biology (KoRS-CB), and the work of D.S. by a Deutsche Forschungsgemeinschaft (DFG) grant (grant SCHL 1936/1-1) and by the University of Konstanz and the Konstanz Young Scholar Fund (YSF). CHAPTER 5 77

SUPPLEMENTAL MATERIAL for manuscript submitted to Applied Environmental Microbiology (AEM02412-12)

Two Enzymes of a Complete Degradation Pathway for Linear Alkylbenzenesulfonate (LAS) Surfactants: 4-Sulfoacetophenone Baeyer-Villiger Monooxygenase and 4-Sulfophenylacetate Esterase in Comamonas testosteroni KF-1

MICHAEL WEISS1,2, KARIN DENGER1, THOMAS HUHN2,3, AND DAVID SCHLEHECK1,2 1Department of Biology, 2Konstanz Research School Chemical Biology, and 3Department of Chemistry of University of Konstanz, D-78457 Konstanz, Germany

Table S1. PCR primers product primer gene locus tag primer sequence (5’ -3’) length name (binding position [bp]) (bp) bvmo1-for CtesDRAFT_PD1901 (413–432) GCGCCGGCTATTACGACCAC 640 bvmo1-rev a CtesDRAFT_PD1901 (1052–1034) CCGCCGAACCGCTTGAGAT bvmo2-for CtesDRAFT_PD3136 (677-695) TGCAGCGCGAACCCACCTA 556 bvmo2-rev a CtesDRAFT_PD3136 (1232-1213 ) TGCTCGCCGCGTTCTATCAG bvmo3-for CtesDRAFT_PD5437 (51-71) CTTCGCCGGCCTCTACCAACT 544 bvmo3-rev a CtesDRAFT_PD5437 (594–577) GGCGATCACGGGCACCAT bvmo4-for CtesDRAFT_PD1705 (462-483) CGTATCTCCCGAGCCGTTCAAA 498 bvmo4-rev a CtesDRAFT_PD1705 (960–939) ACCATAGCCCACCAGCCACAAG sucD-for CtesDRAFT_PD5114 (118-137) GCAGGCGTGAACCCCAAGAA 451 sucD-rev a CtesDRAFT_PD5114 (568-550) CGC CAC CGA TAC CGA CAG C est3-for CtesDRAFT_PD5438 (88–106) AGCGGCCCGATGTTGATGG 437 est3-rev a CtesDRAFT_PD5438 (524–506) GCGGGCTTGGCGTTGTTGT TOPO-for b CtesDRAFT_PD5437 (1–20) CACCATGAGCAATTCAACCACCCG 1616 TOPO-rev c CtesDRAFT_PD5437 (1616-1596) TTTCAGGCCTTCGCAAAGCCC a primer used for reverse-transcription reaction. b the adaptor-sequence for directional TOPO-cloning is underlined in the primer sequence and the start codon in bold letters. c the stop codon is indicated in bold letters.

78 CHAPTER 5

GG GG GG GG

VGG L VGG VGG VGG VGG

DG KT GD SG ND TD

G MDGF A EEVG A A G G

K A T K K

------D E EA E E

LEA LEA LE L Y Y

V I VI

A II H RAF VI VV

TQ L F VLL V V F F

------

K S T P P

L KK L RS L V V

G G G G G G

GL EQFADRC NS RTN EL SQ QA QA

R K R K K

A LR L LR F F F

R RLR R R R

RH IH YF L L

G A LY IH

AA A A A AA AA

YQLH G L SV Y Y I I

I I M M

GL GL GL

SG SG SGL A SG SG

FA I M QA F I E V

G

C

------GXGXXG GAG GAG GAG

V IG

IIGAG IIGAG VV

VI L IV I LVV

L F VVIIGAG VVIIGAG

D D DV DV DV DVV K K

F Y Y F F

QV EV

MK

MSFH

------MAQH

-----

------SVVTAPDATTGTTS APRWHKDHVASGRD APRWHKDHVAAGRE

R R R R

------MAGQTTVDSRRQPPE

437] 437]

vmo4 [CtesDRAFT_PD1705] [CtesDRAFT_PD1705] vmo4

bvmo2 [CtesDRAFT_PD3136] [CtesDRAFT_PD3136] bvmo2 [CtesDRAFT_PD5437] bvmo3/SAPMO [CtesDRAFT_PD1901] bvmo1 [CtesDRAFT_PD3136] bvmo2 b [Q47PU3] PAMO [BAA24454] STMO MSAFNTTLPSLDYDDDTLREHLQGADIPTLLLTVAHLTGDLQILKPNWKPSIAMGVARSG 1 [AAK54073] HAPMO MRTYNTTLASLECDDETLRAHLQGADIPTLLLTVAHLTGDLNVLKPAWKPVVAMGVAHSG 1 [ACJ37423] HAPMO bvmo3/SAPMO_[CtesDRAFT_PD5 [CtesDRAFT_PD1901] bvmo1 [CtesDRAFT_PD1705] bvmo4 [Q47PU3] PAMO [BAA24454] STMO MDLETEAQVREFCLQRLIDFRDSGQPAPGRPTSDQLHILGTWLMGPVIEPYLPLIAEEAV 61 [AAK54073] HAPMO MTPEVEAEVREDCLQKLLAFRDSGLPVPARPSSEQLHALGTWLMGPVIEPYLPLVAEELV 61 [ACJ37423] HAPMO MSNSTT 1 bvmo3/SAPMO_[CtesDRAFT_PD5437] 1 [CtesDRAFT_PD1901] bvmo1 1 [CtesDRAFT_PD3136] bvmo2 1 [CtesDRAFT_PD1705] bvmo4 1 [Q47PU3] PAMO MNGQHP 1 [BAA24454] STMO TAEEDL 121 [AAK54073] HAPMO TAEEDP 121 [ACJ37423] HAPMO

CHAPTER 5 79

E I I I I I I G R T T Q S R G

KD TS KD D E E VE DE TP TP

KD YEF SG RD L M I K PA P PA PA

GLE APL VDFS R YEH YDH GLE T G GL V LPA LP V I LP LP

LR I LR LR P PGL PG PG S F IF A S L L

D GL G ET D GL GL D F F F Y W W

QIK KFK EW EVE NF AFD AI AI N QW N N N

P EN KY KY P P P P P P P A PT A PH A T T

Y DRF E DKF REH RDK DRF P V V L T I I S S

FL RT RT

-- VA V VA VA VA VA ------QR A A

N H ---- F A A Q F Q N M M L FQRT

P AETI R EH P GRGY P P P P VFQRT M VFQRT VFQR VF VF

H IN L FEG L IN MQ MQ L H R R T T RH F F K K

H Y Y Y Y Y Y Y Y SI SV SNA N N L V V L L L L

L R D R A L L L L L H H K E E D

V L RH QQ L FA FS L C YYD TWRN Q P Q Q G A ESTTWVTLEPPAF EQ

G VV V V G G G G G G G AA A - A AA A AA AA

T EI EI Q Q T T S A V V

A PEI D A PEI P P PEI A CA A A A A A EQ PD SR KQ EQ QT KT

A Q A G A G V C V F F A SLV A A

V T P SQ TQ W V V V V I LA LA V I I LA LA

V A PSRDH AP AP V V V V S Q I Q Q

V YAP Y YA F F YA YLIM FL A P SA AE P

K R Q R C C K Q R RYLIM RFL IP S IP IP IP

A E RVLP E E D D E A A A MV L IV IL V S F F

S GNAI S T D D S Q Q Q Q Q Q Q

W W SG W W W W WHS V AT V A I I T A

QYQ E I T KFD RIR EVS G A A

TQVDSNV TRVDSNV TA T SG S S

A N A A

-- -- G G

------S G

------GQ GXGXXG G

K E D D E IGTGSSG IGTGS V

G G SGSDEVQTFTTG G G L VIG V VIGTG

H SDELQQEWN KPWT EPNP LMPP SEEVLQEWN ARG ARS SPELEQEWN H H R V M A

F W I IAI

G SF SF RVGVIGTGSSG

QDQ Y YSF YSF F F ETEA EVT NAGATS DTN RTD LYRDS LCRDSAGQ Q

L G T C V L V V V L L GKRVG GKR GKR GKRV G GKRVGVIGTGSSG GKRVGVIGTG GKRV

YSYSF Y Y Y YS YS YSYSF Q Q Q T E K S T

WT T SSIAG W F F WT Y F WS WS

LD F HA W IE F F ID RWT RWT G T RWT RW RW D PLD

EA S L S S S S S S Q N Q Q VDF L I EP VDF VDF VD VD

D P N N G - - - P G - -

-- VE S V IE I I VE SRV AT EGL ST DR E E S E D D N

D D D PTE R Q H H H

TG S CD V I DE SSEDR E DE DE D VSP D D

W R R F W F WDE WDE WP WP W WPH WPH W W

E LT C C T R A VL H H D R QY R Q Q

A A A A A A A A

R A S D D PQH TGN T

YPG YPG YPG KS QS LVTSV S HSA H HSA HSA H H HSA HSA

VT RYPGAR T T T V V V VT VT V VT I I Y V F

E NRYPGARCD N NRYPGARCD N N NRYPGARCD R T R E E QM I M

T W TH W W S T T T T KWY QV NL Q NL D P PAF

D F W RD QHGWDSLRLFSPA Y RE RE Y N GHK H D N N G G G G

F W W W F Y F F F F G A T Q

TW TW TW TW TW FXGXXXHXXXW FKG FKG F FKG F F FKG F

54 V 54

[Q47PU3] 114 T 114 [Q47PU3]

bvmo2 [CtesDRAFT_PD3136] 96 Q 96 [CtesDRAFT_PD3136] bvmo2 46 [CtesDRAFT_PD5437] bvmo3/SAPMO 44 [CtesDRAFT_PD1901] bvmo1 40 [CtesDRAFT_PD3136] bvmo2 A 41 [CtesDRAFT_PD1705] bvmo4 [Q47PU3] PAMO 179 [AAK54073] HAPMO 179 [ACJ37423] HAPMO V 59 [BAA24454] STMO Q 106 [CtesDRAFT_PD5437] bvmo3/SAPMO R 98 [CtesDRAFT_PD1901] bvmo1 LIERP 92 [CtesDRAFT_PD1705] bvmo4 PAMO R 119 [BAA24454] STMO R 234 [AAK54073] HAPMO Q 234 [ACJ37423] HAPMO 159 [CtesDRAFT_PD5437] bvmo3/SAPMO 158 [CtesDRAFT_PD1901] bvmo1 147 [CtesDRAFT_PD3136] bvmo2 144 [CtesDRAFT_PD1705] bvmo4 167 [Q47PU3] PAMO 172 [BAA24454] STMO 290 [AAK54073] HAPMO 290 [ACJ37423] HAPMO

80 CHAPTER 5

G G G G I F V L I I V I

EG LG TN ID S N N D H G D E N N

D PN VPP D D D M M L I L I I

YSPD GAGG ------V V M T R R

R K R R C V I LS RP ES FK DK MP MP

LI L IV LILE L V L L L L L

IA Y F F

KR KR R KR KRIV KRIV KRI ------

C QKICQ ERYWQ AVSA AVSA T FGE T A G S R R K K

G FEFRW I L L V G G G G G G MTGA LTGA MTGS S S

E E E E F WQQ G F V V A TD PALQP A A A A

RA P P P P P P D NLK K R D D H H

R L Y HAI

-- RLK QIQ LVET RRAVYEERW ------

Y E E E E YPPT EDL G D TGF TGF TGF

H S D K GY EN K K DS DS ATG ATGF ATGF ATGF C ATGF ATGF

P DE RQHPEAMRAFLLG D DE EE P P P P P P P T L L L YG YG

Q S DVGYPPT DV C V T I I IF IW V

L VS VS VS V V L L L L L I IV IV I L IV IV IV

E QT LDP E E M V GKD R L V V A I I A S M L L

V V V V V DL E E E E D D D D D D D D

RQK A I G AL AL D D DRHFN Q A A R RN L A V L F F

L E E E E L V L L D E DL

VKA DI A EV A D D HE HE HE HE

AA F P P E AKAF E V

D DP DP DP S GHL VL S REY AHY T T

- G - -

T G DG DG G DG DG

T T S SE T

FLMANDKS ANVERFT PQSVGFL PQSVGFL T T T T

M THRYQGPKS SPHRPHPKS M R QL R R V CCV CCI

GV G G AM L V AQWA V

------G GI G G G GI GI GI

PTY TKV N S S R T K T

- P AE P P P

EAVT E NTPG ESGG ARSKIRAVVK IRNKIRNTVR WEEKIRAVVD ADR ADR EK DE R

R R R R F F F F F VT V LS IT M IT I

G Q S WYRVAM WYRA RFDG T V A

L L L L L TE AQ VG

IE I FAR

------PI P PIE PI PIE PIE

DT EA Q Q

------

T SNKHAAD PEG SAWMEPQ CAWMEPQ

V YAERRA RDLLVY ------YAERRR V L I I ------

N N Q N DVKAN T T G DTLSA DLRST R R

A S W KRYAEFREES A NP SH DRDANERVAE QD RD V I V

DAS K LLKPLNLPEDWYHEIMRRAFIARTD R R K K R TDPAANDTARA LV V L LV LV MI MI

L LR I LR L LR LR D HS H E S S

N QV EF TA DL EQ L E I Q R Q L

D A A A A A D D D D D

N N GV

QPDSE RDNV RDNV

------NRDNV REGKAS NR NRDNV NRDNV K K

F IQ Y L L

SEETL VP DPEFL DDATR T SM AR MF T T T

E Q QA D E E

PL PL PL PL IS IS

E PAFSEEQR A V YY FF YY VL YY YY W W

------N N N N ------

260 260 300 D 300

bvmo2 [CtesDRAFT_PD3136] [CtesDRAFT_PD3136] bvmo2 219 [CtesDRAFT_PD5437] bvmo3/SAPMO EH 217 [CtesDRAFT_PD1901] bvmo1 205 [CtesDRAFT_PD3136] bvmo2 VDGRVLFER 203 [CtesDRAFT_PD1705] bvmo4 227 [Q47PU3] PAMO 232 [BAA24454] STMO DLHEKISDSCKWLLAHVPHYS 349 [AAK54073] HAPMO DLHEKISDSCKWLLANLPNYS 349 [ACJ37423] HAPMO GFRMLRAFN 279 [CtesDRAFT_PD5437] bvmo3/SAPMO 267 [CtesDRAFT_PD1901] bvmo1 218 [CtesDRAFT_PD1705] bvmo4 GPDILAAYR 285 [Q47PU3] PAMO GVLFSKAFP 290 [BAA24454] STMO 405 [AAK54073] HAPMO 405 [ACJ37423] HAPMO 339 [CtesDRAFT_PD5437] bvmo3/SAPMO [CtesDRAFT_PD1901] bvmo1 292 [CtesDRAFT_PD3136] bvmo2 241 [CtesDRAFT_PD1705] bvmo4 344 [Q47PU3] PAMO 349 [BAA24454] STMO T 447 [AAK54073] HAPMO T 447 [ACJ37423] HAPMO

CHAPTER 5 81

W D Q D D D D Y Y F F F

CR SE A V V

LS V VT V

PNT W W YIS TA W W YI YI FMP FML FMP TQN TQN

V D D E D QPWRH

RV R RV RV RV RV

NG VC QWG HV TAS P P P G G

THVN M QHV L M FTAT K K R K K

E DL E

SE SE SE SE

----- HSI DT GVMRTAR VSI LH QF QF NMPG MFGQGD NVPG NIPG ------

I I I I A A A A

VV WTLRC ML MV

LDR N S SL TL N TV TI VG TG LG KNS KNS

S S A S S S

L LAN Y Y

SWY V V V V SWY

S VGNF SAL S NSWY NSWY NSWY NSWY

P P P P

------GLV GLV

S S S S S T T PVG PMTA SKV SRV

G T A G G N N Y Y F F

YFR

GP G GP GP GP GP GP GP

A N A T Y Y

I M I L M M

KGSVFGSGCS V F EWT F N C C ANQTL LRAWG LRAWG

A AYTQ Y F F A A G

F L F E WIEADNFNAGYVLRAQD EIADETL NRAEASLLNSA

---

PNM G ------

L Y FPNMF N S N DEGN DQAN --

ERLEA GFPNMF G GFPN G GFPN GFPN GFPNMF V L V

R A E P V D PQ P H R H ECR RV RV ------

D I L TA I Q -----

- M W G M NQ NQ

Y VSI V ISV L LS L MTV MTV WVD WVE WVE Y Y ------

G G K AG E S S

R KS M D D D E E

YLG Y Y RPS YLG YLG YLG Y EGFAKA IHHQTEELAA EGFVLT KGFAILEG QLG CLS

T S T T A A DALAYR Y Y

TA R R DY D DY DY DY

PR P PR PR

RPEALAQ ------EQAHE VLEKE TPEAVAD

A G G G DDA A A G VKTPVF IKAQVF AEK AAKG AESG EPT QAT

S A SD A A G E RPTVAAA E E E E V V V V V

L SDV A K KDDDA I V I M M

WA W WA WA W W EG PAASLQ SKPVWQ DE TE

K K T V V M C

VAG E E E E IQR HAE ------LTRS AAG HQS HQS II VL I II

G R EHA EGTR K K HD R G G G G G G R V QE Q E

L L V L R R R R R WQRTHS WHRTHE

A AL AL AL Y F Y Y F F

QT AT V V RT V L REQ SRSE FKN DAR LEG LEG

GEQA G FSVDDAPVDFRERI A N G G G TRAQ L M L L L

R G A G G R HM H

E T I LN V R RD RD E A AY AY RL RL

G G G G G G G AQ I I V V

LI LL LL I ------

503 T 503

[CtesDRAFT_PD1901] 350 G 350 [CtesDRAFT_PD1901]

bvmo2_[CtesDRAFT_PD3136] 405 405 bvmo2_[CtesDRAFT_PD3136] R 397 [CtesDRAFT_PD5437] bvmo3/SAPMO bvmo1 K 347 [CtesDRAFT_PD3136] bvmo2 VNDSK 296 [CtesDRAFT_PD1705] bvmo4 R 402 [Q47PU3] PAMO V 407 [BAA24454] STMO [AAK54073] HAPMO T 503 [ACJ37423] HAPMO 455 [CtesDRAFT_PD5437] bvmo3/SAPMO 405 bvmo1_[CtesDRAFT_PD1901] 344 bvmo4_[CtesDRAFT_PD1705] H 460 PAMO_[Q47PU3] A 465 STMO_[BAA24454] A 563 HAPMO_[AAK54073] A 563 HAPMO_[ACJ37423] VAGVPA 513 [CtesDRAFT_PD5437] bvmo3/SAPMO DLEYAEE 463 [CtesDRAFT_PD1901] bvmo1 PWSQSR 459 [CtesDRAFT_PD3136] bvmo2 [CtesDRAFT_PD1705] bvmo4 VGGFHR 518 [Q47PU3] PAMO LGGFGV 523 [BAA24454] STMO PFTAVE 619 [AAK54073] HAPMO PFNAVE 619 [ACJ37423] HAPMO

82 CHAPTER 5

Figure S1. Alignment of the amino-acid sequences of the four type-I BVMO candidate genes (bvmo1 – bvmo4) in the genome of C. testosteroni KF-1 and of four characterized type-I BVMOs for comparison. The characterized type-I BVMOs included in the alignment are phenylacetone monooxygenase of Thermobifida fusca YX (PAMO [Q47PU3]), steroid monooxygenase of Rhodococcus rhodochrous (STMO [BAA24454]), and 4-hydroxyacetophenone monooxygenase of Pseudomonas fluorescens ACB (HAPMO [AAK54073]) and of Pseudomonas putida JD1 (HAPMO [ACJ37423]). The characteristic ‘BVMO-fingerprint’ sequence motif (Fraaije et al. 2002) is indicated in red and the two Rossman-fold motifs for FAD and NADPH-binding are indicated in blue; other characteristic flavoprotein sequence motifs are also indicated (in black). Note that the sequence of the bvmo4 candidate contains two substitutions in the BVMO-fingerprint motif (FXGXXXHXXXY-V instead of FXGXXXHXXXW-(P/D)). CHAPTER 6 83

CHAPTER 6

Characterization of a steroid Baeyer-Villiger monooxygenase and esterase in Comamonas testosteroni KF-1

Michael Weiss, Ann-Katrin Felux, Dieter Spiteller & David Schleheck

Department of Biology and Research School Chemical Biology, University of Konstanz, Germany

84 CHAPTER 6

ABSTRACT Comamonas testosteroni is well-known for its ability to utilize steroids for growth, such as cholate, progesterone and testosterone, through a well-defined, inducible steroid-ring cleavage pathway. Here, we report the identification and functional characterization of a steroid Baeyer- Villiger monooxygenase (BVMO) and esterase in C. testosteroni KF-1 that catalyze an abstraction of the C17-acetyl side chain of progesterone in order to yield testosterone for an entry into the steroid-ring cleavage pathway. The known FAD- and NADPH-dependent (type-I family) BVMO in strain KF-1, 4-sulfoacetophenone BVMO, showed no activity with progesterone, and therefore, three other candidate genes for type I BVMOs in the strain KF-1 genome were also expressed recombinantly, and the proteins purified and tested. Only one of the candidates converted progesterone, hence, is a steroid BVMO. Only this BVMO gene was inducibly and strongly transcribed as mRNA during growth of strain KF-1 with progesterone, in a transcriptional unit (as polycistronic mRNA) with a steroid esterase candidate gene. The esterase candidate was also expressed recombinantly and purified. With progesterone as substrate, the steroid BVMO and esterase catalyzed an NADPH-dependent formation of testosterone acetate followed by formation of testosterone and acetate, respectively. With pregna-1,4-diene-3,20-dione (PDD) as substrate, an NADPH-dependent formation of boldenone acetate, followed by formation of boldenone and acetate, was observed. The latter reactions for a progesterone utilization, acetyl-abstraction from PDD after unsaturation of the steroid A-ring of progesterone, may be prevailing in strain KF-1, since PDD was observed as a transient metabolite in liquid cultures during growth with progesterone.

INTRODUCTION Baeyer-Villiger-type monooxygenases (BVMOs) (EC 1.14.13.x) of the family of flavin- dependent monooxygenases (van Berkel et al. 2006) are found in many different organisms for the catalysis of oxidation reactions including Baeyer-Villiger oxidations (Baeyer and Villiger 1899), in which aliphatic, cyclic, or aromatic ketones or aldehydes are converted to esters (or lactones) by insertion of an oxygen atom into the higher substituted carbon-carbon bond of the carbonylic substrate (Kamerbeek et al. 2003) (see Figure 1). Members of the BVMO families are remarkably versatile in their substrate range and, due to their extraordinary enantio-, regio- and chemoselectivities, are subject of intensive research in the recent years, e.g., for their biotechnological, ‘green chemistry’ application (Leisch et al. 2011, Balke et al. 2012). Members of the type-I family BVMOs, i.e., strictly NADPH-dependent and FAD-containing enzymes, which share highly conserved, distinctive type-I BVMO sequence motifs (Fraaije et al. 2002, Leisch et al. 2011, Riebel et al. 2012), represent the best characterized BVMOs thus CHAPTER 6 85 far. For example, cyclohexanone BVMOs (e.g., Chen et al. 1988, Yachnin et al. 2012) and 4-hydroxyacetophenone BVMOs (‘HAPMOs’) (Tanner and Hopper 2000, Rehdorf et al. 2009) in different bacterial strains, and a phenylacetone BVMO (‘PAMO’) in Thermobifida fusca (Malito et al. 2004) and a steroid BVMO (‘STMO’) of Rhodococcus rhodochrous (Franceschini et al. 2012), have been cloned, expressed and characterized; this yielded important insight into the structure of the catalytic site of type-I BVMOs and their highly sophisticated reaction mechanism (e.g., Franceschini et al. 2012, Yachnin et al. 2012).

A

CH3 H3C O O O OH

SAPMO SPAc esterase (PD5437) (PD5438) - - - SO3 SO3 SO3 4-sulfoacetophenone 4-sulfophenyl acetate 4-sulfophenol

B O O O OH O O H2O 2 + NADPH NADP+ acetate NAD NADH

STMO SEST hydroxysteroid (PD3136) (PD3135) dehydrogenase O O O O progesterone testosterone acetate testosterone androsta-3,17-dione

O

O O OH O

STMO SEST hydroxysteroid (PD3136) (PD3135) dehydrogenase O O O O pregna-1,4-diene-3,20-dione boldenone acetate boldenone androsta-1,4-diene-3,17-dione (PDD) (ADD)

9,10-seco- pathway Figure 1. Illustration of a known (A) and a predicted (B) Baeyer-Villiger monooxygenase (BVMO) and esterase reaction sequence in C. testosteroni KF-1 for an abstraction of acetyl-side chains from carbonylic growth substrates. (A) The known reaction of 4-sulfoacetophenone BVMO and 4-sulfophenyl acetate esterase to yield 4-sulfophenol and acetate from 4-sulfoacetophenone in a pathway for a catabolism of xenobiotic 3-(4-sulfophenyl)butyrate (for details on this pathway, see Weiss et al. 2012). (B) Postulated steroid BVMO and steroid esterase reactions in C. testosteroni KF-1 for a conversion of progesterone via testosterone acetate into testosterone, as implied from the ability of strain KF-1 to utilize progesterone for growth; the testosterone is catabolised via androsta-1,4-diene-3,17-dione through the well-defined steroid-ring cleavage pathway (9,10-seco- pathway), as commonly found in C. testosteroni species (see text). Continuous horizontal arrows indicate the enzyme reactions that were defined in this study, of progesterone/ pregna-1,4-diene-3,20-dione (PDD) Baeyer- Villiger monooxygenase (steroid BVMO, STMO) and testosterone acetate/boldenone acetate esterase (steroid esterase, SEST), and the identified genes are denoted by their locus tags (prefix CtesDRAFT_). The vertical dotted arrows indicate observed oxidations of the steroid A-rings (see text). Abbreviations used: SAPMO, 4-sulfoacetophenone Baeyer-Villiger monooxygenase; SPAc, 4-sulfophenyl acetate; STMO, steroid Baeyer- Villiger monooxygenase; SEST, steroid esterase; PDD, pregna-1,4-diene-3,20-dione; ADD, androsta-1,4-diene- 3,17-dione. 86 CHAPTER 6

For the heterotrophic bacteria that apply BVMO reactions as key steps of catabolic pathways, the esters formed by the BVMO are cleaved by an esterase, e.g., by a carboxylester hydrolase / alpha/beta-fold hydrolase-family enzyme (Moonen et al. 2008a, Weiss et al. 2012). For example in Comamonas testosteroni KF-1, the carbonylic substrate 4-sulfoacetophenone (SAP) (Figure 1A) is converted to 4-sulfophenyl acetate by a BVMO (‘SAPMO’), and the 4-sulfophenyl acetate is hydrolyzed by an esterase to yield 4-sulfophenol and acetate, which are further catabolized (Weiss et al. 2012). The transformation of a carbonylic substrate to an ester by a BVMO, followed by ester cleavage to yield an alcohol and carboxylic acid, is a widespread biochemical strategy in bacteria for an abstraction of substituents from various aromatic or aliphatic carbonylic growth substrates, or from carbonylic intermediates of growth substrates (e.g., Higson and Focht 1990, Kotani et al. 2007, Moonen et al. 2008a, Rehdorf et al. 2009, Weiss et al. 2012). The genes for such a BVMO-esterase reaction sequence can be encoded directly next to each other (Moonen et al. 2008a , Rehdorf et al. 2009, Weiss et al. 2012, Kotani et al. 2007, Van Beilen et al. 2003) and can represent transcriptional units (Weiss et al. 2012); such ‘BVMO-esterase gene pairs’ are highly abundant in the genomic and metagenomic databases.

The genome of C. testosteroni KF-1 (Weiss et al. 2013) contains, in total, four valid candidate genes for type-I BVMO enzymes, based on their sequence motives, and three of them are encoded in pair with an esterase candidate gene (Weiss et al. 2012). The physiological function of one of these BVMO-esterase gene pairs has recently been identified (Figure 1A): they encode a 4-sulfoacetophenone BVMO (‘SAPMO’, locus tag CtesDRAFT_ PD5437) and 4-sulfophenyl acetate esterase (CtesDRAFT_ PD5438), respectively, for a catabolic pathway for the xenobiotic substrate 3-(4-sulfophenyl)butyrate (see therefore Schleheck et al. 2010, Weiss et al. 2012). The 4-sulfoacetophenone BVMO showed a preference for phenylacetone (phenyl- 2-propanone) and other substituted acetophenones as substrates, however, the enzyme showed no activity with the carbonylic steroid progesterone (Weiss et al. 2012).

C. testosteroni species have long been known for their ability to completely degrade the steroid testosterone via its inducible steroid-ring cleavage pathway, which has been defined in great detail in the recent years (e.g., Horinouchi et al. 2001, Horinouchi et al. 2003a, Horinouchi et al. 2003b, Horinouchi et al. 2004a, Horinouchi et al. 2004b , Horinouchi et al. 2005). In this study, we explored specifically the BVMO and esterase reaction sequence, i.e., the enzymes and genes, which are implied in the ability of C. testosteroni to utilize also the carbonylic steroid progesterone for growth: an abstraction of the acetyl-side chain of progesterone to yield testosterone, which is further catabolized via the well-characterized steroid-ring cleavage CHAPTER 6 87 pathway (Figure 1B). No progesterone BVMO of C. testosteroni, and only one other steroid BVMO, the STMO of Rhodococcus rhodochrous (Franceschini et al. 2012), have been cloned and characterized thus far.

MATERIALS AND METHODS

Chemicals All standard chemicals were obtained from Sigma-Aldrich, Fluka or Merck, and 4-sulfoacetophenone (4-acetylbenzenesulfonate) was purchased from ABCR (Karlsruhe, Germany), boldenone acetate from Mondophor GmbH (Munich, Germany), and NADH, NADPH, NAD+ and NADP+ from Biomol (Hamburg, Germany).

Growth conditions C. testosteroni KF-1 (DSM 14576) was isolated in our lab (Schleheck et al. 2004a) and grown in a mineral-salts medium (Thurnheer et al. 1986) with succinate (15 mM) or steroids (0.05 – 0.15% (w/v)) as sole source of carbon and energy. Cultures in the 5 – 10 ml scale were incubated shaken (150 rpm) in Corning tubes, or in the 50 ml – 1 l scale in Erlenmeyer flasks, at 30°C in the dark. Cultures were inoculated with outgrown, homologous pre-cultures (1%). Due to precipitation of the steroid substrates in the culture fluid, the growth was monitored routinely as optical density (OD 580 nm) after removal of the steroid particles by filtration (folded filter paper MN 615 ¼, Macherey-Nagel, Germany), or by cell counting using a Helber counting chamber (Saaringia, Germany). Cells for enzyme assays, total RNA extraction, and proteomics, were harvested by centrifugation in the mid-exponential growth phase after the residue steroid precipitate had been separated from the cellular biomass by filtration (see above). Samples to determine the total growth yield (total cellular protein, see below) were taken in the stationary phase, and hence, unfiltered samples were used.

Analytical chemistry Steroids in the culture medium or in enzyme reactions were determined against authentic standards by gas chromatography-mass spectrometry (GC-MS) in ethyl acetate extracts; therefore, 250 µl of aqueous sample and 250 µl ethyl acetate were vortexed, centrifuged, and 0.2 µl of the ethyl acetate extract were injected into the GC-MS. A Trace GC Ultra - ISQ series chromatograph connected to a Single Quadrupole mass spectrometer (Thermo Scientific) fitted with a TG-5MS capillary column (15 m x 0.25 mm, 0.25 µm, Thermo Scientific) was used. The GC temperature program started at 100°C (1 min isotherm) followed by gradients (30°C min-1) to 200°C and (15°C min-1) to 290°C (5 min isotherm). The retention times were as follows: progesterone, 9.17 min; testosterone acetate, 8.93 min; testosterone, 8.39 min; pregna- 88 CHAPTER 6

1,4-diene-3,20-dione (PDD), 9.37 min; boldenone acetate, 8.70 min; boldenone, 8.56 min. The MS conditions were as follows: temperature of transfer line, 290°C; temperature of ion source, 200°C; liner temperature, 280°C; helium was the carrier gas (1.2 ml min-1); split ratio 12:1. Electron impact ionization (70 eV) was used to record mass spectra within the mass range 41- 500 amu.

Total cellular protein was determined following a Lowry-based protocol (Kennedy and Fewson 1968) and soluble protein by protein dye binding (Bradford 1976); bovine serum albumin was used as the standard.

Heterologous expression of BVMOs and the steroid esterase (SEST) in E. coli and purification of the recombinant proteins Chromosomal DNA of C. testosteroni KF-1 was isolated using the Illustra bacteria genomicPrep Mini Spin Kit (GE Healthcare). The genes were amplified by PCR using Phusion HF DNA Polymerase (Finnzymes); primer sequences are given in Table S1.

For CtesDRAFT_PD1705, the PCR conditions were: 90 s initial denaturation at 98°C; 30 cycles of 15 s denaturation at 98°C, 20 s annealing at 65°C, and 60 s elongation at 72°C. For CtesDRAFT_PD1901, the PCR conditions were: 90 s initial denaturation at 98°C; 30 cycles of 15 s denaturation at 98°C, 20 s annealing at 65°C, and 70 s elongation at 72°C. For CtesDRAFT_PD3136, the PCR conditions were: 90 s initial denaturation at 98°C; 30 cycles of 15 s denaturation at 98°C, 20 s annealing at 60°C, and 80 s elongation at 72°C. The PCR conditions for CtesDRAFT_PD3135 were: 90 s initial denaturation of at 98°C; 30 cycles of 15 s denaturation at 98°C, 20 s annealing at 62.8°C, and 45 s elongation at 72°C. The PCR products were purified (MinElute PCR Purification Kit, Qiagen) and ligated into pET100, an N-terminal

His6-tag expression vector, using the Champion pET100 Directional TOPO Expression Kit and OneShot TOP10 E. coli (Invitrogen); correct constructs were confirmed by sequencing (GATC- Biotech, Konstanz). For expression, of CtesDRAFT_PD1705, CtesDRAFT_PD3136, and CtesDRAFT_PD3135, BL21 Star (DE3) OneShot E. coli (Invitrogen) cells were transformed -1 with the construct and grown in LB medium (100 mg l ampicillin) at 37°C. At OD580nm ≈ 0.7, the cultures were induced (0.5 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG)) and grown for additional 5 h at 20°C; cells were harvested by centrifugation (15,000 x g, 20 min, 4°C) and stored at -20°C. For the expression of CtesDRAFT_PD1901, BL21 Star (DE3) OneShot E. coli (Invitrogen) cells were also transformed with the construct, but the LB medium (containing 100 mg l-1 ampicillin) contained also 2% ethanol, and the expression was carried out at 18°C for approximately 36 h without additional IPTG induction. The cells were harvested by centrifugation (15,000 x g, 20 min, 4°C) and stored at -20°C. CHAPTER 6 89

Preparation of cell extracts and purification of recombinant proteins C. testosteroni KF-1 cells grown with either progesterone or succinate were harvested in the mid-exponential growth phase, centrifuged, and washed with buffer A (20 mM Tris-HCl pH 8.0). The cell pellets were resuspended in buffer A containing 50 µg ml-1 DNase I and 12.5 µg ml-1 RNase. E. coli cells from the heterologous expression of CtesDRAFT_PD1705, CtesDRAFT_PD1901, and CtesDRAFT_PD3136 (STMO) were resuspended in buffer B (25 mM Tris-HCl pH 8.0, 50 µM FAD, 50 µM NADP+, 10% (v/v) glycerol), and cells from the heterologous expression of CtesDRAFT_PD3135 (SEST) in buffer C (25 mM Tris-HCl pH 8.0, 10% (v/v) glycerol, and 100 mM KCl), each containing 30 µg ml-1 DNase I. All cell suspensions were disrupted by four passages through a chilled French pressure cell (140 MPa) (Newport Scientific, Inc./AMINCO, Silver Spring). Cell debris was removed by centrifugation (15,000 x g, 10 min, 4°C) followed by ultracentrifugation (100,000 x g, 1 h, 4°C) in order to obtain the membrane pellet and the soluble protein fraction. Soluble fractions from heterologous expressions were loaded onto a pre-equilibrated (with buffer B or C) Ni2+- chelating agarose affinity column (Macherey-Nagel). After washing with buffer (B or C) containing 30 mM imidazole, the His-tagged proteins were eluted with buffer (B or C) containing 250 mM imidazole. The eluted protein was concentrated using Vivaspin (30 kDa cut-off, Sartorius, Göttingen), glycerol was added (to a final concentration of 30% (v/v)), and the preparation was stored at -20°C.

Enzyme assays The activity of native STMO in crude extract from either progesterone- or succinate-grown cells was followed discontinuously by GC-MS, when from reactions with 500 µg protein ml-1 in 50 mM Tris-HCl (pH 8.0; stirred) in the presence of 25 µM progesterone and 0.5 mM NADPH samples were taken at intervals for ethyl acetate extraction prior to GC-MS (see above). The activity of native testosterone acetate esterase in crude extracts from either progesterone- or succinate-grown cells was also followed discontinuously by GC-MS, when reactions with 100 µg protein ml-1 in 50 mM MOPS-NaOH (pH 7.0; stirred) in the presence of 100 µM testosterone acetate were sampled at intervals, the samples extracted with ethyl acetate, and the extracts analyzed.

The activity of the recombinant, purified BVMOs were routinely followed as decrease of the absorption of the co-substrate NADPH at 365 nm (ε = 3.5 x 10³ M-1cm-1; Bergmeyer 1975); the reaction contained 50 mM Tris-HCl (pH 8.0), 0.5 mM NADPH, 1 µM to 400 µM progesterone or PDD, 0.01 to 5 mM 4-hydroxyacetophenone, 0.01 to 1 mM cyclohexanone, 0.01 to 4 mM 4-sulfoacetophenone, 0.1 to 5 mM acetone, 0.005 to 5 mM 2-decanone, or 0.1 to 10 mM 90 CHAPTER 6 phenylacetone, and typically between 8 and 24 µg recombinant BVMO per ml. For the reaction with ethionamide, the formation of its oxide product was measured at 400 nm (ε = 1.0 x 10³ M-1cm-1; Fraaije et al. 2004). Ethionamide concentrations in the reaction mixture ranged from 0.01 to 5 mM. The activities determined were plotted using hyperbolic fitting in Origin (Microcal Software, Inc.).

The activity of recombinant, purified SEST was examined discontinuously by GC-MS (see above); the reaction mixture contained 50 mM MOPS-NaOH (pH 7.0), 100 µM testosterone acetate or boldenone acetate, and 5 µg SEST ml-1. In order to confirm also the release of acetate during the reactions, 1-ml samples were taken at intervals and analyzed using an acetic acid kit (NZYTech, Portugal). The recombinant SEST was also tested with 4-sulfophenyl acetate (SPAc) as substrate; the reaction contained 0.5 mM SPAc and 10 µg SEST ml-1, for which samples were taken at intervals and analyzed by reversed-phase HPLC as described previously (Weiss et al. 2012).

For in vitro reconstitution of the reaction sequences from progesterone via testosterone acetate to testosterone and actetate, or from PDD via boldenone acetate to boldenone and acetate, the recombinant purified STMO (15 µg ml-1), and after 30 min the recombinant purified SEST (15 µg ml-1), were added sequentially to stirred reactions that contained 50 mM MOPS-NaOH (pH 7.0), 0.5 mM NADPH, and 20 µM progesterone or PDD. The reactions were analyzed discontinuously by GC-MS and by the acetic acid determination kit (see above).

RNA-preparation and reverse-transcription (RT) PCR

Cells were harvested at OD580nm ≈ 0.4 and the cell pellets were stored at –20°C in RNAlater RNA stabilization solution (Ambion Biosystems). Total RNA was prepared using the E.Z.N.A. bacterial RNA kit (Omega Bio-Tek) following the manufacturer´s instructions; the obtained RNA preparation was treated with DNase I (100 U, 30 min, 37°C) (Fermentas/Thermo Scientific). Reverse transcription for cDNA synthesis (20 µl-scale) was performed using Maxima reverse transcriptase (100 U, 60 min, 50°C) (Fermentas/Thermo Scientific) with 200 ng of total RNA and 0.5 µM of the sequence-specific reverse primer (see Table S1). The cDNA (2 µl) was used as template in PCR reactions (20-µl scale) using Taq DNA polymerase (Thermo Scientific). The following PCR program was used: Initial denaturation, 3 min, 95°C; 10 cycles of 30 s denaturation at 95°C, 30 s annealing at 60°C - 0.5°C per cycle, and 40 s elongation at 72°C; 15 cycles of 30 s denaturation at 95°C, 30 s annealing at 55°C, and 40 s elongation at 72°C. For positive controls, genomic DNA of strain KF-1 was used (10 ng). To confirm the absence of DNA contamination in the RNA preparations, the non-reverse transcribed total RNA preparation (2 µl) served as template. The cDNA was also used as template for a PCR assay to CHAPTER 6 91 test for polycistronically transcribed mRNA (see the Results). The PCR conditions using Taq DNA polymerase were: Initial denaturation, 95°C, 3 min; 10 cycles of 30 s denaturation at 95°C, 30 s annealing at 60°C - 0.5°C per cycle, and 40 s elongation at 72°C; 20 cycles of 30 s denaturation at 95°C, 30 s annealing at 55°C, and 40 s elongation at 72°C. All primers used were purchased from Microsynth (Balgach, Switzerland).

Protein gel electrophoresis and proteomics Proteins in crude extract of C. testosteroni KF-1, or proteins obtained from the heterologous expressions, were analyzed on 12% SDS-PAGE gels with Coomassie brilliant blue R-250 staining (Laemmli 1970). Proteins in the soluble fraction of C. testosteroni KF-1 were analyzed by two dimensional gel electrophoresis (2D-PAGE; IEF/SDS-PAGE) as described previously (Schmidt et al. 2013) with the following modifications: An IEF strip of the range pH 4-7 was used; one mg of total protein was used; the IEF separation involved 30,000 Vh. Protein bands or spots of interest were excised and analyzed by peptide fingerprinting-mass spectrometry (PF- MS) at the Proteomics Facility of the University of Konstanz. The MASCOT engine (Matrix Science, London, UK) was used to search against an amino-acid sequence database of all annotated C. testosteroni KF-1 genes (IMG version 2011-08-16); the parameters for searching and scoring were set as described previously (Schmidt et al. 2013).

RESULTS

Physiology of growth of C. testosteroni KF-1 with progesterone and identification of a transient metabolite, pregna-1,4-diene-3,20-dione, in the culture fluid In a detailed growth experiment with 1.5 mM progesterone as sole source of carbon and energy (Figure 2A), strain KF-1 grew exponentially (µ = 0.6 h-1) and the culture reached stationary phase at about 40 h, at which time progesterone had completely been utilized, as confirmed by GC-MS. The molar growth yield was 4.9 g protein per mol of carbon, a value which indicated quantitative incorporation of progesterone-carbon into biomass (Cook et al. 1983, Cook 1987). Further, a novel peak was observed in the GC-chromatograms (not shown) and indicated that a degradation intermediate was excreted into the culture medium in a first growth phase, and in a later phase completely utilized: The transient metabolite was pregna-1,4-diene-3,20-dione (PDD) (Figure 2B inset), as identified by an identical GC separation and MS fragmentation pattern in comparison to commercially available, authentic PDD (not shown). Hence, it appeared that the first reaction on progesterone in C. testosteroni KF-1 involves also a dehydrogenation of the A-ring of progesterone to PDD, prior to a steroid BVMO and esterase 92 CHAPTER 6 reaction (Figure 1B), thus, a 3-oxosteroid delta 1-dehydrogenase activity (EC 1.3.99.4) (see the Discussion).

1.5 A 1.5

1.2 1.2

0.9 0.9

580

OD 0.6 0.6

0.3 0.3

progesterone (mM)

0 0 B O 0.2

O 0.1

0

pregna-1,4-diene-3,20-dione (mM) 10 20 30 40 50 60 70 80 time (hours)

Figure 2. Growth of Comamonas testosteroni KF-1 in a liquid culture with progesterone as the sole source of carbon and energy (A) and of the observed transient appearance of a degradation intermediate in the culture fluid (B). (A) The mineral-salts medium contained 1.5 mM progesterone (0.05% w/v) () and substrate utilization was followed by GC-MS analysis, and growth () as optical density (OD 580 nm). (B) In a first growth phase, a degradation intermediate () accumulated in the culture medium that, in a second phase, was completely utilized; the intermediate was identified as pregna-1,4-diene-3,20-dione (PDD; inset).

Progesterone and testosterone acetate conversion in cell-free extracts of progesterone- grown cells Evidence for activity of an inducible, strictly NADPH-dependent steroid BVMO (STMO) in C. testosteroni KF-1 was obtained when the reaction was followed as disappearance of progesterone by GC-MS, and when cell-free extract of progesterone-grown cells was used, but not with extract of succinate-grown cells (see Supplemental material, Figure S1A). When we tried to follow the BVMO reaction as progesterone- and NADPH-dependent oxygen consumption in an oxygen electrode, or as progesterone-dependent NADPH conversion in a spectrophotometer, no significant activities were detectable in comparison to the high background activities observed (not shown); these observations are in contrast to the well- detectable activities (in all three assays) of the first BVMO that we characterized in CHAPTER 6 93

C. testosteroni KF-1, of 4-sulfoacetophenone BVMO (Schleheck et al. 2010, Weiss et al. 2012). Nevertheless, the STMO could be assayed in cell-free extracts, discontinuously by GC-MS, and therefore, we expected that also a recombinantly produced STMO candidate enzyme (from a BVMO candidate gene) could be assayed under these conditions (see below).

A testosterone acetate esterase activity was also progesterone-inducibly expressed, as confirmed when the reaction was followed as substrate disappearance by GC-MS with extract of progesterone-grown cells in comparison to succinate-grown cells (see the Supplemental file, Figure S1B). Two novel peaks in the GC chromatograms (not shown) suggested the formation of testosterone and boldenone during the reaction, as identified by GC-MS in comparison to authentic standards. Hence, also these observation implied the activity of a steroid A-ring dehydrogenase (see Figure 1B and the Discussion)

BVMO candidate genes, their heterologous expression and characterization of the recombinant proteins The genome of C. testosteroni KF-1 contains, in total, four candidate genes for BVMOs, i.e., for NADPH-dependent type-I BVMO (genes bvmo1 – bvmo4) (Weiss et al. 2012), and three of these are co-located with a candidate esterase gene (bvmo1 - bvmo3). One encodes 4-sulfoacetophenone BVMO (gene bvmo3), is inducibly transcribed with the co-located 4-sulfophenyl acetate esterase gene (Weiss et al. 2012), and the heterologously over-expressed, purified BVMO3 enzyme converted 4-sulfoacetophenone, other substituted acetophenones and phenylacetone, but not progesterone (Weiss et al. 2012). The three other candidate genes for type-I BVMOs in C. testosteroni KF-1 were also cloned and over-expressed in E. coli, the His- tagged proteins purified (Figure S2), and the purified proteins tested for an NADPH-dependent activity with progesterone as substrate. Only one of the candidate enzymes, BVMO2 (locus tag CtesDRAFT_PD3136), showed activity with progesterone: The disappearance of progesterone, concomitant to a formation of testosterone acetate, and a NADPH-dependent disappearance of PDD concomitant to a formation of boldenone acetate (see Figure 1B), was confirmed by GC- MS (see below).

The reaction of recombinant BVMO2 could be followed also spectrophotometrically by the conversion of NADPH; the NADPH disappearance in the absence of steroid (uncoupling rate) was 0.035 µmol min-1 mg-1. The highest specific activities, each at around 0.7 µmol min-1 mg-1, were observed with 20 µM progesterone and PDD as substrates, but these activities decreased with higher substrate concentrations (Figure 3), which was attributed to steroid-solubility limitation (Franceschini et al. 2012) or substrate inhibition effects; therefore, the kinetic parameters for the reaction with progesterone and PDD could not be determined. The enzyme 94 CHAPTER 6 showed also activity with phenylacetone (phenyl-2-propanone), but not with

4-sulfoacetophenone, 4-hydroxyacetophenone, cyclohexanone, or acetone when tested; the Km -1 for the reaction with phenylacetone was 182 ± 38 µM, and Vmax was 0.73 ± 0.03 µmol min mg-1.

0.8

)

-1 0.7

mg

-1 0.6

0.5

0.4

0.3

0.2

specific activity (µmol min 0.1

0.0 0.1 0.2 0.3 0.4

substrate concentration (mM]

Figure 3. Specific activities of recombinant STMO determined for progesterone () and PDD () as substrates. Note that the activities did not follow a Michaelis-Menten kinetic (see text).

The protein expressed from candidate gene bvmo1 (CtesDRAFT_PD1901) showed no activity with progesterone, PDD, 4-sulfoacetophenone, 4-hydroxyacetophenone, cyclohexanone, and acetone as substrates. However, the enzyme was active with phenylacetone (Km, 2.2 ± 0.5 mM -1 -1 and Vmax, 0.85 ± 0.07 µmol min mg ). Interestingly, the BVMO1 sequence in a phylogenetic tree (Figure. S6) clustered together with a BVMO from Pseudomonas putida KT2440 and a BVMO from Mycobacterium tuberculosis. Both genes were already cloned and characterized (Fraaije et al. 2004, Rehdorf et al. 2007) and showed no, or hardly any, activity with most of the aromatic or cyclic ketones tested, except for aliphatic ketones (Fraaije et al. 2004, Rehdorf et al. 2007) and the antitubercular prodrug ethionamide (Fraaije et al. 2004). Correspondingly, recombinant BVMO1 showed activity with 2-decanone and ethionamide as substrates (Km, 29.7 -1 -1 ± 8.9 µM and Vmax, 1.89 ± 0.11 µmol min mg for 2-decanone, and Km, 5.31 ± 1.21 mM and -1 -1 Vmax, 0.93 ± 0.13 µmol min mg for ethionamide). Notably, also the STMO of C. testosteroni CHAPTER 6 95

KF-1 (BVMO2) was active with 2-decanone as substrate when tested (Km, 0.84 ± 0.55 mM and -1 -1 Vmax, 0.36 ± 0.08 µmol min mg ), however there was no significant activity measurable with ethionamide as substrate. The BVMO3 (‘SAPMO’, Figure 1A; see also Weiss et al. 2012) was -1 also tested positive with 2-decanone (Km, 76.31 ± 14.43 µM and Vmax, 2.13 ± 0.09 µmol min -1 mg ), as well as with ethionamide as substrate (Km, 89.71 ± 15.00 µM and Vmax, 1.76 ± 0.07 µmol min-1 mg-1).

Finally, the recombinant protein expressed from candidate gene bvmo4 (CtesDRAFT_PD1705) was, in our hands, only active with the substrates 2-decanone (Km 1.84 ± 1.25 mM and Vmax, -1 -1 0.14 ± 0.04 µmol min mg ) and ethionamide (Km, 1.52 ± 0.48 mM and Vmax, 0.39 ± 0.05 µmol min-1 mg-1).

Transcriptional and proteomic analyses STMO is inducibly expressed in strain KF-1, as observed with cell-free extracts of progesterone-grown strain KF-1 cells (see above), and therefore, the transcription of the four BVMO candidate genes in progesterone-grown cells in comparison to succinate-grown cells was analyzed by reverse-transcription PCR (RT-PCR) (Weiss et al. 2012). The results (Figure 4) confirmed that the bvmo2 (STMO) candidate was strongly transcribed during growth with progesterone, but not during growth with succinate, whereas no significant transcription of the candidates bvmo1, bvmo3 and bvmo4 was detectable under the growth conditions tested. Hence, we had experimental support that the candidate gene bvmo2 encodes an STMO (above), and is specifically and strongly induced during growth with progesterone in C. testosteroni KF-1 (Figure 4). Further, the STMO gene is located in a three-gene cluster (locus tags CtesDRAFT_PD3135-37) together with an esterase (carboxylester-hydrolase) candidate gene downstream (PD3135) and a short-chain dehydrogenase/reductase (SDR) gene upstream (PD3137), each on the same DNA strand (see Figure S3A). The results of a RT-PCR experiment that mapped cDNA spanning over the whole PD3135-37 locus (Figure S3AB) suggested that the STMO, esterase and SDR gene are inducibly co-transcribed as a polycistronic mRNA during growth with progesterone. 96 CHAPTER 6

BVMO candidates

Figure 4. Reverse transcription-PCR (RT-PCR) analysis for an inducible transcription of the four BVMO candidate genes in C. testosteroni KF-1. The agarose gel shows the strong PCR signal obtained for the reverse transcribed RNA (cDNA) of the BVMO2 candidate gene specifically for progesterone-grown strain KF-1 cells. Genomic DNA of strain KF-1 was used as template for PCR positive controls, and for RT-PCR positive controls (+) the cDNA generated of a constitutively expressed gene (succinyl-CoA synthetase alpha subunit gene; Weiss et al. 2012 ). The length marker (M) is indicated in bp. The lane numbering 1-4 refers to BVMO candidate genes 1-4, respectively.

A proteomics approach (the data are shown and described in the Supplemental file) yielded significant hits for the esterase (PD3135) and SDR locus (PD3137) specifically for progesterone-grown cells in comparison to succinate-grown cells and, thus, further evidence for a progesterone-inducible expression of the three-gene locus PD3135-37; no significant hits could yet be detected for the bvmo2 locus by proteomics. In addition, significant hits for several genes of the well-known steroid-ring cleavage pathway, that is, in the steroid gene hot spot PD3640–3746 in strain KF-1 (Horinouchi et al. 2012, Weiss et al. 2013), and specifically for progesterone-grown cells, could be obtained by proteomics (see the Supplemental file), hence, also the testosterone/boldenone generated from progesterone is metabolized in strain KF-1 most likely via the ‘normal’ steroid-ring cleavage pathway route.

Heterologous expression of the steroid esterase candidate and characterization of the recombinant enzyme The steroid esterase candidate (‘SEST’; PD3135) was also cloned and over-expressed in E. coli (Figure S2). The protein catalyzed the hydrolysis of testosterone acetate to testosterone and acetate, as confirmed in a discontinuous enzyme assay (Figure 5); the enzyme hydrolyzed also boldenone acetate (see below), but not 4-sulfophenyl acetate, as determined by HPLC. CHAPTER 6 97

100

80

60

40

concentration (µM) 20

0

0 3 6 9 12 15 18 incubation time (min)

Figure 5. Discontinuous enzyme assay demonstrating the quantitative conversion of testosterone acetate () to testosterone () and acetate () by the recombinant steroid esterase. The reaction was started by the addition of esterase (15 µg protein ml-1) and followed by GC-MS analysis of samples taken at intervals.

In vitro reconstitution of the steroid BVMO and esterase reaction sequence The purified recombinant STMO (15 µg ml-1) was incubated with 20 µM progesterone and 0.5 mM NADPH, and after 30 min, the purified recombinant esterase was added (15 µg ml-1); samples were taken at intervals for GC-MS analysis throughout the reactions. As expected, STMO converted progesterone to testosterone acetate, and the addition of steroid esterase resulted in a rapid conversion of testosterone acetate to testosterone (Figure 6A). Furthermore, recombinant STMO (15 µg ml-1) was incubated with 20 µM PDD and 0.5 mM NADPH (Figure 6B). The GC-MS analysis indicated disappearance of PDD concomitant to formation of a novel reaction product, boldenone acetate (Figure 1B), as identified against authentic standard (not shown); boldenone acetate was formed in quantitative amounts (Figure 6B). After addition of recombinant steroid esterase (SEST; 15 µg ml-1) (Figure 6B), boldenone acetate was rapidly converted to boldenone (Figure 1), as identified against authentic standard, as well as to acetate, each in quantitative amounts. 98 CHAPTER 6

STMO 18 A esterase 15

12

9

6

concentration(µM) 3

0 STMO 25 esterase B

20

15

10

concentration (µM) 5

0

0 10 20 30 40 50 60

incubation time (min)

Figure 6. In-vitro reconstitution of the conversion of (A) progesterone () via testosterone acetate () to testosterone (), and of (B) PDD () via boldenone acetate () to boldenone () and acetate (), with recombinant STMO and steroid esterase, respectively. The reactions were started by addition of STMO (15 µg protein ml-1), and after 30 min, the esterase (15 µg protein ml-1) was added.

DISCUSSION In 1952, Talalay and coworkers (Talalay et al. 1952, Talalay 2005) isolated an aerobic bacterium that grew with testosterone as its sole source of carbon and energy, Comamonas testosteroni (formerly Pseudomonas testosteroni; Tamaoka et al. 1987), as a model organism to explore its corresponding testosterone (steroid) degradation pathway. Since then, the involved enzymes, genes, and their regulation have been explored in great detail in several C. testosteroni strains (e.g., Marcus and Talalay 1956, Shaw et al. 1965, Plesiat et al. 1991, Abalain et al. 1993, Florin et al. 1996, Möbus et al. 1997, Cabrera et al. 2000, Horinouchi et al. 2001, Horinouchi et al. 2003b, Horinouchi et al. 2010, Gong et al. 2012b, Horinouchi et al. 2012). That the catabolism of the carbonylic steroid progesterone involves Baeyer-Villiger’s CHAPTER 6 99 reaction (Baeyer and Villiger 1899) has been recognized more than 50 years ago (Fonken et al. 1960, Carlström 1966, Rahim and Sih 1966, Itagaki 1986a, Itagaki 1986b). Later, a steroid BVMO in steroid-induced Rhodococcus rhodochrous was isolated (Miyamoto et al. 1995), cloned and over-expressed (Morii et al. 1999), and crystallized (Franceschini et al. 2012). A testosterone acetate hydrolase has been described 50 years ago, in cell extracts of Rhodococcus equi (Sih et al. 1963), but no protein or gene information had been made available since.

In C. testosteroni KF-1, the steroid BVMO (locus tag) CtesDRAFT_PD3136 and steroid esterase CtesDRAFT_PD3135 catalyze a C17-acetyl side chain removal from progesterone and PDD and yield acetate, and testosterone or boldenone, respectively, during growth with progesterone. The STMO and esterase gene pair (see Introduction) are co-encoded together with a predicted short-chain dehydrogenase/reductase (SDR) gene (CtesDRAFT_PD3137), and these genes appear to be co-transcribed and co-expressed as an operon (on a single mRNA) (see Figure S3). Syntenic three-gene cluster were found in all other genomes of C. testosteroni strains that are currently available (IMG database; strains ATCC 11996, CNB-1, NBRC 100989, and S44), and hence, the three-gene cluster may be a feature of the core genome of C. testosteroni species.

During growth of strain KF-1 with progesterone, a transient intermediate was detected in the culture fluid and identified as PDD (Figure 2). The recombinant STMO converted progesterone, as well as PDD, to the corresponding steroid-acetate esters (Figure 6), each with similar activity (Figure 3). Furthermore, both steroid esters were hydrolyzed by the recombinant steroid esterase (Figure 6). A catabolism of progesterone via PDD, boldenone acetate and boldenone and acetate (Figure 1) has also been described for the progesterone pathways in the ascomycete Cylindrocarpon radicicola and the actinobacteria Streptomyces lavendulae and Nocardioides simplex (Fried et al. 1953, Peterson et al. 1957, Mahato et al. 1988). The transient excretion of PDD (Figure 2B, see also Liu et al. 2013), may reflect that STMO is active and/or expressed at much lower level in comparison to a 3-oxo-steroid delta 1-dehydrogenase in C. testosteroni, thus, corresponding to the observations that in crude extracts of strain KF-1, only low STMO activity was detectable (by GC-MS) in comparison to 4-sulfoacetophenone BVMO (Weiss et al. 2012), and that any BVMO locus was detectable by proteomics. Another possible explanation for such metabolic overflow (Figure 2B) is molecular-oxygen limitation of STMO, particularly in dense cell suspension of exponential cultures, a phenomenon that has previously been observed for 4-sulfoacetophenone BVMO of strain KF-1, for which an insufficient aeration of the culture fluid led to transient excretion of 4-sulfoacetophenone during growth with 4-sulfophenylbutyrate (Schleheck et al. 2010). 100 CHAPTER 6

The SDR gene (CtesDRAFT_PD3137) of the three-gene cluster was also cloned, over- expressed, and the purified protein tested negative (not shown) as 3-/17-hydroxysteroid dehydrogenase (Marcus and Talalay 1956, Benach et al. 2002) with the substrates progesterone, testosterone, androsta-3,17-dione and cholate and the co-substrates NAD(P)+ or NAD(P)H, under the conditions we used; alternative functions of this SDR, e.g., to produce the acyl- carbonyl group on progesterone for a subsequent BVMO reaction (e.g., from the substrate pregn-4-en-20-ol-3-one (Hashimoto et al. 1968)), need to be explored in future work. Notably, a second SDR candidate was found to be highly expressed by proteomics (CtesDRAFT_PD3138), which is located directly upstream of the three gene cluster, in divergent orientation, whereat another SDR candidate downstream (CtesDRAFT_PD3103) showed the highest identity (92%) to the 3-hydroxysteroid dehydrogenase identified in C. testosteroni ATCC 11996 (3BHD_COMTE; Abalain et al. 1993); the second 3-hydroxysteroid dehydrogenase/carbonyl reductase from strain ATCC 11996 (DIDH_COMTE; Möbus and Maser 1998) shows 95% identity to yet another SDR (CtesDRAFT_PD3679), which is part of a putative steroid degradation gene hotspot (Horinouchi et al. 2010, Horinouchi et al. 2012). The genome of strain KF-1 contains >20 valid SDR candidates, several of which we identified by proteomics (see the Supplemental file; CtesDRAFT_PD3137, 3138, 3084, 1327, and 4155). We presume that SDRs might be expressed from redundant loci in C. testosteroni KF-1, and maybe differently in the different C. testosteroni strains; hence, we postponed any attempts at disentangling the identities and functions of these many (putative steroid) dehydrogenases in strain KF-1, and focused solely on the steroid BVMO and esterase in this study.

The C. testosteroni STMO amino acid sequence is only 26% identical to that of archetype R. rhodochrous STMO (and 25% to phenylacetone BVMO, ‘PAMO’) and, as illustrated by a phylogenetic tree (Figure S6), did not cluster in the same branch with the R. rhodochrous STMO (i.e., not in the CHMO/PAMO/STMO-group, see Riebel et al. 2012), but clustered with the 4-hydroxyacetophenone BVMOs (‘HAPMO’) (see Figure S6), while being only 27-28% identical to HAPMO sequences. HAPMO has been reported not to convert progesterone at significant rates (Rehdorf et al. 2009), and the C. testosteroni STMO did not convert 4-hydroxyacetophenone but accepted phenylacetone, progesterone, PDD, and 2-decanone as substrates. In contrast, the previously characterized 4-sulfoacetophenone BVMO (‘SAPMO’) of C. testosteroni KF-1, which seems acquired through horizontal gene transfer (discussed in Weiss et al. 2012), and which seems to be phylogenetically more closely related to the R. rhodochrous STMO and PAMO (55-57% identity) than to the HAPMOs (30%) (Figure S6), CHAPTER 6 101 converted 4-sulfoacetophenone, 4-hydroxyacetophenone and phenylacetone, but not progesterone (Weiss et al. 2012), as confirmed in this study (using 20 µM progesterone, cf. Figure 4). Hence, an analysis of the phylogenetic relationship of type I BVMOs (Figure S6) gives no indication of the substrate specificity of these enzymes, at least in respect to a conversion of carbonylic steroids. Notably, the factors in the architectures of type I BVMOs that distinguish the specific substrate recognition and specificity have not been clearly identified yet, though the recent resolution of the crystal structure of the R. rhodochrous STMO (and of mutant forms) indicated that the substrate preference may result rather from nonspecific hydrophobic interactions between the substrate and the binding site surface (Franceschini et al. 2012). The availability of a second STMO, of C. testosteroni KF-1, which is phylogenetically more distantly related to the STMO/PAMO than to the HAPMO-group of type-I BVMOs (Figure S6), might be valuable in future attempts to distinguish possible common factors of steroid-substrate binding in these enzymes and, more generally, why BVMOs bind and convert substrates, or not.

The steroid esterase is about 35% identical to that of 4-sulfophenyl acetate esterase in strain KF-1 (Weiss et al. 2012) and contains the characteristic alpha/beta fold, a catalytic triad (S,D,H), a pentapeptide (GDSAG), as well as a GGGX motif, of carboxylester hydrolases (Bornscheuer 2002, Henke et al. 2002, Rehdorf et al. 2012). Recombinant steroid esterase showed activity with testosterone acetate (Figures 5, 6A) and boldenone acetate (Figure 6B), but not with 4-sulfophenyl acetate (Weiss et al. 2012).

Hence, the observations that these ‘core’ steroid-degradation enzymes in C. testosteroni, STMO and steroid esterase, do not convert 4-sulfoacetophenone and 4-sulfophenyl acetate, respectively (and also BVMO1 and BVMO4 not 4-sulfoacetophenone when tested), correspond to the observation that strain KF-1 had to acquire two novel genes, for 4-sulfoacetophenone BVMO and 4-sulfophenyl acetate esterase, through horizontal gene transfer in order to catalyze the analogous acetyl-side chain removal reactions in a newly established pathway for a utilization of xenobiotic 4-sulfophenylcarboxylates (Weiss et al. 2012).

ACKNOWLEDGMENTS We would like to thank Andreas Marquardt for proteomics, the DOE-JGI for the C. testosteroni KF-1 genome sequence, and Anna I. Kesberg for her help in individual experiments. The work of M.W. was funded by the Research School Chemical Biology (KoRS-CB) and the Zukunftskolleg of University of Konstanz. The work of A.-K.F. was funded by the KoRS-CB. The work of D.S. was funded by a German Research Foundation (DFG) grant (SCHL 1936/1) 102 CHAPTER 6 and by the University of Konstanz, and supported by the KoRS-CB and by the Konstanz Young Scholar Fund (YSF). CHAPTER 6 103

SUPPLEMENTAL MATERIAL

Characterization of a steroid Baeyer-Villiger monooxygenase and esterase in Comamonas testosteroni KF-1

MICHAEL WEISS, ANN-KATRIN FELUX, DIETER SPITELLER, AND DAVID SCHLEHECK

Department of Biology and Research School Chemical Biology of the University of Konstanz

Table S1. PCR primers product gene locus tag primer name primer sequence (5’ -3’) length (binding position [bp]) (bp) bvmo1-for CtesDRAFT_PD1901 (413–432) GCGCCGGCTATTACGACCAC 640 bvmo1-rev a CtesDRAFT_PD1901 (1052–1034) CCGCCGAACCGCTTGAGAT 3136for, bvmo2-for CtesDRAFT_PD3136 (677-695) TGCAGCGCGAACCCACCTA 556 3136rev, bvmo2-rev a CtesDRAFT_PD3136 (1232-1213 ) TGCTCGCCGCGTTCTATCAG bvmo3-for CtesDRAFT_PD5437 (51-71) CTTCGCCGGCCTCTACCAACT 544 bvmo3-rev a CtesDRAFT_PD5437 (594–577) GGCGATCACGGGCACCAT bvmo4-for CtesDRAFT_PD1705 (462-483) CGTATCTCCCGAGCCGTTCAAA 498 bvmo4-rev a CtesDRAFT_PD1705 (960–939) ACCATAGCCCACCAGCCACAAG (+), sucD-for CtesDRAFT_PD5114 (118-137) GCAGGCGTGAACCCCAAGAA 451 (+), sucD-rev a CtesDRAFT_PD5114 (568-550) CGCCACCGATACCGACAGC 3135for, est2-for CtesDRAFT_PD3135 (248-265) GCTGGCATGGCAGGGTGATAGT 421 3135rev, est2-rev a CtesDRAFT_PD3135 (668-647) GCGGCGGCTGGTGCATAG 3137for, sdr-for CtesDRAFT_PD3137 (99-115) CGGCGCCAGGGTGGTGT 450 3137rev, sdr-rev CtesDRAFT_PD3137 (548-526) ATATAGCCGGGGCTGATGCTGTT TOPO_Ct_1705-for b CtesDRAFT_PD1705 CACCCGCGTAGAGGGAGCGCTTGAGG

TOPO_Ct_1705-rev CtesDRAFT_PD1705 CTGGCCGGGAGGCTAGTGTCTCAG TOPO_Ct_1901-for b CtesDRAFT_PD1901 CACCATGGCACAGCATTTCGACTTTC

TOPO_Ct_1901-rev CtesDRAFT_PD1901 CTAGCGATAGGCCAAGGCATCGTC TOPO_Ct_3135-for b CtesDRAFT_PD3135 CACCATGCCGATACACCCCGAAA

TOPO_Ct_3135-rev CtesDRAFT_PD3135 TCAGGACCGAACACGCGCCAAGG TOPO_Ct_3136-for b CtesDRAFT_PD3136 CACCATGAAATATGATGTCATTGTGATT

TOPO_Ct_3136-rev CtesDRAFT_PD3136 GGCTG CGCGCCCGCTCGGTTCT

a primer used for reverse-transcription reaction. b the adaptor-sequence for directional TOPO-cloning is underlined in the primer sequence. 104 CHAPTER 6

Figure S1. Progesterone (A) and testosterone acetate (B) disappearance as followed by GC-MS in crude extract of C. testosteroni KF-1 cells grown with progesterone (solid asterisk) or succinate (open asterisk). The reaction with progesterone contained NADPH. CHAPTER 6 105

Figure S2. SDS-PAGE analysis of recombinant enzymes (each 20 µg total protein per lane). Lane 1, molecular mass marker (in kDa); lane 2, purified CtesDRAFT_PD1705 (BVMO4); lane 3, purified CtesDRAFT_PD1901 (BVMO1); lane 4, purified CtesDRAFT_PD3136 (BVMO2; STMO); lane 5, purified CtesDRAFT_PD3135 (steroid esterase; SEST). 106 CHAPTER 6

Figure S3. Schematic representation of the RT-PCR assay (A) that was used to analyze a co-transcription of the three-gene cluster with the esterase candidate, STMO (BVMO2), and SDR gene (as shown; locus tags PD3135- 37) and (B) illustration of the result obtained. (A) An open triangle represents the primers used to generate cDNA in an individual RT reaction and closed triangles represent the primers used for the mapping of the length of the cDNA obtained by these RT reactions. The labeling of the PCR products (R1-R5 in A) corresponds to the labeling of the lanes of the agarose gel (B); genomic DNA of strain KF-1 was used as template for the PCR positive controls (+). The length markers are indicated (in bp).

Identification of progesterone-inducible proteins by proteomics We used differential proteomics to obtain further evidence for an inducible expression of locus tags PD3135-37 during growth with progesterone, and to identify also other progesterone- inducible proteins. Firstly, soluble proteins of progesterone- and succinate-grown strain KF-1 cells were separated by 2D-PAGE (Figure S4), and the most prominent protein spots that were visible specifically on the gel from progesterone-grown cells were excised and analyzed by peptide fingerprinting-mass spectrometry (PF-MS) (Table S2). Secondly, cell-free crude extracts (i.e., soluble and membrane proteins) of progesterone- and succinate-grown strain KF- 1 cells were separated by 1D-PAGE (Figure S5), and the proteins in the molecular mass ranges CHAPTER 6 107

45-65 kDa (for steroid BVMO; ‘STMO’) and 30-40 kDa (for steroid esterase, ‘SEST’) were excised (see Figure S5) and analyzed by high-resolution PF-MS (Orbitrap MS) (Tables S3-S6).

Figure S4. 2D-PAGE of soluble protein fractions of C. testosteroni KF-1 cells grown with progesterone and succinate. Protein spots that were excised are labeled by the locus tag number (CtesDRAFT_PD0000) obtained from PF-MS analysis (Table S2).

None of the prominent protein spots from the 2D-PAGE could identify the STMO or SEST, but the SDR locus (PD3137) of the gene cluster (PD3135-37) was identified by a prominent spot visible exclusively for progesterone-grown cells. In the samples of the 1D-gels, the SEST candidate (PD3135) was identified, specifically for progesterone-grown cells, but still not STMO (Figure S5, Table S3). Interestingly, locus tag PD3138, which is located directly upstream of the PD3135-37 locus in divergent orientation, was attributed by a different prominent spot on the 2D-gel, specifically for progesterone-grown cells; PD3138 is annotated to encode another SDR.

Furthermore, several genes of the steroid degradation gene hot spot (PD3640–3746 in strain KF-1; Weiss et al. 2013) described for C. testosteroni (Horinouchi et al. 2012) were attributed as progesterone-inducibly expressed (if appropriate, ORF/protein names were adopted from Horinouchi et al. 2012). Briefly, on the 2D-gel were identified (Figures S4 and Table S2): 108 CHAPTER 6

PD3646, 4,5-9,10-diseco-3-hydroxy-5,9,17-trioxoandrosta-1(10),2-diene-4-oate hydrolase (TesD); PD3648, aldehyde dehydrogenase (TesF); PD3666, acyl-CoA dehydrogenase; PD3693, 3-ketosteroid 9-monooxygenase subunit A (designated KshA (van der Geize et al. 2002)); PD3724, acetyl-CoA acetyltransferase (ORF33); PD3734, enoyl-CoA hydratase/isomerase (ORF5). The analysis of the 1D-gel samples confirmed and expanded upon these identifications (Figure S5 and Tables S3-S6). Confirmed was the progesterone-inducible expression of PD3646 (TesD), PD3648 (TesF) and PD3666 (acyl-CoA dehydrogenase). In addition, the following loci were identified as progesterone-inducibly expressed. PD3643, 3- oxosteroid 1-dehydrogenase (TesH); PD3644, 3-hydroxy-9,10-secoandrosta-1,3,5(10)-triene- 9,17-dione monooxygenase (TesA2), PD3647, 2-oxo-3-hexenedioate decarboxylase (TesE); PD3653, protein of unknown function (DUF35); PD3655, glycosyl hydrolase; PD3659, protein of unknown function DUF1302; PD3660, putative regulatory protein DUF1329; PD3667, lipid- transfer protein; PD3669, acyl-CoA dehydrogenase; PD3670, putative DNA binding protein (DUF925); PD3730, amidase (ORF25); PD3732, acyl-CoA dehydrogenase (ORF22); PD3737 CoA-transferase subunit B (ORF2); PD3739, 3,4-dihydroxy-9,10-secoandrosta-1,3,5(10)- triene-9,17-dione 4,5-dioxygenase (TesB).

Table S2. Identities of 2D-PAGE protein spots (see Figure S4)

spot / locus tag annotation score coverage (%) (CtesDRAFT_PD)

3137 short-chain dehydrogenase/reductase SDR 452 89

3138 short-chain dehydrogenase/reductase SDR 104 47

3646 alpha/beta hydrolase fold protein 171 67

3648 acetaldehyde dehydrogenase (acetylating) 165 72

3666 acyl-CoA dehydrogenase domain protein 224 44

3693 Rieske (2Fe-2S) domain protein 221 34

3724 acetyl-CoA acetyltransferase 466 70

3734 enoyl-CoA hydratase/isomerase 139 63

CHAPTER 6 109

kDa kDa A B 170 170 130 130

1 70 2 70 55 55

3 40 4 40

35 35

25 25

15 15

10 10 progesterone succinate

Figure S5. 1D-PAGE of crude extracts of progesterone (A) and succinate-grown cells (B). The sections of the gels that were excised and submitted to PF-MS are labeled 1-4; for the results, see Tables S3-S6 below). 110 CHAPTER 6

Figure S6. Illustration of the phylogenetic relationship of C. testosteroni STMO (solid arrow) and of archetype Rhodococcus rhodochrous STMO (open arrow) within the type-I BVMO family of enzymes. A collection of cloned BVMO sequences (Leipold et al. 2012) was used, inclusive the Rhodococcus jostii RHA1 BVMOs (Szolkowy et al. 2009, Riebel et al. 2012) and best BLASTp hits (grey) for STMO and the previously identified SAPMO of C. testosteroni (Weiss et al. 2012). The BVMO1 and BVMO4 of C. testosteroni KF-1 are indicated with arrows with stripes or diamonds, respectively. The collection was aligned and a phylogenetic tree generated using ClustalX2 (Larkin et al. 2007) and the tree visualized using Dendroscope (Huson et al. 2007). Two FAD- dependent monooxygenase superfamily enzymes, L-ornithine 5-monooxygenase (NMO; Q51548) and 4-hydroxybenzoate 3-monooxygenase (HBMO; Pput_2237), were used to root the tree (not shown). Enzyme abbreviations: ACMO, acetone monooxygenase; CAMO, cycloalkanone monooxygenase; CDMO, cyclododecanone monooxygenase; CHMO, cyclohexanone monooxygenase; CPMO, cyclopentanone monooxygenase; CPDMO, cyclopentadecanone monooxygenase; ETaA, ethionamide BVMO; FMO, flavin- dependent monooxygenase; HAPMO, 4-hydroxyacetophenone monooxygenase; MAKMO, methylalkylketone monooxygenase; PAMO, phenylacetone monooxygenase; SAPMO, 4-sulfoacetophenone monooxygenase; STMO, steroid monooxygenase. CHAPTER 6 111

Table S3. Results obtained from high resolution PF-MS of gel slice no. 1 (see Figure S5); cut-off was set at a minimal score of 300

Accession Description Score Coverage

645106180 CtesDRAFT_PD4426 chaperonin GroEL [Comamonas testosteroni KF-1] 6772,27 47,35 645106887 CtesDRAFT_PD5113 succinyl-CoA synthetase, beta subunit [Comamonas testosteroni KF-1] 2491,46 43,78 645106513 CtesDRAFT_PD4744 translation elongation factor Tu [Comamonas testosteroni KF-1] 2443,18 47,73 645106532 CtesDRAFT_PD4759 ATP synthase F1, beta subunit [Comamonas testosteroni KF-1] 2201,51 43,71 645106534 CtesDRAFT_PD4761 ATP synthase F1, alpha subunit [Comamonas testosteroni KF-1] 1632,28 30,44 645105389 CtesDRAFT_PD3660 putative sigma E regulatory protein, MucB/RseB [Comamonas testosteroni KF-1] 1586,88 25,49 645105388 CtesDRAFT_PD3659 protein of unknown function DUF1302 [Comamonas testosteroni KF-1] 1541,82 33,73 645103997 CtesDRAFT_PD2294 carboxyl transferase [Comamonas testosteroni KF-1] 1268,20 31,51 645104352 CtesDRAFT_PD2649 dihydrolipoamide dehydrogenase [Comamonas testosteroni KF-1] 1188,15 23,16 645105373 CtesDRAFT_PD3644 Acyl-CoA dehydrogenase type 2 domain protein [Comamonas 1083,86 27,64 645107266 CtesDRAFT_PD5488 acetyl-CoA acetyltransferase [Comamonas testosteroni KF-1] 1014,35 25,53 645105740 CtesDRAFT_PD3995 chaperone protein DnaK [Comamonas testosteroni KF-1] 999,10 20,09 645103900 CtesDRAFT_PD2197 inosine-5'-monophosphate dehydrogenase [Comamonas testosteroni 947,39 25,25 645102959 CtesDRAFT_PD1274 citrate synthase I [Comamonas testosteroni KF-1] 900,88 21,10 645101942 CtesDRAFT_PD0269 heat shock protein Hsp90 [Comamonas testosteroni KF-1] 877,46 15,72 645102591 CtesDRAFT_PD0908 ATPase AAA-2 domain protein [Comamonas testosteroni KF-1] 782,99 6,94 645105825 CtesDRAFT_PD4080 argininosuccinate lyase [Comamonas testosteroni KF-1] 758,83 15,35 CtesDRAFT_PD0544 phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP 645102221 741,64 17,86 cyclohydrolase [Comamonas testosteroni KF-1] 645105653 CtesDRAFT_PD3909 Glutamate dehydrogenase (NADP(+)) [Comamonas testosteroni KF-1] 704,66 20,98 645106035 CtesDRAFT_PD4283 extracellular solute-binding protein family 1 [Comamonas testosteroni 670,29 18,76 CtesDRAFT_PD3643 fumarate reductase/succinate dehydrogenase flavoprotein domain protein 645105372 666,26 12,15 [Comamonas testosteroni KF-1] 645105287 CtesDRAFT_PD3573 ribosomal protein S1 [Comamonas testosteroni KF-1] 642,64 17,68 645104778 CtesDRAFT_PD3074 NusA antitermination factor [Comamonas testosteroni KF-1] 623,42 17,17 645103897 CtesDRAFT_PD2194 GMP synthase, large subunit [Comamonas testosteroni KF-1] 614,95 14,87 645105628 CtesDRAFT_PD3884 Phosphopyruvate hydratase [Comamonas testosteroni KF-1] 612,86 14,25 645105700 CtesDRAFT_PD3955 Polyribonucleotide nucleotidyltransferase [Comamonas testosteroni 576,60 10,17 645104006 CtesDRAFT_PD2303 aminotransferase class I and II [Comamonas testosteroni KF-1] 538,80 14,07 645102062 CtesDRAFT_PD0387 acyl-CoA dehydrogenase domain protein [Comamonas testosteroni 536,67 9,70 645106565 CtesDRAFT_PD4792 translation elongation factor G [Comamonas testosteroni KF-1] 506,61 10,68 645102970 CtesDRAFT_PD1285 aconitate hydratase 2 [Comamonas testosteroni KF-1] 505,00 9,52 645101706 CtesDRAFT_PD0034 beta-lactamase domain protein [Comamonas testosteroni KF-1] 501,16 12,17 645103099 CtesDRAFT_PD1414 Glycine hydroxymethyltransferase [Comamonas testosteroni KF-1] 491,96 16,14 645106592 CtesDRAFT_PD4819 glutamyl-tRNA(Gln) amidotransferase, B subunit [Comamonas 462,59 10,31 645105459 CtesDRAFT_PD3730 Amidase [Comamonas testosteroni KF-1] 459,21 17,69 645106859 CtesDRAFT_PD5085 DNA polymerase III, beta subunit [Comamonas testosteroni KF-1] 444,66 11,80 645103434 CtesDRAFT_PD1735 acyl-CoA dehydrogenase domain protein [Comamonas testosteroni 417,98 8,58 645104319 CtesDRAFT_PD2616 Extracellular ligand-binding receptor [Comamonas testosteroni KF-1] 373,30 9,12 645103520 CtesDRAFT_PD1820 isocitrate dehydrogenase, NADP-dependent [Comamonas testosteroni 372,18 5,25 645105776 CtesDRAFT_PD4031 protease Do [Comamonas testosteroni KF-1] 355,41 6,20 645105395 CtesDRAFT_PD3666 acyl-CoA dehydrogenase domain protein [Comamonas testosteroni 338,86 11,45 645103073 CtesDRAFT_PD1388 glutamine synthetase, type I [Comamonas testosteroni KF-1] 334,23 9,13 645106130 CtesDRAFT_PD4376 glutamate synthase, NADH/NADPH, small subunit [Comamonas 319,98 7,94 645103057 CtesDRAFT_PD1372 histidyl-tRNA synthetase [Comamonas testosteroni KF-1] 300,71 11,98

112 CHAPTER 6

Table S4. Results obtained from high resolution PF-MS of gel slice no. 2 (see Figure S5); cut-off was set at a minimal score of 300

Accession Description Score Coverage

645106180 CtesDRAFT_PD4426 chaperonin GroEL [Comamonas testosteroni KF-1] 5151,22 51,92 645106513 CtesDRAFT_PD4744 translation elongation factor Tu [Comamonas testosteroni KF-1] 4682,64 50,25 645106532 CtesDRAFT_PD4759 ATP synthase F1, beta subunit [Comamonas testosteroni KF-1] 2652,85 46,48 645106887 CtesDRAFT_PD5113 succinyl-CoA synthetase, beta subunit [Comamonas testosteroni KF-1] 2490,42 43,78 645104352 CtesDRAFT_PD2649 dihydrolipoamide dehydrogenase [Comamonas testosteroni KF-1] 1950,04 30,53 645106534 CtesDRAFT_PD4761 ATP synthase F1, alpha subunit [Comamonas testosteroni KF-1] 1739,96 30,64 645103900 CtesDRAFT_PD2197 inosine-5'-monophosphate dehydrogenase [Comamonas testosteroni KF-1] 1569,67 33,60 645102959 CtesDRAFT_PD1274 citrate synthase I [Comamonas testosteroni KF-1] 1542,43 23,39 645105498 CtesDRAFT_PD3764 trigger factor [Comamonas testosteroni KF-1] 1312,09 38,53 CtesDRAFT_PD2650 2-oxoglutarate dehydrogenase, E2 subunit, dihydrolipoamide succinyltransferase 645104353 1284,50 24,03 [Comamonas testosteroni KF-1] 645105729 CtesDRAFT_PD3984 Homoserine dehydrogenase [Comamonas testosteroni KF-1] 1164,31 27,25 645105653 CtesDRAFT_PD3909 Glutamate dehydrogenase (NADP(+)) [Comamonas testosteroni KF-1] 1137,82 21,88 645105287 CtesDRAFT_PD3573 ribosomal protein S1 [Comamonas testosteroni KF-1] 1126,37 25,54 645105553 CtesDRAFT_PD3809 aspartate kinase [Comamonas testosteroni KF-1] 1002,65 22,80 CtesDRAFT_PD0544 phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP 645102221 996,92 23,50 cyclohydrolase [Comamonas testosteroni KF-1] CtesDRAFT_PD1539 type I secretion outer membrane protein, TolC family [Comamonas testosteroni 645103236 958,02 21,50 KF-1] 645105628 CtesDRAFT_PD3884 Phosphopyruvate hydratase [Comamonas testosteroni KF-1] 737,02 14,25 645104006 CtesDRAFT_PD2303 aminotransferase class I and II [Comamonas testosteroni KF-1] 685,68 10,69 645105776 CtesDRAFT_PD4031 protease Do [Comamonas testosteroni KF-1] 667,43 20,20 645105825 CtesDRAFT_PD4080 argininosuccinate lyase [Comamonas testosteroni KF-1] 626,73 11,83 645107265 CtesDRAFT_PD5487 2-nitropropane dioxygenase NPD [Comamonas testosteroni KF-1] 618,47 21,12 645103897 CtesDRAFT_PD2194 GMP synthase, large subunit [Comamonas testosteroni KF-1] 598,20 15,61 645102535 CtesDRAFT_PD0852 efflux transporter, RND family, MFP subunit [Comamonas testosteroni KF-1] 556,68 20,61 645103067 CtesDRAFT_PD1382 Adenylosuccinate synthase [Comamonas testosteroni KF-1] 553,15 14,19 645106859 CtesDRAFT_PD5085 DNA polymerase III, beta subunit [Comamonas testosteroni KF-1] 540,66 11,80 645105625 CtesDRAFT_PD3881 CTP synthase [Comamonas testosteroni KF-1] 535,87 14,69 CtesDRAFT_PD3161 O-acetylhomoserine/O-acetylserine sulfhydrylase [Comamonas testosteroni KF- 645104866 505,67 10,59 1] 645106591 CtesDRAFT_PD4818 glutamyl-tRNA(Gln) amidotransferase, A subunit [Comamonas testosteroni KF-1] 500,45 5,65 645106225 CtesDRAFT_PD4471 Aldehyde Dehydrogenase [Comamonas testosteroni KF-1] 484,76 13,13 645103099 CtesDRAFT_PD1414 Glycine hydroxymethyltransferase [Comamonas testosteroni KF-1] 478,74 16,14 645107266 CtesDRAFT_PD5488 acetyl-CoA acetyltransferase [Comamonas testosteroni KF-1] 463,01 16,25 645105727 CtesDRAFT_PD3982 threonine synthase [Comamonas testosteroni KF-1] 453,54 14,35 CtesDRAFT_PD0854 RND efflux system, outer membrane lipoprotein, NodT family [Comamonas 645102537 447,32 15,40 testosteroni KF-1] 645103057 CtesDRAFT_PD1372 histidyl-tRNA synthetase [Comamonas testosteroni KF-1] 433,05 11,06 645101942 CtesDRAFT_PD0269 heat shock protein Hsp90 [Comamonas testosteroni KF-1] 419,55 5,70 645103073 CtesDRAFT_PD1388 glutamine synthetase, type I [Comamonas testosteroni KF-1] 417,09 14,01 645102699 CtesDRAFT_PD1015 malate/quinone oxidoreductase [Comamonas testosteroni KF-1] 414,06 8,33 645103551 CtesDRAFT_PD1851 methylmalonate-semialdehyde dehydrogenase [Comamonas testosteroni KF-1] 411,97 10,45 645106592 CtesDRAFT_PD4819 glutamyl-tRNA(Gln) amidotransferase, B subunit [Comamonas testosteroni KF-1] 395,83 10,31 645105330 CtesDRAFT_PD3603 Argininosuccinate synthase [Comamonas testosteroni KF-1] 395,08 15,73 645102136 CtesDRAFT_PD0461 cobalamin synthesis protein P47K [Comamonas testosteroni KF-1] 391,65 10,46 645105322 CtesDRAFT_PD3595 amidophosphoribosyltransferase [Comamonas testosteroni KF-1] 380,99 7,37 645105345 CtesDRAFT_PD3618 fumarate hydratase, class II [Comamonas testosteroni KF-1] 376,02 16,63 645103163 CtesDRAFT_PD1472 transcription termination factor Rho [Comamonas testosteroni KF-1] 368,65 11,43 645103434 CtesDRAFT_PD1735 acyl-CoA dehydrogenase domain protein [Comamonas testosteroni KF-1] 367,58 8,58 645106035 CtesDRAFT_PD4283 extracellular solute-binding protein family 1 [Comamonas testosteroni KF-1] 367,05 12,59 645102962 CtesDRAFT_PD1277 succinate dehydrogenase, flavoprotein subunit [Comamonas testosteroni KF-1] 331,08 8,32 645102238 CtesDRAFT_PD0561 amidohydrolase [Comamonas testosteroni KF-1] 330,22 12,09 CtesDRAFT_PD5269 peptidase M24B X-Pro dipeptidase/aminopeptidase domain protein [Comamonas 645107043 306,90 10,02 testosteroni KF-1] CHAPTER 6 113

Table S5. Results obtained from high resolution PF-MS of gel slice no. 3 (see Figure S5); cut- off was set at a minimal score of 80

Accession Description Score Coverage

645106183 CtesDRAFT_PD4429 porin Gram-negative type [Comamonas testosteroni KF-1] 3782,58 43,82 645106513 CtesDRAFT_PD4744 translation elongation factor Tu [Comamonas testosteroni KF-1] 2227,49 44,19 645106180 CtesDRAFT_PD4426 chaperonin GroEL [Comamonas testosteroni KF-1] 1872,22 37,29 CtesDRAFT_PD1198 Electron transfer flavoprotein alpha/beta-subunit [Comamonas testosteroni KF- 645102883 1788,67 48,59 1] 645102966 CtesDRAFT_PD1281 malate dehydrogenase [Comamonas testosteroni KF-1] 1781,97 34,65 645107270 CtesDRAFT_PD5492 conserved hypothetical protein [Comamonas testosteroni KF-1] 1624,19 56,17 645102884 CtesDRAFT_PD1199 Electron transfer flavoprotein alpha subunit [Comamonas testosteroni KF-1] 1613,40 33,55 645103114 CtesDRAFT_PD1423 translation elongation factor Ts [Comamonas testosteroni KF-1] 1366,18 41,75 645105377 CtesDRAFT_PD3648 acetaldehyde dehydrogenase (acetylating) [Comamonas testosteroni KF-1] 1260,62 46,58 645102784 CtesDRAFT_PD1099 TRAP dicarboxylate transporter- DctP subunit [Comamonas testosteroni KF-1] 1103,53 19,17 645102085 CtesDRAFT_PD0410 extracellular solute-binding protein family 3 [Comamonas testosteroni KF-1] 999,50 17,73 645102716 CtesDRAFT_PD1032 Extracellular ligand-binding receptor [Comamonas testosteroni KF-1] 904,69 21,43 645106457 CtesDRAFT_PD4691 L-asparaginase, type II [Comamonas testosteroni KF-1] 889,41 21,72 645102081 CtesDRAFT_PD0406 TRAP dicarboxylate transporter, DctP subunit [Comamonas testosteroni KF-1] 889,08 26,38 645106887 CtesDRAFT_PD5113 succinyl-CoA synthetase, beta subunit [Comamonas testosteroni KF-1] 842,59 33,94 645106877 CtesDRAFT_PD5103 conserved hypothetical protein [Comamonas testosteroni KF-1] 821,75 25,85 645103927 CtesDRAFT_PD2224 protein of unknown function DUF534 [Comamonas testosteroni KF-1] 767,38 29,50 CtesDRAFT_PD2729 sulfate ABC transporter, periplasmic sulfate-binding protein [Comamonas 645104432 729,57 15,13 testosteroni KF-1] 645106533 CtesDRAFT_PD4760 ATP synthase F1, gamma subunit [Comamonas testosteroni KF-1] 663,88 27,08 CtesDRAFT_PD3739 Glyoxalase/bleomycin resistance protein/dioxygenase [Comamonas testosteroni 645105468 625,26 31,67 KF-1] 645104943 CtesDRAFT_PD3234 ketol-acid reductoisomerase [Comamonas testosteroni KF-1] 624,41 16,86 645102651 CtesDRAFT_PD0967 basic membrane lipoprotein [Comamonas testosteroni KF-1] 607,42 24,42 645102650 CtesDRAFT_PD0966 basic membrane lipoprotein [Comamonas testosteroni KF-1] 533,29 26,25 645106534 CtesDRAFT_PD4761 ATP synthase F1, alpha subunit [Comamonas testosteroni KF-1] 520,13 11,56 645106319 CtesDRAFT_PD4565 porin Gram-negative type [Comamonas testosteroni KF-1] 509,92 15,13 645106532 CtesDRAFT_PD4759 ATP synthase F1, beta subunit [Comamonas testosteroni KF-1] 489,14 20,47 645104537 CtesDRAFT_PD2833 lipoprotein, YaeC family [Comamonas testosteroni KF-1] 486,03 24,52 645105715 CtesDRAFT_PD3970 ATP-dependent chaperone ClpB [Comamonas testosteroni KF-1] 477,10 6,67 645104840 CtesDRAFT_PD3135 Alpha/beta hydrolase fold-3 domain protein [Comamonas testosteroni KF-1] 430,42 15,02 645101932 CtesDRAFT_PD0259 conserved hypothetical protein [Comamonas testosteroni KF-1] 424,67 30,40 645106508 CtesDRAFT_PD4740 ribosomal protein L1 [Comamonas testosteroni KF-1] 399,48 17,75 CtesDRAFT_PD5265 glyceraldehyde-3-phosphate dehydrogenase, type I [Comamonas testosteroni 645107039 374,08 12,05 KF-1] 645104352 CtesDRAFT_PD2649 dihydrolipoamide dehydrogenase [Comamonas testosteroni KF-1] 371,46 11,79 645105231 CtesDRAFT_PD3519 porin Gram-negative type [Comamonas testosteroni KF-1] 355,24 8,12 645107268 CtesDRAFT_PD5490 Enoyl-CoA hydratase/isomerase [Comamonas testosteroni KF-1] 349,95 14,07 645102959 CtesDRAFT_PD1274 citrate synthase I [Comamonas testosteroni KF-1] 327,16 11,24 645106760 CtesDRAFT_PD4986 branched-chain amino acid aminotransferase [Comamonas testosteroni KF-1] 322,55 9,94 645105375 CtesDRAFT_PD3646 alpha/beta hydrolase fold protein [Comamonas testosteroni KF-1] 313,55 24,01 645102113 CtesDRAFT_PD0438 cysteine synthase A [Comamonas testosteroni KF-1] 299,23 16,99 645107265 CtesDRAFT_PD5487 2-nitropropane dioxygenase NPD [Comamonas testosteroni KF-1] 290,60 10,16 CtesDRAFT_PD5277 fructose-bisphosphate aldolase, class II, Calvin cycle subtype [Comamonas 645107051 285,27 12,43 testosteroni KF-1] CtesDRAFT_PD4863 oxidoreductase FAD/NAD(P)-binding domain protein [Comamonas testosteroni 645106636 276,03 12,84 KF-1] 645105287 CtesDRAFT_PD3573 ribosomal protein S1 [Comamonas testosteroni KF-1] 269,78 8,93 645104333 CtesDRAFT_PD2630 conserved hypothetical protein [Comamonas testosteroni KF-1] 267,41 15,96 645106559 CtesDRAFT_PD4786 ribosomal protein L2 [Comamonas testosteroni KF-1] 264,73 11,31 645105382 CtesDRAFT_PD3653 protein of unknown function DUF35 [Comamonas testosteroni KF-1] 262,06 8,97 645102911 CtesDRAFT_PD1226 Pili assembly chaperone [Comamonas testosteroni KF-1] 259,59 18,36 645105466 CtesDRAFT_PD3737 coenzyme A transferase [Comamonas testosteroni KF-1] 250,15 7,81 645106313 CtesDRAFT_PD4559 TRAP dicarboxylate transporter, DctP subunit [Comamonas testosteroni KF-1] 245,65 8,23 114 CHAPTER 6

645104788 CtesDRAFT_PD3084 short-chain dehydrogenase/reductase SDR [Comamonas testosteroni KF-1] 237,71 13,53 645105396 CtesDRAFT_PD3667 lipid-transfer protein [Comamonas testosteroni KF-1] 237,24 7,95 CtesDRAFT_PD2268 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase [Comamonas 645103971 236,81 14,80 testosteroni KF-1] 645105384 CtesDRAFT_PD3655 glycosyl hydrolase [Comamonas testosteroni KF-1] 224,79 9,25 645104934 CtesDRAFT_PD3225 translation elongation factor P [Comamonas testosteroni KF-1] 222,77 13,59 645106128 CtesDRAFT_PD4374 extracellular solute-binding protein family 3 [Comamonas testosteroni KF-1] 218,04 8,55 645105398 CtesDRAFT_PD3669 acyl-CoA dehydrogenase domain protein [Comamonas testosteroni KF-1] 213,90 8,99 645102596 CtesDRAFT_PD0913 2-alkenal reductase [Comamonas testosteroni KF-1] 207,70 11,46 645103900 CtesDRAFT_PD2197 inosine-5'-monophosphate dehydrogenase [Comamonas testosteroni KF-1] 207,56 4,68 645104429 CtesDRAFT_PD2726 sulfate ABC transporter, ATPase subunit [Comamonas testosteroni KF-1] 203,24 6,40 645101708 CtesDRAFT_PD0036 ribose-phosphate pyrophosphokinase [Comamonas testosteroni KF-1] 193,15 9,40 645103099 CtesDRAFT_PD1414 Glycine hydroxymethyltransferase [Comamonas testosteroni KF-1] 187,41 6,51 645105399 CtesDRAFT_PD3670 protein of unknown function DUF35 [Comamonas testosteroni KF-1] 171,69 12,42 645105727 CtesDRAFT_PD3982 threonine synthase [Comamonas testosteroni KF-1] 170,83 5,91 645106145 CtesDRAFT_PD4391 transaldolase [Comamonas testosteroni KF-1] 169,77 10,28 645104480 CtesDRAFT_PD2777 lipoprotein, YaeC family [Comamonas testosteroni KF-1] 168,57 8,65 645103220 CtesDRAFT_PD1524 conserved hypothetical protein [Comamonas testosteroni KF-1] 165,96 9,02 645102757 CtesDRAFT_PD1072 cysteine synthase B [Comamonas testosteroni KF-1] 163,90 8,31 645105641 CtesDRAFT_PD3897 conserved hypothetical protein [Comamonas testosteroni KF-1] 153,20 8,28 645102378 CtesDRAFT_PD0700 phosphoglycerate mutase 1 family [Comamonas testosteroni KF-1] 145,61 11,74 645105580 CtesDRAFT_PD3836 Hemin-degrading family protein [Comamonas testosteroni KF-1] 140,35 5,14 645104781 CtesDRAFT_PD3077 heat shock protein DnaJ domain protein [Comamonas testosteroni KF-1] 136,18 7,35 645102790 CtesDRAFT_PD1105 Carbon-monoxide dehydrogenase (acceptor) [Comamonas testosteroni KF-1] 132,18 8,43 645105498 CtesDRAFT_PD3764 trigger factor [Comamonas testosteroni KF-1] 128,17 11,47 645105537 CtesDRAFT_PD3802 conserved hypothetical protein [Comamonas testosteroni KF-1] 126,40 8,41 645104710 CtesDRAFT_PD3006 conserved hypothetical protein [Comamonas testosteroni KF-1] 125,96 7,00 CtesDRAFT_PD2650 2-oxoglutarate dehydrogenase, E2 subunit, dihydrolipoamide succinyltransferase 645104353 125,85 4,61 [Comamonas testosteroni KF-1] 645104798 CtesDRAFT_PD3094 band 7 protein [Comamonas testosteroni KF-1] 116,15 12,75 645103231 CtesDRAFT_PD1535 phenylalanyl-tRNA synthetase, alpha subunit [Comamonas testosteroni KF-1] 115,24 10,86 645102639 CtesDRAFT_PD0955 efflux transporter, RND family, MFP subunit [Comamonas testosteroni KF-1] 110,67 5,18 645105376 CtesDRAFT_PD3647 4-oxalocrotonate decarboxylase [Comamonas testosteroni KF-1] 110,67 9,33 645105653 CtesDRAFT_PD3909 Glutamate dehydrogenase (NADP(+)) [Comamonas testosteroni KF-1] 106,98 5,36 CtesDRAFT_PD1952 Alcohol dehydrogenase zinc-binding domain protein [Comamonas testosteroni 645103654 106,01 12,43 KF-1] CtesDRAFT_PD1276 succinate dehydrogenase and fumarate reductase iron-sulfur protein 645102961 99,96 16,24 [Comamonas testosteroni KF-1] 645106696 CtesDRAFT_PD4923 porin Gram-negative type [Comamonas testosteroni KF-1] 98,00 7,55 645101722 CtesDRAFT_PD0050 secretion protein HlyD family protein [Comamonas testosteroni KF-1] 94,65 6,69 645105699 CtesDRAFT_PD3954 NAD(P)H quinone oxidoreductase, PIG3 family [Comamonas testosteroni KF-1] 91,31 7,04 645106225 CtesDRAFT_PD4471 Aldehyde Dehydrogenase [Comamonas testosteroni KF-1] 90,58 7,29 645105461 CtesDRAFT_PD3732 acyl-CoA dehydrogenase domain protein [Comamonas testosteroni KF-1] 89,99 7,12 645103064 CtesDRAFT_PD1379 HflC protein [Comamonas testosteroni KF-1] 89,31 6,42 645102970 CtesDRAFT_PD1285 aconitate hydratase 2 [Comamonas testosteroni KF-1] 87,27 2,67

CHAPTER 6 115

Table S6. Results obtained from high resolution PF-MS of gel slice no. 4 (see Figure S5); cut- off was set at a minimal score of 80

Accession Description Score Coverage

645106183 CtesDRAFT_PD4429 porin Gram-negative type [Comamonas testosteroni KF-1] 9519,47 52,53 645102966 CtesDRAFT_PD1281 malate dehydrogenase [Comamonas testosteroni KF-1] 3820,85 40,12 645107270 CtesDRAFT_PD5492 conserved hypothetical protein [Comamonas testosteroni KF-1] 3150,49 56,17 645103114 CtesDRAFT_PD1423 translation elongation factor Ts [Comamonas testosteroni KF-1] 2638,23 37,04 645106513 CtesDRAFT_PD4744 translation elongation factor Tu [Comamonas testosteroni KF-1] 2623,16 39,90 CtesDRAFT_PD1198 Electron transfer flavoprotein alpha/beta-subunit [Comamonas testosteroni KF- 645102883 2551,40 48,59 1] 645106877 CtesDRAFT_PD5103 conserved hypothetical protein [Comamonas testosteroni KF-1] 2109,32 48,00 645102651 CtesDRAFT_PD0967 basic membrane lipoprotein [Comamonas testosteroni KF-1] 1259,19 24,16 CtesDRAFT_PD5265 glyceraldehyde-3-phosphate dehydrogenase, type I [Comamonas testosteroni 645107039 1197,68 27,71 KF-1] 645102884 CtesDRAFT_PD1199 Electron transfer flavoprotein alpha subunit [Comamonas testosteroni KF-1] 1161,91 27,42 645102113 CtesDRAFT_PD0438 cysteine synthase A [Comamonas testosteroni KF-1] 1106,33 64,05 645102650 CtesDRAFT_PD0966 basic membrane lipoprotein [Comamonas testosteroni KF-1] 1102,60 26,25 CtesDRAFT_PD2729 sulfate ABC transporter, periplasmic sulfate-binding protein [Comamonas 645104432 1100,24 26,67 testosteroni KF-1] 645106508 CtesDRAFT_PD4740 ribosomal protein L1 [Comamonas testosteroni KF-1] 1026,34 33,77 CtesDRAFT_PD2268 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase [Comamonas 645103971 1012,89 23,47 testosteroni KF-1] 645102085 CtesDRAFT_PD0410 extracellular solute-binding protein family 3 [Comamonas testosteroni KF-1] 990,38 23,08 645107268 CtesDRAFT_PD5490 Enoyl-CoA hydratase/isomerase [Comamonas testosteroni KF-1] 918,83 14,07 645106319 CtesDRAFT_PD4565 porin Gram-negative type [Comamonas testosteroni KF-1] 827,12 22,55 645103927 CtesDRAFT_PD2224 protein of unknown function DUF534 [Comamonas testosteroni KF-1] 821,39 23,60 CtesDRAFT_PD0383 two component transcriptional regulator, winged helix family [Comamonas 645102058 809,25 31,38 testosteroni KF-1] 645105537 CtesDRAFT_PD3802 conserved hypothetical protein [Comamonas testosteroni KF-1] 808,22 27,10 645102716 CtesDRAFT_PD1032 Extracellular ligand-binding receptor [Comamonas testosteroni KF-1] 807,14 16,93 CtesDRAFT_PD1937 phosphonate ABC transporter, periplasmic phosphonate-binding protein 645103639 780,62 21,84 [Comamonas testosteroni KF-1] 645104570 CtesDRAFT_PD2866 extracellular solute-binding protein family 3 [Comamonas testosteroni KF-1] 763,95 34,67 645104943 CtesDRAFT_PD3234 ketol-acid reductoisomerase [Comamonas testosteroni KF-1] 759,42 16,86 645105904 CtesDRAFT_PD4155 short-chain dehydrogenase/reductase SDR [Comamonas testosteroni KF-1] 736,24 20,00 645106568 CtesDRAFT_PD4795 Serine-type D-Ala-D-Ala carboxypeptidase [Comamonas testosteroni KF-1] 705,26 20,77 645106888 CtesDRAFT_PD5114 succinyl-CoA synthetase, alpha subunit [Comamonas testosteroni KF-1] 674,57 26,42 645102214 CtesDRAFT_PD0537 nicotinate-nucleotide pyrophosphorylase [Comamonas testosteroni KF-1] 673,28 41,87 645106792 CtesDRAFT_PD5018 septum site-determining protein MinD [Comamonas testosteroni KF-1] 667,80 18,15 645102596 CtesDRAFT_PD0913 2-alkenal reductase [Comamonas testosteroni KF-1] 640,65 28,65 645102347 CtesDRAFT_PD0669 conserved hypothetical protein [Comamonas testosteroni KF-1] 636,15 23,19 645101838 CtesDRAFT_PD0166 DNA-directed RNA polymerase, alpha subunit [Comamonas testosteroni KF-1] 581,02 29,22 645103059 CtesDRAFT_PD1374 outer membrane assembly lipoprotein YfgL [Comamonas testosteroni KF-1] 576,62 19,69 645104707 CtesDRAFT_PD3003 conserved hypothetical protein [Comamonas testosteroni KF-1] 562,62 12,64 645104934 CtesDRAFT_PD3225 translation elongation factor P [Comamonas testosteroni KF-1] 545,38 35,33 645103113 CtesDRAFT_PD1422 ribosomal protein S2 [Comamonas testosteroni KF-1] 534,26 14,40 645104480 CtesDRAFT_PD2777 lipoprotein, YaeC family [Comamonas testosteroni KF-1] 532,90 18,80 645106128 CtesDRAFT_PD4374 extracellular solute-binding protein family 3 [Comamonas testosteroni KF-1] 531,69 19,70 CtesDRAFT_PD1489 sulfate ABC transporter, periplasmic sulfate-binding protein [Comamonas 645103185 519,58 15,16 testosteroni KF-1] 645104537 CtesDRAFT_PD2833 lipoprotein, YaeC family [Comamonas testosteroni KF-1] 517,52 18,77 645102704 CtesDRAFT_PD1020 phosphoribosylformylglycinamidine cyclo-ligase [Comamonas testosteroni KF-1] 514,81 19,14 645106533 CtesDRAFT_PD4760 ATP synthase F1, gamma subunit [Comamonas testosteroni KF-1] 513,76 30,21 645106886 CtesDRAFT_PD5112 N-acetyl-gamma-glutamyl-phosphate reductase [Comamonas testosteroni KF-1] 507,31 19,68 645106556 CtesDRAFT_PD4783 ribosomal protein S3 [Comamonas testosteroni KF-1] 484,66 17,30 645103493 CtesDRAFT_PD1794 Nucleoside-triphosphate--adenylate kinase [Comamonas testosteroni KF-1] 479,62 18,81 645102378 CtesDRAFT_PD0700 phosphoglycerate mutase 1 family [Comamonas testosteroni KF-1] 471,54 22,27 645101722 CtesDRAFT_PD0050 secretion protein HlyD family protein [Comamonas testosteroni KF-1] 468,77 19,50 645102237 CtesDRAFT_PD0560 aspartate carbamoyltransferase [Comamonas testosteroni KF-1] 443,56 11,56 116 CHAPTER 6

645103220 CtesDRAFT_PD1524 conserved hypothetical protein [Comamonas testosteroni KF-1] 443,28 21,05 645104211 CtesDRAFT_PD2508 conserved hypothetical protein [Comamonas testosteroni KF-1] 429,45 19,79 645102081 CtesDRAFT_PD0406 TRAP dicarboxylate transporter, DctP subunit [Comamonas testosteroni KF-1] 424,99 13,33 645106008 CtesDRAFT_PD4256 ribose-phosphate pyrophosphokinase [Comamonas testosteroni KF-1] 424,52 13,79 645106444 CtesDRAFT_PD4678 extracellular solute-binding protein family 1 [Comamonas testosteroni KF-1] 387,07 13,45 645106313 CtesDRAFT_PD4559 TRAP dicarboxylate transporter, DctP subunit [Comamonas testosteroni KF-1] 384,01 8,23 645104788 CtesDRAFT_PD3084 short-chain dehydrogenase/reductase SDR [Comamonas testosteroni KF-1] 383,70 19,17 645105602 CtesDRAFT_PD3858 NlpBDapX family lipoprotein [Comamonas testosteroni KF-1] 381,71 10,27 645106605 CtesDRAFT_PD4832 orotate phosphoribosyltransferase [Comamonas testosteroni KF-1] 364,30 19,31 645102911 CtesDRAFT_PD1226 Pili assembly chaperone [Comamonas testosteroni KF-1] 358,77 12,89 645102784 CtesDRAFT_PD1099 TRAP dicarboxylate transporter- DctP subunit [Comamonas testosteroni KF-1] 353,00 11,39 645102008 CtesDRAFT_PD0335 2OG-Fe(II) oxygenase [Comamonas testosteroni KF-1] 349,43 15,04 645106559 CtesDRAFT_PD4786 ribosomal protein L2 [Comamonas testosteroni KF-1] 345,82 11,31 645103012 CtesDRAFT_PD1327 short-chain dehydrogenase/reductase SDR [Comamonas testosteroni KF-1] 344,92 17,82 645104798 CtesDRAFT_PD3094 band 7 protein [Comamonas testosteroni KF-1] 328,60 39,87 645105601 CtesDRAFT_PD3857 dihydrodipicolinate synthase [Comamonas testosteroni KF-1] 314,43 10,40 645102885 CtesDRAFT_PD1200 Extracellular ligand-binding receptor [Comamonas testosteroni KF-1] 307,55 11,70 645102057 CtesDRAFT_PD0382 acetylglutamate kinase [Comamonas testosteroni KF-1] 300,45 10,44 CtesDRAFT_PD4863 oxidoreductase FAD/NAD(P)-binding domain protein [Comamonas testosteroni 645106636 296,09 12,84 KF-1] 645102709 CtesDRAFT_PD1025 conserved hypothetical protein [Comamonas testosteroni KF-1] 295,28 15,22 CtesDRAFT_PD1341 protein of unknown function DUF306 Meta and HslJ [Comamonas testosteroni 645103026 293,37 12,64 KF-1] CtesDRAFT_PD5246 phosphoribosylaminoimidazole-succinocarboxamide synthase [Comamonas 645107020 291,86 13,77 testosteroni KF-1] 645103562 CtesDRAFT_PD1861 selenide, water dikinase [Comamonas testosteroni KF-1] 288,48 7,58 645107089 CtesDRAFT_PD5315 thioredoxin [Comamonas testosteroni KF-1] 287,65 9,49 645106822 CtesDRAFT_PD5048 conserved hypothetical protein [Comamonas testosteroni KF-1] 281,07 10,91 645106165 CtesDRAFT_PD4411 GTP-binding protein HSR1-related [Comamonas testosteroni KF-1] 264,19 15,47 645104425 CtesDRAFT_PD2722 Beta-lactamase [Comamonas testosteroni KF-1] 261,96 7,64 645102306 CtesDRAFT_PD0629 conserved hypothetical protein [Comamonas testosteroni KF-1] 254,62 18,30 645103231 CtesDRAFT_PD1535 phenylalanyl-tRNA synthetase, alpha subunit [Comamonas testosteroni KF-1] 251,94 8,57 645102757 CtesDRAFT_PD1072 cysteine synthase B [Comamonas testosteroni KF-1] 250,18 16,61 645103064 CtesDRAFT_PD1379 HflC protein [Comamonas testosteroni KF-1] 247,65 8,45 CtesDRAFT_PD4008 acetyl-CoA carboxylase, carboxyl transferase, beta subunit [Comamonas 645105753 245,64 9,28 testosteroni KF-1] 645106532 CtesDRAFT_PD4759 ATP synthase F1, beta subunit [Comamonas testosteroni KF-1] 244,07 5,12 CtesDRAFT_PD1952 Alcohol dehydrogenase zinc-binding domain protein [Comamonas testosteroni 645103654 238,63 12,43 KF-1] 645106333 CtesDRAFT_PD4579 Enoyl-CoA hydratase/isomerase [Comamonas testosteroni KF-1] 235,63 11,92 645102630 CtesDRAFT_PD0946 2-hydroxy-3-oxopropionate reductase [Comamonas testosteroni KF-1] 230,51 8,91 645106180 CtesDRAFT_PD4426 chaperonin GroEL [Comamonas testosteroni KF-1] 230,36 4,39 645106486 CtesDRAFT_PD4720 conserved hypothetical protein [Comamonas testosteroni KF-1] 228,88 9,52 645106846 CtesDRAFT_PD5072 putative exonuclease RdgC [Comamonas testosteroni KF-1] 225,36 12,17 645106534 CtesDRAFT_PD4761 ATP synthase F1, alpha subunit [Comamonas testosteroni KF-1] 224,94 5,01 645104781 CtesDRAFT_PD3077 heat shock protein DnaJ domain protein [Comamonas testosteroni KF-1] 224,31 7,87 645103604 CtesDRAFT_PD1903 aldo/keto reductase [Comamonas testosteroni KF-1] 220,52 7,65 CtesDRAFT_PD2534 Methylenetetrahydrofolate dehydrogenase (NADP(+)) [Comamonas testosteroni 645104237 219,20 12,32 KF-1] 645102790 CtesDRAFT_PD1105 Carbon-monoxide dehydrogenase (acceptor) [Comamonas testosteroni KF-1] 219,03 12,26 CtesDRAFT_PD3811 acetyl-CoA carboxylase, carboxyl transferase, alpha subunit [Comamonas 645105555 215,16 8,62 testosteroni KF-1] CtesDRAFT_PD4351 phosphoribosylformimino-5-aminoimidazole carboxamide ribotide isomerase 645106105 214,57 10,12 [Comamonas testosteroni KF-1] 645105505 CtesDRAFT_PD3770 conserved hypothetical protein [Comamonas testosteroni KF-1] 214,33 15,29 645105996 CtesDRAFT_PD4245 conserved hypothetical protein [Comamonas testosteroni KF-1] 213,13 7,54 645102639 CtesDRAFT_PD0955 efflux transporter, RND family, MFP subunit [Comamonas testosteroni KF-1] 210,08 11,76 645106896 CtesDRAFT_PD5122 conserved hypothetical protein [Comamonas testosteroni KF-1] 197,72 9,48 CtesDRAFT_PD1491 aliphatic sulfonates family ABC transporter, periplsmic ligand-binding protein 645103187 190,52 7,06 [Comamonas testosteroni KF-1] 645104483 CtesDRAFT_PD2780 ABC transporter related [Comamonas testosteroni KF-1] 188,13 10,06 CHAPTER 6 117

645103273 CtesDRAFT_PD1576 band 7 protein [Comamonas testosteroni KF-1] 185,26 7,80 645105843 CtesDRAFT_PD4098 guanosine monophosphate reductase [Comamonas testosteroni KF-1] 182,35 7,69 645103082 CtesDRAFT_PD1397 Thymidylate synthase [Comamonas testosteroni KF-1] 180,47 8,70 645102474 CtesDRAFT_PD0792 protein of unknown function DUF1732 [Comamonas testosteroni KF-1] 173,03 7,07 645102211 CtesDRAFT_PD0534 GTP-binding protein YchF [Comamonas testosteroni KF-1] 171,69 9,07 645106850 CtesDRAFT_PD5076 efflux transporter, RND family, MFP subunit [Comamonas testosteroni KF-1] 171,66 7,24 645103572 CtesDRAFT_PD1871 nucleotide sugar dehydrogenase [Comamonas testosteroni KF-1] 163,62 7,46 CtesDRAFT_PD0184 6-phosphogluconate dehydrogenase NAD-binding [Comamonas testosteroni KF- 645101857 163,42 10,89 1] 645102038 CtesDRAFT_PD0363 glucose-1-phosphate thymidylyltransferase [Comamonas testosteroni KF-1] 160,52 8,45 645101804 CtesDRAFT_PD0132 Indole-3-glycerol-phosphate synthase [Comamonas testosteroni KF-1] 156,77 8,27 645101932 CtesDRAFT_PD0259 conserved hypothetical protein [Comamonas testosteroni KF-1] 148,39 8,21 645104325 CtesDRAFT_PD2622 conserved hypothetical protein [Comamonas testosteroni KF-1] 146,88 9,17 645106826 CtesDRAFT_PD5052 oxidoreductase molybdopterin binding [Comamonas testosteroni KF-1] 145,50 5,71 645103954 CtesDRAFT_PD2251 fumarylacetoacetate (FAA) hydrolase [Comamonas testosteroni KF-1] 145,44 14,72 645106118 CtesDRAFT_PD4364 ABC transporter related [Comamonas testosteroni KF-1] 144,86 10,83 645107127 CtesDRAFT_PD5349 ABC transporter related [Comamonas testosteroni KF-1] 140,83 8,85 645103913 CtesDRAFT_PD2210 Pirin domain protein [Comamonas testosteroni KF-1] 139,45 13,30 645105641 CtesDRAFT_PD3897 conserved hypothetical protein [Comamonas testosteroni KF-1] 131,68 8,28 645105785 CtesDRAFT_PD4040 fatty acid/phospholipid synthesis protein PlsX [Comamonas testosteroni KF-1] 126,89 9,76 645103327 CtesDRAFT_PD1628 Thiamine-phosphate diphosphorylase [Comamonas testosteroni KF-1] 126,11 12,14 645105631 CtesDRAFT_PD3887 Hsp33 protein [Comamonas testosteroni KF-1] 125,05 6,57 645107105 CtesDRAFT_PD5331 glutathione synthetase [Comamonas testosteroni KF-1] 118,73 9,21 CtesDRAFT_PD1721 Pyridoxal-5'-phosphate-dependent protein beta subunit [Comamonas 645103420 113,98 11,11 testosteroni KF-1] CtesDRAFT_PD2650 2-oxoglutarate dehydrogenase, E2 subunit, dihydrolipoamide succinyltransferase 645104353 110,54 4,37 [Comamonas testosteroni KF-1] 645104937 CtesDRAFT_PD3228 transcriptional regulator, IclR family [Comamonas testosteroni KF-1] 107,51 7,22 645104333 CtesDRAFT_PD2630 conserved hypothetical protein [Comamonas testosteroni KF-1] 104,67 8,13 645106066 CtesDRAFT_PD4313 protein of unknown function DUF1342 [Comamonas testosteroni KF-1] 96,58 10,36 CtesDRAFT_PD1276 succinate dehydrogenase and fumarate reductase iron-sulfur protein 645102961 84,60 8,12 [Comamonas testosteroni KF-1]

CHAPTER 7 119

CHAPTER 7

A proteomic approach to unravel enzymes of the sulfophenylcarboxylate degradation pathway in Comamonas testosteroni KF-1 and identification of a hydroxyquinol ring cleavage dioxygenase

Michael Weiss and David Schleheck

Department of Biological Sciences and Research School Chemical Biology, University of Konstanz, Germany

120 CHAPTER 7

ABSTRACT Xenobiotic sulfophenylcarboxylates (SPCs) are formed as intermediates during bacterial degradation of laundry surfactant linear alkylbenzenesulfonates (LAS). In a first degradation step, bacteria shorten the alkyl chains of LAS by fatty-acid beta-oxidation and SPCs are formed, one of which is 3-(4-sulfophenyl)butyrate (3-C4-SPC). In a second step, the SPCs are completely degraded by other bacteria, with 3-C4-SPC being utilized by Comamonas testosteroni KF-1. For the latter pathway, a first C2 unit is cleaved off the C4-side chain of

3-C4-SPC, most likely in form of acetyl-coenzyme A (acetyl-CoA) in order to yield 4-sulfoacetophenone. Further, 4-sulfoacetophenone has been shown to be oxygenated to 4-sulfophenyl acetate and subsequently hydrolyzed to 4-sulfophenol and acetate. Finally, 4-sulfophenol was presumed to be converted to 4-sulfocatechol for intradiol ring cleavage followed by a desulfonative ortho-ring cleavage pathway. In this study, we identified candidate genes for several enzymatic steps of the anticipated 3-C4-SPC degradation pathway by two dimensional polyacrylamide gel electrophoresis (2D-PAGE) and peptide fingerprinting-mass spectrometry (PF-MS) of proteins which were specifically induced during growth of strain

KF-1 with 3-C4-SPC and 4-sulfophenol. One of them was an intradiol ring cleavage dioxygenase encoded by two nearly identical gene copies. However, the purified recombinant proteins showed no activity with 4-sulfocatechol and catechol, but high activity with hydroxyquinol (1,2,4-trihydroxybenzene); correspondingly, whole cells of strain KF-1 showed much higher, inducible substrate-dependent oxygen uptake with hydroxyquinol than with 4-sulfocatechol when tested. Additionally, a superoxide dismutase was identified, which protects hydroxyquinol from autoxidation to 2-hydroxy-1,4-benzoquinone. Hence, the present data implies the involvement of a desulfonative 4-sulfophenol ring-hydroxylating oxygenase in order to yield hydroxyquinol for ring cleavage, rather than 4-sulfocatechol and an involvement of a desulfonative ortho-ring cleavage pathway.

INTRODUCTION LAS are the major laundry surfactants with an annual consumption of 3 x 106 tons worldwide. After their application, they are disposed, at best, into sewage treatment plants, or directly into aquatic environments (Knepper and Berna 2003). Under aerobic conditions, bacteria can completely degrade LAS (Sawyer and Ryckman 1957), but the degradation pathways, and how these pathways have been assembled in these bacteria, is still not completely understood.

Commercial LAS is composed of a congeneric and isomeric mixture of compounds with linear

C10-C13 alkanes sub-terminally substituted by a 4-sulfophenyl moiety (20 congeners and 18 pairs of enantiomers; Knepper and Berna 2003). Its overall degradation involves communities CHAPTER 7 121 of microorganisms (van Ginkel 1996) and a two-tier model was proposed that explains the degradation of four individual LAS congeners (Schleheck et al. 2004a). The first-tier organism of such communities is represented by Parvibaculum lavamentivorans DS-1T (Schleheck et al. 2000, Schleheck et al. 2004b), which is able to utilize all LAS congeners as carbon and energy source through shortening of the alkyl side chains, and which excretes many different SPCs (>50 intermediates; Schleheck et al. 2004a, Schleheck et al. 2007). A second-tier organism in this community is represented by C. testosteroni KF-1, which is able to utilize only a narrow range of SPCs produced from commercial LAS, that is 3-C4-SPC (3-(4-sulfophenyl)butyrate; see Figure 1, compound I), 3-C5-SPC (3-(4-sulfophenyl)pentanoate), and alpha/beta- unsaturated 3-C4-SPC and 3-C5-SPC (Schleheck et al. 2004a).

Figure 1. Postulated degradation pathway for 3-C4-SPC in C. testosteroni KF-1. Compounds: I, 3-C4-SPC (3-(4-sulfophenyl)butyrate); II, 3-C4-SPC CoA; III, 3-C4-2en-SPC-CoA; IV, 3-OH-3-C4-SPC-CoA (3-hydroxy- 3-(4-sulfophenyl)butyryl-CoA); V, 4-sulfoacetophenone, VI, 4-sulfophenyl acetate; VII, 4-sulfophenol; VIII, 4-sulfocatechol; IX, 3-sulfomuconate; X, 4-sulfomuconolactone; XI, maleylacetate; XII, 3-oxoadipate; XIII, 3-oxoadipyl-CoA (adapted from Schleheck et al. 2010).

During growth of C. testosteroni KF-1 with 3-C4-SPC in the laboratory, two transient metabolites appeared in the culture fluid, and were identified as 4-sulfoacetophenone and 4-sulfophenol. On this basis, a degradation pathway was postulated (Figure 1), since the metabolites 4-sulfoacetophenone and 4-sulfophenol allowed for the conclusion that the C4 side chain of 3-C4-SPC is removed stepwise as C2-units. Firstly, a C2-unit is abstracted from

3-C4-SPC, most likely in form of acetyl-CoA by a sequence of short-chain fatty acid degradation reactions, in order to yield 4-sulfoacetophenone (Figure 1, compound V). Secondly, 122 CHAPTER 7 acetate and 4-sulfophenol are cleaved from a metabolite of 4-sulfoacetophenone, identified as 4-sulfophenyl acetate (Figure 1, compound VI). The 4-sulfophenyl acetate is generated from 4-sulfoacetophenone by a 4-sulfoacetophenone Baeyer-Villiger-type monooxygenase (BVMO) and hydrolyzed by a 4-sulfophenyl acetate esterase, both of which have been identified previously (Weiss et al. 2012).

Measurable substrate-dependent oxygenase activity with 4-sulfocatechol (Figure 1, compound VIII) in whole cells and cell extract (Schleheck et al. 2010) supported the proposed model of an aromatic ortho-ring cleavage of 4-sulfocatechol and a subsequent desulfonation reaction in the ortho-ring cleavage pathway, of 4-sulfomuconolactone (Figure 1, compound X), as observed previously in other bacteria (Contzen et al. 2001, Halak et al. 2006, Halak et al. 2007). Hence, we anticipated a 4-sulfophenol 2-monooxygenase in strain KF-1 in order to yield 4-sulfocatechol for an entry into the anticipated desulfonative 4-sulfocatechol ortho-ring cleavage pathway. However, 4-sulfophenol was also found as an intermediate in the bacterial degradation pathway for the plant herbicide Asulam (methyl- (4-aminobenzenesulfonyl)carbamate), and thought to be metabolized further to hydroxyquinol (1,2,4-trihydroxybenzene; Balba et al. 1979). This possibility, an involvement of hydroxyquinol, has never been explored for the 3-C4-SPC and 4-sulfophenol pathway in strain KF-1.

In the present study, we used differential proteomics to identify abundant proteins expressed specifically during growth of C. testosteroni KF-1 with 3-C4-SPC and 4-sulfophenol compared to cells grown with succinate. The proteins were separated by denaturing 2D-PAGE and proteins of interest were excised and identified by PF-MS. By this means, we identified an elevated expression of a predicted intradiol ring cleavage dioxygenase, which is encoded twice in the genome of strain KF-1. We heterologously expressed the corresponding genes and characterized the purified enzymes.

MATERIALS AND METHODS

Chemicals Standard chemicals were obtained from Sigma-Aldrich, Fluka or Merck. 4-Sulfoacetophenone (4-acetylbenzenesulfonate) was purchased from ABCR (Karlsruhe, Germany), 4-sulfophenyl acetate (1-phenol-4-sulfonate-acetate) from SYNCHEM (Felsberg-Altenburg, Germany), and biochemicals (NADH, NADPH, NAD+ and NADP+) from Biomol (Hamburg, Germany). Racemic 3-(4-sulfophenyl)butyrate was synthesized as described previously (Schleheck et al. 2004a). 4-Sulfocatechol was a gift of B. J. Feigel (Stuttgart). CHAPTER 7 123

Growth conditions C. testosteroni KF-1 (DSM 14576) was grown in a phosphate-buffered, mineral-salts medium (Thurnheer et al. 1986) supplemented with 6 mM 3-(4-sulfophenyl)butyrate, 10 mM 4-sulfophenol or 15 mM succinate as sole carbon source. Cultures were incubated in glass tubes (Corning, 3-ml scale) in a roller, or in Erlenmeyer flasks (0.1 – 2-l scale) on a shaker at 30°C in the dark. Cultures were inoculated (1%) with cultures pre-grown with the same substrate.

Preparation of cell extracts and enrichment of hydroxyquinol dioxygenase and superoxide dismutase C. testosteroni KF-1 cells were harvested in the exponential growth phase by centrifugation (15,000 x g, 20 min, 4°C) and stored frozen (-20°C). Cells were resuspended in 20 mM Tris- HCl buffer (pH 8.0) containing 0.03 mg ml-1 DNase I (Sigma) and disrupted by four passages through a chilled French pressure cell (140 MPa, 4°C) (Aminco, Silver Spring, USA). Whole cells and debris were removed by centrifugation (15,000 x g, 20 min, 4°C) to obtain crude extract, and membranes were removed by ultracentrifugation (60,000 x g, 60 min, 4°C) to obtain the soluble fraction.

The soluble protein solution was subjected to anion exchange chromatography on, either a MonoQ HR (high-resolution) 10/10 column (Pharmacia), or on a DEAE Sepharose fast flow column (Pharmacia). The MonoQ column was pre-equilibrated with 20 mM Tris-HCl containing 10% glycerol at a flow rate of 1 ml min-1. Bound proteins were eluted with a NaCl solution gradient (10 min without NaCl, in 50 min to 0.3 M NaCl, and in further 30 min to 1 M NaCl). The fractions were collected in 3 ml volumes; the hydroxyquinol dioxygenase activity eluted at about 350 mM NaCl as assayed photospectrometrically by an increase of its proposed product maleylacetate or by oxygen consumption in a Clark-type electrode (see below).

The DEAE column was pre-equilibrated with 20 mM Tris-HCl containing 10% glycerol at a flow rate of 1 ml min-1. Bound proteins were eluted using a linear NaCl gradient (20 min without NaCl, in 30 min to 1 M NaCl), and fractions (3 ml) were collected; the ʻhydroxyquinol protecting activityʼ (see below) eluted at about 300 mM NaCl.

Analytical chemistry Total cellular protein was determined following a Lowry-based protocol (Kennedy and Fewson 1968) and soluble protein by protein dye binding (Bradford 1976); bovine serum albumin was used as a reference standard. The release of sulfate during growth was quantified turbidimetrically (Sörbo 1987) as a suspension of BaSO4. 124 CHAPTER 7

Protein gel electrophoresis and proteomics Proteins of whole cells, soluble protein fractions, or proteins obtained from heterologous gene expressions, were analyzed on 12% SDS-PAGE with Coomassie brilliant blue R-250 staining (Laemmli 1970). Proteins in the soluble fraction of C. testosteroni KF-1 cells were analyzed by two dimensional gel elctrophoresis (2D-PAGE; IEF/SDS-PAGE) as described previously (Schmidt et al. 2013). Protein spots of interest were excised and analyzed by PF-MS at the Proteomics Facility of the University of Konstanz. The MASCOT engine (Matrix Science, London, UK) was used to search against an amino-acid sequence database of C. testosteroni KF-1 genes (IMG version 2011-08-16). The parameters for searching and scoring were set as described previously (Schmidt et al. 2013).

Heterologous expression of the two dioxygenase candidate genes in E. coli and purification of the recombinant proteins Chromosomal DNA of C. testosteroni KF-1 was isolated using the Illustra bacteria genomicPrep Mini Spin Kit (GE Healthcare). The genes were amplified by PCR using Phusion HF DNA Polymerase (Finnzymes). The PCR primers were purchased from Microsynth (Balgach, Switzerland). The sequences of the primer pairs were: PD5469 forward and PD5471 forward, 5´-CACCATGCGCAACATCAACGAAGACACC-3´, PD5469 reverse, 5´-TCATGATTGCTGCGCGCTCGGCTTGGTGGG-3´, and PD5471 reverse, 5´-TCAAGTGGAAACCTTGAGTGGG-3´.

For CtesDRAFT_PD5469 and CtesDRAFT_PD5471, the PCR conditions were: 90 s initial denaturation at 98°C; 30 cycles of 15 s denaturation at 98°C, 20 s annealing at 61.6°C, and 60 s elongation at 72°C.

The PCR products were separated by agarose gel electrophoresis, excised, and purified using a

QIAquick gel extraction kit (Qiagen) and ligated into pET100, an N-terminal His6-tag expression vector, by using the Champion pET100 Directional TOPO Expression Kit and OneShot TOP10 E. coli (Invitrogen); correct constructs were confirmed by sequencing (GATC- Biotech, Konstanz). For the gene expression, BL21 Star (DE3) OneShot E. coli (Invitrogen) cells were transformed with the construct and grown in LB medium (100 mg l-1 ampicillin) at

37°C. At OD580nm ≈ 0.5, the cultures were induced (0.5 mM Isopropyl β-D-1- thiogalactopyranoside; IPTG) and grown for additional 5 h at 20°C; cells were harvested by centrifugation (15,000 x g, 20 min, 4°C) and stored at -20°C.

Cells were resuspended in buffer A (20 mM Tris-HCl pH 8.0, 100 mM KCl, 10% (v/v) glycerol) containing 0.03 mg ml-1 DNase I (Sigma), and disrupted by four passages through a pre-cooled CHAPTER 7 125

French pressure cell (140 MPa; Aminco, Silver Spring, USA). Cell debris was removed by centrifugation (15,000 x g, 10 min, 4°C), and membranes by ultracentrifugation (60,000 x g, 60 min, 4°C). The soluble fraction was loaded on a Ni2+-chelating Agarose affinity column (1-ml column volume, Macherey-Nagel, Germany), pre-equilibrated with buffer A, followed by a washing step using 30 mM imidazole in buffer A. The His-tagged protein was eluted using 250 mM imidazole in buffer A, concentrated using Vivaspin (10 kDa cut-off; Sartorius, Germany), and stored at -20°C after addition of glycerol to 30% (v/v) final concentration.

Enzyme assays The activities of native and recombinant hydroxyquinol dioxygenase were measured as substrate dependent oxygen consumption with a Clark-type oxygen electrode (Schleheck et al. 2010) in either 20 mM Tris-HCl buffer (pH 8.0) or in 20 mM MES-NaOH buffer (pH 6.4).

Specific activities of native and recombinant hydroxyquinol dioxygenase were determined spectrophotometrically as decrease of hydroxyquinol at 289 nm (ε = 3.3 x 103 M-1 cm-1; Perry and Zylstra 2007) and as increase of absorption of the putative reaction product maleylacetate at 242 nm (ε = 4.74 x 103 M-1 cm-1; Halak et al. 2007). The activities were plotted using hyperbolic fit in Origin (Microcal Software Inc.). The ʻhydroxyquinol protecting activityʼ was determined as qualitative method; therefore protein samples were added to 50 mM Tris-HCl (pH 8.0) and 1 mM freshly prepared hydroxyquinol. The oxidation of hydroxyquinol, or its protection, was judged visually, by the appearance or absence of a red color.

RESULTS AND DISCUSSION

C. testosteroni KF-1 was grown with 3-C4-SPC, or with 4-sulfophenol as described further below, and the patterns of soluble proteins were compared by 2D-PAGE with the pattern of cells grown with succinate. Protein spots of interest were excised (see Figures S2-S6) and submitted to PF-MS identification (see corresponding Tables S1-S5).

Candidate genes for a conversion of 3-C4-SPC to 4-sulfoacetophenone

Several prominent protein spots were observed specifically on the gels for 3-C4-SPC-grown cells, and several of these were identified repeatedly, i.e., in independent gel electrophoresis runs. Two very prominent 3-C4-SPC-inducible proteins (e.g., spots 2 and 3 in Figure S2, and B7 and B27 in Figure S4) are encoded by locus tags (predicted genes) CtesDRAFT_PD5437 and PD5438 (Tables S1 and S3) (in the following, the locus tag prefixes CtesDRAFT_ are omitted). These genes have previously been identified to encode the 3-C4-SPC-inducible 4-sulfoacetophenone Baeyer-Villiger monooxygenase (SAPMO) and 4-sulfophenyl acetate esterase (Weiss et al. 2012); hence, the proteomic data confirmed our earlier results. 126 CHAPTER 7

Several of the other prominent spots that were observed exclusively for 3-C4-SPC-grown cells, identified candidate genes that are located in close vicinity to the yet identified SAPMO- and 4-sulfophenyl acetate esterase genes, i.e., within the PD54xx loci of the genome. Further, the annotation of several of these genes suggested enzymes that may be involved in CoA-ester metabolism, hence, in support of the hypothesis that 3-C4-SPC may be converted to 4-sulfoacetophenone in a reaction sequence in analogy to short-chain fatty acid CoA degradation, thus, involving a 3-C4-SPC CoA-ester (see Figure 1).

For the formation of the predicted 3-C4-SPC CoA-ester, an acyl-CoA transferase candidate, PD5460, was identified (n = 2) by prominent spots (spot 10 in Figure S2 and spot B21 in Figure S4) and an acyl-CoA synthetase candidate, PD5439 (spot B3 in Figure S4). For an alpha/beta-unsaturation of predicted 3-C4-SPC CoA, the acyl-CoA dehydrogenase candidate PD5491 was identified by another prominent spot (spot B2 in Figure S4). The next gene in the genome, PD5490, was identified (n = 2) by prominent spots (spot 2 in Figure S2 and B28 in Figure S4) and is annotated to encode an enoyl-CoA hydratase. Another enoyl-CoA hydratase candidate in the same genome region, PD5454, was also identified (spot B29 in Figure S4). For the next enzyme of the postulated reaction sequence, NAD-binding 3-hydroxyacyl-CoA dehydrogenase, no candidate(s) could be identified by proteomics, i.e., not within the PD54xx loci and not elsewhere in the genome. However, for an abstraction of acetyl-CoA from a 3-ketoacyl-CoA intermediate, the acyl-CoA acetyltransferase (thiolase) candidate PD5455 was identified by a very prominent spot (spot B14 in Figure S4) and the thiolase candidate is encoded directly next to the identified enoyl-CoA hydratase PD5454 (see above).

Notably, a beta-ketoadipyl-CoA thiolase gene (PD5466; spot A3 in Figure S3) and a 3-oxoacid CoA transferase subunit gene (PD5467) were also identified, but these candidates are attributed to contribute rather to the aromatic ring cleavage pathway (described further below). In addition, there were also other candidates for CoA-ester metabolism identified specifically on the gels for 3-C4-SPC-grown cells that are not located within the PD54xx loci, as follows: Two other enoyl-CoA hydratase candidates, PD0528 and PD1197, one acetyl-CoA synthetase, PD3605, two other predicted acyl-CoA dehydrogenases, PD0387 and PD4604, another beta- ketoadipyl-CoA thiolase gene, PD4382, and two other 3-oxoacid CoA transferase subunit genes, PD2590 and PD4381.

Finally, other candidates located within the PD54xx loci were also identified specifically on the gels for 3-C4-SPC-grown cells, but are not related to CoA-ester metabolism, as follows: Predicted aryl-alcohol dehydrogenase, PD5443; predicted dehydrogenase, PD5475; short-chain dehydrogenase/reductase (SDR), PD5461; iron-containing alcohol dehydrogenase, PD5474; CHAPTER 7 127 predicted nitropropane dioxygenase (NPD), PD5453; predicted catechol 1,2-dioxygenase paralogs PD5469 and PD5471 (see below). Furthermore, several other genes were identified, or co-identified, elsewhere in the genome but are most likely attributed to encode enzymes for central metabolism, and are not further described here (see Tables S1-S5).

Another protein identified on the gels of 3-C4-SPC-grown cells, which could indeed play a role in the postulated 3-C4-SPC degradation pathway, was a predicted Rieske (2Fe2S) ring- hydroxylating dioxygenase, PD4205, which was identified twice (spot 6 in Figure S2 and B15 in Figure S4), and is discussed below in the context of the 4-sulfophenol degradation pathway.

Candidate genes for a conversion of 4-sulfophenol, for aromatic ring cleavage and for a ring cleavage pathway C. testosteroni KF-1 is able to utilize also the intermediate 4-sulfophenol as sole source of carbon and energy for growth, and gratuitously induces the complete 3-C4-SPC pathway during growth with 4-sulfophenol (Schleheck et al. 2004a, Schleheck et al. 2010). Quantitative growth of strain KF-1 with 10 mM 4-sulfophenol was confirmed in a growth experiment (not shown) by the observed molar growth yield of 5.8 g of protein per mol of carbon, representing mass balance of 4-sulfophenol carbon as biomass and CO2 (Cook 1987); a stoichiometric release of the sulfur of 4-sulfophenol as sulfate (9.8 mM) was observed, and the growth rate (µ) determined was 0.06 h-1, which translated into a specific degradation rate for 4-sulfophenol of 288 nmol min-1 mg-1.

In the first step of the postulated pathway for 4-sulfophenol (Figure 1), the substrate would be converted by a 4-sulfophenol 2-monooxygenase to 4-sulfocatechol, and the 4-sulfocatechol would be the subject of aromatic ring cleavage (Schleheck et al. 2010). Notably, 4-sulfocatechol is not a growth substrate for strain KF-1 (Schleheck et al. 2010), however, a similar observation was made for Pseudomonas sp. strain WBC-3, which is able to utilize 4-nitrophenol for growth but not its degradation intermediate 4-nitrocatechol (Wei et al. 2010).

An inducible NAD(P)H-dependent 4-sulfophenol oxygenase activity was observed in cell-free extracts of strain KF-1, grown with 3-C4-SPC (Schleheck et al. 2010), and on the 2D-PAGE gels for 3-C4-SPC-grown cells, a ring-hydroxylating dioxygenase candidate gene (Rieske (2Fe-2S)-type oxygenase component) was identified (see above, PD4205). The same candidate was identified when 4-sulfophenol-grown cells were analyzed by 2D-PAGE (see spot 24 in Figure S5 and Table S4). This oxygenase component gene is co-encoded with a ferredoxin- reductase component gene, PD4202, and a ferredoxin gene, PD4203, but these could not be identified by proteomics. However, also a second Rieske-type oxygenase component gene, 128 CHAPTER 7

PD0403, was found for 4-sulfophenol-grown cells (spot 8 in Figure S5 and Table S4), as well as the corresponding ferredoxin-reductase component, PD0404 (spot 27 in Figure S6 and Table S5). Hence, we identified two putative aromatic ring hydroxylating oxygenase components (PD4205 and PD0403), but only one set of candidate genes for a complete oxygenase system (PD0403 with PD0404). However, this gene set, PD0403 with PD0404, has been characterized in C. testosteroni BR6020 to represent the vanillate and isovanillate demethylase system (Providenti et al. 2006a). On the other hand, only one component, PD4205, was identified for both 3-C4-SPC and 4-sulfophenol-grown cells. Hence, further analyses are necessary in order to clarify which candidate might be responsible for 4-sulfophenol conversion in strain KF-1, e.g., by transcriptional, mutational and/or heterologous-expression and activity assays.

An inducible 4-sulfocatechol oxygenase activity was detected in unamended crude extract of

3-C4-SPC-grown cells (Schleheck et al. 2010), hence, a ring cleavage dioxygenase activity which was attributed by that time as 4-sulfocatechol 1,2-dioxygenase leading to 3-sulfomuconate (see Figure 1). Two candidate genes for ring cleavage dioxygenase were identified by 2D-PAGE in 3-C4-SPC-grown cells (see above, PD5469 and PD5471). Importantly, both proteins include a catechol (intradiol) 1,2-dioxygenase domain and are completely identical in their sequence, except of three amino acids at the C-terminus of the polypeptide chain. The same candidates were identified when 4-sulfophenol-grown cell were analyzed by 2D-PAGE. Here, the intradiol ring cleavage dioxygenase protein was co-located in the same protein spot with an electron transfer flavoprotein alpha subunit (etfA, PD1199). This spot was also found on the gels from succinate-grown cells (Figure S5, spot S1) but with lower intensity, however, when the spot was excised and identified by PF-MS, no hit for genes PD5469 and PD5471, but only for etfA, was obtained (Table S4). Interestingly, the attributed intradiol ring cleavage dioxygenase genes are clustered with predicted genes for an intradiol ring cleavage pathway (beta-ketoadipate pathway), and the beta-ketoadipyl-CoA thiolase gene, PD5466, was also identified as well as the 3-oxoacid CoA transferase subunit gene, PD5467 (see above). In addition, this gene cluster is framed by transposase genes (Figure S7) and none of these genes is found in any other genome-sequenced C. testosteroni strain (not shown). Thus, it is tempting to speculate if these genes were only recently mobilized from a different location into the C. testosteroni genome in order to allow for the metabolism of these xenobiotic substrates.

Both intradiol ring cleavage dioxygenase candidate genes were cloned, expressed in E. coli and the proteins purified (Figure S1). There was no activity measurable with 4-sulfocatechol as substrate, as followed in a Clark-type oxygen electrode, and no activity with catechol, CHAPTER 7 129

4-nitrocatechol, protocatechuate, hydrochinone, 3-methylcatechol, 4-methylcatechol, or 3-chlorocatechol, when tested. However, both recombinant enzymes showed high oxygenase activity with hydroxyquinol (1,2,4-trihydroxybenzene). The preference of this type of ortho- fission dioxygenase for trihydroxylated aromatic compounds, compared to dihydroxylated aromatic compounds, might originate from a reduced activation energy of hydroxyquinol with its increased semiquinone character (Zaborina et al. 1999).

Hydroxyquinol 1,2-dioxygenases are known to be involved in the 4-aminophenol catabolism in Burkholderia sp. strain AK-5 (Takenaka et al. 2003), and in the catabolism of 4-nitrophenol and 4-nitrocatechol in Arthrobacter sp. strain JS443 (Jain et al. 1994, Perry and Zylstra 2007) and Pseudomonas sp. strain WBC-3 (Wei et al. 2010). In these pathways, hydroxyquinol is cleaved into maleylacetate, which is further reduced to beta-ketoadipate by a maleylacetate reductase. Importantly, a candidate gene for a maleylacetate reductase is encoded in close proximity to genes PD5469 and PD5471, and this gene, PD5474, was also identified for 4-sulfophenol-grown cells by 2D-PAGE (Figure S4, Table S3), hence, in addition (see above) to the beta-ketoadipyl-CoA thiolase gene (PD5466) and 3-oxoacid CoA transferase subunit gene (PD5467) at the same locus in the strain KF-1 genome. The strain KF-1 hydroxyquinol 1,2-dioxygenases share high sequence similarity to the known hydroxyquinol 1,2-dioxygenases; PD5469 / PD5471 share 50% amino acid identity to NpdB (Perry and Zylstra 2007), and 47% to PnpG (Zhang et al. 2009), and the proposed strain KF-1 maleylacetate reductase (PD5474) shares 44% identity to NpdC (Perry and Zylstra 2007) and 57% to PnpF (Zhang et al. 2009).

The kinetic parameters for the recombinant hydroxyquinol dioxygenases were determined spectrophotometrically: For the substrate disappearance (hydroxyquinol), the Km was 26.65 ± -1 -1 5.49 µM and Vmax of 3.36 ± 0.38 µmol min mg ; when measured as product formation -1 -1 (maleylacetate), the Km was 29.54 ± 13.75 µM and Vmax, 3.10 ± 0.26 µmol min mg . The two enzymes were, within the standard deviation, identical in their specific activities. The hydroxyquinol 1,2-dioxygenases were also measured in the oxygen electrode with whole cells of strain KF-1 grown with 4-sulfophenol, and the activity was much higher (124.98 ± 22.75 nmol min-1 mg-1) in comparison to the oxygenase activities detectable with 4-sulfocatechol (18.64 ± 3.81 nmol min-1 mg-1), 4-sulfophenol (56.26 ± 2.65 nmol min-1 mg-1) and protocatechuate (16.81 ± 5.08 nmol min-1 mg-1) (Figure 2); succinate-grown cells showed no hydroxyquinol 1,2-dioxygenases activity, hence, the enzyme was confimed to be inducible.

130 CHAPTER 7

160

140

)

1 -

mg 120

1 -

min 100 nmol

( 80

60 activity

40 specific 20

0 4-sulfocatechol hydroxyquinol 4-sulfophenol protocatechuate

Figure 2. Substrate-dependent oxygen uptake rates observed with whole cells of C. testosteroni KF-1. The cells were grown in mineral salts medium supplemented with 4-sulfophenol (10 mM) as carbon source.

The native hydroxyquinol 1,2-dioxygenase activity in 4-sulfophenol-grown cells was enriched by anion exchange chromatography (MonoQ column). One fraction exhibited high hydroxyquinol-dependent oxygen consumption in the oxygen electrode, and a protein band enriched in this active fraction (see Figure S7) was excised and analyzed by PF-MS. The protein identified, again, the intradiol dioxygenase candidate genes PD5469/PD5471 (score, 419; coverage, 46%).

Interestingly, when the enzyme assay was performed in Tris-HCl buffer pH 8.0, the hydroxyquinol solution turned from colorless to red before the addition of enzyme, due to the autoxidation of hydroxyquinol to 2-hydroxy-1,4-benzoquinone (Figure 3, compound XV) (Zaborina et al. 1999), which is a dead-end product in the pathway to maleylacetate (Takenaka et al. 2011). A protein fraction from an independent anion exchange chromatography run (DEAE column) prevented the formation of red color and, hence, the formation of 2-hydroxy- 1,4-benzoquinone. A protein band enriched in this active fraction (see Figure S8) was excised and identified by PF-MS. The protein was a superoxide dismutase (SOD), PD1809 (score, 519; coverage, 64%). Notably, this protein was identified also as 4-sulfophenol inducible protein on 2D-PAGE (Figure S6, Table S5). The involvement of SOD in hydroxyquinol degradation was CHAPTER 7 131 already observed for the 4-aminophenol pathway in Burkholderia sp. strain AK-5 (Takenaka et al. 2011), where the enzyme is thought to savage and detoxify reactive species that promote an autoxidation of hydroxyquinol to 2-hydroxy-1,4-benzoquinone. The SOD amino acid sequence of strain AK-5 shares 67% identity with the sequence of the SOD identified in strain KF-1.

Figure 3. Transformation steps considerable for 4-sulfophenol involving hydroxyquinol as ring cleavage substrate, and transformations of hydroxyquinol. Compounds: VII, 4-sulfophenol; VIII, 4-sulfocatechol; XI, maleylacetate; XIV, hydroxyquinol; XV, 2-hydroxy-1,4-benzoquinone.

In addition, it is very likely that strain KF-1 encodes also a reductase that actively converts the autoxidation product 2-hydroxy-1,4-benzoquinone back to hydroxyquinol (see Figure 3), as proposed for resorcinol transformation in Rhodococcus sp. BPG-8 (Armstrong et al. 1993) and 4-aminophenol degradation of Burkholderia sp. AK-5 (Takenaka et al. 2011), since we have preliminary evidence (not shown) for an NADH-dependent enzyme that decolorizes 2-hydroxy- 1,4-benzoquinone / oxidized hydroxyquinol solutions.

Conclusion and outlook The annotation of several of the identified genes, which were specifically induced during growth of C. testosteroni KF-1 with 3-C4-SPC, suggested enzymes that are involved in a metabolism of CoA-esters. This is an observation in support of the hypothesis that 3-C4-SPC may be converted to 4-sulfoacetophenone in a reaction sequence in analogy to short-chain fatty acid CoA degradation, thus, involving a 3-C4-SPC CoA-ester (Figure 1). However, much more work lies ahead in order to confirm this postulated pathway section, i.e., the ‘upper’ 3-C4-SPC pathway (Figure 1), for example by transcriptional and mutational analyses, and heterologous expression and activity assays once the SPC CoA-esters as substrates were synthesized.

Furthermore, it was shown that the ‘lower’ 3-C4-SPC pathway involves hydroxyquinol as substrate for aromatic ring cleavage, rather than (or in addition to?) 4-sulfocatechol and the 132 CHAPTER 7 modified, desulfonative ortho-degradation pathway, as initially hypothesized (see Figure 1; Schleheck et al. 2010). The hydroxyquinol 1,2-dioxygenase was unequivocally identified and characterized in this study, and proteomics as well as the genome sequence gave no indication for a second intradiol dioxygenase candidate and, hence, putative 4-sulfocatechol 1,2-dioxygenase that may be present in strain KF-1; both of the two identical copies of hydroxyquinol 1,2-dioxygenase did not convert 4-sulfocatechol when tested.

In future work, the desulfonative (di)oxygenase reaction(s) that convert the intermediates 4-sulfophenol to hydroxyquinol need(s) to be identified. Considerable options, as illustrated in Figure 3, are: (i) A single, desulfonative dioxygenase reaction is catalyzed by one dioxygenase enzyme system, in analogy to the desulfonating multicomponent dioxygenase system of the 4-toluenesulfonate degradation pathway in other C. testosteroni species, which converts the intermediate 4-sulfobenzoate to the ring cleavage substrate protocatechuate (3,4-dihydroxybenzoate) (Junker et al. 1996); (ii) two individual monooxygenase reactions and a pathway that may still involve 4-sulfocatechol, hence, a pathway in analogy to 4-nitrophenol degradation in Arthrobacter sp. i.e., from 4-nitrophenol to 4-nitrocatechol to hydroxyquinol (Jain et al. 1994, Perry and Zylstra 2007); this option presumably involves 2-hydroxy- 1,4-benzoquinone as an intermediate, which is converted to hydroxyquinol by a yet unidentified reductase; (iii) a monooxygenase reaction forming 4-sulfocatechol, which undergoes a nucleophilic aromatic substitution and leads to hydroxyquinol; (iv) some completely novel and unexpected reactions.

The current information suggests that the 3-C4-SPC pathway has been assembled as ‘patchwork’ combination of genes recruited from at least four different pre-existing pathways, from (i) short-chain fatty acid CoA degradation for the ‘upper’ 3-C4-SPC pathway, (ii) 4-hydroxyacetophenone/phenylacetone degradation for the conversion of 4-sulfoacetophenone to 4-sulfophenol in the ‘central’ 3-C4-SPC pathway, and for the ‘lower’ 3-C4-SPC pathway, (iii), a yet unknown precursor pathway/enzymes for the conversion of 4-sulfophenol to hydroxyquinol, and (v) hydroxyquinol degradation. Notably, hydroxyquinol is the first non- xenobiotic, desulfonated, natural intermediate in this pathway, and the second major SPC utilized by strain KF-1, 3-C5-SPC (see introduction), could proceed along the same pathway (first, the abstraction of acetyl-CoA) but involve 4-sulfopropiophenone (4-propionylbenzenesulfonate) as intermediate instead of 4-sulfoacetophenone.

Most excitingly, many of the genes which we identified to be specifically induced during growth of strain KF-1 with 3-C4-SPC in this study, as well as the previously identified 4-sulfoacetophenone-BVMO and esterase genes (Weiss et al. 2012), are all encoded in the same CHAPTER 7 133 genome region in strain KF-1, in the CtesDRAFT_PD54xx loci. These genes are located within a complex arrangement of sets of transposase genes of mobile genetic elements (IS1071 elements), and this arrangement is not found in any other genome-sequenced C. testosteroni strain (not shown). Therefore, it is tempting to speculate that we have identified a genome region with recently mobilized gene modules that encode SPC degradation in combination. Although bacteria have at least started the conversion of LAS to SPCs right after its introduction on the market 50 years ago, the genetic assembly and optimization of these pathways seems to be still in progress.

ACKNOWLEDGEMENTS We thank Frederick von Netzer for initial growth- and whole cell experiments, Lena Wurmthaler, Aylin Metzner and Manuel Serif for their help with growth experiments, protein preparation and 2D-PAGE analysis, as well as Andreas Marquardt of the Proteomics Facility (University of Konstanz) for PF-MS operation, and the DOE-JGI for the C. testosteroni KF-1 genome sequence. The work of M.W. was supported by the Konstanz Research School Chemical Biology and the Zukunftskolleg (University of Konstanz). D.S. is grateful for funding by a Deutsche Forschungsgemeinschaft (DFG) grant (SCHL 1936/1-1), by the University of Konstanz and by the Konstanz Young Scholar Fund.

134 CHAPTER 7

SUPPLEMENTAL DATA

Figure S1. Heterologously expressed intradiol ring cleavage dioxygenase genes. Lanes 1 and 9, whole cells prior to IPTG induction; lanes 2 and 8 whole cells after IPTG induction; lanes 3 and 7 soluble protein fraction after ultracentrifugation (80 µg protein); lanes 4 and 6, protein fraction obtained from Ni2+-NTA purification (30 µg protein); lane 5, molecular mass marker (in kDa).

CHAPTER 7 135

spots were excised and

SPC (left) or succinate (right). Protein

-

4

C

-

1 cells grown with 3

-

KF

testosteroni

C.

sults aresults summarized Table in S1.

MS. The re

PAGE of soluble fractions of

-

-

2D

analyzed by analyzed PF Figure S2.

136 CHAPTER 7

S1. Table

Identities2D of

-

PAGEprotein Figure (see spots S2)

CHAPTER 7 137

SPC (left) or succinate (right). Protein spots were excised and

-

4

C

-

1 cells grown with 3

-

KF

.

testosteroni

C.

MS. The results are summarized Table in S2

PAGE of soluble fractions of

-

-

2D

analyzed by analyzed PF Figure S3.

138 CHAPTER 7

S2. Table

Identities2D of

-

PAGEprotein Figure (see spots S3)

CHAPTER 7 139

SPC (left) or succinate (right). Protein spots were excised and

-

4

C

-

1 cells grown with 3

-

KF

.

testosteroni

C.

marized in Table in marized S3

MS. The MS. are results sum

PAGE of soluble fractions of

-

-

2D

analyzed by PF by analyzed Figure S4.

140 CHAPTER 7

2

28 39 19 19 38 19 26 12 27 15 14 16 18 34 14 79 64 26 23 49 40 69 51 65

(%)

Coverage

62 59

741 476 148 111 693 325 100 254 141 109 113 108 538 164 744 655 230 225 733 486 662 262 433

Score PMFScore

(Da)

MW

48789 82196 77538 78700 74033 73348 92581 72369 59865 65557 65059 59865 60271 56081 57624 59865 54936 49596 59583 47708 37806 41981 45748 40446

105151

annotated annotated

inducible protein F protein inducible

-

dependent

-

NADP

and II

Annotation

carboxyvinyltransferase

-

2 protein domain

-

2S) domain protein 2S) domain

-

dehydratase / bile acid dehydratase

CoA acetyltransferase

-

dependent synthetase and dependentligase synthetase

CoA dehydrogenase domaindehydrogenase CoA protein domaindehydrogenase CoA protein

binding protein domain

- thiolase CoA ketoadipyl

containing alcohol dehydrogenase alcohol containing

- -

-

-

-

carnitine

phosphoshikimate 1 phosphoshikimate BVMO Sulfoacetophenone BVMO Sulfoacetophenone BVMO Sulfoacetophenone NPD dioxygenase nitropopane

-

- - - - -

Isocitrate dehydrogenase, dehydrogenase, Isocitrate G Translation elongation factor Acyl 3 CoA 2 hydratase Aconitate AAA ATPase ligase Acetate/CoA 4 flavoprotein subunit Succinatedehydrogenase, Acyl 4 AMP transferase Carboxyl domain TPPpyrophosphateprotein Thiamine binding 4 lyase Isocitrate Acetyl (2Fe Rieske class I Aminotransferase Trigger factor Iron Beta L 2

Locus tag Locus

CtesDRAFT_PD1530 CtesDRAFT_PD1820 CtesDRAFT_PD4792 CtesDRAFT_PD5491 CtesDRAFT_PD3572 CtesDRAFT_PD5439 CtesDRAFT_PD1285 CtesDRAFT_PD0908 CtesDRAFT_PD3605 CtesDRAFT_PD5437 CtesDRAFT_PD1277 CtesDRAFT_PD0387 CtesDRAFT_PD5437 CtesDRAFT_PD1644 CtesDRAFT_PD2294 CtesDRAFT_PD0525 CtesDRAFT_PD5437 CtesDRAFT_PD5455 CtesDRAFT_PD4205 CtesDRAFT_PD2303 CtesDRAFT_PD3764 CtesDRAFT_PD5474 CtesDRAFT_PD4382 CtesDRAFT_PD5460 CtesDRAFT_PD5453

PAGE protein spots (see Figure S4) spots Figure (see protein PAGE

-

85 75 70 97 70 60 63 58 55 55 60 50 55 53 55 55 45 45 50 45

105

MW

(kDa)

apparent apparent

B1 B2 B3 B4 B5 B6 B7 B8 B9

B10 B11 B12 B13 B14 B15 B16 B17 B19 B20 B21 B23

Spot No Spot

Identities of 2D of Identities

TableS3.

CHAPTER 7 141

9 7

13 32 44 39 28 39 64 23 13 52 43 49 36 43 46 43 32 60 43

96 47 82 84 73

255 475 467 230 430 806 177 263 135 156 229 337 149 181 448 180

40732 23869 35959 39650 33605 28700 28783 26620 36566 28098 24695 29761 26620 24865 27958 27390 22413 29425 22409 23247 21544

subunit subunit

- -

beta beta

/ /

alpha alpha

lactonase

phosphate dehydrogenase, type I type phosphatedehydrogenase,

-

-

transcriptional regulator, LuxR family LuxR regulator, transcriptional

3

transferase, subunittransferase, A subunittransferase, A B subunittransferase, B subunittransferase,

-

- - - -

hydratase/isomerase

dehydrogenase domain protein dehydrogenase

CoA hydratase/isomerase CoA hydratase/isomerase CoA CoA hydratase/isomerase CoA

- - - -

CoA CoA

-

Sulfophenyl acetate esterase acetate Sulfophenyl oxoacid CoA oxoacid CoA oxoacid CoA enol oxoadipate oxoacid CoA

------

Acyl component Two Glyceraldehyde proteinOxidoreductase domain 4 Enoyl Enoyl flavoproteinElectron transfer reductase Aldo/keto Enoyl 3 protein II aldolase/adducinClass family flavoproteinElectron transfer 3 Enoyl helix componentregulator, Two transcriptional winged 3 3 3 family componentregulator, Two transcriptional LuxR proteinshock Heat Hsp20

CtesDRAFT_PD4604 CtesDRAFT_PD2533 CtesDRAFT_PD5265 CtesDRAFT_PD5475 CtesDRAFT_PD5438 CtesDRAFT_PD5490 CtesDRAFT_PD5454 CtesDRAFT_PD1198 CtesDRAFT_PD5443 CtesDRAFT_PD1197 CtesDRAFT_PD4380 CtesDRAFT_PD5459 CtesDRAFT_PD1198 CtesDRAFT_PD2590 CtesDRAFT_PD0528 CtesDRAFT_PD2814 CtesDRAFT_PD4381 CtesDRAFT_PD4384 CtesDRAFT_PD5467 CtesDRAFT_PD0696 CtesDRAFT_PD0907

45 27 45 40 33 33 31 30 34 28 27 29 29 29 29 29 27 28 27 26 22

B25 B42 B24 B26 B27 B28 B29 B30 B31 B32 B33 B34 B35 B36 B37 B38 B39 B40 B41 B43 B44

142 CHAPTER 7

and analyzed by PF analyzed and S5. Figure

2D

-

PAGE of soluble fractions of of fractions soluble of PAGE

-

MS. The results are summarized in Table S4 in Table summarized are results The MS.

C.

testosteroni

KF

-

.

1 c 1

ells grown with 4 with grown ells

-

sulfophenol

(left) or succinate (right). Protein spots were excised excised were spots Protein (right). succinate or (left)

CHAPTER 7 143

see Figure S5) seeFigure

PAGE protein spots ( protein PAGE

-

Identities of 2D of Identities

TableS4.

144 CHAPTER 7

and analyzed by andanalyzed PF Figure S6. Figure

2D

-

PAGE of soluble fractions of of fractions soluble of PAGE

-

MS. The results are summarized in Table S5 in summarized results Table are The MS.

C.

testosteroni

KF

-

.

1 cells grown with 4 with grown cells 1

-

sulfophenol

(left) or succinate (right). Protein spots were excised excised were spots Protein (right). succinate or (left)

CHAPTER 7 145

, ,

family family

3

Tn

5470, 5472

,

5464, 5464, 5465, 5476

XX):

.

diol ring cleavage dioxygenases;

clavage dioxygenases are highlighted in yellow

-

intra

,

oxidoreductase

,

5475

5469, 5471

cted cted reductase;maleylacetate

edi

transferase (A and B subunit);

, pr ,

-

The identified genes encoding the aromatic ring

A

5474

oxoacid Co

-

3

,

barrel protein;

beta

-

5467, 5468

alpha

PAGE protein spots (see Figure S6) spots Figure (see protein PAGE

stress responsivestress

-

,

related to the degradation of aromatic compunds.

ketoadipyl CoA thiolase;

5473

-

beta

,

5466

Identities of 2D of Identities

Cluster Cluster of genes

. .

S7

ansposases;

TableS5. and the framing transposasse genes are shown in tr orange. The annotated gene functions are as follows transcriptional regulator; (locus tag CtesDRAFT_PDXX Figure 146 CHAPTER 7

1 2 3 4

170

130

72

55

43

34

26

17

10

Figure S8. SDS-gel of protein fractions obtained after a MonoQ run. A specific protein band accelerated in fraction 21 (box-framed, lane 3), which was analyzed by PF-MS and resulted in the intradiol dioxygenases (locus tags: CtesDRAFT_PD5469/PD5471). Lane 1, molecular mass marker (kDa); lane 2, proteins from fraction 20 (40 µg); lane 3, proteins from fraction 21 (40 µg); lane 4, proteins from fraction 23 (40 µg). CHAPTER 7 147

1 2 3 4 5 6

170

130

72

55

43

34

26

17

10

Figure S9. SDS-gel of protein fractions obtained after a DEAE-run. A specific protein band accelerated in fractions 12 and 13 (lanes 2 and 3). The band was excised from lane 3 (box-framed) and analyzed by PF-MS and resulted in a superoxide dismutase (locus tag: CtesDRAFT_PD1809). Lane 1, proteins from fraction 14 (50 µg); lane 2, proteins from fraction 13 (50 µg); lane 3, proteins from fraction 12 (50 µg); lane 4, proteins from fraction 11 (50 µg); lane 5, proteins from the soluble fraction (80 µg); lane 6, molecular mass marker (kDa).

CHAPTER 8 149

CHAPTER 8

Sulfoglycolysis in Escherichia coli K-12 closes a gap in the biogeochemical sulfur cycle

Karin Denger1, Michael Weiss2, Ann-Katrin Felux2, Alexander Schneider3, Christoph Mayer3, Dieter Spiteller1, Thomas Huhn4, Alasdair M. Cook1 & David Schleheck1

1Department of Biology, University of Konstanz, Germany

2Konstanz Research School Chemical Biology, University of Konstanz, Germany

3Interfaculty Institute of Microbiology and Infection Medicine, University of Tübingen, Germany

4Department of Chemistry, University of Konstanz, Germany

This chapter was published in Nature. 2014, 507(7490):114-117.

150 CHAPTER 8

ABSTRACT Sulfoquinovose (SQ, 6-deoxy-6-sulfoglucose) has been known for 50 years as the polar headgroup of the plant sulfolipid (Benson 1963, Benning 2007) in the photosynthetic membranes of all higher plants, mosses, ferns, algae, and most photosynthetic bacteria (Benning 1998). It is also found in some non-photosynthetic bacteria (Harwood and Nicholls 1979), and SQ is part of the surface layer of some Archaea (Meyer et al. 2011). The estimated annual production of SQ (Harwood and Nicholls 1979) is 10,000,000,000 tonnes (10 petagrams), thus it comprises a major portion of the organo-sulfur in nature, where SQ is degraded by bacteria (Martelli 1967, Roy et al. 2003). However, despite evidence for at least three different degradative pathways in bacteria (Martelli 1967, Roy et al. 2003, Denger et al. 2012), no enzymic reaction or gene in any pathway has been defined, although a sulfoglycolytic pathway has been proposed (Roy et al. 2003).

Here we show that Escherichia coli K-12, the most widely studied prokaryotic model organism, performs sulfoglycolysis, in addition to standard glycolysis. SQ is catabolised through four newly discovered reactions that we established using purified, heterologously expressed enzymes: SQ isomerase, 6-deoxy-6-sulfofructose (SF) kinase, 6-deoxy-6-sulfofructose- 1-phosphate (SFP) aldolase, and 3-sulfolactaldehyde (SLA) reductase. The enzymes are encoded in a ten-gene cluster, which probably also encodes regulation, transport and degradation of the whole sulfolipid; the gene cluster is present in almost all (>91%) available E. coli genomes, and is widespread in Enterobacteriaceae. The pathway yields dihydroxyacetone phosphate (DHAP), which powers energy conservation and growth of E. coli, and the sulfonate product 2,3-dihydroxypropane-1-sulfonate (DHPS), which is excreted. DHPS is mineralized by other bacteria, thus closing the sulfur cycle within a bacterial community.

INTRODUCTION, RESULTS, AND DISCUSSION Recent work showed that environmental isolates of Klebsiella spp. (Enterobacteriaceae) convert SQ quantitatively to DHPS (Roy et al. 2003, Denger et al. 2012), and we hypothesized that utilization of SQ might be a property of Enterobacteriaceae. We found that four genome- sequenced E. coli K-12 substrains (BW25113, DH1, MG1655 and W3100), after subculturing, grew with SQ within 1 to 3 days. We chose to work (largely) with the fastest-growing substrain, MG1655. The organism used SQ as a sole source of carbon and energy with a molar-growth yield of 3 g of protein per mol of SQ carbon, whereas glucose gave about 6 g of protein per mol of carbon; the latter value represented mass balance of carbon as biomass and CO2 (Cook 1987). However, approximately 1 mol of DHPS per mol of SQ was released into the growth medium CHAPTER 8 151

(Figure 1a), as observed with Klebsiella oxytoca (Denger et al. 2012). Thus, there was complete mass balance for carbon and for sulfur from SQ. The growth rate with SQ was 0.13 h-1 (0.5 h-1 with glucose), and the specific degradation rate for SQ in vivo was 120 mU per mg of protein -1 (1 mU = 1 nmol min ). We concluded that SQ is metabolized to a C3 sulfonate, which is excreted as DHPS, and that the remainder of the molecule is utilized for growth (Figure 2a).

The out-grown culture was filter-sterilized and inoculated with Cupriavidus pinatubonensis JMP134, which can utilize DHPS for growth (Mayer et al. 2010), but cannot utilize SQ (Denger et al. 2012). C. pinatubonensis grew with the DHPS formed from SQ by E. coli, and released its sulfonate-sulfur quantitatively as sulfate (Figure 1b) using a pathway described elsewhere (Mayer et al. 2010). We thus demonstrated mineralization of SQ in a laboratory model system.

Figure 1. Complete degradation of sulfoquinovose during growth. a, Growth of E. coli K-12 substrain MG1655 with SQ and excretion of 2,3-dihydroxypropane-1-sulfonate (DHPS). b, Growth of C. pinatubonensis JMP134 with the DHPS formed from SQ by E. coli. Data from representative growth experiments (n = 3) are shown. To allow a compact graph, sulfate release and not total sulfate is shown.

Proteins from whole cells of E. coli K-12 grown with glucose or SQ were subjected to two dimensional polyacrylamide gel electrophoresis (2D-PAGE) (Extended Data Figure 1) and examined by peptide fingerprinting-mass spectrometry (PF-MS) (Extended Data Table 1). The immediately relevant, apparently SQ-inducible proteins (see Extended Data Figure 1 and Extended Data Table 1) were attributed to b3878 (also known as yihQ (b numbers are locus 152 CHAPTER 8 tags); predicted to be an -glucosidase), b3879 (also known as yihR; predicted to be an epimerase), b3880 (yihS; predicted to be an isomerase), b3881 (yihT; predicted to be an aldolase), and b3882 (yihU; predicted to be an NAD+/NADH-linked dehydrogenase/reductase). Transcriptional analyses for the gene cluster b3879-b3882, as well as for b3883 (also known as yihV; predicted to be a sugar kinase), confirmed a strong inducible transcription during growth with SQ, but not during growth with glucose (Extended Data Figure 2). Furthermore, single- gene knockouts (in substrain BW25113; Baba et al. 2006) in genes b3876 (also known as yihO; predicted to be a major facilitator superfamily (MFS)-type transporter), b3880, b3881 and b3883 did not grow with SQ, which confirmed and expanded on the proteomic and transcriptional data (Figure 2b).

Figure 2. The four core enzyme reactions of sulfoglycolysis, with transport, and the corresponding genes in a ten- gene cluster in E. coli K-12. a, SQ is metabolized by four enzymes (shown in color) to a C3 sulfonate, DHPS, which is excreted, and the remainder of the molecule is used for growth. For comparison, the analogous enzyme reactions for the catabolism of (unsubstituted) glucose through the glycolytic pathway in E. coli, are also shown (dashed arrows). Fba, fructose bisphosphate aldolase; GAP, glyceraldehyde-3-phosphate; Pfk, phosphofructokinase; Pgi, phosphoglucose isomerase; PTS, phosphotransferase system permease; Tpi, triose phosphate isomerase. b, The EcoGene E. coli website (http://www.EcoGene.org) uses the abbreviation ‘yih’ for most of these genes; we have retained this nomenclature. Vertical stripes, genes confirmed as being essential for growth with SQ by mutational analysis; horizontal stripes, genes confirmed as being inducible for growth with SQ by proteomic and/or transcriptional analyses; box-framed genes, genes encoding the four core enzymes of the pathway (shown in a) that were subject of heterologous expression purification, and functional characterization.

We thus identified a gene cluster in E. coli K-12 that contained SQ-inducible, essential genes for catabolism of SQ, but we still did not know which pathway was involved. A sulfoglycolytic pathway would involve a hypothetical 3-sulfolactaldehyde (SLA) reductase to yield DHPS in the final reaction (apart from export) (Figures 1a and 2a), whereas a hypothetical SQ CHAPTER 8 153 dehydrogenase as the first reaction would lead into hypothetical Entner-Doudoroff-type or pentose-phosphate-type pathways, or another novel pathway. An SLA-reductase was detected (assayed as DHPS oxidation) in cell-free extracts of SQ-grown substrain MG1655 at a specific activity of 420 mU per mg of protein, which exceeds the specific degradation rate for SQ in vivo and, thus, was sufficient to explain growth. This enzyme activity was not detected in extracts of glucose-grown cells. The enzyme was, thus, confirmed to be inducible, and it was specific for NAD+; NADP+ was not a substrate. Furthermore, SQ did not lead to reduction of NAD+ or of NADP+ in the extracts of SQ- or glucose-grown cells, hence, hypothetical SQ dehydrogenase was not detectable. These data led us to predict the sulfoglycolytic pathway depicted in Figure 2a, including the requirement for sulfonate import and export across the cell membrane (Graham et al. 2002, Mampel et al. 2004, Mayer and Cook 2009).

The four predicted core enzymes of the pathway (Figure 2a) were heterologously expressed and purified as His-tagged proteins, b3880 (putative isomerase), b3883 (putative sugar kinase), b3881 (putative aldolase) and b3882 (putative reductase) (Extended Data Figure 3). Protein b3882 was shown to encode SLA reductase. First, we partially purified and identified (PF-MS) the wild type enzyme in cell-free extracts of substrain MG1655 (see above), and second, we examined the recombinant protein (see below). In both cases, we identified that b3882 represents an SLA reductase; the enzyme showed no activity with 4-hydroxybutyrate (Saito et al. 2009).

The heterologously expressed and purified putative isomerase (b3880) caused about one-fourth of the SQ in the reaction mixture to disappear, as observed by high-pressure liquid chromatography-mass spectrometry (HPLC-MS), and a new peak was formed that eluted with shorter retention time, but exhibited the same relative mass (Mr = 244 Dalton (Da); observed as a quasi-molecular ion in the negative ion mode ([M-H]-) at a mass-to-charge ratio (m/z) of 243) (Figure 3a, b). The new peak was confirmed to represent 6-deoxy-6-sulfofructose (SF), as proposed elsewhere (Roy et al. 2003), by the HPLC separation pattern (Extended Data Figure 4), by the matching exact mass of the [M-H]- ion (Extended Data Figure 4), and by its MS/MS fragmentation pattern (Extended Data Figure 5). Thus we confirmed that b3880 catalysed the SQ isomerase reaction.

The reaction mixture was augmented with ATP and the putative sugar kinase (b3883). The peaks of SQ and SF partially disappeared and a new peak was formed (Figure 3c). This new peak was confirmed to represent 6-deoxy-6-sulfofructose-1-phosphate (SFP), as proposed elsewhere (Roy et al. 2003), by the matching exact mass of the [M-H]- ion (observed mass, - 322.9877 Da; theoretical mass of C6H12O11PS , 322.9843 Da) and by its fragmentation pattern 154 CHAPTER 8

(Extended Data Figure 6). HPLC confirmed that ATP disappeared and ADP was formed during the reaction and, furthermore, that SFP was converted back to SF when alkaline phosphatase was added to a preparation of SFP (not shown). Thus, with b3883, we expressed an ATP- dependent kinase that phosphorylated SF to SFP.

Figure 3. Illustration of the reactions of the four core enzymes of sulfoglycolysis in vitro. The transformation of SQ to SF, SFP, DHAP and SLA, and DHPS, by successive addition of recombinantly expressed pathway enzymes was followed by HPLC-ESI-MS. a, Sample of SQ in reaction buffer (t = 0 min). b, Sample after addition of isomerase (b3880) (t = 60 min). c, Sample after addition of ATP and kinase (b3883) (t = 120 min). d, Sample after addition of aldolase (b3881) (t = 180 min). e, Sample after addition of NADH and reductase (b3882) (t = 240 min). f, Sample after extended incubation of the four-enzyme reaction (t = 360 min). The total-ion chromatograms (TICs) recorded in the negative-ion mode from the MS-MS fragmentation of the quasi-molecular ions [M-H]- of SQ and SF and SFP, DHAP, SLA and DHPS, from a representative experiment (n = 5) are shown. For representative MS- MS fragmentation patterns of the [M-H]- ions of SQ and SF, SFP, and SLA and DHPS, see the Extended Data Figures 5, 6 and 7, respectively.

The reaction mixture was augmented with the putative aldolase (b3881). The peak for SFP partially disappeared, and two new peaks were formed (Figure 3d). The first new peak was identified to represent DHAP, as proposed elsewhere (Roy et al. 2003), with an authentic DHAP standard. The second new peak was confirmed to represent SLA, as proposed elsewhere - - (Roy et al. 2003), by the matching mass of the [M-H] ion (Mr = 154; observed as [M-H] ion at m/z = 153) and by its fragmentation pattern (Extended Data Figure 7); the same peak was observed when we used recombinant SLA reductase in reverse to oxidize DHPS to SLA (see above). Thus, with b3881, we expressed an aldolase that cleaved SFP into DHAP and SLA. CHAPTER 8 155

The SFP turnover was incomplete (Figure 3d); the equilibrium of the corresponding enzyme reaction in glycolysis (fructose-1,6-bisphosphate aldolase) lies far to the left (Cornish-Bowden 1981), that is, hardly any products are formed.

However, when NADH and the recombinant SLA-reductase (b3882) were added, the peak for SFP was further diminished, as was the peak for SQ, and that for the aldolase-reaction product DHAP, was further increased (Figure 3e). In addition, the peak for SLA had disappeared, and the peak for the anticipated sulfonate product, DHPS, was formed (Figure 3e and Extended Data Figure 7). After an extended incubation of the four-enzyme reaction (see Figure 3e, f), the peaks for SQ and SFP had almost completely disappeared, and the peaks for DHAP and DHPS, had further increased.

Together, the results show that the SQ-pathway in E. coli K-12 (Figure 2a) does not involve a desulfonation reaction and that no substrate-level phosphorylation of the sulfonated C3 intermediate occurs, which has been used previously (Roy et al. 2003) as a default hypothesis. Furthermore, we deduce that there are ten genes in the gene cluster (Figure 2b). The core pathway comprises a SQ transporter (for example, b3876, YihO), SQ isomerase (b3880, YihS), SF kinase (b3883, YihV), SFP aldolase (b3881, YihT), SLA reductase (b3882, YihU) and a DHPS exporter (for example, b3877, YihP), which could be under the putative control of repressor b3884 (YihW). We propose a sulfolipid porin (b3875, OmpL), a sulfolipid - glucosidase (b3878, YihQ), and an epimerase (b3879, YihR) to funnel other SQ derivatives into the pathway, for example, the whole sulfolipid (see Extended Data Figure 8).

The gene cluster is found in at least 1,009 (>91%) of the 1,110 genome sequences of commensal E. coli, as well as pathogenic E. coli (for example, EHEC) strains, that were available in November 2013 (finished and draft genome sequences) in the Integrated Microbial Genomes (IMG) and Human Microbiome Project (HMP) databases (that is, gene clusters with syntenic yihTUVW and collinear homologs of yihSRQPO and ompL in variable order). Hence, the gene cluster is a feature of the core-genome of E. coli species. It can also be found in a wide range of other Enterobacteriaceae (for example, Chronobacter sakazakii ATCC BAA-894, Klebsiella oxytoca 10-5242, Pantoea ananatis LMG 20103 and Salmonella enterica LT2). We therefore suspect that the pathway has a significant role in bacteria in the alimentary tract of all omnivores and herbivores, that the pathway occurs in excrement from these animals, and in plant pathogens, which would explain part of the widespread occurrence of microbial degradation observed (Martelli 1967, Roy et al. 2003, Denger et al. 2012).

SQ is produced in huge amounts in nature and, thus, represents a significant proportion of the organic sulfur cycle (Harwood and Nicholls 1979), and it is degraded in similar amounts by 156 CHAPTER 8 both bacteria (Martelli 1967, Roy et al. 2003, Denger et al. 2012) and algae (Sugimoto et al. 2007), or it would accumulate in the environment. We see here that the Enterobacteriaceae use one pathway (Figure 2a) to initiate degradation of SQ, and that a community is required for complete degradation (Figure 1b) (Denger et al. 2012). This covers a variety of habitats, but we know that other pathways exist. A previous paper (Roy et al. 2003) presented evidence for SQ dehydrogenase, which we also observe in our SQ-using strain of Pseudomonas putida (A.-K.F. unpublished observations). Notably, another group (Martelli 1967) reported complete SQ degradation, including desulfonation, in a single organism; however, this organism has been lost (Cook and Denger 2002).

In summary, we have established that sulfoglycolysis, which was named but not defined in a previous report (Benson and Shibuya 1961), converts SQ to DHPS in the most widely-studied prokaryotic model organism, E. coli K-12, representing many Enterobacteriaceae (Figure 2a). We have identified a gene cluster in E. coli K-12 (Figure 2b), which encodes the pathway. The core pathway, for SQ, involves four newly discovered enzymes, two newly identified transporters and three newly characterized intermediates (Figures 2a, b). We know that the pathway is regulated (Extended Data Figures 1 and 2) and we suspect that it includes the degradation of the whole sulfolipid (Extended Data Figure 8). The pathway represents a substantial part of the biogeochemical sulfur cycle, and the pathway is likely to have a significant role in bacteria in the alimentary tract of all omnivores and herbivores, and in plant pathogens. We and others (Martelli 1967, Roy et al. 2003, Denger et al. 2012) anticipate other degradative pathways for SQ in nature; for example, in bacteria of all marine, freshwater and terrestrial habitats where SQ is produced and degraded. We now provide the tools to elucidate these degradative pathways. CHAPTER 8 157

METHODS SUMMARY SQ and DHPS were synthesized chemically and identified by NMR and mass spectrometry (Mayer et al. 2010, Denger et al. 2012). Cultivation, preparation of cell-free extracts, enzyme purification, 2D-PAGE and PF-MS, RNA preparation and RT-PCR, and expression and purification of His-tagged proteins, are described in the Methods. SQ, SF, SFP, SLA, DHAP and DHPS were separated using hydrophilic interaction liquid chromatography (HILIC) (Denger et al. 2012) and detected by an evaporative light scattering detector (ELSD) (Denger et al. 2012) or electrospray ionization (ESI)-time-of-flight (TOF)-MS or ESI-iontrap-MS (see Methods). The enzyme reaction mixture (Figure 3) was 3 mM SQ in 50 mM ammonium acetate buffer (pH 8.7), and 8 mM ATP, 0.5 mM MgCl2 and 8 mM NADH supplemented with the corresponding enzymes (each 50 µg ml-1).

ACKNOWLEDGEMENTS We thank E. Deuerling for substrain MG1655, J. Klebensberger for substrain BW25113 and its knock-out mutants, and K. Leitner for help with growth experiments. The work of M.W. and A.-K.F. was supported by the Konstanz Research School Chemical Biology (KoRS-CB), the work of C.M. by German Research Foundation (DFG) grants (MA2436/4 and SFB766/A15) and by the Baden-Württemberg Stiftung (P-BWS-Glyko11), and the work of D.Sch. by a DFG grant (SCHL 1936/1-1) and by the University of Konstanz and the Konstanz Young Scholar Fund.

158 CHAPTER 8

METHODS

Chemicals SQ and DHPS were synthesized chemically and identified by NMR and MS as described previously (Mayer et al. 2010, Denger et al. 2012). Dihydroxyacetone phosphate dilithium salt, D-fructose 6-phosphate disodium salt hydrate, D-glucose 6-phosphate disodium salt hydrate, and D-fructose 1,6-bisphosphate trisodium salt octahydrate were from Sigma. Other biochemicals (NADH, NADPH, NAD+, NADP+, ATP, ADP) were purchased from Sigma, Fluka, Merck or Biomol.

Bacteria and growth conditions Escherichia coli K-12 substrains W3100 (DSM 5911, ATCC 27325) and DH1 (DSM 4235, ATCC 33849), and Cupriavidus pinatubonensis JMP134 (DSM 4058) (Sato et al. 2006) were purchased from the Leibniz Institute DSMZ - Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH. E. coli K-12 substrain MG1655 (DSM 18039) was a gift from E. Deuerling, and E. coli K-12 substrain BW25113 and its single-gene knockouts from the E. coli Keio Knockout Collection (Baba et al. 2006) were a gift from J. Klebensberger. The growth medium was a phosphate-buffered mineral salts medium (Thurnheer et al. 1986) (pH 7.2) with SQ or glucose as the sole carbon sources. Cultures were inoculated (1%) with pre- culture grown with the same substrate, and grown aerobically at 30°C. Cultures in 3 ml volume were grown in screw-cap tubes (30 ml) in a roller, and cultures in the 50 ml or 200 ml volume in capped Erlenmeyer flasks (0.3 or 1.0 litre volume, respectively) on a horizontal shaker; for the latter, 0.8-ml samples were taken at intervals to determine optical density (attenuance D at

580 nm; D580nm) and substrate and product concentrations (HILIC-HPLC, see below). For the growth experiments to demonstrate mineralization of SQ (see Figure 1), E. coli K-12 substrain MG1655 was grown with SQ (4 mM; 50-ml scale), and after growth had been completed, the cellular biomass was removed from the culture fluid by centrifugation (20,000g, 30 min, 4°C) followed by filter-sterilization (pore size, 0.2 µm). The culture fluid was then inoculated with C. pinatubonensis JMP134. During the growth experiments, samples were taken at intervals to monitor the growth (D580) and to determine total protein, substrate, and product concentrations (see below). All growth experiments were replicated twice (n = 3).

Preparation of crude extract, soluble fraction and enzyme enrichment

E. coli cells from growth experiments with SQ or glucose were harvested at an D580 of approximately 0.4 by centrifugation (20,000g, 15 min, 4°C) and disrupted by three passages through a chilled French pressure cell (140 megapascals (MPa); Aminco) in the presence of CHAPTER 8 159

DNase (25 µg ml-1). Cell debris was removed by centrifugation (11,000g, 5 min, 4°C) and the membrane fragments collected by ultracentrifugation (70,000g, 1 h, 4°C); the supernatant was called the soluble fraction. For enzyme enrichment, soluble fraction was loaded onto an anion- exchange chromatography column (MonoQ HR 10/10 column, Pharmacia) equilibrated with -1 20 mM Tris/H2SO4 buffer, pH 9.0, at a flow rate of 1 ml min , and bound proteins eluted by a linear Na2SO4 gradient (from 0 M to 0.2 M in 45 min, and to 0.5 M in 10 min) and fractions

(2 ml) collected; the SLA reductase activity eluted at about 0.12 M Na2SO4.

Two dimensional gel electrophoresis and peptide fingerprinting-mass spectrometry 2D-PAGE and PF-MS were done according to our previously published protocols (Schmidt et al. 2013). In brief, soluble protein fractions from E. coli cells grown with SQ or glucose (see above) were desalted (PD-10 Desalting Columns, GE Healthcare Life Sciences) and precipitated by addition of acetone (four volumes 100% acetone, -20°C, overnight); each 1 mg of precipitated protein was solubilised in rehydration buffer (300 µl) and loaded onto isoelectric focussing (IEF) strips (BioRad ReadyStrip IPG system) overnight; IEF involved a voltage ramp to 10,000 V during 3 h, and a total focussing of 40,000 Volt-hours (Vh); the strips were equilibrated in SDS-equilibration buffers I and II (with DTT and iodoacetamide, respectively) and placed onto SDS-PAGE gels using an overlay of SDS-gel buffer solidified with agarose (0.5%); SDS-PAGE gels contained 12% polyacrylamide (no stacking gel), and were stained with Coomassie brilliant blue R-250 (Laemmli 1970). Stained protein spots of interest were excised from gels and submitted to PF-MS at the Proteomics Facility of the University of Konstanz to identify the corresponding genes; the MASCOT engine (Matrix Science, London, UK) was used to match each peptide fingerprint against a local database of all predicted protein sequences of the annotated E. coli K-12 substrain MG1655 genome (IMG version 2011-08-16).

Total RNA preparation and PCR with reverse transcription RNA preparation and RT-PCR were done according to our previously published protocols (Weiss et al. 2012). In brief, cells were grown in the appropriate selective medium (3 ml) and harvested in the mid-exponential growth phase (D580 ≈ 0.3); the cell pellets were stored at -20°C in RNAlater RNA stabilization solution (Applied Biosystems); total RNA was prepared using the E.Z.N.A. Bacterial RNA Kit (Omega Bio-Tek) following the manufacturer’s instructions; the RNA preparation (40 µl) was treated with RNase-free DNase (2 U, 30 min, 37°C) (Fermentas). For complementary DNA (cDNA) synthesis, the Maxima reverse transcriptase (Fermentas) was used following the manufacturer’s instructions; the reactions contained 0.2 µg total RNA and 20 pmol sequence-specific primer (see below). PCR reactions (20 µl volume) were done using Taq DNA polymerase (Fermentas) and the manufacturer’s standard reaction 160 CHAPTER 8

mixture (including 2.5 mM MgCl2); cDNA from reverse transcription reactions was used as template (2 µl of reverse transcription reaction mixture), or genomic DNA (4 ng DNA) for PCR-positive controls, or non-reverse transcribed total RNA (2 µl) for the confirmation of an absence of DNA impurities in the RNA preparations.

Heterologous expression and purification of His-tagged proteins Heterologous expression of candidate genes and purification of the recombinant proteins were done according to our previously published protocol (Felux et al. 2013). In brief, chromosomal DNA was isolated using the Illustra bacteria genomicPrep Mini Spin Kit (GE Healthcare) and the target genes amplified by PCR using Phusion HF DNA Polymerase (Finnzymes) and the primer pairs given below; the PCR conditions were 30 cycles of 18 s denaturation at 98°C, 20 s annealing at 58°C, and 60 s elongation at 72°C for gene b3880, or 45 s elongation at 72°C for genes b3881, b3882, and b3883; the PCR products were then separated by agarose-gel electrophoresis, excised, and purified using the QIAquick gel extraction kit (Qiagen), and ligated into the amino-terminal His6-tag expression vector pET100 (Invitrogen); correct integration of the inserts was confirmed by sequencing (GATC-Biotech). For expression, BL21 Star (DE3) OneShot E. coli cells (Invitrogen) were transformed with the constructs and grown -1 at 37°C in lysogeny broth medium containing 100 mg l ampicillin; at an D580 ≈ 0.6, the cultures were induced by addition of 0.5 mM IPTG (isopropyl-β-D-thiogalactoside), and the cells grown for additional 4 to 5 h at 20°C, collected by centrifugation (15,000g, 20 min, 4°C), and stored frozen (-20°C). Cells were resuspended in buffer A (20 mM Tris/HCl, pH 8.0, 100 mM KCl) that contained 50 µg ml-1 DNase I, and disrupted by four passages through a pre-cooled French pressure cell (140 MPa). The cell extracts were centrifuged (15,000g, 10 min, 4°C) and ultracentrifuged (70,000g, 1 h, 4°C), and the soluble protein fractions loaded onto 1-ml Ni2+- chelating Agarose affinity columns (Macherey-Nagel) pre-equilibrated with buffer A (see above). After a washing step (30 mM imidazole in buffer A), the His-tagged proteins were eluted (200 mM imidazole in buffer A), concentrated in a Vivaspin concentrator (Sartorius), and, after addition of 30% glycerol (v/v), stored in aliquots at -20°C.

Enzyme assays SLA reductase activity was assayed photometrically at 365 nm in 1-ml cuvettes in 50 mM Tris/HCl buffer, pH 9.0, with 1 mM NAD+ and 5 mM DHPS as substrates. The reaction was started with the addition of protein (0.01 – 0.1 mg ml-1) and the reduction of NAD+ was recorded. Enzyme assays with recombinant proteins (each 50 µg protein ml-1) for analysis by HILIC-HPLC were carried out in the 1 ml volume in 50 mM ammonium acetate buffer, pH 8.7, stirred at room temperature (approximately 20 – 23°C); samples were taken at intervals, for CHAPTER 8 161 which the reactions were stopped by addition of 30% acetonitrile. SQ (3 mM) and recombinant isomerase were incubated for 60 min, after which ATP (8 mM), MgCl2 (0.5 mM), and recombinant kinase were added. After additional 60 min, recombinant aldolase was added, and after another 60 min NADH (8 mM) and the recombinant reductase.

Analytical methods Total protein was determined according to a protocol based on the method reported previously (Kennedy and Fewson 1968), and soluble protein by protein dye binding (Bradford 1976), each using bovine serum albumin (BSA) as the standard. Sulfate release during growth was quantified turbidimetrically (Sörbo 1987) as a suspension of BaSO4. For HPLC-ESI-MS-MS, an Agilent 1100 HPLC system fitted with a ZIC-HILIC column (5 µm, 200 Å, 150 x 4.6 mm; Merck) was connected to an LCQ ion trap mass spectrometer (Thermofisher). The HPLC conditions for the LCQ ion trap were: From 90% B to 65% B in 25 min, 65% B for 10 min, in

0.5 min back to 90% B, 90% B column equilibration for 9.5 min; solvent A, 90% 0.1 M NH4Ac, 10% acetonitrile; solvent B, acetonitrile; flow rate, 0.75 ml min-1. The mass spectrometer was run in ESI negative mode. The retention times and ESI-MS-MS fragmentation patterns of the analytes were observed as follows: SQ retention time, 25.4 min; SQ ESI-MS m/z (per cent base- peak) 243 (100); SQ ESI-MS-MS of [M-H]- 243: 243 (4), 225 (11), 207 (34), 183 (100), 153 (54), 143 (1), 123 (16), 101 (8), 81 (6). SF retention time, 21.9 min; SF ESI-MS, 243 (100); SF ESI-MS-MS of [M-H]- 243: 243(1), 225 (37), 207 (38), 183 (21), 153 (100), 143 (3), 123 (24), 101 (13), 81 (5). SFP retention time, 33.4 min; SFP ESI-MS, 323 (100); SFP ESI-MS-MS of [M-H]- 323: 305 (34), 287 (3), 233 (2), 225 (100), 207 (32), 153 (4). SLA retention time, 21.0 min; SLA ESI-MS, 153 (100); SLA ESI-MS-MS of [M-H]- 153: 153 (9), 81 (100), 71 (18). DHPS retention time, 18.0 min; DHPS ESI-MS, 155 (100), 95 (4); DHPS ESI-MS-MS of [M-H]- 155: 155 (100), 137 (18), 95 (40). DHAP retention time, 30.2 min; DHAP ESI-MS: 169 (100); DHAP ESI-MS-MS of [M-H]- 169: 169 (2), 125 (2), 97 (100). HPLC for ESI-TOF-MS (MicrO-TOF II, Bruker) involved the same column and gradient system, but a different gradient program (from 90% B to 65% B in 20 min, and further to 55% B in 20 min), which resulted in retention times (see Extended Data Figure 4) of 22.4 min for SQ, 21.8 min for SF, 16.3 min for fructose, 18.5 min for glucose, 35.2 min for fructose-6-phosphate, and 39.8 min for glucose- 6-phosphate.

PCR primers Primers were purchased from Microsynth (Balgach). The sequences of the primers pairs (for/rev) for RT–PCR (see above) were (product length in bp): b3879 forward, 5’-CCTTATGGCGTGGGTATTCATCC-3’, b3879 reverse, 5’-TTAGGCGGGCAACTCAT- 162 CHAPTER 8

AGGTTC-3’ (353); b3880 forward, 5’-ACGCGGTGGAAGCTTTCTTGAT-3’, b3880 reverse, 5’-CACGGTGGCGTTAAACAGACCTT-3’ (332); b3881 forward, 5’-TGTCGCC- GCCGATGAGTTC-3’, b3881 reverse, 5’-CTTTGTAGAGGTCAGCGCCACTGT-3’ (320); b3882 forward, 5’-GGCGCAGGCCGCTAAAGA-3’, b3882 reverse, 5’-AAGATTCAGG- GCTTCGCACAAAA-3’ (439); b3883 forward, 5’-GGCACGACGGCGCTAAAAA-3’, b3883 reverse, 5’-TGACTCCGCTAAATCCCCACTTG-3’ (374); b0720 forward, 5’-CGCT- GGCGGCGTTCTATCA-3’, b0720 reverse, 5’-ATTTTCAGCGCCGCTTCGTTAG-3’ (403). The sequences of the primers pairs for TOPO-cloning and heterologous expression (see above) were (the directional overhang is underlined): b3880 forward, 5’-CACCGGAATGAAATGGTTTAACACCCTAAG-3’, b3880 reverse, 5’-AACCCGCACC- CTATTTTCAG-3’; b3881 forward, 5’-CACCATGAATAAGTACACCATCAACGACATT- ACG-3’, b3881 reverse, 5’-ACCATTTCATTCCTTTTATCCTCATCTT-3’; b3882 forward, 5’-CACCATGGCAGCAATCGCGTTTATCG-3’, b3882 reverse, 5’-CGCGTAATGTCGTT- GATGGTGTA-3’; b3883 forward, 5’-CACCATGATTCGTGTTGCTTGTGTAGGT-3’, b3883 reverse, 5’-TGAAAATTCCTCGAAAAACCATCA-3’.

Genome analyses Analysis of genomes for orthologous gene clusters was carried out through the gene cassette search and neighborhood regions search options of the Integrated Microbial Genomes (IMG) and IMG Human Microbiome Project (IMG HMP) platforms (http://img.jgi.doe.gov/). Basic sequence analyses were done using NCBI’s BLAST tools (http://blast.ncbi.nlm.nih.gov) and the Lasergene DNAstar software package (www.dnastar.com).

Enzyme nomenclature We suggest that sulfolactaldehyde reductase belongs to the NC-IUBMB subgroup EC 1.1.1., with the name sulfolactaldehyde 3-reductase (systematic name 2,3-dihydroxypropane- 1- sulfonate:NAD+ 3-oxidoreductase). The sulfofructose kinase would then belong to EC 2.7.1., with the name sulfofructose 1-kinase (systematic name ATP:6-deoxy-6-sulfofructose 1- phosphotransferase). The sulfofructosephosphate aldolase would belong to EC 4.1.2., with the name sulfofructosephosphate aldolase (systematic name 6-deoxy-6-sulfofructose 1-phosphate 2-hydroxy-3-oxopropane-1-sulfonate-lyase (glycerone phosphate forming)). Sulfoquinovose isomerase would belong to EC 5.3.1., with the name sulfoquinovose isomerase (systematic name 6-deoxy-6-sulfoglucose aldose-ketose-isomerase). CHAPTER 8 163

EXTENDEND DATA FIGURES AND TABLES

Extended Data Figure 1. Soluble proteins in glucose- or SQ-grown cells of E. coli K-12 MG1655 separated by 2D-PAGE. All prominent protein spots on the gel from SQ-grown cells that suggested inducibly expressed proteins were excised and submitted to PF-MS (see Extended Data Table 1). The PF-MS identifications were replicated in an independent growth experiment and gel electrophoresis run.

Extended Data Table 1. Identifications by peptide fingerprinting-mass spectrometry of protein spots excised from 2D-PAGE gels of SQ-grown E. coli cells

Protein spots were sorted according to their apparent molecular mass on the gel (see Extended Data Figure 1). 164 CHAPTER 8

Extended Data Figure 2. Differential transcriptional analysis of the genes encoding the central pathway of sulfoglycolysis in E. coli K-12. RT-PCR of the inducible transcription of genes b3879 (epimerase), b3880 (isomerase), b3883 (kinase), b3881 (aldolase) and b3882 (reductase) in cells grown with SQ in comparison to glucose-grown cells. A positive control was a constitutively expressed gene (b0720; gltA, citrate synthase), and a negative control was a PCR without reverse transcription (RNA) to confirm the absence of DNA contamination in the RNA preparations used. The results were replicated when starting from an independent growth experiment. CHAPTER 8 165

Extended Data Figure 3. Purity of heterologously expressed enzymes using SDS-PAGE. M, marker proteins; 1, b3880 (isomerase); 2, b3883 (sugar kinase); 3, b3881 (aldolase), 4, b3882 (reductase). A representative SDS gel is shown (n = 2). Each enzyme, 20 µg. 166 CHAPTER 8

Extended Data Figure 4. Separation of SQ, SF and analogues by HILIC-HPLC and detection by ESI-TOF-MS. a, Extracted ion chromatogram of SQ in reaction buffer before (top) and after (bottom) addition of recombinant isomerase b3880 to generate SF. The exact masses determined for the [M-H]- ions of SQ and SF are indicated (in - - Da); the theoretical exact masses (monoisotopic masses) for the [M-H] ions of both SQ and SF (each C6H11O8S ) is 243.0180 Da. The data from a representative experiment are shown; the results were replicated (n = 3) with samples from independent enzyme reactions. b, Samples of reference substances fructose and glucose (top), and fructose-6-phosphate and glucose-6-phosphate (bottom). The reference substances illustrate the chromatographic separation by HILIC, that is, the ketoses (for example, SF) elute before the aldoses (for example, SQ). CHAPTER 8 167

Extended Data Figure 5. MS-MS fragmentation of SQ and SF. a, Fragment ions of the [M-H]- ions of SQ. b, Fragment ions of the [M-H]- ions of SF. The fragmentation pattern of both compounds, SQ and SF, is dominated - by the loss of water (-18), C2H4O2 (-60), C3H6O3 (-90) and C4H8O4 (-120), and by the formation of HSO3 (81) ions. The MS-MS spectra of SQ and SF differ in their peak heights, in particular in their base peak for SQ (183) - - and of SF (153) corresponding to a preferred formation of C4H7O6S and C3H5O5S , respectively; both fragments might be formed by McLafferty-like rearrangement reactions corresponding to the different positions of the keto- groups in SQ and SF. Representative data are shown (n = 5; see Figure 3). 168 CHAPTER 8

Extended Data Figure 6. MS-MS fragmentation of SFP and FBP. a, Fragment ions of the [M-H]- ions of SFP. b, Fragment ions of the [M-H]- ions of the analogue fructose-1,6-bisphosphate (FBP) for comparison. Fragmentations led to loss of water (-18) and/or loss of phosphoric acid (-98), and to the formation of dihydrogen phosphate (97) and pyrophosphate (177). Representative data are shown (n = 5; see Figure 3). CHAPTER 8 169

Extended Data Figure 7. MS-MS fragmentation of SLA and DHPS. a, Fragment ions of the [M-H]- ions of SLA. b, Fragment ions of the [M-H]- ions of DHPS. Fragmentation of SLA led to a cleavage of the carbon-sulfur bond - - and formation of HSO3 (81) and [C3H3O2] (71) ions. The fragmentation of DHPS is characterized by an initial loss of water (-18) and concomitant fragmentation of the enol (or the methylketone) to ketene (-42) and formation - of [CH3O3S] (95) ions; the same fragmentation pattern was observed for authentic DHPS standard (not shown). Representative data are shown (n = 5; see Figure 3). 170 CHAPTER 8

Extended Data Figure 8. Hypothetical degradation of the sulfolipid sulfoquinovosyl diacylglycerol in E. coli K- 12. Presumed functions of OmpL (b3875) and YihQ (b3878); apart from OmpL, the subcellular location of this pathway is unknown. R1 and R2 indicate acyl chains of different length and degree of unsaturation. CHAPTER 9 171

CHAPTER 9

General discussion

The exploration of yet unknown catabolic pathways, in respect to the degradation of xenobiotic compounds, in microbes and microbial communities is of great importance: (i) for the general understanding of the global cycling of matter and (ii) for a more detailed understanding of the mechanisms behind the metabolic flexibility of microbes and their adaptation to a utilization of growth substrates.

There are plenty of examples where microbes fail to provide the major ‘ecosystem service’, of an effective recycling of organic matter into, ultimately, CO2, for synthetic industrial chemicals. A ‘success story’ in this respect is the complete degradation of linear alkylbenzenesulfonates (LAS), which are entirely synthetic bulk laundry surfactants of petrochemical origin. LAS are disposed into the environment in massive amounts without causing problems. These surfactants replaced branched-chain alkylbenzenesulfonates, which are only very poorly degraded in wastewater treatment plants (Sawyer and Ryckman 1957, Kölbener et al. 1995), and it has been shown exhaustively that LAS are completely degradable. However, the details on how LAS and their degradation intermediates, sulfophenylcarboxylates (SPCs), are degraded, remained elusive, until a defined LAS- and SPC-degrading laboratory model community had been established for experimental work (Schleheck et al. 2004a). Now, this community is genome- sequenced and their genomes have been analyzed, Parvibaculum lavamentivorans DS-1T (chapter 2), Comamonas testosteroni KF-1 (chapter 3) and Delftia acidovorans SPH-1 (chapter 4).

Access to genome-sequenced organisms is very helpful for studies of bacterial degradation pathways on a molecular level. Particularly, the application of modern proteomics greatly enhances the straightforwardness of candidate-gene calling, the production of recombinant protein in adequate amount and purity, and enzymatic confirmation of their predicted function and role in the degradative pathway. Another striking advantage of working with genome- sequenced organisms is the fact that paralogs or copies of genes for certain enzymes under consideration (e.g., for BVMO reactions) encoded in the same organism, can be identified and tested. Comparative studies in order to discriminate the genes desired (e.g. by reverse transcription-PCR) are based on these studies at a genomic level. However, the function of the identified gene products has to be checked with biochemical and enzymatic studies. Therefore, the access to the enzyme’s substrates is essential. For the substrates which are not commercially 172 CHAPTER 9 available, this issue has been addressed by chemical synthesis, e.g., for the growth substrates

3-(4-sulfophenyl)butyrate (3-C4-SPC) and sulfoquinovose (SQ). The second inducible SPC degradation pathway, for 4-(4-sulfophenyl)hexanoate (4-C6-SPC), is only now becoming experimentally accessible, because the substrate for generating active biomass and proteomics is now available in reasonable amounts and purity. Another possibility to generate rare substrates for enzyme testing is in-vitro reconstitution of the complete pathway, thus, one recombinant enzyme is used to generate the substrate for the next enzyme in the pathway, e.g., as for the intermediates of sulfoglycolysis and when starting with the available SQ and dihydroxypropanesulfonate (DHPS).

The degradation of LAS and SPCs is an intriguing example of interplay between non-specific and highly specific enzymatic attack in order to access and utilize substrates completely. At the first degradation level, P. lavamentivorans DS-1T utilizes the alkyl chains of LAS for growth but releases the inaccessible remainder, many short-chain SPCs and related compounds (Schleheck et al. 2000, Schleheck et al. 2004a, Schleheck et al. 2007). This initial, non-specific attack on the alkyl chains of LAS is a hydroxylation of the terminal carbon atom (omega-oxygenation), an oxidation of the hydroxyl-carbon to a carbonyl-carbon, and its activation as coenzyme A (CoA)-ester followed by several rounds of acetyl-CoA abstractions through fatty-acid beta-oxidations. However, the non-specific chain-shortening stops, when the 4-sulfophenyl groups on the many different SPCs become sterically hindering, thus, when specific attack is required. The CoA is recovered and the short-chain SPCs are released (Schleheck and Cook 2005). In fact, P. lavamentivorans DS-1T is also able to degrade 16 other alkyl-chain containing commercially important surfactants, as well as alkanes; the latter was found also for the other members of the genus Parvibaculum, which were, e.g., isolated from alkane contaminated environments (see chapter 2).

The omega-oxygenation of this wide range of alkyl-substrates is most likely catalyzed by cytochrome-P450 (CYP) alkane monooxygenase (MO; Schleheck and Cook 2005) and the genome analysis indicated that indeed several of these enzymes might act in concert in order to catalyze such a wide substrate range: The genome of strain DS-1 encodes, in total, ten valid candidates for this type of oxygenase (chapter 2). Whereas the LAS-oxygenases, as measured with whole cells in the oxygen electrode, appear to be active also in acetate-grown cells, hence, are constitutively expressed, some of the individual CYP-MOs appear to be 3-fold to 10-fold up regulated during growth with LAS in comparison to acetate-grown cells, as determined by transcriptional analyses (Serif 2012). However, all attempts of heterologously overexpressing CHAPTER 9 173 such a candidate CYP-MO system (i.e., the oxygenase component, ferredoxin and ferredoxin- reductase) in an active state did fail so far (Serif 2012).

The representatives for the complete degradation of SPCs are the specialists in specificity, Comamonas testosteroni KF-1 and Delftia acidovorans SPH-1 (Schleheck et al. 2004a). They are able to utilize only a very narrow range of the many SPCs available as potential substrates. Clearly, the specificity lies in the yet unknown mechanisms of abstractions of the different branched carboxylate side chains of SPCs. However, only the 3-C4-SPC degradation pathway was experimentally accessible in this study, since so far only 3-C4-SPC was chemically synthesized in order to follow its inducible degradation pathway in C. testosteroni KF-1. A

3-C4-SPC degradation pathway was postulated (Schleheck et al. 2010) after the identification of degradation intermediates (4-sulfoacetophenone (SAP) and 4-sulfophenol), which allowed for the suggestion of a Baeyer-Villiger-type monooxygenase (BVMO) reaction leading to 4-sulfophenyl acetate and the subsequent hydrolytic cleavage into acetate and 4-sulfophenol. This was confirmed in this work through the identification of the responsible BVMO gene by reverse transcription-PCR (RT-PCR), the heterologous expression of the identified gene, and the confirmation of the function of the purified enzyme. Further, the corresponding esterase, which catalyzes the hydrolytic cleavage of 4-sulfophenyl acetate, was purified to homogeneity and identified (chapter 5).

The concept of using special sets of genes and enzymes to make different growth substrates available is not to be reinvented, but rather slightly modified in order to fulfill the new duties. For example with BVMO2, strain KF-1 uses also the successful concept of inserting an oxygen atom between a C-C bond next to a keto group, followed by the subsequent hydrolytic cleavage of an acetate moiety, as it was shown for the abstraction of the acetyl side chain of progesterone, when KF-1 thrives on that substrate. Notably, in consideration of kinetic aspects of more favorable reactions, the identified intermediate of the progesterone degradation pathway, pregna-1,4-diene-3,20-dione seems to be the ʻtrueʼ substrate for BVMO2 under physiological conditions (see chapter 6).

The same concept seems also true for short-chain aliphatic ketones (Fraaije et al. 2004, Rehdorf et al. 2007), for which one was tested (2-decanone) as substrate for all four BVMO candidates of strain KF-1. Indeed, 2-decanone is a substrate for all four BVMOs of strain KF-1, however, with differing activities. Short-chain aliphatic ketones have been reported as substrates for BVMOs, e.g., in Mycobacterium tuberculosis and in Pseudomonas putida KT2440 (Fraaije et al. 2004, Rehdorf et al. 2007), and both of these enzymes, share on the phylogenetic level, closest homology to BVMO1 of C. testosteroni KF-1 (44% and 42% amino acid identity, 174 CHAPTER 9 respectively), but less to BVMO3 (SAPMO; 30% and 32% amino acid identity, respectively), although the latter exhibits the highest specific activities with 2-decanone and ethionamide. Further, BVMO3 is phylogenetically more closely related to the steroid BVMO of Rhodococcus rhodochrous, but accepts no steroids as substrates (see chapter 6). Hence, the phylogenetic relationships of BVMOs seem to give no indication of their substrate specificities, and there are both, rather promiscuous enzymes (BVMO3) as well as comparatively specified enzymes (BVMO2) existing, as far as the substrate range tested in this study is concerned. The physiological function for BVMO4 is still unclear; it exhibited activity with 2-decanone and ethionamide only, however, with very poor kinetic magnitudes (chapter 6).

Table 1. Substrate range tested for the four BVMOs of strain KF-1 (, measurable activity; , no measurable activity; n.d., not determined)

BVMO 1              

BVMO 2   n.d.  n.d. n.d. n.d. n.d.      

BVMO 3              

BVMO 4              

The physiological reaction product of BVMO3, 4-sulfophenyl acetate, was shown to be hydrolytically cleaved into acetate and 4-sulfophenol (chapter 5). The latter was initially thought to be converted by a 4-sulfophenol 2-monooxygenase to 4-sulfocatechol (Schleheck et al. 2010). The cleavage of 4-sulfocatechol, leading to 3-sulfomuconate, was already shown to be performed by a modified protocatechuate 3,4-dioxygenase in Hydrogenophaga intermedia S1 (Contzen et al. 2001). This type of ring cleavage has been used as a default hypothesis for the 3-C4-SPC degradation in C. testosteroni KF-1, since there is a measurable 4-sulfocatechol dependent oxygen consumption with induced strain KF-1 whole cells, as well as with cell extracts (Schleheck et al. 2010, chapter 7). A suitable candidate that might be responsible for the ring cleavage reaction was found quickly by proteomics, whose gene product, however, did not convert 4-sulfocatechol. It turned out that hydroxyquinol is the substrate for the ring cleavage dioxygenase, and this finding rejects the initially proposed pathway via a desulfonation after a ring cleavage of 4-sulfocatechol; the 4-sulfocatechol, however, may still be an appropriate intermediate in the formation of hydroxyquinol (see chapter 7). CHAPTER 9 175

Additionally, the proteomic approach identified several other induced candidates that would fit nicely into the proposed route of 3-C4-SPC degradation in C. testosteroni KF-1. Clearly, there is still a lot more work to do in order to close the gaps in understanding the complete degradation of this individual SPC, of 3-C4-SPC, on the enzymatic and genetic level. However, most of the identified genes and candidate genes are encoded in a contiguous arrangement of gene clusters that is unique in C. testosteroni KF-1, and that shows clear remnants of their recent mobilization, i.e., through the transposable elements (IS1071 elements), which are co-encoded in this genome region (see chapter 7). Overall, it seems as if the pathway for xenobiotic

3-C4-SPC has been assembled only very recently (recently in the evolutionary time scale) in C. testosteroni KF-1.

In contrast, the bacterial degradation of the natural and supposedly ‘ancient’ organosulfonate SQ has had much more time to fully shape and optimize, e.g., its pathway(s), gene organization and regulation. This is exemplified by the first SQ pathway that is defined in such detail, sulfoglycolysis in Escherichia coli K-12 (see chapter 8). However, the basic concept is the same: slight modification of the pre-existing in order to fulfill the new metabolic duties, rather than reinvention. The substrate SQ is degraded in analogy to an also very ancient pathway, the ‘normal’ glycolysis (Embden-Meyerhof-Parnas (EMP) pathway). The genes for SQ are encoded in one extra operational unit, and this gene cluster (presumed operon) is widespread in Enterobacteriaceae. Sulfoglycolysis operates with a first isomerization reaction of SQ to sulfofructose, followed by a phosphorylation yielding sulfofructosephosphate, the substrate for the subsequent aldolytic cleavage, whose products are dihydroxyacetonephosphate and sulfolactaldehyde. The latter is reduced to DHPS, which is excreted and can serve as growth substrate for other organisms, e.g. Cupriavidus pinatubonensis JMP134 (Denger et al. 2012, chapter 8). In fact, the remarkable analogy of sulfoglycolysis to the EMP pathway is not surprising, since the EMP pathway is evolutionary optimized in regard to product formation and achieved by the fewest possible reaction steps (Bräsen et al. 2014). Additionally, the EMP pathway is, at least in parts, present in all organisms (Fothergill-Gilmore and Michels 1993), and these, which are able to utilize SQ did not have to ‘reinvent the wheel’ for its catabolism. This applies also to other options for glucose catabolism (Roy et al. 2003): For instance, the SQ-utilizing Pseudomonas putida SQ1, which was isolated from Lake Constance, excretes a different sulfonated product, 3-sulfolactate, and preliminary evidence suggests that a SQ degradation pathway analogous to the glucose dissimilation route via the Entner-Doudoroff pathway, is operative in strain SQ-1 (Ann-Katrin Felux, personal communication). Notably, the 176 CHAPTER 9 excreted 3-sulfolactate is the growth substrate for other second-tier organisms, e.g. Paracoccus pantotrophus NKNCYSA (Denger et al. 2012).

The bacteria’s motivation for each catabolic pathway is the same: gorge whatever is possible, based on their metabolic capabilities and relative to opportunity. In this thesis, two examples for catabolic pathways for natural (SQ) and of xenobiotic (LAS/SPCs) organosulfonates have been examined, and in both cases, there seem to be two tiers of bacteria necessary in order to catalyze these pathways completely. Based on their metabolic capabilities, the members of the first tier can utilize only a part of the carbon, provided by LAS and SQ, and the remainder is excreted (SPCs and DHPS or 3-sulfolactate), but this opens up the opportunity for the second tier of organisms to thrive on these excreted, still sulfonated, remainder molecules. Hence, the work with defined bacterial communities as laboratory model systems mirrors, even though in a highly simplified version, the interactions between the members of the environmental microbial communities and their complex catabolic network. On the other hand, from the evolutionary perspective, it would be surprising if there had not been a strong selection for complete pathways in the same organism: Such organisms may not yet exist for the ‘young’ substrate LAS, or may not yet be available as isolated laboratory model organisms, but such organisms do exist for SQ (Martelli 1967), however, the respective strain, isolated by that time, has not been maintained in any culture collection (Cook and Denger 2002).

CHAPTER 10 177

CHAPTER 10

Appendix

Abbreviations °C degree Celsius

1D-PAGE one dimensional polyacrylamide gel electrophoresis

2-C4-SPC 2-(4-sulfophenyl)butyrate

2D-PAGE two dimensional polyacrylamide gel electrophoresis

2Fe-2S Rieske type iron-sulfur cluster

3-C4-SPC 3-(4-sulfophenyl)butyrate

3-C4-2en-SPC 3-(4-sulfophenyl)-Δ2-butyrate

3-C5-SPC 3-(4-sulfophenyl)pentanoate

3-C5-2en-SPC 3-(4-sulfophenyl)-Δ2-pentanoate

3-C12-LAS 3-(4-sulfophenyl)dodecane

4-C5-SPC 4-(4-sulfophenyl)pentanoate

4-C5-2en-SPC 4-(4-sulfophenyl)-Δ2-pentanoate

4-C6-SPC 4-(4-sulfophenyl)hexanoate

4-C6-2en-SPC 4-(4-sulfophenyl)-Δ2-hexanoate

5-C6-SPC 5-(4-sulfophenyl)hexanoate

Å angstrom (10-10m)

AAP 4-aminoacetophenone

ADD androsta-1,4-diene-3,17-dione

ADP adenosine diphosphate amu atomic mass unit

AP acetophenone aph genes responsible for the meta degradation of phenol appr./approx. approximately 178 CHAPTER 10 ars genes responsible for arsenical resistance

ATP adenosine triphosphate

Au gold (aurum)

BA benzaldehyde

Bcr/CflA drug resistance transporter subfamily (including bicyclomycin resistance protein)

BenAB benzoate dioxygenase

BLASTn Basic Local Alignment Search Tool (for nucleotide sequences)

BLASTp Basic Local Alignment Search Tool (for amino acid sequences) box genes responsible for CoA-dependent aerobic benzoate degradation bp base pairs

BphC biphenyl dioxygenase

BSA bovine serum albumin

Bsh bile salt hydrolase

BVMO Baeyer-Villiger monooxygenase c centi (10-2, as prefix)

CA California cDNA complementary deoxyribonucleic acid

CDSs Coding DNA Sequences cf. to be compared with (confer)

CHMO cyclohexanone Baeyer-Villiger monooxygenase cnb genes responsible for the 4-chloronitrobenzene degradation cntA gene for a cobalt nickel transporter

CoA coenzyme A

COG Clusters of Orthologous Groups cop genes responsible for copper resistance

CRISPR clustered regularly interspaced short palindromic repeats

CTAB cetyltrimethylammonium bromide

C. testosteroni Comamonas testosteroni CHAPTER 10 179

CuyA cysteate sulfo-lyase

CYP cytochrome-P450

CzcA heavy metal resistance protein (for cobalt, zinc and cadmium resistance)

D attenuance

Da Dalton

D. acidovorans Delftia acidovorans dca genes responsible for dichloroaniline degradation

DEAE diethylethanolamine del genes responsible for delftibaction production

DFG Deutsche Forschungsgemeinschaft

DHAP dihydroxyacetone phosphate

DHPS 2,3-dihydroxypropanesulfonate

DNA deoxyribonucleic acid

DOE Department of Energy

DSM(Z) Deutsche Sammlung von Mikroorganismen (und Zellkulturen)

DTT dithiothreitol

DUF domain of unknown function

EC Enzyme Commission number

E. coli Escherichia coli e.g. for example (exempli gratia)

EHEC enterohemorrhagic Escherichia coli

ELSD evaporative light scattering detector

EMP Embden-Meyerhof-Parnas

EmrB/QacA drug resistance transporter subfamily (including multiple drug resistance efflux pump)

ESI electrospray ionisation est gene encoding an esterase et al. and others (et alii) etfA electron transfer flavoprotein alpha subunit 180 CHAPTER 10 eV electron volt

FAD flavin adenine dinucleotide

Fba fructose bisphosphate aldolase

Fe iron (ferrum)

FMN flavin mononucleotide for forward fruA fructose PTS permease fruA subunit fruB fructose PTS permease fruB subunit g grams g earth´s standard acceleration

GAP glyceraldehyde-3-phosphate

GC gas chromatography

GC content guanine-cytosine content

GC skew (guanine – cytosine)/(guanine + cytosine) gen. nov. new genus (genus novum) gltA gene encoding a citrate synthase

GOLD Genomes OnLine Database h hours

HAP 4-hydroxyacetophenone

HAPMO 4-hydroxyacetophenone Baeyer-Villiger monooxygenase

His histidine

HMP Human Microbiome Project

HpaB 4-hydroxyphenylacetate 3-monooxygenase

HPAc 4-hydroxyphenyl acetate

HPLC high-performance liquid chromatography

HPP 4-hydroxypropiophenone

HR high resolution

ID identifier

IDA Inferred from Direct Assay CHAPTER 10 181 i.e. that is (id est)

IEF isoelectric focusing

IMG Integrated Microbial Genomes

IN Indiana iph genes responsible for the isophthalate degradation

IPTG isopropyl-β-D-thiogalactopyranoside

IS insertion sequence ivaA gene responsible for the demethylation of isovanillate

JGI Joint Genome Institute k kilo (103, as prefix) kcat turnover rate

KEGG Kyoto Encyclopedia of Genes and Genomes

Km Michaelis constant

KoRS-CB Konstanz Research School Chemical Biology

KshA 3-ketosteroid 9-monooxygenase subunit A l liter/litre

LABs linear alkylbenzenes

LAS linear alkylbenzenesulfonate

LB lysogeny broth

LigB extradiol ring cleavage dioxygenase class III protein subunit B

µ micro (10-6, as prefix)

µ specific growth rate m meter m milli (10-3, as prefix)

M molar

M mega (106, as prefix)

MacA macrolide export protein

MD Maryland

MEGA Molecular Evolutionary Genetics Analysis 182 CHAPTER 10 mer genes responsible for mercury resistance

MES 2-(N-morpholino)ethanesulfonic acid

MFP membrane fusion protein

MFS major facilitator superfamily

(M-H)- deprotonated molecule ion measured in negative mode mhp genes responsible for the 3-(3-hydroxyphenyl)propionic acid degradation

MhpA 3-(3-hydroxyphenyl)propionate hydroxylase

MIGS minimum information about a genome sequence min minute

MMLV moloney murine leukemia virus mnb genes responsible for the 3-nitrobenzoate degradation

MO monooxygenase

MOPS 3-morpholinopropane-1-sulfonic acid

MS mass spectrometry

MS-MS tandem mass spectrometry

Mr relative mass mRNA messenger ribonucleic acid m/z mass-to-charge ratio n nano (10-9, as prefix)

NAD+ nicotinamide adenine dinucleotide (oxidized)

NADH nicotinamide adenine dinucleotide (reduced)

NADP+ nicotinamide adenine dinucleotide phosphate (oxidized)

NADPH nicotinamide adenine dinucleotide phosphate (reduced)

NAP 4-nitroacetophenone

NAS non-traceable Author Statement

NAST Nearest Alignment Space Termination nbz genes responsible for the nitrobenzene degradation

NCBI National Center for Biotechnology Information

NCIMB National Collection of Industrial, Food and Marine Bacteria CHAPTER 10 183

NC-IUBMB Nomenclature Committee of the International Union of Biochemistry and Molecular Biology

Ni nickel

NMR Nuclear magnetic resonance no. number

NP nonylphenol

NpdB hydroxyquinol dioxygenase in the para-nitrophenol catabolism

NpdC maleylacetate reductase in the para-nitrophenol catabolism

NTA nitrilotriacetic acid

OD580 optical density at a wavelength of 580 nanometers

OmpL outer membrane protein L

ORF open reading frame

ORNL Oak Ridge National Laboratory p pico (10-12, as prefix)

Pa Pascal

PA phenylacetone paa genes responsible for aerobic phenylacetate degradation

Paa phenylacetate

PAGE polyacrylamide gel electrophoresis

PAH polycyclic aromatic hydrocarbons

PAMO phenylacetone Baeyer-Villiger monooxygenase parAB partitioning proteins (act in chromosome segregation) pbr genes responsible for cadmium/lead resistance

PCBs polychlorinated biphenyls

PCR polymerase chain reaction

PdeA phosphodiesterase

PDD pregna-1,4-diene-3,20-dione

Pfam protein families database

Pfk phosphofructokinase 184 CHAPTER 10

PF-MS peptide fingerprinting-mass spectrometry

Pgi phosphoglucose isomerase pH pondus Hydrogenium

PhaR polyhydroxyalkanoate synthesis repressor

PhaZ polyhydroxyalkanoate depolymerase

PHB polyhydroxybutyrate

PhbC polyhydroxybutyrate synthase phc genes responsible for phenol catabolism

Pil pilus assembly protein

P. lavamentivorans Parvibaculum lavamentivorans pmd genes responsible for the protocatechuate meta degradation

PmdAB Protocatechuate 4,5-dioxygenase

PnpF maleylacetate reductase in the para-nitrophenol catabolism

PnpG hydroxyquinol dioxygenase in the para-nitrophenol catabolism

PPE carboxylester hydrolase in Pseudomonas putida

P. putida Pseudomonas putida

PQQ pyrroloquinoline quinone

PRIAM profils pour l'Identification automatique du métabolisme

Pta phosphotransacetylase

PTS phosphotransferase system

Q11 ubiquinone 11

RDP Ribosomal Database Project rev reverse

Rfam RNA families database

RNaseP ribonuclease P (a ribozyme)

RNA ribonucleic acid

RND efflux transporter superfamily (resistance-nodulation-cell division) rpm revolutions per minute rRNA ribosomal ribonucleic acid CHAPTER 10 185

RT reverse transcription s second

SAP 4-sulfoacetophenone

SAPMO 4-sulfoacetophenone Baeyer-Villiger monooxygenase

SAS secondary alkanesulfonates

SDR short-chain dehydrogenase/reductase

SDS sodium dodecyl sulfate

SEM scanning electron microscope

SEST steroid esterase

SF 6-deoxy-6-sulfofructose

SFP 6-deoxy-6-sulfofructose-1-phosphate sil genes responsible for silver/copper resistance

SLA 3-sulfolactaldehyde

SOD superoxide dismutase

SorAB sulfite dehydrogenase (SorA, catalytic unit; SorB, cytochrome c)

SorT sulfite dehydrogenase sp. species

SP 4-sulfophenol

SPAc 4-sulfophenyl acetate

SPCs sulfophenylcarboxylates sp. nov. new species (species nova)

SQ sulfoquinovose

SQDG sulfoquinovosyldiacylglycerol

STMO steroid Baeyer-Villiger monooxygenase sucD succinate CoA ligase alpha subunit

SuyAB 3-sulfolactate sulfo-lyase

Taq Thermus aquaticus

TAS Traceable Author Statement tau genes attributed to taurine uptake and degradation 186 CHAPTER 10

Tau proteins attributed to taurine uptake and degradation tes genes attributed to testosterone uptake and degradation

Tes proteins attributed to testosterone uptake and degradation

TIC total-ion chromatogram

TMHMM transmembrane prediction using hidden Markov models

TN Tennessee

Tn3 a mobile genetic element

TOF time of flight tph genes responsible for the terephthalate degradation

Tpi triose phosphate isomerase tra conjungative transfer modules trb mating pair formation modules

TRBA Technische Regeln für Biologische Arbeitsstoffe

Tris tris(hydroxymethyl)aminomethane tRNA transfer ribonucleic acid

U unit (µmol min-1)

UDP uridinediphosphate

UK United Kingdom

UniProt Universal Protein Resource

U.S. United States

USA United States of America

UV ultraviolet v volume

V volts vanA gene responsible for the demethylation of vanillate

Vmax maximum velocity w weight

WI Wisconsin

Xsc sulfoacetaldehyde acetyltransferase CHAPTER 10 187

YihO sulfoquinovose transporter

YihP putative DHPS exporter yihQ gene encoding an alpha-glucosidase

YihQ sulfolipid alpha-glucosidase yihR gene encoding an epimerase

YihR epimerase yihS gene encoding an isomerase

YihS sulfoquinovose isomerase yihT gene encoding an aldolase

YihT SFP aldolase yihU gene encoding a dehydrogenase/reductase

YihU SLA reductase yihV gene encoding sugar kinase

YihV SF kinase

YihW putative repressor

YSF Young Scholar Fund of the University of Konstanz zntA gene attributed to a heavy metal translocating ATPase

188 CHAPTER 10

Record of Contributions

Chapter 2, 3 and 4: I contributed to the growth experiments, literature research, genome analyses and writing of the manuscripts.

Chapter 5: I did the molecular cloning, heterologous expression and purification of the recombinant BVMO enzyme, substrate screening and activity assays, and the in vitro reconstitution of the BVMO and esterase reaction sequence. Karin Denger purified the esterase. David Schleheck did the reverse transcription-PCR. Thomas Huhn synthesized 3-C4-SPC. I contributed to the writing of the manuscript.

Chapter 6: I did the growth experiments, activity assays with cells and cell-free extracts, molecular cloning, expression and purification of the BVMOs, as well as the substrate screening and all in vitro activity assays. I did the transcriptional and proteomic analyses. I did the cloning, expression and purification of the esterase together with Anna Kesberg, and the analytical chemistry together with Ann-Katrin Felux and Dieter Spiteller. I wrote the manuscript draft.

Chapter 7: I did the molecular cloning, expression and purification of the ring cleavage dioxygenase, and all kinetic studies for the recombinant and native enzymes. Some growth experiments were performed by Frederick von Netzer. I did the 2D-PAGE analyses supported by our Vertiefungskurs-students Manuel Serif, Aylin Metzner and Lena Wurmthaler. I wrote the manuscript.

Chapter 8: I did the molecular cloning, expression and purification of the candidate genes / enzymes and contributed to the transcriptional analyses and activity assays. CHAPTER 11 189

CHAPTER 11

General references

Abalain, J. H., S. Distefano, Y. Amet, E. Quemener, M. L. Abalaincolloc and H. H. Floch (1993). Cloning, DNA sequencing and expression of (3-17)beta hydroxysteroid dehydrogenase from Pseudomonas testosteroni. J. Steroid. Biochem. Mol. Biol. 44(2): 133-139. Ahmad, D., R. Massé and M. Sylvestre (1990). Cloning and expression of genes involved in 4-chlorobiphenyl transformation by Pseudomonas testosteroni: homology to polychlorobiphenyl-degrading genes in other bacteria. Gene 86: 53-61. Albuquerque, L., F. A. Rainey, A. Pena, I. Tiago, A. Veríssimo, M. F. Nobre and M. S. da Costa (2010). Tepidamorphus gemmatus gen. nov., sp. nov., a slightly thermophilic member of the Alphaproteobacteria. Syst. Appl. Microbiol. 33: 60-66. Alonso-Gutiérrez, J., A. Figueras, J. Albaigés, N. Jiménez, M. Viñas, A. M. Solanas and B. Novoa (2009). Bacterial communities from shoreline environments (Costa da Morte, Northwestern Spain) affected by the Prestige oil spill. Appl. Environ. Microbiol. 75(11): 3407-3418. Anderson, R., M. Kates and B. E. Volcani (1978). Identification of sulfolipids in non- photosynthetic diatom Nitzschia alba. Biochim. Biophys. Acta 528(1): 89-106. Arai, H., S. Akahira, T. Ohishi, M. Maeda and T. Kudo (1998). Adaptation of Comamonas testosteroni TA441 to utilize phenol: organization and regulation of the genes involved in phenol degradation. Microbiology 144 (Pt 10): 2895-2903. Arai, H., T. Ohishi, M. Y. Chang and T. Kudo (2000). Arrangement and regulation of the genes for meta-pathway enzymes required for degradation of phenol in Comamonas testosteroni TA441. Microbiology 146 (Pt 7): 1707-1715. Arai, H., T. Yamamoto, T. Ohishi, T. Shimizu, T. Nakata and T. Kudo (1999). Genetic organization and characteristics of the 3-(3-hydroxyphenyl)propionic acid degradation pathway of Comamonas testosteroni TA441. Microbiology 145 (Pt 10): 2813-2820. Armstrong, S., T. R. Patel and M. Whalen (1993). Detoxification mechanisms for 1,2,4- benzenetriol employed by a Rhodococcus sp. BPG-8. Arch. Microbiol. 159(2): 136-140. Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin and G. Sherlock (2000). Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25: 25-29. Baba, T., T. Ara, M. Hasegawa, Y. Takai, Y. Okumura, M. Baba, K. A. Datsenko, M. Tomita, B. L. Wanner and H. Mori (2006). Construction of Escherichia coli K-12 in- frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2: 2006.0008. Badger, J. H. and G. J. Olsen (1999). CRITICA: Coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16: 512-524. Baeyer, A. and V. Villiger (1899). Einwirkung des Caro'schen Reagens auf Ketone. Ber. Dtsch. Chem. Ges. 32: 3625-3633. 190 CHAPTER 11

Balba, M. T., M. R. Khan and W. C. Evans (1979). The microbial degradation of a sulphanilamide-based herbicide (Asulam). Biochem. Soc. Trans. 7: 405-407. Balke, K., M. Kadow, H. Mallin, S. Sass and U. T. Bornscheuer (2012). Discovery, application and protein engineering of Baeyer-Villiger monooxygenases for organic synthesis. Org. Biomol. Chem. 10(31): 6249-6265. Benach, J., C. Filling, U. C. T. Oppermann, P. Roversi, G. Bricogne, K. D. Berndt, H. Jornvall and R. Ladenstein (2002). Structure of bacterial 3-beta/17-beta- hydroxysteroid dehydrogenase at 1.2 angstrom resolution: a model for multiple steroid recognition. Biochemistry 41(50): 14659-14668. Benning, C. (1998). Biosynthesis and function of the sulfolipid sulfoquinovosyl diacylglycerol. Annu. Rev. Plant Physiol. Plant Mol. Biol. 49: 53-75. Benning, C. (2007). Questions remaining in sulfolipid biosynthesis: a historical perspective. Photosyn. Res. 92: 199-203. Benson, A. A. (1963). The plant sulfolipid. Adv. Lipid Res. 1: 387-394. Benson, A. A., H. Daniel and R. Wiser (1959). A sulfolipid in plants. Proc. Natl. Acad. Sci. U.S.A. 45(11): 1582-1587. Benson, A. A. and I. Shibuya (1961). Sulfocarbohydrate metabolism. Fed. Proc. 20: 79. Bergmeyer, H. U. (1975). New values for the molar extinction coefficients of NADH and NADPH for the use in routine laboratories (author's transl). Z Klin Chem Klin Biochem 13(11): 507-508. Birkenmaier, A., J. Holert, H. Erdbrink, H. M. Moeller, A. Friemel, R. Schoenenberger, M. J. Suter, J. Klebensberger and B. Philipp (2007). Biochemical and genetic investigation of initial reactions in aerobic degradation of the bile acid cholate in Pseudomonas sp. strain Chol1. J. Bacteriol. 189: 7165-7173. Birkenmaier, A., H. M. Möller and B. Philipp (2011). Identification of a thiolase gene essential for beta-oxidation of the acyl side chain of the steroid compound cholate in Pseudomonas sp. strain Chol1. FEMS Microbiol. Lett. 318: 123-130. Blothe, M. and E. E. Roden (2009). Composition and activity of an autotrophic Fe(II)- oxidizing, nitrate-reducing enrichment culture. Appl. Environ. Microbiol. 75(21): 6937- 6940. Bocola, M., F. Schulz, L. F., A. Vogel, M. W. Fraaije and M. T. Reetz (2005). Converting phenylacetone monooxygenase into phenylhexanone monooxygenase by rational design: towards practical Baeyer-Villiger monooxygenases. Adv. Synth. Catal. 347: 979-986. Boon, N., J. Goris, P. De Vos, W. Verstraete and E. M. Top (2001). Genetic diversity among 3-chloroaniline- and aniline-degrading strains of the Comamonadaceae. Appl. Environ. Microbiol. 67(3): 1107-1115. Bornscheuer, U. T. (2002). Microbial carboxyl esterases: classification, properties and application in biocatalysis. FEMS Microbiol. Rev. 26(1): 73-81. Bradford, M. M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72: 248-254. Bräsen, C., D. Esser, B. Rauch and B. Siebers (2014). Carbohydrate metabolism in Archaea: current insights into unusual enzymes and pathways and their regulation. Microbiol. Mol. Biol. Rev. 78(1): 89-175. CHAPTER 11 191

Brettar, I., R. Christen, J. Bötel, H. Lünsdorf and M. G. Höfle (2007). Anderseniella baltica gen. nov., sp. nov., a novel marine bacterium of the Alphaproteobacteria isolated from sediment in the central Baltic Sea. Int. J. Syst. Evol. Microbiol. 57: 2399-2405. Brüggemann, C., K. Denger, A. M. Cook and J. Ruff (2004). Enzymes and genes of taurine and isethionate dissimilation in Paracoccus denitrificans. Microbiology 150: 805-816. Buhmann, M. (2008). Charakterisierung der Biofilmbildung durch eine Tensid-abbauende Bakteriengemeinschaft Master's thesis, University of Konstanz. Cabrera, J. E., J. L. P. Paz and S. Genti-Raimondi (2000). Steroid-inducible transcription of the 3beta/17beta-hydroxysteroid dehydrogenase gene (3beta/17beta-hsd) in Comamonas testosteroni. J. Steroid. Biochem. Mol. Biol. 73(3-4): 147-152. Cai, L., G. Liu, C. Rensing and G. Wang (2009). Genes involved in arsenic transformation and resistance associated with different levels of arsenic-contaminated soils. BMC Microbiol. 9. Cain, R. B. (1987). Biodegradation of anionic surfactants. Biochem. Soc. Trans. 15: 7-22. Carlström, K. (1966). Metabolism of progesterone by Streptomyces griseus. Acta Chem. Scand. 20(9): 2620-2622. Cedergren, R. A. and R. I. Hollingsworth (1994). Occurrence of sulfoquinovosyl diacylglycerol in some members of the family Rhizobiaceae. J. Lipid Res. 35(8): 1452- 1461. Chae, J. C. and G. J. Zylstra (2006). 4-Chlorobenzoate uptake in Comamonas sp. strain DJ- 12 is mediated by a tripartite ATP-independent periplasmic transporter. J. Bacteriol. 188(24): 8407-8412. Chang, H. K. and G. J. Zylstra (2008). Examination and expansion of the substrate range of m-hydroxybenzoate hydroxylase. Biochem. Biophys. Res. Commun. 371(1): 149-153. Chen, Y. C., O. P. Peoples and C. T. Walsh (1988). Acinetobacter cyclohexanone monooxygenase: gene cloning and sequence determination. J. Bacteriol. 170: 781-789. Chen, Y. H., L. Y. Chai, Y. H. Zhu, Z. H. Yang, Y. Zheng and H. Zhang (2012). Biodegradation of kraft lignin by a bacterial strain Comamonas sp. B-9 isolated from eroded bamboo slips. J. Appl. Microbiol. 112(5): 900-906. Cleland, W. W. (1982). The use of pH studies to determine chemical mechanisms of enzyme- catalyzed reactions. Meth. Enzymol. 87: 390-405. Cole, J. R., Q. Wang, E. Cardenas, J. Fish, B. Chai, R. J. Farris, A. S. Kulam-Syed- Mohideen, D. M. McGarrell, T. Marsh, G. M. Garrity and J. M. Tiedje (2009). The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 37: D141-D145. Contzen, M., S. Bürger and A. Stolz (2001). Cloning of the genes for a 4-sulphocatechol- oxidizing protocatechuate 3,4-dioxygenase from Hydrogenophaga intermedia S1 and identification of the amino acid residues responsible for the ability to convert 4- sulphocatechol. Mol. Microbiol. 41: 199-205. Cook, A. M. (1987). Biodegradation of s-triazine xenobiotics. FEMS Microbiol. Rev. 46: 93- 116. Cook, A. M., C. G. Daughton and M. Alexander (1980). Desulfuration of dialkyl thiophosphoric acids by a pseudomonad. Appl. Environ. Microbiol. 39(2): 463-465.

Cook, A. M. and K. Denger (2002). Dissimilation of the C2 sulfonates. Arch. Microbiol. 179: 1-6. 192 CHAPTER 11

Cook, A. M., H. Grossenbacher and R. Hutter (1983). Isolation and cultivation of microbes with biodegradative potential. Experientia 39(11): 1191-1198. Cook, A. M., H. Laue and F. Junker (1998). Microbial desulfonation. FEMS Microbiol. Rev. 22(5): 399-419. Cook, A. M., T. H. Smits and K. Denger (2007). Sulfonates and organotrophic sulfite metabolism. Microbial Sulfur Metabolism. C. Dahl and T. Friedrich. Berlin, Springer: 170-183. Cornish-Bowden, A. (1981). Thermodynamic aspects of glycolysis. Biochem. Educ. 9: 133- 137. Daniel, H., M. Miyano, R. O. Mumma, T. Yagi, M. Lepage, I. Shibuya and A. A. Benson (1961). The plant sulfolipid. Identification of 6-sulfo-quinovose. J. Am. Chem. Soc. 83(7): 1765-1766. Davis, G. H. G. and R. W. A. Park (1962). Taxonomic study of certain bacteria currently classified as Vibrio species. J. Gen. Microbiol. 27(1): 101-119. De Vos, P., K. Kersters, E. Falsen, B. Pot, M. Gillis, P. Segers and J. Deley (1985). Comamonas Davis and Park 1962 gen. nov., nom. rev. emend., and Comamonas terrigena Hugh 1962 sp. nov., nom. rev. Int. J. Syst. Bacteriol. 35(4): 443-453. De Weert , J. P. A., M. V. T. Grotenhuis, H. H. M. Rijnaarts and A. A. M. Langenhoff (2011). Degradation of 4-n-nonylphenol under nitrate reducing conditions. Biodegradation 22: 175-187. Delcher, A. L., K. Bratke, E. Powers and S. Salzberg (2007). Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673-679. Denger, K. and A. M. Cook (2010). Racemase activity effected by two dehydrogenases in sulfolactate degradation by Chromohalobacter salexigens: purification of (S)- sulfolactate dehydrogenase. Microbiology 156(Pt 3): 967-974. Denger, K., T. Huhn, K. Hollemeyer, D. Schleheck and A. M. Cook (2012). Sulfoquinovose degraded by pure cultures of bacteria with release of C3 organosulfonates: complete degradation in 2-member communities. FEMS Microbiol. Lett. 328: 39-45. Denger, K., T. H. Smits and A. M. Cook (2006). L-cysteate sulpho-lyase, a widespread pyridoxal 5'-phosphate-coupled desulphonative enzyme purified from Silicibacter pomeroyi DSS-3T. Biochem. J. 394(Pt 3): 657-664. Denger, K., S. Weinitschke, T. H. Smits, D. Schleheck and A. M. Cook (2008). Bacterial sulfite dehydrogenases in organotrophic metabolism: separation and identification in Cupriavidus necator H16 and in Delftia acidovorans SPH-1. Microbiology 154(Pt 1): 256-263. DeSantis, T. Z., P. Hugenholtz, K. Keller, E. L. Brodie, N. Larsen, Y. M. Piceno, R. Phan and G. L. Andersen (2006). NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes. Nucleic Acids Res. 34: 394-399. Dodgson, K. S. and F. F. White (1983). Some microbial enzymes involved in the biodegradation of sulphated surfactants. Topics in Enzyme and Fermentation Biotechnology 7: 90-155. Dong, W., P. Eichhorn, S. Radajewski, D. Schleheck, K. Denger, T. P. Knepper, J. C. Murrell and A. M. Cook (2004). Parvibaculum lavamentivorans converts linear alkylbenzenesulfonate (LAS) surfactant to sulfophenylcarboxylates, ,-unsaturated sulfophenylcarboxylates and sulfophenyldicarboxylates, which are degraded in communities. J. Appl. Microbiol. 96: 630-640. CHAPTER 11 193

Dyrløv Bendtsen, J. D., H. Nielsen, G. von Heijne and S. Brunak (2004). Improved prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340: 783-795. Eichhorn, E., J. R. Van der Ploeg and T. Leisinger (2000). Deletion analysis of the Escherichia coli taurine and alkanesulfonate transport systems. J. Bacteriol. 182(10): 2687-2695. Eilers, H., J. Pernthaler, J. Peplies, F. O. Glöckner, G. Gerdts and R. Amann (2001). Isolation of novel pelagic bacteria from the German Bight and their seasonal contributions to surface picoplankton. Appl. Environ. Microbiol. 67: 5134-5142. Erb, T. J., I. A. Berg, V. Brecht, M. Müller, G. Fuchs and B. E. Alber (2007). Synthesis of C5-dicarboxylic acids from C2-units involving crotonyl-CoA carboxylase/reductase: the ethylmalonyl-CoA pathway. Proc. Natl. Acad. Sci. U.S.A. 104(25): 10631-10636. Erb, T. J., G. Fuchs and B. E. Alber (2009). (2S)-Methylsuccinyl-CoA dehydrogenase closes the ethylmalonyl-CoA pathway for acetyl-CoA assimilation. Mol. Microbiol. 73(6): 992-1008. Erdlenbruch, B. N., D. P. Kelly and J. C. Murrell (2001). Alkanesulfonate degradation by novel strains of Achromobacter xylosoxidans, Tsukamurella wratislaviensis and Rhodococcus sp., and evidence for an ethanesulfonate monooxygenase in A. xylosoxidans strain AE4. Arch. Microbiol. 176(6): 406-414. Euzéby, J. (2005). Notification that new names and new combinations have appeared in volume 54, part 5, of the IJSEM. Int. J. Syst. Evol. Microbiol. 55: 3-5. Euzéby, J. (2008). Notification that new names and new combinations have appeared in volume 57, part 10, of the IJSEM. Int. J. Syst. Evol. Microbiol. 58: 3-4. Euzéby, J. (2010). List of new names and new combinations previously effectively, but not validly, published. Int. J. Syst. Evol. Microbiol. 60: 1477-1479. Ewing, B. and P. Green (1998). Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186-194. Ewing, B., L. Hillier, M. C. Wendl and P. Green (1998). Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8: 175-185. Felux, A.-K., K. Denger, M. Weiss, A. M. Cook and D. Schleheck (2013). Paracoccus denitrificans PD1222 utilizes hypotaurine via transamination followed by spontaneous desulfination to yield acetaldehyde, and finally acetate for growth. J. Bacteriol. 195: 2921-2930. Fernandez, C., A. Ferrandez, B. Minambres, E. Diaz and J. L. Garcia (2006). Genetic characterization of the phenylacetyl-coenzyme A oxygenase from the aerobic phenylacetic acid degradation pathway of Escherichia coli. Appl. Environ. Microbiol. 72(11): 7422-7426. Field, D., G. M. Garrity, T. Gray, N. Morrison, J. Selengut, P. Sterk, T. Tatusova, N. Thomson, M. J. Allen, S. V. Angiuoli, M. Ashburner, N. Axelrod, S. Baldauf, S. Ballard, J. Boore, G. Cochrane, J. Cole, P. Dawyndt, P. De Vos, C. DePamphilis, R. Edwards, N. Faruque, R. Feldman, J. Gilbert, P. Gilna, F. O. Glöckner, P. Goldstein, R. Guralnick, D. Haft, D. Hancock, H. Hermjakob, C. Hertz-Fowler, P. Hugenholtz, I. Joint, L. Kagan, M. Kane, J. Kennedy, G. Kowalchuk, R. Kottmann, E. Kolker, S. Kravitz, N. C. Kyrpides, J. Leebens-Mack, S. E. Lewis, K. Li, A. L. Lister, P. Lord, N. Maltsev, V. M. Markowitz, J. Martiny, B. Methe, I. Mizrachi, R. Moxon, K. Nelson, J. Parkhill, L. Proctor, O. White, S. A. Sansone, A. Spiers, R. Stevens, P. Swift, C. Taylor, Y. Tateno, A. Tett, S. Turner, D. Ussery, B. Vaughan, N. Ward, T. Whetzel, I. San Gil, G. Wilson and A. Wipat (2008). The 194 CHAPTER 11

minimum information about a genome sequence (MIGS) specification. Nat. Biotechnol.: 541-547. Florin, C., T. Kohler, M. Grandguillot and P. Plesiat (1996). Comamonas testosteroni 3- ketosteroid-4 (5)-dehydrogenase: gene and protein characterization. J. Bacteriol. 178(11): 3322-3330. Fonken, G. S., H. C. Murray and L. M. Reineke (1960). Pathway of progesterone oxidation by Cladosporium resinae. J. Amer. Chem. Soc. 82(20): 5507-5508. Fothergill-Gilmore, L. A. and P. A. Michels (1993). Evolution of glycolysis. Prog. Biophys. Mol. Biol. 59(2): 105-235. Fraaije, M. W., N. M. Kamerbeek, A. J. Heidekamp, R. Fortin and D. B. Janssen (2004). The prodrug activator EtaA from Mycobacterium tuberculosis is a Baeyer-Villiger monooxygenase. J. Biol. Chem. 279(5): 3354-3360. Fraaije, M. W., N. M. Kamerbeek, W. J. van Berkel and D. B. Janssen (2002). Identification of a Baeyer-Villiger monooxygenase sequence motif. FEBS Lett. 518(1- 3): 43-47. Fraaije, M. W., J. Wu, D. P. Heuts, E. W. van Hellemond, J. H. Spelberg and D. B. Janssen (2005). Discovery of a thermostable Baeyer-Villiger monooxygenase by genome mining. Appl. Microbiol. Biotechnol. 66(4): 393-400. Franceschini, S., H. L. van Beek, A. Pennetta, C. Martinoli, M. W. Fraaije and A. Mattevi (2012). Exploring the structural basis of substrate preferences in Baeyer-Villiger monooxygenases: insight from steroid monooxygenase. J. Biol. Chem. 287(27): 22626- 22634. Fried, J., R. W. Thoma and A. Klingsberg (1953). Oxidation of steroids by microorganisms. III. Side chain degradation, ring D-cleavage and dehydrogenation in ring A. J. Amer. Chem. Soc. 75(22): 5764-5765. Fukuhara, Y., K. Inakazu, N. Kodama, N. Kamimura, D. Kasai, Y. Katayama, M. Fukuda and E. Masai (2010). Characterization of the isophthalate degradation genes of Comamonas sp. strain E6. Appl. Environ. Microbiol. 76(2): 519-527. Garrity, G. M., J. A. Bell and T. Lilburn (2005a). Class I. Alphaproteobacteria class. nov. Bergey's Manual of Systematic Bacteriology, second edition, part C. G. M. Garrity, D. J. Brenner, N. R. Krieg and J. T. Staley. New York, Springer. 2: p. 1. Garrity, G. M., J. A. Bell and T. Lilburn (2005b). Class II. Betaproteobacteria class. nov. Bergey's Manual of Systematic Bacteriology, second edition, part C. G. M. Garrity, D. J. Brenner, N. R. Krieg and J. T. Staley. New York, Springer. 2: p. 575. Garrity, G. M., J. A. Bell and T. Lilburn (2005c). Family X. Rhodobiaceae fam. nov. Bergey's Manual of Systematic Bacteriology, second edition (The Proteobacteria), part C (The Alpha-, Beta-, Delta-, and Epsilonproteobacteria). D. J. Brenner, N. R. Krieg, J. T. Staley and G. M. Garrity. New York, Springer. 2: 571. Garrity, G. M., J. A. Bell and T. Lilburn (2005d). Order I. Burkholderiales ord. nov. Bergey's Manual of Systematic Bacteriology, second edition, part C. G. M. Garrity, D. J. Brenner, N. R. Krieg and J. T. Staley. New York, Springer. 2: p. 575. Garrity, G. M., J. A. Bell and T. Lilburn (2005e). Phylum XIV. Proteobacteria phyl. nov. Bergey's Manual of Systematic Bacteriology, second edition, part B. G. M. Garrity, D. J. Brenner, N. R. Krieg and J. T. Staley. New York, Springer. 2: p. 1. Gescher, J., A. Zaar, M. Mohamed, H. Schagger and G. Fuchs (2002). Genes coding for a new pathway of aerobic benzoate metabolism in Azoarcus evansii. J. Bacteriol. 184(22): 6301-6315. CHAPTER 11 195

Godocíková, J., V. Bohácová, M. Zámocký and B. Polek (2005). Production of catalases by Comamonas spp. and resistance to oxidative stress. Folia Microbiol. 50(113-118). Gong, W., M. Kisiela, M. B. Schilhabel, G. Xiong and E. Maser (2012a). Genome sequence of Comamonas testosteroni ATCC 11996, a representative strain involved in steroid degradation. J. Bacteriol. 194(6): 1633-1634. Gong, W., G. Xiong and E. Maser (2012b). Identification and characterization of the LysR- type transcriptional regulator HsdR for steroid-inducible expression of the 3- hydroxysteroid dehydrogenase/carbonyl reductase gene in Comamonas testosteroni. Appl. Environ. Microbiol. 78(4): 941-950. Gordon, D., C. Abajian and P. Green (1998). Consed: a graphical tool for sequence finishing. Genome Res. 8: 195-202. Graham, D. E., H. Xu and R. H. White (2002). Identification of coenzyme M biosynthetic phosphosulfolactate synthase: a new family of sulfonate biosynthesizing enzymes. J. Biol. Chem. 277: 13421-13429. Griffiths-Jones, S., A. Bateman, M. Marshall, A. Khanna and S. R. Eddy (2003). Rfam: an RNA family database Nucleic Acids Res. 31: 439-441. Halak, S., T. Basta, S. Burger, M. Contzen and A. Stolz (2006). Characterization of the genes encoding the 3-carboxy-cis,cis-muconate-lactonizing enzymes from the 4-sulfocatechol degradative pathways of Hydrogenophaga intermedia S1 and Agrobacterium radiobacter S2. Microbiology 152(Pt 11): 3207-3216. Halak, S., T. Basta, S. Burger, M. Contzen, V. Wray, D. H. Pieper and A. Stolz (2007). 4- sulfomuconolactone hydrolases from Hydrogenophaga intermedia S1 and Agrobacterium radiobacter S2. J. Bacteriol. 189(19): 6998-7006. Han, C. S. and P. Chain (2006). Finishing repeat regions automatically with Dupfinisher. Proceedings of the 2006 international conference on bioinformatics & computational biology. H. R. Arabnia and H. Valafar, CSREA Press: 141-146. Harwood, J. L. and R. G. Nicholls (1979). The plant sulpholipid- a major component of the sulphur cycle. Biochem. Soc. Trans. 7: 440-447. Hashimoto, I., D. M. Henricks, L. L. Anderson and R. M. Melampy (1968). Progesterone and pregn-4-en-20 alpha-ol-3-one in ovarian venous blood during various reproductive states in the rat. Endocrinology 82(2): 333-341. Hedegaard, J. and I. C. Gunsalus (1965). Mixed function oxidation. IV. An induced methylene hydroxylase in camphor oxidation. J. Biol. Chem. 240: 4038-4043. Henke, E., J. Pleiss and U. T. Bornscheuer (2002). Activity of lipases and esterases towards tertiary alcohols: insights into structure-function relationships. Angew. Chem. Int. Ed. Engl. 41(17): 3211-3213. Higson, F. K. and D. D. Focht (1990). Bacterial degradation of ring-chlorinated acetophenones. Appl. Environ. Microbiol. 56(12): 3678-3685. Hilyard, E. J., J. M. Jones-Meehan, B. J. Spargo and R. T. Hill (2008). Enrichment, isolation, and phylogenetic identification of polycyclic aromatic hydrocarbon-degrading bacteria from Elizabeth River sediments. Appl. Environ. Microbiol. 74(4): 1176-1182. Hoffmann, D., S. Kleinsteuber, R. H. Müller and W. Babel (2003). A transposon encoding the complete 2,4-dichlorophenoxyacetic acid degradation pathway in the alkalitolerant strain Delftia acidovorans P4a. Microbiology 149: 2545-2556. Horinouchi, M., T. Hayashi, H. Koshino, T. Kurita and T. Kudo (2005). Identification of 9,17-dioxo-1,2,3,4,10,19-hexanorandrostan-5-oic acid, 4-hydroxy-2-oxohexanoic acid, and 2-hydroxyhexa-2,4-dienoic acid and related enzymes involved in testosterone 196 CHAPTER 11

degradation in Comamonas testosteroni TA441. Appl. Environ. Microbiol. 71(9): 5275- 5281. Horinouchi, M., T. Hayashi, H. Koshino, M. Malon, T. Yamamoto and T. Kudo (2008). Identification of genes involved in inversion of stereochemistry of a C-12 hydroxyl group in the catabolism of cholic acid by Comamonas testosteroni TA441. J. Bacteriol. 190(16): 5545-5554. Horinouchi, M., T. Hayashi, H. Koshino, T. Yamamoto and T. Kudo (2003a). Gene encoding the hydrolase for the product of the meta-cleavage reaction in testosterone degradation by Comamonas testosteroni. Appl. Environ. Microbiol. 69(4): 2139-2152. Horinouchi, M., T. Hayashi and T. Kudo (2004a). The genes encoding the hydroxylase of 3- hydroxy-9,10-secoandrosta-1,3,5(10)-triene-9,17-dione in steroid degradation in Comamonas testosteroni TA441. J. Steroid. Biochem. Mol. Biol. 92(3): 143-154. Horinouchi, M., T. Hayashi and T. Kudo (2012). Steroid degradation in Comamonas testosteroni. J. Steroid. Biochem. Mol. Biol. 129(1-2): 4-14. Horinouchi, M., T. Hayashi, T. Yamamoto and T. Kudo (2003b). A new bacterial steroid degradation gene cluster in Comamonas testosteroni TA441 which consists of aromatic- compound degradation genes for seco-steroids and 3-ketosteroid dehydrogenase genes. Appl. Environ. Microbiol. 69: 4421-4430. Horinouchi, M., T. Kurita, T. Hayashi and T. Kudo (2010). Steroid degradation genes in Comamonas testosteroni TA441: Isolation of genes encoding a 4(5)-isomerase and 3- and 3-dehydrogenases and evidence for a 100 kb steroid degradation gene hot spot. J. Steroid Biochem. Mol. Biol. 122(4): 253-263. Horinouchi, M., T. Kurita, T. Yamamoto, E. Hatori, T. Hayashi and T. Kudo (2004b). Steroid degradation gene cluster of Comamonas testosteroni consisting of 18 putative genes from meta-cleavage enzyme gene tesB to regulator gene tesR. Biochem. Biophys. Res. Co. 324(2): 597-604. Horinouchi, M., T. Yamamoto, K. Taguchi, H. Arai and T. Kudo (2001). Meta-cleavage enzyme gene tesB is necessary for testosterone degradation in Comamonas testosteroni TA441. Microbiology 147: 3367-3375. Huson, D. H., D. C. Richter, C. Rausch, T. Dezulian, M. Franz and R. Rupp (2007). Dendroscope: An interactive viewer for large phylogenetic trees. BMC Bioinformatics 8: 460. Hyatt, D., G. L. Chen, P. F. Locascio, M. L. Land, F. W. Larimer and L. J. Hauser (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: e119. Hynninen, A., T. Touze, L. Pitkanen, D. Mengin-Lecreulx and M. Virta (2009). An efflux transporter PbrA and a phosphatase PbrB cooperate in a lead-resistance mechanism in bacteria. Mol. Microbiol. 74(2): 384-394. Isono, Y., H. Mohri and Y. Nagai (1967). Effect of egg sulpholipid on respiration of sea urchin spermatozoa. Nature 214(5095): 1336-1338. Itagaki, E. (1986a). Studies on steroid monooxygenase from Cylindrocarpon radicicola ATCC 11011. Oxygenative lactonization of androstenedione to testololactone. J. Biochem. 99(3): 825-832. Itagaki, E. (1986b). Studies on steroid monooxygenase from Cylindrocarpon radicicola ATCC 11011. Purification and characterization. J. Biochem. 99(3): 815-824. Jain, R. K., J. H. Dreisbach and J. C. Spain (1994). Biodegradation of p-nitrophenol via 1,2,4-benzenetriol by an Arthrobacter sp. Appl. Environ. Microbiol. 60(8): 3030-3032. CHAPTER 11 197

Jendrossek, D., M. Backhaus and M. Andermann (1995). Characterization of the extracellular poly(3-hydroxybutyrate) depolymerase of Comamonas sp. and of its structural gene. Can. J. Microbiol. 41 Suppl 1: 160-169. Johnston, C. W., M. A. Wyatt, X. Li, A. Ibrahim, J. Shuster, G. Southam and N. A. Magarvey (2013). Gold biomineralization by a metallophore from a gold-associated microbe. Nature Chem. Biol. 9(4): 241-243. Junker, F. and A. M. Cook (1997). Conjugative plasmids and the degradation of arylsulfonates in Comamonas testosteroni. Appl. Environ. Microbiol. 63(6): 2403-2410. Junker, F., E. Saller, H. R. S. Oppenberg, P. M. H. Kroneck, T. Leisinger and A. M. Cook (1996). Degradative pathways for p-toluenecarboxylate and p-toluenesulfonate and their multicomponent oxygenases in Comamonas testosteroni strains PSB-4 and T-2. Microbiology 142: 2419-2427. Kamerbeek, N. M., M. J. Moonen, J. G. Van Der Ven, W. J. Van Berkel, M. W. Fraaije and D. B. Janssen (2001). 4-Hydroxyacetophenone monooxygenase from Pseudomonas fluorescens ACB. A novel flavoprotein catalyzing Baeyer-Villiger oxidation of aromatic compounds. Eur. J. Biochem. 268(9): 2547-2557. Kamerbeek, N. M., A. J. Olsthoorn, M. W. Fraaije and D. B. Janssen (2003). Substrate specificity and enantioselectivity of 4-hydroxyacetophenone monooxygenase. Appl. Environ. Microbiol. 69(1): 419-426. Kamimura, N., T. Aoyama, R. Yoshida, K. Takahashi, D. Kasai, T. Abe, K. Mase, Y. Katayama, M. Fukuda and E. Masai (2010). Characterization of the protocatechuate 4,5-cleavage pathway operon in Comamonas sp. strain E6 and discovery of a novel pathway gene. Appl. Environ. Microbiol. 76(24): 8093-8101. Kappler, U., B. Bennett, J. Rethmeier, G. Schwarz, R. Deutzmann, A. G. McEwan and C. Dahl (2000). Sulfite:Cytochrome c oxidoreductase from Thiobacillus novellus. Purification, characterization, and molecular biology of a heterodimeric member of the sulfite oxidase family. J. Biol. Chem. 275(18): 13202-13212. Kasai, D., M. Kitajima, M. Fukuda and E. Masai (2010). Transcriptional regulation of the terephthalate catabolism operon in Comamonas sp. strain E6. Appl. Environ. Microbiol. 76(18): 6047-6055. Kelly, D. P. and J. C. Murrell (1999). Microbial metabolism of methanesulfonic acid. Arch. Microbiol. 172(6): 341-348. Kennedy, S. I. T. and C. A. Fewson (1968). Enzymes of the mandelate pathway in bacterium N.C.I.B. 8250. Biochem. J. 107: 497-506. Kjeldsen, K. U., B. V. Kjellerup, K. Egli, B. Frolund, P. H. Nielsen and K. Ingvorsen (2007). Phylogenetic and functional diversity of bacteria in biofilms from metal surfaces of an alkaline district heating system. FEMS Microbiol. Ecol. 61: 384-397. Klebensberger, J., O. Rui, E. Fritz, B. Schink and B. Philipp (2006). Cell aggregation of Pseudomonas aeruginosa strain PAO1 as an energy-dependent stress response during growth with sodium dodecyl sulfate. Arch. Microbiol. 185(6): 417-427. Knepper, T. P. and J. L. Berna (2003). Surfactants: properties, production, and environmental aspects. Analysis and fate of surfactants in the aquatic environment. T. P. Knepper, D. Barceló and P. de Voogt. Amsterdam, Elsevier: 1-50. Knepper, T. P., P. Eichhorn and L. S. Bonnington (2003). Environmental processes. Analysis and fate of surfactants in the aquatic environment. T. P. Knepper, D. Barceló and P. de Voogt. Amsterdam, Elsevier: 525-653. 198 CHAPTER 11

Kölbener, P., U. Baumann, T. Leisinger and A. M. Cook (1995). Non-degraded metabolites arising from the biodegradation of commercial linear alkylbenzenesulfonate (LAS) surfactants in a laboratory trickling filter. Environ. Toxicol. Chem. 14: 561-569. Kotani, T., H. Yurimoto, N. Kato and Y. Sakai (2007). Novel acetone metabolism in a propane-utilizing bacterium, Gordonia sp. strain TY-5. J. Bacteriol. 189(3): 886-893. Krejčík, Z., D. Schleheck, K. Hollemeyer and A. M. Cook (2012). A five-gene cluster involved in utilization of taurine-nitrogen and excretion of sulfoacetaldehyde by Acinetobacter radioresistens SH164. Arch. Microbiol. 194(10): 857-863. Krogh, A., B. Larsson, G. von Heijne and E. L. L. Sonnhammer (2001). Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305: 567-580. Krol, J. E., J. T. Penrod, H. McCaslin, L. M. Rogers, H. Yano, A. D. Stancik, W. Dejonghe, C. J. Brown, R. E. Parales, S. Wuertz and E. M. Top (2012). Role of IncP-1 plasmids pWDL7::rfp and pNB8c in chloroaniline catabolism as determined by genomic and functional analyses. Appl. Environ. Microbiol. 78(3): 828-838. Kuykendall, L. D. (2005). Order VI. Rhizobiales ord. nov. Bergey's Manual of Systematic Bacteriology, second edition, part C. G. M. Garrity, D. J. Brenner, N. R. Krieg and J. T. Staley. New York, Springer. 2: p. 324. Laemmli, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227(5259): 680-685. Lagesen, K., P. F. Hallin, E. Rødland, H. H. Stærfeldt, T. Rognes and D. W. Ussery (2007). RNammer: consistent annotation of rRNA genes in genomic sequences. Nucleic Acids Res. 35: 3100-3108. Lai, Q., L. Wang, Y. Liu, J. Yuan, F. Sun and Z. Shao (2010). Parvibaculum indicum sp. nov., isolated from deep sea water of Indian Ocean. Int. J. Syst. Evol. Microbiol. Larkin, M. A., G. Blackshields, N. P. Brown, R. Chenna, P. A. McGettigan, H. McWilliam, F. Valentin, I. M. Wallace, A. Wilm, Lopez, R. , J. D. Thompson, T. J. Gibson and D. G. Higgins (2007). Clustal W and Clustal X version 2.0. Bioinformatics 23: 2947- 2948. Laue, H., K. Denger and A. M. Cook (1997). Taurine reduction in anaerobic respiration of Bilophila wadsworthia RZATAU. Appl. Environ. Microbiol. 63: 2016-2021. Leipold, F., R. Wardenga and U. T. Bornscheuer (2012). Cloning, expression and characterization of a eukaryotic cycloalkanone monooxygenase from Cylindrocarpon radicicola ATCC 11011. Appl. Microbiol. Biotechnol. 94(3): 705-717. Leisch, H., K. Morley and P. C. Lau (2011). Baeyer-Villiger monooxygenases: more than just green chemistry. Chem. Rev. 111(7): 4165-4222. Lessner, D. J., G. R. Johnson, R. E. Parales, J. C. Spain and D. T. Gibson (2002). Molecular characterization and substrate specificity of nitrobenzene dioxygenase from Comamonas sp. strain JS765. Appl. Environ. Microbiol. 68(2): 634-641. Levy-Schil, S., F. Soubrier, A. M. Crutz-Le Coq, D. Faucher, J. Crouzet and D. Petre (1995). Aliphatic nitrilase from a soil-isolated Comamonas testosteroni sp.: gene cloning and overexpression, purification and primary structure. Gene 161(1): 15-20. Li, B., L. Shen and D.-M. Zhang (2009). Dynamics of petroleum-degrading bacterial community with biodegradation of petroleum contamination in seawater. J. Ningbo University (Natural Science & Engineering Edition) 4. CHAPTER 11 199

Li, M., L. Peng, Z. Ji, J. Xu and S. Li (2008). Establishment and characterization of dual- species biofilms formed from a 3,5-dinitrobenzoic-degrading strain and bacteria with high biofilm-forming capabilities. FEMS Microbiol. Lett. 278(1): 15-21. Lister, D. L., R. F. Sproule, A. J. Britt, C. R. Lowe and N. C. Bruce (1996). Degradation of cocaine by a mixed culture of Pseudomonas fluorescens MBER and Comamonas acidovorans MBLF. Appl. Environ. Microbiol. 62(1): 94-99. Liu, S., G. G. Ying, Y. S. Liu, F. Q. Peng and L. Y. He (2013). Degradation of norgestrel by bacteria from activated sludge: comparison to progesterone. Environ. Sci. Technol. 47(18): 10266-10276. Locher, H. H., T. Leisinger and A. M. Cook (1989). Degradation of p-toluenesulphonic acid via sidechain oxidation, desulphonation and meta-ring cleavage in Pseudomonas (Comamonas) testosteroni T-2. J. Gen. Microbiol. 135: 1969-1978. Lowe, T. M. and S. R. Eddy (1997). tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25: 955-964. Ma, Y. F., J. F. Wu, S. Y. Wang, C. Y. Jiang, Y. Zhang, S. W. Qi, L. Liu, G. P. Zhao and S. J. Liu (2007). Nucleotide sequence of plasmid pCNB1 from Comamonas strain CNB-1 reveals novel genetic organization and evolution for 4-chloronitrobenzene degradation. Appl. Environ. Microbiol. 73(14): 4477-4483. Ma, Y. F., Y. Zhang, J. Y. Zhang, D. W. Chen, Y. Zhu, H. Zheng, S. Y. Wang, C. Y. Jiang, G. P. Zhao and S. J. Liu (2009). The complete genome of Comamonas testosteroni reveals its genetic adaptations to changing environments. Appl. Environ. Microbiol. 75(21): 6812-6819. Macedo, A. J., K. N. Timmis and W. R. Abraham (2007). Widespread capacity to metabolize polychlorinated biphenyls by diverse microbial communities in soils with no significant exposure to PCB contamination. Environ. Microbiol. 9(8): 1890-1897. Maeda, T., Y. Takeda, T. Murakami, A. Yokota and M. Wada (2010). Purification, characterization and amino acid sequence of a novel enzyme, D-threo-3- hydroxyaspartate dehydratase, from Delftia sp. HT23. J. Biochem. 148(6): 705-712. Mahato, S. B., S. Banerjee and S. Podder (1988). Oxidative side-chain and ring fission of pregnanes by Arthrobacter simplex. Biochem. J. 255(3): 769-774. Malito, E., A. Alfieri, M. W. Fraaije and A. Mattevi (2004). Crystal structure of a Baeyer- Villiger monooxygenase. Proc. Natl. Acad. Sci. U.S.A. 101(36): 13157-13162. Mampel, J., E. Maier, T. Tralau, J. Ruff, R. Benz and A. M. Cook (2004). A novel outer- membrane anion channel (porin) as part of the putatively two-component transport system for p-toluenesulfonate in Comamonas testosteroni T-2. Biochem. J. 383: 91-99. Mampel, J., M. A. Providenti and A. M. Cook (2005). Protocatechuate 4,5-dioxygenase from Comamonas testosteroni T-2: biochemical and molecular properties of a new subgroup within class III of extradiol dioxygenases. Arch. Microbiol. 183(2): 130-139. Marcus, P. I. and P. Talalay (1956). Induction and purification of -and -hydroxysteroid dehydrogenases. J. Biol. Chem. 218(2): 661-674. Markowitz, V. M., I.-M. A. Chen, Palaniappan, K. Chu, E. Szeto, Y. Grechkin, A. Ratner, B. Jacob, J. Huang, P. Williams, M. Huntemann, I. Anderson, K. Mavromatis, N. N. Ivanova and N. C. Kyrpides (2012). IMG: the integrated microbial genomes database and comparative analysis system. Nucleic Acids Res. 40: D115-D122. Markowitz, V. M., E. Szeto, K. Palaniappan, Y. Grechkin, K. Chu, I. M. Chen, I. Dubchak, I. Anderson, A. Lykidis, K. Mavromatis, N. N. Ivanova and N. C. 200 CHAPTER 11

Kyrpides (2008). The integrated microbial genomes (IMG) system in 2007: data content and analysis tool extensions. Nucleic Acids Res. 36(Database issue): D528-533. Martelli, H. L. (1967). Oxidation of sulphonic compounds by aquatic bacteria isolated from rivers of the Amazon region. Nature 216: 1238-1239. Martínez-Pascual, E., N. Jiménez, G. Vidal-Gavilan, M. Vinas and A. M. Solanas (2010). Chemical and microbial community analysis during aerobic biostimulation assays of non-sulfonated alkyl-benzene-contaminated groundwater. Appl. Microbiol. Biotechnol. 88(4): 985-995. Mayer, J. and A. M. Cook (2009). Homotaurine metabolized to 3-sulfopropanoate in Cupriavidus necator H16: enzymes and genes in a patchwork pathway. J. Bacteriol. 191: 6052-6058. Mayer, J., T. Huhn, M. Habeck, K. Denger, K. Hollemeyer and A. M. Cook (2010). 2,3- Dihydroxypropane-1-sulfonate degraded by Cupriavidus pinatubonensis JMP134: purification of dihydroxypropanesulfonate 3-dehydrogenase. Microbiology 156: 1556- 1564. Meyer, B. H., B. Zolghadr, E. Peyfoon, M. Pabst, M. Panico, H. R. Morris, S. M. Haslam, P. Messner, C. Schäffer, A. Dell and S.-V. Albers (2011). Sulfoquinovose synthase - an important enzyme in the N-glycosylation pathway of Sulfolobus acidocaldarius. Mol. Microbiol. 82: 1150-1163. Miyamoto, M., J. Matsumoto, T. Iwaya and E. Itagaki (1995). Bacterial steroid monooxygenase catalyzing the Baeyer-Villiger oxidation of C-21-ketosteroids from Rhodococcus rhodochrous: the isolation and characterization. Biochim. Biophys. Acta 1251(2): 115-124. Möbus, E., M. Jahn, R. Schmid, D. Jahn and E. Maser (1997). Testosterone-regulated expression of enzymes involved in steroid and aromatic hydrocarbon catabolism in Comamonas testosteroni. J. Bacteriol. 179(18): 5951-5955. Möbus, E. and E. Maser (1998). Molecular cloning, overexpression, and characterization of steroid-inducible 3-hydroxysteroid dehydrogenase/carbonyl reductase from Comamonas testosteroni. A novel member of the short-chain dehydrogenase/reductase superfamily. J. Biol. Chem. 273(47): 30888-30896. Moonen, M. J., N. M. Kamerbeek, A. H. Westphal, S. A. Boeren, D. B. Janssen, M. W. Fraaije and W. J. van Berkel (2008a). Elucidation of the 4-hydroxyacetophenone catabolic pathway in Pseudomonas fluorescens ACB. J. Bacteriol. 190(15): 5190-5198. Moonen, M. J., S. A. Synowsky, W. A. van den Berg, A. H. Westphal, A. J. Heck, R. H. van den Heuvel, M. W. Fraaije and W. J. van Berkel (2008b). Hydroquinone dioxygenase from Pseudomonas fluorescens ACB: a novel member of the family of nonheme-iron(II)-dependent dioxygenases. J. Bacteriol. 190(15): 5199-5209. Morii, S., S. Sawamoto, Y. Yamauchi, M. Miyamoto, M. Iwami and E. Itagaki (1999). Steroid monooxygenase of Rhodococcus rhodochrous: sequencing of the genomic DNA, and hyperexpression, purification, and characterization of the recombinant enzyme. J. Biochem. 126(3): 624-631. Müller, R. H., S. Jorks, S. Kleinsteuber and W. Babel (1999). Comamonas acidovorans strain MC1: a new isolate capable of degrading the chiral herbicides dichlorprop and mecoprop and the herbicides 2,4-D and MCPA. Microbiol. Res. 154(3): 241-246. Nakatsu, C., J. Ng, R. Singh, N. Straus and C. Wyndham (1991). Chlorobenzoate catabolic transposon Tn5271 is a composite class I element with flanking class II insertion sequences. Proc. Natl. Acad. Sci. U.S.A. 88(19): 8312-8316. CHAPTER 11 201

Ni, B., Y. Zhang, D. W. Chen, B. J. Wang and S. J. Liu (2012). Assimilation of aromatic compounds by Comamonas testosteroni: characterization and spreadability of protocatechuate 4,5-cleavage pathway in bacteria. Appl. Microbiol. Biotechnol. 97(13): 6031-6041. Nishino, S. F. and J. C. Spain (1995). Oxidative pathway for the biodegradation of nitrobenzene by Comamonas sp. strain JS765. Appl. Environ. Microbiol. 61: 2308-2313. Oppermann, U. C., I. Belai and E. Maser (1996). Antibiotic resistance and enhanced insecticide catabolism as consequences of steroid induction in the gram-negative bacterium Comamonas testosteroni. J. Steroid Biochem. Mol. Biol. 58: 217-223. Ornston, M. K. and L. N. Ornston (1972). Regulation of -ketoadipate pathway in Pseudomonas acidovorans and Pseudomonas testosteroni. J. Gen. Microbiol. 73: 455- 464. Paixão, D. A., M. R. Dimitrov, R. M. Pereira, F. R. Accorsini, M. B. Vidotti and E. G. Lemos (2010). Molecular analysis of the bacterial diversity in a specialized consortium for diesel oil degradation. Rev. Bras. Cienc. Solo 34 773-781. Pati, A., N. Ivanova, N. Mikhailova, G. Ovchinikova, S. D. Hooper, A. Lykidis and N. C. Kyrpides (2010). GenePRIMP: A Gene Prediction Improvement Pipeline for microbial genomes. Nat. Methods 7: 6. Pérez-Pantoja, D., R. Donoso, L. Agulló, M. Córdova, M. Seeger, D. H. Pieper and B. González (2012). Genomic analysis of the potential for aromatic compounds biodegradation in Burkholderiales. Environ. Microbiol. 14(5): 1091-1117. Perry, L. L. and G. J. Zylstra (2007). Cloning of a gene cluster involved in the catabolism of p-nitrophenol by Arthrobacter sp strain JS443 and characterization of the p-nitrophenol monooxygenase. J. Bacteriol. 189(21): 7563-7572. Peterson, G. E., R. W. Thoma, D. Perlman and J. Fried (1957). Metabolism of progesterone by Cylindrocarpon radicicola and Streptomyces lavendulae. J. Bacteriol. 74(5): 684- 688. Plesiat, P., M. Grandguillot, S. Harayama, S. Vragar and Y. Michelbriand (1991). Cloning, sequencing, and expression of the Pseudomonas testosteroni gene encoding 3- oxosteroid delta-1-dehydrogenase. J. Bacteriol. 173(22): 7219-7227. Providenti, M. A., J. Mampel, S. MacSween, A. M. Cook and R. C. Wyndham (2001). Comamonas testosteroni BR6020 possesses a single genetic locus for extradiol cleavage of protocatechuate. Microbiology 147(Pt 8): 2157-2167. Providenti, M. A., J. M. O'Brien, J. Ruff, A. M. Cook and I. B. Lambert (2006a). Metabolism of isovanillate, vanillate, and veratrate by Comamonas testosteroni strain BR6020. J. Bacteriol. 188(11): 3862-3869. Providenti, M. A., R. E. Shaye, K. D. Lynes, N. T. McKenna, J. M. O'Brien, S. Rosolen, R. C. Wyndham and L. B. Lambert (2006b). The locus coding for the 3-nitrobenzoate dioxygenase of Comamonas sp. strain JS46 is flanked by IS1071 elements and is subject to deletion and inversion events. Appl. Environ. Microbiol. 72(4): 2651-2660. Pruneda-Paz, J. L., M. Linares, J. E. Cabrera and S. Genti-Raimondi (2004). TeiR, a LuxR-type transcription factor required for testosterone degradation in Comamonas testosteroni. J. Bacteriol. 186(5): 1430-1437. Pugh, C. E., A. B. Roy, T. Hawkes and J. L. Harwood (1995). A new pathway for the synthesis of the plant sulpholipid, sulphoquinovosyldiacylglycerol. Biochem. J. 309: 513-519. 202 CHAPTER 11

Rahim, M. A. and C. J. Sih (1966). Mechanisms of steroid oxidation by microorganisms. XI. Enzymatic cleavage of pregnane side chain. J. Biol. Chem. 241(15): 3615-3623. Reetz, M. T. and S. Wu (2009). Laboratory evolution of robust and enantioselective Baeyer- Villiger monooxygenases for asymmetric catalysis. J. Am. Chem. Soc. 131(42): 15424- 15432. Rehdorf, J., G. A. Behrens, G. S. Nguyen, R. Kourist and U. T. Bornscheuer (2012). Pseudomonas putida esterase contains a GGG(A)X-motif confering activity for the kinetic resolution of tertiary alcohols. Appl. Microbiol. Biotechnol. 93(3): 1119-1126. Rehdorf, J., A. Kirschner and U. T. Bornscheuer (2007). Cloning, expression and characterization of a Baeyer-Villiger monooxygenase from Pseudomonas putida KT2440. Biotechnol. Lett. 29(9): 1393-1398. Rehdorf, J., C. L. Zimmer and U. T. Bornscheuer (2009). Cloning, expression, characterization, and biocatalytic investigation of the 4-hydroxyacetophenone monooxygenase from Pseudomonas putida JD1. Appl. Environ. Microbiol. 75(10): 3106-3114. Rein, U., R. Gueta, K. Denger, J. Ruff, K. Hollemeyer and A. M. Cook (2005). Dissimilation of cysteate via 3-sulfolactate sulfo-lyase and a sulfate exporter in Paracoccus pantotrophus NKNCYSA. Microbiology 151(Pt 3): 737-747. Riebel, A., H. M. Dudek, G. de Gonzalo, P. Stepniak, L. Rychlewski and M. W. Fraaije (2012). Expanding the set of rhodococcal Baeyer-Villiger monooxygenases by high- throughput cloning, expression and substrate screening. Appl. Microbiol. Biotechnol. 95(6): 1479-1489. Rösch, V., K. Denger, D. Schleheck, T. H. Smits and A. M. Cook (2008). Different bacterial strategies to degrade taurocholate. Arch. Microbiol. 190(1): 11-18. Roy, A. B., M. J. E. Hewlins, A. J. Ellis, J. L. Harwood and G. F. White (2003). Glycolytic breakdown of sulfoquinovose in bacteria: a missing link in the sulfur cycle. Appl. Environ. Microbiol. 69: 6434-6441. Ruff, J., K. Denger and A. M. Cook (2003). Sulphoacetaldehyde acetyltransferase yields acetyl phosphate: purification from Alcaligenes defragrans and gene clusters in taurine degradation. Biochem. J. 369(Pt 2): 275-285. Ruff, J., T. H. Smits, A. M. Cook and D. Schleheck (2010). Identification of two vicinal operons for the degradation of 2-aminobenzenesulfonate encoded on plasmid pSAH in Alcaligenes sp. strain O-1. Microbiol. Res. 165(4): 288-299. Saito, N., M. Robert, H. Kochi, G. Matsuo, Y. Kakazu, T. Soga and M. Tomita (2009). Metabolite profiling reveals YihU as a novel hydroxybutyrate dehydrogenase for alternative succinic semialdehyde metabolism in Escherichia coli. J. Biol. Chem. 284(24): 16442-16451. Sánchez, O., E. Diestra, I. Esteve and J. Mas (2005). Molecular characterization of an oil- degrading cyanobacterial consortium. Microb. Ecol. 50( 4): 580-588. Sasoh, M., E. Masai, S. Ishibashi, H. Hara, N. Kamimura, K. Miyauchi and M. Fukuda (2006). Characterization of the terephthalate degradation genes of Comamonas sp. strain E6. Appl. Environ. Microbiol. 72(3): 1825-1832. Sato, Y., H. Nishihara, M. Yoshida, M. Watanabe, J. D. Rondal, R. N. Concepcion and H. Ohta (2006). Cupriavidus pinatubonensis sp. nov. and Cupriavidus laharis sp. nov., novel hydrogen-oxidizing, facultatively chemolithotrophic bacteria isolated from volcanic mudflow deposits from Mt. Pinatubo in the Philippines. Int. J. Syst. Evol. Microbiol. 56: 973-978. CHAPTER 11 203

Sawyer, C. N. and D. W. Ryckman (1957). Anionic synthetic detergents and water supply problems. J. Am. Water Works Assoc. 49: 480-490. Sayeh, R., J. L. Birrien, K. Alain, G. Barbier, M. Hamdi and D. Prieur (2010). Microbial diversity in Tunisian geothermal springs as detected by molecular and culture-based approaches. Extremophiles 14(6): 501-514. Schläfli, H. R., M. A. Weiss, T. Leisinger and A. M. Cook (1994). Terephthalate 1,2- dioxygenase system from Comamonas testosteroni T-2: purification and some properties of the oxygenase component. J. Bacteriol. 176(21): 6644-6652. Schleheck, D., N. Barraud, J. Klebensberger, J. S. Webb, D. McDougald, S. A. Rice and S. Kjelleberg (2009). Pseudomonas aeruginosa PAO1 preferentially grows as aggregates in liquid batch cultures and disperses upon starvation. PLoS ONE 4(5): e5513. Schleheck, D. and A. M. Cook (2005). -Oxygenation of the alkyl sidechain of linear alkylbenzenesulfonate (LAS) surfactant in Parvibaculum lavamentivoransT. Arch. Microbiol. 183: 369-377. Schleheck, D., W. Dong, K. Denger, E. Heinzle and A. M. Cook (2000). An - proteobacterium converts linear alkylbenzenesulfonate (LAS) surfactants into sulfophenylcarboxylates, and linear alkyldiphenyletherdisulfonate surfactants into sulfodiphenylethercarboxylates. Appl. Environ. Microbiol. 66: 1911-1916. Schleheck, D., T. Knepper, P. Eichhorn and A. M. Cook (2007). Parvibaculum lavamentivorans DS-1T degrades centrally-substituted congeners of commercial linear alkylbenzenesulfonate (LAS) to sulfophenyl carboxylates and sulfophenyl dicarboxylates. Appl. Environ. Microbiol. 73: 4725-4732. Schleheck, D., T. P. Knepper, K. Fischer and A. M. Cook (2004a). Mineralization of individual congeners of linear alkylbenzenesulfonate (LAS) by defined pairs of heterotrophic bacteria. Appl. Environ. Microbiol. 70: 4053-4063. Schleheck, D., M. Lechner, R. Schönenberger, M. J. Suter and A. M. Cook (2003). Desulfonation and degradation of the disulfodiphenylethercarboxylates from linear alkyldiphenyletherdisulfonate surfactants. Appl. Environ. Microbiol. 69(2): 938-944. Schleheck, D., B. Tindall, R. Rosselló-Mora and A. M. Cook (2004b). Parvibaculum lavamentivorans gen. nov., spec. nov., a novel heterotroph that initiates catabolism of linear alkylbenzenesulfonate. Int. J. Syst. Evol. Microbiol. 54: 1489-1497. Schleheck, D., F. von Netzer, T. Fleischmann, D. Rentsch, T. Huhn, A. M. Cook and H. P. Kohler (2010). The missing link in linear alkylbenzenesulfonate surfactant degradation: 4-sulfoacetophenone as a transient intermediate in the degradation of 3-(4- sulfophenyl)butyrate by Comamonas testosteroni KF-1. Appl. Environ. Microbiol. 76(1): 196-202. Schleheck, D., M. Weiss, S. Pitluck, D. Bruce, M. L. Land, S. Han, E. Saunders, R. Tapia, C. Detter, T. Brettin, J. Han, T. Woyke, L. Goodwin, L. Pennacchio, M. Nolan, A. M. Cook, S. Kjelleberg and T. Thomas (2011). Complete genome sequence of Parvibaculum lavamentivorans type strain (DS-1T). Stand. Genomic Sci. 5(3): 298-310. Schmidt, A., N. Müller, B. Schink and D. Schleheck (2013). A proteomic view at the biochemistry of syntrophic butyrate oxidation in Syntrophomonas wolfei. PLoS ONE 8(2): e56905. Schmidt, M., D. Böttcher and U. T. Bornscheuer (2009). Protein engineering of carboxyl esterases by rational design and directed evolution. Protein Pept. Lett. 16(10): 1162- 1171. 204 CHAPTER 11

Schulz, S., W. Dong, U. Groth and A. M. Cook (2000). Enantiomeric degradation of 2-(4- sulfophenyl)butyrate via 4-sulfocatechol in Delftia acidovorans SPB1. Appl. Environ. Microbiol. 66: 1905-1910. Seifert, U. and E. Heinz (1992). Enzymatic characteristics of UDP-sulfoquinovose - diacylglycerol sulfoquinovosyltransferase from chloroplast envelopes. Bot. Acta 105(3): 197-205. Selstam, E. and D. Campbell (1996). Membrane lipid composition of the unusual cyanobacterium Gloeobacter violaceus sp. PCC 7421, which lacks sulfoquinovosyl diacylglycerol. Arch. Microbiol. 166(2): 132-135. Serif, M. (2012). Characterization of P450 monooxygenases involved in the LAS-degradation pathway in Parvibaculum lavamentivorans DS-1 Master's thesis, University of Konstanz. Shaw, D. A., L. F. Borkenhagen and P. Talalay (1965). Enzymatic oxidation of steroids by cell-free extracts of Pseudomonas testosteroni: isolation of cleavage products of ring A. Proc. Natl. Acad. Sci. U.S.A. 54: 837-844. Sheng, D., D. P. Ballou and V. Massey (2001). Mechanistic studies of cyclohexanone monooxygenase: chemical properties of intermediates involved in catalysis. Biochemistry 40(37): 11156-11167. Sih, C. J., J. Laval and M. A. Rahim (1963). Purification and properties of steroid esterase from Nocardia restrictus. J. Biol. Chem. 238: 566-571. Simoni, S., S. Klinke, C. Zipper, W. Angst and H. E. Kohler (1996). Enantioselective metabolism of chiral 3-phenylbutyric acid, an intermediate of linear alkylbenzene degradation, by Rhodococcus rhodochrous PB1. Appl. Environ. Microbiol. 62(3): 749- 755. Sipila, T. P., A. K. Keskinen, M. L. Akerman, C. Fortelius, K. Haahtela and K. Yrjala (2008). High aromatic ring cleavage diversity in birch rhizosphere: PAH treatment- specific changes of I.E.3 group extradiol dioxygenases and 16S rRNA bacterial communities in soil. ISME J 2(9): 968-981. Siunova, T. V., A. V. Siunov, V. V. Kochetkov and A. M. Boronin ( 2009). The cnr-like operon in strain Comamonas sp. encoding resistance to cobalt and nickel. Genetika 45: 336-341. Song, Z., S. R. Edwards and R. G. Burns (2005). Biodegradation of naphthalene-2-sulfonic acid present in tannery wastewater by bacterial isolates Arthrobacter sp. 2AC and Comamonas sp. 4BC. Biodegradation 16(3): 237-252. Sörbo, B. (1987). Sulfate: turbidimetric and nephelometric methods. Meth. Enzymol. 143: 3-6. Sota, M., H. Yano, Y. Nagata, Y. Ohtsubo, H. Genka, H. Anbutsu, H. Kawasaki and M. Tsuda (2006). Functional analysis of unique class II insertion sequence IS1071. Appl. Environ. Microbiol. 72(1): 291-297. Stelzer, S., S. Egan, M. R. Larsen, D. H. Bartlett and S. Kjelleberg (2006). Unravelling the role of the ToxR-like transcriptional regulator WmpR in the marine antifouling bacterium Pseudoalteromonas tunicata. Microbiology 152(5): 1385-1394 Sugimoto, K., N. Sato and M. Tsuzuki (2007). Utilization of a chloroplast membrane sulfolipid as a major internal sulfur source for protein synthesis in the early phase of sulfur starvation in Chlamydomonas reinhardti. FEBS Lett. 581(23): 4519-4522. Swisher, R. D. (1970). Surfactant biodegradation. New York, Marcel Dekker. Sylvestre, M., M. Sirois, Y. Hurtubise, J. Bergeron, D. Ahmad, F. Shareck, D. Barriault, I. Guillemette and J. M. Juteau (1996). Sequencing of Comamonas testosteroni strain CHAPTER 11 205

B-356-biphenyl/chlorobiphenyl dioxygenase genes: evolutionary relationships among Gram-negative bacterial biphenyl dioxygenases. Gene 174(2): 195-202. Szolkowy, C., L. D. Eltis, N. C. Bruce and G. Grogan (2009). Insights into sequence-activity relationships amongst Baeyer-Villiger monooxygenases as revealed by the intragenomic complement of enzymes from Rhodococcus jostii RHA1. Chembiochem. 10(7): 1208-1217. Takenaka, S., J. Koshiya, S. Okugawa, A. Takata, S. Murakami and K. Aoki (2011). Fe- superoxide dismutase and 2-hydroxy-1,4-benzoquinone reductase preclude the auto- oxidation step in 4-aminophenol metabolism by Burkholderia sp. strain AK-5. Biodegradation 22(1): 1-11. Takenaka, S., S. Okugawa, M. Kadowaki, S. Murakami and K. Aoki (2003). The metabolic pathway of 4-aminophenol in Burkholderia sp strain AK-5 differs from that of aniline and aniline with C-4 substituents. Appl. Environ. Microbiol. 69(9): 5410-5413. Talalay, P. (2005). A fascination with enzymes: The journey not the arrival matters. J. Biol. Chem. 280(32): 28829-28847. Talalay, P., M. M. Dobson and D. F. Tapley (1952). Oxidative degradation of testosterone by adaptive enzymes. Nature 170(4328): 620-621. Tamaoka, J., D. M. Ha and K. Komagata (1987). Reclassification of Pseudomonas acidovorans Den Dooren De Jong 1926 and Pseudomonas testosteroni Marcus and Talalay 1956 as Comamonas acidovorans comb. nov. and Comamonas testosteroni comb. nov., with an emended description of the genus Comamonas. Int. J. Syst. Bacteriol. 37(1): 52-59. Tamura, K., J. Dudley, M. Nei and S. Kumar (2007). MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24(8): 1596-1599. Tanner, A. and D. J. Hopper (2000). Conversion of 4-hydroxyacetophenone into 4-phenyl acetate by a flavin adenine dinucleotide-containing Baeyer-Villiger-type monooxygenase. J. Bacteriol. 182(23): 6565-6569. Tehara, S. K. and J. D. Keasling (2003). Gene cloning, purification, and characterization of a phosphodiesterase from Delftia acidovorans. Appl. Environ. Microbiol. 69(1): 504- 508. Teramoto, M., H. Futamata, S. Harayama and K. Watanabe (1999). Characterization of a high-affinity phenol hydroxylase from Comamonas testosteroni R5 by gene cloning, and expression in Pseudomonas aeruginosa PAO1c. Mol. Gen. Genet. 262(3): 552-558. Teufel, R., T. Friedrich and G. Fuchs (2012). An oxygenase that forms and deoxygenates toxic epoxide. Nature 483(7389): 359-362. Teufel, R., C. Gantert, M. Voss, W. Eisenreich, W. Haehnel and G. Fuchs (2011). Studies on the mechanism of ring hydrolysis in phenylacetate degradation: a metabolic branching point. J. Biol. Chem. 286(13): 11021-11034. Teufel, R., V. Mascaraque, W. Ismail, M. Voss, J. Perera, W. Eisenreich, W. Haehnel and G. Fuchs (2010). Bacterial phenylalanine and phenylacetate catabolic pathway revealed. Proc. Natl. Acad. Sci. U.S.A. 107(32): 14390-14395. Thomas, T., F. F. Evans, D. Schleheck, A. Mai-Prochnow, C. Burke, A. Penesyan, D. S. Dalisay, S. Stelzer-Braid, N. Saunders, J. Johnson, S. Ferriera, S. Kjelleberg and S. Egan (2008). Analysis of the Pseudoalteromonas tunicata genome reveals properties of a surface-associated life style in the marine environment. PLoS ONE 3(9): e3252. 206 CHAPTER 11

Thurnheer, T., T. Kohler, A. M. Cook and T. Leisinger (1986). Orthanilic acid and analogs as carbon-sources for bacteria: Growth physiology and enzymic desulfonation. J. Gen. Microbiol. 132: 1215-1220. Thysse, G. J. E. and T. H. Wanders (1974). Initial steps in degradation of alkane-1-sulfonates by Pseudomonas. Antonie Van Leeuwenhoek 40(1): 25-37. Top, E. M. and D. Springael (2003). The role of mobile genetic elements in bacterial adaptation to xenobiotic organic compounds. Curr. Opin. Biotechnol. 14(3): 262-269. Tralau, T., A. M. Cook and J. Ruff (2001). Map of the IncP1 plasmid pTSA encoding the widespread genes (tsa) for p-toluenesulfonate degradation in Comamonas testosteroni T-2. Appl. Environ. Microbiol. 67(4): 1508-1516. Tralau, T., A. M. Cook and J. Ruff (2003a). An additional regulator, TsaQ, is involved with TsaR in regulation of transport during the degradation of p-toluenesulfonate in Comamonas testosteroni T-2. Arch. Microbiol. 180(5): 319-326. Tralau, T., J. Mampel, A. M. Cook and J. Ruff (2003b). Characterization of TsaR, an oxygen-sensitive LysR-type regulator for the degradation of p-toluenesulfonate in Comamonas testosteroni T-2. Appl. Environ. Microbiol. 69(4): 2298-2305. Turek, M., L. Vilimkova, V. Kremlackova, J. J. Paca, M. Halecky, J. Paca and M. Stiborova (2011). Isolation and partial characterization of extracellular NADPH- dependent phenol hydroxylase oxidizing phenol to catechol in Comamonas testosteroni. Neuro Endocrinol. Lett. 32: 137-145. Urata, M., E. Uchida, H. Nojiri, T. Omori, R. Obo, N. Miyaura and N. Ouchiyama (2004). Genes involved in aniline degradation by Delftia acidovorans strain 7N and its distribution in the natural environment. Biosci. Biotechnol. Biochem. 68(12): 2457- 2465. Validation-List-107 (2006). List of new names and new combinations previously effectively, but not validly, published. Int. J. Syst. Evol. Microbiol. 56(Pt 1): 1-6. Van Beilen, J. B., F. Mourlane, M. A. Seeger, J. Kovac, Z. Li, T. H. Smits, U. Fritsche and B. Witholt (2003). Cloning of Baeyer-Villiger monooxygenases from Comamonas, Xanthobacter and Rhodococcus using polymerase chain reaction with highly degenerate primers. Environ. Microbiol. 5(3): 174-182. van Berkel, W. J., N. M. Kamerbeek and M. W. Fraaije (2006). Flavoprotein monooxygenases, a diverse class of oxidative biocatalysts. J. Biotechnol. 124(4): 670- 689. van den Heuvel, R. H., N. Tahallah, N. M. Kamerbeek, M. W. Fraaije, W. J. van Berkel, D. B. Janssen and A. J. Heck (2005). Coenzyme binding during catalysis is beneficial for the stability of 4-hydroxyacetophenone monooxygenase. J. Biol. Chem. 280(37): 32115-32121. van der Geize, R., G. I. Hessels, R. van Gerwen, P. van der Meijden and L. Dijkhuizen (2002). Molecular and functional characterization of kshA and kshB, encoding two components of 3-ketosteroid 9alpha-hydroxylase, a class IA monooxygenase, in Rhodococcus erythropolis strain SQ1. Mol. Microbiol. 45(4): 1007-1018. van der Meer, J. R. (2008). A genomic view on the evolution of catabolic pathways and bacterial adaptation to xenobiotic compounds. Microbial biodegradation: genomics and molecular biology. E. Diaz. Norfolk, England, Caister Academic Press. van Ginkel, C. G. (1996). Complete degradation of xenobiotic surfactants by consortia of aerobic organisms. Biodegradation 7: 151-164. CHAPTER 11 207

Van Houdt, R., S. Monchy, N. Leys and M. Mergeay (2009). New mobile genetic elements in Cupriavidus metallidurans CH34, their possible roles and occurrence in other bacteria. Antonie Van Leeuwenhoek 96(2): 205-226. Wagner, F. C. and E. E. Reid (1931). The stability of the carbon-sulfur bond in some aliphatic sulfonic acids. J. Am. Chem. Soc. 53: 3407-3413. Wang, B., Q. Lai, Z. Cui, T. Tan and Z. Shao (2008). A pyrene-degrading consortium from deep-sea sediment of the West Pacific and its key member Cycloclasticus sp. P1. Environ. Microbiol. 10(8): 1948-1963. Wang, L., W. Wang, Q. Lai and Z. Shao (2010). Gene diversity of CYP153A and AlkB alkane hydroxylases in oil-degrading bacteria isolated from the Atlantic Ocean. Environ. Microbiol. 12(5): 1230-1242. Wang, Y., A. Yamazoe, S. Suzuki, C.-T. Liu, T. Aono and H. Oyaizu (2004). Isolation and characterization of dibenzofuran-degrading Comamonas sp. strains isolated from white clover roots. Curr. Microbiol. 49: 288-294. Wang, Y. Z., Y. Zhou and G. J. Zylstra (1995). Molecular analysis of isophthalate and terephthalate degradation by Comamonas testosteroni YZW-D. Environ. Health Perspect. 103: 9-12. Wei, M., J. J. Zhang, H. Liu and N. Y. Zhou (2010). para-Nitrophenol 4-monooxygenase and hydroxyquinol 1,2-dioxygenase catalyze sequential transformation of 4- nitrocatechol in Pseudomonas sp strain WBC-3. Biodegradation 21(6): 915-921. Weiss, M., K. Denger, T. Huhn and D. Schleheck (2012). Two enzymes of a complete degradation pathway for linear alkylbenzenesulfonate (LAS) surfactants: 4- sulfoacetophenone Baeyer-Villiger monooxygenase and 4-sulfophenylacetate esterase in Comamonas testosteroni KF-1. Appl. Environ. Microbiol. 78: 8254-8263. Weiss, M., A. I. Kesberg, K. M. LaButti, S. Pitluck, D. Bruce, L. Hauser, A. Copeland, T. Woyke, S. Lowry, S. Lucas, M. Land, L. Goodwin, S. Kjelleberg, A. M. Cook, M. Buhmann, T. Thomas and D. Schleheck (2013). Permanent draft genome sequence of Comamonas testosteroni KF-1. Stand. Genomic Sci. 8(2): 239-254. Wen, A. M., M. Fegan, C. Hayward, S. Chakraborty and L. I. Sly (1999). Phylogenetic relationships among members of the Comamonadaceae, and description of Delftia acidovorans (den Dooren de Jong 1926 and Tamaoka et al. 1987) gen. nov., comb. nov. Int. J. Syst. Bacteriol. 49: 567-576. White, G. F. and N. J. Russell (1994). Biodegradation of anionic surfactants and related molecules. Biochemistry of microbial degradation. C. Ratledge. Dordrecht, The Netherlands, Kluwer Academic Publishers: 143-177. Willems, A., J. Deley, M. Gillis and K. Kersters (1991a). Comamonadaceae, a new family encompassing the Acidovorans ribosomal-RNA Complex, including Variovorax paradoxus gen. nov., comb. nov., for Alcaligenes paradoxus (Davis 1969). Int. J. Syst. Bacteriol. 41(3): 445-450. Willems, A., B. Pot, E. Falsen, P. Vandamme, M. Gillis, K. Kersters and J. Deley (1991b). Polyphasic taxonomic study of the emended genus Comamonas: Relationship to Aquaspirillum aquaticum, E. Falsen group 10, and other clinical isolates. Int. J. Syst. Bacteriol. 41(3): 427-444. Wilson, J. J. and U. Kappler (2009). Sulfite oxidation in Sinorhizobium meliloti. Biochim. Biophys. Acta 1787(12): 1516-1525. 208 CHAPTER 11

Woese, C. R., O. Kandler and M. L. Wheelis (1990). Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. U.S.A. 87(12): 4576-4579. Xiong, G., H. J. Martin and E. Maser (2003). Identification and characterization of a novel translational repressor of the steroid-inducible 3-hydroxysteroid dehydrogenase/carbonyl reductase gene in Comamonas testosteroni. J. Biol. Chem. 278: 47400-47407. Xiong, G. and E. Maser (2001). Regulation of the steroid-inducible 3-hydroxysteroid dehydrogenase/carbonyl reductase gene in Comamonas testosteroni. J. Biol. Chem. 276(13): 9961-9970. Xiong, J., D. Li, H. Li, M. He, S. J. Miller, L. Yu, C. Rensing and G. Wang (2011). Genome analysis and characterization of zinc efflux systems of a highly zinc-resistant bacterium, Comamonas testosteroni S44. Res. Microbiol. 162(7): 671-679. Yachnin, B. J., T. Sprules, M. B. McEvoy, P. C. Lau and A. M. Berghuis (2012). The substrate-bound crystal structure of a Baeyer-Villiger monooxygenase exhibits a Criegee-like conformation. J. Am. Chem. Soc. 134(18): 7788-7795. Yang, Y., T. Chen, P. Ma, G. Shang, Y. Dai and S. Yuan (2010). Cloning, expression and functional analysis of nicotinate dehydrogenase gene cluster from Comamonas testosteroni JA1 that can hydroxylate 3-cyanopyridine. Biodegradation(21): 593-602. Yee, L. N., J. A. Chuah, M. L. Chong, L. Y. Phang, A. R. Raha, K. Sudesh and M. A. Hassan (2012). Molecular characterisation of phaCAB from Comamonas sp. EB172 for functional expression in Escherichia coli JM109. Microbiol. Res. 167(9): 550-557. Yu, B. and C. Benning (2003). Anionic lipids are required for chloroplast structure and function in Arabidopsis. Plant J. 36(6): 762-770. Zaborina, O., H. J. Seitz, I. Sidorov, J. Eberspacher, E. Alexeeva, L. Golovleva and F. Lingens (1999). Inhibition analysis of hydroxyquinol-cleaving dioxygenases from the chlorophenol-degrading Azotobacter sp. GP1 and Streptomyces rochei 303. J. Basic Microbiol. 39(1): 61-73. Zhang, J., Y. Q. Wang, S. G. Zhou, C. Y. Wu, J. He and F. B. Li (2013). Comamonas guangdongensis sp. nov., isolated from subterranean forest sediment, and emended description of the genus Comamonas. Int. J. Syst. Evol. Microbiol. 63: 809-814. Zhang, J. J., H. Liu, Y. Xiao, X. E. Zhang and N. Y. Zhou (2009). Identification and characterization of catabolic para-nitrophenol 4-monooxygenase and para- benzoquinone reductase from Pseudomonas sp. strain WBC-3. J. Bacteriol. 191(8): 2703-2710. Zhang, Y., Y. F. Ma, S. W. Qi, B. Meng, M. T. Chaudhry, S. Q. Liu and S. J. Liu (2007). Responses to arsenate stress by Comamonas sp strain CNB-1 at genetic and proteomic levels. Microbiology 153: 3713-3721.