DISCOVERY AND CHARACTERIZATION OF HYDROLYTIC DEHALOGENASES FROM GENOMIC DATA

by

Max Wong

A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Chemical Engineering and Applied Chemistry University of Toronto

Copyright Q 2008 by Max Wong Library and Bibliotheque et 1*1 Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de I'edition

395 Wellington Street 395, rue Wellington Ottawa ON K1A0N4 Ottawa ON K1A0N4 Canada Canada

Your file Votre reference ISBN: 978-0-494-38866-2 Our file Notre reference ISBN: 978-0-494-38866-2

NOTICE: AVIS: The author has granted a non­ L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par I'lnternet, prefer, telecommunication or on the Internet, distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non­ sur support microforme, papier, electronique commercial purposes, in microform, et/ou autres formats. paper, electronic and/or any other formats.

The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these. this thesis. Neither the thesis Ni la these ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent etre imprimes ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission.

In compliance with the Canadian Conformement a la loi canadienne Privacy Act some supporting sur la protection de la vie privee, forms may have been removed quelques formulaires secondaires from this thesis. ont ete enleves de cette these.

While these forms may be included Bien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. •*• Canada Abstract

Discovery and characterization of hydrolytic dehalogenases from genomic data

Max Wong

Master of Applied Science

Graduate Department of Chemical Engineering and Applied Chemistry

University of Toronto

2008

Halogenated organic compounds are prevalent environmental contaminants as a result of their widespread use in industry, and they are cited for deleterious health effects. In situ

bioremediation offers an alternative to existing processes for removing these substances from contaminated sites.

Hydrolytic dehalogenases catalyze the reaction of organohalogens with water, replacing

halide with a hydroxyl group. This project's principle objective was to find new dehaloge­

nases from genomic data, using previously characterized dehalogenases as search templates.

Putative haloalkane, haloacid and fluoroacetate dehalogenases were identified by BLAST search of a selection of genomes from enviromental and scrutinized for the presence of known critical residues. They were recombinantly expressed and screened for activity.

Out of 27 targets examined, eleven were true dehalogenases. Three haloacid dehaloge­ nases were discovered with previously-uncharacterized activity against fluoroacetate, and one haloalkane dehalogenase exhibited moderate activity against 1,2-dichloroethane. The success rate was family-dependent, and sequence similarity to characterized dehalogenases was gen­ erally a good indicator of a target's dehalogenation ability.

11 Acknowledgements

First, my supervisor, Dr. Elizabeth Edwards: thank you for taking me onto the team and allowing me to work on this unique project. It's been quite a ride!

Second, the talented people I worked with these two years. There's quite a list...

• In the Best lab: Drs. Alexander Iakounine and Alexei Savchenko, Greg Brown and

Michael Proudfoot for taking me into their lab and offering their expertise with protein

expression and purification - and Thursday night bouts of sanity.

• In EdLab: in alphabetical order, Winnie Chan, Angelika Duffy, Melanie Duhamel,

Ariel Grostern, Laura Hug, Ahsan Islam, Eve Moore, Allie Simmonds, Alison Waller,

Jennifer Wang and Cheryl Washer for acting as sounding boards for ideas and for giving

me a home in Wallberg.

• In Emil Pai's lab: Peter Chan for assistance in target selection and numerous training

sessions, and Terence To for guidance with kinetic assays.

Then there are the people who kept me (mostly) happy and hale during my stay...

• Nadine Lam, for making sure I knew when to work, when not to work, for making

sure I was always fed, for tending to the things I didn't, for listening to senseless ranting,

and for letting a lot of my mistakes slide.

• Jen Wang, for treating me like family and for giving me a wisp of Calgary.

• Raymond Choi, for injecting sardonic realism into my life.

• Stanley Wong, for listening to my frustrations on this side of the country.

Most importantly, I would like to thank my parents, Paul and Eliza, for standing by me during the past six years. It wouldn't have happened this way without their support.

111 Contents

1 Motivation 1

1.1 Introduction 1

1.2 Existing remediation methods 2

1.2.1 Excavation 3

1.2.2 Pump-and-treat 3

1.2.3 Bioremediation 4

1.3 Opportunities from genomic data 4

1.4 Research objectives 7

1.5 Outline of document 7

2 Background 8

2.1 Haloalkane dehalogenases 8

2.2 L-2-haloacid dehalogenases 13

2.3 Fluoroacetate dehalogenases 15

3 Materials and Methods 17

iv 3.1 Selection of targets 17

3.2 Cloning and expression of gene targets 18

3.3 Expression vector preparation 19

3.4 Protein expression 21

3.5 Protein purification by affinity chromatography 21

3.6 Rapid colourimetric dehalogenation assay 22

3.7 Further purification of confirmed dehalogenases 23

3.8 Optimization of reaction conditions 24

3.9 Determination of enzyme kinetic parameters 25

3.9.1 Quantitative determination of halide production 25

4 Results and Discussion 28

4.1 Selection and purification of targets 28

4.2 Biochemical screening 33

4.2.1 Identification of HADs with defluorination activity 36

4.3 Kinetic characterization of haloacid dehalogenases 37

4.4 Rate estimate of defluorination 39

4.5 Rate estimate of 1,2-DCA dechlorination by Jann2620 41

4.6 Discussion 43

4.6.1 Confirmation of HAD annotations 45

4.6.2 Confirmation of HAn dehalogenase annotations 48

v 4.6.3 Confirmation of fluoroacetate dehalogenase annotations 53

5 Conclusions 58

5.1 Contributions 60

5.2 Future Work 60

Appendices 62

A Standard curves 63

A. 1 Standard curves for spectrophotometric assay 63

A.2 Standard curves for ion chromatographic assay 64

B Kinetic data for HADs 68

B.l Kinetics against chloroacetate 68

B.2 Kinetics against fluoroacetate 68

VI List of Tables

4.1 List of source organisms 29

4.2 List of putative dehalogenase targets 30

4.3 Existing annotations of putative dehalogenases 32

4.4 Protein yields 34

4.5 Results of general screens 35

4.6 Kinetic characteristics of HADs with ClAc 37

4.7 Comparison of turnover rates of HAD-FAcs and FAc dehalogenases 40

4.8 Pairwise alignment statistics of HAD targets 46

4.9 Pairwise alignment statistics of HAn targets 50

4.10 Pairwise alignment statistics of FAc dehalogenase targets 57

Vll List of Figures

2.1 Structure of DhaA of Rhodococcus rhodocrous NCIMB 13064 10

2.2 Reaction scheme of 1,2-DCA with DhlA 11

2.3 Structure of L-DEX YL from Pseudomonas sp. YL 14

3.1 pl5TV-L cloning plasmid 19

3.2 IC elution profile for separating and fluoroacetate 26

4.1 Visualization of proteins by SDS-PAGE 31

4.2 Example of colourimetric screening results 33

4.3 Defluorination by HAD dehalogenase Adeh3811 36

4.4 Example of kinetic characterization, here of Adeh3811 of Anaeromyxobacter

dehalogenans 2CV-C. O.l^g/mL enzyme in 25mM CAPS-Na pH 10.5 38

4.5 Example of defluorination as observed via IC 39

4.6 IC confirmation of Jann2620 dechlorination of 1,2-DCA 42

4.7 Alignment of all HAD dehalogenase targets 44

4.8 Unrooted phylogenetic tree of HAD targets 47

viii 4.9 Alignment of positive HAn dehalogenases and closely related negatives .... 49

4.10 Unrooted phylogenetic tree of HAn targets 51

4.11 Domain structure annotation of Bpro2447 52

4.12 Unrooted phylogenetic tree of FAc dehalogenase targets 54

4.13 Alignment of FAc dehalogenase targets 56

B.l Michaelis-Menten curve for BC2051 69

B.2 Michaelis-Menten curve for Bpro0530 70

B.3 Michaelis-Menten curve for Bpro4516 71

B.4 Michaelis-Menten curve for GMI1362 72

B.5 Michaelis-Menten curve for Jannl658 73

B.6 Defluorination of FAc by Adeh3811 74

B.7 Defluorination of FAc by Bpro0530 74

B.8 Defluorination of FAc by Bpro4516 75

IX Chapter 1

Motivation

1.1 Introduction

Industrial societies produce and consume many halogenated organic compounds. Chlori­ nated organic compounds are common feedstocks for industrial chemical synthesis: for ex­ ample, over one million tonnes of 1,2-dichloroethane (1,2-DCA) is produced every year, much of which is used in the synthesis of vinyl chloride (VC), the monomer of the com­ mon plastic PVC. Other chlorinated compounds are used as solvents, fumigants and pesti­ cides, among other applications. Brominated organic compounds are used as flame-retardant materials. Fluorinated organics have been used extensively in refrigeration (chlorofluoro- carbons; CFCs), stain-resistant coatings, non-stick coatings (such as polytetrafluoroethylene, commonly known as Teflon^) and lubricants.

Humans were not the first to incorporate halogens into organic compounds. Over 4000 naturally-ocurring chlorinated compounds have been identified of both physicochemical and biological origin (1). Marine microorganisms in particular have evolved to use the abundant chloride and bromide present in seawater (2). A number of antibiotics (for example, van­ comycin) in use today were first isolated from soil-based bacteria, or are halogenated deriva-

1 CHAPTER l. MOTIVATION 2 tives. However, in terms of sheer quantity, anthropogenic sources in aggregate far outstrip the production by natural processes.

In addition to being important feedstocks for the chemical industries and constituents of many products in modern society, some halogenated organic compounds have been cited for their deleterious health effects. Unfortunately, some of these substances have been re­ leased, accidentally or intentionally, into soils and waters, raising the risk of exposure to the general population. Of the 28 organic compounds currently classified as priority pol­ lutants (those that are persistent, bioaccumulative and inherently toxic), 17 contain one or more halogen atoms. A number of chlorinated compounds, including 1,2-DCA, 1,2- dichloropropane, o- and /?-dichlorobenzenes are regulated by the United States Environmen­ tal Protection Agency (USEPA) because of known or suspected carcinogenic or hepatotoxic effects. Trihalomethanes and haloacetates (THMs and HAAs), byproducts of drinking wa­ ter chlorination, the most common method of deactivating water-borne pathogens, are sus­ pected carcinogens. The government of Ontario has passed legislation regulating the allow­ able levels of these and a number of other halogenated compounds in water (3).

Because of their widespread use, contamination by halogenated organics is prevalent: of approximately 1200 USEPA Superfund priority cleanup sites, 287 have 1,2-DCA con­ tamination, 375 have VC and 259 contain polychlorinated biphenyls (PCBs), a highly toxic, bioaccumulative and possibly carcinogenic group of compounds. It is in the interest of both environmental protection and public health that these contaminants are removed.

1.2 Existing remediation methods

A number of methods are currently available for the remediation of contaminated sites. A brief description of these technologies is presented here. CHAPTER 1. MOTIVATION 3

1.2.1 Excavation

If contaminant compounds are sorbed to a solid matrix, such as soil, the contaminated por­ tion can be physically excavated or dredged (USEPA website). Once removed from the site, the matrix can be transported elsewhere for treatment and/or proper disposal. Further treat­ ment may involve thermal stripping, incineration, oxidation or other processes. This form of physical removal is conceptually the simplest of remediation methods and has been widely deployed. The clear disadvantage to excavation is that it will severely alter a site's physical character. Furthermore, the high energy requirement for removal, transport and treatment of large quantities of material for ex situ treatment adds significantly to the monetary and energy expenditure of such an undertaking.

1.2.2 Pump-and-treat

Similar in principle to excavation is "pump-and-treat", a technique in which contaminated groundwater is pumped from the aquifer and treated in a reactor above ground to remove or reduce the quantity of dissolved contaminant. The treated effluent may be reinjected to recharge the aquifer or sent elsewhere. Pump-and-treat was used at nearly three-quarters of

USEPA Superfund sites between 1982 and 1992 where the remediation work involved the treatment of groundwater (4). Its efficacy has been called into question due to (a) longer- than-predicted treatment times due to declining removal rates (known as tailing); and (b) the resurgence of contaminants in the water after treatment is concluded (5). Depending on the nature and configuration of the contamination, pump-and-treat alone may not remove the actual contaminant source and the problem may persist despite extensive treatment. CHAPTER l. MOTIVATION 4

1.2.3 Bioremediation

In situ bioremediation is a newer approach to the problem of contaminated sites. The ba­ sic strategy is to exploit organisms' natural metabolic activity toward what humans consider pollutants to remove the contaminant over a period of time. The removal of chlorine from halogenated compounds generally lowers their toxicity, though there are notable exceptions.

A number of microorganisms are capable of growing on halogenated organic carbon sources, including some considered to be xenobiotic; some of these have been cultured and/or charac­ terized (5, 6). These organisms may already exist at the site, or may be introduced to the site.

The clear advantage of a biological solution is that it is a continuing process; dehalogenating microorganisms will grow given the appropriate conditions and can consume the organhalo- gen as electron donor or acceptor, depending on the specific compound. There is less physical site disruption associated with an in situ method as compared to other alternatives, but bio­ logical options are expected to require more time for completion. With the general public's increasing concern about the environment in general and energy scarcity in particular, bio­ logical methods potentially offer low-cost, less energy-intensive "greener" alternatives to the methods currently employed for the treatment of halogenated organic wastes.

As contaminated sites and waters vary in their contaminant content and physicochemical conditions; it is advantageous for the engineering practitioner to have a variety of organisms and proteins with different degradative capabilities and tolerances. Recognizing the immense biodiversity that remains uncharacterized (and untapped) it is preferable to seek out existing solutions in nature.

1.3 Opportunities from genomic data

Genome sequencing has produced vast quantities of genetic data. This has allowed the study of organisms in detail and offers the possibility of identifying previously unknown metabolic CHAPTER l. MOTIVATION 5 activities, such as dehalogenation. Furthermore, sequencing efforts thus far have focussed on individual cultured organisms; the vast majority of organisms (99% by some estimates) have not been cultured, and some organisms may be difficult or impossible to culture, or there may be inseparable interdependencies between organisms in mixed cultures. New culture- independent techniques, in particular genome and metagenome sequencing projects, allow for the examination of these previously uncharacterizable organisms, adding to the potential of finding novel enzymatic activities.

A sequenced genome must be analyzed to identify (a) the location of and (b) their biochemical function in vivo, a process known as annotation. The simplest method of gene annotation is to compare, typically by sequence alignment (BLAST or some variant), pri­ mary sequence information derived from in silico translation of genetic data to the existing database of characterized proteins. By this strategy it has proven relatively straightforward to ascribe general catalytic function (e.g. hydrolysis) to some sequences, but as there is currently no general method for predicting tertiary structure from primary structure information, more specific information (e.g. preferred or substrate range) is difficult to glean.

While conceptually straightforward and easily automated as a part of the genome se­ quencing process, sequence similarity clearly applies only for proteins with well-characterized phylogenetic relatives; many sequences do not have any characterized homologues and can­ not be identified in this way. Furthermore, similarity suffers from an additional, potentially severe shortcoming: annotations are entered into the database with no biochemical verifica­ tion, and are tested against by searches in the future. Errors in the database remain hidden until the protein is characterized (if ever), and they contaminate the database and, in turn, affect the ability of future similarity searches to identify and locate important conserved residues, the discovery of which has the potential to improve annotations.

With the decreasing cost of sequencing, it is unlikely the rate of sequence production will plateau or decrease in the foreseeable future. As of early January 2008, there are 702 CHAPTER l. MOTIVATION 6 completed genome sequencing projects, with 2587 more at various stages of completion (7).

Most of these organisms have been cultured in the laboratory, and therefore this collection only represents a small fraction of global genetic diversity, but even now, biochemical studies have not kept pace: the latest revision of the UniProt consolidated protein database (version

12.4 at the time of writing) contains 290484 verified protein sequences at various levels of characterization and 5072048 in sz/zco-translated sequences with no biochemical data. Char­ acterized sequences make up less than 10% of this database, and this proportion looks likely to decrease. Although the most accurate method of annotation is biochemistry, the reality of the situation is such that automated annotation is an important enabling tool for mining the wealth of available genomic information.

A number of other methods are currently being employed to attempt to improve au­ tomatic annotation. Databases such as Pfam, PROSITE, BLOCKS and InterPro attempt to organize the wealth of data on recurring protein families, motifs and domains (8); newly discovered sequences may be searched against these databases and compared with those previ­ ously analyzed to identify conserved features, providing hints to the biochemistry of the pu­ tative protein. A different approach is taken by SEED, an undertaking to examine complete metabolic pathways (rather than the individual components) across a number of genomes

(9).

By virtue of its simplicity, BLAST searches and multiple alignments remain an attractive choice for identifying the putative function of newly discovered sequences. Furthermore, many characterized proteins (dehalogenases, in this case) have been subjected to site-directed mutagenesis and some information about catalytically critical residues is available. The use of this knowledge to narrow the search field for uncovering new dehalogenases motivates this research project. CHAPTER l. MOTIVATION 7

1.4 Research objectives

The intent of this research is as follows:

1. To identify new dehalogenases from publicly available sequenced genomes by sequence

similarity and manual examination for catalytically critical residues;

2. To test the reliability of computerized annotation of haloalkane, halocid and fmoroac-

etate dehalgoenases; and

3. To explore the diversity of enzyme activities in the new dehalogenases.

1.5 Outline of document

Chapter 2 will offer an introduction to the biochemical characteristics of haloalkane, haloacid and fluoroacetate dehalogenases. Chapter 3 will detail the experimental methods used in this research. The results of the enzyme screening and characterization will be presented and discussed in Chapter 4. Finally, Chapter 5 concludes with a summary of the lessons learned and proposals for future work. The Appendices provide experimental data not included in the main text. Chapter 2

Background

Bacteria have evolved a number of different mechanisms for the dehalogenation of organic compounds, including reductive dehalogenation, oxidative dehalogenation and thiolytic de­ halogenation. These are reviewed elsewhere (10).

The focus of this research is on hydrolytic dehalogenases, which catalyze the following reaction:

enzvme „ , R-X + HO —-* R-OH + H+ + X- where R—X is any halogenated organic compound. The biochemistry of three hydrolytic dehalogenases is detailed here. They have been classified according to their active substrates.

2.1 Haloalkane dehalogenases

Haloalkane dehalogenases (EC 3.8.1.5) remove halide from halogenated alkanes.

The first protein identified able to dechlorinate 1,2-DCA was DhlA from Xanthobac- ter autotrophics GJ10 (11). To date, only one additional enzyme has been identified with dechlorination activity against 1,2-DCA (12). Dechlorination of 1,2-DCA is of particular CHAPTER 2. BACKGROUND 9 interest for a number of reasons: first, 1,2-DCA is believed to have been first introduced into the environment in 1922, the year of its initial synthesis, and hence it is intriguing that a protein already exists whose best substrate is this recent addition to the biosphere. Addition­ ally, there are concerns about 1,2-DCA's potential carcinogenicity and its prevalence as an environmental pollutant which continues to be widely used in industrial processes, both as a chemical feedstock and an organic solvent.

To date, over ten haloalkane dehalogenases have been discovered, with activity against a variety of halogenated compounds. The crystal structures of three of these - DhlA, LinB of

Sphingomonas paucimobilis UT26 and DhaA of Rhodococcus sp. - have been published. These belong to the a/^- superfamily, a large group spanning diverse hydrolytic function­ alities, including esterases, thioesterases, epoxide (epoxidases) and dehalogenases

(13). of this superfamily have a two-domain tertiary structure: a core domain com­ posed of a sheet of /^-strands (a linear protein secondary structure; see Horton et al. (14)) surrounded by a number of or-helices (a helical secondary structure; also see Horton et al.

(14)), and an or-helical cap domain with significant structural variability (refer to Figure 2.1 for a representative a:/y3-hydrolase structure). A nucleophilic asparate, another acidic nu- cleophile (Asp or Glu) and a histidine general base constitute the conserved , which resides in a region of the protein between the core and cap domains. This interdomain region, which is comparatively hydrophobic, forms the cavity to bind haloalkanes.

The cap domain has some movement potential due to the flexible connection between it and the core domain.

Unique to the known haloalkane dehalogenases is the exclusive use of an aspartate nu- cleophile located at a sharp turn after /?-strand 5 (/35), in a configuration known as the nucleophile elbow. Furthermore, the residues of the catalytic triad consistently appear in the order Asp-Asp/Glu-His. The position of the acidic Asp or Glu appears to vary, as it has been found after strand j3d (putatively identified in LinB and DhaA), but also after strand CHAPTER 2. BACKGROUND 10

Figure 2.1: Structure of haloalkane dehalogenase DhaA from Rhodococcus rhodocrous

NCIMB 13064 (PDB 1BN6)

j37 (as in DhlA). Despite this topological variation, the haloalkane dehalogenases exhibit a

significant degree of amino acid identity, on the order of 30-40%.

In general, a//?-hydrolases employ a two-step reaction mechanism: a nucleophilic attack

on the carbon on which the leaving group resides to form an enzyme-substrate intermediate,

followed by hydrolysis of the intermediate and the regeneration of the enzyme for continued

catalysis (Figure 2.2). In the characterized haloalkane dehalogenases, the initial attack the

halogenated carbon is by a nucleophilic aspartate, whose action releases the halide and forms

the enzyme-ester intermediate. A water molecule, activated by interaction with the catalytic

acid and histidine, subsequently attacks the carboxyl group of the nucleophilic asparate to yield a haloalcohol (halohydrin) and a proton. The hydrolysis step also restores the enzyme to its original state by reforming the carboxylate group of the nucleophilic aspartate (15).

A number of structural features enable the enzyme to handle the chemical structures gen­

erated during the dehalogenation process. In the initial nucleophilic attack, the negatively- CHAPTER 2. BACKGROUND 11

Of"N or H

,CL CL CI pX^J "-HN r CI C> ,0 °^/b

"O" HN /^N^H^rf^ Asp \ Asp TrP AS! W In *- Asp H His His

Trp KJH IV or

CI OH CI

CX .O

/ /^ -0---HN' ^N H^Q 0---HN NH ^J \ Asp TrP Asp Asp His His

Figure 2.2: Reaction scheme of 1,2-DCA with DhlA. This mechanism extends generally to all haloalkane dehalogenases of this family. From Chan (16). CHAPTER 2. BACKGROUND 12 charged leaving group halide is held in the hydrophobic pocket by the edges of one of more

aromatic rings which, in contrast with their electron-rich flat faces, are slightly positively- charged. A tryptophan residue, containing an aromatic indole side chain, is absolutely con­

served immediately downstream of the nucleophile in all examples of haloalkane dehaloge-

nases and serves to bind the halide. A second halide-binding residue is either a downstream

tryptophan (DhlA) or an upstream asparagine (LinB, DhaA). The second hydrolysis step gen­

erates a negatively-charged oxyanion intermediate, which is stabilized by Glu56 and Trpl25

(17).

The three haloalkane dehalogenases crystallized to date have distinct substrate specifici­

ties and ranges, as identified by their catalytic efficiency. DhlA's natural substrate (as identi­

fied by its kcat/Km) is 1,2-DCA, a small, straight-chain haloalkane, while 1,3,4,6-tetrachloro- l,4,cyclohexadiene, a comparatively large polychlorinated cyclic alkene (though it exhibits

activity on an alkene, LinB removes chloride from one of two sp3 carbons). LinB is also

the only characterized hydrolytic dehalogenase with activity against /3-halogenated aliphat-

ics (18). The Rhodococcus strain carrying DhaA was isolated on media containing short-chain

(three to 10 carbon) 1-haloalkanes (19). The differences in substrate specificity stem partly

from the size of the active site cavity: DhlA's active site is conspicuously occupied by a pheny­

lalanine residue, which makes it significantly smaller than the active site cavities of LinB and

DhaA (17). The size and configuration of the active site cavity is defined by the enzyme's

cap domain, which is the least conserved among the dehalogenases. Sequence comparison of

the cap domain of different haloalkane dehalogenases reveals little similarity amongst each

other. CHAPTER 2. BACKGROUND 13

2.2 L-2-haloacid dehalogenases

Haloacid dehalogenases catalyze the dehalogenation of halocarboxylic acids. At least two phylogenetically distinct families of HADs exist: L-2-haloacid dehalogenases (L-DEXs; EC

3.8.1.2) which catalyze the transformation of L-2-haloacids to stereochemically inverted D-2- hydroxyacids, and D-2-haloacid dehalogenases (D-DEXs; EC 3.8.1.9) which transforms D-2- haloacids into L-2-hydroxyacids. A third type of HAD, the DL-DEX (EC 3.8.1.10, 3.8.1.11), reacts with both enantiomers and appears to be phylogenetically related to D-DEXs (20).

L-DEXs are by far the best characterized of the haloacid dehalogenases and only they are examined further in this research. Unless otherwise noted, all references to HADs herein pertain exclusively to L-DEXs.

HADs are members of the eponymous haloacid dehalogenase superfamily, whose mem­ bers primarily act on phosphate groups: magnesium-dependent phosphatases, P-type AT-

Pases, phosphonatases and phosphoglucomutases. HAD-superfamily enzymes have a two- domain structure, with a core domain in a Rossman fold arrangement with a six-stranded fully parallel /3-sheet flanked by or-helices, linked to a cap domain composed entirely of a- helices (see Figure 2.3). The active site residues are located between the two domains. Al­ though the structural descriptions appear similar, there is neither sequential nor structural similarity between haloalkane and haloacid dehalogenases (21).

The chemistry of HAD-mediated dehalogenation (and, in fact, all hydrolysis reactions by this family) is similar to that of haloalkane dehalogenases; that is, a nucleophilic aspar­ tate attacks the halogenated or-carbon, creating an ester intermediate that is subsequently hydrolyzed by a water molecule in the vicinity of the aspartate, concomitantly liberating the a-hydroxyacid and regenerating the enzyme. All HAD superfamily hydrolases appear to use an aspartate nucleophile near the N-terminal end; however, L-DEXs are unique among HADs to require no cofactors for activity. CHAPTER 2. BACKGROUND 14

Figure 2.3: Structure of L-DEX YL haloacid dehalogenase from Pseudomonas sp. YL, with chloroacetate positioned in the active site (PDB 1ZRN).

Though the mechanism is ostensibly identical to that employed by haloalkane dehalo- genases, there currently is no clearly identified catalytic triad responsible for the activity, as there is in the haloalkane dehalogenases. In its place one finds instead a collection of residues around the active site which have been found to be important for enzyme activity. The nu- cleophilic aspartate appears after the strand j3\, near the junction between the core and cap domains. A relatively motile asparagine residue (Arg39; L-DEX YL coordinates) is thought to be responsible for the import of the halocarboxylate substrate and its positioning in the active site; its movement also allows for egress of the product. At least three residues (Serl75,

Asnl 77, Asp 180) participate in the positioning of a water molecule for the hydrolysis of the covalent intermediate. The leaving group halide is thought to be supported by a binding cradle formed by Arg39, Asnl 15 and Phe (L-DEX YL) or Trp (DhlB). CHAPTER 2. BACKGROUND 15

2.3 Fluoroacetate dehalogenases

The C—F bond has one of the highest bond energies found in organic compounds (for in­ stance, in CH —X, where X is a halide, C—F = 451kJ/mol versus C—Cl = 349kJ/mol and

C—Br = 285kJ/mol (22)). Fluorinated compounds are very stable (23) and are thus persist in the environment.

Fluoroacetate (FAc) is structurally similar to chloroacetate and bromoacetate, with the obvious difference in halides between the three haloacetates. However, the two crystallized enzymes known to dehalogenate fluoroacetate are not haloacid dehalogenases. DehHl of

Moraxella sp. B and FAc-DEX of Burkholderia sp. FA1 (24) both belong to the same a//3- hydrolase superfamily as haloalkane dehalogenases. Primary sequence similarity is low (be­ low 20% amino acid identity) between haloalkane and fluoroacetate dehalogenases, but the folded proteins are structurally very similar, with the /3-sheet core domain and a-helical cap domain present in both. Despite the similarities, the two functionalities appear mutually exclusive and no promiscuous proteins catalyzing both reactions have been discovered to date. One notable structural difference is that fluoroacetate dehalogenases appear to exist as homodimers, while haloalkane dehalogenases have been found as monomers.

Compared with haloalkane dehalogenases, fluoroacetate dehalogenases have a clearly iden­ tified nucleophilic Aspl05 (DehHl sequence numbering) and a general base His272. How­ ever, they appears to lack the catalytic acid present in all haloalkane dehalogenase. Additional conserved residues include Argl06 and Trpl51, presumed to be involved with leaving group stabilization, and Argl09, likely a substrate carboxylate . Mass spectrometric and mutational studies suggest that defluorination proceeds in the same manner as in haloalkane dehalogenase; Aspl05 displaces the halogen, and water activated by His272 hydrolyzes the acyl-enzyme (25). Further elucidation of the reaction mechanism and the structural features or changes that enable C—F bond cleavage awaits detailed mutagenesis and crystallization of CHAPTER 2. BACKGROUND 16 reaction intermediates. Chapter 3

Materials and Methods

3.1 Selection of gene targets

Organisms which had one or both of the following features were selected for genetic exam­ ination: first, organisms were selected if they have known whole-cell activity against halo- genated organic compounds; or second, they were selected if they were soil-based or marine organisms.

The draft or completed genome of each selected organism was subject to a translated pro­ tein BLAST search. Each genome was searched with the following characterized proteins of the three dehalogenase families under examination (here with SwissProt accession numbers):

• Haloalkane dehalogenase: DhlA (P22643), LinB (P51698), DhaA (P0A3G2)

• L-2-haloacid dehalogenase: L-DEX YL (Q53464), DhlB (Q60099)

• Fluoroacetate dehalogense: DehHl (Q01398), FAc-DEXFAl (Q1JU72)

All resulting contiguous alignments of significant length were examined for the presence of known conserved residues in each of the protein families. In all cases, only sequences

17 CHAPTER 3. MATERIALS AND METHODS 18 where the alignment showed an aspartate nucleophile in the alignment of the active site were selected as targets. This a priori assumption was based on a separate screen from previous work by Chan (16).

Multiple sequence alignment was performed by CLUSTALW and CLUSTLAWPROF

(26) on Biology Workbench (San Diego State University, San Diego, CA). Unrooted phylo- genetic trees were generated from entire sequences by the program proml, as implemented on

T-REX (UQAM, Montreal, QC) (27). proml compares protein alignments by CLUSTAW with alignments generated by bootstrapping of the sequences, and trees are generated based on the maximum-likelihood estimation. Trees were generated including outgroups of epox­ ide hydrolases (for haloalkane and fluoroacetate dehalogenases) or HAD-superfamily phos­ phatases (L-2-haloacid dehalogenases).

3.2 Cloning and expression of gene targets

Genomic DNAs for the organisms were obtained from their source as listed on their re­ spective genome sequencing websites, and the target genes isolated by the polymerase chain reaction (PCR) with specific primers for each target. In addition to the gene-specific region, each pair of primers was modified at the 5'-terminus with 15 base pairs homologous to that of the gene insertion site on the expression vector pl5Tv-L (see below). Each PCR reaction contained Pfu reaction buffer, 0-10% DMSO (percentage was reaction-dependent), 200nM of both primers, 25nM dNTP mix, 0.1-l^g genomic DNA and 0.5/^L of Pfu. Pfu is a DNA polymerase with proofreading activity, resulting in superior replication fidelity compared with conventional Taq polymerase. An automated thermocycler was programmed for 32 cycles of the following: 30s at 95°C (denaturation), 45s at 59CC (anneal), 2.5 min at 72°C

(extension). After the completion of the final cycle, the PCR mixture was heated to 72° C for an additional 10 minutes to ensure complete amplification. The annealing step above was CHAPTER 3. MATERIALS AND METHODS 19 chosen to be 59° C as this temperature was empirically determined to yield product in most cases. The length of the amplification products were verified by agarose gel electrophoresis.

3.3 Expression vector preparation

PCR products were placed into pl5TV-L by homologous recombination. pl5Tv-L (Figure

3.1) is a pETl5(b)-derived, ColEl-based plasmid featuring a bacteriophage T7 promoter driv­ ing expression of sacB, the levansucrase gene from Bacillus subtilis which causes cell lysis in

Gram-negative bacteria in the presence of sucrose (28, 29). sacB is flanked at both 5' and

3' ends by BseRI restriction sites. pl5Tv-L also contains an N-terminal fusion (MGSSHH-

HHHHSSGRENLYFQG) with a hexahistidine (6xHis) tag sequence and a cleavage site for tobacco etch virus (TEV) protease. Cells harbouring unaltered plasmid are resistant to ampi- cillin and are incapable of growing on media with 5% sucrose.

Fwul (627)

SseftI <539e) Hdel C538B> Hcol <5315> Bglll C521B)

lac I (3764-4B43)

Figure 3.1: pl5TV-L cloning plasmid. CHAPTER 3. MATERIALS AND METHODS 20

pl5Tv-L was digested with 4 units of BseRI (New England Biolabs) at 37°C. After one hour, the reaction mixture was supplemented with 4 units of enzyme and incubated at 37°C for an additional hour. The digested vector was purified with a PCR purification kit (Qiagen) and stored in deionized water (dH20).

Cleaned, digested pl5TV-L was used to rehydrate a lyophilized pellet of InFusion re­ combination enzyme (BD Biosciences), a substitute for the conventional digestion-ligation method for joining vector and insert which requires only one step. The vector and insert were mixed in 1:2 proportion by mass, as recommended by the manufacturer, and the re­ combination reaction was allowed to proceed at room temperature for 30 minutes. 1/JL of this reaction was transformed into rubidium-competent E. coli DH5

Verified plasmids were frozen at -20°C until subsequent use.

30-40 ng of sequenced plasmid was transformed into rubidium-competent E. coli BL21(DE3)

Gold cells by the heat shock method and plated on LB agar supplemented with ampicillin and kanamycin (LBA/K). Successful transformants formed colonies overnight on LBA/K agar plates, and one colony from each plate was picked and grown in LBA/,K. 1 mL of cell suspen­ sion was mixed with 400jA, of 80% glycerol to establish frozen stock. CHAPTER 3. MATERIALS AND METHODS 21

3.4 Protein expression

Cells were grown overnight at 37°C in 5 mL LBA/K. The entire culture volume used to inoculate 1L of Terrific Broth (TB) containing the same concentrations of ampicillin and kanamycin as in the overnight culture. The 1L culture was grown at 37°C until OD600 reached 0.8-1.0, at which point the incubation temperature was lowered to 16°C and IPTG was added to a working concentration of ImM to induce recombinant protein production.

Growth at 16°C proceeded for 16 -18 h and then the cells were harvested by centrifugation.

The cell pellet was retained and resuspended in 15 - 20 mL of Binding Buffer (BB; 50 mM

HEPES-Na, 5% glycerol, 0.5 M NaCl, 5 mM imidazole, pH 7.5), as appropriate, and then flash frozen in liquid nitrogen. The cells were stored at -20°C until subsequent use.

3.5 Protein purification by affinity chromatography

Frozen cells were thawed overnight at 4°C. Cells were lysed by sonication by a rectangular waveform. Sonification occurred in five second cycles: during the first two seconds, the sonicating probe was powered and active; for the latter three seconds the probe was turned off to allow the cells to cool. Total sonication time was 3 active minutes (7.5 minutes total time in sonicator). Sonicated cells were centrifuged at 24,000 rpm for 45 minutes to pellet cell debris.

All subsequent steps were carried out at 4°C. The cleared cell lysate was loaded onto a protein purification column prepared with 6 mL of nickel nitrilotriacetate (Ni-NTA) beads which had been washed with 30 mL BB prior to lysate loading to remove residual ethanol from the storage solution. After the lysate had completely passed through the column, an additional 30 mL of BB was passed over the beads to remove unspecifically-bound proteins.

200 mL of Wash Buffer (WB; 50 mM HEPES-Na, 5% glycerol, 0.5 M NaCl, 30mM imidazole, CHAPTER 3. MATERIALS AND METHODS 22 pH 7.5) was then used to displace stronger nonspecific binders. All column flowthroughs up to this point were discarded.

After the wash step, Elution Buffer (EB; 50 mM HEPES-Na, 5% glycerol, 0.5 M NaCl,

250mM imidazole, pH 7.5) was used to strip the target protein from the Ni-NTA beads. Elu­ tion progress was monitored by combining 5//L samples of the flowthrough with 95/^L 0.2X diluted Bradford reagent (Biorad). If protein was present in the flowthrough, the mixture changed from the reagent's original maroon-brown to a bright blue colour. EB was added until flowthrough samples no longer caused a colour change of Bradford reagent (Biorad).

All flowthrough due to EB was collected in membrane concentrator columns of 10 kDa molecular weight cutoff (MWCO) and concentrated by centrifugation. The proteins were flash-frozen in liquid nitrogen and stored at -80°C for future use or were subjected immedi­ ately to TEV digestion.

The final concentration of the protein sample was determined by spectrophotometric assay with Bradford Reagent. Purity of the purified target was evaluated by SDS-PAGE.

3.6 Rapid colourimetric dehalogenation assay

All Ni-NTA purified proteins were screened for dehalogenase activity by a colourimetric pH shift assay (31). 5^L of thawed protein was mixed into a solution of ImM HEPES (pH 8.2),

ImM EDTA, 20 ppm phenol red (sodium salt) and 20mM of test substrate, to a total volume of 200^1. 1,2-dichloroethane (1,2-DCA), 1,2-dibromoethane (1,2-DBA), sodium chloroac- etate (ClAc), sodium bromoacetate (BrAc) and (FAc) were tested against all targets, regardless of their putative classification generated by BLAST searches. The reac­ tions were allowed to proceed overnight at room temperature. Targets were declared true dehalogenases if the reaction mixture changed in colour from pink to yellow, indicating de­ pressed pH. Colour change was evaluated by visual inspection. Negative control was pro- CHAPTER 3. MATERIALS AND METHODS 23 vided by addition of EB in place of enzyme, which elicits no appreciable colour change.

Positive control was provided by addition of sulphuric acid in place of enzyme, or by the reaction of purified DhlA with His-tag against 1,2-DBA.

3.7 Further purification of confirmed dehalogenases

True dehalogenases from the above assay were further purified through the removal of the hexahistidine tag used for the initial purification, and subsequently by gel filtration chro­ matography. Finally, the protein was buffer exchanged into a separate storage buffer.

An aliquot of Ni-NTA purified protein was diluted tenfold in BB. For each 10 mg of pro­ tein in the aliquot (as determined by the Bradford assay) 1 mg of TEV protease was added from a stock preparation of 2.5 mg/mL, which was purified over a Ni-NTA column as de­ scribed above, except that elution buffer contained 20% glycerol. The digestion was allowed to proceed overnight at room temperature. Each digest was visualized by SDS-PAGE to ver­ ify completion; undigested protein was loaded for size comparison.

The entire digest volume was passed over 6 mL of charged Ni-NTA beads, and then washed with 20 mL of WB. All flowthrough was collected and concentrated by centrifu- gation in membrane concentrator columns (10 kDa MWCO; Amicon). The retentate was either directly used for gel filtration (see below) or frozen at -80°C.

Fast protein liquid chromatography (FPLC) was used to separate the target protein from remaining contaminants of dissimilar molecular weight. The chromatographic system (Amer- sham/GE Biosciences) was connected to a Superdex 200 dextrose column (column volume:

60mL) appropriate for the separation of polypeptides of 10 to 75 kDa. Concentrated reten­ tate from the previous step was manually injected into the system in 2 mL aliquots, and an eluant of degassed 50mM Tris-S04, 150mM NaCl (pH 8) was delivered isocratically through the column at 0.8 - 1 mL/min, to a total volume of 120mL. 1.5mL fractions were auto- CHAPTER 3. MATERIALS AND METHODS 24 matically collected by the integrated fraction collector. Elution progress was monitored by ultraviolet absorbance and recorded by computer. The correct protein peak was manually identified from the chromatogram, and the appropriate fractions were pooled, transferred into a membrane concentrator (10 kDa MWCO; Amicon) and concentrated by centrifuga- tion.

The storage solution of the protein was changed to 50mM Tris-S04 (pH 8.5) by buffer ex­ change. The concentrated protein from above was diluted with 15mL of the storage solution and concentrated by centrifugation in a membrane concentrator (10 kDa MWCO; Amicon).

This process was repeated until the cumulative dilution factor exceeded 1000.

3.8 Optimization of enzyme reaction conditions

The concentration of protein at which their activity became substrate-limited was deter­ mined to ensure their maximum enzyme activity was assessed at a concentration at which the target substrate was in excess. To determine the appropriate concentration of enzyme for use, a range of protein concentrations were assayed at a constant substrate concentration in

25mM N-Cyclohexyl-2-aminoethanesulfonic acid/NaOH (CHES-Na) at pH 9.0 and 30°C.

Different concentrations of protein was added to initiate the reaction, and was quenched by the addition of 200/^L of 125mM FeNH4(SC>4)2 in 6N HN03. The halide produced was determined by methods outlined in the Section 3.9.1. The protein concentration used in sub­ sequent experiments was chosen to be in the range where reaction velocity was determined solely by the amount of protein added.

The pH optimum for each enzyme was determined by testing each one at pH's of 7.5 to 10.5, in 0.5-unit increments, at an identical substrate concentration. The reaction buffers used for this purpose were 4-(2-hydroxyethyl)-l-piperazineethanesulfonic acid/NaOH (HEPES-

Na; pH 7.5 and 8.0), 2-Amino-2-hydroxymethyl-propane-l,3-diol/H2S04 (Tris-S04; pH 8.0 CHAPTER 3. MATERIALS AND METHODS 25 and 8.5), N-Cyclohexyl-2-aminoethanesulfonic acid (CHES-Na; pH 9.0,9.5) and N-Cyclohexyl-

3-aminopropanesulfonic acid (CAPS-Na; pH 10.0 and 10.5). The working concentration of all buffers was 25mM. Reactions were carried out as above for 20 minutes, and halide pro­ duction was quantified as in Section 3.9.1. The optimum pH for the enzyme was that which resulted in the most halide production during the test period.

3.9 Determination of enzyme kinetic parameters

Once the optimum pH and appropriate enzyme concentration was determined, full kinetic assays were carried out at each enzyme's respective optimum. Reactions were quenched at

2.5,5,7.5 and 10 minutes (2 and 4 hours for defluorination), either as described above or with

70/JL 2M H SO if the samples were intended for ion chromatography. The initial velocity of reaction was taken while turnover was linearly proportional to reaction time. The data obtained were fitted to the Michaelis-Menten equation by non-linear regression in GraphPad

Prism 4.00 for Windows (GraphPad Software, San Diego, CA) to determine Km and vm3X.

3.9.1 Quantitative determination of halide production

Spectrophotometric measurement (Cl~, Br")

The method developed by Iwasaki et al. (32) was used to measure chloride and bromide generated by the enzymatic reaction. Briefly, 100/^L of Hg(SCN)2, saturated in 95% ethanol, was added to the above quenched reaction and mixed. FeNH4(S04) from the quenching solution combines with Hg(SCN)2 to form a reddish-brown iron thiocyanate complex, the absorbance of which was measured at 460nm after 10 minutes of colour development. The measured absorbance was compared to standard curves generated by measurement of sodium halide in the appropriate buffering solutions. CHAPTER 3. MATERIALS AND METHODS 26

Ion chromatographic separation (CI , Br , F )

Halide concentrations were further verified by anion-exchange chromatography. A Dionex chromatographic system fitted with an ASf 9 anion-exchange column with AG19 guard col­ umn and AS40 autosampler allowed for the separation of haloacids and halides. Samples were first diluted lOx in water and centrifuged to pellet particulates. 0.5 mL of the dilution injected into a 20/uL sample loop to flush and load the sample.

To identify CI", Br", chloroacetate and bromoacetate, an isocratic eluent of 20mM NaOH was delivered at 1 mL/min for 20 minutes.

To separate F" and fluoroacetate, the gradient elution profile shown in Figure 3.2 was used. IC elution profile for fluoride and fluoroacetate

30

25

#20 & o

15 H

= 10

I I o oj 1 1 1 1 1 1 i 1 ; 0 5 10 15 20 25 30 35 40 45 Time (min)

Figure 3.2: Ion chromatography profile to separate fluoride and fluoroacetate in assay sam­ ples.

In both cases, measured peak times and areas were compared to those generated with CHAPTER 3. MATERIALS AND METHODS 27 standard curves to determine the nature and concentration of the moiety. Chapter 4

Results and Discussion

4.1 Selection and purification of targets

A total of 33 targets were selected from 12 sequenced bacterial genomes. The genomes were taken from the list of draft or complete genomes as of July 2006 (Tables 4.1,4.2, 4.3). Existing annotations and Enzyme Commission (EC) designation were not considered in the selection of targets.

Potential targets were obtained by the program tblastn as implemented on each indi­ vidual genome's public web portal. HAn dehalogenase targets obtained using DhlA as the search template were unique from those obtained using LinB and DhaA. Although members of the same

Of the initial targets, 27 were successfully expressed and purified from E. coli BL21(DE3) cells cloned for recombinant expression. In addition, the confirmed 1,2-DCA degrading enzyme DhlA was purified to act as positive control to confirm the validity of the haloalkane

28 CHAPTER 4. RESULTS AND DISCUSSION 29

Table 4.1: List of organisms whose genomes were screened for putative dehalogenases.

Organism Source Gram Notes

Photosynthetic organism. Model Anabaena sp. PCC7120 Water negative organism for pattern formation.

Reductively dechlorinates many Anaeromyxobacter dehalogenans 2CP-C Soil negative halogenated compounds. Also reduces metals, including U(VI) and Fe(lll).

Azotobacter vinelandii AvOP Soil negative Nitrogen fixing bacterium.

Also known as strain BCC1. Strain isolated Burkholderia cenocepacia HI2424 Soil negative from agricultural soil,

Burkholderia vietnamiensis G4 Water negative Cometabolizes TCE.

Degrades some aromatic compounds, Chromohalobacter salexigens DSM 3043 Water negative Most halophilic moderate halophile known. Anaerobically oxidizes benzene, toluene Dechloromonas aromatica RCB Water negative ' and chlorobenzoate. Prototrophic organism isolated from Jannaschia sp. CCS1 Marine negative seawater. Member of Roseobacter clade, which has diverse degradative activities. Can grow aerobically on cis-DCE as sole Polaromonas sp. JS666 Water negative carbon source. Optimum growth 20-25C.

Pseudomonas fluorescens Pf-5 Soil negative Plant commensal.

Contains plasmid enabling degradation of Ralstonia eutropha JMP134 Soil negative 2,4-dichlorophenoxyacetic acid and 3- chlorobenzoic acid

Ralstonia solanacearum GMI1000 Soil negative Pervasive plant pathogen. CHAPTER 4. RESULTS AND DISCUSSION 30

Table 4.2: List of targets pursued during cloning and purification phases. HAn = haloalkane dehalogenase; HAD = haloacid dehalogenase; FAc = fluoroacetate dehalogenase.

Gene Predicted ExPaSy Successful Expressed Organism (Abbreviation) Purified Notes No. activity Accession clone in culture

0039 FAc Q8Z0Q1 V >/ s/ Anabaena sp. PCC7120 (PCC) 1353 HAn U8YX62 s/ V •J 4221 HAn Q8YPH3 s/ s/ s/ Anaeromyxobacter dehalogenans 2CP- 0522 HAn Q2INB6 J / v 2964 FAc A0KB35 V Burkholderia cenocepacia HI24Z4 (BC) 4182 HAn A0AZU1 V 6682 HAD A0KDZ8 v v/ J 3053 FAc A4JIE1 Burkholderia vietnamiensis G4 (BV) 5158 HAn A4JPA6 2506 FAc Q44IV8 Chromohalobacter salexigens DSM 2565 HAD Q44IP7 •J 3043 (DSM) 3227 HAn Q44K90 Dechloromonas aromatica RCB (Daro) 3835 FAc Q479B8 s/ sf s/ 1658 HAD Q28RT7 V 4 Jannoschia sp. CCS1 (Jann) 2620 HAn Q28P25 y • 0521 HAn Q12G58 y y •/ 0530 HAD Q12G50 v y y 0547 HAn Q12G35 s/ -J V Polaromonas sp. JS666 (Bpro) 2447 HAn Q12AS6 y -J • 3067 FAc Q128R1 y sf y 4478 FAc Q123C8 y s/ y 4516 HAD Q122Z0 •j >/ y 0960 HAn Q4KI42 J V Pseudomonas fluorescein Pf-5 (PF) 4714 FAc Q4K7I6 • y 0165 HAn Q476Y6 could not PCR Ralstonia eutropha JMP134 (RE) 1952 FAc Q46ZW7 could not PCR 0256 FAc Q8Y2S9 Ralstonia solanocearum GMI1000 1362 HAD Q8XZN3 • (GMI) 1770 HAn Q8XYI8 •J >/ •j HAn positive Xanthobacter autotrophicus GJ10 DhIA HAn P22643 J •y >/ control CHAPTER 4. RESULTS AND DISCUSSION 31 dehalogenation assay. The Ni-NTA ion exchange method employed for protein purification was highly selective for the target proteins (Figure 4.1).

Figure 4.1: Protein visualization by sodium dodecyl sulfate-polyacrylamide gel electrophore­ sis (SDS-PAGE). Left: protein content of recombinant cell lysate. Cells were induced to express target protein, which is the thickest band visible in each lane. Right: column passthrough from reverse Ni-NTA step after His-tag cleavage by TEV protease. L = molec­ ular weight ladder.

Protein yields from one litre of TB culture ranged from 2 mg to over 100 milligrams. The positive targets which underwent TEV treatment and buffer exchange experienced significant losses (from 30% to over 90%) during the process (see Table 4.4). The loss between pre- and post-cleavage can be attributed to the loss of the His-tag (2.4kDa per protein, representing approximately 10% of typical HAD molecular weight, or 6% of typical HAn/FAc dehalo- genase molecular weight), transfer losses between tubes and and retention on the membrane of the membrane concentrator column. No attempt was made to quantify the loss due to each individual factor, but enough enzyme remained in each case to perform kinetic assays for some level of characterization. In some cases the amount of protein obtained from a pu­ rification step (as determined by Bradford assay) was more than that used for the purification CHAPTER 4. RESULTS AND DISCUSSION 32

Table 4.3: Public annotation of targets from UniProt. These automated annotations were not considered in target selection. EC 3.-.-.- denotes general hydrolase.

Target Annotation (verbatim) EC prediction

Putative HAD dehalogenases

Adeh3811 Haloacid dehalogenase, type II 3.8.1.2

BC2051 Haloacid dehalogenase, type II 3.8.1.2

BC6682 Haloacid dehalogenase, type II none

Bpro0530 Haloacid dehalogenase, type II 3.8.1.2

Bpro4516 Haloacid dehalogenase, type II 3.8.1.2

GMI1362 Putative dehalogenase-like hydrolase; protein 3.8.1.2

Jannl658 Haloacid dehalogenase type II 3.8.1.2

Putative HAn dehalogenases

Adeh0522 Alpha/beta hydrolase fold-1 [Precursor] 3.3.2.3 AVOP5010 Alpha/beta hydrolase fold 3.8.1.5 BC4182 Alpha/beta hydrolase fold 3.8.1.5 Bpro0521 Alpha/beta hydrolase 3.8.1.5 Bpro0547 Twin-arginine translocation pathway signal [Precursor] 3.8.1.5 Bpro2447 CMP/dCMP deaminase, zinc-binding 3.8.1.5

BV5158 Alpha/beta hydrolase fold 3 37 3 GMI1770 Putative hydrolase /ayltransferase (Alpha/beta hydrolase superfamily) protein 3.8.1.3 Jann2620 Alpha/beta hydrolase 3.8.1.5 PCC1353 AII1353 protein 3.-.-.- PCC4221 AII4221 protein 3.3.2.3 PF0960 Hydrolase, alpha/beta fold family none

Putative FAc dehalogenases

BC2964 Alpha/beta hydrolase fold 3.8.1.3 Bpro3067 Alpha/beta hydrolase fold 3.8.1.3 Bpro4478 Alpha/beta hydrolase fold 3.8.1.3 BV3053 Alpha/beta hydrolase fold 3.8.1.3 Daro3835 Alpha/beta hydrolase fold 3.8.1.3 GMI0256 Hypothetical haloacetate dehalogenase h-1 protein 3.8.1.3 PCC0039 Alr0039 protein 3.8.1.3 PF4714 Hydrolase, alpha/beta fold family none CHAPTER 4. RESULTS AND DISCUSSION 33

in the first place; though it was not investigated why this may have occurred, in order to

ensure accurate specific activity data each aliquot of enzyme used in the kinetic assays was well-mixed by pipetting and quantified immediately prior to assaying.

4.2 Biochemical screening

All successfully purified targets were subject to the rapid colourimetric screening assay (Fig­

ure 4.2). Targets capable of turnover of any test substrate, as evaluated by phenol red colour

change, were considered to be true dehalogenases and investigated.

As concentrated enzyme solutions were used in the assays (0.25-1.25 mg/mL reaction),

and the reaction was weakly buffered, dehalogenation could generally be readily identified,

i.e. within 30 minutes a colour difference was observed between proteins that tested positive

and negative as dehalogenases. After this initial examination, the same assay reactions were

covered to prevent evaporation and allowed to sit for a further 8 to 12 hours for additional

turnover. Some reactions that did not achieve sufficient turnover in 30 minutes to elicit

colour change did reveal enzymatic activity upon the extended incubation.

A total of 11 active dehalogenases were found amongst the targets (see Table 4.2).

Figure 4.2: Colourimetric screen of targets with chloroacetate. Positives changed original

pink colour to yellow/orange.

Verified dehalogenases were reacted with TEV protease to remove the 6xHis binding tag CHAPTER 4. RESULTS AND DISCUSSION 34

Table 4.4: Yields from purification process: (a) Yields from initial Ni-NTA purification, (b)

Yields after treatment for positive target proteins. For positives, concentrations after TEV cleavage and buffer exchange were verified for accuracy prior to all quantitative kinetic assays.

Concentrated Concentration a) Protein Total (mg) volume (mL) (mg/mL)

PCC0039 0.8 16.9 13.5 PCC1353 2 0.9 1.8 PCC4221 1.5 0.9 1.4 Adeh0522 6.3 3.3 20.8 Adeh3811 2.5 7.3 18.3 AVOP5010 10 13.4 134.0 BC2051 1.8 19.3 34.7 BC2964 11.6 12.5 145.0 BC4182 11 20.7 :227. 7 BC6682 9 7.5 67.5 BV3053 15 12.2 183.0 BV5158 12 2.5 30.0 Daro3835 2.6 51.5 ' 133.9 Jannl658 2 10.3 20.6 Jann2620 2.6 25.1 65.3 Bpro0521 10.2 4.2 42.8 Bpro0530 1.3 39.2 51.0 Bpro0547 8.4 2.5 21.0 Bpro2447 11 2.3 25.3 Bpro3067 3.5 5.2 18.2 Bpro4478 1.4 42.8 59.9 Bpro4516 1.9 24.4 46.4 PF0960 7.5 3.2 24.0 PF4714 1 5.5 5.5 GMI0256 8 5.9 47.2 GMI1362 8.4 6.8 57.1 6MI1770 5 12.9 64.5 DhIA 0.6 39 23.4

Total protein Total protein b) Total protein Overall loss Protein used for TEV post-buffer post-TEV (mg) (initial-final, mg) digest (mg) exchange (mg) PCC0039 12.7 2.9 1.4 •11.3 Adeh3811 18.3 8.9 2.2 BC2051 34.7 93.9 35.1 0.4 BC2964 49.9 63.6 18.9 •31.Q Daro3835 51.5 |precipitate d iV a n/a Jannl658 20.6 28.3 32 11.4 Jann2620 65.3 24.1 28.6 •35.7 Bpro0530 51 15.3 8.3 -42.7 Bpro4478 59.9 86.6 39.3 -20.6 Bpro4516 46.4 43.9 29.7 -16.7 GMI1362 20.5 10.2 12.4 -8.1 CHAPTER 4. RESULTS AND DISCUSSION 35

Table 4.5: Results of general enzymatic screens of putative targets with halogenated com­ pounds. 1,2-DCA = 1,2-dichloroethane; 1,2-DBA = 1,2-dibromoethane; CI Ac = chloroac- etate; BrAc = bromoacetate; FAc = fluoroacetate.

Target Screening Substrates Notes 1,2-DCA 1,2-DBA ClAc BrAc FAc Putative HAD dehalogenases Adeh3811 Good digest/buffer exchange BC2051 Good digest/buffer exchange BC6682 no activity detected, not pursued Bpro0530 Good digest/buffer exchange Bpro4516 Good digest/buffer exchange GMI1362 Good digest/buffer exchange Jannl658 Good digest/buffer exchange Putative HAn dehalogenases Adeh0522 no activity detected, not pursued AVOP5010 precipitated on thaw, not pursued BC4182 precipitated on thaw, not pursued Bpro0521 no activity detected, not pursued Bpro0547 no activity detected, not pursued Bpro2447 no activity detected, not pursued BV5158 no activity detected, not pursued GMI1770 no activity detected, not pursued Jann2620 Good digest/buffer exchange PCC1353 no activity detected, not pursued PCC4221 no activity detected, not pursued PF0960 precipitated on thaw, not pursued Putative FAc dehalogenases BC2964 Good digest/buffer exchange Bpro3067 no activity detected, not pursued Bpro4478 Good digest/buffer exchange BV3053 no activity detected, not pursued Daro3835 precipitated on digest GMI0256 no activity detected, not pursued PCC0039 Good digest/buffer exchange PF4714 no activity detected, not pursued CHAPTER 4. RESULTS AND DISCUSSION 36 and buffer-exchanged with Tris-SO to remove chloride. During the tag cleavage, Daro3835

FAc dehalogenase precipitated in solution. A test cleavage at 4°C for 12 h appeared successful

(i.e. no precipitate) but was not characterized further.

4.2.1 Identification of HADs with defluorination activity

Three of six positive HADs were observed to change the pH of the buffering solution when the sole available substrate in the solution was FAc. This was interpreted as a novel deflu­ orination activity in HADs and has not been previously reported in the literature. It was consistently observed over multiple replicates that compared to dechlorination, defluorina­ tion by HADs occurs at a significantly lower rate, qualitatively observed as a longer time lag between the addition of enzyme and the time colour change could be observed.

t = 0 30 t

1111111111111111111111111111111111111111111111 0 5 10 15 20 25 30 35 40 45 Minutes [FAc-] = 954 nM [FAc-] = 811 nM, [F']=142nM

Figure 4.3: Defluorination by Adeh3811 HAD, performed at 20/ig/mL. F" = fluoride, Gly" = glycolate, FAc" = fluoroacetate. The decrease in FAc" is stoichiometric with the increase in F".

To verify the colour change was caused by defluorination, and not by the unexpected degradation of another moiety, sample reactions were separated by ion chromatography to CHAPTER 4. RESULTS AND DISCUSSION 37 distinguish between different anions (Figure 4.3). Mass balances of fluoride and fluoroacetate in each reaction volume demonstrate that fluoride has been generated in the reaction and fluoroacetate has been consumed. This confirms defluorination indeed occurred and was mediated by the HADs. These fluoroacetate-capable HADs are herein referred to as HAD-

FAcs.

4.3 Kinetic characterization of haloacid dehalogenases

A number of the confirmed HADs were biochemically characterized for Km and specific activity, with ClAc as substrate (Table 4.6). The reactions were performed at 30°C at the optimal pH of each enzyme (See Figure 4.4 for an example and Appendix B for all data). For these experiments, each concentration of ClAc was tested once, and between 8-12 concentra­ tions were studied for their initial rates of halide production. The Km's of the HADs with

Table 4.6: Kinetic characteristics of HADs with ClAc, mean ± SD. Data for L-DEX YL from

Liu et al. (33), DhlB from van der Ploeg et al. (34), others from BRENDA (35).

Specific Activity Assay buffer/ Enzyme KM (MM) (nmol/min-mg PH protein)

Adeh3811 CAPS 10.5 9.8 ± 3.9 25 ±2.5

BC20S1 Tris-SCU 8.5 17 ±2.4 12 ±0.3

Bpro0530 Tris-SCM 8.5 4.0 ±0.5 29 ± 0.7

Bpro4516 CHES 9.5 11 ±2.0 64 ±3.6

GMI1362 CAPS 10 34 + 8.5 100 ± 12

Jannl658 Tris-N03 8.5 6.7 ±4.1 10 ±3.1 OhIB pH9.5 -- 55 L-DEX YL pH9.5 1100 120

HAD of Azotobacter sp. pH8 100 0.7

HAD of Pseudomonas putida pH 10.5 1000 120 CHAPTER 4. RESULTS AND DISCUSSION 38

ClAc are around 10 to 40/iM, which is very low in comparison to the values in the literature.

Mass-normalized specific activities spanned one order of magnitude, from approximately 10 to 100 fjmol Cr/(min-mg protein), and lie in the range of values for HADs in the literature.

The comparison does create some cause for concern and these enzymes should be assayed together with DhlB and L-DEX YL by the same detection method (see Future Work, Section

5.2). Chan (16) has also found an HAD with Km in the low-^m range, though with the same colourimetric detection method used here.

No decrease in the initial reaction velocity was observed at the concentrations used to determine the enzymes' kinetic parameters (generally below lOO^M, but up to 400/^M in one case), i.e. no inhibition was observed. However, because inhibition was not specifically investigated in this study, results from these kinetic experiments clearly do not preclude the possibility of substrate- or product-based inhibition at higher substrate concentrations.

Adeh3811 -- ClAc

u

r>, '-rt

40 r 80 60 [ClAc] (uM)

Figure 4.4: Example of kinetic characterization, here of Adeh3811 of Anaeromyxobacter dehalogenans 2CP-C. O.l^g/mL enzyme in 25mM CAPS-Na pH 10.5. CHAPTER 4. RESULTS AND DISCUSSION 39

4.4 Rate estimate of defluorination

30 t = 9h

0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 25 30 35 40 45

Minutes Minutes

- [FAcf]=1077uM [FAc ] = 535uM, [F~] = 453nM

Figure 4.5: Example of defluorination as observed via IC, by Bpro4478. F" = fluoride, Gly ~

= glycolate, FAc" = fluoroacetate. IC used was a Dionex system equipped with AG19 guard column, AS 19 separation column and conductivity detector. Elution was by NaOH gradient per Materials and Methods 26.

To compare defluorination for the FAc dehalogenases and HAD-FAcs, extended assays were performed with lOmM FAc. The samples were quenched with H SO and separated by IC, which allows for unambiguous resolution of the anionic moieties expected in the reaction (fluoride, fluoroacetate, glycolate and sulfate). Chromatograms clearly show the development of fluoride and glycolate peaks, and the decline of the fluoroacetate peak (Figure

4.5).

In comparison to the dehalogenation rates of ClAc, defluorination is very slow by both

FAc dehalogenases and HAD-FAcs (Table 4.7), and this necessitated high protein concentra­ tions and extended incubation times. At equal concentrations of enzyme and substrate, the CHAPTER 4. RESULTS AND DISCUSSION 40

Table 4.7: Kinetic characteristics of HAD-FAcs and FAc dehalogenases with fluoroacetate.

All assays performed with protein at 20^g/mL in 25mM HEPES-Na pH 8.0 buffer. Lower

rate estimate is average velocity over 4 hour incubation; upper estimate is velocity over 2

hours. Relative rate is upper estimate/maximum activity against ClAc. nd = not determined.

Rates for DehHl and FAc-DEX FA1 respectively taken from Liu et al. (25) and Kurihara et al.

(24).

Activity, lower Activity, upper Est. Activity against Relative rate Enzyme Enzyme type estimate (fit = 4 h; estimate (At = 2 h; ClAc (umol/min-mg (ClAc/FAc) Hmol/min-mg protein) u.mol/min-mg protein) protein)

BC2964 FAc 0.094 0.11 nd

Bpro4478 FAc 0.34 0.41 0.041 0.10

PCC0O39 FAc 0.10 0.115 0.12 1.04

Adeh3811 HAD 0.33 0.40 25 62.76

Bpro0530 HAD 0.42 0.53 29 55.15

Bpro4516 HAD 0.16 0.19 64 331.75

DehHl FAc 9.2 0.36 0.039 (ref. 25)

FAc-DEX FA1 FAc 11 2.31 0.21 (ref. 24) CHAPTER 4. RESULTS AND DISCUSSION 41

HAD-FAcs appear to be generally competitive with the new FAc dehalogenases in terms of substrate turnover, but HAD-FAcs are much faster in turning over ClAc. All of the mea­ sured rates from both families of enzymes are much lower than defluorination values from the literature.

FAc dehalogenases exhibited varied activity toward chloroacetate. PCC0039 showed ap­ proximately the same activity against ClAc and FAc at equal concentrations, while Bpro4478 had a tenfold lower activity. There was difficulty in obtaining consistent dechlorination data for BC2964 and it was not pursued further. The literature offers little specific guidance for comparison, but FAc dehalogenases characterized thus far have been observed to have lower activity against ClAc than against FAc.

4.5 Rate estimate of 1,2-DCA dechlorination by Jann2620

Jann2620 was the only HAn dehalogenase found among 13 putative targets. To confirm its dechlorination activity, 20;ag/mL of enzyme was incubated with 40mM 1,2-DCA in HEPES pH 8.0. Chloride clearly increases over the course of the assay, as observed by the increase in chloride peak area at successful sample timepoints (Figure 4.6). The specific activity was calculated to be 6 x 10-2 fjmol Cr/(min-mg protein) at this concentration.

In comparison, the same reaction conditions (20/jg/mL, 40mM 1,2-DCA, HEPES pH

8.0) DhlA of Xanthobacter autotrophics, whose preferred substrate is 1,2-DCA (as defined by its higher Km/kcat compared to all other tested substrates), saturated the spectrophotometric assay by the first time point at t = 11 min, with greater than 500 nmol CI" production.

Because the spectrophotometric CI" measurement method requires sacrificial consumption of assay samples, the samples were unable to be separated by ion chromatography. However, based on this information, the lower limit of DhlA's specific activity is calculated to be 4.5

/umol Cl"/(min-mg protein). This is compared to the published value for specific activity of CHAPTER 4. RESULTS AND DISCUSSION 42

30 1 30 t = 0 min t = 11 min

uS uS cr cr

Uu 0 ._JU I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I 0 5 10 15 20 0 5 10 15 20 Minutes Minutes

30 30 t=23.5 min t=33 min

cr uS cr uS UL I I I I I I I I I I I I I I I I I I i I I I JI I I I II I I I 11 I I I I ! I I I 0 5 10 15 20 0 5 10 15 20

Minutes Minutes

Figure 4.6: Determination of dechlorination rate of 1,2-DCA by Jann2620. Green arrows denote chloride peak at each time point. CHAPTER 4. RESULTS AND DISCUSSION 43

11 fumol Cr/(min-mg protein).

4.6 Discussion

This work is an extension of previous work to identify key residues or sequence-level identi­ fiers of true dehalogenases in order to distinguish them from other similar sequences. An initial screen performed by Chan (16) screened 118 putative a//3-hydrolases and HAD- superfamily hydrolases, and found seven with dehalogenase activities (success rate: 6%), while the others were phosphatases, esterases, thioesterases, or exhibited no activity against the test substrates. This work provided substantial guidance for the refined target selection conducted here.

The organisms selected for study originated from soil and water environments. The ratio­ nale for this is that these environments are more likely to have had exposure to halogenated compounds from intentional application (e.g. pesticides from agricultural soil) and acciden­ tal release. Only one organism examined (Jannascbia sp. CCS1) was isolated from salt water, another environment rich in chlorinated and brominated compounds.

From these organisms, 33 putative dehalogenases were identified, of which 27 were suc­ cessfully expressed. 11 tested positive for one of the dehalogenase activities, for an overall screening success rate of 40%. For the individual families, the success rates were:

• Haloacid dehalogenase: 6/7 (86%)

• Haloalkane dehalogenase: 1/12 (8.3%)

• Fluoroacetate dehalogenase: 4/8 (50%) CHAPTER 4. RESULTS AND DISCUSSION 44

LDEXYL + # fcHSW GRCDEAyPGRGREMSAi 4? DhlB H.J^ 10SVA D ATERAiPSRGJYpQ: 4S Adeft3811 VSAA---AEARG--ALGDRVaP«AE! 4S Bpro0530 JP-~AttS*$ r'SVP. TSCEPJUPG-ISSMSSK 47 Bpto451£ HEKPRA§APVPgiJGi fDHYGTlfi SVG LLAEOliiPUasarlGLil 55 FD3FGTL 53 BC2051 MSATTT|S--SPD/ >.AVL AAAEaHtfPCHG»ffiHBQi GMI13£2 s FDQYGTL SVA AP.AEQ LSfCRGjSAjiSVj: 4? Jannl£58 m-- & - FDHYGTL AAARELAERPGREAi'^EVViaOaAHD^LE 5; BC6£S2 *tT--K-TTCj f'HFRMAE TARE I t!ADRtS P ATMEAFVlAFAA IroFEPifyy

LDEXYL H Ryv: •30Ai«iA JftyjTCP.Hfj LJ)APTPSTJCDAyLR4APijSEVPDS 103 DhlB 6 P-iJAD jAfri.GT| POESFLADaAgAijJIPfrPajPAAaC 101 Adeh3811 G P.HAD •JAIEAM GPGLRDG-^DA*J>R3AA«2JEAP.DA 104 Bpto0530 G .JrtC'D SAyLJ ;STCGSi| LBADGEAHjCSEyLSfTPnADVPOA 103 Bpro4516 ,'TTCH-DGAHif-- jjVCKRi* lTPERE3R$rao3PHSSAi|pWKGV 114 BC2Q51 MAAPR3 LHGAAEKRpDEuAcgSTljlDTVPA 113 GM11362 3S|AGPSGEH5:P |ACARt% PlGNHAEAT%RE*lACgSAiJ|EWVPV 10? Jannl£58 3AOAP.--—DHCD S»G3G ,CMELREE-^IGL!ijFT3SA3pVPEH 10? BC££82 "EyjiG AJ-TP ,RTCAR

LDEXYL RPO— BAVgSHA DhlB p OALgANA Adeh3811 t[GAG- SAAESA Bpto0530 AC-- WflGHS Bpro4516 R6-- 'VAlKSG BC2QS1 LDPRPPl IAHKS. GHI13e: MG--yjPiG IASKS. JannlcSS !AAG--#3TG IGASES. BC£682 SKY---KIV JISN&

• • © LDEXYL QAt^tDRSA .TGS'.Tt NVFEE-, 204 DhlB EV13VTPAE ;GG*3KNL«| • jiSVARiijAylSaEEALARELVSGTIAPLTHFKA * 2ic Adeh3811 DRLJgVPAPE FG< LQVAjCj G —--0PAEP.J3 205 204 Bpco0530 DT1HLGESE TG^Kfi fcpvclj GVFDOa Bpr©451£ OHTatDVP.a ISM 215 ,3 . tPFEEH BC2Q51 AAFFSAHPR: D iortrSI SBrrrg 216 GHI13£2 RAFrsr! ?PAA Q ceSfrsH TF3 G APAEEl JamU£58 QRF2CAPGD AFGAAfc 1,5 , HPPEA3 208 Bcecs: DKL^CKPED LHT»JEDL EDTLSSGG EPHDP.3 203 IKHKAFVHRC HEPGTP

LDEXYL G yTdDWEVTS i'ElFET-- •--AAGKAEKG 232 DhlB RMREETYAEAgDFVVPAi DHPRIVRG MAGAHLAPAV — 253 Adeh38U PAApAEIRS' PLtGlP --- 22£ Bpro0530 GWSDIVVSD SP.FSP VDEAA 22S Bpro451e GTAyjTYTGSSi [LTLPGHN LPGAAA 242 BC20S1 GAPpGTGTGi tAFUTPTPAPSGRPANRTRPGPGA 255 GHI1362 DVAgAAAGHDl MFVQA RQSHR- 233 JannlSSS -—TGKUGREAROr ALVGA 228 BC€£82 FYHYYEVSD! TQLGL -

Figure 4.7: Alignment of all HAD dehalogenase targets. Cross denotes nucleophile; circles denote catalytically critical residues. CHAPTER 4. RESULTS AND DISCUSSION 45

4.6.1 Confirmation of HAD annotations

Notably, we were more successful in selecting true HADs than for the other dehalogenases, but particularly for haloalkane dehalogenases. The specific reason for this is unclear; how­ ever, based on the previous work it was known a priori that the amino acids immediately downstream of the nucleophile of HADs are important for activity. Essentially all con­ firmed HAD-superfamily phosphatases have a DxD motif (x = any amino acid) in strand {3l

(36); the second Asp in the motif is conspicuously absent from all characterized HAD-family dehalogenases, including all those found in this research. The backbone atoms of the second

Asp is known to be involved in Mg++ binding; no known HAD-family dehalogenases re­ quire divalent ion cofactors, while many HAD-family phosphatases require Mg++ or other divalent cations for enzymatic activity (21).

Instead of DxD, nearly all positive HADs identified in this study contained a nucleophilic

DxY motif (see Figure 4.7 for alignment), which is present in both previously characterized

HADs DhlB PAY) and L-DEX YL PLY). Five of six positive HADs contained one of these varieties. One target (BC2051) had a Phe in place of a Tyr in the third motif position; since both Phe and Tyr have aromatic side chains, this mutation is the most biochemically conser­ vative possible. Although only crystallized HADs were used as BLAST search templates in this research, of the nine biochemically verified HADs listed in SwissProt, BC2051 is unique in having this mutation.

Based on the results of protein assays, the absence of the second Asp appears to be a necessary condition, but is insufficient by itself to absolutely determine haloacid dehalo- genase activity. This is unsurprising since by examining only the nucleophilic motif, one ignores many other residues which have been identified as critical. Indeed, a putative HAD- superfamily protein from Saccharomyces cerevisiae containing DAY did not exhibit dechlori­ nation activity with ClAc in the screening assay (data not shown). CHAPTER 4. RESULTS AND DISCUSSION 46

Table 4.8: Pairwise alignment statistics of HADs, identical/positive residues. Dark shading denotes positive target; light shading denotes negative target, na = no alignment.

Protein DhIB L-DEXYL Adeh3811 BC2051 Bpro0530 Bpro4516 GMI1362 Jannl658 BC6682 DhlB .. 47/63 42/54 42/59 51/67 37/53 43/60 42/57 26/43 L-DEXYL .. .. 45/57 42/58 54/72 42/56 46/59 40/52 25/38 Adeh38ii 44/56 39/57 39/53 47/58 51/63 24/38 BC2051 - - .. - 40/6O 49/64 59/72 38/53 26/42

BPro0530 ...... „ 42/55 44/60 36/54 26/44 BPro45i6 - - .. - - - 57/70 39/50 na GMI1362 .. 42/57 24/39 Jannl658 - ...... „ - - 28/43 BC6682

The single putative HAD which was determined not to be a dehalogenase (BC6682) did have a number of mutations not present in the others, and they may have served to eliminate activity. In addition to the comparative lower sequence identity between this and the positive dehalogenases (near 25% versus above 35% for the others; see Table 4.8), three mutations of catalytically critical amino acids (37) are present (L-DEX YL coordinates): A41F, Y157F and

N173S. If indeed an amino-acid substitution caused the loss of dehalogenase activity in an otherwise well-conserved sequence, the first and third mutations would appear more likely due to the more drastic biochemical changes imparted by them. One additional feature that is unique to this target is the presence of cysteine in the second position of the DxY motif; though this residue has not been shown to be important for activity, it is possible that the high side chain reactivity of Cys (as opposed to the relatively inert aliphatic side chains of

Ala and Leu) may have influenced the tertiary protein structure and thereby affected activity.

HADs and fluoroacetate reactivity

Even though the L-2-haloacetates are structurally similar there have been no reports of ac­ tivity against fluoroacetate by HADs. The discovery of HADs capable of hydrolyzing flu- CHAPTER 4. RESULTS AND DISCUSSION 47

0.3744 pro0530 BC668

OUTGROUP Bpro45to BC2051

Figure 4.8: Unrooted phylogenetic tree of HAD targets. Positives underlined.

oroacetate (herein referred to as HAD-FAcs) demonstrates that this dehalogenation reaction is not limited to a//3-hydrolase type enzymes such as FAc-DEX FAl. Alignments amongst the HAD-FAcs do not reveal higher similarity amongst themselves, nor are there any mu­ tations of conserved residues that are not present in another non-FAc degrading HAD. The three residues of the halide-binding cradle (Arg39, Asnll5, Phel75), which are responsible for supporting the leaving halide, appear identical to those of L-DEX YL in all three cases of HAD-FAcs, in which one might reasonably expect bulkier residues due the presence of a smaller halide in fluoroacetate. It is possible that the residues affording fluoroacetate degra­ dation activity are not residues currently identified as present in the active site.

Qualitatively, the rate of defluorination by HAD-FAcs was lower than their respective rates of dechlorination and denomination, as observed by the overnight incubation required to reveal this activity. This was clearly verified by the preliminary kinetic characterization, which showed specific activities over 100 times lower than -with the same concentration of chloroacetate.

The activity difference is not surprising for two reasons. First, the C—F bond is signif­ icantly stronger and is expected to be more difficult to break than other C—X bonds (22). CHAPTER 4. RESULTS AND DISCUSSION 48

Secondly, a similar difference in activity has been observed with FAc dehalogenases: their activity is severalfold higher against the native substrate than against the other haloacetates.

Given that the C—F bond is significantly stronger than other carbon-halide linkages, it ap­ pears that there may be a physical reason that makes fluoroacetate, comparatively more recal­ citrant, a better substrate for fluoroacetate dehalogenases than is chloroacetate. By the same reasoning, the much lower activity of HAD-FAcs against FAc (as compared with ClAc) may suggest that there is no particular physical feature that lends them to react with FAc; rather, it may be a fortuitous configuration of the halide-binding residues that renders the binding site slightly smaller compared with regular, non-FAc degrading HADs.

Elucidation of the physical basis for fluoroacetate activity in HADs will likely require structural visualization of these enzymes soaked in different halides, crystallization of reac­ tion intermediates or homology modelling against existing HAD structures.

4.6.2 Confirmation of HAn dehalogenase annotations

Although nearly half of the 27 expressed targets were putative haloalkane dehalogenases, only one (Jann2620) was found to be a true dehalogenase in this study based on screening with

1,2-dibromoethane, degraded by all previously-characterized haloalkane dehalogenases, and

1,2-dichloroethane, for which hydrolytic degradation has been rarely observed. In light of the screening results, the conditions for the selection of targets may have been overly relaxed, or the screening compounds selected did not have sufficient chemical diversity; most likely it was a combination of both factors. The principal selective mechanisms employed were the observation of (a) nucleophilic Asp residue and (b) the His general base. The presence of the catalytic acid Asp260 (DhlA coordinates) was not enforced because it has been observed to take at least two different positions (38).

The quality of target alignments against known dehalogenases was low: alignments typ- CHAPTER 4. RESULTS AND DISCUSSION 49

DhlA , KINAIRTPOORFSKtDgai IS DhaA --KHVLP.TPDSRFEHl.EDli} 18 LinB HSLGAKPF 8 DhaA . MSEIGTGijI 5 Jann2£2Q MKP.IRALATAATiAAGlAMPVAAQDTGCAQQP-ISAEffil 38 Bpro2447 lHHgTG¥gGGVLAEECGELLQAFFRRRRADKREKFRLNHPLRDDALRTPDAAFDDtPG||| 180 BC4182 HIRSDATFDGTJU 13 P

DhlA ||SPNi45DDLPGYPG-L !PPIHL!_ PITDAGHBV 7c LinB G£KK<£gEIKG PjyAB i--T--SDPlflFQ3 PHCAGL-GjjL 58 DhaA ipMIuUEVLO E.-JH| IPP.D--3TPVHF PHVAPS-HGC el Jann2e20 •jELQTgEVLG S» JAB ;--D--gPVVflF|. PHVAED-HjjA 88 Bpro2447 3PPP.»j£|SDLPALDG-L • Hj ;EGGPG3LTYflCt PVFLQAGHjiV 235 ( 65 BC4182 SAPH^DDASG F ja? pRD--SEIv|ciL ALSPT-YJjJF G g 6 HG R 99 DhlA JPjDFFlBF EMI^'DE - -EDtfritfEFjJteHfFI LALIERj 8GFgBLTLP 134 DhaA !pii L*. t F MSP. I - -EBprtORi VDW S'SBFEHi jSL» JLP.IA 133 LinB M*^-"«"-'>ERiA8AE|p.D-1 ALtfEAi ISA JFDWA 118 DhaA -tDaFijDDSVR' AFIEAftG 3SA I «FHWA 116 Jann2€20 IISL^E -iDir.jOD|yA; GFIDAj G!iTYYA 143 Ep«:o2447 M L: |KKO--SFHS||GVSP.0I| LELVDRi jJjjGL JiTLP 2S6 BC4182 'PflH 1QDR—SjjjIfLODSIDH :RFVLAH!>B)R-^ 121 6a D G G Sdkp 1 L :3B<;piaS«GLA 6 Lv6 D5G cG

DhlA HADP.SdFKRHIIiaiACLHTDPVTgPAFSAFVTQPADGBrfllS YDLVTPSDLRLD3 18S DhaA AEQPDSVGRjSrVAHGFLPTAgRR TPPAjygHa AFARYSPVLPAGR 178 LinB RRHREW'JG AYMEAIA HPI-E¥ADFP-EQDRDL jQHS-S^AGEELVLQDHVFVEO 172 DhaA KRHPEJVKG ACHEFIP. PIPT¥DE¥P-£FARETjSJgUgTADVGRELIIDOHAFIEG 171 Jann2£20 ANNSDKVRA KKMEAAAPPALPlPDWAM?ADQQTRETj3fraDPVMGPQIILEQNGFVEG 203 Bpro2447 HVAPL^KGitViraTKLATSDVP LSPGjREEsJ- EMCAKMPDFDVAR 341 BC4182 ARHPDllRRHV'SAHGPTPFGdTDlAERLTANGREAPSpWlHRAAADGTtETVlGOLGFH 181 r 6 Fa

DhlA F-jKR¥APT |J-EAEASA| jJHPDT-SYOAGVRKi KHtfAgRDQAC-ID ISTEA 240 DhaA MS V GTVR R9 JSKVRAGI A:SPDK-TYQAGARA 'HftPTSPADPAIPAHRKA 231 LinB VtJPGLILR P2SEAEMAA| E:aLAAGEARRPTLSl WO* IAGTP AD VVAI AP.D 22£ DhaA AlPEC VVR P|HE VEHDHl E;JLKP-VDREPLVR JEfflP IAGEPAHIVALVEA 224 Jann2620 ifPATUR TgRDAEMDA| •AjjPTP-ESRQPVLMl WEEP IEGTPARNVTVHEE 256 BpE02447 1FARGHPQ SiPDEC AjjPDK-GHRAAiRA jJLHtfPESECADGAAISREA 3S3 BC4182 IHSTLKLHGFEHHAIflADTWIAAi GAajAQP-ADCLGAIGflARGFAAG AHRFEEPD 236 PF Sp

DhlA ISFSQHDHHGQTFHAIGMKDKjaGPDVMYPHKALIHGCPEPtEIJADAGjj KJFGEQVAR 300 DhaA VEALG-RWEKPFtJAIFGARDPHLGHADSPLIKHIPGAAGgPHARIHAS." RGPELA 2S0 LinB YAGVLSE S PISKifrlHAEP GA|rTGRMRD-FCRT¥ P-HQTEITB-AGAl iSPDEIG 283 DhaA YKHlLHQSPVPKgtFtfGTPGVaiPPAEAARlAESLP-MCKTVDBGPGL' WPDLIG 283 Jann2620 yAABLTTSE3SMHYASPGLmSPEVADFAARTFH-HTEARF|GAGI QPEAIG 315 Bpro2447 REF»OSR¥TGgTgHAVGA-2DP!aLGLPVHRAtOGIIRGCDAP!,v|EOAG! GEPIAR 453 BC4182 AAALRAIRGKPA3AI¥GDADRTLGTEHFlPI,FTALFPSAPIERJAG?Gi APDAIA HS qE

DhlA ETE 310 DhaA QAU 301 LinB HRIRPA-- 2B£ DhaA PAl 253 Jarm2620 KDRVTP.GH 330 Bpro2447 CR 461 BC4182 llTTG 307

Figure 4.9: Alignment of positive HAn dehalogenases and closely related negatives. Circles

indicate catalytically important residues. CHAPTER 4. RESULTS AND DISCUSSION 50

Table 4.9: (a) Pairwise alignment statistics (identity/positive) of Jann2620 with crystallized

HAn dehalogenases; (b) Pairwise alignment statistics of enzymes from (a) against all negative

HAn targets. Dark shading denotes positive target; light shading denotes negative target. *

= alignments not longer than 150 residues; :,"::' = alignments not longer than 120 residues.

(a) Protein Jann2620 DhIA LinB DhaA Jann2620 - 30/44 46/61 48/64 DhIA - -- 37/50 32/46 LinB -- - - 47/62 DhaA __ __

(b) Protein Jann2620 DhIA LinB DhaA Adeh0522 23/39 23/38 26/38 24/36 AVOP5010 28/43 36/53** 34/50* 27/41 BC3948 24/44 28/47 25/43 28/44 BC4182 30/43 27/41 27/43 45/66** Bpro0521 22/39 35/55* 31/50* 25/42 Bpro0547 31/47 36/51** 34/47* 41/58** Bpro2447 31/44 48/62 35/47 46/56* PCC1353 27/42 26/43 28/45 31/48 PCC4221 21/39 24/41 33/49** 23/40 GMI1770 27/41 30/44 28/40 26/40 PF0960 32/53** 29/45** 34/51** 33/51** CHAPTER 4. RESULTS AND DISCUSSION 51 ically had long gaps, or extended for only a portion (40-50%) of the total protein length.

Those that did have sizable alignments saw low percentages of amino acid identity (see Table

4.9). The sole positive HAn dehalogenase identified here (Jann2620) was aligned essentially through the whole protein, had 30% identity with DhlA, but above 40% identity with both

LinB and DhaA. The remaining (negative) targets generally had identities around or below

30% against the confirmed HAn dehalogenases and/or had short alignments covering under

50% of the total protein length. However, in general the negatives each had at least one case in which the alignment was sufficiently long as to warrant testing.

BV5158 GMI1770

OUTGROUP

PF0960

Figure 4.10: Unrooted phylogenetic tree of HAn targets. Positives underlined. Closely related targets, as referred to in text, are circled.

Phylogenetic analysis suggests that two of the inactive targets (BC4182, Bpro2447) are more closely related than others to true haloalkane dehalogenases (Figure 4.10). Upon closer CHAPTER 4. RESULTS AND DISCUSSION 52 examination of the conserved residues in these two cases, BC4182 contains notable muta­ tions: a Phe (instead of Trp) in the location of the first halide-binding residue, and He (instead of Trp or Phe) in the location of the second halide-binding residue. It also contains a 6 amino acid insertion that is absent from the dehalogenases, between helices ad and al. By sequence homology, this insertion is expected to be part of the cap domain, a region expected to show low similarity as it is the primary determinant of substrate specificity.

Pfam domains ^ -) r ~ 3 t t e ~ r a f^fr t ?, seGcc-rice. Chcking *:.r a doT:a^ will take yc;> to its page ck-scrbir.;, Itai Pfan's entry "ft a i r* t o < f ( w-3 'txher, ovenappine domains This is sotec n tne

Hjt!|lri«lAw£-l

Smirch D»main Start £-iKi

Pfam& Ar^vfr-gsp ;

Figure 4.11: Domain structure annotation of Bpro2447. From Sanger Institute, http://pfam.sanger.ac.uk

The situation with Bpro2447 is more complicated: although its putative catalytic residues are identical to those of DhlA (Asp-His-Asp catalytic triad + Trp-Trp halide pocket), at 50 kDa it is significantly larger than other HAn dehalogenases (35 kDa). The extra weight is attributable to 150 residues at the N-terminus, appearing before the putative a//?-hydrolase fold. Despite recognition of the hydrolase domain, Bpro2447 is annotated as "CMP/dCMP deaminase, zinc-binding", presumably based on the N-terminal recognition

(Figure 4.11). BLAST search of this target on UniProt reveals the annotation is taken from other electronic annotations of other two-domain proteins, all of which are predicted (and therefore unconfirmed) to exist. The highest characterized protein match is, in fact, DhlA.

Although Bpro2447 did not test as a true dehalogenase against 1,2-DBA and 1,2-DCA, the possibility remains that it is active against halogenated compounds not tested here. How­ ever, this case does highlight the difficulty of predicting activity for proteins with multiple domains. CHAPTER 4. RESULTS AND DISCUSSION 53

Jann2620 - a new 1,2-DCA degrading enzyme

Jann2620, from Jannaschia sp. CCSl, was found to have activity against the haloalkanes

1,2-DBA and 1,2-DCA.

Referring to the alignment with previously characterized HAn dehalogenases (Figure

4.9), the main catalytic residues are Aspl33, His304 and Glul57. Both the identity and the topological positioning of the catalytic acid are identical to those in LinB and DhaA. Pair- wise alignments of Jann2620 with existing HAn dehalogenases suggest that these three HAn dehalogenases are more closely related to each other than to DhlA. The phylogenetic tree supports this reasoning.

From repeated colourimetric screenings, it was observed that this enzyme's activity against

1,2-DBA is far higher than that against 1,2-DCA. Using a saturated solution of 1,2-DCA in water as the reaction stock solution (expected concentration: 88mM), the specific activity was determined to be approximately 6 xlO-2 ^mol Cr/(min-mg protein) with 40mM 1,2-

DCA. This specific activity is between that of DhlA (10 ^mol Cr/(min-mg protein); from

(11)) and DhmA (1.34 xlO-3 /imol Cr/(min-mg protein); from (12)), an uncrystallized and relatively unstable dehalogenase from Mycobacterium avium N85. Another report indicates that LinB of S. paucimobilis has activity toward 1,2-DCA comparable to that of DhmA (39).

This dechlorination activity of Jann2620 appears to be somewhat higher than that of some previously characterized enzymes, but does not approach that of DhlA, for which 1,2-DCA is considered the natural substrate.

4.6.3 Confirmation of fluoroacetate dehalogenase annotations

Out of eight putative FAc dehalogenase targets, four were determined to be active against fluoroacetate.

The disparity between the relative success of finding FAc dehalogenases versus HAn de- CHAPTER 4. RESULTS AND DISCUSSION 54

OUTGROUP

DenHI

3835

Bpro3067

GM10258 >.x> o--,f £;p & \

Figure 4.12: Unrooted phylogenetic tree of FAc dehalogenase targets. Positives underlined. CHAPTER 4. RESULTS AND DISCUSSION 55 halogenases is surprising, and the reason for this is unclear. They are members of the same , and there are actually fewer characterized examples of FAc dehaloge- nases (two, as opposed to three, and only one crystal structure). The selection criteria used for FAc dehalogenases were identical; that is, the presence of the nucleophilic Asp and the general base His. However, BLAST alignments of the FAc dehalogenase targets tended to be better, both because they had higher levels of residue conservation (identical and/or chemi­ cally similar residues), and they also extended through the majority of the protein.

Phylogenetically, both positive and negative targets were more closely related to each other than to the epoxide hydrolase outgroup (Figure 4.12), but the positives are not clearly separated from the negatives. All targets contain the catalytically critical Asp 105 nucleophile,

His272 general base and Argl06 putative fluoride binding residue (all DehHl coordinates)

(Figure 4.13). All except one (PF4714) contain Argl09, which is speculated to bind the leav­ ing group fluoride. In arginine's place in PF4714 is Tyr which, unlike Arg, is uncharged but could possibly (though less effectively) bind halide at the edges of its aromatic side chain, similar to the second halide-binding residue in LinB and DhaA haloalkane dehalogenases.

Regarding overall sequence similarity between the selected targets, there does not appear to be a marked difference between positive and negative targets (see Table 4.10). With the sole exception of PF4714, all negative sequences had identities above 40% with confirmed dehalogenases, with some similarities above 50%. Positives have much the same similarity characteristics amongst themselves. Based on the sequence comparison alone, it appears dif­ ficult to distinguish between true FAc dehalogenases and related proteins inactive against fluoroacetate.

This difficulty is particularly evident in the case of BV3053: it has a near-perfect 92% identity with BC2964 and well above 40% identity with other positive targets but it did not show activity against FAc in multiple screening assays. The few differences in primary se­ quence have not affected the critical residues noted above, all of them having been absolutely CHAPTER 4. RESULTS AND DISCUSSION

FAc FA1 HSjEGj IPP.LfjDVGDVT *IC yVGSS^- 36 DehHl MD^PG :HST«TVDCV5.*- 'y TVsSfi- 37 EC2S64 HS!°S:A| PFR;'|TVQDTD FG tfKegrS- 37 Bpro44?8 HTSpG SRSFEVHGAS „QARFSPVAIGDAPP- 44 Daro383S HFBSSl TRDdDVGATR HVR 3P.EBEHR.1, 3§ FCCG03E HFTH- 3TIfl3TTEA~F , ,S!L- •3-AI1 3£ BV30S3 MSgDA .PFRtfTVODTD r 37 GMI0256 HTASPTIEVAIAAADRALBPG FFPQAAHGVE g::::::$Sfij S3 Bpro30€7 — H&Si SS£«!IGPAGA -3PGM-A 3c PF4714 »P6| LEHjJKlPEAT 36

FAc FA1 JSsJUJVGAPDHAH! se DehHl :D«CLPDP.SH« 57 BC2S64 •OpPSDARHTPfl8 S7 Bpro4478 :sfrPGlPDHSHfi 104 Daro383S GHIASOADAGHO ss FCC003S si BV30S3 PSDAQHAPn S7 GHio;se •GSATfiVE 113 Bpro30c7 |A|VOAI>. _„....„ se PF4714 |S13TDPHDYEA§

FAc FA1 ISlYV4ii(CE VJSFVJJP.Aj 151 DehHl * jflfATrara TJRLVFAS ISC BC2S64 "• !P»A3SFK TJRAF| FAi is: Bpro447S jtDSpORGHTEPYHAFf QA] 164 Dato3835 ™*-TTHAL TCfQAjJrs 1S4 PCC003S KiSRT TSBElf rA, 1S1 BV3QS3 ittASc T%p.kV& rA is: GHI0256 SASSES T*JHJUrf AAj 168 Bpte3067 JEGSEN TSDAFSP.A] 151 PF4714 (GvBtiEAtER cirergso 151

FAc FA1 GA DPDTj 'EGC^FGWCATGADGypjiM; ,KQWRDji{AAj9CG 207 DehHl GO DPDF •ETC<"FGSGATKVSDffl)0.*i ESVPUaHgHG :G8 BC25£4 A HTDAl :EP.V-.GHR-SAGlAPuAij AAtAQyGAgSA 207 Bpro4478 jGAHHLETAKAi .HAFMSGGBGSSGIGYIE; RCFCHAEAprr 224 Daro3835 .A DPASl .KGCrtSRWSHGHEEAuppjAV RCFSNgEAURC 210 PCC0039 A WPEYl .RKC^KVGKDFS-Ayffli IRCFSOSAVpiA BV3053 G P.TDAl ;ERV^GHR-SAGJUP3A§| AAiAQgGAyHA 207 GMI0256 A TPDFl GKl -•GLR-HAGtAPijAS3~_—, .AAHREgACjiHA 223 Bpto30€7 S DPVP. ^SV^GKP.-HAGlGAyAfPjkiAEl RCAOI3GTAKS 206 PF4714 LO DPSA •G— CCAESHG;JHFDD| aSAigDyRT^'G 158

FAc FA1 FSVSAMLSMSl 263 DehHl F|£SKS0S G-H 264 BC2964 KASLERGH H§V|GRC 265 Bpro4478 ESRAHGL-I rsSER3Y| 283 Datc-3835 Ei--AERi 266 FCC003S L|--HKQ 262 BV3053 AlLEP.GH 26 5 GHI0256 AJ«AAGR; li 281 Bpro30€7 AS.TAGR-] 264 PF4714 EJJP.CAGR; iSlCDDj 258

FAc FA1 DDTARIMRE itSDAESGIHQTEP.P.ES 304 DehHl F AETSEllKK .--ARHG 2S4 BC2?£4 * AAtlDE IEAP.DAAA-- 258 Bpro4478 ™-AVAOA| IALPELDRV - 317 Daro3835 DVARE I'ARHSSANSLAG 303 PCC003S ETY-3A ITHC 2S1 BV30S3 AtlDEf 'BARD AAA 258 GKI0256 AUAEJSA ;G 308 Bpco3067 LUECAIH 'ET 2S2 PF4714 EtASAiff'tT TDGT 288

Figure 4.13: Alignment of FAc dehalogenase targets. Circles indicate residues speculated be catalytically important. CHAPTER 4. RESULTS AND DISCUSSION 57

Table 4.10: Pairwise alignment statistics of FAc dehalogenase targets, identical/positive

residues. Dark shading denotes positive target; light shading denotes negative target.

Protein FAc-DEX FA1 DehHl "i PCC0039 r BC2964 j Daro3835 J~Bpro4478 j GMI0Z56 Y BV3053 JBpro3067 [ PF4714 FAC-DEX FAi - 63/72 46/62 44/57 40/57 | ^42/55 ; 45/57 ! 44/56 H42/54 39/54 DehHl - - 45/63 44/59 \ 42/59 •. 39/55 I 45/58 [ 45/58 j 41/55 36/50 ! PCC0039 - - - 57/71 55/71 U8/63 57/7]/ 55/69 j 53/67 | 41/54 JBC2964^ - - - - 47/63 ^_50/64 i 67/77_^92/95 ^60/70 38/53 Daro3835 - - - - - ; 43/59 i 49/63 i 47/62 i 48/63 40/54 i Bpro4478 - ' - ..:...„•_[ 48/60 48/62 | 47/62 34/52 GMIO256^ .. ._ - - - : ^- - 66/76 ^62/72 J 41/57 BV3053 - - - - - T - F _- ^ " . 59/71 39/54 Bpro3067 ------j - ; .. 40/53

PF4714 - - - • — - — ' - — -

conserved. One possibility is that BV3053 is not folded properly in E. coli, and its production

and purification should be attempted in other expression systems. Alternatively, if alternate

purification and assay arrangements do not reveal activity, BV3053 could serve as a useful

starting point for site-directed mutagenesis experiments which may reveal additional critical

residue(s) involved in haloacetate dehalogenation by FAc dehalogenases.

The specific activities of FAc dehalogenases discovered here were at least one order of

magnitude lower than those of FAc-DEX FAI and DehHl (see Table 4.7). However, because

Km values were not investigated for these proteins, it is possible that these dehalogenases have

Km's much greater than lOmM, the concentration of substrate used in the assays. This allows

for the possibility for a higher maximum turnover rate than was determined here. Chapter 5

Conclusions

The original intent of this work was to test the annotation of dehalogenases, with the gen­ eral objective of developing rules for distinguishing biochemically active dehalogenases from inactive homologues. Of 27 expressed putative dehalogenases, selected only with considera­ tion of a limited number of conserved residues, 11 previously uncharacterized proteins were confirmed to be active against a limited test suite of dehalogenases. The others remain un­ characterized but cannot be disregarded as potential dehalogenases against other compounds not tested in this work.

The success rate of HAD discovery was markedly higher than that of the other dehalo- genase types. HADs possess highly-conserved residues through all known samples. The N- terminus nucleophilic motif DxY appears to predict dehalogenase activity quite accurately.

Additionally, DxF, a conservative mutation of DxY, was observed in one positive dehaloge­ nase. One target with a DCY motif did not prove to be a dehalogenase, however, its inactiv­ ity may be explained by the absence of other residues. Dehalogenase-like sequences with this

N-terminus should not be excluded from further investigation for dehalogenase activity.

The second-highest success rate was experienced with FAc dehalogenases. Despite the lack of research into the structural and mechanistic details of dehalogenation by these en-

58 CHAPTER 5. CONCLUSIONS 59 zymes (compared with HAD and HAn dehalogenases), their relatively high sequence iden­ tity (40-50%) appears to have made their identification easier. No clear delineation in the de­ gree of similarity could be ascertained between positive and negative targets, nor were there clear differences in the residues. The automated annotations did not distinguish between targets that registered as FAc and HAn dehalogenases (it considered both as or//?-hydrolases) but BLAST clearly distinguishes between them since searches with one dehalogenase type only produced putative targets of the same dehalogenase type.

The search for HAn dehalogenases was the least successful, with only one positive identi­ fied. This rate of success does not offer any confidence of the effectiveness of this method for finding true HAn dehalogenases at this point, and may simply be attributable to fortuitous selection. The sole positive, Jann2620 of Jannaschia sp. CCSl, was by far the most similar to the existing haloalkane dehalogenases; in the case of the other (negative) targets, alignments were generally poorer and contained many gaps. To the credit of the existing database en­ tries, descriptive annotations of putative HAn dehalogenases in most cases were unspecific and did not speculate beyond a proposed enzyme fold family, perhaps in recognition of the low sequence similarity between the targets and a//3-hydrolases generally, and characterized

HAn dehalogenases specifically.

The general result to come out of the biochemical verification exercise is that the level of sequence identity and similarity is useful and may be indicative of activity, but it is at best an imperfect projector; this was particularly evident in the case of FAc dehalogenases. Factors such as the length and continuity of alignment and the presence of critical catalytic residues can help to narrow the range of targets for which biochemical assays should be pursued, however, a requirement for specific residues to be present should allow for some mutational flexibility, particularly for more conservative mutations. This two-stage approach (BLAST plus manual curation) has been shown here to successfully enrich for true dehalogenases in the gene selections within this selected set of organisms. Although each sequence was CHAPTER 5. CONCLUSIONS 60 manually selected in this case, in principle the curation could be automated on the basis of protein families as each one examined here has a unique set of conserved residues.

The success of finding HADs and FAcs justifies some confidence that more precise an­ notations can be automatically assigned; in the case of HADs, this is already the case with a high level of accuracy. At this point, however, the curated search method clearly cannot serve as a surrogate for biochemical testing for haloalkane dehalogenases, and further testing of these will be necessary to establish refined search rules.

5.1 Contributions

• Discovery of 11 previously uncharacterized dehalogenases.

• Performed preliminary kinetic characterization of six HAD dehalogenases against chloroac-

etate.

• Confirmation of novel enzymatic activity of three HAD dehalogenases against fluo-

roacetate.

• Discovery of one new haloalkane dehalogenase with moderate activity against 1,2-

DCA.

5.2 Future Work

• Expansion of search space to newly available sequence information, in particular to

dehalogenation organisms but also to organisms with no known dehalogenation activ­

ity from pristine locales. A comparison of the rates of dehalogenase discovery would

indicate whether selective organism examination is useful or unnecessary. CHAPTER 5. CONCLUSIONS 61

• Crystallization and structural studies of HAD-FAcs, to study the structural features

enabling activity with fluoroacetate.

• Broad-range substrate testing of all purified enzymes, to further reduce chance of false

negatives and to identify the active biochemical activity (if any) of the negatives. This

would include /3-halogenated alkanes, cyclic alkanes, alkenes, long chain and polyhalo-

genated acids. Metallic cofactors could be included in the screening buffers to further

diversify and generalize the testing conditions.

• Screening of enzymes with non-halogenated compounds. Colourimetric screening for

phosphatase, esterase, thioesterase, can be conducted using techniques demonstrated in

Kuznetsova et al. (40).

• Testing of DhlB and L-DEX YL with the same colourimetric assay method employed

in this study to verify their kinetic properties, which were found to be up to two orders

of magnitude different than the new HADs examined in this study.

• Full kinetic characterization of 1,2-DCA dechlorination by Jann2620. Appendices

62 Appendix A

Standard curves

A.l Standard curves for spectrophotometric assay

Spectrophotometric standard curves for Q"

HEPES pH 7.5 Tris pH 8 CHES pH 9 CHES pH 9.5 CAPS pH 10 CAPS pH 10.5

[CI] OiM)

63 APPENDIX A. STANDARD CURVES 64

A.2 Standard curves for ion chromatographic assay

Bromide calibration, 10-2000|iM

y = 1594241.649X R! = 1.000. 3.000E+09

n 2.000E+09

a- 1.500E+09

O.OOOE+00 1000 1500

Ion concentration (|iM) APPENDIX A. STANDARD CURVES 65

Bromoacetate calibration, 10-2000|iM

3.000E+09

2.000E+09

5.000E+08

0.000E+00 1000 1500 2000 3000 Ion concentration (nM)

Chloride calibration, 10-2000|iM

y = 1560344.233X R; = 1.000>

n 2.000E+09

o- 1.500E+09

5.000E+08

0.000E+00 1000 1500 2000 Ion concentration (nM) APPENDIX A. STANDARD CURVES

Chloroacetate calibration, 10-2000|iM

3.000E+09 i

y = 1158918.792x R2 = 0.997 2.500E+09

2.000E+09

1.000E+09

5.000E+08

O.OOOE+00 1000 1500 Ion concentration (|iM)

Fluoride calibration curve, 10-1000uM

1.200E+09

8.000E+O8

6.000E+08

4.000E+08

200 400 600 800

Ion concentration [\xM) APPENDIX A. STANDARD CURVES 67

Fluoroacetate calibration, 10-500|iM

1.400E+09

1.200E+09

o 8.000E+08

°- 6.000E+08

4.000E+08

2.000E+08

400 600 800 1200 Ion concentration (\iM)

Glycolate calibration, 10-100nM

1.400E+08

1.200E+08

n 8.000E+07

6.O00E+07

4.000E+07

O.OOOE+OO

Ion Concentration (|iM) Appendix B

Kinetic data for HADs

B.l Kinetics against chloroacetate

B.2 Kinetics against fluoroacetate

68 APPENDIX B. KINETIC DATA FOR HADS 69

BC2051--ClAc

1 300 r 500 400 [CIAc] (uM)

Figure B.l: Michaelis-Menten curve for BC2051 from Burkbolderia cenocepacia HI2424. Re­ action conditions: 25mM Tris-SO pH 8.5, 0.4/ig/mL protein, 30°C at the indicated sub­ strate concentrations. Total reaction volume for each time point at each concentration was 1 mL, for a total of 4 mL for each point on the Michaelis-Menten curve. APPENDIX B. KINETIC DATA FOR HADS

Bpro0530 - ClAc

&•

[ClAc] (uM)

Figure B.2: Michaelis-Menten curve for Bpro0530 from Polaromonas sp. JS666. Reaction conditions: 25mM Tris-S04 pH 8.5, 0.1/^g/mL protein, 30°C at the indicated substrate con­ centrations. Total reaction volume for each time point at each concentration was 1 mL, for a total of 4 mL for each point on the Michaelis-Menten curve. APPENDIX B. KINETIC DATA FOR HADS 71

Bpro45l6--ClAc

[ClAcJ (uM)

Figure B.3: Michaelis-Menten curve for Bpro4516 from Polaromonas sp. JS666. Reaction conditions: 25mM CHES-Na pH 9.5, 0.1/^g/mL protein, 30°C at the indicated substrate concentrations. Total reaction volume for each time point at each concentration was 1 mL, for a total of 4 mL for each point on the Michaelis-Menten curve. APPENDIX B. KINETIC DATA FOR HADS 72

GMI1362~ClAc

[ClAcj (uM)

Figure B.4: Michaelis-Menten curve for GMI1362 from Ralstonia solaneacearum GMI1000.

Reaction conditions: 25mM CAPS pH 10, O.OS^g/mL protein, 30°C at the indicated sub­ strate concentrations. Total reaction volume for each time point at each concentration was 1 mL, for a total of 4 mL for each point on the Michaelis-Menten curve. APPENDIX B. KINETIC DATA FOR HADS 73

Jannl658 -- ClAc

r 75 100 125 [ClAc] (nM)

Figure B.5: Michaelis-Menten curve for Jannl658 from Jannaschia sp. CCS1. Reaction conditions: 25mM Tris-N03 pH 8.5, O.l^g/mL protein, 30°C at the indicated substrate concentrations. Total reaction volume for each time point at each concentration was 1 mL, for a total of 4 mL for each point on the Michaelis-Menten curve. APPENDIX B. KINETIC DATA FOR HADS 74

Adeh3811 defluorination, lOmM FAc

150

Reaction time (min)

Figure B.6: Defluorination of FAc by Adeh3811, single trial only. Reaction conditions:

25mM HEPES-Na pH 8.0, 20^g/mL protein, lOmM fluoroacetate, 30°C. Reaction volume at each time point was 1 mL.

Bpro0530 defluorination, lOmM FAc

100 150 Reaction time (min)

Figure B.7: Defluorination of FAc by Bpro0530, single trial only. Reaction conditions:

25mM HEPES-Na pH 8.0, 20/ug/mL protein, lOmM fluoroacetate, 30°C. Reaction volume at each time point was 1 mL. APPENDIX B. KINETIC DATA FOR HADS 75

Bpro4516 defluorination, 10mM FAc

150

Reaction time (min)

Figure B.8: Defluorination of FAc by Bpro4516, single trial only. Reaction conditions:

25mM HEPES-Na pH 8.0, 20^g/mL protein, lOmM fluoroacetate, 30°C. Reaction volume at each time point was 1 mL. References

[1] Gordon W Gribble. Eurochlor Science Dossier: Natural Organohalogens, 2004. URL

http://www.eurochlor.org/upload/documents/document67.pdf.

[2] Michiel Kotterman, Ike van der Veen, Judith van Hesselingen, Pirn

Leonards, Ronald Osinga, and Jacob de Boer. Preliminary study on

the occurrence of brominated organic compounds in dutch marine or­

ganisms. Biomolecular Engineering, 20(4-6):425-427, July 2003. URL

http://www.sciencedirect.com/science/article/B6VRM-492W13D-l/2/b9ed0d02a482dde993dc

[3] Ontario. Ontario Drinking Water Quality Standards, O. Reg 169/03. In Statutes of

Ontario, 2003.

[4] Committee on Ground Water Cleanup Alternatives, National Research Council, edi­

tor. Alternatives for Ground Water Cleanup. National Academies Press, 1994.

[5] Charles W. Fetter. Contaminant Hydrogeology. Prentice Hall, 1993.

[6] Derek R. Lovley. Cleaning up with genomics: applying molecular biology to biore-

mediation. Nat Rev Micro, l(l):35-44, October 2003. ISSN 1740-1526. URL

http://dx.doi.org/10.1038/nrmicro731.

[7] Genomes Online. Internet resource. URL

http://www.genomesonline.org/gold.cgi.

76 REFERENCES 77

[8] Andreas D. Baxevanis and B. F. Francis Ouellette, editors. Bioinformatics: A practical

guide to the analysis of genes and proteins. John Wiley and Sons, 2005.

[9] Ross Overbeek, Tadhg Begley, Ralph M. Butler, Jomuna V. Choudhuri, Han-

Yu Chuang, Matthew Cohoon, Valerie de Crecy-Lagard, Naryttza Diaz, Terry

Disz, Robert Edwards, Michael Fonstein, Ed D. Frank, Svetlana Gerdes, Eliza­

beth M. Glass, Alexander Goesmann, Andrew Hanson, Dirk Iwata-Reuyl, Roy

Jensen, Neema Jamshidi, Lutz Krause, Michael Kubal, Niels Larsen, Burkhard

Linke, Alice C. McHardy, Folker Meyer, Heiko Neuweger, Gary Olsen, Robert

Olson, Andrei Osterman, Vasiliy Portnoy, Gordon D. Pusch, Dmitry A. Rodi-

onov, Christian Ruckert, Jason Steiner, Rick Stevens, Ines Thiele, Olga Vassieva,

Yuzhen Ye, Olga Zagnitko, and Veronika Vonstein. The Subsystems Approach

to Genome Annotation and its Use in the Project to Annotate 1000 Genomes.

Nucl. Acids Res., 33(17):5691-5702, 2005. doi: 10.1093/nar/gki866. URL

http://nar.oxfordjournals.org/cgi/content/abstract/33/17/5691.

[10] S Fetzner and F Lingens. Bacterial dehalogenases: biochemistry, genetics, and

biotechnological applications. Microbiol. Mol. Biol. Rev., 58(4):641-685, 1994. URL

http://mmbr.asm.org/cgi/content/abstract/58/4/641.

[11] S Keuning, D B Janssen, and B Witholt. Purification and characterization of hydrolytic

haloalkane dehalogenase from Xanthobacter autotrophicus GJ10. /. Bacteriol., 163(2):

635-639, 1985. URL http: //jb. asm. org/cgi/content/abstract/163/2/635.

[12] Andrea Jesenska, Milan Bartos, Vladimira Czernekova, Ivan Rychlik, Ivo Pavlik, and

Jiri Damborsky. Cloning and Expression of the Haloalkane Dehalogenase Gene dhmA

from Mycobacterium avium N85 and Preliminary Characterization of DhmA. Appl.

Environ. Microbiol., 68(8):3724-3730, 2002. doi: 10.1128/AEM.68.8.3724-3730.2002.

URL http://aem.asm.0rg/cgi/content/abstract/68/8/3724. REFERENCES 78

[13] M. Holmquist. Alpha/Beta-hydrolase fold enzymes: structures, functions and mecha­

nisms. Curr Protein PeptSci, l(2):209-235, Sep 2000.

[14] H. Robert Horton, Laurence A. Moran, Raymond S. Ochs, David J. Rawn, and

K. Gray Scrimgeour. Principles of Biochemistry. Pearson Higher Education, 3 edition,

2002.

[15] Zbynek Prokop, Marta Monincova, Radka Chaloupkova, Martin Klvana, Yuji

Nagata, Dick B. Janssen, and Jiri Damborsky. Catalytic Mechanism of the

Haloalkane Dehalogenase LinB from Sphingomonas paucimobilis UT26. /.

Biol Chem., 278(46):45094-45100, 2003. doi: 10.1074/jbc.M307056200. URL

http://www.jbc.org/cgi/content/abstract/278/46/45094.

[16] W. Y. Chan. Identification of Improved Dehalogenase Sequence Fingerprints through

Screening of Microbial Genomes for Dehalogenase Activities (draft title). Master's the­

sis, University of Toronto, 2008.

[17] Jiri Damborsky and Jaroslav Koca. Analysis of the reaction mechanism and sub­

strate specificity of haloalkane dehalogenases by sequential and structural compar­

isons. Protein Eng., 12(ll):989-998, 1999. doi: 10.1093/protein/12.11.989. URL

http://peds.oxfordjournals.org/cgi/content/abstract/12/11/989.

[18] Y Nagata, K Miyauchi, J Damborsky, K Manova, A Ansorgova, and M Tak-

agi. Purification and characterization of a haloalkane dehalogenase of a new sub­

strate class from a gamma-hexachlorocyclohexane-degrading bacterium, Sphingomonas

paucimobilis UT26. Appl. Environ. Microbiol, 63(9):3707-3710, 1997. URL

http://aem.asm. org/cgi/content/abstract/63/9/3707.

[19] AN Kulakova, MJ Larkin, and LA Kulakov. The plasmid-

located haloalkane dehalogenase gene from Rhodococcus rhodochrous REFERENCES 79

NCIMB 13064. Microbiology, 143(1):109-115, 1997. URL

http://mic.sgmjournals.org/cgi/content/abstract/143/1/109.

[20] Julian R. Marchesi and Andrew J. Weightman. Comparing the Dehalo-

genase Gene Pool in Cultivated alpha-Halocarboxylic Acid-Degrading Bac­

teria with the Environmental Metagene Pool. Appl. Environ. Microbiol.,

69(8):4375-4382, 2003. doi: 10.1128/AEM.69.8.4375-4382.2003. URL

http://aem.asm.org/cgi/content/abstract/69/8/4375.

[21] Karen N. Allen and Debra Dunaway-Mariano. Phosphoryl group transfer: evolution of

a catalytic scaffold. Trends in Biochemical Sciences, 29(9):495-503, September 2004. URL

http://www.sciencedirect.com/science/article/B6TCV-4D09GNK-l/2/6f13a0bd5db4c916ac91

[22] Francis Carey. Organic Chemistry. McGraw-Hill, 6 edition, 2005.

[23] Peter Goldman. The Enzymatic Cleavage of the Carbon-Fluorine Bond in Fluoroac-

etate. /. Biol. Chem., 240(8):3434-3438, 1965. URL http://www.jbc.org.

[24] Tatsuo Kurihara, Takahiro Yamauchi, Susumu Ichiyama, Hiroyuki Takahata, and

Nobuyoshi Esaki. Purification, characterization, and gene cloning of a novel fluo-

roacetate dehalogenase from Burkholderia sp. FA1. Journal of Molecular Catalysis B:

Enzymatic, 23:347-355, 2003.

[25] Ji-Quan Liu, Tatsuo Kurihara, Susumu Ichiyama, Masaru Miyagi, Susumu

Tsunasawa, Haruhiko Kawasaki, Kenji Soda, and Nobuyoshi Esaki. Reac­

tion Mechanism of Fluoroacetate Dehalogenase from Moraxella sp. B. /.

Biol. Chem., 273(47):30897-30902, 1998. doi: 10.1074/jbc.273.47.30897. URL

http://www.jbc.org/cgi/content/abstract/273/47/30897.

[26] Julie D. Thompson, Desmond G. Higgins, and Toby J. Gibson. CLUSTAL

W: improving the sensitivity of progressive multiple sequence align­

ment through sequence weighting, position-specific gap penalties and REFERENCES 80

weight matrix choice. Nucl. Acids Res., 22(22):4673-4680, 1994. URL

http: //nar. oxf ordjoumals. org/cgi/content/abstract/22/22/4673.

[27] Vladimir Makarenkov. T-REX: reconstructing and visualizing phy-

logenetic trees and reticulation networks. Bioinformatics, 17(7):

664-668, 2001. doi: 10.1093/bioinformatics/17.7.664. URL

http://bioinformatics.oxfordjournals.org/cgi/content/abstract/17/7/664.

[28] pl5TvL Vector information sheet. Clinical Genomics Centre.

[29] G Recorbet, C Robert, A Givaudan, B Kudla, P Normand, and G Faurie. Con­

ditional suicide system of Escherichia coli released into soil that uses the Bacil­

lus subtilis sacB gene. Appl. Environ. Microbiol, 59(5): 1361-1366, 1993. URL

http://aem.asm.org/cgi/content/abstract/59/5/1361.

[30] T. Maniatis, E.F. Firtsch, and J Sambrook. Moleulcar cloning: a laboratory manual.

Cold Spring Harbor Laboratory Press, 2001.

[31] Paul Holloway, Jack T. Trevors, and Hung Lee. A colorimetric assay for detecting

haloalkane dehalogenase activity. Journal of Microbiological Methods, 32(l):31-36, 1998.

[32] Iwaji Iwasaki, Satori Utsumi, and Takejiro Ozawa. New Colorimetric Determination

of Chloride using Mercuric Thiocyanate and Ferric Ion. Bull Chem Soc Japan, 3:226,

1952.

[33] J Q Liu, T Kurihara, A K Hasan, V Nardi-Dei, H Koshikawa, N Esaki,

and K Soda. Purification and characterization of thermostable and nonther-

mostable 2-haloacid dehalogenases with different stereospecificities from Pseu-

domonas sp. strain YL. Appl. Environ. Microbiol, 60(7):2389-2393, 1994. URL

http://aem.asm.org/cgi/content/abstract/60/7/2389. REFERENCES 81

[34] J. van der Ploeg, G. van Hall, and D. B. Janssen. Characterization of the haloacid

dehalogenase from Xanthobacter autotrophicus GJ10 and sequencing of the dhlB gene.

JBacteriol, 173(24):7925-7933, Dec 1991.

[35] BRENDA - The Comprehensive Enzyme Information System. Internet resource. URL

http://www.brenda-enzymes.info/.

[36] A. Maxwell Burroughs, Karen N. Allen, Debra Dunaway-Mariano, and L. Aravind.

Evolutionary Genomics of the HAD Superfamily: Understanding the Structural

Adaptations and Catalytic Diversity in a Superfamily of Phosphoesterases and Allied

Enzymes. Journal of Molecular Biology, 361(5):1003-1034, September 2006. URL

http://www.sciencedirect.eom/science/article/B6WK7-4KBWXK0-7/2/41ecc609ecb497cf83fl

[37] Tatsuo Kurihara, Ji-Quan Liu, Vincenzo Nardi-Dei, Hiromoto Koshikawa,

Nobuyoshi Esaki, and Kenji Soda. Comprehensive Site-Directed Mu­

tagenesis of L-2-Halo Acid Dehalogenase to Probe Catalytic Amino

Acid Residues. / Biochem (Tokyo), 117(6):1317-1322, 1995. URL

http://jb.oxfordjournals.org/cgi/content/abstract/117/6/1317.

[38] Dick B Janssen. Evolving haloalkane dehalogenases. Curr Opin Chem

Biol, 8(2):150-159, Apr 2004. doi: 10.1016/j.cbpa.2004.02.012. URL

http://dx.doi.org/10.1016/j.cbpa.2004.02.012.

[39] A.J. Oakley, Z. Prokop, M. Bohac, J. Kmunicek, T. Jedlicka, M. Monin-

cova, I. Kuta-Smatanova, Y. Nagata, J. Damborsky, and M.C.J. Wilce. Ex­

ploring the Structure and Activity of Haloalkane Dehalogenase from Sphin-

gomonas paucimobilis UT26: Evidence for Product- and Water-Mediated Inhi­

bition. Biochemistry, 41(15):4847-4855, April 2002. ISSN 0006-2960. URL

http: //pubs3. acs. org/acs/journals/doilookup?in^oi = 10.1021/£z'015734z.

[40] Ekaterina Kuznetsova, Michael Proudfoot, Stephen A Sanders, Jeffrey Reinking, Alexei REFERENCES 82

Savchenko, Cheryl H Arrowsmith, Aled M Edwards, and Alexander F Yakunin. En­

zyme genomics: Application of general enzymatic screens to discover new enzymes.

FEMS Microbiol Rev, 29(2):263-279, Apr 2005. doi: 10.1016/j.femsre.2004.12.006. URL

http://dx.doi.org/10.1016/j.femsre.2004.12.006.