MASARYK UNIVERSITY FACULTY OF SCIENCE

DEPARTMENT OF EXPERIMENTAL BIOLOGY

LOSCHMIDT LABORATORIES

Molecular modeling of enzymes’ substrate specificity

Doctoral dissertation

Lukáš Daniel

Supervisors:

Prof. Mgr. Jiří Damborský, Dr.

Mgr. Jan Brezovský, Ph.D.

Brno 2016

Poděkování

Rád bych poděkoval svému školiteli Jiřímu Damborskému nejen za umožnění studia v Loschmidtových laboratořích, ale především za odborné vedení během mého akademického dozrávání, povzbuzení a za ochotu a čas věnovaných do četných diskuzí.

Velmi děkuji i Janu Brezovskému za trpělivost a cenné rady poskytované během mého studia.

Děkuji také všem současným i minulým kolegům, kteří formovali Loschmidtovy laboratoře po vědecké i lidské stránce v místo, které je velmi přívětivé a inspirující pro každodenní práci.

Největší poděkování však patří mým rodičům a blízkým za neutuchající podporu a důvěru v mém studijním i osobním životě.

Bibliographic entry

Author: Mgr. Lukáš Daniel Loschmidt laboratories Department of Experimental Biology Faculty of Science Masaryk University

Title of dissertation: Molecular modeling of enzymes’ substrate specificity

Study programme: Biology

Supervisor: prof. Mgr. Jiří Damborský, Dr.

Supervisor-specialist: Mgr. Jan Brezovský, Ph.D.

Year of defence: 2016

Keywords: biocatalysis, substrate specificity, dehalogenases, virtual screening, protein engineering, protein tunnel

Bibliografický záznam

Autor: Mgr. Lukáš Daniel Loschmidtovy laboratoře Ústav experimentální biologie Přírodovědecká fakulta Masarykova univerzita

Název disertace: Molekulové modelování substrátové specificity enzymů

Studijní obor: Biologie

Školitel: prof. Mgr. Jiří Damborský, Dr.

Školitel specialista: Mgr. Jan Brezovský, Ph.D.

Rok obhajoby: 2016

Klíčová slova: biokatalýza, substrátová specifita, halogenalkandehalogenasy, virtuální screening, proteinové inženýrství, proteinový tunel

Zvítězí, kdo vytrvá.

© Lukáš Daniel, Masaryk University 2016

CONTENTS

MOTIVATION 1

SUMMARY 2

SHRNUTÍ 4

INTRODUCTION 7

1. Enzyme biotechnology 7 2. Haloalkane dehalogenases 8 3. Engineering compounds – virtual screening 13 4. Engineering enzymes – design of specificity 28

SYNOPSIS OF RESULTS 42

CHAPTER 1 Mechanism-based discovery of novel substrates of haloalkane dehalogenases using in silico screening 45

Supplementary information 63

CHAPTER 2 Discovery of novel haloalkane dehalogenase inhibitors 93

Supplementary information 111

CHAPTER 3 Structural and functional analysis of a novel haloalkane dehalogenase with two halide-binding sites 121

Supplementary information 147

CHAPTER 4 CAVER Analyst 1.0: Graphic tool for interactive visualization and analysis of tunnels and channels in protein structures 155

Supplementary information 161

CHAPTER 5 Structural basis of paradoxically thermostable dehalogenase from psychrophilic bacterium 171

Supplementary information 193

CHAPTER 6 Enzyme tunnels and gates as relevant targets in drug design 201

REFERENCES 236

CURRICULUM VITAE 250

LIST OF PUBLICATIONS 252

MOTIVATION

Haloalkane dehalogenases (HLDs) emerged in enzyme biotechnology more than twenty-five years ago. Despite they catalyze a reaction of environmental, pharmaceutical, and industrial importance, the known range of compounds able to bind into them has not changed much since then. Although HLDs were found in many different organisms, the natural substrate of these enzymes remains mostly elusive. The of HLDs is mediated by a network of access tunnels connecting the deeply buried active site with the surrounding solvent. Therefore, the in silico screening of chemical databases and modulation of the access tunnels represent an efficient strategy for enhancing the scope of enzymes’ binders and might decipher novel entities utilized by HLDs.

The objectives of the Ph.D. project:

1. Development of the in silico screening platform for substrate identification 2. Systematic identification of novel chemical scaffolds binding into haloalkane dehalogenases 3. Rational engineering of access tunnels and active sites in haloalkane dehalogenases 4. Development of advanced tools for tunnel analysis 5. Critical review of the importance of access tunnels in pharmaceutical targets

1

SUMMARY

This Thesis describes an application of molecular modeling tools to study and modify the substrate specificity of haloalkane dehalogenases (HLDs), hydrolytic enzymes cleaving the carbon-halide bonds in a variety of halogenated . HLDs are industrially attractive biocatalysts with a well-understood reaction mechanism that often serve as a benchmark for testing of different molecular modeling protocols. The Introduction of the Thesis provides an overview of structure, function and industrial applications of HLDs as well as two sets of methods broadening the substrate scope of enzymes: (i) identification of novel substrates or inhibitors of natural enzymes by virtual screening and (ii) rational engineering of the enzymes’ active sites and their access pathways.

Despite the broad interest in the identification and characterization of novel HLDs, the reported substrate range covers only short-chain aliphatic linear or cyclic and their derivatives. To further explore the substrate scope, an in silico method tested by screening more than 40,000 halogenated compounds against the previously uncharacterized HLD DmmA was newly developed (Chapter 1). The correct prediction of 50 % substrates and 100 % binders from 16 compounds proposed for experimental testing was observed. The new substrates comprised aromatic moieties and on average 50 % higher molecular weight than the common substrates of HLDs. DmmA transformed these molecules with the comparable or higher catalytic activity then other substrates. Three novel substrates were also converted by three other HLDs suggesting a catalytic robustness of these enzymes. Derivatives of the identified substrates possess the highest affinity ever observed in this protein family.

HLDs have recently been discovered in a number of bacteria, including symbionts and pathogens of both plants and humans. However, the biological roles of HLDs in these organisms are unclear. The development of efficient HLD inhibitors serving as molecular probes to explore their function would represent an important step toward a better understanding of these interesting enzymes. The Chapter 2 describes the first systematic search for HLD inhibitors using two different approaches. The first built on the structures of the enzymes’ known substrates and led to the discovery of less potent nonspecific HLD inhibitors. The second approach involved the virtual screening of 140,000 potential inhibitors against the crystal structure of HLD from the human pathogen Mycobacterium tuberculosis H37Rv. The best inhibitor exhibited high specificity for the target structure, with an inhibition constant of 3 µM and a molecular architecture that clearly differs from those of all known HLD substrates. The new inhibitors will be used to study the natural functions of HLDs in bacteria, to probe their mechanisms, and to achieve their stabilization.

2

The active site engineering was used to assess the role of a unique second halide- binding site identified in the crystal structure of HLD DbeA (Chapter 3). Since the determination of the structure of the two-point mutant (DbeA ΔCl) lacking this site was not successful, disruption of the second-halide binding site was studied by molecular modeling. The modeling suggested that the binding of a chloride into the second-halide binding site was more favorable in the wild-type enzyme than in DbeA ΔCl which was subsequently confirmed by stopped-flow fluorescence measurement. The elimination of the second-halide binding site in DbeA ΔCl shifted the substrate specificity, lowered the catalytic activity and thermostability and eliminated the substrate inhibition. The changes of the catalytic activity studied by molecular modeling and kinetic experiments were attributed to deceleration of the rate-limiting hydrolytic step mediated by the lower basicity of the catalytic histidine in DbeA ΔCl.

Fine-tuning of substrate specificity of many enzymes has been achieved by forcing the incoming substrate to pass through an access tunnel before reaching the deeply buried active site. This newly proposed „lock-keyhole-key” model introduces in the traditional models of catalysis the geometry and physicochemical properties of the access tunnels to assure the selection of substrates. The Chapter 4 describes a development of CAVER Analyst 1.0, a software tool enabling analysis and visualization of protein tunnels to facilitate the study of molecular transportation and the design of novel catalysts or effective drugs.

The concept of tunnel engineering was applied in the structure-function studies of a psychrophilic HLD DmxA originating from a bacterium naturally occurring in an Antarctic lake

(Chapter 5). DmxA possesses paradoxically the highest thermostability (Tm = 65.9 °C) ever observed in any wild-type HLD. Analysis of the DmxA’s crystal structure revealed narrow access tunnels of this protein. To open the main tunnel, four residues forming the tunnel bottleneck were in silico mutated to smaller amino acids. The effect of two substitutions was predicted as destabilizing while the effect of the other two was predicted as neutral. Experimental characterization of a mutant with introduced destabilizing mutations revealed improved overall activity, shifted substrate specificity and lower thermostability (Tm = 56.9 °C) than the wild-type DmxA.

The Chapter 6 describes the importance of the access tunnels in the pharmaceutically relevant biomolecules. Many protein families possess the same catalytic residues and the highly conserved active sites, but the composition of their access tunnels differs. Thus, inhibitors specifically targeting protein tunnels might represent an unexploited and efficient way for the development of novel drugs. The chapter shows examples of inhibitors blocking the tunnels of pharmacological targets that might be utilized as novel drugs against inflammation, neurodegenerative disorders, pathogens, atherosclerosis, immunosuppression and various cancers. 3

SHRNUTÍ

Disertační práce popisuje využití technik molekulového modelování pro studium a modifikaci substrátové specificity halogenalkandehalogenas (HLD), hydrolytických enzymů štěpících vazbu mezi uhlíkem a halogenem v řadě halogenovaných uhlovodíků. HLD jsou průmyslově atraktivní biokatalyzátory s dobře prostudovaným reakčním mechanismem, díky čemuž jsou hojně využívány pro testování protokolů molekulového modelování. Úvodní kapitola přibližuje čtenáři kromě struktury, funkce a nových biotechnologických aplikací HLD také dva typy metod pro rozšíření substrátové specifity enzymů: (i) identifikaci nových substrátů či inhibitorů přírodních enzymů prostřednictvím virtuálního screeningu a (ii) racionální inženýrství vazebných míst enzymů a jejich přístupových cest.

V poslední době narostl zájem o identifikaci a experimentální charakterizaci nových HLD za účelem jejich průmyslového využití. Nicméně spektrum jejich prozatím popsaných substrátů zahrnovalo pouze krátké lineární či cyklické halogenované alkany a jejich deriváty. Za účelem lepšího poznání substrátové specificity těchto enzymů byla vyvinuta nová in silico metoda. Tato metoda byla použita ke screeningu více než 40000 halogenovaných látek s dosud necharakterizovanou HLD DmmA (kapitola 1). Z 16 molekul vybraných pro experimentální testování bylo správně předpovězeno 50% substrátů a 100% molekul schopných vazby do aktivního místa. Na rozdíl od známých susbtrátů HLD vykazovaly nově identifikované molekuly průměrně o 50% větší molekulovou hmotnost a přítomnost aromatických skupin. DmmA přeměňovala tyto molekuly se shodnou nebo vyšší katalytickou aktivitou než známé substráty. Tři nové molekuly byly substráty i pro další tři HLD, což indikuje katalytickou všestrannost těchto enzymů. Molekuly odvozené od nově objevených substrátů mají nejvyšší afinitu k HLD ve srovnání se všemi dosud testovanými substráty.

HLD byly identifikovány v baktériích zahrnující i symbionty a patogeny jak rostlin, tak lidí. Biologická role HLD v těchto organismech však není zřejmá. Vývoj inhibitorů HLD sloužících jako molekulární próby pro prozkmoumání jejich funkce tak představuje významný krok k pochopení role této enzymové rodiny. Kapitola 2 popisuje první systematické hledání inhibitorů HLD s využitím dvou odlišných přístupů. První vycházel ze struktur známých substrátů a vedl k objevení málo účinných nespecifických inhibitorů. Druhý přístup zahrnoval virtuální screening 140000 potenciálních inhibitorů vůči HLD z lidského patogenu Mycobacterium tuberculosis H37Rv. Nejlepší objevený inhibitor vykázal vysokou afinitu vůči testované HLD, inhibiční konstantu rovnu 3 µM a molekulární architekturu odlišnou od známých substrátů. Nově objevené inhibitory najdou uplatnění při studiu funkcí HLD v baktériích, či jejich katalytických mechanismů a také při stabilizaci HLD během skladování a krystalizace.

4

Analýza krystalové struktury HLD DbeA odhalila unikátní druhé vazebné místo pro chloridový aniont. K ověření jeho funkce bylo použito inženýrství aktivního místa DbeA, při kterém byl navrhnut a zkonstruován dvoubodový mutant (DbeA ΔCl) bez tohoto vazebného místa (kapitola 3). Absence krystalové struktury DbeA ΔCl vedla k využití molekulového modelování při studiu vlivu mutací na druhé vazebné místa pro chloridy. Modelovaní navrhlo, že vazba chloridu je vhodnější v přirozené formě DbeA než v DbeA ΔCl, což bylo následně potvrzeno stopped-flow fluorescenčním měřením. Eliminace druhého vazebného místa pro chloridy v mutantu DbeA ΔCl vedla ke změně substrátové specificity, snížení katalytické aktivity a termodynamické stability a také k odstranění substrátové inhibice. Změny v katalytické aktivitě studované pomocí molekulového modelování a kinetických experimentů nastaly v důsledku zpomalení hydrolytického kroku, které je připisováno nižší bazicitě katalytického histidinu v DbeA ΔCl.

V přírodě existuje mnoho enzymů, které svou substrátovou specificitu upravují tím, že nutí projít substrát do aktivního místa skrz přístupové tunely. Na tomto základě byl definován „lock-keyhole-key“ model, který vnesl do tradičních modelů enzymové katalýzy prvky popisující důležitost tvaru a fyzikálně-chemických vlastností přístupových cest. Kapitola 4 popisuje vývoj bioinformatického nástroje CAVER Analyst 1.0, který umožňuje analýzu a vizualizaci proteinových tunelů s cílem usnadnit studium transportu molekul a design nových biokatalyzátorů a farmaceutik.

Koncept inženýrství proteinových tunelů byl aplikován při studiu strukturně-funkčních vztahů psychrofilní HLD DmxA pocházející z bakterie, která kolonizuje jezero na Antarktidě

(kapitola 5). DmxA paradoxně vykazuje nejvyšší termální stabilitu (Tm = 65.9 °C) ze všech dosud charakterizovaných přirozených HLD. Analýza krystalové struktury DmxA odhalila velmi úzké přístupové tunely vedoucí do aktivního místa v tomto proteinu. Za účelem otevření hlavního tunelu byly in silico mutovány čtyři aminokyseliny tvořící jeho nejužší místo za menší residua. Efekt dvou substitucí byl předpovězen jako strukturně destabilizující, kdežto u zbylých dvou jako neutrální. Experimentální charakterizace dvou-bodového mutantu obsahujícího strukturně destabilizující mutace odhalila zlepšení aktivity, změnu substrátové specificity a snížení termální stability (Tm = 56.9 °C) při srovnání s původní DmxA.

Kapitola 6 popisuje důležitost přístupových tunelů ve farmaceuticky významných biomolekulách. Mnoho proteinových rodin obsahuje stejná katalytická rezidua a velmi konzervovaná aktivní místa, avšak jejich přístupové tunely jsou odlišné. Inhibitory specificky cílící na proteinové tunely tak představují neprozkoumanou a účinnou cestu k vývoji nových léčiv. Tato kapitola předkládá čtenáři příklady inhibitorů blokujících tunely ve farmakologicky zajímavých cílech s možným využitím jako nová léčiva proti zánětlivým a neurodegenerativním onemocněním, patogenům, ateroskleróze či různým typům rakovin. 5

6

INTRODUCTION

1. Enzyme biotechnology

The usage of biocatalysts by humans started thousands of years ago by fermentation of food to produce cheese, beer, vinegar, and wine. However, the cornerstones of modern biotechnology were laid in the 1800s when scientists proved the utility of using living cells for biological transformations and pioneered the research of metabolic pathways.1–3 At the beginning of the last century, this paradigm enabled the application of whole cells, cells extracts and purified enzymes for the conversion of natural substrates to production of useful chemicals even at the industrial scale.4–6 Another milestone in biocatalysis was placed in the early 1950s by pharmaceutical companies such as Pfizer and Merck which utilized regio- and stereoselective microbes to overcome multistep chemical synthesis of drugs.4 Despite the gained importance, the usage of enzyme biotechnology remained limited in this era because of two major limitations: (i) Enzymes were traditionally obtained by screening of a broad range of sources including microorganisms, fungi, insects, plants or mammalian species. This was often demanding and did not provide quantities required for practical applications.4,7 (ii) Enzymes suffered from their narrow substrate specificity, insufficient stability under operating conditions and poor regio- and stereo-selectivity. However, the availability of recombinant DNA systems allowing both isolation of the gene encoding any protein existing in Nature and its overexpression in a suitable host organism,8 the discovery of polymerase chain reaction generating millions of copies of a specific DNA sequence in vitro9 and DNA sequencing10 allowed accessibility of large amounts of the desired enzymes. Molecular biology methods allowing alternation of amino acids encoded by a cloned gene via an in vitro version of Darwinian evolution8,11,12 enabled construction of stable enzymes accepting previously inert substrates.13 The recent development of advanced DNA technologies,14,15 de novo design of enzymes and biosynthetic pathways,16,17 resurrection of ancestral biocatalysts,18 protein engineering techniques,19 and high-throughput methods20 allowed development of enzymes and reactions simplifying chemical processes currently needed for conversion of biomass for the next generation of biofuels21,22 and for production of new materials23 and chemicals.4,5,24,25 Enzyme biotechnology presents an attractive alternative to chemical synthesis especially with regards to environmental aspects. Enzymes and microbes act as non-toxic catalysts originating from renewable sources and providing high yields under mild operation conditions. Moreover, biocatalysts have been employed to both decreasing production of toxic wastes generated by classical chemical production and their removal from contaminated environment.5,24,26,27 Nonetheless, enzyme biotechnology is not an ultimate solution for every chemical process, but it could flourish together with advanced organic synthesis.7,24

7

2. Haloalkane dehalogenases

Haloalkane dehalogenases (EC 3.8.1.5; HLDs) are predominantly microbial enzymes structurally belonging to the versatile α/β-hydrolase fold and catalyzing the conversion of halogenated aliphatic hydrocarbons to the corresponding alcohol, a halide, and a proton. HLDs are relatively stable and robust proteins with a widely studied and well-understood reaction mechanism.28–31 Consequently, they are often used as targets in fundamental enzymology and protein engineering and as benchmarks for testing molecular modeling protocols.32–38 Although HLDs were originally extracted from soil-contaminated bacteria in which they enabled usage of halogenated compounds as carbon and energy sources,39–41 their natural function in rhizobial strains,42 a plant parasite,43 animal pathogens,44 marine strains45–47 and eukaryotes48 remain unknown. In addition, more than 6,000 putative HLDs have been identified by database mining.49 HLDs are attractive biocatalysts that can be employed for the bioremediation of environmental pollutants,50–52 construction of biosensors,53,54 decontamination of warfare agents,55 synthesis of optically pure alcohols,56 imaging of cells57–60 protein purification61, chemical knockdown of fusion proteins62,63 and identification of cellular targets for bioactive compounds.64

2.1 Reaction mechanism

The dehalogenation reaction follows a two-step mechanism involving a covalently bound ester intermediate. The ester is formed in the first step by bimolecular nucleophilic substitution of a halogen atom mediated by the catalytic triad. The transition state formed in this step is stabilized by two halide-stabilizing residues donating hydrogen bonds to the leaving halide to stabilize the developing charge. Subsequently, the ester intermediate is hydrolyzed in the second reaction step by a hydroxyl group generated from water by the catalytic histidine (Figure 1). Finally, an alcohol product and a halide are released from the buried active site through the access tunnels.28,29 The main difference in the kinetic mechanism of HLDs is the rate-limiting step that depends on the particular enzyme-substrate pair and has been attributed to halide release,30 cleavage of the carbon-halide bond and alcohol release,37,65 and hydrolysis of the ester intermediate.29 The observed differences are governed by the arrangement of the catalytic residues, geometry, and hydration of the active site cavity and anatomy of the access tunnels.66

8

Introduction

Figure 1. The general scheme of the reaction mechanism of HLDs. Bimolecular nucleophilic substitution of a halogen atom bonded to a sp3 hybridized carbon atom (left). Hydrolysis of the ester intermediate by a water molecule (right).

2.2 Substrate specificity

HLDs possess a broad substrate specificity and are able to convert chlorinated, brominated and iodinated alkanes, alkenes, alcohols, hydrins, ethers, esters, acetamides, acetonitriles, and cycloalkanes. However, conversion of fluorinated, multi-halogenated at a single atom, aromatic and at sp2 hybridized halogenated compounds has not been reported.67 Moreover, the natural substrate for most HLDs is unknown. The substrate specificity differs in each member of the family and the systematic analysis of the relationship between the evolution and function of HLDs revealed that it is not possible to infer the substrate specificity of putative HLDs by their amino acid sequences alone.68 The difference is determined by spatial and dynamic attributes of the active-site cavity and the access tunnels as well as by the geometry of halide-stabilizing residues.49,69–71 Statistical analysis of a uniformed set of 30 compounds divided experimentally characterized HLDs into distinct substrate-specificity groups and revealed five universal substrates of HLDs, which are 1-bromobutane, 1-iodopropane, 1-iodobutane, 1,2-dibromoethane, and 4-bromobutyronitrile.49 However, the number of biochemically characterized HLDs is still limited to 17 members, suggesting that the set of universal substrates might still develop once more members of HLD family will be biochemically characterized.72,73

HLDs were used as model enzymes for the development of experimentally validated computational method for the discovery of novel substrates of enzymes with a known reaction mechanism. The method was applied to previously uncharacterized HLD DmmA and revealed eight novel substrates comprising aromatic moieties with the comparable or higher catalytic activity and on average 50 % higher molecular weight than universal substrates of HLDs (Chapter 1, Table 2 and Figure 2). Three novel substrates were additionally converted by three different HLD enzymes suggesting the catalytic robustness of these enzymes.74 Derivatives of

9

the identified substrates possess the highest affinity ever observed in this protein family (not published).

In silico screening of more than 140,000 compounds against HLD DmbA expanded the set of known binders by 17 newly identified inhibitors (Chapter 2). This study described the first systematic search for inhibitors in an attempt to decipher the unclear biological role HLDs. The best inhibitor exhibited high specificity for the target structure, with an inhibition constant of 3 µM and a molecular architecture that clearly differs from those of all known HLD substrates (Chapter 2, Figure 2). The new inhibitors will be used to study the natural functions of HLDs in bacteria, to probe their mechanisms, and to achieve their stabilization supporting crystallization, long-term storage, and protection during immobilization or industrial process.

2.3 Structural features of HLDs

Structurally, HLDs belong to the versatile α/β-hydrolase fold that became one of the largest groups of structurally related proteins with diverse catalytic functions and phylogenetic origin.75,76 The three-dimensional structure of HLDs consists of a conserved α/β-hydrolase core domain and a helical cap domain which is variable in the number and the arrangement of secondary structure elements and it is known to shape the substrate specificity (Figure 2).77 The active site is deeply buried in a predominantly hydrophobic cavity at the interface between these two domains. The tertiary structure has been determined for eleven native HLDs (Table 1).

Figure 2. The tertiary structure of HLDs. The general topology of the structure elements (left) consists of a variable cap domain (A), access tunnels (B) and a conserved α/β-hydrolase core domain (C). The position of the active site is represented by a black star.

10

Introduction

Table 1. PDB-IDs of HLDs with experimentally determined structure. Enzyme Source organism PDB-ID Reference DhlA Xanthobacter autotrophicus GJ10 2HAD 78 DhaA Rhodococcus rhodochrous NCIMB 13064 1CQW 79 LinB Sphingobium japonicum UT26 1CV2 80 DmbA Mycobacterium tuberculosis H37Rv 2QVB 81 DbjA Bradyrhizobium japonicum USDA110 3A2M 56 DppA Plesiocystis pacifica SIR-1 2XTO 45 DmmA unknown marine bacterium 3U1T 46 DbeA Bradyrhizobium elkanii USDA94 4K2A 82 HanR marine Rhodobacteraceae 4BRZ 83 DatA Agrobacterium tumefaciens C58 3WI7 43 DmrA Mycobacterium rhodesiae JS60 4MJ3 73

The catalytic residues of HLDs include a nucleophile, a base and an acid, which are essential for the reactions catalyzed by members of the α/β-hydrolase fold, and two halide-stabilizing residues (Figure 3). The composition of the catalytic residues and their location within structure is not conserved since five different pentades have been identified in HLDs so far. 38,43,68,84

Figure 3. Generalized topology of HLDs. The α/β-hydrolase core domain (light blue) and the variable cap domain (dark blue) are distinguished. The position of the catalytic residues is represented by symbols and functional groups.

11

The exchange of a substrate and the products between the active site and the bulk solvent is mediated by access tunnels (Figure 4). They differ in a number, size, shape and physico- chemical properties among individual members of HLDs, which marks them as one of the important determinants of substrate specificity. The tunnels in HLDs can either be permanent or ligand-induced that together with the flexibility of the cap domain guides the solvation and desolvation of the active site.67,85,86

Figure 4. The topology of the cap domain (light blue) and the access tunnels (orange) of various HLDs. A) DbjA from Bradyrhizobium japonicum USDA110, B) DhaA from Rhodococcus rhodochrous NCIMB 13064, C) DhlA from Xanthobacter autotrophicus GJ10 and D) LinB from Sphingobium japonicum UT26. The position of the active site is represented by the black star. The tunnels were calculated by CAVER3.0287 with the same settings for all aligned structures.

12

Introduction

3. Engineering compounds – virtual screening

Virtual screening (VS) is an application of in silico methods for the knowledge-driven mining of chemical databases in an attempt to find novel compounds or chemotypes with desired biological activity. It is regarded as a computational counterpart to experimental high-throughput screening (HTS) which was the main source of chemical entities during the early drug discovery campaigns. However, the growing availability of compounds in the chemical databases and their associated bioactivity data, biological targets derived from the human genome and microbiome projects and their experimentally determined tertiary structures prioritized cost- and time-efficient in silico methods for discovery of novel lead compounds. Moreover, VS does not require the physical existence of the tested compounds and overcomes the experimental obstacles such as limited solubility or aggregate formation. The development of computational methods searching large chemical databases and selecting small sets of candidates for experimental testing is academically sound. Also application of VS strategies attracts considerable interest outside the medicinal community (Chapter 1 and 2).88–90

VS can be divided into two categories according to the required input data, namely to ligand-based virtual screening (LBVS) utilizing structure-activity data of known active compounds as inputs to query chemical databases and to structure-based virtual screening (SBVS) utilizing a tertiary structure of a macromolecule to dock compounds into the active site and prioritize them by binding affinity or complementarity to the active site.88,89 Both LBVS and SBVS are used in a synergy nowadays to identify novel drugs,88,91 chemical probes (Chapter 2),89 substrates (Chapter 1) and to assign functions of enzymes with unknown catalytic activity.92–94

Given the excess of available VS protocols, the main challenge is the difficult judgment of their performance because of lack of community standards for both protocol evaluation and its experimental validation.95 In addition, the increased amount of biological and structural data demands more computationally efficient VS approaches. To address this issue from the hardware-oriented point of view, use of computational clouds, graphical processing units (GPUs) and parallel computing strategies have been established. Besides the technological developments, the prefiltering of large compound databases used in both LBVS and SBVS is commonly applied to reduce the computational costs.89 Nonetheless, the outlook of VS is rather promising in the post-genomic era. Since a vast number of compounds and targets is available, many compounds can be screened against a large number of targets to either identify off- targets that might cause unwanted side effects or to re-purpose accepted drugs for a treatment of different diseases.88 Moreover, the increased number of potential macromolecular targets will require identification of molecular probes for their validation and functional studies.

13

The probes are not expected to be drug-like which favors VS over experimental screens that soon will be impossible to set up for many targets of interests at once.89

3.1 Ligand-based virtual screening

LBVS utilizes molecular characteristics extracted from a set of known active compounds as inputs to query chemical databases with an attempt to identify structurally diverse compounds having similar characteristics. The nature of extracted characteristics depends on the chemical representation of compounds stored in the database and on applied database search strategies.96

The existing plethora of chemical representations encodes molecular characteristics into either two-dimensional (2D) or three-dimensional (3D) fingerprints (Figure 5). The fingerprints can be derived from molecular graphs or conformations, pharmacophore models or molecular shape queries.97 Systematic analysis of different chemical representations revealed the difficulty of obtaining a generally applicable form of fingerprint.96 However, the 2D fingerprints are very popular due to their fast calculation and easy storage allowing rapid VS campaigns. Thus, LBVS is essential at the beginning of the drug discovery efforts, especially where scarce 3D information is available for the target macromolecule.88

Figure 5. Common types of molecular fingerprints used in LBVS to query compound databases. Adopted from 97.

14

Introduction

The database searching applied in LBVS can be divided into two broad categories, namely to similarity search and compound classification methods. Similarity search compares fingerprints in a pair-wise manner with database compounds using a similarity metric to produce a ranking based on decreasing similarity to reference compounds. From this ranking, candidate compounds are selected for experimental validation. The measure of fingerprints overlap is quantified by similarity coefficients, frequently by the Tanimoto coefficient.90 Moreover, analysis of the similarity between compounds is often used to filter efficiently large databases to reduce the computational costs. Compounds that are not at least partially similar to known actives (LBVS) or are chemically or spatially incompatible with the active site (SBVS) are usually filtered out. Furthermore, compounds possessing undesired chemical properties are removed based on property-rule filters such as drug-likeness, chemical reactivity, association with metabolism-based toxicity, autofluorescence and interference with detection techniques.88 The compound classification methods consist of clustering and machine learning techniques (Figure 6). Both approaches allow discrimination of active compounds from inactive ones based on models derived from training sets, as well as provide a ranking of database compounds according to the probability of activity, and prioritize compound classes for building target- oriented databases. Although compounds classification methods require rich content of information as many active and inactive compounds are needed for training and subsequent testing of the models, increased availability of big data collections of all sorts prompted interest in the development of these methods within academic and commercial communities. The frequently used methods include support vector machines, decision trees, k-nearest neighbors, naïve Bayesian methods and artificial neural networks.90,96

The scientific limitation of LBVS is the lack of methods reliably linking the computed molecular similarity and observed biological activity of tested compounds. The insufficient correlation between these two properties requires additional analyzes and represents an avenue for the methodological breakthroughs.89 Ripphausen et al.95 analyzed 115 state-of-the- art LBVS applications into details and assessed them according to scientific, quality and structural criteria. The analysis revealed that one-third of studies lacked rigorous or non- questionable verification of the computational predictions. Moreover, about 10 % of successful LBVS studies preselected by the authors insufficiently described the computational parts for understanding and/or reproduction of the calculations. However, the majority (68) of current LBVS studies were able to identify novel active compounds with about 30 % falling into the submicromolar range. Additionally, the consideration of multiple quality criteria such as dose- response ratio, multistep experimental validation, controls for a non-specific binding, identification of multiple scaffolds and preliminary structure-activity relationship information revealed nine most advanced studies of high scientific quality.95

15

Figure 6. Examples of the classification methods. A) Support vector machine. Active (green points) and inactive (red points) are separated by maximum-margin hyperplane (thick black line). The screened compounds are ranked by their distance from this hyperplane (blue line). B) Decision tree. The screened compound follows a path from the root to a leaf node (green circles) which allows its classification based on the descriptor value at each split node. C) k-nearest neighbors. The dataset comprised of active (green points) and inactive (red points) compounds (left). The new data point (empty circle) is classified according to the majority of its closest neighbors. As inactive for k=1, active for k=3 and inactive for k=5. Adopted from 90.

16

Introduction

3.2 Structure-based virtual screening

SBVS utilizes the 3D arrangement recognition elements extracted from the tertiary structure of a macromolecular target to guide docking of compounds into the active site and their prioritization by binding affinity or complementarity to the active site. In general, SBVS consists of four main components: (i) macromolecular target selection; (ii) compound database selection; (iii) molecular docking; and (iv) post-docking analysis (Figure 7).88,98 Analysis of more than 250 SBVS campaigns performed by Ripphausen et al.99 revealed that among all macromolecular targets, protein kinases were both the most screened and the most successfully targetted ones.

Figure 7. Overview of a typical SBVS campaign. A) Selection of a macromolecular target and identification of the binding site. B) Selection and filtering of a compound database. C) Molecular docking. D) Post- docking analyzes, selection, and experimental testing.

17

Once a macromolecular target either with biotechnological prospects or with expected pharmacological relation to a disease has been validated thoroughly, it is essential to obtain its 3D structure. The largest repository of structures determined by X-ray crystallography, NMR or cryo-electron microscopy is the Protein Data Bank which currently includes more than 100,000 records.100 Nonetheless, the gap between the available protein sequences and their corresponding 3D structures is very large due to the rapid development of DNA sequencing techniques. If no 3D structure of the target macromolecule exists, in silico homology modeling can be used to its prediction based on the sequence similarity with a related homology protein with experimentally determined tertiary structure.98,101 Several SBVS studies using the homology models successfully retrieved novel inhibitors operating in micromolar or nanomolar range.98,101 The experimentally determined or predicted 3D structure is then examined for its capability of binding compounds with drug-like properties by identification of buried or surface binding sites. They can be indicated either by the position of ligands present through experimental structure solving or calculated and ranked by a variety of tools such as LIGSITE,102 SURFNET,103 Fpocket,104 Q-SiteFinder,105 PocketPicker106 and others. Finally, the target structure is prepared for molecular docking by general steps including solvent removal, addition of hydrogen atoms, structural refinement removing clashes and tensions, analysis of water molecules mediating key interactions and preparation of metal sites. These last steps have been neglected many times, while they can significantly improve the compound interaction with the active site and thus considerably affect the SBVS campaign.98,101

Although VS possesses the capacity to test rapidly a large number of compounds without much effort, a screening of the entire chemical space is beyond our current capability. Fortunately, only a small fraction of chemical space is expected to be soluble in water and carry appropriate chemical groups achieving complementarity to the macromolecular active site. Several chemical compound databases have been developed in recent years to store not just the structure but also important chemical and biological data in one place. They usually contain a mixture of natural and synthetic compounds, known drugs and their metabolites and targeted subsets.88,98 The most popular and freely available databases include ZINC (95 million compounds),107 PubChem (61 million compounds),108 ChemSpider (35 million compounds),109 and CoCoCo (7 million compounds)110. Many more public and commercial ones exist as well.88 Moreover, small specialized databases focused on antibiotics, protein kinase inhibitors, receptor modulators, antimalarials and others have also been reported, and all major pharmaceutical companies have proprietary databases comprising several millions of compounds.88 The database is usually stripped of compounds not sharing the bioavailability characteristics of the majority of known drugs, or those possessing toxic, chemically reactive or autofluorescent groups. The stripping utilizes simple counting methods based on biophysical properties, e.g. Lipinski’s Rule of Five111 and its modifications,112 or functional group filters.88,98 18

Introduction

However, the filtering methods should not be indisputable because good drug candidates may not obey these rules and it is not guaranteed that fulfilling all of them allows passing through all stages of clinical trials. In addition, the high database variety is desirable in any SBVS campaign. Finally, prior the molecular docking, representative tautomers, conformers and stereoisomers at the pH of interest are assigned to the compounds.101

The molecular docking is designed to predict a preferred conformation of a molecule within the desired binding site (referred here as a target) to form either protein-compound or protein-protein complex. A mathematical algorithm (referred here as a scoring function) is then used to evaluate the strength of the association or the binding affinity between both parts of the complex. The best-scored conformation should correspond to the energy minimum and be close to the experimentally observed one when such information is available. The molecular docking is the most computationally demanding part of SBVS. Many assumptions and simplifications are typically applied during the development of the docking software to make it sufficiently fast for the screening of large databases. This efficiency is often increased by limiting the compound conformation search to the region of the expected binding site and by decreasing the complexity of the macromolecular target. Instead of using all-atoms representation, descriptors of geometry and interaction patterns of the binding site or a grid representation are used.113,114 After its development in the late 20th century, the molecular docking became essential in drug discovery efforts and now dominates the field.98,99,115 The existing excess of molecular docking software performing SBVS98 can be categorized into three groups according to the implementation of compound conformational search: (i) rigid-body; (ii) flexible compound; and (iii) flexible compound and target. During the conformational search, torsional, translational and rotational degrees of freedom of a compound are incrementally modified to find its best fit within the defined binding site.

Rigid-body search algorithms are based solely on the geometrical and/or interaction complementarity between the compound and the target since they consider the flexibility of neither of them (Figure 8). The algorithms are basic and with limited prediction accuracy, while their major strength is speed. They were widely applied in the earlier protein–compound docking studies. Currently, they are employed in protein–protein docking protocols or in the initial stages of VS to filter large databases by removing compounds incompatible with the binding site.98,116 Implementations of this algorithm are found in ZDOCK,117 DOCK,118 MS- DOCK,119 LigandFit,120 and Glide121.

19

Figure 8. A schematic representation of rigid-body docking. A) Compound shape and interaction characteristics are mapped on a rigid frame. B) Binding site shape and interaction characteristics (left) are matched with compound ones by distance matrices. In the case of a match, rotational and translational vectors are applied to the compound (in the middle) yielding its final placement within the binding site (right).

Flexible compound search algorithms sample the conformational space of the compound while keeping the target rigid. They achieved the highest popularity in the field and are implemented in a broad range of docking software. The flexibility of the compound during a search can be achieved by two types of methods: (i) systematic; and (ii) stochastic.98,115,116 Implementation of these methods is found in AutoDock,122 GOLD,123 Glide,121 LigandFit,120 ICM,124 and in many other software tools115.

 Systematic algorithms aim to explore all combinations of the structural parameters of a compound from a database of pre-generated conformers. To avoid the demanding computational costs arising from the increased number of degrees of freedom of a compound, fragmentation base methods were developed. The compound is split into

20

Introduction

several fragments that are either sequentially docked into the binding site until the entire compound is constructed (incremental construction method) or simultaneously docked into the binding site and then linked together (fragment placing and linking method).115,116,125 The conformational search is performed only for the added fragment that significantly reduces the computational costs (Figure 9).

Figure 9. Molecular docking by the incremental construction of a compound. A) The compound is initially broken into several fragments. B) An anchor fragment (red sticks) is docked into the target binding site. C) A second fragment is docked on the anchor fragment. D) The remaining fragments are sequentially docked to build the entire compound in a particular conformation within the binding site.

 Stochastic algorithms perform the conformational sampling by random changes of structural parameters of the compound (or a population of compounds) which can be accepted or rejected based on a probability function. The stochastic conformational search populates a wide range of the energy landscape to avoid trapping of the final conformation in a local energy minimum, and to increase the probability of reaching the global energy minimum. Nonetheless, the computational cost associated with stochastic searching is demanding and forms a bottleneck for the performance of SBVS campaigns. The most widely used implementations of stochastic search are (i) genetic algorithm; and (ii) Monte Carlo.98,115,116

The genetic algorithms are designed to mimic the principles of the Darwinian theory. All initial structural parameters of a compound (torsional angles, rotational and translational states) are represented by genes that are encoded into a chromosome. The algorithm then generates a random population of chromosomes corresponding to 21

the multiple conformational states covering a wide range of the energy landscape. The chromosomes evolve via two random genetic operators – mutation and crossover. Their population is then evaluated to find the most adapted ones (e.g. those with the lowest binding energy). Finally, the genetic information of fittest chromosomes is transmitted into the next generation of chromosome population. This reduces the conformational space needed for sampling because only the most favorable structural parameters are transmitted from one chromosome population to another. The generation of chromosome population and its energy evaluation is performed iteratively and converges after a given number of cycles into a chromosome (a compound conformation) corresponding to the global energy minimum (Figure 10).

Figure 10. The genetic algorithm. A) The encoding of the initial structural parameters of a compound (color arrows) into genes (color bars) and their assembly into a chromosome (in the middle). Subsequently, a random population of chromosomes is generated (right). B) The evolution of chromosome population by genetic operators. C) The most adopted chromosomes are selected as templates for generation of the next population. After many cycles of conformational search and evaluation, the final compound conformation is obtained (right).

22

Introduction

The Monte Carlo method alters the structural parameters of a compound by many sequential steps, e.g. random bond rotations, translations, or rigid-body rotations to avoid trapping of the compound’s conformation in a local energy minima. Each generated conformation is accepted or rejected based on a Boltzmann probability function (Figure 11).115,116

Figure 11. The Monte Carlo algorithm. Adopted from 116.

Flexible compound-target search methods attempt to mimic the structural changes upon compound binding (induced fit) or occurring in pre-existing target conformers (conformational selection). The accurate modeling of these collective motions is very demanding even in a high- performance GPU computing era that correlates with the popularity of flexible compound-rigid target search algorithms. Nonetheless, several methodologies considering the flexibility of both the compound and the target have been implemented and successfully applied.98,101 These algorithms are based on three approaches: (i) soft docking; (ii) rotamer libraries; and (iii) target ensembles (Figure 12).

 Soft docking methods allow a certain structural overlap between a compound and the target by employing “tolerant” scoring function for binding energy evaluation. The main advantage is the low computational cost. Additionally, a local energy minimization can be performed to simulate adjustment of binding site residues. The backbone rearrangements are omitted which is particularly suitable for targets with fairly static binding site.101

 Rotamer libraries describe the conformational space of the target by preferred conformational states of each reside side-chain observed in a set of experimentally determined structures. This approach is suitable for the targets with limited motions prior and upon compound binding.98

23

 Docking to target ensembles attempts to model all degrees of freedom within the compound-target complex. The residue side-chain and backbone flexibility are considered by different conformation states of the target generated by molecular dynamics, Monte Carlo, NMR, X-ray crystallography or homology modeling. Despite the biological accuracy, the high-dimensionality of the search space explored requires substantial computational costs.98,101

Figure 12. Flexible molecular docking. A) Soft docking allowing overlap (black stars) of a compound (solid spheres) with active-site residues (transparent spheres). B) The flexibility of a target achieved by rotamer libraries of active-site residues (orange sticks). C) The ensemble of target structures produced by solution NMR

24

Introduction

An essential part of molecular docking is evaluation of the association or the binding affinity between the target and the compound conformation provided by the conformational search algorithms. Since the same algorithms (scoring functions), which are used for the evaluation are frequently employed also during the conformational search to distinguish the most feasible compound conformation from the others, their implementation has to be computationally efficient while still describing the phenomena involved in molecular recognition. However, the scoring functions are usually optimized towards high productivity by employing simplified physical description or predefined parameters obtained from experimental observations or quantum chemical calculations that may compromise the results accuracy.98,116 In terms of accuracy, the scoring functions can be evaluated by many criteria: (i) the strength of the compound-target interaction reported should correspond to the actual free energy of binding; (ii) the compound conformation most closely resembling the experimental one should also be the best scored one; (iii) true binders and not-binders should be discriminated during docking of multiple compounds; and (iv) a function should run at sufficient speed for implementation in the compound conformational search. The number of available scoring functions is large and offers difference in either speed or complexity to handle any SBVS campaign. The scoring functions can be classified into three groups: (i) force-field-based; (ii) knowledge-based; and (iii) empirical.

 Force-field-based scoring functions utilize the classical molecular mechanics energy functions of well-established force-fields such as AMBER126 or CHARMM127. Their major limitation is inaccuracy in calculation of entropic effects, desolvation and high computational costs.115,116,128

 Knowledge-based scoring functions utilize the pairwise energy potentials derived from the statistical analysis of atom-pairs frequencies observed in the experimentally determined protein-compound complexes. These potentials favor contact with interatomic distance occurring the most often, and penalize repulsive interactions.115,116 Although designed to reproduce experimental structures rather than binding energies, knowledge-based scoring functions offer a suitable speed-to-accuracy ratio. Their major limitation is the lack of information on interactions involving certain atom types that are underrepresented in the available protein-compound complexes such as those with metals or halogens, and directionality of the interactions, e.g. hydrogen bonding.98,116 DrugScore,129 SMoG,130 PoseScore,131 and RF-Score132 represent the current implementation of the knowledge-based scoring functions.

25

 Empirical scoring functions represent the binding energy by a sum of simple scalable interaction terms such as hydrogen bonding, electrostatic and hydrophobic effects, entropic changes and interactions with metal ions. The scaling factors parameterize the terms as favorable/unfavorable based on multiple linear regression analysis or machine learning used to fit the experimental binding affinity data.98,128 Since the main strength of empirical scoring function is their speed, they are widely used and implemented in many docking software. However, their major limitations are dependence on the accuracy of the experimental data, arbitrary inclusion/exclusion of energy terms and selection of training data used for derivation of the scaling factors.98,115 AutoDock,133 ChemScore,134 SFCscore135 and HYDE136 are examples of popular empirical scoring functions.

Once all of the compounds from the database have been docked into the desired binding site of the macromolecular target, the most promising compounds are selected for experimental validation. However, the list naturally comprises many false positive hits as well as compounds resembling each other. To overcome these issues and promote true hits to the top level of the list, many post-docking analyzes have been employed including (i) visual inspection; (ii) clustering methods; (iii) consensus scoring; and (iv) refinement of the binding energy.98

 Visual inspection aims both to identify binding artifacts such as clashes, wrong directionality of hydrogen bonds or coordination of metal ions and to verify the binding according to the assumed catalytic mechanism, if such information is available. However, this analysis is not suitable for large-scale application. Instead, an application of automatic filters prior to visual inspection is advisable.98 Mechanism-based geometric filters for reactivity estimation can identify novel substrates outside the chemical scaffolds of experimentally characterized ones (Chapter 1).

 Clustering methods aim to exclude compounds resembling each other prior the experimental testing since there is no point to test them all. Usually, a representative compound of each cluster defined by chemical scaffold or molecular interactions with the target is selected. Based on the analysis of interatomic contacts between the compound and the target, Bouvier et al.137 developed a clustering tool discriminating active compounds from inactive ones.

 Consensus scoring aims to balance the limitations of a single scoring function by combination of scores produced by several scoring functions. The number of false positive hits is usually decreased if the results of individually-well-performing scoring functions are intersected.138,139

26

Introduction

 Refinement of the binding energy aims to treat neglected or simplified terms in scoring functions in a more physically realistic way by inclusion mainly solvation or entropy effects. For this purpose, molecular mechanics Poison-Boltzmann surface area or molecular mechanics generalized-Born surface area are usually employed.140,141 However, this method is not suitable for large-scale application due to its high computational costs.

The scientific limitation of SBVS is the inability of scoring functions to consistently and correctly predict the binding energies of diverse compounds. The major consequence is the necessity of time-consuming post-processing analyses aimed at removing the false positive hits as well as the selection of hits for experimental testing based on knowledge or intuition. In a perfect scenario, active molecules would score significantly better than inactive ones without any researcher’s intervention, which represents an avenue for methodological breakthroughs.89,98 The methods accounting for the target flexibility and dynamic inclusion of water molecules within the binding site have also been improved recently. However, their contributions to the enhanced reliability of SBVS cannot be easily evaluated due their strong system-dependence and unavailability of a standard dataset.142 Nonetheless, the contribution of SBVS to drug discovery and biocatalysis is obvious from many successful applications already reported.99,115,128

27

4. Engineering enzymes – design of specificity

Nature has provided enzymes catalyzing a myriad of chemical reactions. Although biotechnological applications of enzymes were designed in the past according to their limitations, nowadays, the enzymes are usually engineered to match the specifications of the process of interest.5,19 Modern enzyme engineering methods are able to construct a biotechnologically useful catalyst even for non-natural reactions,26,143 faster than ever before and with proven applications in pharmaceuticals, green chemistry industry, and biofuels.5,19,144 The enzyme engineering strategies could be divided into two broad categories based on known information about the target enzyme and underlying technology: (i) directed evolution; and (ii) rational design19,144 (Figure 13).

Directed evolution either randomly recombines a set of related DNA sequences or introduces random mutations into a single gene to create a library of variants which is screened for a desired property, e.g., activity, substrate specificity or thermostability. Although no structural information is required and mutations can be introduced into variable places, the drawback is that several rounds of evolution have to be usually applied and many enzyme variants screened to yield a useful catalyst.5,19,144

The rational design utilizes structural, sequential and biochemical data to propose mutations that are subsequently introduced by site-specific mutagenesis. These mutations possess increased probability of affecting the desired property and at the same time they reduce effort devoted to the library screening. The rational design is vital when no high- throughput screening platform is available. However, the construction and testing of only a handful of mutants might miss the best substitutions.19,144

Both strategies can be combined in semi-rational design that attempts to create small libraries for the screening based on the knowledge extracted from structural or experimental data.19,144 Although all strategies used by enzyme engineers should eventually yield to improved catalysts, the semi-rational design may reach this goal with less effort than applying directed evolution and rational design separately. The semi-rational design is more efficient because it substitutes pre-selected positions with less effort than is usual for random mutagenesis or for reliable calculation predicting best combination of substitutions.19

The following chapters are focused on semi-rational design of substrate specificity by engineering an enzyme active site and its access pathways.

28

Introduction

Figure 14. The illustration of rational design and directed evolution. A) The structural analysis guiding rational exchange of a specific amino acid by defined base mutations in the coding sequence. B) Random mutagenesis of the coding sequence followed by a functional screening of the generated enzyme variants. An improved coding sequence found in one round of the design or evolution may be used as a starting point for further cycles. Adopted from 13.

29

4.1 Active site engineering

The active sites of enzymes are composed of precisely positioned amino acids mediating the catalytic function. The active site residues thus represent an obvious target for altering enzymatic properties. Morley et al.145 performed a survey of published single amino acid substitutions improving enzyme properties. The analysis of their locations compared to the active site revealed that substitutions close to the active site were more effective for enantioselectivity, substrate specificity, and promiscuous activity rather than the distant ones. In contrast, substitutions improving the native catalytic activity and thermostability were similarly effective irrespective of their distance to the active site.

The active site residues interacting with the desired compound can be detected from experimentally determined enzyme-compound complex. If such information is unavailable, molecular docking can be employed to predict the compound’s binding mode within the active site. The effective analysis of important enzyme-compound interactions can be facilitated by software such as LigPlot+146 or PoseView147 which generate easy-readable 2D diagrams (Figure 15). In addition, even without the precise knowledge of compound-interacting amino acids, their position can be accurately predicted by analysis of an active site cavity (Section 3.2).148

Figure 15. The illustration presenting the protein-compound interactions. A) PoseView B) LigPlot+. The illustration was generated by using the structure of human HMG-CoA reductase with bound drug Lipitor (PDB-ID 1HWK).

30

Introduction

The active site residues can be subjected to evolutionary analysis to focus the mutagenesis to the amino acids occurring within the enzyme family and/or to omit conserved residues. Although mutating the conserved residues might be beneficial for a dramatic reshaping of the active site required for unnatural compounds,149 targeting the highly variable positions was proven to alter effectively enzyme properties with decreased risk of abolished activity.150 This analysis can be performed automatically using the HotSpot Wizard web server150 which combines evolutionary and structural information extracted from bioinformatic databases and software to propose feasible positions for mutagenesis. Once the location of substitutions is set, single or multiple substitutions are introduced by established molecular biology techniques.151 Reetz et al.152 have developed combinatorial active site saturation test (CAST) combining the structural data with combinatorial amino acid randomization within the active site. CAST method groups the sequentially close amino acids into pairs that are randomized separately in small focused libraries. The improved variants from each library are subsequently used as templates for randomization at one or more of the other libraries (Figure 16). CAST has been proven versatile and efficient in many applications focused on enantioselectivity and conversion of unnatural substrates.4,153

Figure 16. The illustration of combinatorial active site saturation test. A) Sequentially associated amino acids surrounding the active site are divided into one- to two-member libraries. B) Iterative saturation mutagenesis. Improved variants from each library are used as templates for mutagenesis at remaining sites. Adopted from 153.

31

The possible aims of active site engineering include (i) introduction of new activity; (ii) partitioning of reaction intermediates; and (iii) improvement of promiscuous enzyme activities.154 The new activity can be introduced into the suitable enzyme scaffold either by addition or removal of the essential catalytic amino acids while keeping the other structural features intact.154,155 Wu et al.156 converted a protease subtilisin into a peroxidase just by mutating the catalytic serine into selenocysteine. The thiol-dependent reduction of peroxides was unprecedented within this scaffold and was performed with many secondary and tertiary hydroperoxides to yield the corresponding alcohols in a stereospecific manner. Jiang et al.157 computationally designed retro-aldolases by inserting four proposed catalytic motifs into different protein scaffolds. 32 designs were able to break a carbon-carbon bond in a substrate not found in biological systems. The new activity can be also gained by the interconversion of homologous enzymes which share the same scaffold but catalyze different reactions indicating their descent from a common evolutionary ancestor.154 Schmidt et al.158 engineered both L-Ala- D/L-Glu epimerase and muconate lactonizing enzyme II to catalyze the o-succinylbenzoate synthase reaction which cannot be detected for the natural enzymes. These enzymes are members of the mechanistically diverse enolase superfamily sharing the triose- phosphate isomerase scaffold. Chaloupkova et al. (Chapter 3) identified a chloride binding site in the crystal structure of haloalkane dehalogenase DbeA which is unique within the α/β- hydrolase fold. The substitution of amino acids to those found in homologous enzymes led to the successful removal of the unique site and revealed shifted substrate specificity, decreased thermostability and catalytic activity as well as eliminated substrate inhibition. The examples mentioned above relied on a single- or double-point substitutions while multiple-point mutants are usually needed to introduce new activities.5 Sun et al.159 reshaped the active site of limonene epoxide hydrolase by smart saturation mutagenesis with single amino acid alphabets at ten positions. This strategy kept the screening effort at a minimum and yielded variants with improved or even inverted enantioselectivity.

The partitioning of reaction intermediates into complex mixtures of products would occur if enzymes did not provide environment restricting the reaction pathways. The promotion of disfavored pathways and/or inhibition of favorable ones was used in many active site engineering studies focused on enzymes such as hydrolases and amide- or ester-bond ligases, which catalyze the acyl-transfer reactions. These enzymes favor transfer of the covalent acyl- enzyme intermediate to either water (hydrolases) or to a different acceptor (ligases) and simple active site engineering can have a dramatic effect on the processing of these adducts.154 Witkowski et al.160 mutated the catalytic nucleophile S101 and its surrounding base H237 in thioesterase II, resulting into a hydrolase which gained acyltransferase activity in the presence

32

Introduction

of a thiol acceptor. Although the hydrolase activity was severely affected, the mutant exhibited higher catalytic efficiency for the acyltransferase reaction than the natural enzyme for hydrolysis. Some hydrolase inhibitors are known to form a covalent intermediate with the catalytic nucleophile yielding a locked catalytic cycle. However, active site engineering can be used for processing such inhibitors as substrates.154 Organo-phosphorous compounds applied in nerve gasses such as sarin or VX, pesticides or drugs form a hydrolysis-resistant phosphoester bond with the catalytic serine of cholinesterases. Millard et al.161 introduced a histidine residue close to the catalytic serine in the human butyrylcholinesterase. The G117H mutant underwent spontaneous reactivation upon sarin and VX treatment and regained 100 % of its esterase activity. Moreover, the mutant acquired an ability to hydrolyze the antiglaucoma drug echothiophate and the pesticide paraoxon.162 The manipulation of the reaction paths can be also applied to control the product formation by engineering the size of the active site or the position of the catalytic residues relative to the substrate. This strategy keeps the natural substrates and catalytic mechanisms of the enzyme intact and controls the length of polymerization products including their stereo- and regioselectivity.154

Besides the substrate promiscuity (acceptance of structurally distinct substrates converted by the same chemical reaction) and the product promiscuity (formation of different products starting from the same substrate), the enzymes can exhibit also a catalytic promiscuity (alternative catalytic ability performed by reaction mechanism different from the native one). The catalytic promiscuity can either appear accidentally by subjecting the native enzyme to unnatural substrates or be induced by mutating the catalytic residues or binding of unnatural cofactors or metals.154,163 Vongvilai et al.164 accidentally discovered a racemase activity involved in carbon–carbon bond breaking and forming in a lipase-catalyzed kinetic resolution of N- substituted α-aminonitriles. This hidden activity allowed a high level of substrate conversion that would not be otherwise possible. Hederos et al.165 rationally designed human glutathione- S-transferase normally transferring compounds to glutathione to allow hydrolysis of glutathione thioester for production of benzoic acid. A substitution of active site alanine to histidine eliminated the formation of irreversible enzyme-substrate complex observed in the native enzyme. Leitgeb et al.166 turned a β-diketone-cleaving dioxygenase into an esterase by a single point mutation and metal ion substitution. The O2-dependent carbon-carbon bond cleavage performed by the nonheme Fe2+ center was abolished by the substitution of one of its three metal-coordinating histidines to glutamate. However, the accidentally observed catalytic promiscuity for hydrolysis of 4-nitrophenyl esters was improved. Moreover, the replacement of Fe2+ ion to Zn2+ further improved catalytic efficiency of the mutant.

33

4.2 Tunnel engineering

The fine-tuning of substrate specificity by enabling a substrate to pass through an access tunnel to reach the buried active site spans through all enzyme classes.167,168 This recently proposed lock-keyhole-key model167 extends the traditional models of catalysis i.e., Fischer’s lock-and-key,2 Koshland’s induced-fit,169 and conformational selection170. The important properties of the access tunnel, i.e., the keyhole, include: (i) geometry; (ii) physicochemical properties; (iii) gating elements; and (iv) dynamics. On the contrary to the active sites, the access tunnels are often not conserved within the enzyme families.168 Besides selecting the proper substrate from the complex mixture of compounds located outside of the enzyme, the tunnels can also facilitate the flux of solvent molecules and the exchange of products and cofactors. The examples demonstrating the impact of tunnel engineering on enzyme properties e.g., catalytic activity, substrate specificity, enantioselectivity and thermostability are reviewed in Chapter 6.

The tunnel geometry mirrors the active site in terms of controlling the maximal size and shape of the incoming compound and its proper stabilization in the activated complex.168 The plethora of tunnel geometries is broad and includes I-shaped tunnels, L shaped-tunnels, U-shaped tunnels, tunnels with adjacent sub-sites and many combined forms.167 The tunnels can be represented by sole or multiple pathways connecting the buried active site with bulk solvent or facilitate connection among different catalytic sites in multifunctional enzymes or enzyme complexes (Chapter 6, Figure 2).167

The physicochemical properties of a tunnel favor passage of complementary compounds into the active site. Tunnels provide key interactions and barriers encountered on the way to the active site and represent an additional level of selectivity than the tunnel geometry itself. Even properly sized and shaped compounds would not pass through the tunnel if they had non-complementary properties, e.g., passing hydrophilic compound through a hydrophobic tunnel.168

The gating elements represent a dynamic system of reversible conformational changes between two distinct states, an open and a closed one regulating either the access of compounds into and out of the protein or the flux of intermediate products among different parts of the protein.171 The gates can consist of individual residues, loops, secondary structure elements, or even entire domains and can be classified in five groups according to their structural basis and amplitude (Chapter 6, Figure 3). The gates are usually located at the tunnel mouth or at the tunnel bottleneck.171 Their functional mode can be either stochastic or induced by stimuli, such as voltage changes or the binding of certain ligands.172

34

Introduction

Due to the dynamical motion of enzymes, the tunnel geometry may significantly differ in time. Moreover, the transient tunnels might appear in a reflection of the actual protein conformation or can be induced by the incoming compound. Although such tunnels might not be observed in experimentally determined structures, several studies revealed their importance.168,172 This is particularly valid for the CYP enzyme family that is responsible for the metabolism of naturally occurring compounds, xenobiotics, and drugs as well.173 Cojocaru et al.174 performed a survey of available crystal structures of CYP enzymes to identify tunnels connecting the deeply buried active site with the bulk solvent. The analysis revealed transient tunnels observed only in a fraction of crystal structures as well as merging of tunnels caused by dynamic movements of the flexible F–G helix–loop–helix and the B–C loop.

The complexity of biological systems limits the efficacy of visual inspection of tunnels even in a single structure. However, the tunnels can be efficiently calculated by established methods using either: (i) the protein structure alone or (ii) the protein structure and a probe. The former one employs geometric algorithms such as Voronoi diagrams to describe the skeleton of tunnels within the structure. In the latter, a compound is used to probe the protein structure in molecular dynamic (MD) simulations for possible access or exit pathways. Thus, the major difference between both approaches is that the latter one identifies compound-specific tunnels instead of all possible pathways.87,168,175 However, the main disadvantage of methods identifying the compound-specific tunnels is the need for high computational costs of extensive MD simulations.168

The tunnel identification is simplified by the geometric methods into a mathematical problem (Figure 17). The initial principles approximated the protein structure by a grid in which all grid points not-overlapping with protein atoms revealed the accessible space.176 However, this approximation led to high computational and memory costs and was limited to smaller systems or necessitated only crude exploration.87,177 Nowadays, most of the software tools use Voronoi diagrams in which protein atoms represent vertices. These vertices define series of interconnected edges representing the accessible space. Methods such as Dijkstra’s algorithm can be employed to navigate through linked edges for finding the paths from the defined active site to the bulk solvent. The scientific limitation of the geometric methods is the identification of the most relevant tunnels for biological processes. The tunnel length, width and curvature are commonly employed metrics quantifying the tunnel priority based on the assumption that wider and shorter tunnels are more likely to be favorable for the incoming substrates.168,177 The commonly used geometric software for tunnel calculation includes MOLE 2.0,178 MolAxis,179 and CAVER 3.087. Only the last one was developed to analyze multiple access pathways using large ensembles of protein structures to consider the tunnel dynamics. In addition, Kozlikova et al.180 (Chapter 4) have developed CAVER Analyst 1.0 – an interactive visualization and analytical tool

35

enabling calculation of tunnels in both static structures and their ensembles. Opposed to the other tools, it simplifies comparative analysis of homologous structures and allows accurate analysis of protein voids (Chapter 4, Supplementary Figures 2 and 5). CAVER Analyst has embedded a graphical user interface to facilitate fine-tuning of calculation, real-time animation of tunnels in protein ensembles, interactive analysis of tunnel characteristics, plotting of graphs and export of results. The existing web-servers calculating tunnels and internal protein voids in static structures include CHEXVIS,181 BetaCavityWeb,182 MOLEonline 2.0183 and CAVERweb (http://loschmidt.chemi.muni.cz/caverweb).

Figure 17. The illustration of geometric approaches of tunnel calculation. A) The mapping of protein atoms (light blue circles) on a discrete grid (black squares). The solid line represents the boundary between protein interior and outer space. B) The idealized Voronoi diagram projecting protein atoms (light blue circles) on a plane. The solid lines represent edges, the dashed orange line represents the most probable route connecting the tunnel starting point (black star) and the outer space. Adopted from 176,177.

HLDs were used as model systems in the initial tunnel engineering efforts that revealed important phenomena. Chaloupkova et al.184 performed a systematic engineering of a bottleneck residue of an access tunnel of HLD LinB. This analysis revealed a strong dependence

36

Introduction

of substrate specificity and activity on the size and physico-chemical properties of the introduced mutations (Figure 18). The mutated position (L177) was selected based on a combined structural and phylogenetic analysis. L177 was phylogenetically the most variable residue among those forming either the active site or the access tunnels of HLDs. This study demonstrated advantageous usage of rational tunnel engineering generating a number of mutants with modified catalytic properties and substrate specificities. The similar mutagenesis of a single amino acid located at the tunnel mouth improving enzyme activity and enantioselectivity was reported for epoxide hydrolases.185,186 Biedermannova et al.187 did a follow-up study with LinB L177W mutant showing the most dramatic changes in the substrate specificity. They performed transient and steady state kinetic experiments together with MD simulations and tunnel detection to reveal a marked slowdown of the product release. In fact, the substitution by the bulky tryptophan residue changed both the mechanism of bromide ion release as well as the rate-limiting step of the catalytic cycle. This study demonstrated the influence of access tunnels on the mechanism and reaction kinetics of an enzymatic catalytic cycle.

Figure 18. Mutagenesis of tunnel bottleneck. The structural analysis revealed that the exchange of leucine to tryptophan closed and shortened the main access tunnel (orange spheres) which affected the connection between the active site (black star) and the bulk solvent. The dashed orange line represents the missing part of the shortened access tunnel.

37

Pavlova et al.37 performed a computer-assisted redesign of the access tunnels of HLD DhaA which yielded a mutant with 32-fold higher activity towards a toxic and recalcitrant anthropogenic substrate 1,2,3-trichloropropane (TCP). They identified the key tunnel-lining residues by molecular modeling, verified their mutability by conservation analysis and randomized them by directed evolution (Figure 19). The most active mutant DhaA31 revealed narrow access tunnels that enabled stabilization of bound substrate in its productive conformation and in this way enhanced the rate of carbon-chlorine bond cleavage. This chemical step limits the catalytic cycle of the native enzyme, while product release was found limiting in the mutant. This study demonstrated the importance of access tunnels for balancing catalysis against the ligand transport.

Figure 19. Mutagenesis of tunnel-lining residues. Rationally selected residues (red spheres) lining the access tunnels (orange spheres) of the native HLD DhaA (left). The most active mutant DhaA31 (right) revealed introduction of large hydrophobic residues into the access tunnels (red sticks) restricting both the water access and product release. The active site is represented as a black star.

Koudelakova et al.188 prepared a systematic set of HLD DhaA variants with substantially improved thermostability and resistance to organic co-solvent dimethylsulfoxide by a combination of random mutagenesis and focused directed evolution. The structural and biochemical analysis revealed that the improvement was governed by four tunnel-lining residues. Substitutions of these residues narrowed the access tunnel, sealed the active site and improved intramolecular hydrophobic packing (Figure 20A, B). However, the catalytic activity of 38

Introduction

this four-point mutant (DhaA80) was severely reduced in the buffer environment. To validate the concept of tunnel engineering for protein stabilization, all possible mutations in 26 different proteins spanning all enzyme classes were constructed in silico and analyzed. The analysis revealed two-times higher occurrence of highly stabilizing mutations in the tunnel-lining residues than in the other protein regions. Liskova et al.189 performed a follow-up study aimed at enhancement of the poor catalytic activity of the highly thermostable DhaA80 in the buffer. The saturation mutagenesis of two out of the four tunnel-lining residues originally replaced in DhaA80 yielded DhaA106 differing from the template by a single mutation F176G (Figure 20C). Interestingly, DhaA106 exhibited 32-times higher catalytic activity of the target substrate in the buffer environment while sacrificing only 4 °C of its thermostability. Moreover, this variant revealed improved activity towards 26 out of 30 tested halogenated compounds which replicated the substrate specificity of the native DhaA. The improvement in activity was linked to the increased diameter of the access tunnel and the mobility of the adjacent structural elements. Both studies demonstrated a novel engineering concept for balancing thermostability-activity trade-off by saturating the access tunnel residues.

Chrast et al. (Chapter 5) analyzed the thermostability-activity trade off in the paradoxically thermostable HLD DmxA originating from a bacterium naturally occurring in the Antarctic lake. Surprisingly, DmxA exhibits the highest thermostability which has ever been observed in any native HLD. Although acting as a dimer and possessing a unique halide-stabilizing residue, neither of these structural features contributed to the protein thermostability. The analysis of the DmxA crystal structure revealed narrow access tunnels (Chapter 5, Supplementary Table 4). Mutagenesis of two residues forming a bottleneck of the main access tunnel increased opening of this tunnel and decreased the thermostability by 9 °C, matching the value common for other native HLDs. This study highlighted the contribution of narrow access tunnels to improved thermostability of enzymes.

Rational engineering of a structurally defined and functional path representing a new protein tunnel remains a challenge. Brezovsky et al.190 described the computational design and directed evolution of a de novo transport tunnel in HLD LinB. Initially, they closed the main access tunnel either by a tryptophan residue or by a disulphide bridge. The tunnel closing was followed by the opening of a new tunnel in a distinct part of the structure that was confirmed by the crystallographic analysis. The mutants possessing newly introduced tunnel exhibited dramatically modified substrate specificity, substrate inhibition and activity surpassing even the most proficient HLD reported so far.

39

Figure 20. Tunnel mutants of HLD DhaA with improved thermostability. A) A 10-point mutant combined from all possible single-point mutations constructed by Gene Site Saturation Mutagenesis191. Mutated tunnel-lining and other protein residues are represented as red and blue spheres, respectively. B) Mutant DhaA80 revealing that only the four residues forming the tunnel bottleneck (red sticks) governed the thermostability improvement observed in the 10-point mutant. C) Mutant DhaA106 carrying F176G mutation (black arrow) shows two-times increased bottleneck radius of the main access tunnel as well as improved activity and substrate specificity. This mutant lost only 4 °C of its thermostability compared to the 10-point mutant and DhaA80. The tunnel is represented as orange spheres and the active site by a black star.

40

Introduction

The concept of tunnel engineering is applicable generally to enzymes with buried active sites.167 The access tunnels have been shown to affect the substrate specificity of CYP enzymes involved in procarcinogen activation and drug metabolism.174,192,193 Furthermore, the access tunnels are of growing interest in the drug design campaigns. One of the innovative attempts to use the tunnel data to design novel drugs was described by Stsiapanava et al. (Chapter 6, Example Nr. 8, Table 2).194 They were focused on a bifunctional enzyme leukotriene A4 hydrolase/aminopeptidase (LTA4H) involved in inflammatory processes by catalyzing both the formation of a proinflammatory lipid mediator leukotriene B4 (LTB4) and inactivation a neutrophil chemotactic tripeptide Pro-Gly-Pro (PGP). The authors managed to design a specific inhibitor blocking only one of the two access tunnels connecting the surface with the buried active site. This kept the enzyme viable to perform a second catalytic process that is thought to be important, and prevent accumulation of the neutrophil chemoattractant PGP. This would not be possible with inhibitors directly blocking the active site. The Chapter 6 describes the general concepts and classification systems of tunnels and gating elements and highlights their potential as targets for the binding of small molecules. The different types of binding and the possible pharmacological benefits of such targeting are presented and discussed. Moreover, 12 complexes involving small molecules bound to those features in pharmaceutically relevant biomolecules are presented as case studies. The aim is to facilitate the development of new inhibitors or modulators that specifically target these unexploited structural features.

41

SYNOPSIS OF RESULTS The results are composed of five original articles and one review article:

1. Daniel, L., Buryska, T., Prokop, Z., Damborsky, J., Brezovsky, J., 2015: Mechanism-Based Discovery of Novel Substrates of Haloalkane Dehalogenases using in Silico Screening. Journal of Chemical Information and Modeling 55: 54-62, DOI:10.1021/ci500486y

2. Buryska, T.* , Daniel, L.*, Kunka, A., Brezovsky, J., Damborsky, J., Prokop, Z., 2016: Discovery of Novel Haloalkane Dehalogenase Inhibitors. Applied and Environmental Microbiology 82: 1958-1965, DOI:10.1128/AEM.03916-15

3. Chaloupkova, R., Prudnikova, T., Rezacova, P., Prokop, Z., Koudelakova, T., Daniel, L., Brezovsky, J., Ikeda-Ohtsubo, W., Sato, Y., Kuty, M., Nagata, Y., Kuta Smatanova, I., Damborsky, J., 2014: Structural and Functional Analysis of a Novel Haloalkane Dehalogenase with Two Halide-Binding Sites. Acta Crystallographica D70: 1884-1897, DOI:10.1107/S1399004714009018

4. Kozlikova, B., Sebestova, E., Sustr, V., Brezovsky, J., Strnad, O., Daniel, L., Bednar, D., Pavelka, A., Manak, M., Bezdeka, M., Benes, P., Kotry, M., Gora, A., Damborsky, J., Sochor, J., 2014: CAVER Analyst 1.0: Graphic Tool for Interactive Visualization and Analysis of Tunnels and Channels in Protein Structures. Bioinformatics 30: 2684-2685, DOI:10.1093/bioinformatics/btu364

5. Chrast L, Tratsiak K., Daniel L., Sebestova E., Brezovsky J., Kuta Smatanova I., Damborsky J., Chaloupkova R., 2016: Structural Basis of Paradoxically Thermostable Dehalogenase from Psychrophilic Bacterium (under review)

6. (Review) Marques, S. M.*, Daniel, L. *, Buryska, T., Prokop, Z., Brezovsky, J., Damborsky, J., 2016: Enzyme Tunnels and Gates as Relevant Targets in Drug Design (under review)

* These authors contributed equally to this work

42

Contribution to the articles: 1. Designed modeling experiments, conducted molecular docking and rescoring, interpreted the data, wrote the manuscript

2. Designed in silico screening, conducted molecular docking, rescoring and clustering, interpreted the data

3. Designed modeling experiments, run molecular dynamics, interpreted the data, wrote the modeling part of the manuscript

4. Tested the software, contributed to the development, wrote the supplementary information

5. Designed modeling experiments, run molecular dynamics, calculated the tunnel network, performed in silico design of mutated enzymes, interpreted the data, wrote the modeling part of the manuscript

6. Searched and wrote the case studies, calculated the tunnel network, made graphics

43

44

CHAPTER 1

Mechanism-based discovery of novel substrates of haloalkane dehalogenases using in silico screening

J.Chem.Inf.Model. 2015, 55, 54−62

DOI: 10.1021/ci500486y

45

Abstract

Substrate specificity is a key feature of enzymes determining their applicability in biomaterials and biotechnologies. Experimental testing of activities with novel substrates is a time-consuming and inefficient process, typically resulting in many failures. Here, we present an experimentally validated in silico method for the discovery of novel substrates of enzymes with a known reaction mechanism. The method was developed for a model system of biotechnologically relevant enzymes, haloalkane dehalogenases. On the basis of the parametrization of six different haloalkane dehalogenases with 30 halogenated substrates, mechanism-based geometric criteria for reactivity approximation were defined. These criteria were subsequently applied to the previously experimentally uncharacterized haloalkane dehalogenase DmmA. The enzyme was computationally screened against 41,366 compounds, yielding 548 structurally unique compounds as potential substrates. Eight out of 16 experimentally tested top-ranking compounds were active with DmmA, indicating a 50% success rate for the prediction of substrates. The remaining eight compounds were able to bind to the active site and inhibit enzymatic activity. These results confirmed good applicability of the method for prioritizing active compounds - true substrates and binders - for experimental testing. All validated substrates were large compounds often containing polyaromatic moieties, which have never before been considered as potential substrates for this enzyme family. Whereas four of these novel substrates were specific to DmmA, two substrates showed activity with three other tested haloalkane dehalogenases, i.e., DhaA, DbjA, and LinB. Additional validation of the developed screening strategy with the data set of over 200 known substrates of Candida antarctica lipase B confirmed its applicability for the identification of novel substrates of other biotechnologically relevant enzymes with an available tertiary structure and known reaction mechanism.

Introduction

Enzymatic catalysis has matured into an important tool for various biotechnological applications. The high enantioselectivity and specificity of enzymes allow efficient production of many valuable chemicals.4,7 However, engineering a successful catalyst remains a challenging task and often requires the construction of many variants to achieve the desired activity and stability.4,5 Identification of conversions of novel substrates by natural enzymes is an important way for discovering practically useful reactions. In silico screening by molecular docking has been used to complement experimental in vitro screening of chemicals against biological targets in drug discovery.88,91 Recently, in silico screening was also employed for functional assignment of enzymes with unknown function by prediction of their putative substrates.92–94 This approach can be further extended for the discovery of novel substrates of enzymes used in

46

Chapter 1

biotechnology to broaden their scope. Although molecular docking has been used to predict substrates of short-chain dehydrogenase,195 laccase,196 lipase B197 and sulfotransferase,198 those studies largely focused on small libraries of similar compounds, employed basic molecular docking for consideration of substrate binding only without addressing the reactivity, and sometimes lacked experimental validation.197

In a comprehensive study, Irwin et al.199 employed molecular docking to identify new substrates of Zn-dependent phosphothioesterase, an enzyme hydrolyzing phosphate esters, including the insecticide paraoxon and nerve gases sarin, soman, and VX. The screening of 167,000 compounds revealed a known substrate ranked as the eighth hit. After visual inspection, seven compounds resembling known substrates were found among the 100 top- ranked hits and were subsequently experimentally verified as substrates. The authors did not explicitly address the reactivity of docked compounds. Reactivity was considered by docking substrates in high-energy forms in a study by Hermann et al.92 who docked 4,207 compounds and experimentally verified three of them as substrates. Xu et al.197 addressed reactivity by considering distance-based screening criteria based on the sum of van der Waals radii of atoms involved in the nucleophilic attack. The authors correctly identified com- pounds reported in the literature, but no external validation was carried out. Moreover, the essential hydrogen bonds were kept fixed during the docking procedure, which may have disfavored compounds showing different binding patterns.

Here, we present an in silico screening method calibrated on experimentally verified substrates for better prediction accuracy. We focused on biotechnologically interesting enzymes from the family of haloalkane dehalogenases (EC 3.8.1.5; HLDs) as model enzymes. HLDs are microbial enzymes structurally belonging to the α/β-hydrolase fold that catalyze the degradation of halogenated aliphatic hydrocarbons to the corresponding alcohol, a halide and a proton. They can be utilized in the bioremediation of environmental pollutants,50 construction of biosensors,53 decontamination of warfare agents,55 synthesis of optically pure alcohols,56 and imaging of cells and protein analysis.60,61,200 The reaction mechanism of HLDs has been widely studied and is well understood.28–31 Consequently, HLDs are often employed as benchmarks for testing of various molecular modeling protocols.32–36 Dehalogenation by HLDs follows a two- step reaction mechanism involving a covalently bound ester intermediate. The ester is formed in the first step by SN2 nucleophilic displacement of a halogen atom mediated by the catalytic triad. The transition state formed in this step is stabilized by two halide-stabilizing residues, which donate H-bonds to the leaving halide group to saturate the developing charge. The ester intermediate is then hydrolyzed in the second reaction step by a hydroxyl group, which is generated from water by the catalytic histidine. Eventually, an alcohol product and halide ion are released from the buried active site through the access tunnels.

47

In this study, the newly developed in silico method was tested by screening 41,366 halogenated compounds against the previously uncharacterized enzyme DmmA, revealing eight experimentally validated novel substrates and eight inhibitors. In the case of the top-ranking compounds, the success rate of the method was 50% for the prediction of catalysis and 100% for the prediction of binding. We note that these rates will most probably decline with an increasing number of tested compounds. The broader applicability of the method was further validated with 206 experimentally tested substrates of Candida antarctica lipase B (CALB). The method could be used to enrich the pool of converted substrates for other biotechnologically relevant enzymes with known tertiary structure and reaction mechanism.

Methods

Preparation of experimentally verified substrates

Thirty known substrates of HLDs49 were prepared in both ground and high-energy states. The tertiary structures in the ground state were taken from our in house database.77 For the preparation of high-energy states of substrates, the Antechamber module of AMBER 12201 was employed to convert the PDB files to the z-matrix format compatible with the Gaussian 09 program revision D.01.202 The structures were then minimized using the HF/6-31G(d) wave function for all atoms except iodine, for which the LANL2DZ basis set203–205 was employed. The following variables were constrained during the minimization: (i) The length of the carbon−halogen bond increased by 0.5 Å in comparison to its respective ground state length. (ii) Four angles between the halogen, carbon, and three atoms bound to the carbon at 90°. Restrained electrostatic potential partial charges206 were derived using the Antechamber module. In cases where multiple halogen atoms were present in the structure, the preparation of high-energy states was performed individually for each halogen atom. Input files were converted to an AutoDock4.0 compliant format by MGLTools.207

Preparation of screening database

The three-dimensional structures of 41,366 compounds were extracted from the EDULISS database.208 The criteria for selection of these compounds were as follows: presence of a halogen atom at a sp3 carbon atom, M logP< 8, molecular weight < 500 g mol−1, atom types parametrized for AutoDock4.0, and less than 15 torsional degrees of freedom. Input files in the Sybyl mol2 format were converted into an AutoDock4.0 compliant format by MGLTools.

48

Chapter 1

Preparation of receptors

The coordinates of the crystal structures of seven HLDs were downloaded from the Protein Data Bank209 (Table 1). Only chain A was used when screening proteins with multiple chains. All water molecules were deleted from the structures with the exception of “structured water”, that is, molecules important for the catalytic mechanism of HLDs (Table 1). Hydrogen atoms were added to the receptor with the H++ server210 at pH 7.5. The hydrogen atom bound to the NE2 atom of the catalytic histidine was deleted to reflect the catalytic mechanism of HLDs. Gasteiger charges were subsequently assigned by MGLTools. During the docking procedure, the receptor was represented by a set of atomic and electrostatic maps calculated by AutoGrid4.0.114 The grid maps comprised 80 × 76 × 80 grid points with a spacing of 0.25 Å, which were centered at a position near the nucleophilic oxygen and catalytic base to cover the whole active site and main access tunnel (Supplementary Figure 1).

Table 1. PDB codes of crystal structures of HLDs and IDs of ‘structured water’ molecules. ID of ‘structured Source PDB code Crystal structure water’ molecules

DbjA Bradyrhizobium japonicum USDA110 3A2M 544

DmbA Mycobacterium tuberculosis H37Rv 2QVB 316, 341

LinB Sphingomonas paucimobilis UT26 1MJ5 3050, 3508

DhaA Rhodococcus rhodochrous NCIMB 13064 1CQW 565, 566

DhlA Xanthobacter autotrophicus GJ10 2YXP 559, 610

DbeA Bradyrhizobium elkanii USDA94 4K2A NA

DmmA metagenome of marine microbial consortium 3U1T 26, 622

NA – not applicable, structured catalytic waters are missing and the coordinates of water molecules were taken from LinB.

Docking protocol

The energy of the unbound system was estimated as the internal energy of the unbound extended conformation determined from a Lamarckian Genetic Algorithm search. A total of 250 docking calculations were performed for each compound using the Lamarckian Genetic Algorithm with the following parameters: initial population size 300, maximum number of generations 27,000, elitism value 1, mutation rate 0.02, and crossover rate 0.8. The maximum number of energy evaluations was set to (1.5 × 105 ∙ Ntor2) + 1.5 × 106, where Ntor is

49

the number of torsional degrees of freedom of a docked compound. Each local search was based on the pseudo Solis and Wets algorithm with a maximum of 300 iterations per search.211 Final conformations from all 250 docking calculations were clustered into groups with tolerance for their root-mean-square positional deviation at 2 Å.

Mechanism-based geometric criteria for reactivity estimation

The largest cluster obtained from molecular docking of each compound was evaluated for proper positioning of the halogen atom between the two halogen-stabilizing residues using a 4.5 Å cutoff distance. Compounds with their halogens properly stabilized were investigated − for a possible SN2 displacement reaction. The ––COO ∙∙∙C−Cl reactive distance (RD) − and ––COO ···C···Cl reactive angle (RA) parameters were measured for a set of 30 known substrates. Six characterized HLDs were recorded, and values applicable to 70% of the known substrates were set as the cutoff for the description of reactivity (Figure 1).

Figure 1. The geometric parameters used for estimation of reactivity. The reactive distance (RD) between the catalytic aspartate (cyan sticks) and attacked carbon atom of the substrate molecule (yellow sticks) is represented by a thick dashed line. The reactive angle (RA) between the catalytic aspartate, the attacked carbon and the leaving halogen atoms is represented by a thin dashed line. The two hydrogen-bonding distances (H1 and H2) between the halide-stabilizing residues (cyan lines) and the leaving halogen atom of the substrate are represented by dashed lines.

50

Chapter 1

Rescoring of binding affinity

The scoring functions employed during the virtual screening are generally optimized for speed at the expense of accuracy. Therefore, the application of a more sophisticated scoring scheme is often advantageous to evaluate binding of docked compounds. The binding affinity of the docked compounds was therefore rescored using two principally different methods: (i) predicting the compound’s binding affinity using the neural network-based scoring function NNScore 2.0 tool212 and (ii) calculation of the free energy change between a bound and free state of the protein by the molecular mechanics/generalized Born surface area (MM/ GBSA) method. In the last step, a consensus rank for each docked compound was calculated by averaging the data obtained by MM/GBSA and NNScore 2.0.

NNScore 2.0 is based on the scores obtained from the 20 best-performing neural networks that were developed and validated by authors of this tool.212 To predict the binding affinity of the docked compound, the average value of the normalized scores from these 20 networks was employed for each compound according to the established protocol.212 The binding free energies were calculated by combining the gas phase energies of the protein, compounds, and protein−compound complexes with their respective solvation free energies (both polar and nonpolar) calculated with an implicit solvent model. Force field parameters for the docked compounds were prepared with the Antechamber and prmchk modules of AmberTools 1.5.201 AM1-BCC charges213 were assigned to individual atoms of compounds using the Antechamber module of AmberTools 1.5. Molecular topologies of protein, compounds, and protein−compound complexes were prepared with the Leap module of AmberTools 1.5 using the ff03.r1 force field for proteins214,215 and general Amber force field216 for compounds. The PB radii were set to mbondi2 as required for generalized Born model 2.217 To partially account for an induced fit upon the binding of a compound, the structures of the protein−compound complexes were minimized in two consequent rounds using the Sander module of AMBER 12. In the first round, the fast minimization consisted of 250 steps of the steepest descent method followed by 750 steps of the conjugate gradient energy method using a nonbonded cutoff of 50 Å and a distance-dependent dielectric multiplicative constant for the electrostatic interactions of 4rij to simulate solvation effects. The second round was carried out in actual implicit solvent with the following parameters: 100 steps of steepest descent followed by 400 steps of conjugate gradient energy minimization, nonbonded cutoff of 16 Å, interior dielectric constant of 2,218 and generalized Born model 2.217,219 The convergence criterion for the energy gradient was set to 0.1 kcal/mol Å for both minimization rounds. The nonpolar contribution to the solvation energy was computed using atomic surfaces calculated from the linear combinations of pairwise overlaps (LCPO) method.220 Finally, the MM/ GBSA calculation of the binding energy

51

was performed on the minimized structure by Python script MMPBSA221 of AmberTools 1.5 using a setting consistent with the second level of minimization.

Protein expression and purification

The recombinant plasmid pET21 carrying corresponding His-tagged dehalogenase gene was introduced into E. coli BL21 DE3 (Zymo Research, Irvine, CA, U.S.A.) cells by heat shock transformation. Transformed cells were cultivated on agar plates containing 100 μgmL−1 of kanamycin as a selection marker. Plates were first incubated overnight at 37 °C. The next day, 10 mL of LB media containing 100 μgmL−1 of kanamycin was inoculated with a single colony of cells followed by overnight culture cultivation at 37 °C. Finally, the latter overnight culture was used to inoculate 1 L of LB medium with 100 μgmL−1 of kanamycin, which was cultivated at 37 °C until an OD600 of 0.6 was reached. Induction of protein expression was initiated by addition of IPTG to a final concentration of 0.5 mM and then cultivating the culture overnight at 20 °C. Cells were harvested by centrifugation at 14,000 g for 10 min at 4 °C, then washed with 20 mM phosphate buffer, resuspended in 20 mL of the same buffer, and frozen to −80 °C. Biomass was thawed on ice, whereupon DNase (2 μL of DNase for each 1 mL of sample) was added to the sample. Cells were disrupted by sonication on ice with a Hielscher UP200S ultrasonic processor (Hielscher, ). The cell lysate was subjected to centrifugation at 21,000 g for 1 h at 4 °C. Crude extracts were manually loaded onto a Ni-NTA Superflow cartridge (Qiagen, Germany) charged with Ni2+ ions and equilibrated with a purification buffer (16.4 mM

K2HPO4, 3.6 mM KH2PO4, 0.5 M NaCl, and 10 mM imidazole) at pH 7.5. After washing out unbound and weakly bound fractions, histidine-tagged protein was eluted by increasing the imidazole concentration up to 250 mM. The collected protein was dialyzed against 50 mM phosphate buffer (pH 7.5). The homogeneity and purity of the prepared enzyme was verified by SDS-PAGE on 15% polyacrylamide gels. Staining was performed by Coomassie brilliant blue R-250 dye (Fluka, Switzerland), and the approximate molecular weight was determined on the basis of the protein molecular weight marker (Fermentas, Canada).

Activity and inhibition assay

Stock solutions were prepared by dissolving 1 mg of powder in 200 μL of 99% DMSO, giving a concentration of 5 mg mL−1. The reaction progress was monitored by using a modified Holloway assay.222 Briefly, phenol red serving as a pH indicator was added to 1.0 mM HEPES buffer (pH 8.0) to a final concentration of 60 μM. The reaction mixture was prepared in a transparent 96 well microtiter plate, which contained 120 μL of buffer with indicator, 15 μL of stock compound in DMSO, and 15 μL of enzyme. The enzymatic reaction was monitored by measuring the absorbance change at 550 nm associated with a decrease in pH using a FLUOstar

52

Chapter 1

Optima spectrophotometer (BMG Labtech, U.S.A.). The final concentration of DMSO was 30% to comparable concentrations even for less soluble substrates. The inhibition assay was performed with 1,2-dibromoethane as substrate at a concentration of 9.8 mM and the tested compounds at concentrations from 270 to 490 µM. The results were related to data for the reaction without addition of inhibitor.

Validation using external data set

Structures of 233 compounds extracted from a data set published by Xu et al.197 were converted to an AutoDock4.0 compliant format by Open Babel 2.3223 and MGLTools. The coordinates of the crystal structure of CALB were downloaded from the Protein Data Bank (PDB ID: 1TCA). Hydrogen atoms were added to the receptor with the H++ server at pH 7.5. Gasteiger charges were subsequently assigned by MGLTools. During the docking procedure, the receptor was represented by a set of atomic and electrostatic maps calculated by AutoGrid4.0. The grid maps comprised 70 × 80 × 70 grid points with a spacing of 0.25 Å, which were centered at a position near the nucleophilic oxygen and catalytic base to cover the whole active site. The docking protocol was the same as described previously for HLDs. The largest cluster obtained from the molecular docking of each compound was evaluated for proper positioning of the carbonyl between the two stabilizing residues (Thr40 and Gln106) using 4.5 Å as the cutoff distance. Properly stabilized compounds had at least one of the distances to each of the stabilizing residues within this cutoff. The ––O···C==O reactive distance one (D1) and

––H···O reactive distance two (D2) parameters measured for the set of 22 known substrates of CALB were recorded, and values applicable to 64% of the known substrates were set as the cutoff for the description of reactivity (Supplementary Figure 2).

Results and Discussion

Development of screening protocol

Enzymes preferably recognize compounds in their transition (or so-called high-energy) states over the ground state structures.224,225 Therefore, high-energy intermediates of substrates should have better affinity toward active sites than nonsubstrates and should be easily recognized by their lower binding energies. However, this is largely influenced by the accuracy of binding energies calculated from the docking. On the other hand, compounds in the ground state could be subjected to further minimization and rescoring procedures, allowing more precise prediction of binding. Moreover, they do not have to be exhaustively prepared for molecular docking because many databases of compounds in ready-to-dock formats are available.110,226–228 The drawback of using ground states is a need for evaluation of their reactivity in 53

addition to their binding. To compare both approaches and select the most beneficial one, we initially docked 32 ground states and 43 corresponding high-energy states of known substrates to the DbjA, DmbA, LinB, DhaA, DhlA, and DbeA enzymes. Substrates with a halogen atom stabilized within 4.5 Å from the halogen-stabilizing residues were assessed for possible SN2 − displacement reaction, and the ––COO ∙∙∙C−Cl reactive distance (RD) and –COO−∙∙∙C∙∙∙Cl reactive angle (RA) parameters were recorded (Supplementary Tables 1−4).

Substrates docked in their ground state showed better predicted affinity and better correspondence between their predicted and experimental reactivity for six evaluated enzymes (Supplementary Tables 5 and 6). For these reasons and the aforementioned practical benefits of using ground states, we employed them for further screening. Using the data on the docked geometry of the known substrates in their ground states, strict cutoff values for reactivity applying to 70% of known substrates were identified as follows: RD < 3.3 Å and RA > 140° (Supplementary Figure 3).

Identification and characterization of novel substrates

Altogether, 41,366 compounds were docked to the active site of DmmA. A total of 11,273 compounds had the halogen positioned within 4.5 Å from the halide-stabilizing residues. Their binding energies predicted by AutoDock ranged from −12.6 to −2.1 kcal mol−1 (Supplementary Figure 4). A total of 548 structurally unique compounds fulfilling the cutoff values for reactivity were selected for rescoring by NNScore 2.0 and MM/GBSA. Their binding energies calculated by MM/GBSA ranged from −51.4 to −12.4 kcal mol−1 confirming the potential of these compounds to bind the active site of DmmA (Supplementary Figure 5). Fifty top-ranked compounds were assessed for their commercial availability in the PubChem database and visually inspected in PyMOL229 to verify proper binding according to the assumed catalytic mechanism (Supplementary Table 7). Sixteen compounds were selected (Table 2) and experimentally assayed for their activities with DmmA. Eight of these compounds were confirmed as substrates of DmmA. The new substrates comprised aromatic moieties with on average 50% higher molecular weight than common substrates of HLDs. The conversion of such bulky aromatic compounds has not previously been reported for HLDs but is consistent with the large active site of DmmA.46,49 The new substrates also exhibited new types of interactions within the mostly hydrophobic active site. For example, compounds C01, C08, and C12 created a hydrogen bond with the backbone oxygen of N78. The catalytic activities of DmmA with these novel substrates were compared to the activity of this enzyme with five universal substrates of HLDs49 (Figure 2), that is, 1-bromobutane (BRB), 1-iodopropane (IOP), 1-iodobutane (IOB), 1,2-dibromoethane (DBE), and 4-bromobutyronitrile (BBN).

54

Chapter 1

Table 2. Structural representation of experimentally tested compoundsα. Compound ID Molecular Structure

C01

C02

C03

C04

C05

C06

55

C07

C08

C09

C10

C11

C12

C13

56

Chapter 1

C14

C15

C16

BRB

IOP

IOB

DBE

BBN

αAll compounds without exception were experimentally confirmed as either substrates (unshaded) or inhibitors (shaded). Five universal substrates of HLDs used as positive controls are shown at the end of the table.

DmmA showed a 2-fold higher specific activity with C15 in comparison to BBN, which is the best universal substrate of HLDs, whereas the specific activity with C01 was comparable to BBN. The experimentally determined specific activity of DmmA toward the remaining six new substrates was within the activity range of BRB, IOB, and IOP.

57

Figure 2. Experimentally determined activities of DmmA with eighteen compounds identified by virtual screening as potential substrates. Each bar represents an average of three independent experiments. Activities with five universal substrates of HLDs are shown for comparison: 1,2-dibromoethane (DBE), 1- bromobutane (BRB), 1-iodopropane (IOP), 1-iodobutane (IOB) and 4-bromobutyronitrile (BBN). The activity with 1,2-dibromoethane was set to 100%. The same substrate is commonly used for kinetic and mechanistic characterization of HLDs.

Eight novel substrates of DmmA were assayed against three further enzyme family members (Supplementary Figure 6). DhaA was found to be active with C11, C15, and C16. LinB was active with C02, C11, C15, and C16. DbjA was active with C15. The activity of C11, C15, and C16 with three out of the four tested HLDs might be a consequence of the catalytic robustness of these enzymes, which belong to the same substrate specificity group.49 Members of this group are generally regarded as the most active HLDs and are widely used for mechanistic studies,29 as well as biotechnological applications.37,55,56,60,61,200,230 The activity of DmmA with the substrates unique to this enzyme may be due to the unusual size and shape of its active site or architecture of the access tunnels.49,85

58

Chapter 1

Characterization of inactive compounds

To understand the origin of unsuccessful prediction for the eight inactive compounds, we probed their ability to bind the active site of DmmA by measuring their inhibitory effect on the activity of DmmA with DBE. The determined residual activities were in the range of 21−93% (Figure 3). The observed inhibitory effects confirmed that all eight inactive compounds were able to bind to the active site of DmmA. Taking the results for both the experimentally confirmed substrates and inhibitors together, the prediction method achieved 100% success rate in the prediction of binding for the evaluated set of 16 compounds.

Figure 3. Experimentally determined inhibitory effect of eight compounds identified by virtual screening and showing no activity with DmmA. The inhibition effect was apparent by decreased conversion of 1,2-dibromoethane. The initial activity of DmmA with 1,2-dibromoethane in the absence of inhibitor was set to 100%.

Visual analysis of the bound inhibitors revealed that five of them were able to form a hydrogen bond with the reactive oxygen of the catalytic aspartate, hindering nucleophilic attack (Figure 4). We were not able to discern any obvious pattern for the three remaining compounds. Nonetheless, the knowledge of interactions between inhibitors and the active site can be applied for subsequent screening of new substrates or inhibitors of HLDs. The compounds showing no activity with DmmA were also assayed for activity with other three HLDs, DhaA, LinB, and DbjA (Supplementary Figure 6). However, no activity was detected with these enzymes, suggesting that hydrogen bonding with a nucleophile is a common inhibitory motif preventing conversion of these potential substrates by different HLDs.

59

Figure 4. Binding modes of inhibitors in the active site of DmmA. Ligands are represented by lines, whereas the catalytic residues are represented by green sticks. The dashed lines represent the hydrogen bonds between the ligands and nucleophile.

Validation using external data set

To evaluate the applicability of our method with other systems, we tried to identify potential substrates of CALB using the experimental data set197 containing 206 substrates and 27 nonsubstrates (Supplementary Tables 8 and 9). Twenty-two substrates were used for development of mechanism-based geometric criteria to estimate the reactivity (Supplementary Figure 2). The selected cutoff values for the reactivity, applying to 64% of selected substrates, were D1 < 4.1 Å and D2 < 5.1 Å (Supplementary Table 8). The remaining 211 compounds were evaluated by the method. In this experiment, 158 compounds were predicted to be correctly stabilized, and 80 compounds were predicted as potential substrates (Supplementary Table 9). Depending on the number of compounds selected for experimental testing, the portion of successfully identified substrates would range from 96% for the 50 top-ranked potential substrates to 89% for all 80 predicted substrates (Supplementary Figure 7). While the high success rates obtained for the CALB data set are partially due to the bias in the data set toward real substrates, they clearly confirm ability of our approach to enrich true substrates among selected compounds.

60

Chapter 1

Conclusions

Substrate specificity is one of the most important features of enzymes. Here, we show that the substrate scope of an enzyme can be efficiently explored by in silico screening, requiring only knowledge of the tertiary structure, reaction mechanism, and several experimentally verified substrates.

The newly developed method applied to DmmA correctly predicted 50% of substrates and 100% of binders from 16 compounds proposed for experimental testing, confirming the possible utilization of this strategy for prioritizing active compounds. These statistics should be interpreted with caution because selection of more compounds for experimental testing would reduce these success rates.

The activities of eight novel substrates were found to be comparable to those of five universal substrates of HLDs; two of these substrates (C01 and C15) were found to be more active with DmmA than the best universal substrate of this enzyme family BBN. Three novel substrates, C11, C15, and C16, were additionally converted by three different HLD enzymes, DhaA, DbjA, and LinB.

The developed method was successfully evaluated with an external data set of 206 substrates of CALB, which confirmed its broader applicability for selection of true substrates. Our study demonstrates the potential of in silico screening for finding novel substrates of enzymes outside the chemical space of experimentally characterized compounds.

61

62

CHAPTER 1

Mechanism-based discovery of novel substrates of haloalkane dehalogenases using in silico screening

Supplementary information

J.Chem.Inf.Model. 2015, 55, 54−62

DOI: 10.1021/ci500486y

63

Supplementary Table 1. The reactive parameters of docked substrates in the ground states. DbjA DmbA LinB DhaA DhlA DbeA

Name R R R R R R R R R R R R ID D A D A D A D A D A D A

[Å] [°] [Å] [°] [Å] [°] [Å] [°] [Å] [°] [Å] [°]

4 1-chlorobutane 3.2 161 3.3 163 3.2 158 3.1 144 3.1 165 3.3 168

6 1-chlorohexane 3.4 170 3.3 161 3.4 152 3.1 156 - - - -

18 1-bromobutane 3.1 160 3.2 137 3.1 147 3.1 140 - - - -

20 1-bromohexane - - 3.2 139 3.2 161 3.0 153 - - - -

28 1-iodopropane ------3.2 92 - - - -

29 1-iodobutane 3.0 110 - - 3.0 109 3.4 89 - - - -

31 1-iodohexane ------2.9 121 - - - -

37 1,2-dichloroethane 3.2 172 * * * * 3.1 141 3.0 145 * *

38 1,3-dichloropropane 3.3 168 3.5 145 3.3 166 3.2 140 3.1 162 * *

40 1,5-dichloropentane 3.3 171 3.5 142 3.6 166 3.1 155 - - 3.4 164

47 1,2-dibromoethane 3.1 148 3.0 154 3.2 164 2.9 152 3.0 133 3.1 146

48 1,3-dibromopropane 3.1 157 3.1 152 3.2 173 3.2 126 2.9 165 2.9 118

52 1-bromo-3-chloropropane 3.3 161 3.4 155 3.6 167 3.0 134 3.2 172 3.3 160

54 1,3-diiodopropane 3.1 111 * * 3.1 104 3.3 94 - - 3.1 82

64 2-iodobutane 2.8 133 3.5 85 - - 3.6 86 - - - -

67s (S)-1,2-dichloropropane 3.2 173 * * * * * * * * * *

67r (R)-1,2-dichloropropane 3.1 163 * * * * * * * * * *

72s (S)-1,2-dibromopropane 3.1 163 3.0 149 3.8 127 3.1 132 - - - -

72r (R)-1,2-dibromopropane 3.1 161 3.1 169 3.2 169 3.3 100 - - 3.1 166

76 2-bromo-1-chloropropane 3.0 162 3.0 149 3.2 173 2.9 148 - - 3.2 158

80 1,2,3-trichloropropane 3.2 166 * * * * 3.2 140 * * * *

111 bis(2-chloroethyl)ether ------* * - -

115 chlorocyclohexane 3.7 94 * * 3.6 141 - - * * * *

64

Chapter 1

117 bromocyclohexane 3.2 127 - - 3.5 141 ------

119 1-bromomethylcyclohexane * * * * 3.9 118 3.2 96 - - 3.1 148

137 1-bromo-2-chloroethane 3.4 148 3.4 153 3.4 164 3.1 163 3.0 146 3.3 151

138 chlorocyclopentane 3.3 167 3.3 172 3.4 172 3.2 176 - - 3.2 177

141 4-bromobutyronitrile ------

154 1,2,3-tribromopropane 3.1 158 3.0 147 3.2 166 3.3 100 - - 3.0 111

155 1,2-dibromo-3-chloropropane * * * * * * 3.2 102 - - 3.7 87

209 3-chloro-2-methylpropene 3.2 164 3.5 146 3.5 172 3.2 140 2.9 108 - -

225 (E)-2,3-dichloropropene 3.3 160 3.4 144 3.5 155 3.2 138 3.0 107 3.4 163

Average 3.2 153 3.3 148 3.4 153 3.2 129 3.0 145 3.2 143

Maximum 3.7 173 3.5 172 3.9 173 3.6 176 3.2 172 3.7 177

Median 3.2 161 3.2 151 3.3 163 3.2 137 3.0 162 3.2 151

Minimum 2.8 94 3.0 85 3.0 104 2.9 86 2.9 107 2.9 82

Standard deviation 0.2 21 0.2 18 0.2 21 0.2 26 0.1 22 0.2 28

- not successfully stabilized halogen

* activity not detectable by experimental test

Supplementary Table 2. The distribution of reactive parameters of substrates in the ground states.

RD [Å] RA [°]

DbjA DmbA LinB DhaA DhlA DbeA DbjA DmbA LinB DhaA DhlA DbeA

2.8 3.0 3.0 2.9 2.9 2.9 94 85 104 86 107 82

3.0 3.0 3.1 2.9 2.9 3.0 110 137 109 89 108 87 3.0 3.0 3.1 2.9 3.0 3.1 111 139 118 92 133 111 3.1 3.0 3.2 3.0 3.0 3.1 127 142 127 94 145 118 3.1 3.1 3.2 3.0 3.0 3.1 133 144 141 96 146 146 3.1 3.1 3.2 3.0 3.0 3.1 148 145 141 100 162 148 3.1 3.2 3.2 3.1 3.1 3.2 148 146 147 100 165 151

65

3.1 3.2 3.2 3.1 3.1 3.2 157 147 152 102 165 158

3.1 3.3 3.2 3.1 3.2 3.3 158 149 155 121 172 160 3.1 3.3 3.2 3.1 3.3 160 149 158 126 163 3.1 3.3 3.3 3.1 3.3 160 152 161 132 164 3.2 3.3 3.3 3.1 3.3 161 153 164 134 164 3.2 3.4 3.4 3.1 3.4 161 154 164 138 166 3.2 3.4 3.4 3.2 3.4 161 155 166 140 168 3.2 3.4 3.4 3.2 3.7 162 160 166 140 177 3.2 3.5 3.5 3.2 163 161 166 140 3.2 3.5 3.5 3.2 163 163 167 140 3.3 3.5 3.5 3.2 164 169 167 141 3.3 3.5 3.6 3.2 165 172 169 144 3.3 3.6 3.2 166 172 148 3.3 3.6 3.2 167 172 152 3.3 3.8 3.2 168 173 153 3.3 3.9 3.3 170 173 153 3.4 3.3 171 155 3.4 3.3 172 156 3.7 3.4 173 163

3.6 176

Average 3.2 146

Standard Deviation 0.2 24

Minimum 2.8 82

Maximum 3.9 177

Median 3.2 153

66

Chapter 1

Supplementary Table 3. The reactive parameters of docked substrates in the high-energy states. DbjA DmbA LinB DhaA DhlA DbeA

ID Name RD RA RD RA RD RA RD RA RD RA RD RA [Å] [°] [Å] [°] [Å] [°] [Å] [°] [Å] [°] [Å] [°]

1-chlorobutane 2.9 148 3.0 144 2.9 170 - - 2.8 166 3.0 149 4 6 1-chlorohexane 2.9 150 3.0 143 3.1 151 2.9 136 - - 3.0 150

18 1-bromobutane 2.8 143 3.0 126 2.8 156 ------

20 1-bromohexane 2.8 133 3.0 122 2.8 144 2.9 124 - - 2.9 132

28 1-iodopropane 2.8 117 3.0 100 3.0 111 3.2 102 - - - -

29 1-iodobutane 2.8 117 - - - - 3.2 103 - - - -

31 1-iodohexane ------3.1 106 - - - -

37 1,2-dichloroethane 3.0 157 * * * * - - - - * *

38 1,3-dichloropropane - - - - 2.9 169 - - 2.8 165 * *

40 1,5-dichloropentane 2.9 152 3.0 153 3.1 151 2.9 138 - - 3.0 152

47 1,2-dibromoethane 2.8 135 3.1 129 3.0 149 - - - - 2.9 130

48 1,3-dibromopropane - - - - 2.9 157 - - - - 2.9 130

1-bromo-3-chloropropane_br - - - - 2.8 155 ------52 1-bromo-3-chloropropane_cl - - - - 2.9 169 - - 2.8 165 3.1 139

54 1,3-diiodopropane 2.8 100 * * 2.9 106 3.3 95 - - 2.9 91

64 2-iodobutane 2.7 120 - - 2.9 108 2.8 107 - - - -

(R)-dichloropropane_1 - - * * * * * * * * * * 67r (R)-dichloropropane_2 3.1 127 * * * * * * * * * *

(S)-dichloropropane_1 2.9 160 * * * * * * * * * * 67s (S)-dichloropropane_2 3.0 160 * * * * * * * * * *

(R)-dibromopropane_1 - - 2.9 128 2.9 148 3.1 109 - - - - 72r (R)-dibromopropane_2 3.2 127 - - 2.9 151 - - - - 3.4 120

72s (S)-dibromopropane_1 ------2.8 140

67

(S)-dibromopropane_2 3.1 127 3.3 114 2.9 160 2.8 121 - - 3.2 118

2-bromo-1-chloropropane_br 2.7 167 - - 2.9 149 ------76 2-bromo-1-chloropropane_cl 3.0 162 3.0 145 3.0 175 2.9 137 - - 2.8 158

1,2,3-trichloropropane_1 - - * * * * - - * * * * 80 1,2,3-trichloropropane_2 3.2 126 * * * * 3.4 112 * * * *

111 bis(2-chloroethyl)ether - - 3.0 137 3.0 151 3.0 124 * * 2.9 153

115 chlorocyclohexane 2.8 162 * * 3.1 162 2.8 143 * * * *

117 bromocyclohexane 2.9 141 2.9 145 3.0 134 2.8 131 - - - -

119 1-bromomethylcyclohexane * * * * 2.9 159 2.9 126 - - 2.8 152

1-bromo-2-chloroethane_br 2.8 126 3.1 122 2.9 149 3.2 110 - - - - 137 1-bromo-2-chloroethane_cl 3.0 159 3.0 140 3.0 151 - - 2.8 134 2.9 153

138 chlorocyclopentane 2.9 161 2.9 162 2.9 155 2.9 163 - - 2.8 167

141 4-bromobutyronitrile ------

1,2,3-tribromopropane_1 ------2.8 156 154 1,2,3-tribromopropane_2 - - - - 3.0 149 ------

1,2-dibromo-3-chloropropane_1 * * * * * * 2.9 156 - - 3.0 154

155 1,2-dibromo-3-chloropropane_2 * * * * * * ------

1,2-dibromo-3-chloropropane_3 * * * * * * - - - - 2.9 132

209 3-chloro-2-methylpropene 3.0 161 2.9 160 3.1 147 2.9 164 - - 3.0 152

225 (E)-2,3-dichloropropene 3.0 159 3.0 144 3.1 156 2.9 148 - - 3.1 161

Average 2.9 142 3.0 136 3.0 150 3.0 126 2.8 158 3.0 142

Standard deviation 0.14 19 0.10 16 0.09 17 0.18 21 0.00 16 0.15 18

Minimum 2.7 100 2.9 100 2.8 106 2.8 95 2.8 134 2.8 91

Maximum 3.2 167 3.3 162 3.1 175 3.4 164 2.8 166 3.4 167

Median 2.9 146 3.0 140 2.9 151 2.9 124 2.8 165 2.9 150

- not successfully stabilized halogen

* activity not detectable by experimental test

68

Chapter 1

Supplementary Table 4. The distribution of reactive parameters of substrates in the high-energy states

RD [Å] RA [°]

DbjA DmbA LinB DhaA DhlA DbeA DbjA DmbA LinB DhaA DhlA DbeA

2.7 2.9 2.8 2.8 2.8 2.8 100 100 106 95 134 91

2.7 2.9 2.8 2.8 2.8 2.8 117 114 108 102 165 118 2.8 2.9 2.8 2.8 2.8 2.8 117 122 111 103 165 120 2.8 2.9 2.9 2.8 2.8 2.8 120 122 134 106 166 130 2.8 3.0 2.9 2.9 2.8 126 126 144 107 130 2.8 3.0 2.9 2.9 2.9 126 128 147 109 132 2.8 3.0 2.9 2.9 2.9 127 129 148 110 132 2.8 3.0 2.9 2.9 2.9 127 137 149 112 139 2.8 3.0 2.9 2.9 2.9 127 140 149 121 140 2.8 3.0 2.9 2.9 2.9 133 143 149 124 149 2.9 3.0 2.9 2.9 2.9 135 144 149 124 150 2.9 3.0 2.9 2.9 2.9 141 144 151 126 152 2.9 3.0 2.9 2.9 3.0 143 145 151 131 152 2.9 3.0 2.9 3.0 3.0 148 145 151 136 152 2.9 3.1 2.9 3.1 3.0 150 153 151 137 153 2.9 3.1 2.9 3.1 3.0 152 160 151 138 153 3.0 3.3 3.0 3.2 3.0 157 162 155 143 154 3.0 3.0 3.2 3.1 159 155 148 156 3.0 3.0 3.2 3.1 159 156 156 158 3.0 3.0 3.3 3.2 160 156 163 161 3.0 3.0 3.4 3.4 160 157 164 167 3.0 3.0 161 159 3.1 3.0 161 160

69

3.1 3.1 162 162

3.2 3.1 162 169 3.2 3.1 167 169 3.1 170

3.1 175

Average 3.0 141

Standard deviation 0.1 20

Minimum 2.7 91

Maximum 3.4 175

Median 2.9 145

70

Chapter 1

Supplementary Table 5. The binding energies of molecules in ground and high-energy states. DbjA DmbA LinB DhaA DhlA DbeA

ID Name high- high- high- high- high- high- ground ground ground ground ground ground energy energy energy energy energy energy state state state state state state state state state state state state

4 1-chlorobutane -2.6 -3.1 -2.7 -3.3 -2.6 -3.1 - -3.3 -2.8 -3.7 -2.6 -3.2

6 1-chlorohexane -3.1 -3.7 -3.3 -4.0 -3.1 -3.7 -3.3 -4.0 - - -3.2 -

18 1-bromobutane -2.8 -3.4 -3.0 -3.6 -2.9 -3.4 - -3.5 - - - -

20 1-bromohexane -3.3 - -3.6 -4.3 -3.3 -3.9 -3.6 -4.3 - - -3.4 -

28 1-iodopropane -2.8 - -2.9 - -2.8 - -3.0 -3.5 - - - -

29 1-iodobutane -3.0 -3.6 - - - -3.7 -3.3 -3.8 - - - -

31 1-iodohexane ------3.9 -4.6 - - - -

37 1,2-dichloroethane -2.3 -2.8 * * * * - -2.9 - -3.2 * *

38 1,3-dichloropropane - -3.1 - -3.3 -2.7 -3.1 - -3.3 -2.9 -3.6 * *

40 1,5-dichloropentane -3.2 -3.8 -3.5 -4.0 -3.2 -3.7 -3.4 -4.0 - - -3.4 -4.0

47 1,2-dibromoethane -2.9 -3.3 -3.1 -3.5 -2.9 -3.3 - -3.4 - -3.8 -2.9 -3.3

48 1,3-dibromopropane - -3.6 - -3.9 -3.1 -3.6 - -3.8 - -3.9 -3.1 -3.7

71

52 1-bromo-3-chloropropane - -3.4 - -3.6 -2.9 -3.4 - -3.6 -3.0 -3.9 -2.9 -3.5

54 1,3-diiodopropane -3.6 -4.1 * * -3.7 -4.2 -4.0 -4.5 - - -3.6 -4.2

64 2-iodobutane -3.3 -3.9 - -4.1 -3.3 - -3.5 -4.1 - - - -

67s (S)-1,2-dichloropropane -2.8 -3.4 * * * * * * * * * *

67r (R)-1,2-dichloropropane -2.8 -3.4 * * * * * * * * * *

72s (S)-1,2-dibromopropane -3.4 -3.9 -3.5 -4.1 -3.5 -3.9 -3.6 -4.0 - - -3.4 -

72r (R)-1,2-dibromopropane -3.4 -3.9 -3.6 -4.1 -3.5 -3.9 -3.5 -4.1 - - -3.5 -3.9

76 2-bromo-1-chloropropane -3.1 -3.6 -3.2 -3.8 -3.2 -3.6 -3.2 -3.8 - - -3.2 -3.7

80 1,2,3-trichloropropane -3.1 -3.7 * * * * -3.2 -3.9 * * * *

111 bis(2-chloroethyl)ether - - -2.7 - -2.4 - -2.6 - * * -2.6 -

115 chlorocyclohexane -4.2 -4.7 * * -4.2 -4.7 -4.5 - * * * *

117 bromocyclohexane -5.0 -4.9 -5.3 - -5.0 -4.9 -5.2 - - - - -

119 1-bromomethylcyclohexane * * * * -4.8 -5.3 -5.0 -5.7 - - -4.9 -5.5

137 1-bromo-2-chloroethane 2.6 -3.1 -2.7 -3.2 -2.6 -3.0 -2.7 -3.2 -2.9 -3.5 -2.7 -3.1

138 chlorocyclopentane -3.8 -4.3 -3.8 -4.4 -3.7 -4.2 -3.8 -4.4 - - -3.7 -4.3

141 4-bromobutyronitrile ------

154 1,2,3-tribromopropane - -4.3 - -4.6 -3.9 -4.5 - -4.7 - - -3.9 -4.5

72

Chapter 1

155 1,2-dibromo-3-chloropropane * * * * * * -3.8 -4.5 - - -3.8 -4.3

209 3-chloro-2-methylpropene -2.7 -3.2 -2.9 -3.4 -2.7 -3.2 -2.8 -3.4 - -3.5 -2.8 -

225 (E)-2,3-dichloropropene -2.9 -3.5 -3.0 -3.6 -2.9 -3.5 -3.0 -3.6 - -3.5 -3.0 -3.6

Average -2.9 -3.7 -3.3 -3.8 -3.3 -3.8 -3.5 -3.9 2.9 -3.6 -3.3 -3.9

Difference between ground and -0.7 -0.5 -0.5 -0.4 -0.7 -0.6 high-enegry states [kcal/mol]

Supplementary Table 6. The percentage of successfully stabilized active substrates of HLDs in ground and high-energy states. DbjA DmbA LinB DhaA DhlA DbeA

high- high- high- high- high- high- ground ground ground ground ground ground energy energy energy energy energy energy state state state state state state state state state state state state

Successfully stabilized molecules [%] 77 83 64 72 89 81 70 87 15 33 73 54

73

Supplementary Table 7. Fifty top-ranked molecules obtained from the virtual screening. The molecules selected for experimental testing are shaded.

Consensus EDULISS ID R [Å] R [°] Pubchem CID Vendor ID Compound ID D A rank

3.16 161 1 414603 - SPH1-168-893 SPH1-240-048 3.04 159 2 494207 -

SPH1-112-111 2.97 147 3 4606569 -

SPH1-078-750 2.83 156 4 3557884 -

SPH1-013-027 2.99 162 5 945631 -

SPH1-179-465 3.27 165 6 97254 -

SPH1-177-022 3.29 154 7 415721 -

SPH1-031-089 3.24 169 8 40547482 -

SPH1-055-755 3.30 174 9 3271268 -

SPH1-164-424 3.12 177 10 943991 -

SPH1-010-322 2.91 151 11 622507 -

SPH1-035-173 3.26 173 12 1075625 -

SPH1-210-819 3.28 172 13 205513 -

SPH1-243-756 3.10 174 14 2393546 MolPort-002-465-087 C01

SPH1-225-410 3.20 161 15 3410181 -

SPH1-221-478 3.29 178 16 4321188 -

SPH1-115-805 3.09 159 17 2110811 MolPort-002-466-871 C02

SPH1-135-285 3.26 174 18 1741312 MolPort-001-666-016 C03

SPH1-015-208 3.22 145 19 7092313 -

SPH1-136-887 3.20 158 20 124109 -

SPH1-090-685 3.10 152 21 4087903 -

SPH1-212-728 3.14 170 22 5139625 -

74

Chapter 1

SPH1-179-467 3.28 156 23 57550 -

SPH1-045-832 3.11 176 24 943534 MolPort-000-451-281 C04

SPH1-016-177 3.14 176 25 14870014 -

SPH1-039-309 3.19 178 26 57549 -

SPH1-044-459 3.26 170 27 919087 MolPort-000-469-738 C05

SPH1-207-006 2.92 166 28 286258

SPH1-102-333 3.17 143 29 16813824 MolPort-003 -015-503 C06

SPH1-015-763 2.86 147 30 713079 MolPort-000-692-283 C07

SPH1-175-417 3.17 177 31 15758400 MolPort-000-479-874 C08

SPH1-207-003 2.89 163 32 286253

SPH1-192-829 3.13 150 33 1714004 MolPort-000 -224-530 C09

SPH1-113-455 3.17 157 34 418034

SPH1-036-068 3.07 165 35 324872

SPH1-279-727 3.12 168 36 2442996

SPH1-143-410 3.21 157 37 153049

SPH1-170-579 3.26 172 38 414867

SPH1-028-421 3.24 164 39 5324672 MolPort-005 -313-263 C10

SPH1-005-016 3.07 176 40 691093 MolPort-001-012-050 C11

SPH1-172-689 3.28 164 41 415141

SPH1-242-436 3.13 167 42 2392384 MolPort-002 -462-950 C12

SPH1-163-729 3.13 157 43 244371

SPH1-034-348 3.16 169 44 38349

SPH1-160-035 3.16 172 45 940907

SPH1-151-527 3.28 177 46 1666398 MolPort-000 -137-943 C13

SPH1-307-501 3.12 164 47 2470491 MolPort-002-466-005 C14

SPH1-017-212 3.18 163 48 1484767 MolPort-002-878-041 C15

75

SPH1-256-282 2.96 162 49 2405225

SPH1-057-739 3.26 177 50 222275 MolPort-002 -469-005 C16

Supplementary Table 8. Reactive distances and energies of substrates selected for development of geometric criteria in CALB. The numbering of substrates corresponds with data published by Xu et al. (Xu, T.; Zhang, L.; Wang, X.; Wei, D.; Li, T. Structure-Based Substrate Screening for an Enzyme. BMC Bioinformatics 2009, 10, 257) Reactive distance [Å] Energy Substrate D1 D2 [kcal/mol]

LIGD1 3.4 3.5 -4.3

LIGD11 5.0 6.1 -5.0

LIGD30 5.7 5.6 -4.9

LIGD50 5.4 7.2 -5.2

LIGD60 5.5 5.4 -5.6

LIGD70 4.0 4.8 -6.1

LIGD80 3.4 3.4 -4.2

LIGD90 5.2 6.7 -5.4

LIGD100 3.3 5.4 -5.7

LIGD111 3.4 2.7 -6.3

LIGD120 3.5 3.7 -4.3

LIGD130 3.3 3.5 -3.8

LIGD140 3.2 2.8 -6.5

LIGD151 3.4 3.4 -3.4

LIGD160 3.8 5.7 -4.5

LIGD170 3.6 3.4 -5.3

LIGD181 5.9 6.7 -6.1

LIGD190 3.0 2.5 -5.6

76

Chapter 1

LIGD200 4.1 3.6 -6.2

LIGD212 3.1 3.4 -7.3

LIGD223 5.0 4.6 -5.5

LIGD230 3.0 2.7 -6.1

Supplementary Table 9. The reactive parameters of compounds from the external dataset published by Xu et al. (Xu, T.; Zhang, L.; Wang, X.; Wei, D.; Li, T. Structure-Based Substrate Screening for an Enzyme. BMC Bioinformatics 2009, 10, 257) Reactive distance [A] Energy Experimentally Compound Stabilized verified substrate [kcal/mol] D1 D2

LIGD45 X X 3.2 3.0 -9.0

LIGD43 X - -8.8

LIGD44 X X 4.0 4.2 -8.5

LIGD102 X - -7.4

LIGD83 X - -7.2

LIGD21 X - -7.1

LIGD22 X - -7.1

LIGD46 X - -7.0

LIGD19 X - -7.0

LIGD208 X - -7.0

LIGD24 X - -6.9

LIGD23 X - -6.8

LIGD185 X X 5.8 6.7 -6.8

LIGD20 X - -6.7

LIGD103 X - -6.6

LIGD101 X - -6.5

77

LIGD211 X - -6.5

LIGD191 X X 3.3 3.4 -6.5

LIGD206 X X 5.4 5.2 -6.5

LIGD18 X - -6.5

LIGD141 - X 3.1 2.7 -6.5

LIGD104 X - -6.5

LIGD202 X X 5.6 5.3 -6.5

LIGD203 X X 5.7 5.1 -6.5

LIGD106 X - -6.5

LIGD142 - X 4.4 3.5 -6.4

LIGD73 X X 5.6 5.7 -6.4

LIGD184 X X 5.2 6.4 -6.4

LIGD93 X - -6.4

LIGD96 X - -6.4

LIGD186 X - -6.4

LIGD144 - - -6.3

LIGD178 X X 5.9 5.6 -6.3

LIGD146 X X 3.1 2.9 -6.3

LIGD204 X X 5.3 5.0 -6.3

LIGD199 X X 5.5 5.2 -6.2

LIGD28 X X 3.8 3.5 -6.2

LIGD94 X - -6.2

LIGD114 X X 3.0 3.2 -6.2

LIGD97 X X 3.3 5.4 -6.2

LIGD221 X - -6.2

LIGD231 X - -6.2

78

Chapter 1

LIGD105 X X 5.3 5.0 -6.2

LIGD148 - - -6.2

LIGD201 X X 3.8 3.5 -6.2

LIGD229 X X 3.2 2.7 -6.2

LIGD149 X - -6.1

LIGD95 X - -6.1

LIGD222 X - -6.1

LIGD139 X X 4.4 4.0 -6.1

LIGD220 X - -6.1

LIGD145 X - -6.1

LIGD91 X X 5.3 7.1 -6.1

LIGD72 X X 5.7 5.5 -6.1

LIGD78 X X 3.1 2.2 -6.1

LIGD183 X X 3.7 3.6 -6.1

LIGD213 - X 4.8 4.5 -6.1

LIGD68 X X 4.0 4.8 -6.1

LIGD232 X X 4.6 4.6 -6.0

LIGD182 X X 6.0 6.7 -6.0

LIGD110 - X 3.8 4.1 -6.0

LIGD99 X X 2.9 2.3 -6.0

LIGD136 X - -6.0

LIGD143 - X 5.1 5.6 -6.0

LIGD210 X - -6.0

LIGD67 X X 4.2 5.2 -6.0

LIGD227 X X 4.9 4.5 -6.0

LIGD27 - X 4.4 3.4 -6.0

79

LIGD108 X - -6.0

LIGD219 X - -6.0

LIGD205 X X 5.3 5.2 -6.0

LIGD134 X X 3.5 3.4 -5.9

LIGD209 X - -5.9

LIGD107 X X 2.9 2.7 -5.9

LIGD228 X X 3.1 1.9 -5.9

LIGD98 X X 4.3 5.7 -5.9

LIGD174 X X 3.2 2.3 -5.9

LIGD187 X X 2.9 3.4 -5.9

LIGD76 X X 3.3 3.7 -5.9

LIGD180 X - -5.9

LIGD216 X - -5.9

LIGD169 X X 3.5 3.3 -5.9

LIGD77 X X 3.1 3.3 -5.8

LIGD207 X X 5.5 5.2 -5.8

LIGD59 X X 5.2 6.6 -5.8

LIGD113 X X 3.0 3.3 -5.8

LIGD215 X X 2.9 3.1 -5.8

LIGD225 X X 4.9 4.6 -5.8

LIGD65 X X 4.0 4.7 -5.8

LIGD122 X X 4.9 5.5 -5.8

LIGD63 X X 4.2 5.1 -5.8

LIGD171 X X 3.1 2.8 -5.8

LIGD163 X - -5.7

LIGD147 - - -5.7

80

Chapter 1

LIGD109 X - -5.7

LIGD71 X X 3.9 4.6 -5.7

LIGD61 X X 5.2 6.8 -5.7

LIGD198 X - -5.7

LIGD226 X X 3.2 1.9 -5.7

LIGD69 X X 3.4 3.0 -5.7

LIGD115 X X 6.7 5.6 -5.7

LIGD218 X X 5.2 5.1 -5.7

LIGD89 X X 5.3 7.2 -5.6

LIGD112 - X 4.5 4.5 -5.6

LIGD196 X X 4.5 5.9 -5.6

LIGD150 X - -5.6

LIGD172 X X 6.0 5.7 -5.6

LIGD92 X X 3.0 2.8 -5.5

LIGD175 X X 3.0 2.6 -5.5

LIGD64 X X 3.8 4.3 -5.5

LIGD126 - X 5.3 4.7 -5.5

LIGD138 X X 3.6 2.8 -5.5

LIGD176 X X 3.0 2.9 -5.5

LIGD168 X X 3.0 2.6 -5.5

LIGD58 X X 5.2 6.6 -5.4

LIGD164 X X 3.8 3.2 -5.4

LIGD173 X X 3.1 2.8 -5.4

LIGD32 X X 5.8 5.3 -5.4

LIGD137 X X 5.2 4.7 -5.4

LIGD42 - X 6.0 5.8 -5.4

81

LIGD66 X X 4.0 4.9 -5.3

LIGD166 X X 2.9 2.4 -5.3

LIGD51 X X 5.5 7.2 -5.3

LIGD162 X - -5.3

LIGD188 X X 3.9 4.8 -5.3

LIGD224 X X 5.0 4.6 -5.3

LIGD41 - - -5.3

LIGD34 X X 5.9 5.6 -5.3

LIGD88 X X 5.2 6.8 -5.3

LIGD189 X X 3.3 3.4 -5.3

LIGD135 X X 3.7 3.6 -5.3

LIGD177 X X 3.0 2.5 -5.3

LIGD214 X - -5.2

LIGD57 X X 5.7 7.2 -5.2

LIGD192 X X 3.6 4.3 -5.2

LIGD26 X X 3.6 3.3 -5.2

LIGD40 - X 5.9 5.7 -5.2

LIGD87 X X 6.5 5.9 -5.2

LIGD31 X X 5.7 5.7 -5.2

LIGD48 X X 5.3 7.2 -5.1

LIGD47 X X 5.3 7.0 -5.1

LIGD82 X X 3.2 2.4 -5.1

LIGD116 X - -5.1

LIGD119 X X 4.7 5.1 -5.1

LIGD39 - X 5.9 5.8 -5.1

LIGD123 X - -5.1

82

Chapter 1

LIGD165 X X 3.7 3.7 -5.1

LIGD133 X X 4.5 4.9 -5.1

LIGD36 - X 5.8 5.6 -5.1

LIGD62 X X 3.0 3.1 -5.1

LIGD233 X X 3.8 4.3 -5.1

LIGD33 - X 5.7 5.6 -5.0

LIGD49 X X 5.3 7.2 -5.0

LIGD117 X X 5.3 4.8 -5.0

LIGD12 - X 3.1 5.4 -5.0

LIGD35 X X 5.7 5.5 -5.0

LIGD86 X - -5.0

LIGD132 X X 3.3 3.4 -5.0

LIGD167 X X 3.1 2.6 -5.0

LIGD25 - X 3.7 3.2 -5.0

LIGD84 X X 6.0 5.7 -4.9

LIGD37 - X 5.7 6.0 -4.9

LIGD118 X X 3.9 4.4 -4.9

LIGD29 X X 5.6 5.6 -4.9

LIGD54 X X 3.9 3.8 -4.9

LIGD195 X X 5.4 6.2 -4.9

LIGD85 X X 5.3 7 -4.8

LIGD128 X X 3.2 3.2 -4.8

LIGD81 X X 5.5 5.7 -4.8

LIGD121 X X 3.6 4.0 -4.7

LIGD127 X X 3.3 3.4 -4.7

LIGD217 X - -4.7

83

LIGD38 X X 5.7 5.9 -4.6

LIGD10 - X 4.6 6.2 -4.5

LIGD193 X X 4.2 3.8 -4.5

LIGD179 X X 3.2 3.0 -4.5

LIGD194 X X 5.1 5.7 -4.5

LIGD129 X - -4.4

LIGD9 X X 3.3 5.5 -4.4

LIGD2 - X 3.6 3.5 -4.4

LIGD159 X X 3.1 3.1 -4.4

LIGD161 X - -4.3

LIGD131 X X 3.8 4.1 -4.3

LIGD197 X X 6.4 7.4 -4.2

LIGD125 X - -4.2

LIGD124 X - -4.1

LIGD158 X X 3.0 3.1 -4.1

LIGD4 - X 3.2 3.0 -4.1

LIGD79 X X 3.2 3.4 -4.1

LIGD6 - X 3.3 3.2 -4.1

LIGD3 X X 3.7 4.1 -4.0

LIGD5 X X 3.3 3.4 -4.0

LIGD17 X X 5.9 5.6 -4.0

LIGD157 X X 3.1 3.1 -4.0

LIGD156 X X 3.1 3.2 -3.9

LIGD53 X X 5.7 5.5 -3.9

LIGD155 X X 3.3 2.3 -3.8

LIGD16 - X 3.6 3.6 -3.8

84

Chapter 1

LIGD153 X X 3.7 3.4 -3.8

LIGD8 - X 3.3 3.1 -3.8

LIGD7 X X 3.3 3.5 -3.8

LIGD15 X X 3.3 3.5 -3.8

LIGD56 X X 5.7 5.6 -3.7

LIGD154 X X 3.1 3.2 -3.6

LIGD52 X X 4.4 4.5 -3.6

LIGD152 X X 3.4 3.4 -3.6

LIGD75 X X 4.5 4.0 -3.4

LIGD55 X X 3.3 3.5 -3.3

LIGD14 - X 3.3 3.4 -3.3

LIGD13 X X 3.5 5.8 -3.3

LIGD74 X X 3.2 4.5 -3.2

85

Supplementary Figure 1. A region of DmmA enzyme selected for the molecular docking. The region is represented by the gray box with its centre shown as black cross, structure of DmmA by green cartoon and the catalytic nucleophile and the catalytic base by green sticks. The catalytic acid and the two halide- stabilizing residues are represented by gray lines.

86

Chapter 1

Supplementary Figure 2. The geometric parameters used for estimation of reactivity in CALB (PDB ID:

1TCA). The reactive distances (D1 and D2) are represented by black dashed lines. The hydrogen bonds providing stabilization of the carbonyl are represented by orange dashed lines.

87

Supplementary Figure 3. Cumulative distribution functions of a) reactive distance and b) reactive angle in HLDs. The dashed line represents the values recalling 70% of active substrates.

88

Chapter 1

Supplementary Figure 4. Distribution of binding energies of 11,273 properly stabilized molecules accordingly to AutoDock.

Supplementary Figure 5. Distribution of binding energies of 548 molecules which were selected for rescoring.

89

Supplementary Figure 6. Activity assay of LinB (green), DhaA (blue) and DbjA (orange) towards compounds identified by the virtual screening. Each bar represents an average of three independent experiments. 100% represents activity of a DmmA towards 1,2-dibromoethane (DBE).

90

Chapter 1

Supplementary Figure 7. False positive rate for groups of compounds selected for experimental validation of activity. The expected false positive rate for the whole dataset is marked by the black line.

91

92

CHAPTER 2

Discovery of novel haloalkane dehalogenase inhibitors

Appl. Environ.Microbiol. 2016, 82, 1958–1965

DOI: 10.1128/AEM.03916-15

93

Abstract

Haloalkane dehalogenases (HLDs) have recently been discovered in a number of bacteria, including symbionts and pathogens of both plants and humans. However, the biological roles of HLDs in these organisms are unclear. The development of efficient HLD inhibitors serving as molecular probes to explore their function would represent an important step toward a better understanding of these interesting enzymes. Here we report the identification of inhibitors for this enzyme family using two different approaches. The first builds on the structures of the enzymes’ known substrates and led to the discovery of less potent nonspecific HLD inhibitors. The second approach involved the virtual screening of 150,000 potential inhibitors against the crystal structure of an HLD from the human pathogen Mycobacterium tuberculosis H37Rv. The best inhibitor exhibited high specificity for the target structure, with an inhibition constant of 3 µM and a molecular architecture that clearly differs from those of all known HLD substrates. The new inhibitors will be used to study the natural functions of HLDs in bacteria, to probe their mechanisms, and to achieve their stabilization.

Introduction

Haloalkane dehalogenases (HLDs; EC 3.5.1.8) are enzymes that catalyze the hydrolytic cleavage of carbon-halogen bonds in halides (Figure 1) with a wide range of potential applications in biocatalysis, biodegradation, biosensing, decontamination, and cell imaging.231 In structural terms, they belong to the α/β-hydrolase superfamily.39,40,232 The HLDs have broad substrate specificity, enabling them to catalyze conversion of diverse chlorinated, brominated and iodinated alkanes, alkenes, cycloalkanes, alcohols, epoxides, carboxylic acids, esters, ethers, amides, and nitriles.30,49 The first known members of this family were isolated from the bacteria Xanthobacter autotrophicus GJ10, Sphinghomonas paucimobilis UT26, and Rhodococcus rhodochrous NCIMB13064,40,233,234 which colonize environments that have been heavily contaminated with halogenated pollutants, such as 1,2-dichloroethane, 1,2,3,4,5,6- hexachlorocyclohexane, and 1-chlorobutane. In these microorganisms, the HLDs were found to be components of metabolic pathways that enable the microbes to utilize otherwise toxic haloalkanes as their sole source of carbon and energy. The HLDs have been recently discovered in a wider range of organisms, including symbiotic bacteria such as Bradyrhizobium japonicum USDA110 and Mesorhizobium loti MAF303099,42 pathogenic bacteria such as Agrobacterium tumefaciens C58 and Mycobacterium spp.,235,236 and the eukaryotic organism Strongylocentrotus purpuratus48. Even though HLDs have been studied intensively over the last 25 years and isolated from many different environments and species, most of their biological functions remain elusive. For instance, it is undoubtedly interesting that the plant pathogen

94

Chapter 2

Agrobacterium tumefaciens C58 carries the HLD-encoding gene datA on its tumor-inducing plasmid while there is no clear link between HLD activity and tumorigenesis.237 Similarly, the presence of three different HLD genes in the genome of the human pathogen Mycobacterium tuberculosis H37Rv44 suggests that HLDs are important for its survival, but it is not immediately obvious why an organism that colonizes human tissues would require enzymes that cleave carbon-halogen bonds. One way to study the natural function of an enzyme is to use specific inhibitors. However, no attempts to systematically identify HLD inhibitors have yet been reported. Such molecules would be extremely useful in studies on the natural functions of HLDs in bacteria, but they may also find use in detailed kinetic and mechanistic studies and as enzyme stabilizers. Inhibitors may also facilitate enzyme crystallization by increasing internal structural stability, and noncovalent inhibitors are useful during long-term storage of proteins.

Here we present the first systematic search for HLD inhibitors. Two different and complementary search techniques were adopted: a ligand-based approach and a structure- based approach. The ligand-based approach involved the rational design of inhibitors based on the structures of the enzymes’ known substrates, while the structure-based approach relied on a virtual screen of candidate inhibitors against an experimentally determined HLD structure, more specifically targeted to the molecules noncovalently bound to the enzyme active site. Both approaches were expected to provide noncovalent competitive inhibitors. The set of molecules selected by these theoretical approaches was tested on inhibitory effects using conventional activity and kinetic assays. The discovered inhibitors show a wide range of binding affinities with interesting selectivity for individual HLDs.

Figure 1. General scheme of the reaction mechanism of HLDs. Enz, enzyme.

Materials and Methods

Preparation of ligands for molecular docking

The clean drug-like subset of the ZINC database226 was searched for molecules satisfying the following selection criteria: xlogP value of ≤5, molecular weight between ≥150 and ≤500, number of H bond donors of ≤5, and number of H bond acceptors of ≤10. This yielded 149,662 hit molecules, whose three-dimensional structures were downloaded and filtered to remove

95

those with Tanimoto similarity coefficients in excess of 0.8. The resulting molecular library was enriched with six ligand-based inhibitors that were chosen to act as positive controls. The structures of these six inhibitors were built in Avogadro238. Input files in Sybyl mol2 format were converted into an AutoDock-compatible format using MGLTools207.

Preparation of the receptor structure for molecular docking

The crystal structure of HLD DmbA (PDB ID 2QVB) was selected for use as a receptor in virtual screening.81 Gasteiger charges and AutoDock atom types were assigned using MGLTools, and hydrogen atoms were added using the H++ server210 at pH 7.5. The hydrogen atom bound to the NE2 atom of His 280 was deleted to reflect the catalytic mechanism of the HLDs. Before performing binding energy calculations, the structure was subjected to energy minimization using the Sander module of AMBER 11239 with the ff03.r1 force field215. The geometry optimization protocol involved performing 250 steepest descent steps followed by 750 conjugate gradient energy minimization steps. The convergence criterion for the energy gradient was set to 0.1 kcal · mol-1 · Å-1. The nonbonded cutoff and dielectric multiplicative constant for electrostatic interactions were set to 50 Å and 4 rij, respectively. Molecular docking, rescoring, and clustering

The active site of DmbA was selected as the target for molecular docking, which was performed using AutoDock Vina240. The region of the active site selected for molecular docking was set to 20 x 19 x 20 Å, centered at (20.45; 14.59; 12.77) Å. The center of this region was located between the positions of the nucleophile and the catalytic histidine. The docked conformations were rescored using NNScore 2.0212, which evaluates the conformation of a molecule with 20 distinct neural-network scoring functions. The final score was obtained by averaging the scores given by these 20 functions. The docked conformations were clustered according to the common features of their binding modes. The clustering analysis was performed using AuposSOM137 with the default settings for all parameters other than map_size, which was changed to 6 x 5 in order to increase the maximal number of clusters. A tree representation of the clustering results was generated using Dendroscope241.

Calculation of binding energies

The free-energy differences between the bound and free states of the receptor and various ligands were calculated by the molecular-mechanics/generalized-Born surface area (MM/GBSA) method. Free-energy differences were calculated by combining gas-phase energy contributions with solvation free-energy components calculated using an implicit solvent model for each species. Force field parameters for the docked conformations of ligands were prepared

96

Chapter 2

using the Antechamber and Prmchk modules of AmberTools 1.5. AM1-BCC charges242 were assigned to individual atoms of ligands with the Antechamber module of AmberTools 1.5. Input topologies for the receptor, ligands, and receptor-ligand complexes were prepared with the Leap module of AMBER 11, using the ff03.r1 force field for proteins and the general amber force field216 for ligands. The PBradii were set to mbondi2217. The structures of receptor-ligand complexes were subjected to two rounds of energy minimization using the Sander module of AMBER11. The first round involved a short minimization with 250 steepest descent steps followed by 750 conjugate gradient minimization steps. The convergence criterion for the energy gradient was set to 0.1 kcal · mol-1 · Å-1, while the nonbonded cutoff and dielectric multiplicative constant for the electrostatic interactions were set to 50 Å and 4 rij, respectively. The second round of minimization was done in an implicit solvent with the following parameters: 100 steps of steepest descent followed by 400 steps of conjugate gradient energy minimization, a nonbonded cutoff of 16 Å, and an interior dielectric constant of 2218 and with the generalized-Born model parameter set to 2217,219. The convergence criterion for the energy gradient was set to 0.1 kcal · mol-1 · Å-1. The nonpolar contribution to the solvation energy was computed with the LCPO (linear combination of pairwise overlaps) model220. Finally, an MM/GBSA refinement of the binding energy was performed on the minimized structure using the Python script MMPBSA.py221 from AmberTools 1.5, with the settings from the second round of minimization. Finally, a consensus score for each conformation of the docked ligands was calculated by averaging the ranks obtained using MM/GBSA and NNScore 2.0.

Enzyme expression and purification

Four optimized recombinant genes, linB-His6, dhaA-His6, dbjA-His6, and dmbA-His6, were subcloned into the expression vector pET21b. Escherichia coli BL21(DE3) cells were grown at 37°C in LB medium with ampicillin as a selection marker (final concentration, 100 µg · ml-1) and induced by adding isopropyl-β-D-thiogalactopyranoside (IPTG) to a final concentration of

1 mM once the culture reached an optical density at 600 nm (OD600) of 0.4. Overexpression was carried out at 20°C for 8 h. Harvested cells were disrupted by sonication using a Hielscher UP200S sonicator (Teltow, Germany) set at 0.3-s pulses and 85% amplitude. Crude extracts were purified by metalloaffinity chromatography on a 5-ml nickel-nitrilotriacetice acid (Ni-NTA) Superflow column (Qiagen, Germany) as reported elsewhere243. Eluted fractions containing the desired proteins were dialyzed against 50 mM phosphate buffer. The homogeneity, purity, and expression level of each protein were evaluated by SDS-PAGE; all of the proteins used in subsequent experiments had purities above 95%.

97

Enzyme activity assay for ligand-based inhibitors

All reactions were performed at 25°C in 25-ml Reacti flasks sealed with Mininert valves. The reaction mixtures consisted of 10 ml of 100 mM (pH 8.6) glycine buffer and 10 µl of 1,2-dibromoethane (DBE) as the substrate, together with 10 µl of the inhibitor being tested, and were initiated by injection of 200 µl of 0.2-mg · ml-1 enzyme. The progress of the enzymatic reaction was monitored by withdrawing 1-ml samples of the reaction mixture 0, 7, 14, 21, and 28 min after the start of the experiment and promptly quenching the samples by mixing them with 100 µl of 35% nitric acid. The released halide ions in the acid-quenched samples were then complexed with mercuric thiocyanate and ferric ammonium sulfate and determined on a Sunrise microplate reader (TECAN, Austria) at 460 nm.244 The enzymes’ dehalogenation activity was quantified as the rate of product formation over time after correction for the rate of abiotic hydrolysis.

Enzyme activity assay for structure-based inhibitors

Compounds identified by virtual screening as potential DmbA inhibitors and selected for experimental evaluation were purchased in powder form at crystalline purity from MolPort (Molport, Latvia). Library stock solutions were prepared by dissolving 1 mg of the relevant compound to a concentration of 5 mg · ml-1 in 200 µl of 99% dimethyl sulfoxide (DMSO). Serial dilutions of the stock solutions were then prepared using pure DMSO to reach a final inhibitor concentration of 0.039 mg ·ml-1. The library of structure-based inhibitors identified by virtual screening was evaluated using a microtiter plate screening assay. Reaction progress was monitored using a modification of Holloway’s assay222. The pH indicator phenol red was added to 1.0 mM HEPES buffer (pH 8.0) to a final concentration of 60 µM and used to prepare reaction mixtures in transparent 96-well microtiter plates. The mixtures comprised 120 µl of buffer with indicator and DBE (9.3 mM final concentration), 15 µl of inhibitor in DMSO, and 15 µl of enzyme. The total volume of each reaction mixture was thus 150 µl. The enzyme-catalyzed haloalkane hydrolysis reaction caused a gradual decrease in the pH of the reaction mixture, which was detected by monitoring the resulting change in absorbance at 550 nm using a FLUOstar Optima spectrometer (BMG Labtech, USA). The level of inhibition achieved with each compound was calculated with reference to control reactions performed in the absence of inhibitor. The average reaction rates observed in three independent experiments were fitted to a logistic dose-response model (equation 1) using Origin 9.1 (OriginLab, USA), and the IC50 for each inhibitor was determined from the inflection point of the corresponding fitted curve.

98

Chapter 2

In equation 1, v is the observed reaction velocity, A2 and A1 are lower and upper limit, respectively, p is slope, [I] is the concentration of the inhibitor, and IC50 is the inhibitor concentration causing 50% inhibition of enzymatic activity.

Protein stability by differential scanning calorimetry

Thermal unfolding of 1mg · ml-1 DmbA was conducted in the absence and presence of 93 µM inhibitor 22 in 1 mM HEPES buffer (pH 8.0). The system’s heat capacity was monitored using a VP-capillary differential scanning calorimetry system (MicroCal, USA). Experiments were performed at temperatures of 20 to 80°C, ramping at 1°C · min-1. Baseline subtraction and peak maximum determination were performed using Origin 7.0 (OriginLab, USA) with the DSC plugin provided by MicroCal.

Analysis of binding by isothermal titration calorimetry

All calorimetry reactions were performed at 25°C in 1 mM HEPES buffer (pH 8.0) containing 10% DMSO, with final enzyme and inhibitor 22 concentrations of 43 µM and 300 µM, respectively. All measurements were performed using a MicroCal VP-isothermal titration calorimetry system (GE Healthcare Life Sciences, Sweden). An inhibitor solution prepared by dissolving 1 mg of compound 22 in 200 µl of 99% DMSO was injected stepwise (28 injections of 10 µl each) into a solution of the enzyme in the HEPES buffer inside the reaction chamber. The injection duration was 20s, with 150-s intervals between injections. The stirring speed was set to 502 rpm, and the feedback mode was set to high. The calorimetric data were evaluated using a single-site binding model in Origin 7.0 (OriginLab, USA).

Analysis of inhibition mechanism by isothermal titration calorimetry

The dependence of the rate of DmbA reaction on the concentration of DBE was measured in the presence of 0, 2.5, 5, and 10 µM compound 22. All reactions were performed at 25°C in 1 mM HEPES buffer (pH 8.0) containing 10% DMSO. The substrate solution was prepared by adding 10 µl of DBE to 4 ml of reaction buffer containing an appropriate inhibitor concentration and incubated in a water bath at 37°C for 30 min. The final concentration of substrate for each measurement was determined by gas chromatography equipped with flame ionization detection (GC-FID) and a Cyclosil-B LTM-II capillary column (Agilent, USA). DBE was extracted from substrate solution into methanol containing 1,2-dichloroethane as an internal

99

standard. Injection of 1-µl samples on the column was done at 250°C; the oven temperature was increased from 40 to 220°C at a 30°C · min-1 gradient followed by 6 min at a constant temperature of 220°C. The flow of carrier gas was constant at 2.0 ml · min-1. The kinetic measurements were performed using a MicroCal isothermal titration calorimetry (ITC) system (GE Healthcare Life Sciences, Sweden). The substrate solution was titrated into the measurement cell containing enzyme solution of identical composition to avoid generation of a dilution heat. Each injection increased amounts of a substrate in the reaction cell while pseudo- first-order conditions were maintained. A total of 28 injections of 10 µl, each lasting 20 s, were carried out during titration. Instrumental feedback mode was set to none. The protein concentration used was between 0.9 and 5.0 µM, depending on the amount of inhibitor present during the reaction. The reaction rates reached after every injection (in units of thermal power) were converted to enzyme turnover by using apparent molar enthalpy (ΔHapp):

where [P] is the molar concentration of product generated, V is the volume of the solution in the reaction cell, and Q is enzyme-generated thermal power. Apparent molar enthalpy of the -1 DBE conversion by DmbA (ΔHapp = 0.094 kcal · mol ) was determined in a separate experiment that allowed the reaction to proceed to completion. In the experiment, 6.86 nmol of DBE was fully converted by DmbA, and the total heat of conversion was obtained by integration of the ITC signal:

where [S] is the molar concentration of substrate converted. The average from the three experiments was used. Observed reaction rates were globally fitted using Origin 9.0 software (OriginLab, USA) according to equation 2:

where v is reaction velocity, Vmax is limiting maximal velocity, Ks is substrate concentration producing half occupation of the binding site, [S] is substrate concentration, [I] is inhibitor concentration, Ksi is dissociation constant of SES complex (substrate inhibition constant), Ki is dissociation constant of enzyme inhibitor (EI) complex, n is Hill coefficient describing cooperative binding of substrate, m is Hill coefficient describing cooperative mode of substrate

100

Chapter 2

inhibition, α is the factor by which Ks changes when I occupies the enzyme (interaction factor) or Ki changes when enzyme is occupied by S, and β is the factor by which the productivity (rate of breakdown of ESI complex to EI + P) is affected.

Results

Ligand-based inhibitors

The first set of inhibitors was rationally designed based on structural similarity to the best known HLD substrates. Due to their similarity with substrates, these molecules are expected to bind to the enzyme active site and competitively affect the enzymatic reaction. Since HLDs cannot cleave carbon-fluorine bonds,70 we selected fluorinated analogues of common HLD substrates as potential noncovalent inhibitors. The set of ligand-based fluorinated analogues consisted of 1-fluoropropane (FPE), 1-fluorohexane (FHE), 1-fluorocyclohexane (FCH), and 1,3-difluoropropane (DFP). This set was augmented with two other halogenated compounds, bromocyclopropane (BCP) and 1-iodo-2,2-dimethylpropane (NPI), which are also very similar to known HLD substrates but do not readily undergo bimolecular nucleophilic substitution in the active sites of these enzymes.

We assayed the inhibitory properties of these six ligand-based inhibitors (Figure 2A) against four widely studied HLDs—DbjA from Bradyrhizobium japonicum USDA110,42 DhaA from Rhodococcus rhodochrous NCIMB13064,245 DmbA from Mycobacterium tuberculosis H37Rv,44 and LinB from Sphingobium japonicum UT26243 34)—using DBE as the test substrate. Testing concentrations were close to solubility limits (5 to 15 mM), and all of the ligand-based inhibitors exhibited significant effects on most of selected HLDs. Interesting variability in the specificity of the tested dehalogenases to ligand-based inhibitors was indicated. The widest specificity was observed for LinB, which was significantly inhibited by all of the tested inhibitors. On the other hand, narrow specificity was observed for DmbA, whose activity was significantly reduced by only one of the tested inhibitors (Figure 3 and Supplementary Table 1). FCH was the inhibitor with the broadest impact, reducing the activities of all of the tested enzymes between 40% and 90%. DFP was the weakest inhibitor, reducing activity by no more than 20% for any of the tested enzymes. The strongest and the most specific inhibition was observed for FHE, which caused complete inhibition of DbjA when present at its saturated concentration. For this best case, the subsequent experiments using descending concentrations of FHE were performed to provide an exact IC50 of 2.7 mM (Supplementary Figure 1). A trend relating specificity of the inhibitors and substrates can be deduced from the correlation matrix comparing the enzymes and theirs inhibition levels (Supplementary Table 2). While the more inhibited LinB, DhaA, and DbjA belong to substrate specificity group 1, the least

101

inhibited DmbA belongs to the substrate specificity group 2 and the same trend can be observed from the correlation table, where correlation coefficients for LinB, DhaA, and DbjA are significantly higher than for DmbA.

Figure 2. Chemical structures of ligand-based (A) and structure-based (B) HLD inhibitors. While the ligand-based inhibitors are structurally similar to known substrates, the structure-based inhibitors differ strongly from any substrate of the target enzymes. FPE, 1-fluoropentane; FHE, 1-fluorohexane; FCH, 1-fluorocyclohexane; DFP, 1,3-difluoropropane; BCP, bromocyclopropane; NPI, 1-iodo-2,2- dimethylpropane. A complete list of structure-based inhibitors and corresponding PubChem codes can be found in Supplementary Table 3.

102

Chapter 2

Figure 3. Effects of ligand-based inhibitors on the activity of DmbA (gray), DbjA (white), LinB (black), and DhaA (cross hatched). Each value represents the average from at least three independent experiments; error bars indicate standard deviations. In each case, 100% activity corresponds to the enzyme’s activity toward DBE in the absence of inhibitors. FPE, 1-fluoropropane (8.6 mM); FHE, 1-fluorohexane (7.5 mM); FCH, 1-fluorocyclohexane (8.8 mM); DFP, 1,3-difluoropropane (11.3 mM); BCP, bromocyclopropane (14.9 mM); NPI, 1-iodo-2,2-dimethylpropane (7.7 mM).

Structure-based inhibitors

Since the ligand-based approach provided only weak inhibitors with a millimolar range of effective concentration, we decided to apply a structure-based approach using virtual screening. DmbA was selected as the target enzyme because (i) its crystal structure has been solved at a high resolution, (ii) it originates from the pathogenic bacterium Mycobacterium tuberculosis H37Rv, which primarily colonizes human tissues and whose role in catalysis of dehalogenation reactions is not obvious,44,246,247 (iii) genetic engineering of mycobacteria is complicated due to the extremely slow growth and high pathogenicity of the organisms, and (iv) it was the enzyme least affected by the ligand-based inhibitors.

In total, we docked 142,662 structurally diverse molecules into the active site of DmbA. The predicted binding energies ranged from -10.7 to 43.8 kcal · mol-1. The 10,000 molecules with the lowest binding energies, ranging from -10.7 to -7.8 kcal · mol-1, were selected for further investigation. Improved binding-energy estimates for this set of molecules were obtained by the molecularmechanics/generalized-Born surface area (MM/GBSA) method, and the corresponding enzyme-ligand complexes were rescored using NNScore 2.0. The ligands

103

were divided into 30 separate clusters based on their interactions with individual active-site residues, with each cluster containing between 117 and 691 molecules. One hundred molecules were selected from the best-ranked hits, with the number of molecules taken from individual clusters proportional to the cluster size. Each cluster was represented by at least one molecule to ensure that the selected set reflected the diversity of identified interaction patterns. The 100 selected molecules were then assessed to determine their availability for experimental testing; the DmbA complexes of those found to be available were inspected visually using PyMol229 to determine whether the ligands effectively blocked the active site of DmbA (Supplementary Figure 2). This visual analysis revealed that the selected molecules have structures very different from those of previously known HLD ligands (Figure 2B). Additionally, all six ligand- based inhibitors were added to the virtual screening library as positive hits. However, none of the molecules ranked in the final top 100 list; the best, 1-chloro-2,2-dimethylpropane, ranked as the 5,392th molecule. Lower ranking may have following origins: first, the dissociation constants of substrates are millimolar, and second, the ligand-based molecules are much smaller, resulting in formation of possibly lower numbers of interactions than the structure- based molecules.

Twenty-five of the compounds identified by virtual screening, one from each cluster whenever possible, were tested experimentally. Five clusters were not represented because the corresponding compounds were not commercially available. IC50s were determined for each of the 25 compounds by measuring their effects on the rate of DBE hydrolysis catalyzed by DmbA at various inhibitor concentrations. Any molecule with an IC50 below 1,000 µM was defined as an inhibitor (Figure 4). Using this threshold, 17 of the 25 tested compounds were found to be inhibitors, giving a hit rate of 68%. All of the compounds identified by virtual screening had structures that differed strongly from the enzymes’ known substrates, and their binding affinities greatly exceeded those of both the enzymes’ native substrates and the best ligand- based inhibitors. Kinetic analysis revealed that all of these compounds were partial inhibitors (Supplementary Figure 3). By partial inhibition we understand the cases when enzyme inhibitor complexes maintain a certain level of catalytic activity. The decreased activity consolidated at an equilibrium where any further increase in inhibitor concentration did not significantly affect the reaction rate.

Significant correlation between the predicted docking scores, MM/GBSA results, and inhibition constants were not observed. The examined inhibitors did not vary significantly in the predicted energies, while all molecules ranked in the top 100. The process of molecule identification used in this study cannot predict whether the discovered inhibitor will be capable of complete or partial inhibition. More rigorous approaches involving molecular dynamics

104

Chapter 2

would be necessary for predictions of such quantities with a certain degree of confidence. Implementing additional computational steps into the workflow would be possible at the expense of ease and throughput of the method.

Figure 4. Observed IC50s for the structure-based inhibitors. All compounds with IC50s below 1,000 µM were assigned as inhibitors.

The binding of the best structure-based inhibitor was studied in detail by molecular docking and isothermal titration calorimetry (ITC). Molecular docking revealed a binding mode where the triazole part of the molecule is localized in between the halide-stabilizing residues (Asn39 and Trp110) at the distances enabling formation of a hydrogen bond (Figure 5A and Supplementary Figure 4). A similar binding motif has been observed in the crystal structure of the HLD-based HaloTag with its stabilizer, where the molecule has a tetrazole part located at the similar position.200 Analysis of the enzyme-inhibitor complex by ITC indicates the single -1 binding site model with an n of 0.91 ± 0.01, Kd of 3.37 ± 0.12 µM, ΔH of -5.24 ± 0.12 kcal · mol , -1 -1 andvvΔS of 8.14v ± 0.12 cal · mol · °C , where n is number of binding sites, Kd is dissociation constant, ΔH is a change in enthalpy, and ΔS is a change in entropy (Figure 5B).

Next, we performed series of kinetic measurements to unveil the inhibition mechanism of the best inhibitor. The steady-state kinetic constants were determined at four different concentrations of the inhibitor 22 (Figure 5C). Kinetic data were subjected to a global fit analysis testing three standard inhibition models: (i) competitive, (ii) noncompetitive, and (iii) mixed inhibition. The best fit was obtained by using the partial mixed-type inhibition model

105

(equation 2). The mixed-type mechanism supports formation of enzyme-inhibitor complex (EI) but also simultaneous binding of both substrate and inhibitor, resulting in an enzyme-substrate- inhibitor (ESI) complex with reduced production efficiency (β = 0.066 ± 0.018) compared to that of ES complex. The interaction factor value of 0.864 ± 0.073 indicates that a substrate and an inhibitor bind in a cooperative manner. The resulting equilibrium dissociation constant for enzyme-inhibitor complex, 3.58 ± 0.29 µM, matching the value obtained from ITC binding experiment, is 3 orders of magnitude lower than that of the best ligand-based inhibitor.

We also employed the differential scanning calorimetry to investigate the effect of inhibitor 22 on protein stability. The presence of inhibitor 22 at the concentration of 93 µM raised the melting temperature of DmbA from 52°C to 58°C, demonstrating that the inhibitor’s binding stabilized the protein significantly (Figure 5D). The same experimental design with the other five best inhibitors (no. 1, 18, 9, 15, and 8) yielded an increase of melting temperature between 1.2°C and 2.1°C.

We examined also the specificity of the three best inhibitors (no. 22, 1, and 18) by investigating their effects on three extensively studied HLDs: DbjA, DhaA, and LinB. None of the three inhibitors had any discernible effect on the catalytic activities of these enzymes, indicating that the structure-based inhibitors do not bind to other dehalogenases despite of their high sequence identity with DmbA: 68% for LinB, 44% for DhaA, and 41% for DbjA. Molecular docking revealed that compounds 22, 1, and 18 all bound unfavorably to these enzymes, without forming hydrogen bonds to halide-stabilizing residues as in the case of DmbA (Supplementary Figure 5). We therefore conclude that the structure-based inhibitors identified by virtual screening against the structure of DmbA are highly specific for this enzyme.

106

Chapter 2

Figure 5. Molecular docking, calorimetry, and kinetic analysis of binding of the best structure-based inhibitor 22 to DmbA. (A) The most probable binding mode of the best structure-based inhibitor (in gray sticks) in the DmbA active site (gray surface) according to the molecular docking. The residues of catalytic pentade are shown in sticks. (B) Integrated heat change of ITC data for the interaction between inhibitor 22 and DmbA. The solid line represents the best fit using a single site binding model. (C) DmbA inhibition kinetics in the presence of 0 µM (squares), 2.5 µM (triangles), 5.0 µM (diamonds), and 10.0 µM (circles) concentrations of the best structure-based inhibitor. Solid lines represent a global fit to the data according to equation 1. (D) Differential scanning calorimetry data, showing the difference in the melting temperature of DmbA in the absence (dashed line) and in the presence (solid line) of inhibitor 22.

107

Discussion Here we have conducted the first systematic search for competitive noncovalent inhibitors of HLD enzymes. Two very different approaches were used, the ligand-based approach and the structure-based approach. The ligand-based approach is focused on candidate inhibitors whose structures resembled those of the enzymes’ known substrates. The resulting inhibitors are expected to bind into the active site in a way similar to that of the substrates, but they cannot undergo enzyme-catalyzed dehalogenation because they either contain strong carbon-fluorine bonds or are not amenable to nucleophilic displacement for tested enzymes, but they exhibited significant inhibition effect only at millimolar concentrations. Such a finding is in a good agreement with mechanistic analysis of the HLD reaction mechanism, showing binding constants in the millimolar range.29 The most potent of the six tested molecules was 1-fluorohexane, which had an IC50 of 5.4 mM toward DbjA. Interestingly, the substrate specificity mirrored the specificity for the ligand-based inhibitors. Enzymes belonging to the same specificity group were also similarly affected by ligand-based inhibitors. The major advantage of the ligand-based inhibitors is their nonspecificity, which allows them to modulate the reactivity of many related HLDs. Nonspecificity is, however, reflected also by their low affinity. Nevertheless, the best ligand-based inhibitor outperformed a recently reported additive that binds to the active site and facilitates crystallization of DatA 248 (Kd > 20 mM). We note that inhibitors with millimolar-range effectiveness are still unsuitable for most of the practical applications, which usually require inhibition constants in the micro- to nanomolar range.249,250

The second approach exploited the crystal structure of DmbA from Mycobacterium tuberculosis H37Rv, which had previously been determined to atomic resolution.81 By virtually screening a large library of inhibitors against this structure, we obtained 17 novel inhibitors from 25 tested molecules, representing a 68% hit rate. The determined inhibition constants are up to 3 orders of magnitude lower than that of the best ligand-based inhibitor. All of the compounds identified in this way had structures that differed strongly from the enzymes’ known substrates, and their binding affinities greatly exceeded those of both the enzymes’ native substrates and the best ligand-based inhibitors. Based on the isothermal titration calorimetry binding analysis, we concluded that the best structure-based inhibitor occupies a single binding site with a dissociation constant of 3.37 ± 0.12 µM. The mechanism of inhibition was studied in more detail by a steady-state inhibition kinetics. The kinetic data indicated that the best structure-based inhibitor follows a hyperbolic mixed-type inhibition pattern with a dissociation constant of 3.58 ± 0.29 µM. The mixed-type inhibition suggests a possibility of

108

Chapter 2

simultaneous binding of substrate and inhibitor to the enzyme active site and provides an explanation for the partial inhibition observed with several tested inhibitors. The overall kinetic mechanism for the best inhibitor includes substrate inhibition and enzyme cooperativity. We are aware of the fact that the determined inhibition mechanism is valid for inhibitor 22 only and cannot be extrapolated to other molecules described in this study.

Interestingly, all structure-based inhibitors were strictly specific for DmbA, having no effect on other tested HLDs. This suggests that the structure-based approach may be a general way of obtaining inhibitors that are unique to the targeted protein. Highly specific molecules can be used as molecular probes in chemical biology studies. It could be, for example, useful for a study of the biological role of DmbA, one of three different HLDs in Mycobacterium tuberculosis H37Rv strain K, and the most abundantly expressed proteins after phagocytosis by the human monocytic cell line U-937.247 Additionally, the best structure-based inhibitor showed strong stabilization effects, raising the melting temperature of DmbA by 6°C. Similar effects have been observed with HaloTag stabilizers200 and with DatA enzyme248 during crystallization experiments. Increased thermostability of the enzyme, while retaining its activity, represents an interesting strategy for extension of its half-life during industrial biocatalysis.

In summary, 6 ligand-based and 17 structure-based HLD inhibitors show an interesting range of binding affinities and selectivities for individual target proteins. The best inhibitors are expected to find use in the analysis of enzymes’ biological role, reaction mechanism, protein stabilization supporting crystallization, long-term storage, and protection during immobilization or industrial process.

109

110

CHAPTER 2

Discovery of novel haloalkane dehalogenase inhibitors

Supplementary information

Appl. Environ.Microbiol. 2016, 82, 1958–1965

DOI: 10.1128/AEM.03916-15

111

Supplementary Table 1. Inhibition effect of ligand-based inhibitors on four tested haloalkane dehalogenases.

Enzyme BCP FPE FHE FCH DFP NPI

DmbA 84 ± 4% 86 ± 6% 83 ± 2% 26 ± 8% 87 ± 3% 97 ± 6% DbjA 89 ± 9% 98 ± 6% 0 ± 4% 14 ± 5% 104 ± 6% 10 ± 2% LinB 68 ± 7% 43 ± 8% 27 ± 3% 38 ± 2% 87 ± 4% 20 ± 1%

DhaA 85 ± 2% 95 ± 8% 86 ± 9% 60 ± 4% 83 ± 6% 38 ± 7% Inhibitory effects of the ligand-based inhibitors on DmbA, DbjA, LinB and DhaA. Relative activity of DbjA in the presence of 1-fluorohexane (FHE) was observed at half concentration, while for the higher concentration complete inhibition was observed. Bromocyclopropane (BCP), 1-fluoropentane (FPE), fluorocyclohexane (FCH), 1,3-difluoropropane (DFP) and 1-iodo-2,2- dimethylpropane (NPI). Concentrations of used inhibitors were following: BCP 14.9 mM, FPE 8.6 mM, FHE 7.5 mM, FCH 8.8 mM, DFP 11.3 mM and 7.7 mM.

Supplementary Table 2. Pearson correlation matrix of ligand-based inhibitors´ effect on four tested HLDs. Displayed correlation are within 0.95 confidence interval.

Enzyme DmbA DbjA LinB DhaA DmbA 1.00 0.32 0.10 0.16 DbjA 0.32 1.00 0.83 0.62 LinB 0.10 0.83 1.00 0.51 DhaA 0.16 0.62 0.51 1.00

Supplementary Table 3. Inhibition effect of structure-based inhibitors on DmbA.

ID PubChem ID Mw Observed IC50 (µM) 1 40217358 298.3 13 2 40074990 323.8 ND 3 37146834 288.3 ND 4 16582453 299.7 34 5 29867428 240.3 114 6 18166507 316.4 ND 7 39009089 291.3 230 8 39818298 265.3 24

112

Chapter 2

ID PubChem ID Mw Observed IC50 (µM) 9 40171433 346.2 20 10 42957181 310.3 1321 11 4726718 243.3 1339 12 25355567 268.2 509 13 934252 311.3 ND 14 16413572 369.2 ND 15 45155817 288.3 23 16 32201146 332.4 58 17 9262370 284.3 86 18 5188312 380.4 14 19 16344495 281.3 ND 20 6473065 269.3 30 21 39854836 268.3 ND 22 37859493 262.3 5 23 47018060 316.4 ND 24 12553739 285.3 360 25 47018060 252.2 758 Observed inhibition effects of inhibitors from virtual screening on DmbA with 1,2-dibromoethane as a substrate. ND – no inhibition effect detected.

Supplementary Figure 1. Inhibition of DbjA by 1-fluorohexane. Dependence of DbjA relative activity on 1-fluorohexane (FHE) concentration with 1,2-dibromoethane as a substrate. Fit represents a logistic dose-response model.

113

1 4 5

7 8 9

10 11 12

114

Chapter 2

15 16 17

18 20 22

24 25

Supplementary Figure 2. Docking of 17 discovered structure-based inhibitors into the DmbA active site. Residues forming the catalytic pentade are depicted in green sticks. Binding orientation of the inhibitors in the active site of DmbA (in grey surface) identified by virtual screening and visualised by PyMol. Each cluster is represented by the top ranked molecule in the visualization.

115

116

Chapter 2

Supplementary Figure 3. Dose-response curves of 17 discovered inhibitors. The activity has been normalized to activity without inhibitor present. Solid line is a dose-response fit to the data. Each point represent average of three independent experiments..

Supplementary Figure 4. Interactions of the inhibitor 22 with the DmbA active site as predicted by PoseView. The catalytic residues contributing to the interaction are Trp144 and His273. All the remaining amino acids are located in the neighboring area. Green circles connected via slashed line indicate pi-pi electron stacking interaction.

117

Supplementary Figure 5. Binding modes of top three structure-based inhibitors. Docking of three best inhibitors (22, 1, 18) to DbjA (top), DhaA (middle) and LinB (bottom). The halide-stabilizing residues of the active site are shown in green sticks. Phenyl rings located in the active site are not capable of creating sufficiently strong interaction.

118

Chapter 2

119

120

CHAPTER 3

Structural and functional analysis of a novel haloalkane dehalogenase with two halide-binding sites

Acta Crystallogr. D. Biol. Crystallogr. 2014, 70, 1884–1897

DOI: 10.1107/S1399004714009018

121

Abstract

The crystal structure of the novel haloalkane dehalogenase DbeA from Bradyrhizobium elkanii USDA94 revealed the presence of two chloride ions buried in the protein interior. The first halide-binding site is involved in substrate binding and is present in all structurally characterized haloalkane dehalogenases. The second halide-binding site is unique to DbeA. To elucidate the role of the second halide-binding site in enzyme functionality, a two-point mutant lacking this site was constructed and characterized. These substitutions resulted in a shift in the substrate-specificity class and were accompanied by a decrease in enzyme activity, stability and the elimination of substrate inhibition. The changes in enzyme catalytic activity were attributed to deceleration of the rate- limiting hydrolytic step mediated by the lower basicity of the catalytic histidine.

Introduction

Haloalkane dehalogenases (HLDs; EC 3.8.1.5) are microbial enzymes that catalyze the hydrolytic conversion of halogenated aliphatic alkanes and their derivatives into three reaction products: an alcohol, a halide anion and a proton.68 Structurally, HLDs belong to the superfamily of α/β-hydrolases and consist of a conserved α/β-hydrolase core domain and a helical cap domain.75,251,252 The active sites of HLDs are buried in a predominantly hydrophobic cavity at the interface between these two domains. The tertiary structures of HLDs have been determined for DhlA from Xanthobacter autotrophicus GJ10,78 DhaA from Rhodococcus rhodochrous NCIMB 13064,79 LinB from Sphingobium japonicum UT26,80 DmbA from Mycobacterium tuberculosis H37Rv,81 DbjA from Bradyrhizobium japonicum USDA110,56 DppA from Plesiocystis pacifica SIR-145 and DmmA from an unknown marine bacterium46. The majority of known HLD structures contain a halide anion bound in the active site. This anion is coordinated by the side chains of a highly conserved amino acid pair, i.e. tryptophan–tryptophan28,253–255 or tryptophan– asparagine56,79–81,256–259. It has been proposed that these two amino acids stabilize the halogen atom of the substrate in the activated complex and the halide anion formed during the dehalo- genation reaction.253 The halide-binding site inside the active site of HLDs identified by protein crystallography has been independently confirmed by site-directed mutagenesis,70,260 steady-state fluorescence quenching,253 stopped-flow fluorescence29,30,254,260 and steady-state kinetic measurements29,261.

Besides the two halide-stabilizing residues, the active sitesof all HLDs contain three other amino acids essential for catalysis: a nucleophilic aspartate, a basic histidine and a catalytic aspartic or glutamic acid, known together as the catalytic triad.38,68 The overall kinetic mechanism of the dehalogenase reaction catalyzed by HLDs proceeds via four main steps: (i) substrate binding, (ii) bimolecular nucleophilic substitution resulting in the formation of a 122

Chapter 3

halide anion and an alkyl-enzyme intermediate, (iii) nucleophilic addition of a water molecule to the ester intermediate and (iv) release of the reaction products.28–30,65,71 It has been shown that the rate-limiting step in the catalytic cycle is halide release in the case of DhlA reacting with 1,2-dichloro- ethane and 1,2-dibromoethane,30 release of an alcohol and cleavage of the carbon–halogen bond for DhaA with 1,3-dibromopropane65 and 1,2,3-trichloropropane,37 and hydrolysis of the alkyl-enzyme intermediate for LinB with 1-chlorohexane, chlorocyclohexane and bromocyclohexane29. The observed differences in the rate-limiting step suggest that the catalytic efficiency of HLDs is governed by the particular enzyme–substrate pair and depends on: (i) the composition of the catalytic residues, (ii) the geometry and solvation of the enzyme active-site cavity and (iii) the geometry and dynamics of the access tunnels connecting the buried enzyme active site to the surrounding solvent.29,37,38,254

Since the first HLD was isolated in 1985,40 more than 200 putative HLDs have been identified by phylogenetic analysis and 16 of them have been biochemically characterized.39,40,42–46,245,262–264 Phylogenetic analysis of HLDs and their putative relatives has revealed that the enzyme family can be divided into three subfamilies, HLD- I, HLD-II and HLD- III, which differ mainly in the composition of the catalytic residues and the anatomy of the cap domain.68 HLDs are broad substrate-specificity enzymes. Individual members of the HLD family are able to convert a wide spectrum of chlorinated, brominated and iodinated alkanes, alcohols, amides, ethers and esters.67 Systematic exploration of their substrate specificity using a uniform set of halogenated substrates led to the classification of the enzymes into four distinct substrate-specificity groups.49 Simultaneous investigation of the relationship between the function and evolution of HLDs showed that it is not possible to predict the substrate specificity of putative HLDs on the basis of sequence similarities with experimentally characterized family members.49 Currently recognized structural determinants of HLD substrate specificity are (i) the composition of the catalytic residues, (ii) the size, shape and physicochemical properties of the active-site cavity and (iii) the shape, physicochemical properties and dynamics of the access tunnels.49,69–71,80,265,266

In the present study, a novel haloalkane dehalogenase, DbeA from B. elkanii USDA94, was structurally and biochemically characterized. Crystallographic analysis revealed the presence of two halide-binding sites in the DbeA structure: one in the active site and the second buried in the protein core. The observed spatial proximity of the second halide-binding site to the active-site cavity of DbeA suggests that it may play an important role in enzyme functionality. Detailed insight into the role of the second halide-binding site was obtained by its elimination from the DbeA structure using site-directed mutagenesis. Steady-state kinetics, pre-steady state kinetics and circular dichroism spectroscopy revealed significant changes in substrate specificity, catalytic activity and stability in the presence of salts.

123

Materials and methods

Gene isolation, synthesis and cloning

The dbeA gene was isolated from B. elkanii USDA94 and its nucleotide sequence was deposited in DDBJ/GenBank/ EMBL under accession No. AB478942. The C-terminus of the dbeA gene was fused to the sequence encoding a hexahistidine tag, enabling purification by metal- affinity chromatography. The recombinant gene dbeA ΔCl-His6 (I44L+Q102H) was synthesized artificially (Entelechon, Regensburg, Germany) according to the wild-type (wt) sequence. The synthesized gene was subcloned into the expression vector pET-21b (Novagen, Madison, USA) using restriction endonucleases NdeI and XhoI (Fermentas, Burlington, Canada) and T4 DNA ligase (Promega, Madison, USA).

Protein overexpression and purification

To overproduce DbeA wt and DbeA ΔCl (internal No.DbeA 03) in Escherichia coli BL21 (DE3) cells, the corresponding genes were transcribed by T7 RNA polymerase, which is expressed by the isopropyl β-d-1-thiogalactopyranoside (IPTG)-inducible lac UV5 promoter. Cells containing these plasmids were cultured in Luria broth medium at 37 °C. When the culture reached an optical density of 0.6 at a wavelength of 600 nm, enzyme expression (at 20 °C) was induced by the addition of IPTG to a final concentration of 0.5 mM. The cells were subsequently harvested and disrupted by sonication using a Soniprep 150 (Sanyo Gallenkamp PLC, Loughborough, England). The supernatant was collected after centrifugation at 100 000 g for 1 h. The crude extract was further purified on a HiTrap Chelating HP 5 ml column charged with Ni2+ ions (GE Healthcare, Uppsala, Sweden). TheHis-tagged enzyme was bound to the resin in the presence of equilibration buffer (20 mM potassium phosphate buffer pH 7.5, 0.5M sodium chloride, 10 mM imidazole). Unbound and nonspecifically bound proteins were washed out by buffer containing 37.5 mM imidazole. The target enzyme was eluted with buffer containing 300 mMimidazole. The active fractions were pooled and dialyzed against 50 mM potassium phosphate buffer pH 7.5 overnight. The enzymes, both of which contained a C-terminal hexahistidyl tail, were stored at 4 °C in 50 mM potassium phosphate buffer prior to analysis.

Specific activity measurements

Enzymatic activity towards 30 halogenated substrates was assayed using the colorimetric method developed by Iwasaki et al.244 The release of halide ions was analyzed spectrophotometrically at 460 nm using a SUNRISE microplate reader (Tecan, Grodig/Salzburg, Austria) after reaction with mercuric thiocyanate and ferric ammonium sulfate. 124

Chapter 3

The dehalogenation reaction was performed at 37 °C in 25 ml Reacti-flasks with Mininert valves. The reaction mixture contained 10 ml glycine buffer (100 mM, pH 8.6) and 10 ml of an appropriate halogenated substrate at a concentration of 0.1–10 mM depending on the substrate solubility. The reaction was initiated by the addition of enzyme. The reaction was monitored by withdrawing 1 ml samples at periodic intervals from the reaction mixture and immediate mixing of the samples with 0.1 ml 35% nitric acid to terminate the reaction. Dehalogenation activity was quantified as the rate of product formation with time.

Principal component analysis

A matrix containing the activity data for nine wt and one mutant HLDs with 30 substrates was analyzed by principal component analysis (PCA).267 The aim of the analysis was to uncover relationships between individual HLDs based on their activities towards the set of substrates.49 In brief, two PCAs were performed using Statistica 9.0 (StatSoft, Tulsa, USA). In the first analysis, raw data of the specific activities were used as the primary input data. This analysis compared the overall activity of the mutant enzyme with the overall activity of wt HLDs. In the second analysis, the raw data were log-transformed and weighted relative to the individual enzyme's activity towards other substrates prior to performing PCA in order to better discern the enzyme specificity profiles.49 These transformed data were used to identify substrate specificity groups of enzymes that exhibited similar specificity profiles regardless of their overall specific activities.

Steady state kinetic measurements

The catalytic properties of the enzymes were described by steady state kinetic parameters determined with selected substrates. Steady state kinetic constants for the reaction between DbeAwt and 1-chlorobutane or 1,3-dibromopropane were evaluated by measuring the substrate and product concentrations using a Trace GC 2000 gas chromatograph (Finnigen, San Jose, USA) equipped with a flame ionization detector and a DB-FFAP 30 m x 0.25 mm x 0.25 mm capillary column (J&W Scientific, Folsom, USA) and the colorimetric method described by Iwasaki et al.,244 respectively. Dehalogenation was performed at 37 °C in 25 ml Reacti-flasks with Mininert valves in a shaking water bath. The reaction mixture consisted of 10 ml glycine buffer (100 mM, pH 8.6) and various concentrations of substrate (0.01–5 mM for 1-chlorobutane and 0.01–7 mM for 1,3-dibromopropane). The enzymatic reaction was initiated by the addition of DbeAwt to final concentrations of 2.09 and 0.14 mM for the reactions with 1-chlorobutane and 1,3-dibromopropane, respectively. The reaction was terminated by the addition of 0.1 ml 35% nitric acid at different times after initiation (0, 10, 20, 30 and 40 min). All data points corresponded to the mean of three independent replicates.

125

Kinetic parameters were determined by nonlinear curve-fitting the data points using the Origin 6.1 software (OriginLab, Massachusetts, USA). The steady state kinetics of DbeA wt and DbeA ΔCl with 1-bromobutane were measured using a VP-ITC isothermal titration micro- calorimeter (MicroCal, Piscataway, USA). The substrate was dissolved in 100 mM glycine buffer pH 8.6 and the solution was allowed to reach thermal equilibrium in the reaction cell (1.4 ml). The reaction was initiated by injecting 10 ml enzyme solution containing either 26 mM DbeA wt or 825 mM DbeA ΔCl into the reaction cell. The enzymes were dialyzed overnight against the same glycine buffer as was used to dissolve the substrate. The measured rate of heat change was assumed to be directly proportional to the velocity of the enzymatic reaction

dQ d[S]  HV dt dt where ΔH is the enthalpy of the reaction, [S] is the substrate concentration and V is the volume of the cell. ΔH was determined by titrating the substrate into the reaction cell containing the enzyme. Each reaction was allowed to proceed to completion. The integrated total heat of reaction was divided by the amount of injected substrate. The evaluated rate of substrate depletion (-d[S]/dt) and the corresponding substrate concentrations were then fitted by nonlinear regression to kinetic models using Origin 6.1 (OriginLab, Massachusetts, USA).

Size exclusion chromatography

The molecular weight of DbeA wt was analyzed using an ÄKTA FPLC system equipped with UV280 detection (GE Healthcare, Uppsala, Sweden) and a Superdex TM 200 10/300 GL column (GE Healthcare, Uppsala, Sweden). A total volume of 100 µl of the protein sample was applied on the column and separated at a constant flow rate of 0.5 ml/min. The elution buffer comprised 50 mM Tris-HCl and 150 mM NaCl. For calibration against molecular weight standards, a gel filtration calibration kit (GE Healthcare, Uppsala, Sweden) was used containing ribonuclease A (13.7 kDa), chymotrypsinogen A (25 kDa), ovalbumin (43.0 kDa), conalbumin (75.0 kDa) and aldolase (158 kDa). All molecular weight standards, as well as protein samples, were dialyzed against the elution buffer prior to analysis.

Native gel electrophoresis

The oligomeric state of DbeA wt was also investigated by native polyacrylamide gel electrophoresis performed with 10% gels lacking sodium dodecyl sulfate. The electrophoresis tank was maintained at 4 °C during the experiment. Gels were stained with Coomassie brilliant blue R-250 dye (Fluka, Buchs, Switzerland). The molecular mass of DbeA wt was estimated by comparison of its mobility with values for two molecular weight standards, ovalbumin (43.0

126

Chapter 3

kDa) and albumin (67.0 kDa), and three HLDs DhaA (33.2 kDa), LinB (33.1 kDa) and DmbA (33.7 kDa).

Circular dichroism spectroscopy and thermal denaturation

To assess a secondary structure and correct folding of the enzymes, circular dichroism (CD) spectra were recorded at room temperature using a Jasco J-810 spectropolarimeter (Jasco, Tokyo, Japan). Data were collected from 185 to 260 nm (in pure 50 mM phosphate buffer, pH 7.5) or from 200 to 260 nm (in the presence of sodium chloride), at a scan rate of 100 nm/min, 1 s response time and 2 nm bandwidth using a 0.1 cm quartz cuvette containing the enzyme. Each spectrum shown represents an average of ten individual scans and has been corrected for absorbance caused by the buffer. CD data were expressed in terms of the mean residue ellipticity (ΘMRE). Thermal unfolding of the studied enzymes was followed by monitoring the ellipticity at 222 nm over a temperature range of 20 to 80 °C, using a resolution of 0.1 °C and heating rate of 1 °C/min. The resulting thermal denaturation curves were roughly normalized to represent a signal change between approximately 1 and 0 and fitted to sigmoidal curves using the software Origin 6.1 (OriginLab, Massachusetts, USA). The melting temperature (Tm) was evaluated as the midpoint of the normalized thermal transition.

Crystallization and data collection

DbeA wt was crystallized by the sitting-drop vapour-diffusion procedure from a solution consisting of 100 mM Tris-HCl pH 7.5, 20% (w/v) PEG 3350 or 4000 and 150 mM calcium acetate as described previously by Prudnikova et al.66 Diffraction data have been collected on MX14.2 operated by the Helmholtz-Zentrum Berlin (HZB) at the BESSY II electron storage ring (Berlin- Adlershof, Germany)268 with a 0.918 Å monochromatic fixed wavelength. Collected data were processed using the HKL-3000 package269. Crystals exhibited the symmetry of space group

P212121 and contained four molecules in the asymmetric unit with a solvent content of ~50 %. Crystals exhibited anisotropic diffraction, the low value of the completeness and high value of I/(I) in the highest resolution shell is a consequence of this anisotropy. Crystal parameters and data collection statistics are given in Table 1.

127

Table 1. Diffraction data collection and refinement statistics.

X-ray diffraction data collection statistics

Space group P212121 a = 62.7, b = 121.9, c = 161.9 Cell parameters (Å, º) α = β = γ = 90 Number of molecules in AU 4 Wavelength (Å) 0.918 Resolution (Å) 2.2 Number of unique reflections 63,890 Redundancy 5.8 (3.4) Completeness (%) 92.3 (62.8) a Rmerge 6.6 (24.8) Average I/ (I) 25.0 (4.3) Wilson B (Å2) 25.672 Refinement statistics Resolution range (Å) 50–2.2 (2.28–2.2) No. of reflections in working set 55927 (2690) R value (%)b 14.53 c Rfree value (%) 20.66 RMSD bond length (Å) 0.013 RMSD angle () 1.592 No. of atoms in AU 9,983 No. of protein atoms in AU 9,222 No. of water molecules in AU 749 No. of acetate ions in AU 3 No. of chloride ions in AU 8 Mean B value protein/ion Cl1/ion Cl2 (Å2) 23.98/22.28/24.06 Ramachandran plot statistics Residues in favored regions (%) 95.16 (1144/1203) Residues in allowed regions (%) 96.13 (1156/1203) PDB code 4k2a

a The data in parentheses refer to the highest-resolution shell. Rmerge = hkliIi(hkl) - I(hkl)|/hkli Ii(hkl), where the Ii(hkl) is an individual intensity of the ith observation of reflection hkl and I(hkl) is the average

128

Chapter 3

b intensity of reflection hkl with summation over all data. R value = ||Fo| - |Fc||/|Fo|, where Fo and Fc are c the observed and calculated structure factors, respectively. Rfree is equivalent to R value, but is calculated for 5 % of the reflections chosen at random and omitted from the refinement process.

Structure determination and refinement

The structure of DbeA wt was solved by the molecular replacement method using the program Molrep270 with the structure of DhaA from Rhodococcus species (PDB code: 1bn6) as a search model.79 Model refinement was carried out using the program REFMAC 5.2271 from the CCP4 package272, interspersed with manual adjustments using COOT 0.5273. The final steps included translation-libration-screw (TLS) refinement.274 The refinement statistics are given in Table 1. The following services were used to analyze the structures: Molprobity275, PISA server276,277, and the Protein-Protein Interaction Server278. Figures showing the structural representations were prepared using the program PyMOL229. Atomic coordinates and experimental structure factors have been deposited in the Worldwide Protein Data Bank under the PDB ID code 4k2a.

Pre-steady state kinetic measurements

To identify the rate-determining step for the catalytic conversion of 1-bromobutane by the enzymes, rapid quench flow experiments were performed at 37 °C in a glycine buffer (pH 8.6) using a model QFM 400 instrument (Bio-Logic, Claix, France). The reaction was initiated by rapid mixing of 70 μl of enzyme with 70 μl of the substrate solution and quenched with 100 μl of 0.8 M H2SO4 at times ranging from 2 ms to 1.2 s. The quenched mixture was directly injected into 0.5 ml of ice-cold diethyl ether containing 1,2-dichloroethane as an internal standard. After extraction, the diethyl ether layer, containing non-covalently bound substrate and alcohol product, was collected, dried on a short column packed with anhydrous Na2SO4 and analyzed using a Trace 2000 gas chromatograph equipped with MS detection (Finnigen, San Jose, USA) and a DB-5MS capillary column (J&W Scientific, Folsom, USA). The amount of bromide ion in the aqueous phase was measured by ion chromatography using an 861 Advanced Compact IC equipped with a METROSEP A Supp 5 column (Metrohm, Herisau, Switzerland). Stopped-flow fluorescence experiments were used to study kinetics of halide binding to the enzymes. Binding experiments were performed using a SFM-20 stopped-flow instrument (Bio-Logic, Claix, France) combined with a Jasco J-810 spectropolarimeter (Jasco, Tokyo, Japan) equipped with a Xe arc lamp with excitation at 295 nm and a SFM-300 stopped-flow instrument combined with a spectrometer MOS-200 (Bio-Logic, Claix, France). Fluorescence emission from tryptophan residues was observed through a 320 nm cut-off filter upon excitation at 295 nm. Reactions were performed at 37 °C in a glycine buffer at pH 8.6. Dissociation constants were

129

calculated from the amplitudes of fluorescence quenching for the rapid equilibrium phase. Amplitudes and observed rate constants for the slow kinetic phase were evaluated by fitting to single exponentials using the software Origin 6.1 (OriginLab, Massachusetts, USA). Calculation of binding energies Molecular dynamics (MD) simulations were used to analyze the binding energies of ions to their respective binding sites. The crystal structure of DbeA wt was employed as a starting model, while the double mutant DbeA ΔCl (I44L+Q102H) was constructed in PyMOL229. The orientation of introduced side chains was chosen to be the same as in the aligned structure of haloalkane dehalogenase LinB (PDB ID 1mj5), which contains the target residues naturally. Hydrogen atoms were added to the structure of DbeA wt and DbeA ΔCl with the H++ server210 at pH 7.5. A chloride ion was placed at the second halide-binding site of DbeA ΔCl based on its position in the crystal structure of DbeA wt. All water molecules present in the initial crystal structure were retained in both systems. Cl- and Na+ ions were added to a concentration of 0.1 M using the Tleap module of AMBER 11126. Using the same module, an octahedral box of TIP3P water molecules279 was also added that extended to a distance of 10 Å from any solute atom in the system. The systems were minimized in five rounds consisting of 250 steepest descent steps and 750 conjugate gradient steps with a decreasing restraint on the protein backbone (500, 125, 50, 25 and 0 kcal mol-1 Å-2). Subsequent MD simulations employed periodic boundary conditions, the particle mesh Ewald method for treatment of electrostatic interactions,280 a 10 Å cutoff for nonbonded interactions and a 2 fs time step coupled with the SHAKE algorithm281 to fix all bonds that involved hydrogen. Equilibration simulations consisted of two steps: (i) 20 ps of gradual heating from 0 to 310 K under constant volume using a Langevin thermostat with collision frequency of 1.0 ps-1 and harmonic restraints of 5.0 kcal mol-1 Å-2 on the position of all protein and ligand atoms, and (ii) 2000 ps of unrestrained MD simulation at 310 K using the Langevin thermostat, a constant pressure of 1.0 bar and a pressure coupling constant of 1.0 ps. Finally, production MD simulations were run for 20 ns with the same settings as the second step of the equilibration MD simulations. Coordinates were saved at 1 ps intervals. The resulting trajectories were analyzed using the Ptraj module of AMBER 11 and visualized in PyMOL and VMD 1.9.1282. All calculations were carried out in the PMEMD module of AMBER 11 using the ff10 force field283. In total, four independent MD simulations were run for each system. Binding energies were calculated from the trajectories using the molecular mechanics/generalized Born surface area method.217,284 Every 10th frame in the last 15 ns of each trajectory was selected for the analysis. Energy differences were calculated by combining the gas phase energy contributions with solvation free energy components obtained using the implicit solvent model for each species. Input topologies of receptors, ligands and receptor- 130

Chapter 3

ligand complexes were prepared with the Leap module of AMBER 11 using the ff10 force field. The following settings were used for the calculation: PBradii were set to mbondi2, generalized Born model = 5 and saltcon = 0.1.217 The analysis was performed with a python script MMPBSA.py implemented in AmberTools 11239.

Computational prediction of pKa

The pKa of the catalytic base in DbeA wt and DbeA ΔCl was calculated from MD simulations. The input file controlling the simulation was generated by an AmberTools 12 script cpinutil.py with the flag for the generalized Born model set to 5. The systems were minimized in five rounds consisting of 250 steepest descent steps and 750 conjugate gradient steps with a decreasing restraint on the protein backbone (500, 125, 50, 25 and 0 kcal mol-1 Å-2). The settings used for all subsequent implicit solvent MD simulations were as follows: interior dielectric constant = 1, saltcon = 0.1 M, nonbonded cutoff = 999 and generalized Born model = 5. The particle mesh Ewald method was used to treat electrostatic interactions and a 2 fs time step coupled with the SHAKE algorithm was used to fix all bonds involving hydrogen atoms. Equilibration simulations consisted of two steps: (i) 20 ps of gradual heating from 0 to 310 K, and (ii) 5 ns of MD simulation at 310 K. During both steps, the Langevin thermostat with a collision frequency of 10 ps-1 was used to control the temperature and harmonic restraints of 1.0 kcal mol-1 Å-2 were applied to the positions of the protein backbone atoms and chloride ion. Finally, a 15 ns constant pH MD simulation was run for the whole system with the positions of the atoms in the protein backbone and chloride ion restrained by 0.1 kcal mol-1 Å-2.285 The pH of the solvent was set to 8.6 and a period of 10 steps was used between Monte Carlo sampling the protonation state. Coordinates were saved every 1 ps. All calculations were carried out in the Sander module of AMBER 11 using the constph force field (modified ff10 force field). In total, two independent MD simulations were run for each system. Results Isolation and biochemical characterization of the novel haloalkane dehalogenase DbeA An open reading frame encoding a putative HLD, designated dbeA, was identified in the genome of Bradyrhizobium elkanii USDA94 by comparing sequences of known HLDs with sequences deposited in genetic databases. The dbeA gene was subcloned into pET21b, overexpressed in E. coli BL21 (DE3) and the His-tagged protein was purified to homogeneity with a yield of 40 mg per L of cell culture. The specific activity of purified DbeA was tested against a set of 30 different halogenated aliphatic substrates, representing a wide range of structures and physicochemical properties.49 DbeA generally exhibited high activity towards terminally substituted iodinated and brominated compounds but poor activity towards 131

chlorinated compounds (Supplementary Table 1). The optimal alkyl chain length of the substrate catalyzed by DbeA was between three and four carbon atoms. The highest enzyme activity (0.0926 µmol s-1 mg-1) was observed in the reaction with 1,3-diiodopropane. Principal component analysis revealed that the level of DbeA activity was moderate compared to other HLDs and classified DbeA into substrate specificity group IV (SSG-IV), whose members (including DatA and DmbC) preferentially convert terminally brominated and iodinated propanes and butanes.49 The catalytic properties of DbeA were assessed by measuring steady-state kinetic constants with 1-chlorobutane, 1-bromobutane and 1,3-dibromopropane as substrates (Table S2). DbeA kinetics with 1-chlorobutane followed a simple hyperbolic Michaelis-Menten relationship. Surprisingly, changing the substituent in the substrate from chlorine to bromine resulted in sigmoidal kinetics accompanied by weak substrate inhibition in the reaction with 1-bromobutane. This suggests that DbeA is able to accommodate 1-bromobutane inside the catalytic pocket better than 1-chlorobutane and that binding of 1-bromobutane in the enzyme pocket occurs in a cooperative manner. Examination of the kinetics of DbeA towards 1,3-dibromopropane revealed another interesting feature, i.e., strong substrate inhibition with

Ksi < Km combined with a cooperative mechanism. These results indicate that the mechanism by which DbeA catalyzes conversion of halogenated compounds is strongly dependent on the specific halogen and chemical structure of the substrate. The size, secondary structure and thermostability of DbeA were also investigated. Gel filtration chromatography (Supplementary Figure 1A) and native gel electrophoresis (Supplementary Figure 1B) were employed for determination of the oligomeric state of DbeA. DbeA was found to exist as a dimer in solution under the tested conditions. CD spectroscopy was used to investigate the correct folding and thermostability of DbeA. Far-UV CD spectra of DbeA and other related HLDs (Supplementary Figure 2) exhibited one positive peak at 195 nm and two negative features at 222 and 208 nm, characteristic of α-helical content and implying correct folding of DbeA.286 Thermally induced denaturation of DbeA indicated a melting temperature of Tm = 58.5 ± 0.2 °C (Supplementary Table 3). Structural characterization of DbeA and identification of two halide-binding sites The crystal structure of DbeA was solved by protein crystallography to 2.2 Å resolution (Table 1). The domain organization of DbeA was found to be very similar to those of other structurally known HLD enzymes. Two distinct domains, the α/β hydrolase core domain and α-helical cap domain, are separated by a deep cleft (Figure 1). The core domain consists of a central eight-stranded β-sheet with β2 lying in an antiparallel orientation with respect to the direction of the β-sheet. The β-sheet is flanked by five α-helices on both sides. The cap domain, protecting the upper surface of the active site cavity consists of six short α-helices linked by

132

Chapter 3

seven loop insertions. Similar to other HLDs, DbeA contains five catalytic residues in the active site: the nucleophile Asp 103, catalytic base His 271 and catalytic acid Glu 127, forming the so called catalytic triad, and two-halide stabilizing residues Asn 38 and Trp 104. Analysis of the intermolecular contacts in the crystal structure suggested that DbeA dimerization is mediated through interaction of the C-terminal α-helices formed by residues 273–303 (Figure 1).

Figure 1. Overall structure of DbeA. Cα ribbon trace representing the elements of the protein secondary structure. α-helices are shown in red for the main domain and orange for the cap domain; β-strands are shown in yellow; loops are shown in green; chloride ions are shown as cyan spheres; chloride ion in the canonical product-binding site is labelled as Cl1; the chloride ion bound in the second halide-binding site is labelled as Cl2; the catalytic triad residues are shown as sticks.

During refinement of the crystal structure, two peaks in electron density were detected, indicating the presence of two ions in the vicinity of the DbeA active site. Since sodium chloride was the only halide-containing compound used in purification and crystallization, two chloride anions were modeled in the electron density, both with occupancy of 1.0 in all four molecules accommodated in one asymmetric unit. The position of the two chloride anions and coordinating amino acid residues are shown in Figure 2A. Identity of ions was corroborated by the presence of strong peaks on anomalous difference map calculated from diffraction data collected at wavelength 1.5 Å. The first chloride anion (designated as Cl1) in the DbeA structure occupies the product-binding site and interacts with nitrogen atoms from conserved halide- binding residues Asn 38 Nδ2 and Trp 104 Nε1 with 3.33 Å and 3.17 Å distances, respectively (Figure 2B). Further coordination is provided by a water molecule mediating contact with the side chain of the catalytic residue Asp 103. The distance between the water molecule oxygen and the chloride ion Cl1 is 3.33 Å. This halide-binding site is common to all members of the HLD 133

family.38,287 The second chloride anion (designated as Cl2) is located about 10 Å from the product-binding site and is buried deep in the protein core. It is coordinated by the side-chains atoms of Gln 274 Nδ2, Gln 102 Nδ2, Gly 37 N and Thr 40 Oγ1 and is positioned 3.50 Å, 3.23 Å, 3.19 Å and 2.90 Å away, respectively (Figure 2C). This second halide-binding site is unique to DbeA and has not previously been observed in any other crystal structure of related HLDs. The full occupancy of this second halide-binding site and its location in close proximity ( 10 Å, Figure 2A) to the active site suggests that it might have a biological relevance.

Figure 2. Active site of DbeA (stereoview). (A) Overview of the active site with two chloride ions. Residues coordinating the chloride ions are represented by sticks with carbon atoms shown in yellow; the carbon atoms of the catalytic triad are shown in green. Two chloride ions are shown as cyan spheres with coordinating interactions represented by cyan dashed lines. A water molecule and water mediated interaction are shown in red; the distance between the two chloride ions is indicated by the number (in Å) over the black dashed line. (B) Expanded view of a chloride ion in the canonical halide-binding region 134

Chapter 3

(Cl1) of the active site. The 2Fo-Fc electron density map for ions and interacting residues contoured at 1.5 is shown in blue; the numbers indicate the coordination distances (in Å). (C) Expanded view of a chloride ion bound to the unique second halide-binding site (Cl2). The distance between the catalytic histidine and the chloride anion is indicated by the number (in Å) over the black dashed line.

Chloride binding to DbeA wt in solution was monitored by stopped flow fluorescence, revealing two distinct phases: a rapid equilibrium phase followed by a slow exponential phase (Figure 3A). The first phase reached the rapid equilibrium within the dead time of the instrument (0.5-5.0 ms), and therefore only a difference in the initial fluorescence signal was observed after mixing DbeA wt with increasing concentration of halides. The second kinetic phase was characterized by a single exponential decrease of the fluorescence signal with time. The chloride concentration dependence of the amplitudes of the initial and kinetic exponential phase (Figure 3C and 3E) indicated the presence of two independent binding sites for chloride ions in the enzyme with derived dissociation constants for the fast and the slow binding interactions Kd1 = 0.10 ± 0.05 M and Kd2 = 0.37 ± 0.06 M, respectively. The observed rate constants for the second kinetic phase showed linear dependence on chloride concentration (Figure 3F) indicating that the slow kinetic phase is associated with chloride binding. The kinetic data were fit to linear equation yielding association rate constant (6.1 ± 0.9 M-1 s-1) and dissociation rate constant (5.3 ± 1.0 s-1) for binding of chloride to the second halide-binding site. Similar results were observed when bromide binding to DbeA wt was monitored by the stopped flow fluorescence (Supplementary Figure 3). The previous studies investigating the binding of halide ions to the haloalkane dehalogenases revealed the presence of a single halide-binding site in DhlA,30 LinB29 and DatA43.

135

Figure 3. Stopped flow fluorescence analysis of chloride binding to DbeA wt and DbeA ΔCl. (A) Fluorescence traces obtained upon mixing 30 μM DbeA wt with chlorides. (B) Fluorescence traces obtained upon mixing 30 μM DbeA ΔCl with chlorides. (C) Chloride concentration dependence of rapid equilibrium fluorescence quench of DbeA wt. (D) Chloride concentration dependence of rapid equilibrium fluorescence quench of DbeA ΔCl. (E) Chloride dependence of the amplitude of the slow exponential fluorescence quench of DbeA wt. Solid lines represent best fits to the data based on Stern-Volmer - - equation F/F0 = (1+(f Kq [Cl ]))/(1+KCl [Cl ]) in which F/F0 is the relative fluorescence; f is the relative fluorescence intensity of enzyme-chloride complex; KCl is the association equilibrium constant of specific binding of chloride; Kq is the quenching constant which is the apparent association equilibrium constant of the non-specific quenching interaction between chloride and the fluorophore. (F) Chloride dependence of the observed rate constants of the slow exponential fluorescence quench of DbeA wt. Solid line - represents the best fit to the data using the equation kobs = kassoc [Cl ]/kdissoc where kassoc and kdissoc are association and dissociation rate constants, respectively.

Construction of the variant DbeA ΔCl with eliminated second halide- binding site Comparison of the structure and amino acid sequence of DbeA with other related enzymes revealed that the second halide-binding side is lined by five amino acid residues: Gly 37, Thr 40, Ile 44, Gln 102 and Gln 274 (Figure 4). Ile 44 and Gln 102 are unique to DbeA and their substitution by Leu and His residues, as found in other HLDs, reduced the volume of the 136

Chapter 3

second halide-binding site by 10%. Two-point mutant I44L+Q102H (designated DbeA ΔCl) was constructed and characterized to gain deeper understanding of the role of the second halide- binding site in determining the structure and function of DbeA. Since the attempts to crystallize DbeA ΔCl were not successful, disruption of the second halide-binding site in DbeA ΔCl was investigated by MD simulations, comparing the binding energies of chloride ions bound to the second halide-binding site in the wt with the putative site in DbeA ΔCl. The calculated difference in the binding energies between DbeA wt and DbeA ΔCl was 8.7 ± 2.7 kcal mol-1. Binding of a chloride ion to the second chloride-binding site Cl2 was clearly more energetically favourable in the wt enzyme than in the double-point mutant. Stopped flow fluorescence of halide binding to DbeA ΔCl showed only a rapid equilibrium phase (Figure 3B and Supplementary Figure 3B), confirming the absence of the second halide-binding site in this protein. The dissociation constant calculated from the dependence of fluorescence quench amplitude of DbeA ΔCl on chloride concentration was Kd = 2.30 ± 1.00 M (Figure 3D).

Figure 4. Comparison of DbeA structure and halide-binding sites with other HLDs. (A) Stereoview of the structure of DbeA from B. elkanii (red, PDB code: 4k2a) superposed with DhaA from Rhodococcus sp. (yellow, PDB code: 1bn6), DmbA from M. tuberculosis Rv2579 (blue, PDB code: 2qvb) and LinB from S. japonicum UT26 (green, PDB code: 1cv2). Chloride ions bound to DbeA are shown as cyan spheres; halide-binding residues coordinating the Cl1 are shown as red sticks; the residues coordinating the Cl2 are shown as dark grey sticks. (B) Superposition of the halide-binding sites for DbeA (carbon atoms in pink), DhaA (C atoms in yellow), DmbA (C atoms in light blue) and LinB (C atoms in green). Chloride ions coordinated in DbeA are represented by cyan spheres, the 2Fo-Fc electron density map contoured at 1.5 is shown in blue.

137

Effect of elimination of the second halide-binding site on substrate specificity of DbeA ΔCl The specific activity of DbeA ΔCl was assayed with the set of 30 halogenated compounds originally used for characterization of the wt enzyme (DbeA wt). DbeA ΔCl exhibited significantly lower activity than DbeA wt with almost all tested compounds (Figure 5 and Supplementary Table 1). An approximately 40-fold decrease in activity was observed in the case of iodinated substrates 1-iodobutane and 1,3-diiodopropane, and long chain chlorinated substrates 1,5-dichloropentane and 1-chlorohexane. Statistical analysis of the transformed activity dataset clustered DbeA ΔCl into the substrate specificity group SSG-I (Figure 5), owing to its reduced preference for iodinated and brominated compounds. In contrast, DbeA wt has been found to cluster to SSG-IV.49 The data suggest that elimination of the second halide- binding site dramatically affects the catalytic activity and substrate specificity of the enzyme.

Figure 5. Statistical analyses of the substrate specificity data. (A) The score-contribution plot t1 from PCA with untransformed dataset. The plot shows differences in the overall activities of individual HLDs. The overall activity of DbeA ΔCl is similar to that of the least active enzymes, DrbA and DmbC. The plot explains 56 % of the variance in the dataset. (B) The score plot t1/t2 from PCA with transformed dataset. The plot is a two-dimensional window into the multidimensional space, where the objects (enzymes) with similar properties (specificity profiles) are collocated. The t1/t2 plot describing 46 % of variance in the dataset shows the enzymes clustered in individual substrate specificity groups (SSGs). Unlike DbeA wt, DbeA ΔCl lies in SSG-I, together with LinB, DbjA, DhaA and DhlA. All members of SSG-I possess broad substrate specificity. (C) The corresponding plot p1/p2 from PCA with transformed dataset showing the main substrates for each SSG. Numbering of the substrates has been adopted from Koudelakova et al.49

138

Chapter 3

Effect of elimination of the second halide-binding site on the kinetics of DbeA ΔCl The catalytic efficiency of the mutant enzyme was assessed by determining the steady state kinetic constants for 1-bromobutane conversion using isothermal titration calorimetry.

DbeA ΔCl exhibited a much lower Km, as well as a sizeable decrease in kcat (Table 2), suggesting that the low specific activities of DbeA ΔCl observed during substrate screening are due to a low catalytic rate. Moreover, substrate inhibition was not observed with this variant. To identify the kinetic steps affected by elimination of the second halide-binding site, transient kinetic experiments were conducted with DbeA wt and DbeA ΔCl, using 1-bromobutane as a substrate. Upon rapid mixing of DbeA wt or DbeA ΔCl with excess of substrate, a clear pre-steady-state burst of bromide production was observed (Figure 6). All steps leading to the formation of halide ion, substrate binding and subsequent cleavage of the carbon-halogen bond are fast, and do not limit overall steady-state turnover. On the other hand, only the linear formation of alcohol product with no sign of burst was observed even from the early transient phase of the reaction (Figure 6). Since all steps before hydrolysis of alkyl-enzyme intermediate are fast, the absence of alcohol burst suggests that the hydrolysis of alkyl-enzyme intermediate is rate-determining step for the catalytic conversion of

1-bromobutane. Rate constants calculated for the burst phase of bromide formation (kobs) were similar for both enzymes, suggesting that elimination of the second halide-binding site did not affect the rate of the substrate binding or the cleavage of the carbon-halogen bond (Table 2).

However, a significant reduction in the rate of the linear steady-state phase (kss) was observed, indicating that the hydrolytic step of the catalytic cycle was severely affected by the introduced mutations (Table 2). The catalytic histidine plays an essential role for the hydrolysis of the alkyl-enzyme intermediate by abstraction of the proton from the catalytic water. The pKa of the catalytic histidine in DbeA wt with and without a chloride anion bound to the second halide-binding site was calculated at constant pH and compared to the pKa of the catalytic histidine in DbeA ΔCl.

The pKa in DbeA wt (pKa = 7.1 ± 1.4) without a chloride anion present was comparable to the pKa in DbeA ΔCl (pKa = 7.3 ± 0.5), which lacked the second halide-binding site. In the case of DbeA wt with a chloride anion present, the pKa of the catalytic histidine was notably increased (pKa = 9.6 ± 0.8), implying it is a stronger base. It is worth mentioning that these calculations are sensitive to the choice of relative dielectric constant for the protein interior. If the site is more solvated than apparent from the crystal structure, then the use of higher relative dielectric constant would be more appropriate leading to smaller difference in pKa.

139

Table 2. Steady state and pre-steady state kinetic constants of DbeA wt and DbeA ΔCl with the substrate 1-bromobutane. -1 -1 -1 Enzyme kcat (s ) K0.5 (mM) n Ksi (mM) kobs (s ) kss (s ) DbeA wt 3.91 ± 0.16a 0.51 ± 0.02a 1.33 ± 0.09a 27.06 ± 3.05a 71 ± 39d 2.54 ± 0.07d DbeA ΔCl 0.038 ± 0.002b 0.011 ± 0.001b 2.11 ± 0.01b -c 77 ± 15d 0.063 ± 0.005d

K0.5 is the concentration of substrate at half maximal velocity, kcat is the catalytic constant, n is the Hill coefficient, Ksi is the substrate inhibition constant, kobs is the rate constant for burst phase of product formation, kss is the rate constant for steady state phase of product formation. All measurements were performed at pH 8.6 and 37 °C. aKinetic parameters were calculated using the Hill equation modified by n n n b additional terms for substrate inhibition: (v/Vlim) = [S] /(K + [S] {1+ ([S]/Ksi)}). Kinetic parameters were c d calculated using the Hill equation: (v/Vlim) = [S]n/(Kn +[S]n). }. Not applicable. Kinetic parameters were calculated using the burst equation: [P]/[E]0 = A0[1 -exp(-kobst)] + ksst.

Figure 6. Rapid quench flow analysis of the burst phase of DbeA wt and DbeA ΔCl reactions. The burst in reaction was monitored upon mixing (A) 155 μM DbeA wt with 1400 μM 1-bromobutane, and (B) 140 μM DbeA ΔCl with 600 μM 1-bromobutane. Solid lines represent the best fits to the bromide ion (empty circles) and 1-butanol (filled circles) kinetic data.

140

Chapter 3

Effect of elimination of the second halide-binding site on stability of DbeA ΔCl Proper folding of DbeA ΔCl was verified by CD spectroscopy. No differences between the CD spectra of DbeA wt and its variant were observed (Figure 7A), suggesting that the introduced substitutions have no effect on the secondary structure of DbeA ΔCl. To explore the structural resistance of DbeA wt and DbeA ΔCl towards salts, far-UV CD spectra were measured in the presence of various concentrations of sodium chloride. The secondary structure of each protein was preserved at all tested concentrations of the salt (Supplementary Figure 4). At the same time, thermally induced denaturation was tested in the presence and absence of various concentrations of sodium chloride to investigate the effect of the second halide-binding site on the thermal stability of the tested enzymes. The introduced mutations had no effect on the stability of DbeA ΔCl in the absence of salt. Its melting temperature (Tm = 58.0 ± 0.2 °C) was almost identical to that of the wt enzyme (Tm = 58.5 ± 0.2 °C). However, the thermal stability of both enzymes changed significantly in the presence of chloride salts (Figure 7B). The Tm values of DbeA wt increased in a concentration-dependent manner over the whole range of sodium chloride concentrations used. The highest thermostability of DbeA wt (Tm = 66.4 ± 0.1 °C) was obtained at the highest tested concentration of sodium chloride (3000 mM). In contrast,

Tm values of DbeA ΔCl slightly decreased for sodium chloride concentrations in the range 0-500 mM but increased at higher concentrations (1000-1500 mM). Increasing the salt concentration further caused a drop in the thermal stability of DbeA ΔCl. Finally, with a sodium chloride concentration of 3000 mM, the DbeA ΔCl stability was almost comparable with that measured in pure buffer (Figure 7B). The second halide-binding site in DbeA wt appears to be responsible for the higher resistance of this enzyme towards thermally induced denaturation in the presence of a high concentration of chloride salts (1000-3000 mM).

141

Figure 7. Secondary structure and stability of DbeA wt and DbeA ΔCl in the presence of chloride ions. (A) Far-UV CD spectra of DbeA wt and DbeA ΔCl, and (B) thermal stability of DbeA wt and DbeA ΔCl in the presence of various concentrations of sodium chloride.

Discussion Structural analysis of the newly isolated haloalkane dehalogenase DbeA from B. elkanii USDA94 revealed the presence of two halide-binding sites, both fully occupied by chloride anion, buried in the protein core. The first halide-binding site, formed by the two halide- stabilizing residues Asn 38 and Trp 104, is responsible for stabilization of the halide ion after the carbon-halogen bond cleavage. This halide-binding site, which is common to all HLDs, has been found to be occupied by various halide ions in a number of crystal structures: chloride, bromide or iodide ion in DhlA (PDB codes: 1b6g, 1edb, 1edd, 2dhd, 2dhe, 1cij, 2eda, 2edc),28,253–255 iodide ion in DhaA (PDB code: 1cqw),79, chloride or bromide ion in LinB (PDB codes: 2g42, 2g4h, 1g5f, 1iz7, 1k5p, 1mj5, 2bfn, 1k63, 1k6e, 1do7, 1iz8),80,256–259 chloride or bromide ion in DmbA (PDB codes: 2qvb, 2o2h, 2o2i), 81 chloride ion in DbjA (PDB codes: 3afi, 3a2n, 3a2m).56 This second site is unique to DbeA and has not previously been observed in crystal structure of related enzymes. The second halide-binding site of DbeA is buried in the protein core and lined by five amino acid residues: Gly 37, Thr 40, Ile 44, Gln 102 and Gln 274. The same halide-binding motif, G-T-I-Q-Q, has been identified in the sequences of evolutionary closely related (sequence identity 56-76 %) enzymes DbjA from Bradyrhizobium japonicum USDA11042, DmlA from Mesorhizobium loti MAFF30309942, DmhA from Mesorhizobium huakuii subsp. rengei 142

Chapter 3

(unpublished data) and three other putative dehalogenase sequences from Mesorhizobium ciceri biovar biserrulae WSM1271 (GI number 319779915), Burkholderia sp. H160 (GI number209502163) and Chthoniobacter flavus Ellin428 (GI number 196221892). Among these proteins, the tertiary structure has so far only been solved for DbjA,56,288 but the presence of two halide anions in the structure of DbjA was not reported. The successful removal of the second halide-binding site in the DbeA structure was experimentally confirmed by stopped-flow analysis of chloride binding to DbeA wt and DbeA ΔCl. Elimination of the second halide-binding site dramatically changed the substrate specificity of the enzyme to the extent that the engineered DbeA ΔCl clustered in a different substrate specificity group SSG-I than its parental enzyme belonging to SSG-IV. Enzymes in SSG-I are active towards most of the tested substrates, including poorly degradable compounds, e.g., 1,2-dichloroethane, 1,2-dichloropropane, 1,2,3-trichloropropane and chlorocyclohexane. On the other hand, enzymes in SSG-IV are more selective for specific halogenated substrates and predominantly exhibit activity towards terminally substituted brominated and iodinated propanes and butanes.49 Unlike DbeA wt, DbeA ΔCl was clustered to SSG-I due to its decreased preference for iodinated and brominated compounds. There have been several previous attempts to modify the substrate specificity of HLDs by engineering their access tunnels and active sites,37,49,184 since the architecture and dynamics of these two structural elements are believed to influence the enzyme’s preference for a particular type of substrate.49,69–71,80,265,266 Although changes in the substrate specificity have been achieved by mutagenesis, a switch in the substrate specificity class has not previously been observed.49 In the present study, the marked switch in specificity class by elimination the second halide-binding site suggest that halide ions buried inside the protein core are an important determinant of the substrate specificity of HLDs. Elimination of the second halide-binding site from the protein interior considerably altered the thermal stability of DbeA in the presence of halide ions, without affecting its secondary structure. Wild-type enzyme was more stable in the presence of chloride salts than its variant. The melting temperature of DbeA wt increased with increasing concentration of salt, whereas only small changes in stability were observed for DbeA ΔCl. Lower stabilization of DbeA ΔCl in the presence of chloride ions is consistent with weaker binding of chloride ions to this enzyme, and therefore a smaller binding energy compared to that of DbeA wt. Our data demonstrate the importance of the second halide-binding site for enzyme stability in the presence of salts. Enzyme stabilization by anions bound to the protein surface as well as the protein core is a phenomenon described in the scientific literature.289–292 Investigation of the role of halide ions in shaping the protein architecture revealed that the buried halide motifs are generally associated with high protein stability.293 It was shown that the majority of the

143

stabilization energy in halide motifs buried in the protein interior is not due to electrostatics, but originates from dispersion forces. Besides protein stability, the halide anions buried in the protein core can affect other properties, such as catalytic activity, solubility, crystallizability and allosteric propensity.293 DbeA wt exhibited substantially higher catalytic activity than DbeA ΔCl. The mutant was classified as belonging to the least active HLDs, such as DrbA and DmbC, in contrast to the parental enzyme, which exhibited an overall activity similar to that of DhaA, DmbA and DhlA.49 Pre-steady state kinetic burst experiments showed that high activity of DbeA wt is connected with acceleration of the hydrolytic step. The second reaction step of hydrolytic dehalogenation catalyzed by HLDs is initiated by attack of an activated water molecule on the alkyl-enzyme intermediate. The catalytic histidine serves as a proton carrier and activates the water molecule. We hypothesize based on pKa calculations that a chloride anion bound in close proximity of the catalytic histidine may increase its basicity, and thus may accelerate nucleophilic addition of water to the alkyl enzyme intermediate. Several structures in the Protein Data Bank contain halides in the interior or at the protein surface.293,294 For the majority of these structures, the halide ions are likely to be an experimental artifact.294 Two buried halide-binding sites have been reported for a few catalysts belonging to different enzyme classes, e.g., human myelopexidase (EC 1.11.1.7),295,296 human testicular angiotensin-converting enzyme (EC 3.4.15.1)297 and photosystem II (EC 1.10.3.9).298,299 In human myelopexidase, halides bind to distal and proximal cavities within the catalytic site. The chloride ion bound to the distal cavity serves as a reaction co-substrate and participates in the production of hypochlorous acid. Binding of the second ion to the proximal cavity allows the distal pocket to generate a low-spin heme iron. When hypochlorous acid is generated, the second ion bound to the proximal cavity moves to replace the first one. This action is accompanied by expulsion of a hypochlorous acid molecule. Chloride ions in human myelopexidase serve as both a substrate and inhibitor, and modulate the heme microenvironment.294–296,300,301 In angiotensin-converting enzyme, two buried chloride ions have been found that are separated by 20.3 Å. The two chloride ions are located 20.7 Å and 10.4 Å away from the zinc ion of the active site. The first chloride ion is bound to two arginines, tryptophan and water, and is surrounded by a hydrophobic shell of four tryptophan residues. The first chloride is important for stabilization of the substrate in the binding groove, while the second halide ion serves as an ionic “switch” and activates the enzyme by positioning the amino acid residues for enzyme- substrate complex.297,302–304

144

Chapter 3

Two chloride binding sites have also been identified in two out of three crystal structures of photosystem II.298,299,304,305 The two chloride ions are separated by approximately 14 Å and are located approximately 6-7 Å from the metal ion of the enzyme active site. Activation of photosystem II by chloride binding to site 1 involves a structural change that results in an optimal framework of amino acids around the oxygen-evolving complex, as well as fine-tuning of the pKa of the residues involved in proton transport. A chloride ion may also be present at the lower affinity site 2, which together with the chloride at the site 1, is required for maintaining the coordination structure of the oxygen-evolving complex and the opening of proton channel.298,304 In summary, we show that the newly isolated haloalkane dehalogenase DbeA from B. elkanii USDA94 possesses two fully occupied chloride-binding sites buried in the protein interior. The first halide-binding site Cl1 is common to all members of the haloalkane dehalogenase family, whereas the second halide-binding Cl2 site was observed in DbeA for the first time. Elimination of the second halide-binding site by introduction of the double-point mutation I44L+Q102H (i) dramatically changed the substrate specificity, (ii) decreased the catalytic activity by an order of magnitude, (iii) eliminated the substrate inhibition with 1-bromobutane, and (iv) reduced the stability of the enzyme in the presence of high concentrations of chloride salts by 8 °C. Switching of the substrate-specificity class by mutagenesis is demonstrated for the first time for this enzyme family.

145

146

CHAPTER 3

Structural and functional analysis of a novel haloalkane dehalogenase with two halide-binding sites

Supplementary information

Acta Crystallogr. D. Biol. Crystallogr. 2014, 70, 1884–1897

DOI: 10.1107/S1399004714009018

147

Supplementary Table 1. Specific activities of DbeA wt and its variant DbeA ΔCl towards the set of 30 different halogenated substrates. a Activity not detectable under tested conditions. Specific activity (µmol s-1 mg-1 of enzyme) Substrate DbeA wt DbeA ΔCl 1-chlorobutane 0.0010 0.0008 1-chlorohexane 0.0211 0.0006 1-bromobutane 0.0297 0.0011 1-bromohexane 0.0176 0.0006 1-iodopropane 0.0274 0.0019 1-iodobutane 0.0236 0.0005 1-iodohexane 0.0155 0.0005 1,2-dichloroethane -a -a 1,3-dichloropropane -a 0.0011 1,5-dichloropentane 0.0247 0.0006 1,2-dibromoethane 0.0102 0.0098 1,3-dibromopropane 0.0357 0.0034 1-bromo-3-chloropropane 0.0416 0.0033 1,3-diiodopropane 0.0926 0.0021 2-iodobutane 0.0017 0.0010 1,2-dichloropropane -a -a 1,2-dibromopropane 0.0033 0.0023 2-bromo-1-chloropropane 0.0013 0.0028 1,2,3-trichloropropane -a 0.0002 bis(2-chloroethyl)ether 0.0012 0.0031 chlorocyclohexane -a -a bromocyclohexane 0.0054 0.0010 (1-bromomethyl)cyclohexane 0.0012 0.0001 1-bromo-2-chloroethane 0.0094 0.0010 chlorocyclopentane 0.0033 0.0012 4-bromobutyronitrile 0.0412 0.0045 1,2,3-tribromopropane 0.0059 0.0057 1,2-dibromo-3-chloropropane 0.0025 0.0033 3-chloro-2-methylpropene 0.0111 0.0050 2,3-dichloropropene 0.0025 0.0032

148

Chapter 3

Supplementary Table 2. Steady state kinetic constants of DbeA and other HLDs with 1-chlorobutane, 1-bromobutane and 1,3-dibromopropane. -1 kcat (s ) K0.5 (mM) n Ksi (mM) m 1-chlorobutane DbeA 0.17 ± 0.01 3.23 ± 0.39 -a -a -a DbjAb 1.40 ± 0.42 5.62 ± 0.41 -a -a -a DhaA 0.48 ± 0.01 0.24 ± 0.01 -a -a -a LinBc 1.11 ± 0.03 0.23 ± 0.02 -a -a -a DmbAb 0.08 ± 0.01 0.16 ± 0.04 -a -a -a 1-bromobutane DbeA 3.91 ± 0.16 0.51 ± 0.02 1.33 ± 0.09 27.06 ± 3.05 -a DbjAb 1.14 ± 0.08 0.01± 0.004 -a 2.40 ± 0.79 -a DhaAd 0.98 ± 0.05 0.35 ± 0.02 -a -a -a LinB 2.26 ± 0.01 0.12 ± 0.01 0.44 ± 0.01 -a -a DmbAb 0.24 ± 0.03 2.71 ± 0.97 -a -a -a 1,3-dibromopropane DbeA 7.62 ± 1.48 1.99 ± 0.49 -a 1.64 ± 0.31 2.29 ± 0.21 DbjAb 3.60 ± 0.49 0.22 ± 0.07 6.98 ± 2.91 DhaA 2.50 ± 0.32 0.14 ± 0.06 -a 1.70 ± 0.40 -a LinB 40.9 ± 5.20 24.1 ± 3.23 -a 0.49 ± 0.06 -a DmbAb 9.20 ± 1.17 4.52 ± 0.71 -a 2.65 ± 0.49 -a

K0.5 - concentration of substrate at half maximal velocity, kcat - catalytic constant, n - Hill coefficient Ksi - substrate inhibition constant, m - Hill coefficient in inhibitory mode. All measurements were performed at pH 8.6 and 37 °C. anot applicable, bdata from306, cdata from184, ddata from261.

Supplementary Table 3. Thermal stability of DbeA and other biochemically characterized HLDsa quantified by melting temperatures.

HLD enzyme Tm (°C) DbeA 58.5 ± 0.2 DbjA 53.6 ± 0.6 DhaA 50.4 ± 0.3 LinB 48.0 ± 0.5 DmbA 52.7 ± 0.2 DhlA 39.2 ± 0.0

149

DatA 48.3 ± 0.2 DmbBb 57.4 ± 0.6 DmbCb 45.8 ± 0.4 DrbAb 39.4 ± 0.1 adata from49, bthermal stability of DmbB, DmbC and DrbA were determined under different experimental conditions (heating rate 0.5 °C/min) compare to other HLDs (heating rate 1.0 °C/min).

Supplementary Figure 1. (A) Gel filtration chromatogram of DbeA and molecular weight calibration standards. Red line, DbeA; line 1, aldolase (158 kDa); line 2, conalbumin (75 kDa); line 3, ovalbumin (43 kDa); line 4, chymotripsinogen A (25 kDa); line 5 ribonuclease A (14 kDa). (B) Native electrophoresis of DbeA, molecular weight standards and other haloalkane dehalogenases. Red lane, DbeA; lane 1, DbjA (68 kDa); lane 2, DhaA (33 kDa); lane 3, LinB (33 kDa); lane 4, DmbA (34 kDa); lane 5, albumin (67 kDa); lane

6, ovalbumin (43 kDa). Theoretical molecular weight (MW) of DbeA monomer and dimer is 34 and 68 kDa, respectively. Experimentally determined MW of DbeA is 64 kDa, suggesting that DbeA forms a dimer under tested conditions.

150

Chapter 3

Supplementary Figure 2. Far-UV CD spectra of DbeA and other related HLDs.

151

Supplementary Figure 3. Stopped flow fluorescence analysis of bromide binding to DbeA wt and DbeA ΔCl. (A) Fluorescence traces obtained upon mixing 30 μM DbeA wt with bromides. (B) Fluorescence traces obtained upon mixing 30 μM DbeA ΔCl with bromides. (C) Bromide concentration dependence of rapid equilibrium fluorescence quench of DbeA wt with equilibrium constant (Kd1) 0.33 ± 0.08 M. (D) Bromide concentration dependence of rapid equilibrium fluorescence quench of DbeA ΔCl with equilibrium constant (Kd) 0.09 ± 0.01 M. (E) Bromide dependence of the amplitude of the slow exponential fluorescence quench of DbeA wt with equilibrium constant (Kd2) 0.84 ± 0.13 M. Solid lines represent best - - fits to the data based on Stern-Volmer equation F/F0 = (1+(f Kq [Br ]))/(1+KBr [Br ]) in which F/F0 is the relative fluorescence; f is the relative fluorescence intensity of enzyme-bromide complex; KBr is the association equilibrium constant of specific binding of bromide; Kq is the quenching constant which is the apparent association equilibrium constant of the non-specific quenching interaction between bromide and the fluorophore. (F) Bromide dependence of the observed rate constants of the slow exponential fluorescence quench of DbeA wt. Solid line represents the best fit to the data using the equation kobs = - -1 -1 -1 kassoc [Br ]/kdissoc where kassoc and kdissoc are association (1.30 ± 0.08 M s ) and dissociation (0.21 ± 0.08 s ) rate constants, respectively.

152

Chapter 3

Supplementary Figure 4. Far-UV CD spectra of DbeA wt (A) and its variant DbeA ΔCl (B) in the presence and absence of various concentrations of sodium chloride.

153

154

CHAPTER 4

CAVER Analyst 1.0: Graphic tool for interactive visualization and analysis of tunnels and channels in protein structures

Bioinformatics 2014, 30, 2684-2685

DOI: 10.1093/bioinformatics/btu364

155

Abstract

The transport of ligands, ions or solvent molecules into proteins with buried binding sites or through the membrane is enabled by protein tunnels and channels. CAVER Analyst is a software tool for calculation, analysis and real-time visualization of access tunnels and channels in static and dynamic protein structures. It provides an intuitive graphic user inter- face for setting up the calculation and interactive exploration of identified tunnels/channels and their characteristics.

Introduction

Binding sites of many proteins are deeply buried in the protein core and are connected with the surrounding environment by access tunnels167. The protein channels are found in transmembrane proteins and are important for the traffic of small molecules through the membranes. Mutations in these structures can lead to serious hereditary diseases. Shape, physico-chemical properties and dynamics of tunnels/channels determine the accessibility of binding sites to ligands, ions or solvent molecules and their biological activity.167,307 Tunnels/channels can be found in a wide range of proteins, making their analysis important and useful. Due to the intrinsic protein dynamics, the tunnels/channels significantly change their shape and properties over time.85,87,308 Interactive calculation and visualization of tunnels/channels with their characteristics can facilitate the study of important biochemical phenomena, as well as the design new catalysts or effective drugs.37,167,171,188 To meet these needs, we have developed the CAVER Analyst (Figure 1), which complements the recently published command-line CAVER 3.0 for identification of tunnels in static and dynamic protein structures.87 Unlike other available graphic interfaces for tunnel analysis MolAxis,309 PoreWalker,310 MOLEonline 2.0,183 which are limited to the tunnel analysis in a single static structure,177 the CAVER Analyst enables comparative analysis of tunnels/channels in homologous structures and to study their dynamics.

156

Chapter 4

Figure 1. The graphical user interface of the CAVER Analyst 1.0.

Tunnel/channels Calculation

CAVER Analyst integrates CAVER 3.0 for identification of tunnels in static structures and molecular dynamic trajectories. The calculation can be set and performed directly using the CAVER Analyst interface. The most important calculation settings are available through the Tunnel Computation window (Supplementary Figure 1), while the advanced parameters can be set in the Tunnel Advanced Settings window or loaded from the configuration file. The starting point for the calculation can be derived from: (i) the catalytic residues automatically loaded from the databases, (ii) automatically identified cavities, (iii) interactive selections or (iv) manually specified atoms, residues or coordinates. The starting point can be manually adjusted.

Cavity Calculation

CAVER Analyst integrates an efficient algorithm for computation and visualization of molecular surfaces and cavities. The algorithm is based on the additively weighted Voronoi diagram and allows users to access a real-time analysis of cavities in static structures (Supplementary Figure 2).

157

Visualization

Protein structures can be visualized using all standard visualization styles and colored according to various criteria (Supplementary Figure 3). The visualization of molecular surface can be customized by setting the level of the surface refinement and transparency. For exploration of the inner protein structure, CAVER Analyst offers clip planes for cutting off parts of the molecule. Tunnels and channels can be visualized by: (i) center-lines indicating tunnel location and curvature, (ii) spheres showing approximate tunnel geometry, omitting asymmetrical parts of the tunnel and (iii) detailed surface, which presents the tunnel geometry more accurately, including the asymmetrical parts (Supplementary Figure 3). Tunnels/channels identified throughout the trajectory can also be visualized in one snapshot as their center-lines.

Statistics

The summary table lists the statistics of individual tunnel clusters, including their priority, frequency, average bottleneck radius, length, curvature and throughput (Supplementary Figure 4). By selecting a particular cluster, characteristics of all individual tunnels/channels from a given cluster are depicted. Furthermore, each individual tunnel/channel can be explored at the level of its profile, showing changes in its radius and neighboring residues along its length. Additionally, the tables listing the tunnel-lining residues and the bottleneck residues are provided for each tunnel cluster. All tables can be sorted by user-selected characteristics and data can be exported as CSV or text files. All tables are interactively interconnected with the visualization window, enabling the interpretation of the results in the context of the three-dimensional protein structure. The tunnels and residues selected in the tables are automatically highlighted in the structure and thus can be simultaneously explored in the tables and visualization window.

Graphs

Characteristics of individual tunnels/channels can be plotted in the form of 2D graphs. Users can plot together into one graph the profiles of different tunnels or the profiles of the same tunnel identified in different structures or snapshots, which significantly facilitate tunnel comparison (Supplementary Figure 5). For molecular dynamics, the tunnel/channel profiles can be animated by switching between individual snapshots. Besides the profiles, this module enables users to display a time evolution of its characteristics (bottleneck and average radius, length and curvature) in the form of 2D graphs or heat plots (Supplementary Figure 6). All graphs can be saved as images or exported as raw data.

158

Chapter 4

Other

Besides the features specific to the tunnel/channel analysis, the CAVER Analyst provides a number of additional features: (i) structure alignment by combinatorial extensions,311 (ii) addition of hydrogen atoms, (iii) changing protonation of titratable residues,201 (iv) selections, (v) coloring and labeling, (vi) managing of workspaces, etc.

Implementation

CAVER Analyst is a multi-platform JAVA-based software. It can run on both 32- and 64-bit system architectures with JAVA 1.7 or a later version (see Supplementary information for implementation).

159

160

CHAPTER 4

CAVER Analyst 1.0: Graphic tool for interactive visualization and analysis of tunnels and channels in protein structures

Supplementary information

Bioinformatics 2014, 30, 2684-2685

DOI: 10.1093/bioinformatics/btu364

161

Implementation Details

CAVER Analyst is a JAVA based platform-independent software. Thanks to its modular architecture, it can be easily customized and distributed as a standalone application or run from a web browser via JAVA Web Start technology. The second option does not require local installation. CAVER Analyst works on a common hardware without the need for special hardware upgrades. This allows seamless integration into existing IT environments.

The application is supported by the following operating systems: Windows XP, Vista, 7 or 8, Mac OS X 10.6 or later, major distributions of Linux including Fedora Core, Red Hat and Ubuntu. It can run on both 32- and 64-bit system architectures and requires JAVA version 1.7 or later. For processing of small data sets, the 32-bit architecture and 2 to 4 GB RAM is sufficient hardware configuration. Large datasets may require 64-bit architecture and 8 to 46 GB RAM. To utilize advanced visualization techniques present in CAVER Analyst, a dedicated graphics card is recommended (AMD Radeon or NVIDIA GeForce).

The installable version of CAVER Analyst is distributed as a complete package with all modules required for running the application, including example data sets and a user guide.

162

Chapter 4

Supplementary Figure 1. Tunnel calculation settings. The starting point for the calculation (visualized as the origin of arrows) can be specified either automatically (loaded from the Catalytic Site Atlas) or manually (from selections, calculated cavities and from specified atoms, residues or coordinates). The derived starting point can be transferred to any other loaded structure. All calculation settings can be loaded/saved from/to an external file. Users are allowed to edit all advanced parameters used by CAVER 3.0. Layout of CAVER Analyst (top), detail of tunnel calculation settings (bottom).

163

Supplementary Figure 2. Cavity computation. Any cavity can be clipped, hidden or selected for obtaining starting point for tunnel calculation. Estimation of volume of each cavity is also provided. Layout of CAVER Analyst (top), detail of characteristics of the computed cavities (bottom left), detail of cavity computation settings (bottom right).

164

Chapter 4

Supplementary Figure 3. Demonstration of visualization styles and coloring techniques. Visualization styles for the protein structure and calculated tunnels can be accessed in three ways: (i) from the Visualization button, (ii) from the symbolic buttons placed in the top bar and (iii) from Visualization button placed in the Structure Overview tab. The coloring scheme is accessible from the top bar and from the Structure Overview tab. Users can adjust all coloring schemes. Layout of CAVER Analyst (top), detail of structure/tunnel visualization techniques (bottom left), detail of coloring schemes (bottom right).

165

Supplementary Figure 4. Advanced statistics during molecular dynamics. Overall characteristics can be displayed for all tunnel clusters of for any individual tunnels from a given cluster. Tunnel-lining and bottleneck residues of the main tunnel (green spheres) from a representative snapshot are depicted as green and blue bars, respectively. The tunnels and residues selected in the tables are automatically highlighted in the structure and thus can be simultaneously explored in the tables and visualization window. Layout of CAVER Analyst (top), overview of tunnel clusters statistics over a molecular dynamics (bottom).

166

Chapter 4

Supplementary Figure 5. Comparative analysis of the main access tunnels. Bottleneck radii across the tunnel length are plotted as 2D graphs. Tunnel characteristics can be exported either as figures (PNG) or as text (CSV). Layout of CAVER Analyst (top), detail of tunnel graphs and alignment tool (bottom).

167

Supplementary Figure 6. Analysis of molecular dynamics. Time-dependent evolution of protein structure and its tunnel matrix can be visualized and dynamic features of tunnels can be plotted either as 2D graphs or heat plots. All characteristics can be saved as images or exported as a raw data. Layout of CAVER Analyst (top), detail of heat plots (bottom).

168

Chapter 4

169

170

CHAPTER 5

Structural basis of paradoxically thermostable dehalogenase from psychrophilic bacterium

under review

171

Summary

Haloalkane dehalogenases are enzymes with broad application potential in biocatalysis, bioremediation, biosensing and cell imaging. Novel haloalkane dehalogenase DmxA originating from psychrophilic bacterium Marinobacter sp. ELB17 possesses the highest thermal stability of all currently known wildtype dehalogenases. The enzyme was successfully expressed and the crystal structure was solved with resolution 1.45 Å. DmxA’s structure contains several features not common to other enzymes of dehalogenase family, namely a unique composition of catalytic pentad and a dimeric oligomeric state mediated by a cysteine bridge. By a combination of in silico and experimental approaches, we discovered that narrow tunnels detected in DmxA play a crucial role in the paradoxical stability of the enzyme.

Highlights

Novel haloalkane dehalogenase of extremophilic origin was expressed and characterized. DmxA possesses the highest thermostability of all so far characterized haloalkane dehalogenases. The enzyme possesses unique composition of a catalytic pentad and forms a dimer via covalent binding. Narrow tunnels were indicated as the crucial feature standing behind the stability of DmxA.

Introduction

Haloalkane dehalogenases (HLDs; EC 3.8.1.5) are family of enzymes that catalyse hydrolytic cleavage of halogen-carbon bond in a wide range of aliphatic halogenated hydrocarbons and their derivatives via SN2 nucleophilic substitution followed by an addition of water, releasing halide ion, proton, and corresponding alcohol as the products of the reaction.38,287 Apart from water molecule, the enzymes do not require any cofactors. HLDs belong to the superfamily of α/β-hydrolases, a well-distinguished group of structurally similar enzymes containing luciferases, esterases, lipases or epoxide hydrolases.251 HLDs are important enzymes from the biotechnological point of view, with an application potential in biodegradation,52,230,312 biocatalysis,56,313 biosensing,54 degradation of warfare agents,55 and cell imaging,58 as reviewed recently72,231. Dehalogenating enzymes can be found in various genera of microorganisms covering soil, marine microbes, plant pathogens, mycobacteria, etc. Currently, there are over twenty biochemically characterized haloalkane dehalogenases, that can be distinguished into three subfamilies according to their primary structure similarities and active site compositions.68

172

Chapter 5

The structure of HLDs is composed of conserved main (or core) domain consisting of 8 β-sheets and 6 α-helices, and versatile cap domain formed mostly by α-helices. The active site containing catalytic pentad is situated in a hydrophobic pocket buried between the main and cap domain. The active site of HLDs is accessible via a main and a slot tunnel,80,266 while both tunnels are crucial for determination of enzyme activity and substrate specificity.265 The catalytic pentad of HLDs consists of a nucleophile, a base, an acid, and two halide stabilizing residues.38 The composition of the catalytic pentad varies among three HLD subfamilies: Asp-His-Asp+Trp-Trp in HLD-I, Asp-His-Glu+Asn-Trp in HLD-II, and Asp-His-Asp+Asn-Trp in HLD-III.68 Over 100 halogenated compounds have been found to be substrates of HLDs, including brominated, chlorinated, and iodinated chemicals. Based on the preferences of each member of the HLD family, the enzymes can be divided into four substrate specificity groups,49 distinguishing enzymes with broad substrate specificity as well as HLDs with the strong preference towards a limited number of substrates. Due to different structural features affecting the activity, it is impossible to predict the substrate specificity of putative HLDs. The rate-limiting steps also differ amongst various enzymes, from the halide release in DhlA,30 to the release of an alcohol and cleavage of the carbon–halogen bond in DhaA,37 or hydrolysis of an alkyl-enzyme intermediate in LinB29.

The first HLD was described in 1985,40 since that time many new members of the family were biochemically characterized and over 6,000 putative genes were discovered.49 HLDs can be found in many microorganisms inhabiting soil,39,40,42,43 marine environments,45,46,83,314 even in mycobacterial strains44,262,314. One haloalkane dehalogenase was characterized from a psychrophilic aquatic bacterium.264 Organisms inhabiting extreme environments – thermophiles, psychrophiles, halophiles, etc. – are important source of enzymes with unique properties of high importance for biocatalysis and other biotechnological applications.315,316 Enzymes found in extremophiles show many structural and kinetic adaptations distinct from their mesophilic counterparts that enable their activity in hostile conditions. Of all extremozymes, thermostable enzymes have attracted most attention over past decades. Several features are generally shared by thermostable and thermotolerant enzymes and proteins. Among those, the inner hydrophobicity seems to be most important for preserving the structure in elevated temperatures.317 This is confirmed by the observations of increased rigidity of thermophilic enzymes,318 although sometimes only local rigidity of the protein structure is sufficient for maintaining the structural integrity in high temperatures. The amino acid composition may be highly individual, nevertheless, the thermophilic enzymes are characterized by an increased proportion of charged and hydrophobic residues.319 Other factors that can influence stability of the protein structure can be the presence of salt bridges,320 electrostatic interactions or hydrogen bonds.319 However, not all types of hydrogen bonds can

173

provide stabilization of the protein structure and they are considered less effective than hydrophobic interactions.317,321 Some thermostable enzymes were found in psychrophilic and mesophilic organisms e.g., dehydrogenase, aspartase and alcohol dehydrogenase from Flavobacterium frigidimaris KUC-1,322–324 endoglucanase from Fusarium oxysporum,325 isocitrate dehydrogenase from Desulfotalea psychrophila,326 or haloacid dehalogenase from Psychromonas ingrahamii327. Such paradigm has not been conclusively explained yet and we don’t know why the organisms possess such stable enzymes. Nonetheless, the study performed by Novak et al.327 suggest horizontal gene transfer as the reason for presence of thermostable enzyme in psychrophilic organism.

In the present study, we describe the biochemical and structural characterization of a novel haloalkane dehalogenase from psychrophilic bacterium Marinobacter sp. ELB17 isolated from an Antarctic lake. The crystalographic analysis revealed several unique features of DmxA: (i) dimerization of DmxA units is performed via a cysteine bridge, (ii) DmxA possesses unusual composition of catalytic pentad in which only Trp serves as halide-stabilizing residue, and (iii) tunnels of DmxA are very narrow and closed. Molecular dynamics and in silico tunnel analysis followed by mutagenesis and functional analysis of DmxA variants revealed changes in substrate specificity and enabled the detailed insight into the role of tunnels in paradoxical stability of the enzyme.

Results

Expression and biochemical characterization of a novel HLD DmxA

The gene encoding putative HLD DmxA was identified in the genome of Marinobacter sp. ELB17, a psychrophilic gammaproteobacterium isolated from the east lobe of Lake Bonney in the Taylor Valley, Antarctica.328 The dmxA gene was subcloned into the pET21b vector, overexpressed in E. coli BL21 (DE3) equipped with C-terminal His-tag and purified to homogeneity yielding 390 mg of a soluble protein per liter of culture and purity over 95 %. The circular dichroism (CD) spectroscopy was used to confirm the correct folding of the enzyme. The spectrum corresponded to the spectra of other related HLDs. The same technique was used for the determination of thermal unfolding. The melting temperature (Tm) was determined to be 65.9 ± 0.1 °C, which means higher stability than any other biochemically characterized HLD (Figure 1). Such stability of DmxA is rather paradoxical, since Marinobacter sp. ELB17 from which the dmxA gene originates, grows optimally at 12-15 °C.328

174

Chapter 5

Figure 1. The comparison of melting temperatures of selected biochemically characterized haloalkane dehalogenases with novel dehalogenase DmxA and its variants. Of all known dehalogenases, DmxA exhibits the highest stability (Tm = 65.9 ± 0.1 °C). Error bars represent standard deviations from three independent experiments. The melting temperature of wild-type HLDs and DmxA variants is represented as grey and white bars, respectively.

The substrate specificity of DmxA was assayed with 30 selected halogenated substrates representing a wide range of structures and physiochemical properties.49 Out of this set, DmxA exhibited activity with 26 compounds (Figure 3A). The substrate spectrum of DmxA is relatively broad, with slight preference towards brominated or bromo-chlorinated substrates and low or zero activity towards chlorinated substrates. Although the overall activity of the enzyme is low compared to other HLDs, the principal component analysis classified DmxA into the SSG-I together with DbjA, DhaA, DhlA, and LinB enzymes (Figure 3B). All these enzymes possess broad substrate specificity and usually convert well even chlorinated substrates.

Measurement of activity at different temperatures revealed that highest activity of DmxA can be achieved at 55°C (Supplementary Figure 3). The enzyme retains ca. 20 % of activity at 60 °C, which corresponds to the thermal stability of the enzyme. Unexpectedly, the enzyme was active also at 20 °C, but the activity gradually decreased with temperature. DmxA also exhibited broad pH spectrum with highest activity at pH 8.99 and 95 % activity in range 7.09 – 8.99 (Supplementary Figure 4). The activity rapidly decreased below pH 5.65 and above 9.53. Such stability is understandable due to the thermal stability of the protein structure, but is rather unusual among HLDs. The only enzyme with broader pH spectrum is DbjA.329

175

The steady-state kinetics determined with three substrates revealed a complex kinetic mechanism of the enzyme, with slight differences between substrates. The most complex mechanism was observed with 1,3-dibromopropane, which involved cooperativity together with hyperbolic substrate inhibition. Cooperativity and/or substrate inhibition was part of the kinetic mechanism also with remaining substrates. The detailed kinetic parameters are presented in Supplementary Table 1.

From a biotechnological point of view, the enantioselectivity might be the second most important feature of DmxA. The enzyme is highly selective with β-brominated alkanes as well as α-brominated esters. The determined E-values for 2-bromopentane and ethyl 2-bromopropionate are 100 and >200, respectively. Comparison of enantioselectivity of DmxA with other HLDs is provided in the Supplementary Table 2.

Structural characterization of DmxA

DmxA was crystallized as described previously330 and the crystal structure was solved with resolution 1.45 Å (Table 1). The overall structure and domain organization is very similar to other related HLD enzymes. The core domain comprises typical α/β-hydrolase structure consisting of eight β-sheets with antiparallel β2 flanked by four α-helices. The cap domain, inserted between β-strand β6 and α-helix α8 consists of five short α-helices connected by six loop insertions. The active site is centrally buried and protected by the cap domain. The catalytic pentad of DmxA contains the catalytic base H273, nucleophile D105, acid E129, and two halide-stabilizing residues W106 and, surprisingly, Q40.

The peaks in difference electron density map in a vicinity of the active-site were interpreted as water molecules and acetic ion with the occupancy 1.0 for all of them in chain A and B. The water molecule was situated in a canonical halide-binding pocket of the enzyme and interacted with nitrogen atom of one halide-stabilizing residue W106 Nε1 with the distances in 2.93 Å. Further coordination of the water molecule was provided by two oxygen atoms from and hydroxyl group of the acetate molecule with 2.57 Å and 3.12 Å distances, respectively, and oxygen O atom from P206 with 3.15 Å distance. The nitrogen atom from second halide stabilizing residue Q40 Nε2 was situated 5.33 Å far from the water molecule and was not involved in the halide-binding process of DmxA. The catalytic nucleophile D105 further interacted with the acetate anion by formation of hydrogen bonds between D105 Oδ1 and oxygen atom O from hydroxyl group of the acetate ion with 2.41 Å distance. DmxA was crystallized as homodimer with monomeric units connected via a disulfide bridge formed by C294 residues. This feature explained our previous observations of DmxA on native gel electrophoresis, where it was present in monomer-dimer mixture differing in proportion of

176

Chapter 5

monomeric and dimeric form (Supplementary Figure 2). Protonation of the cysteine is suggested as the reason for such behavior of the enzyme in solution.

Table 1. Diffraction data collection and refinement statistics. Values in parentheses are for the highest- resolution shell.

Data collection statistics Beamline ID29 Wavelength (Å) 0.972 Detector Pilatus 6M-F Crystal-to-detector distance (mm) 265 Rotation range per image (°) 0.05 Exposure time per image (s) 0.037 Resolution range (Å) 100.0 - 1.45 (1.49 - 1.45) Space group P 21 21 21 (19) Unit-cell parameters (Å; °) a = 43.37, b = 78.34, c = 150.5; α = γ = β = 90.0 Mosaicity (°) 0.2 Matthews coefficient (Å3/Da) / number 1.88/2 of molecules in AU Solvent content (%) 34.67 Total no. of measured intensities 484657 (37044) Number of unique reflections 39029 (5978) Redundancy 5.28 (5.52) Average I/σ(I) 8.02 (2.1) Completeness (%) 99.7 (99.9) a Rmeas (%) 9.1 (71.9%) b Rmerge 11.2 (62.1) Wilson B (Å2) 21.048

Refinement No. of reflections in working set 86979 (6370) Maximum resolution (Ǻ) 1.45 R value (%)c 19.56 Rfree value (%)d 22.30 RMSD bond length (Å) 0.0064 RMSD angle () 1.2196 No. of atoms in AU 5375 No. of water molecules in AU 439 No. of acetate ions in AU 13 Mean B value (Å2) 18.242 Ramachandran plot statistics: Residues in preferred regions (%) 95.02 Residues in allowed regions (%) 4.09 Residues outliers (%) 0.89

177

a 1/2 Rmeas = Σhkl[N/(N-1)] Σi|Ii(hkl) - 〈I(hkl)〉|/ ΣhklΣiIi(hkl), where 〈I(hkl)〉 is the mean of the N(hkl) individual b measurements Ii(hkl) of the intensity of reflection hkl. Rmerge = Shkl Si |Ii(hkl) - (I(hkl))| / Shkl Si Ii(hkl), where Ii(hkl) is the ith observation of reflection hkl and is the weighted average intensity for all observations of reflection hkl.c R-value = ||Fo| - |Fc||/|Fo|, where Fo and Fc are the observed and d calculated structure factors, respectively. Rfree is equivalent to R value but is calculated for 5 % of the reflections chosen at random and omitted from the refinement process.

Construction of variant DmxA C/S with eliminated cysteine bridge

Since cysteine bridges are known as contributors to the protein stability, we examined the effect of C294 on stabilization of DmxA dimer by replacing it with serine (referred here as DmxA C/S variant, Figure 2). The native gel electrophoresis confirmed that this variant is only present as a monomer in the solution (Supplementary Figure 2), and CD spectroscopy verified correct folding and secondary structure of the mutant (Supplementary Figure 1). DmxA C/S exhibited almost the same thermostability (Tm = 65.3 ± 0.2 °C, Figure 1) as the wild-type, and only small changes in specific activity were observed. The PCA analysis clustered DmxA C/S near DmxA wild-type (Figure 3). No substantial change of enantioselectivity was observed (Supplementary Table 2). A cysteine bridge between two DmxA domains has, therefore, no effect on enzyme stability or activity.

Construction of variant DmxA Q/N with substituted halide-stabilizing residue

Q40 in the active site of DmxA was replaced with asparagine (referred here as DmxA Q/N variant, Figure 2) to assess the effect of a unique halide stabilizing residue on activity and stability. The introduced mutation led to decrease of thermostability by 2 °C (Tm = 63.9 ± 0.2 °C; Figure 1), which indicates that glutamine in disordered position interacts with other residues in the active site and provides local stabilization of the enzyme. The replacement was demonstrated by significant changes in the substrate specificity of the mutant (Figure 3A) and shift from SSG-I to SSG-IV (Figure 3B). Such changes of substrate specificity of HLDs could only be achieved by protein engineering e.g., by elimination of one halide-binding site in DbeA.82 DmxA Q/N is the most active variant of DmxA, with increased activity towards 11 out of 30 tested substrates. Prominent changes were observed also in steady-state kinetics, most importantly in case of 1,3-dibromopropane with many-fold increased kcat (Supplementary Table 1). Enantioselectivity of DmxA Q/N remained comparable with the wild-type (Supplementary Table 2).

178

Chapter 5

Figure 2. The overall structure of DmxA (light blue surface) and its variants. Central part represents DmxA dimer formed via disulfide bridge between C294 residues close to C-terminus of the enzyme. Mutant DmxA C/S was constructed by replacing cysteine with serine to examine the effect of cysteine bridge on stability of DmxA. The mutant DmxA Q/N shows position of Q40 in the active site and its replacement with asparagine. DmxA possesses unique catalytic pentad among members of subfamily HLD-II, with glutamine interacting with neighboring residues and not serving as a halide-stabilizing residue. N40 in variant DmxA Q/N is in favorable position to bind the halide released during the SN2 reaction. DmxA TUN reveals narrow tunnels of DmxA and main bottlenecks in the main tunnel represented by residues M177 and F246. These residues were replaced by alanine in variant DmxA TUN, resulting in widely opened main tunnel. The blue star represents the position of the active site.

179

A

B C

Figure 3. The substrate specificity of DmxA and its variants and principal component analysis. (A) Substrate specificity profile of DmxA and its variants assayed towards thirty halogenated substrates. (B) The score plot t1/t2 is a two-dimensional window into the multidimensional space, where the objects (enzymes) with similar properties (specificity profiles) are collocated. The score plot t1/t2 shows clustering of HLDs into individual substrate specificity groups (SSGs). This score plot describes 44 % of variance in the dataset. (C) The corresponding loading plot p1/p2 quantifies contributions of original

180

Chapter 5 variables to created principal components (axes of the score plots). Variables localized further from the origin possessed a stronger effect on the principal component than the variables localized closer to the origin of the plot. Numbering of the substrates is as follows: 1-chlorobutane (4), 1-chlorohexane (6), 1-bromobutane (18), 1-bromohexane (20), 1-iodopropane (28), 1-iodobutane (29), 1-iodohexane (31), 1,2-dichloroethane (37), 1,3-dichloropropane (38), 1,5-dichloropentane (40), 1,2-dibromoethane (47), 1,3-dibromopropane (48), 1-bromo-3-chloropropane (52), 1,3-diiodopropane (54), 2-iodobutane (64), 1,2-dichloropropane (67), 1,2-dibromopropane (72), 2-bromo-1-chloropropane (76), 1,2,3-trichloropropane (80), bis(2-chloroethyl)ether (111), chlorocyclohexane (115), bromocyclohexane (117), (bromomethyl)cyclohexane (119), 1-bromo-2-chloroethane (137), chlorocyclopentane (138), 4- bromobutyronitrile (141), 1,2,3-tribromopropane (154), 1,2-dibromo-3-chloropropane (155), 3-chloro-2 methylpropene (209) and 2,3-dichloropropene (225).

Analysis of tunnel network and Q40 in DmxA wild-type

Four independent molecular dynamic (MD) simulations were run for each monomer. All analyses were performed on the last 100ns of the MD simulations. Q40 was systematically observed to be rotated to a non-stabilizing position and forming hydrogen bonds with oxygen from Y68 and/or L203. The only exception was run 1 of monomer B where no such hydrogen bonds were observed (Supplementary Figure 5 and Supplementary Table 3).

The analysis of tunnel network revealed two predominant pathways leading from the active site to the surface. The topology of these paths corresponds to the p1 (main) tunnel and p2 (slot) tunnel, previously reported by Klvana et al.85 Both the main and the slot tunnels were closed for water molecules for most of the MD simulations with an average bottleneck radius of 1.1 Å (both tunnels) and with an average opening for 1.4 Å probe of 12% (main tunnel) and 8% (slot tunnel) (Supplementary Table 4). Analysis of tunnel bottlenecks, i.e., the narrowest part of tunnel, in at least 80% of simulation on average revealed four important residues for each of the tunnels (Supplementary Table 5).

Mutagenesis of tunnel bottlenecks and construction of DmxA TUN variant

In order to open the main tunnel, all four most frequently occurring bottleneck residues were mutated in silico to smaller nonpolar amino acids (T145A, I173V, M177A and F246A) in both monomers. The mutations were most frequently occurring at the corresponding position in the sequence of evolutionary-related HLDs (Supplementary Table 5). The effect of two substitutions (M177A and F246A) was predicted as destabilizing while the effect of the other two (T145A and I173V) was predicted as neutral. The combination of destabilizing mutations M177A and F246A was predicted as additive with total ΔΔG of 8.8 ± 1.2 kcal/mol. The destabilizing mutations M177A and F246A were introduced in silico into each monomer of 181

DmxA wild-type. Four independent MD simulations were run for each mutated monomer. All analyses were performed on the last 100ns of each MD simulation. The analysis of tunnel network revealed two major pathways leading from the active site to the surface which were in the correspondence with DmxA wild-type. The main tunnel was predominantly open for water molecules in MD simulation (average opening for 1.4 Å probe of 67%) with an average bottleneck radius of 1.5 Å. Thus, the opening was 5.5 times improved over the wild-type. The slot tunnel was not affected by the mutations and remained closed for water molecules for almost whole MD simulations (average opening for 1.4 Å probe of 5%) with an average bottleneck radius of 1.1 Å (Supplementary Table 4).

The double-point mutant M177A/F246A (referred here as DmxA TUN variant, Figure 2) was constructed by side-directed mutagenesis and exhibited 9 °C decrease in thermostability (Tm = 56.9 ± 0.3 °C; Figure 1), which confirmed our hypothesis that narrow tunnels are responsible for stabilization of the enzyme core by hydrophobic interactions. Although the changes in substrate specificity profile were not as prominent as in the case of DmxA Q/N, the variant DmxA TUN was clustered within SSG-IV (Figure 3). Enantioselectivity was not changed with ethyl 2- bromopropionate (E >200), but dropped significantly with 2-bromopentane (E = 14), suggesting that either M177 or F246 is important for enantiodiscrimination of the enzyme towards brominated alkanes (Supplementary Table 2).

Discussion

Biochemical and structural analysis of novel haloalkane dehalogenase DmxA from antarctic bacterium Marinobacter sp. ELB17 revealed paradoxical stability of the enzyme. Despite its psychrophilic origin, DmxA exhibited the highest thermal stability of all biochemically characterized members of HLD enzyme family. As was mentioned earlier and reviewed by Oikawa et al.,331 several thermostable enzymes from psychrophiles and mesophiles were isolated and characterized, although the evolutionary basis of their presence in those organisms remains unrevealed. Only two existing studies on paradoxically thermostable enzymes attempted to explain their stability326,327 but both provided rather general conclusions that would not explain the properties of DmxA.

The thermal stability of DmxA was found to be 65.9 ± 0.1 °C, which is by 7 °C higher than DbeA (Figure 1).82 Several different features of the enzyme were studied in detail to explain the improved stability of the protein structure. DmxA was found to form a dimer via a cysteine bridge. This was the first observation ever of dehalogenase units to bind via covalent bonds. So far studied dimerization of DbjA and DbeA suggested surface interactions mediating dimerization of the enzymes.82,329 Although cysteines in protein structures often serve as

182

Chapter 5

stabilizing elements, the S-S bridge between DmxA units did not have any effect on stability or activity of the enzyme.

The unique composition of catalytic pentad was revealed in the crystal structure of DmxA containing glutamine instead of asparagine which is typical for members of subfamily HLD-II.68 One dehalogenase with an unusual halide-stabilizing residue was described earlier.43 The position of Q40 in DmxA active site is slightly disordered and it seems to be interacting with neighboring residues in the active site. At the same time, the position of side chain disables its functioning as halide-stabilizing residue. When replacing the glutamine with asparagine, the melting temperature decreased by 2 °C (Tm = 63.9 ± 0.2 °C) confirming small stabilizing effect of glutamine interactions in the protein core. With variant DmxA Q/N, we observed an important shift in its substrate specificity from SSG-I to SSG-IV. Such changes in substrate specificity of haloalkane dehalogenases can only be achieved by sophisticated protein engineering.189,332 Mutagenesis of HLD DatA led to change of the activity profile, but its assignment to the substrate specificity group was not affected.237 In our case, only one mutation was sufficient for changing the substrate specificity group of the enzyme.

Deeper analysis of the structure revealed small hydrophobic pocket containing the active site and narrow tunnels with bulky bottleneck residues. Presence of phenylalanine and methionine as bottleneck residues was not overly surprising, since they often provide stabilization of protein structures via hydrophobic interactions.317 Replacement of bulky residues with alanine opened the tunnels and hampered stabilization of the tunnel structure via interactions provided by the bottleneck residues. This led to the destabilization of the protein core exhibited by 9 °C decrease of the melting temperature, most probably related to the decreased hydrophobicity of the protein. In this study, we proved that narrow tunnels play crucial role in paradoxical stability of DmxA.

In order to explain high stability of novel haloalkane dehalogenase DmxA, we systematically investigated three possible sources of stabilization. The system we described is highly understandable and many effects of the introduced mutations in DmxA are predictable. Both replacement of glutamine with asparagine and opening the main tunnel brought expected outcomes. More importantly, with only two mutations we managed to turn the thermostable enzyme into a mesophilic one, with melting temperature comparable to other related HLD enzymes. If we combined effect of mutations in DmxA Q/N and DmxA TUN variants, the resulting enzyme would result in properties of highly active mesophilic dehalogenases.

183

Experimental procedures

Gene synthesis, expression, and purification

Original sequence was taken from GeneBank database (accession number AAXY00000000). The gene re_dmxA was synthesized commercially (Mr.Gene, Germany). Before synthesis, the codon usage was adapted to the codon bias of E.coli and the restriction sites for EcoRI, HindIII and NdeI were added. The dmxA gene containing restriction sites and His-tag was subsequently subcloned into the pET21b expression vector. Successful subcloning was confirmed by agarose gel electrophoresis and sequencing on both strands.

Heterologous expression of DmxA was performed in LB medium (Sigma-Aldrich, USA). The medium was composed of 10 g/L tryptone, 5 g/L yeast extract, and 5 g/L sodium chloride.333 Precultures were prepared by picking one colony of transformed E. coli carrying the gene coding a target enzyme to 10 mL of LB medium with ampicillin and incubated overnight at 37 °C and 200 rpm. One liter of LB medium supplemented with ampicillin was inoculated with the overnight culture and incubated at 37 °C and 120 rpm. When the culture reached OD600 0.5, the expression was initiated by the addition of IPTG to a final concentration of 0.5 mM and cells were further incubated overnight at 25°C. Cells were harvested by centrifugation at 6,000 g and 4 °C, stored at -70 °C and defrosted before purification.

Harvested cells were disrupted by sonication using a Soniprep 150 (Sanyo Gallenkamp PLC, UK). The supernatants were collected after centrifugation at 21,000 g for 1 h. The crude extracts were further purified on Ni-NTA Superflow Cartridge (Qiagen, Germany). The purified proteins were dialyzed against 50 mM phosphate buffer (pH 7.5) overnight. All enzymes were stored at 4 °C in a 50 mM potassium phosphate buffer prior to analysis. Expression profiles, solubility, and purity of the enzymes were checked by SDS-PAGE; the amount of target enzyme in the fractions on SDS gel was determined by a GS-800 Calibrated Densitometer (Bio-Rad, USA). The concentration of purified enzyme was determined by the Bradford method,334 using bovine serum albumin as a standard.

Analysis of secondary structure and thermostability

Circular dichroism (CD) spectra were recorded at room temperature using a Chirascan spectrometer (Applied Photophysics, UK) equipped with a Peltier thermostat. Data were collected from 185 to 260 nm, at a scan rate of 100 nm/min, 1 s response and 1 nm bandwidth using a 0.1 cm quartz cuvette containing 0.2 mg/mL enzyme in a 50 mM potassium phosphate buffer (pH 7.5). Each collected spectrum represents an average of five individual scans and has

184

Chapter 5

been corrected for absorbance caused by the buffer. CD data were expressed in terms of the mean residue ellipticity (ΘMRE) using the following equation:

ΘMRE = (Θobs . Mw . 100) / (n . c . l) where Θobs is the observed ellipticity in degrees, Mw is the protein molecular weight, n is a number of residues, l is the cell path length, c is the protein concentration, and the factor of 100 originates from the conversion of the molecular weight to mg/mmol. Thermal unfolding of DmxA was followed by monitoring the ellipticity at 222 nm over a temperature range of 20 to 80 °C, using a resolution of 0.1 °C and the heating rate of 1 °C/min. The resulting thermal denaturation was roughly normalized to represent a signal change between approximately 1 and 0 and fitted to sigmoidal curves using Origin 6.0 software (OriginLab, USA). The melting temperature (Tm) was evaluated as a midpoint of the normalized thermal transition. Native polyacrylamide gel electrophoresis

Disulfide bond formation in DmxA was studied by comparing the size of reduced and non-reduced enzyme samples on the native polyacrylamide gel. Two well-known HLDs, LinB (34 kDa, monomer) and DbjA (66 kDa, dimer) were used as molecular weight standards. Prior to electrophoresis, the solution of DmxA was diluted to 4 mg/mL and dithiothreitol (DTT) was added to final concentration 10 mM. The sample was then degassed at room temperature for 30 min and saturated with the 20-fold volume of pure nitrogen. Protein samples were mixed with loading dye and applied to the gel wells. The electrophoresis experiments were performed at 4 °C and 115 V. The gel was stained with Coomassie Brilliant blue R250 (Sigma-Aldrich, USA) and analyzed by GS-800 Calibrated Densitometer (Bio-Rad, USA).

Specific activity assay

Enzymatic activity was assayed using the colorimetric method developed by Iwasaki et al.244 The release of halide ions was measured spectrophotometrically at 460 nm using a SUNRISE microplate reader (Tecan, Switzerland) after reaction with mercuric thiocyanate and ferric ammonium sulfate. Dehalogenation reactions were performed at 37 °C in 25-mL Reacti- flasks closed by Mininert valves. The reaction mixture contained 10 mL of glycine buffer (100 mM, pH 8.6) and 10 μL of the halogenated substrate. The reactions were initiated by the addition of appropriate amount of the enzyme depending on the activity. The reactions were monitored by withdrawing 1 mL samples at periodical intervals from the reaction mixture and immediately mixing the samples with 0.1 mL of 35% nitric acid to terminate the reaction. Dehalogenation activities were quantified by a slope of product formation with time.

185

Principal Component Analysis (PCA) analysis

A matrix containing the activity data for ten wild-type haloalkane dehalogenases (HLDs) and three DmxA variants with 30 substrates was analyzed by PCA to uncover relationships between individual HLDs (cases) and their substrates (variables).49 In brief, two PCA were performed using Statistica 12.0 (StatSoft, USA). The raw data were log-transformed and weighted relative to the individual enzyme's activity towards other substrates prior to analysis, in order to better discern individual enzymes' specificity profiles. Thus: (i) each specific activity value was incremented by 1 unit; (ii) the log of this new value was taken; and (iii) this log value was then divided by the sum of all the log values for that particular enzyme. These transformed data were used to identify substrate-specificity groups of enzymes that exhibited similar specificity profiles regardless of their overall specific activities.

Temperature and pH profile

The effect of temperature and pH on DmxA activity was determined by performing the specific activity measurement at different temperatures and pH values with 1,3-diiodopropane as substrate. The activity assay was carried out as described before at temperature ranging from 20 °C to 65 °C in 100 mM glycine buffer (pH 6.8). Effect of pH was assayed at 37 °C in Britton-Robinson buffer covering the pH range 5.03-10.25.

Steady-state kinetics

The steady-state kinetics of DmxA variants with was determined with three halogenated substrates by using a VP-ITC isothermal titration calorimeter (MicroCal, USA) at 37 °C. A reaction mixture vessel of the microcalorimeter was filled with 1.4 ml of enzyme solution at a concentration of 0.005-0.06 mg/ml (enzyme was dialyzed against 100 mM glycine buffer, pH 8.6). The substrate solution was prepared in the same buffer by the addition of 1,2-dibromoethane, 1,3-dibromopropane or 4-bromobutyronitrile to a final concentration of 22-28, 20-23, and 6-8 mM, respectively. Substrate concentration was verified by gas chromatography (Agilent, USA) prior to the experiment. In the kinetic experiment, the enzyme was titrated in 150 s intervals in the reaction mixture vessel with increasing amounts of the substrate, while pseudo-first-order conditions were maintained. Every 10 µL injection increased the substrate concentration, leading to a further increase in enzyme reaction rate (an increase of heat generated) until the enzymatic reaction was saturated. A total of 28 injections were performed during titration. The reaction rates reached at every injection (in units of thermal power) were recalculated to enzyme turnover. The calculated enzyme turnover plotted against the actual concentration of the substrate after every injection was then fitted by nonlinear regression to different kinetic models using Origin 6.0 (OriginLab, USA). 186

Chapter 5

Measurement of enantioselectivity Enantioselectivity measurement was performed at 20 °C in 25-ml Reacti Flasks closed by Mininert Valves containing 25 ml of Tris sulfate buffer (50 mM, pH 8.2). The racemic substrates (2-bromopentane, ethyl 2-bromopropionate) were added to the reaction mixture to a final concentration 0.5 - 3 mM. The reaction was initiated by addition of 1 ml of enzyme (5-7 mg/mL) into the reaction mixture. The reaction progress was monitored by periodical withdrawing samples from the reaction mixture. The samples were mixed with methanol containing 1,2-dichloroethane as an internal standard. The samples were analyzed by using Agilent Technologies 7890A gas chromatograph (Agilent, USA) equipped with a flame ionization detector and chiral capillary column Astec Chiraldex B-DM (50 m x 0.25 mm x 0.12 μl film thickness) (Sigma-Aldrich, USA). The enantioselectivity was expressed as E-value defined as the ratio between the specificity constants (kcat/Km) for the two enantiomers. To estimate the kinetic parameters, the equation describing competitive Michaelis-Menten kinetics was fitted by numerical integration to progress curves obtained from the kinetic resolution experiments by using software Scientist (MicroMath Research, USA).

Enzyme crystallization

The purified DmxA enzyme in a 50 mM Tris HCl buffer (pH 7.5) at the concentration of 10 mg/ml was used for the crystallization experiments. The crystals formation procedure was performed as mentioned before.330 The modification relates to the crystallization drop placed on the microbridge (Triana Science & Technology, Granada, Spain) consisted of 12 µl of the protein solution and 6 µl of precipitant solution pH 5.9, plus 3.6 µl of 0.1 M sarcosine (Hampton Research, Aliso Viejo, California, USA) and equilibrated against 20 ml of a reservoir solution in the crystallization mushroom (Triana Science & Technology, Granada, Spain). The rhombohedra-shaped crystals with the dimensions of approximately 0.156 x 0.091 x 0.0156 mm appeared after 9 days of incubation at 293 K.

Data collection and processing

Single crystals of DmxA were mounted into nylon loops (Hampton Research, Aliso Viejo, California, USA) and MicroLoops (MiTeGen; Jena Bioscience GmbH, Jena, Germany), directly flash-cooled in a liquid-nitrogen stream without additional cryoprotection. The diffraction data were collected at beamline ID29 at the ESRF (European Synchrotron Radiation Facility, Grenoble, France), equipped with a Pilatus 6M-F detector at the wavelengths of 0.972 Å and 100K temperature. A complete diffraction data set of 3000 images with 0.05° oscillation and 265 mm crystal-to-detector distance was collected up to 1.45 Å resolution. The diffraction data

187

were automatically processed by XDS program package.335,336 Data-collection and refinement statistics are summarized in Table 1.

Structure determination and refinement

The phase problem was solved by the molecular replacement method with the help of MOLREP270 using DhaA structure from Rhodococcus sp. (PDB ID 4E46) as a template model. The structure was refined by restrained isotropic and TLS refinement274 using 3 groups for each of the chain that was carried out by REFMAC5271 from the CCP4 software suite (Collaborative Computational Project, Number 4, 1994) and manual building steps were applied in COOT273. The last step of the refinement was carried out by PDB_REDO337 for structure model optimization.

Structure validation and deposition

The structure R and Rfree characteristics were reduced from 0.38 (38%) to 0.197 (19.70%) and from 0.41 (41%) to 0.223 (22.30%), respectfully. The structure validation of the final model was verified using internal tools of COOT, MOLPROBITY,338 SFCHECK339 and wwPDB Validation Server209. A Ramachandran plot275 showed 96.7% favored regions invention of 591/610 residues of the modeled structure with five outliers which are correctly fitted in the electron density map and may be explained by the influencing of the geometry of the close position to the active site. The structure refinement and validation statistics are presented in Table 1. The average B-factor of the output PDB file from REFMAC files was obtained through the program TLSANL from the CCP4 software suite and the resulting PDB file contains the total B-factor including the equivalent isotropic contribution from TLS.

Molecular dynamics

The DmxA occurs as dimer in the crystal structure. Both monomeric units were selected as starting points for the molecular dynamics (MD). The hydrogen atoms were added to both structures separately with H++ server at pH 7.5.210 All water molecules from the crystal structures were added to the systems. Cl- and Na+ ions were added to the final concentration of 0.1M using Tleap module of AMBER 14126. Using the same module, an octahedron of TIP3P water molecules279 was also added to the distance of 10 Å from any atom in the system. The systems were minimized in five rounds consisting of 5,000 steepest descent steps followed by 5,000 conjugate gradient steps with a decreasing restraint on the protein backbone (500, 125, 50, 25 and 0 kcal.mol-1.Å-2). The following MD simulations employed periodic boundary conditions, the particle mesh Ewald method for treatment of the electrostatic interactions,280

188

Chapter 5

10 Å cutoff for nonbonded interactions, and 2 fs time step with the SHAKE algorithm281 to fix all bonds containing hydrogens. The equilibration simulation consisted of two steps: (i) 20 ps of gradual heating from 0 to 310 K under constant volume, using a Langevin thermostat with collision frequency of 1.0 ps-1, and with harmonic restraints of 5.0 kcal.mol-1.Å-2 on the position of all protein atoms, and (ii) 2,000 ps of unrestrained MD at 310 K using the Langevin thermostat, and constant pressure of 1.0 bar using pressure coupling constant of 1.0 ps-1. Finally, production MD simulations were run for 150 ns with the same settings as the second step of equilibration MD. Coordinates were saved in 2 ps interval, and the trajectories were analyzed using cpptraj module340 of AMBER 14, and visualized in PyMOL229 and VMD 1.9.1282. All calculations were carried out in the GPU (CUDA) PMEMD module341,342 of AMBER 14 using ff14SB force field283,343,344.

Tunnel analysis

Tunnel networks were analyzed by the stand-alone version of CAVER 3.0287. Each atom of the protein structure was approximated by 12 spheres. The starting point was specified by residues Asp 105, Trp 106, Asn 40 and Phe 169 followed by automatic optimization to prevent its collision with other protein atoms. The tunnel search was performed on every second frame from the MD by using probe radius 0.8 Å and the default settings. The redundant tunnels were automatically removed from each snapshot. The clustering of tunnels was performed by the average-link hierarchical Murtagh algorithm based on the calculated matrix of pairwise tunnel distances. The clustering threshold was set to 4.

Construction of mutants

All mutants were constructed by Rosetta 3.3. The structures of both monomers DmxA were first minimized by a Rosetta routine minimize_with_cst with following parameters: both backbone and side chains optimization was enabled (sc_min_only false), distance for full atom pair potential was set to 9 Å (fa_max_dis 9.0), and standard weights for the individual terms in the energy function were used and constraint weight 1 (constraint_weight 1.0). Output from minimization was used by script convert_to_cst_file.sh for a creation of the constraint file.345 The construction of mutants was performed by the ddg_monomer module of Rosetta by using parameters employed in protocol 16 by Kellogg et al.345 The soft-repulsive design energy function (soft_rep_design weights) was used for side chains repacking and backbone minimization (sc_min_only false). Optimization was performed on whole protein without distance restriction (local_opt_only false). Previously created cst file was used as a constraint during backbone minimization (min_cst true). The optimization was performed in three rounds with increasing weight on the repulsive term (ramp_repulsive true). A minimum energy (mean

189

false, min true) from 50 iterations (iterations 50) was employed as a final parameter describing the stability effect.

Site directed mutagenesis

Primers for preparation of genes encoding variants of DmxA were designed by using CloneManager software (Sci-ED, USA). Sequences of mutagenic and non-mutagenic primers are available in Supplementary Table 6. The site-directed mutagenesis was performed according to modified protocol described by Sanchis et al.346. PCR mixtures were prepared by mixing 100 ng of the template plasmid pET21b::re_dmxA, 1 µL of each 10 mM primer, 1 µL of 10 mM mixture of dNTPs, 1.25 µL of Phusion High fidelity polymerase (2.5 U), 10 µL of 5x Phusion High fidelity polymerase buffer, and sterile water to final volume 50 µL. In the first part of PCR, the megaprimers were formed between mutagenic primers or mutagenic and non-mutagenic primers. In the second part of PCR, the megaprimers served as primers for replication of the whole plasmid. Template DNA was digested by DpnI endonuclease during 1 h incubation at 37 °C. The endonuclease in the mixture was deactivated by incubation at 80 °C for 10 min. Then, 5 µL of the PCR mixture was digested by NdeI and HindIII restriction endonucleases and analyzed by agarose gel electrophoresis; 5-7 µL of the mixture were transformed to E. coli DH5α cells. After overnight cultivation at 37 °C, 4-6 colonies were picked from the agar plates and used for preparation of 10-mL overnight cultures. Plasmids were isolated from cultured cells using GeneJET Plasmid Miniprep Kit (Fermentas, USA) and sequenced on both strands. Simultaneously, glycerol stocks were prepared from the overnight cultures by mixing with 30% glycerol (1:1).

190

Chapter 5

191

192

CHAPTER 5

Structural basis of paradoxically thermostable dehalogenase from psychrophilic bacterium

Supplementary information

under review

193

Supplementary Figure 1. Circular dichroism spectra of selected haloalkane dehalogenases.

A B C D E

Supplementary Figure 2. Native PAGE of selected haloalkane dehalogenases and DmxA variants: A – LinB (monomer), B – DbjA (dimer), C – DmxA (monomer + dimer), D – DmxA with 10 mM DTT (monomer), E – DmxA C/S (monomer).

194

Chapter 5

Supplementary Figure 3. Temperature profile of DmxA measured with 1,3-diiodopropane. The values represent the mean from three independent experiments.

Supplementary Figure 4. The pH profile of DmxA measured with 1,3-diiodopropane. The values represent the mean from three independent experiments.

195

Supplementary Figure 5. The orientation of the halide-stabilizing Q40 systematically observed in MD simulations. The hydrogen bonds formed between the polar hydrogens of Q40 and oxygen of Y68 and L203 are represented as the dashed line.

Supplementary Table 1. Steady-state kinetic parameters of DmxA and its variants.

1 2 substrate variant K0.5 kcat n KSI b kcat/K0.5 DmxA wt 0.8 1.89 -- 1.66 -- 2.36 1,2-dibromoethane DmxA Q/N 1.19 0.56 1.25 -- -- 0.47 DmxA TUN 1.34 0.43 1.52 18.61 -- 0.32 DmxA wt 0.03 2.61 1.77 3.24 0.29 88.09 1,3-dibromopropane DmxA Q/N 0.13 30.91 -- 6.68 -- 244.12 DmxA TUN 0.13 1.91 2.29 -- -- 14.76 DmxA wt 0.67 1.89 -- 5.72 -- 2.83 4-bromobutyrate DmxA Q/N 1.25 0.93 1.35 -- -- 0.75 DmxA TUN 0.93 1.25 1.53 -- -- 1.35 1Hill index of cooperativity, 2Factor of hyperbolic inhibition

196

Chapter 5

Supplementary Table 2. Enantioselectivity of DmxA and its variants and other haloalkane dehalogenases. E – value

substrate DmxA wt DmxA C/S DmxAQ/N DmxA TUN DatA1 DhaA2 LinB2 DbjA2,3

2-bromopentane 100 106 104 14 > 200 7 16 132 ethyl 2-bromopropionate > 200 > 200 > 200 > 200 n.a. n.a. n.a. n.a. n.a. – not analysed, 1 Hasan et al.43; 2 Prokop et al.56; 3 Chaloupkova et al.329

Supplementary Table 3. The distance between the polar hydrogens of Q40 and oxygen of Y68 and L203 in MD simulations. monomer A monomer B run1 run2 run3 run4 run1 run2 run3 run4 Average distance to OH atom of Y68 [Å] 2.2 2.2 2.3 3.5 6.6 3.7 3.9 4.3 Average distance to O atom of L203 [Å] 3.6 3.6 3.6 4.2 6.8 3.4 3.5 5.0

Supplementary Table 4. Characteristics of the top-ranked tunnel clusters in DmxA TUN variant using the probe radius 0.8 Å. DmxA TUN DmxA wild-type main tunnel slot tunnel main tunnel slot tunnel Average bottleneck radius [Å] 1.5 ± 0.1 1.1 ± 0.0 1.1 ± 0.1 1.1 ± 0.1 Average opening for 1.4 Å probe [%] 67 ± 23 5 ± 2 12 ± 11 8 ± 5

Supplementary Table 5. List of residues forming the main-tunnel bottleneck in at least 80% on average. The mutability score and multiple sequence alignment were provided by HotSpot Wizard.150 Average percentage of Mutability Residue formed bottleneck Multiple sequence alignment score during MD [%] T145 86 ± 13 8 12×A, 5×V, 3×F, 3×G, 3×S, 3×T, 2×K, 2×L, 2×M, 1×D, 1×E, 1×I, 1×P I173 81 ± 20 5 12×I, 12×L, 12×V, 7×A, 2×G, 2×M, 2×T, 1×F M177 93 ± 7 6 14×S, 12×A, 9×G, 4×T, 3×M, 2×F, 2×I, 1×D, 1×E, 1×K, 1×L F246 80 ± 2 7 13×A, 12×F, 7×V, 6×P, 4×L, 2×S, 2×W, 1×C, 1×G, 1×R, 1×T

197

Supplementary Table 6. Primers used for mutagenesis of DmxA. Mutagenic primers

DmxA_C/S_Fw 5'-gtggatggatcgtctggcctgcatcatc-3'

DmxA_C/S_Rv 5'-gatgatgcaggccagaacgatccatccac-3'

DmxA_Q/N_Fw 5'-acagataagaccaagttggattgccatgcagaaacagcaga-3'

DmxA_Q/N_Rv 5'-tctgctgtttctgcatggcaatccaacttggtcttatctgt-3'

DmxA_TUN_Fw 5'-ttgtggaaaacatcctgccggcagcgatttgtcgtccgctggaac-3'

DmxA_TUN_Rv 5'-aacagccggtgccggaatcagagcaccaggttcggcatgaatc-3'

Non-mutagenic primers

pET_Fw 5'-taatacgactcactataggg-3'

pET_Rv 5'-gctagttattgctcagcgg-3'

198

Chapter 5

199

200

CHAPTER 6

Enzyme tunnels and gates as relevant targets in drug design

under review

201

Abstract

Many enzymes contain tunnels and molecular gates that are important to their function. These structural features are rather spread in biology, and many biochemical systems rely on them for proper functioning. However, numerous enzyme gates and transient tunnels have not yet been identified, and thus their functional roles and potential utility as drug targets has been overlooked. Herein we describe a set of general concepts and classification systems to address those structural elements and discuss their key features and components, from the structural and functional point of view. We highlight the potential of the enzyme tunnels and gates as targets for the binding of small molecules. The different types of binding and the possible pharmacological benefits of such targeting are presented and discussed. Several examples of ligands bound to the tunnels or gates of clinically relevant enzymes illustrate those binding modes and benefits, and inspire some of the strategies outlined here for the design of new drugs that aim for the enzyme tunnels or gates. Overall, with this review we draw attention to the structural and functional importance of molecular tunnels and gates in biology. We also demonstrate, with examples, their relevance as targets for drug discovery, which sometimes, may be overlooked. Some of the strategies presented here might help to overcome some of the problems currently faced by medicinal chemists and lead to the discovery of more effective drugs.

Introduction

The complexity of Nature and its operating modes is surprisingly high and varied. And so are the biomolecules that rule every single physiological process of life at the finest detail, from the simple hydration of a carbon dioxide molecule to the complex DNA replication by the synchronization and stepwise action of several different biomolecules. Likewise, the approaches studying and regulating those biochemical systems have to be inspired by that same diversity. Modern medicinal chemistry cannot remain wedded to the old paradigms and traditional approaches, but has to keep searching for new ones.

It is well known that most biomolecular systems contain voids, cavities, channels, tunnels, or grooves of some kind. In many cases, the tunnels and channels have functional roles, which typically consist of securing the transport of substances between different regions. They connect inner cavities with the surface, different inner cavities, distinct parts of the protein surface, or even different cellular environments (such as in membrane proteins). Many enzymes have buried active sites, which need to be connected to the bulk solvent through exchange pathways to carry out their reactions. The lock-keyhole-key model167 (Figure 1) has recently been proposed for enzymes containing buried active sites as a more realistic approach than the

202

Chapter 6

traditional Fischer’s lock-and-key model2 or even the Koshland’s induced-fit model169. According to the lock-keyhole-key model, the substrate (key) needs to pass through a tunnel (keyhole) in order to reach the active site (lock). Hence, it becomes quite intuitive that the access tunnels and channels may represent important structural features that can account for regulating enzymatic functions and other biological processes.167,168 Therefore, those pathways can be regarded as potential hotspots for modulating the functioning of biomolecules inside living cells, and hence they deserve careful analysis.

Figure 1. Lock-keyhole-key model. The key represents the substrate that needs to pass through the keyhole, representing the tunnel, in order to reach the lock, the active site, and react. Adapted from 167. Copyright Wiley-VCH Verlag GmbH & Co. KGaA. Reproduced with permission.

Enzyme tunnels are often equipped with molecular gates, which can make the transport of substances through those biomolecular systems regulated and specific. Molecular gating is a dynamic process that opens or closes the access of substances to different regions of the macromolecules. According to different factors, the gating processes may have very different timescales and roles. While the gates of ion channels have been studied and targeted by therapeutic agents for a long time, the gates of other biomolecules, such as many enzymatic systems, are still far from sufficiently explored or understood.171 Gates exist due to the dynamic nature of all biomolecules, and they can be regarded as sophisticated means to accomplish important functions in those biological systems. Dynamics is nowadays recognized as a fundamental property of all biomolecules, contributing to their function. Hence, the protein’s flexibility can hardly be excluded from any deep study in structural biology or structure-based drug design.347–350

203

In this review, we aim at drawing attention to the importance of the enzyme tunnels and gates, and, in particular, to their relevance in drug discovery. Often these structural features are overlooked, and yet they may hold the key to overcome old problems and render possible the discovery of highly active and selective drugs. Over the following sections, we will present the functional roles, structural bases and localization of tunnels and gates. Their classification is introduced whenever possible, and examples of therapeutic targets containing these features are given. We will present and discuss representative case studies of pharmaceutically relevant targets complexed with synthetic inhibitors bound to their tunnels or gates. These examples will allow us to illustrate the different types of binding and the benefits of targeting those structural elements; they are also taken as a proof-of-concept to several possible strategies to design new binders aiming at the enzyme tunnels and gates.

Enzyme tunnels and gates

In this chapter, we provide a detailed classification and description of enzyme tunnels and gates as functional structural features. The terminology used in the scientific literature is quite diverse, and the terms tunnel and channel are often used with the same meaning. Herein we define enzyme tunnel as the transport pathway that connects two points located in different regions of the enzyme structure, which can be on the surface or in any cavity inside the enzyme and has a functional role. An enzyme gate is a dynamic system consisting of individual residues, loops, secondary structure elements, or even domains that reversibly switch between the open and closed conformations and thereby control the traffic of small molecules – substrates, products, ions, or solvent – into or out of the enzyme structure. For each one of these biomolecular features we describe their structural basis, functional roles, localization, and their potential as targets for drug design.

Enzyme tunnels

Many enzymes contain catalytic or binding sites that are not exposed to the solvent, but rather buried within their cores. The most important advantage of it is to have a thorough control of the catalytic process at different levels. In these cases, a communication system between that functional site and the bulk solvent is necessary, and that is the enzyme tunnel. The main function of an enzyme tunnel is to perform the exchange of substances, such as the transport of substrates, products, cofactors or solvent molecules, in order to ensure the efficiency of the enzymatic process. Sometimes the presence of water molecules may hinder the catalytic reaction, and in such cases, the access of solvent needs to be controlled. In other cases, it is imperative to avoid the release of toxic intermediates that might endanger the living cell. Selection of the proper substrate from the complex mixture of molecules co-localized in the cell also becomes easier with the existence tunnels. Depending on 204

Chapter 6

the geometry and physicochemical properties of the tunnels, it may be possible to select the substrates that can pass through, and hence assure the substrate specificity. This can be intuitively understood from the lock-keyhole-key model (Figure 1), which implies the idea of complementarity between the key (substrate) and the keyhole (tunnel).

The presence of tunnels in enzymes is quite a widespread feature, as they can be found in all six enzyme classes (Table 1). They can be classified according to the structural elements involved and the molecular function167 (Figure 2):

(1) One single tunnel connecting a buried cavity to the bulk solvent. In this case, the existing tunnel is the only pathway for the exchange of reagents, products, solvent or ions between the buried cavity and enzyme surroundings.

(2) Two or more tunnels connecting the buried cavity to the bulk solvent. The substrates, products and solvent may have different preference for the different tunnels. In most cases there is one main tunnel and secondary tunnels which are used as alternative or auxiliary routes.

(3) Tunnels connecting different catalytic sites in multifunctional enzymes or enzyme complexes possessing more than one active center. These tunnels steer the intermediate products in the right direction and prevent them from escaping into the medium, thus enhancing the enzyme’s efficiency. Additionally, they can also avoid side reactions by keeping labile intermediates away from the solvent, or even prevent toxic products from being released into the cell.

The bottleneck of a tunnel – the narrowest point – is often a hotspot for selectivity, since it determines the maximum size and the chemical nature of the substances that can pass through. Another important part of the tunnel is its entrance (or mouth). This is the first point of interaction with the bulk solvent, and it may have a role of major importance in substrate recognition. Likewise, the group of residues forming the bottleneck or the first shell of residues at the entrance of a tunnel can play a major role in determining its function. Hence, several parameters need to be identified in order to investigate the function of any given tunnel: its length, bottleneck radius, average radius, bottleneck residues, entrance residues, and curvature.

The dynamic nature of the system also needs to be taken into account when investigating the function of the tunnels. Due to dynamics, the tunnel’s geometry may vary significantly with time. While the main tunnels are frequently permanent and can be observed straightforwardly from the crystal structure, identification of additional transient tunnels always 205

requires the study of dynamic changes within the enzyme. Transient tunnels open and close, depending on the enzyme conformation. Nevertheless, they can be essential for the proper functioning of the enzyme. Many enzymes thought to possess only one single tunnel may, in fact, have other functional transient tunnels, which will be revealed only with deeper studies.167,168,351,352 In some cases, the nature of the tunnels can only be fully understood when studied in the presence of the ligands, which may induce their opening and influence their occurrence.353

The geometry, chemical nature, and dynamics of the tunnels can also influence many enzyme properties. This has been demonstrated in a wide variety of examples. For instance, mutagenesis studies on the tunnel-lining residues or bottlenecks have been reported to modify the enzyme activity (e.g., cholesterol oxidase,354 catalase,355 cytochrome P450,356,357 glucosamine-6-phosphate synthase,358 β-ketoacyl-acyl-carrier-protein synthase,359 RNA-dependent RNA polymerase,360 lipase,361 acetylcholinesterase,362 epoxide hydrolase,185 haloalkane dehalogenase,37,184 tryptophan synthase,363 3-hydroxydecanoyl-acyl carrier protein dehydratase,364 squalene-hopene cyclase,365 asparagine synthetase,366 carbamoyl phosphate synthetase367); substrate specificity and enantioselectivity (e.g., amine oxidase,368 cytochrome P450,356 octaprenyl pyrophosphate synthase,369 lipase,361,370 epoxide hydrolase,185,371 haloalkane dehalogenases,56,184 squalene-hopene cyclase365); and stability (e.g., haloalkane dehalogenases188). Examples of enzymes containing tunnels which are also pharmaceutical targets are listed in the Table 1. In many of them those tunnels have already been targeted by inhibitors (see Examples in Table 2).

Figure 2. Different types of enzyme tunnels. Examples of enzymes containing one (1) or more tunnels (2) connecting the active site to the surface or connecting several active sites (3). The stars indicate the location of the active sites. Adapted from with permissions from 167

206

Chapter 6

Table 1. Examples of enzymes with clinical relevance containing molecular tunnels or gates. Structural Class Enzyme Clinical relevance References feature EC1 Aldehyde dehydrogenase Neurodegenerative disorders, Tunnel 372 Oxidoreductases cancer Catalase Inflammation, tumor, anemia, Tunnel and 373 diabetes mellitus, gate hypertension, vitiligo Cholesterol oxidase Bacterial pathogenesis Tunnel and 354,374 gate Choline oxidase Bacterial pathogenesis Gate 375 Copper-containing amine Wound healing, Tunnel and 376 oxidase atherosclerosis, cell growth gate Cyclooxygenase Pain, inflammation, cancer Tunnel 377–379 Cytochrome P450 Cancer, antibiotics, Tunnel and 192,353,380–382 antiparasitic, drug metabolism gate Dihydrofolate reductase Cancer, antibiotics Gate 383,384 Dihydroorotate dehydrogenase Autoimmune and parasitic Tunnel 385,386 diseases, immunosuppression, cancer, inflammation Enoyl-acyl carrier protein Antibacterial Gate 387 reductase Lipoxygenase Stroke therapy, inflammatory Tunnel and 388 diseases gate Monoamine oxidase B Alzheimer, Parkinson and Tunnel and 389,390 other neurodegenerative gate diseases Nitric oxide synthase Neurological diseases, Tunnel and 391,392 inflammation, rheumatoid gate arthritis, immune-type diabetes, stroke, cancer, thrombosis, infection susceptibilities Polyamine oxidase Cell growth, proliferation, Tunnel 393 differentiation Proline utilization A Bacterial pathogenesis Tunnel and 394 gate Xanthine oxidase Cardiovascular and Tunnel and 395 inflammatory diseases, gate chronic obstructive pulmonary disease, gout, ischemia EC2 β-ketoacyl-acyl-carrier-protein Antibiotics Tunnel 359,396 Transferases synthase Anthranilate Tuberculosis Gate 397 Phosphoribosyltransferase Aspartate transaminase Antiparasital, antibiotics, Gate 398,399 cancer Catechol-O-methyltransferase Schizophrenia, depression, Gate 400 Parkinson’s disease DNA and RNA polymerases Antivirals Gate 401–404 207

Fatty acid synthase type I Antivirotics, cancer Tunnel and 405 gate Glucosamine-6-phosphate Antifungal chemotherapy Tunnel and 358,406 synthase gate Glutamine Leukemia Tunnel 407,408 phosphoribosylpyrophosphate amidotranferase Glutathione S-transferase Antibiotics, cancer Tunnel 409 Glycogen phosphorylase Diabetes Tunnel 410 Imidazole glycerol phosphate Antibiotics, herbicides Tunnel and 408,411 synthase gate Octaprenyl pyrophosphate Antibiotics Tunnel 369 synthase Peptidyl transferase center Antibiotics Tunnel 412,413 (ribozyme) Phospho-2-dehydro-3- Antibiotics Gate 414 deoxyheptonate aldolase Polynucleotide kinase Cancer Tunnel and 415 gate Protein kinases Cellular metabolism, Gate 416 proliferation, survival, growth, angiogenesis Purine nucleoside phosphorylase Gout, arthritis, cancer Gate 417 Sulfotransferase Xenobiotic metabolism, Gate 418 cancer Thymidylate synthase Cancer Gate 419 Transglutaminase 2 Celiac sprue Tunnel and 420 gate EC3 β-lactamase Antibiotics Gate 421,422 Hydrolases β-secretase Alzheimer’s disease Gate 423 γ-secretase Alzheimer’s disease Gate 424 Acetylcholinesterase Obesity, Alzheimer’s disease, Tunnel and 351,425,426 dyslipidemia gate Aryl esterase Coronary and heart disease Tunnel ATP-dependent protease HsIVU Chronic stress disease, aging Gate 427,428 Autotaxin Arthritis, cancer, neurological Tunnel 429 and cardiovascular diseases ClpP serine protease Antibiotics Tunnel and 430 gate Cysteine protease Chagas disease, other parasitic Gate 431,432 diseases Deacetylase LpxC Antibiotics Tunnel 433,434 Epoxide hydrolase Vascular diseases Tunnel and 435–437 gate Histone deacetylase Inflammation, cancer, Tunnel 438 neurodegenerative disorders HIV protease HIV infection Gate 439,440 Leukotriene-A4 hydrolase Inflammatory diseases Tunnel 194 Lipase Atherosclerosis, Tunnel adn 441–443 chylomicronemia, obesity, gate Alzheimer’s disease, and dyslipidemia associated with diabetes, insulin resistance 208

Chapter 6

Neurolysin Nervous and endocrine Tunnel 444 systems disorders Phospholipase A2 Nephropathy, vascular Gate 445 diseases, neurological disorders Prolyl endopeptidase Neurological disorders, Chagas Tunnel and 446,447 disease, cancer, celiac sprue gate (as therapeutics) RNA triphosphatase Anemia, Alzheimer's disease, Tunnel and 448,449 leukemia, colitis, fungal gate infections Urease Hepatic coma, infection Gate 450 stones, and peptic ulceration EC4 2-amino-2-desoxyisochorismate Tryptophane deficiency Tunnel and 451 Lyases synthase PhzE gate β-hydroxyacyl-acyl carrier Gastric diseases Tunnel and 364,452,453 protein dehydratase FabZ gate Carbonic anhydrase Autoimmune disease, Tunnel and 454,455 diuretics, anticancer, anti- gate obesity, Alzheimer's disease Chondroitin AC lyase Neurological disease Tunnel and 456 gate 457 Aromatic L-amino acid Parkinson’s disease, Gate decarboxylase Tourette’s syndrome, schizophrenia, depression, cancer Tryptophan synthase Tuberculosis, bacterial and Tunnel and 408,458,459 protozoan infections gate EC5 Glutamate racemase Antibiotics Tunnel 460,461 Isomerases Methylmalonyl-CoA-mutase Acidemia Tunnel and 462,463 gate Oxidosqualene cyclase Antibiotics Tunnel and 464 gate Squalene-hopene cyclase Anticholesterol Tunnel 465 Triosephosphate isomerase Tropical diseases, Gate 466–468 tuberculosis, Alzheimer’s disease EC6 Asparagine synthetase Lymphoma Tunnel and 366,469,408 Ligases gate Carbamoyl phosphate Urea cycle defect Tunnel and 367,470,408 synthetase gate Cytidine triphosphate synthetase Anticancer, antiparasitic Tunnel and 471,472 gate Ubiquitin-conjugating enzyme E2 Cancer Gate 473

209

Enzyme gates

Enzyme tunnels are privileged biochemical features that allow regulating the access of small molecules and ions to the functional regions of the enzymes. This control can be performed through size exclusion and chemical complementarity, and driven by diffusion, electrochemical gradient, osmotic pressure, etc. Alternatively, such regulation can be performed by molecular gates. In the latter case, a more sophisticated mechanism is in charge of selecting how and when the substances are allowed to pass through a certain pathway, by reversibly switching between the open and closed conformations.168,171,474 The gates of ion channels have been largely investigated by biochemists and structural biologists for quite some time, whereas the research on the gates of enzymes is far more limited. The topic is dispersed in the literature without consensual terminology, and a systematic classification was introduced only recently.171 In this review, we will focus on the gates of enzymes although many concepts could be generalized to other biomolecular systems. There are many examples of gated enzymes reported in the literature, and all six classes of enzymes have members that contain some type of gate (Table 1).171 The molecular gates can play various roles in the proper functioning of biochemical systems. One of the main functions consists of controlling the access of substances through the tunnels or reaching certain regions. Depending on certain properties, such as the hydrophobic character, electrostatic profile, opening amplitude, opening/closing rates, etc., they can act as efficient filters for attaining selectivity and timeliness. Enzyme gates can also restrain the access of the solvent. In particular cases, the reactions performed by the enzymes are sensitive to the presence of water molecules, and thus solvent restriction is necessary. Some enzymes forming reactive intermediates during the catalytic cycles, such as cytochromes P450,174 carbamoyl phosphate synthetase470 and imidazole glycerol phosphate synthase,411 may use the gates to prevent their reaction with water. Gates can also play a role in the synchronization of molecular events taking place in different parts of the enzyme. This can occur in enzymes containing different active sites, in which the flux of intermediate products needs to be regulated. The gates can also prevent the escaping of toxic intermediates out to the cell. Good examples are carbamoyl phosphate synthetase,470 asparagine synthetase,366 glucosamine 6-phosphate synthase358 and glutamate synthase475. All these enzymes have tunnels for ammonia transportation, the carbamoyl phosphate synthetase has tunnels also for the transportation of carbamate.

The key elements defining the enzyme gates are: (i) door residues – the ones that are displaced during gating and directly lead to opening or closing it; (ii) anchoring residues – which interact with the door residues and stabilize them either in the open or closed state; (iii) hinge residues – which make the structure flexible and allow it to move. The molecular gates can be classified according to their structural basis171; they may involve single residues, groups of

210

Chapter 6

residues, secondary-structure elements, or domains (Figure 3), which also determines their amplitude and time scales.

(1) The simplest gate is termed as a wing, and results from the rotation of the side chain of one single residue. This is the most common type of molecular gates in enzymes. Their amplitudes are small, in the range of a few angstroms, but their time scales can range from picoseconds to microseconds. Each state can be stabilized by anchoring residues, which interact with the gating residue and hold it for some time in a certain conformation. The most common residues involved in this type of gate are W, F and Y.

(2) The swinging door gate corresponds to the synchronized rotation of two side chains. The two residues interact in the closed state by π-π stacking (e.g., F-F, F-Y pairs), hydrophobic interactions (e.g., F-I, F-V, F-L, L-I, L-V, R-L pairs), ionic interactions (e.g., R-E, R-D pairs) or H-bonding (e.g., R-S pair). This is the second most common type of gate, and the most common swinging door pair of residues is F-F. The reported time scales range from picoseconds to microseconds.

(3) The aperture gate corresponds to the simultaneous movement of the backbone atoms of several residues in a sort of a low-frequency “breathing” motion of the backbone, without the need of side-chain rotations. Such backbone movement brings those residues closer or apart, which results in the closure or opening of the gate, and has larger amplitude than the gates employing only side-chains motion. The enzyme’s rigidity has particular influence on the time scale of such type of gates, which range from nanosecond to microsecond time scales.

(4) Drawbridge and double drawbridge gates correspond to the motion of one or two secondary structure elements, respectively, and frequently involve loops. These are gates of larger amplitude than the previous ones and control the access of large ligands or cofactors to the binding cavities. Such type of movements can be part of a complex system that opens, closes, or merges existing tunnels, and can even operate in cooperation with smaller types of gates that provide fine tuning of the ligand accessibility. The time scales of such gates range from the nanoseconds to microseconds.

211

(5) Finally, the shell gate is characterized by the movement of entire enzyme domains. This type of gate typically occurs when very large substrates are involved in enzymatic reactions, but they are also common in ion channels and ion pumps. Sometimes such large movements require an additional supply of energy, e.g., in the form of ATP. Due to the amplitude these motions and size of the elements involved, the time scales in which they occur can be quite broad, from hundreds of nanoseconds to seconds.

212

Chapter 6

Figure 3. Types of enzyme gates. Examples of enzymes containing different classes of gates: 1) wing; 2) swinging door; 3) aperture; 4) drawbridge; 5) double drawbridge; 6) shell. The closed and open conformations are represented with the respective schematic illustration while the gating elements are depicted in red color. Adapted with permissions from 171

The operating mode of enzyme gates can be stochastic and follow the formalism introduced by McCammon and co-workers476,477 describing diffusion-controlled gates. These authors approximated the gating process to a stochastic switch between the fully closed and the fully

213

open states, and the overall binding rate as a function of the non-gated binding rate and the rates of opening and closing. Based on the comparison of these two rates, two limit situations corresponding to the fast or slow gating could be predicted. The molecular gates can also be induced by stimuli, such as voltage changes or the binding of certain ligands. This case is very common in the ion channels, but they lie beyond the scope of this review.

Regarding the location in the structures, the enzyme gates can be found: (i) at the mouth or the bottleneck of the access tunnels and channels, (ii) at the entrance to the active site of the enzymes, or (iii) at the interface between the cofactor binding site and the active site itself.171 Tunnels are important systems for selecting the substances that can access certain regions of the enzymes. Hence, many of these structural systems also possess gates, which grant a finer level of molecular steering or regulation. In these cases, the mouth and the bottleneck of tunnels are the most common locations for the biomolecular gates. The entrance to the tunnel is the first contact with the bulk solvent and the substances, and consequently, this is a relevant candidate for the location of a gate that selects which molecules are allowed to enter. On the other hand, the tunnel bottleneck is the narrowest point and often dictates its permeability. The entrance to the active site cavity of an enzyme can also be a suitable location for the existence of a gate. In this way, it can easily assist on synchronizing the access of all the reagents to the catalytic site, or ensuring the proper orientation of the catalytic residues when the substrate enters. Enzyme gates can also be positioned at the cofactor cavity, taking benefit of the motion of enzyme residues interacting with the cofactor. In some cases, the gate involves the cofactor itself, which can adopt different conformations to open and close the access of the substrate. In spite of this generalization, there are still enzyme gates that do not fall within any of the previous location categories.171

Many enzymes targeted in pharmaceutical research have already been found to possess gating processes of some kind (Table 1). In some of them, the gates have already been recognized as good hotspots for interaction with small molecules (see selected examples in Table 2). In other cases, they still remain as potential targets for new drug design approaches.

Binding of small molecules to tunnels and gates

In this section, we present and discuss different aspects of the binding of small molecules to the tunnels or gates of clinically relevant enzymes, from the point of view of modifying their activity. Aspects such as the different types of binding, the benefits of this binding, and the strategies for designing effective binders are critically described. Finally, representative examples of pharmaceutical targets complexed with small molecules, taken from the Protein Data Bank, are presented in order to illustrate what was previously exposed. 214

Chapter 6

Hereafter, unless clearly stated, the examples mentioned throughout the text refer to the ones found in Table 2.

Types of binding

There are different possible ways that an inhibitor can bind to the tunnels and gates of enzymes in order to inhibit or modify their activity. These types of binding are schematically represented in Figure 4 and are: (i) catalytic site; (ii) the tunnel; (iii) a gate; and (iv) mixed binding.

Figure 4. A schematic representation of the binding modes of the inhibitors to enzymes containing tunnels or gates of: a) binding to the catalytic site only; b) to the tunnel; c) to a a gate. The star represents the catalytic site and the diamond the inhibitor; b) shows also a secondary tunnel.

When an inhibitor targets an enzyme, it can bind directly to the catalytic site in order to prevent the enzymatic reaction. This type of binding is very frequent and is also the most commonly found among enzyme inhibitors.

An inhibitor can bind to the tunnel of an enzyme and block its main function as a transport pathway. In this case, the inhibitor can be long enough and extend its structure along the whole tunnel making a high number of contacts with the tunnel-lining residues. When this happens, it can result in a very stable complex due to the large number of interactions formed, and consequently have high inhibitory activity. This was the case of sterol 14α-demethylase, dihydroorotate dehydrogenase, leukotriene A4 hydrolase/aminopeptidase and peptidyl transferase center (Example Nr. 2, 4 and 8, respectively, Table 2). However, most commonly, the inhibitors bind to particular regions rather than the whole tunnel, provided that they form favorable interactions with some residues. These regions are typically the entrance to the catalytic site, the tunnel bottleneck, or the tunnel entrance. Interaction with the residues at the entrance to the catalytic site is perhaps the most common in inhibitors that bind to the catalytic site; however, alone these interactions hardly provide selectivity, as these residues are often conserved among different variants. The binding at the tunnel entrance was observed for the 215

acetylcholinesterase (Example Nr. 7, Table 2). In this case, the inhibitor interacts with a group of aromatic residues that provide strong affinity for blocking the tunnel and inactivating that enzyme with reasonable potency. As mentioned before, the tunnel mouth sometimes has features for substrates recognition, and specific interactions in this region can provide selectivity, e.g. with the sterol 14α-demethylase (Example Nr. 2, Table 2). The tunnel bottleneck is another potential hotspot for the binding of inhibitors. Being the narrowest part of the tunnel, a small inhibitor may be enough to provide strong interactions to form a stable adduct with the target enzyme and block the transport process through the tunnel. On the other hand, the bottleneck can be a source of specific interactions, and hence selectivity. This was the case of the cytochrome P450 17A1 and the peptidyl transferase center (Example Nr. 3 and 5, respectively, Table 2).

When inhibitors bind to the gate of enzymes, the most obvious and common binding position is the gate interface. This corresponds to the region of contact between the moving elements, which, depending on the type of gate, can be the door residues, the flexible secondary elements, or the entire domains. In this case, either the open or the closed conformations can be targeted by the inhibitors. In this type of binding, the inhibitor stabilizes and locks one of the two conformations, thus disrupting the gating mechanism, together with the biological function of the target. The serine/threonine protein kinase AKT1 (Example Nr. 6, Table 2) is one case where the enzyme was blocked in the closed conformation by the inhibitor, which bound between the two domains and blocked the enzymatic activity with high potency. In the case of the HIV-1 protease (Example Nr. 10, Table 2), the inhibitor was bound between the two flexible flaps, locking the enzyme in the open conformation, and inhibiting its function.

The binding modes mentioned above correspond to the pure types. However, most often the inhibitors bind the enzymes in more than one mode. For instance, they can bind at the catalytic site and to different regions of the tunnel. When the tunnels are gated, it can happen that the inhibitor binds to several of the tunnel residues and to the gate interface, or bind to the catalytic residues and the gating elements. The inhibitors can also bind at the catalytic site and the gating elements. The combination of binding modes can be very varied, and depend on the system and the drug design strategy. This was the case of the iducible nitric oxide synthase, cytochrome P450 17A1, prolyl endopeptidase, deacetylase LpxC and dehydratase FabZ (Example Nr. 1, 3, 9, 11 and 12, respectively, Table 2).

216

Chapter 6

Benefits of binding

When the enzymes possess tunnels or gates of some type, the binding of these structural elements might bring important advantages. Some of these are: (i) target selectivity; (ii) binding affinity; (iii) broad spectrum activity; (iv) widen the space of binding modes; (v) tackle drug resistance; and (vi) selectively target one function.

Selectivity is the first advantage, and probably the most common one, found in the binding of tunnel or gates. According to the reactions catalyzed and the type of substrates, the catalytic site and the surrounding residues can be much conserved among some families of enzymes, and sometimes even across different families. The respective tunnels, on the contrary, are typically formed by many residues from different secondary elements, and therefore are more likely to present variability. The specific binding of inhibitors to those variable regions of the tunnels might result in selective inhibition of the targeted enzymes. Often the bottleneck contributes the most to the specific interactions within the tunnel (e.g., cytochrome P450 17A1 and peptidyl transferase center, Example Nr. 3 and 5, respectively, Table 2), or the residues at the tunnel mouth that provide specificity to the inhibitors (e.g., sterol 14α-demethylase, acetylcholinesterase and dehydratase FabZ, Example Nr. 2, 7 and 12, respectively, Table 2). It is important to mention that the enzyme dynamics can have a fundamental role in the binding of selective inhibitors. Due to dynamics, there is an intrinsic fluctuation in the geometry of the tunnel. Such fluctuation can be different in different enzyme variants due to differences in the tunnel lining residues or in the second- or third-shell residues (anchoring residues). The inducible nitric oxide synthase (iNOS)(Example Nr. 1, Table 2) is a very interesting case where those far anchoring residues played an essential role in discovering selective inhibitors due to the different dynamics of the tunnels among the different NOS variants. The binding to enzyme gates can also lead to high selectivity. Being more or less complex systems, where several factors contribute to their properties (door, hinge and anchoring elements can affect the gate dynamics), the gates can be quite specific for each member of an enzyme family. The serine/threonine protein kinase AKT1 (Example Nr. 6, Table 2) showed how the binding at the gate interface led finding selective inhibitors.

The binding affinity is the second main advantage. A tunnel can have a large accessible surface area available, and hence it may supply many contacts with an inhibitor. These contacts, when favorable and in a large number may form very stable enzyme-inhibitor complexes. This was the case of the inducible nitric oxide synthase, sterol 14α-demethylase, cytochrome P450 17A1, dihydroorotate dehydrogenase, peptidyl transferase center, leukotriene A4 hydrolase/aminopeptidase, and deacetylase LpxC, (Example Nr. 1–5, 8, and 11, respectively, Table 2). When binding to the gate of an enzyme, an inhibitor can also gain high stabilizing

217

interactions with one of the possible gate conformations, and thus result in high potency. This can occur when the gate is flexible enough to allow the accommodation of the inhibitor and maximize their interactions. This can be valid either for the closed (e.g., serine/threonine protein kinase AKT1 and prolyl endopeptidase, Example Nr. 6 and 9, respectively, Table 2) or the open conformation of the gate (e.g., HIV protease, Example Nr. 10 Table 2).

Contrasting with the aim for selectivity, in some cases broad spectrum activity is desirable. This is the case of antivirals or antibiotics, which are often required to block several strains of viruses or bacteria. Some enzymes and other biochemical systems are shared by many of those strains, with only small differences. If this is the case, they can be used as the common targets and their tunnels of gates as the primary binding motives. Examples are the sterol 14α- demethylase, dihydroorotate dehydrogenase, peptidyl transferase center, deacetylase LpxC and dehydratase FabZ (Example Nr. 2, 4, 5, 11 and 12, respectively, Table 2).

The space of binding modes refers here to the number and diversity of interactions available for inhibitor binding. That space is larger for a tunnel than for a buried cavity, due to the number of residues available for contact, and this can be important when designing new inhibitors to target enzymes. The same occurs when binding to a gate; the knowledge of both conformational states increases the number of binding possibilities, as well as the chances of finding inhibitors with ideal pharmacophores for binding either conformation. The flexibility inherent to a tunnel or gate may also allow some adjustments to accommodate an inhibitor and optimize the binding interactions.

A practical application of binding the tunnels or gates can be to reduce drug resistance. In many cases this has been an issue due to high mutability rates of some targets. In such cases, exploring new binding modes may result in higher chances of efficiently inhibiting the targets and overcome the resistance problem. This was the case of the sterol 14α-demethylase, dihydroorotate dehydrogenase, peptidyl transferase center, HIV protease, deacetylase LpxC and dehydratase FabZ (Example Nr. 2, 4, 5, 10–12, respectively, Table 2). Some enzymes have more than one function, and it may be not desirable to fully inhibit them, but to selectively target one function only. The leukotriene A4 hydrolase/aminopeptidase (Example Nr. 8, Table 2) is a good example where the binding of the selective inhibitor to one tunnel allowed to block the hydrolysis of one substrate, while the catalytic site was spared and functional, as well as the second tunnel used for the transport of a different substrate. One can imagine that a similar approach could be applied on other bifunctional enzymes containing different binding pockets or tunnels.

218

Chapter 6

Prospective drug design strategies

The great majority of the inhibitors that bind to enzymes target their catalytic sites and the neighboring binding sites, in order to exert their action. However, sometimes this strategy does not yield the most effective results due to a number of possible reasons. According to the previous sections, important advantages may arise from the binding of inhibitors to the tunnels or gates of enzymes. Here we summarize some possible approaches for the design of new inhibiotrs to target those structural elements: (i) fill the tunnel with many contacts; (ii) bind to particular regions of the tunnel; (iii) bind to a secondary tunnel; and (iv) bind to the gating elements.

The strategies listed above highly correlate with the binding modes described previously. The approach of choice will depend on the system of study, the known structural features, and the pharmacological effect pursued. If the only issue to tackle is the inhibitory activity, then the highest number of favorable contacts is the key, and in this case the aim is to design an inhibitor which best complements the tunnel in geometry and chemical interactions.

A drug design that targets particular regions of the tunnel is another approach that can lead to achieving activity or selectivity. When the tunnel contains a certain group of residues that can optimally be complemented by specific molecular fragments, they can be used for designing inhibitors that bind this region with high affinity. These regions can be formed by non- conserved residues, and in this case, their binding can be a potential strategy for achieving selectivity. These regions are frequently the tunnel bottleneck, the tunnel mouth, or tunnel gating residues. More than one region can be targeted at a time.

As said before, some enzymes contain more than one access tunnel, and some of them can be secondary tunnels. It means that they are not essential for the transport of the substrate or product, but can be an alternative or more specialized pathway (i.e. for solvent). When a secondary tunnel is known to have relevant functional roles, it can be targeted for drug binding with inhibitory or modulating effects. The design of inhibitors to bind econdary tunnel is the same as discussed above. A particular case of secondary tunnels are the transient tunnels. These tunnels open only occasionally, and hence they can be said to be gated. Some cases of inhibitors or other ligands that bind to transient tunnels are known.192,382 A thorough description of the system’s dynamics will be very important in this case, but it may be attainable. It can represent one level further in exploring the differences between different enzyme variants in order to achieve selectivity. This approach can be very challenging, but it can also prove fruitful and bring new solutions to old problems.

219

Targeting an enzyme gate for drug design means designing new inhibitors to interact with the gating elements and block one conformation of the gate, either the closed, the open, or an intermediate state. The most obvious elements for such binding are the door residues or the moving elements. The rational design of such binders requires a thorough knowledge of both open and closed conformations. Ideally, it should be followed by the design validation with dynamics studies of the inhibitor-enzyme complexes, in order to confirm their stability. Although more difficult to rationally execute, another approach is to aim for the hinge and anchoring elements of the enzyme’s gate. It has been shown that allosteric binders could inhibit a cytochrome P450 by rigidifying the whole system.478 However, the function of the hinge and anchoring elements is usually complex and sometimes hard to fully understand, and, likewise, the result from their binding can be more unpredictable. Nonetheless, the development of tools for predicting allosteric binding sites is in progress.479 Many ion channels and other receptors have been successfully targeted on their orthosteric or allosteric binding sites. Indeed, through the binding of inhibitors to the key elements that define the flexibility of those gated proteins, it has been possible to block or modulate them.480,481 Therefore, we foresee that such approaches might also be possible for the design of novel enzyme inhibitors or modulators in a rational way.

Targeting the tunnels or gates for drug binding can reveal some drawbacks or difficulties. These are mainly related to the flexibility of the targets and the correct description of those features. As previously mentioned, the inherent dynamical properties of enzymes can significantly influence the variation of a tunnel’s geometry with time. Hence, a crystal structure can be very far from the average ensemble in solution, and thus result in unsuccessful drug designs. To tackle the problem of flexible tunnels, a reasonable exploration of the target’s conformational space is recommended prior to drug design, either with theoretical studies (e.g., molecular dynamics simulations) or with experimental methods (e.g., using ensembles of structure from nuclear magnetic resonance spectroscopy). Further in silico study of the dynamics of the enzyme-ligand complex may provide a theoretical validation of the design. On the other hand, detection of transient tunnels and molecular gates can prove difficult in some cases. Many gates have slow dynamics, which does not make their study and understanding an easy task.

220

Chapter 6

Representative examples of tunnel and gate binding

Here are compiled several case studies of small molecules that were found to inhibit or modulate the biological activity of pharmacological targets by binding to enzyme tunnels or gates. The presented examples illustrate the different binding modes and possible benefits of such types of binding and serve as the proof-of-concept for some of the drug design strategies mentioned above. The criteria for selecting the examples were: i) clinical relevance of the target; ii) availability of a crystal structure of the enzyme-inhibitor complex in the Protein Data Base; iii) the inhibitor was binding to the tunnel or gate of the enzyme; iv) variety of binding modes/benefit; v) the inhibitory activity of the ligand; vi) resolution of the structure. The most important of these factors were the clinical relevance of the target, the existence of a crystal structure that clearly showed the inhibitor binding to their specific structural features (tunnel or gate), and the diversity of binding modes and benefit. The remaining factors became secondary, only used to discard similar cases. In total, this list contains 12 entries.

221

Table 2. Extended description of the selected examples of enzymes complexed with inhibitors that bind to their tunnels or gates. Cases Nr. 1 to 12 include brief information about each enzyme, their clinical relevance, relevant structural features (tunnels or gates), the inhibitors found in each crystal structure, their inhibitory activities, binding modes, and the main biological benefit attained with binding.

#1 Enzyme: Inducible nitric oxide synthase (iNOS) PDB ID: 3EBF

E.C. 1.14.13.39

Function: Produces nitric oxide for signaling as response to Inhibitor: (3R)-3-(1,2,3,4-tetrahydroisoquinolin-7- cytokines or pathogens, in order to kill bacteria, viruses or yloxymethyl)-2,3-dihydrothieno[2,3-f][1,4]oxazepin- tumor cells. However, overproduction of nitric oxide by iNOS 5-amine has been associated with several diseases IC50 = 0.4 µM Clinical relevance: Neurological diseases, inflammation, rheumatoid arthritis, immune-type diabetes, Structure: stroke, cancer, thrombosis, infection susceptibilities

Structural features: One access tunnel with a wing gate

Binding: Catalytic site, tunnel and tunnel gate Advantage: Selectivity. The active sites of the three NOS isozymes are structurally conserved. Stabilization of the tunnel gate by the inhibitor induces distant isozyme-specific conformational changes of the not-conserved 2nd and 3rd shell

residues defining the inhibitor’s selectivity over other enzyme variants.

Magenta sticks: The inhibitor blocking the tunnel Blue: the overall structure of iNOS; orange: the tunnel; and interacting with the tunnel gate (red sticks) and black star: the catalytic site the heme cofactor (cyan sticks).

References: 482–484

222

Chapter 6

#2 Enzyme: Sterol 14α-demethylase (CYP51) PDB ID: 3K1O

E.C. 1.14.13.70

Function: Catalyzes an essential step in the biosynthesis of Inhibitor: 4-[4-[4-[4-[[(2S,5S)-5-(2,4-difluorophenyl)- ergosterol which is required for membrane construction, 5-(1,2,4-triazol-1-ylmethyl)oxolan-2- growth, development and division of parasites e.g. yl]methoxy]phenyl]piperazin-1-yl]phenyl]-2-[(2S,3S)- Trypanosoma cruzi, Trypanosoma brucei or Leishmania sp. 2-hydroxypentan-3-yl]-1,2,4-triazol-3-one (posaconazole) Clinical relevance: Antiparasitic Kd = 73 nM Structural features: Several access tunnels Structure:

Binding: Catalytic site and tunnel Advantage: Selectivity, potency, broad spectrum activity. Complete blockage of the tunnel results in the strong inhibition of the enzyme. However, the interactions at the tunnel mouth provide selectivity against the parasitic enzyme keeping the human CYP51 homolog unaffected.

Blue: the the overall structure of CYP51; orange: the tunnel; Magenta sticks: posaconazole blocking the tunnel black star: the catalytic site and coordinating the heme cofactor (cyan sticks)

References: 382,485,486

223

#3 Enzyme: Cytochrome P450 17A1 (CYP17A1) PDB ID: 3RUK

E.C. 1.14.14.19; E.C. 4.1.2.30

Function: Catalyzes the synthesis of numerous steroid Inhibitor: (3S,8R,9S,10R,13S,14S)-10,13-dimethyl- hormones in humans. CYP17A1 possesses dual function. Its 17-pyridin-3-yl-2,3,4,7,8,9,11,12,14,15-decahydro- hydroxylase activity enables production of glucocorticoids, 1H-cyclopenta[a]phenanthren-3-ol (abiraterone) and, in combination with the lyase activity, it catalyzes the biosynthesis of androgenic and estrogenic sex steroids. IC50 = 3 nM

Clinical relevance: Breast and prostate cancer Structure:

Structural features: One main access tunnel and many adjacent solvent tunnels

Binding: Catalytic site and main tunnel Advantage: Potency, selectivity. Interaction with tunnel residues provided potency but some, namely bottleneck, provided selectivity of the inhibitor.

Magenta sticks: abiraterone blocking the main tunnel and coordinating the heme cofactor (cyan

sticks); red sticks: tunnel-bottleneck residue Asn202 Blue: the overall structure of CYP17A1; orange: the main responsible for the inhibitor’s selectivity within the tunnel: black star: the catalytic site P450 enzyme family

References: 174,381,487,488

224

Chapter 6

#4 Enzyme: Dihydroorotate dehydrogenase (DHODH) PDB ID: 1D3H

E.C. 1.3.5.2

Function: Catalyzes the rate-limiting step in the de novo Inhibitor: 6-fluoro-3-methyl-2-(4- biosynthesis of . Rapidly proliferating cells, i.e., phenylphenyl)quinoline-4-carboxylic acid (brequinar human T cells or parasitic cells, have an exceptional analog) requirement for de novo biosynthesis. Ki = 8 nM Clinical relevance: Autoimmune or parasitic diseases, cancer, immunosuppression Structure:

Structural features: One access tunnel

Binding: Tunnel Advantage: Selectivity, broad spectrum activity. The active sites of the various DHODHs closely resemble one another. However, the access tunnels in the human and pathogenic enzymes, e.g., from Plasmodium falciparum or Helicobacter pylori, differ markedly.

Magenta sticks: the inhibitor blocking the tunnel; Blue: the overall structure of DHODH; orange: the tunnel; cyan sticks: flavin cofactor black star: the catalytic site

References: 385,386

225

#5 Enzyme: Peptidyl transferase center (PTC) PDB ID: 1NJI

E.C. 2.3.2.12

Function: Part of the large subunit of the ribosomal RNA Inhibitor: 2,2-dichloro-N-((1R,2R)-1,3-dihydroxy-1- (rRNA), it is essential for the ribosome’s main function, the (4-nitrophenyl)propan-2-yl)acetamide protein synthesis. It catalyzes the bond formation between (chloramphenicol) the amino acids, supplied stepwise by aminoacyl-tRNAs to build up the peptide chain bound at the peptidyl-tRNA Ki = 0.7 µM molecule. Subsequently, it catalyzes the termination of such reaction and the release of the fully assembled polypeptide, Structure: by hydrolysis of the final peptidyl-tRNA

Clinical relevance: Antibiotics

Structural features: One tunnel necessary for product release

Binding: Tunnel Advantage: Affinity, broad spectrum, enlarged space of binding modes. Most PTC inhibitors compete with the substrates for binding, but can suffer from drug resistance. Targeting the exit tunnel may widen the space of binding modes, which can help tackling the problem of drug resistance to antibiotics.

Blue: the overall structure of PTC; orange: the tunnel; black

star: the catalytic site Magenta sticks: chloramphenicol bound at the bottleneck of the tunnel

References: 412,413,489

226

Chapter 6

#6 Enzyme: Serine/threonine protein kinase AKT1 (AKT1) PDB ID: 3O96

E.C. 2.7.11.1

Function: Regulates many cellular processes, including Inhibitor: 1-(1-(4-(7-phenyl-1H-imidazo[4,5- metabolism, proliferation, survival, growth and angiogenesis. g]quinoxalin-6-yl)benzyl)piperidin-4-yl)-1H- The phosphoinositide 3-kinase/AKT pathway is possibly the benzo[d]imidazol-2(3H)-one most frequently activated signal transduction system in human cancer IC50 = 58 nM

Clinical relevance: Cellular metabolism, proliferation, Structure: survival, growth, angiogenesis

Structural features: Shell gate

Binding: Gate Advantage: Selectivity. Binding at the allosteric site (at gate interface), blocked the gating mechanism needed for two regulation steps and allowed obtaining selective inhibitors. The majority of orthosteric-binding inhibitors are not selective.

Magenta sticks: inhibitor binding at the allosteric Blue: site, at the gate interface the kinase domain; red: the PH domain; black star: the orthosteric site; black dot: the allosteric site

References: 490–492

227

#7 Enzyme: Acetylcholinesterase (AChE) PDB ID: 2XI4

E.C. 3.1.1.7

Function: Plays a crucial role at cholinergic synapses in both Inhibitor: (6aR,9aS)-4-methoxy-2,3,6a,9a- the central and peripheral nervous systems. AChE terminates tetrahydrocyclopenta[c]furo[3',2':4,5]furo[2,3- the impulse transmission by rapid hydrolysis of the h]chromene-1,11-dione (aflatoxin B1) neurotransmitter acetylcholine Ki = 28 µM Clinical relevance: Neurological diseases Structure: Structural features: One main access tunnel and one backdoor tunnel with an aperture gate

Binding: Main tunnel mouth Advantage: Selectivity, wider space of binding modes. The inhibitor has strong interactions at the tunnel mouth, providing inhibition by blocking substance exchange. Binding at the peripheral anionic site (tunnel mouth) may also provide selectivity for AChE with respect to butyrylcholinesterase

Magenta sticks: aflatoxin B1 blocking the main tunnel mouth; red sticks: the aperture gate permitting the transit of substrate, products or Blue: the overall structure of AChE; orange: the main tunnel; solvent. This gated backdoor tunnel explains the olive: the backdoor tunnel; enzyme’s high efficiency and residual activity when black star: the catalytic site inhibited by tunnel blockers

References: 351,493,494

228

Chapter 6

#8 Enzyme: Leukotriene A4 hydrolase/aminopeptidase (LTA4H) PDB ID: 4L2L

E.C. 3.3.2.6

Function: Catalyzes hydrolysis of leukotriene A4 (LTA4), an Inhibitor: 4-(4-benzylphenyl)thiazol-2-amine (ARM1) essential step in the formation of the proinflammatory lipid mediator, leukotriene B4 (LTB4), a chemoattractant during Ki = 2 µM the innate immune response. But LTA4H also inactivates the neutrophil chemoattractant tripeptide Pro-Gly-Pro (PGP), Structure: which is a biomarker for chronic pulmonary disease

Clinical relevance: Acute and chronic inflammatory diseases, such as nephritis, arthritis, dermatitis, chronic obstructive pulmonary disease and arteriosclerosis

Structural features: Two access tunnels, for the transport of LTA4 and PGP to the active site, respectively

Binding: One main tunnel, out of two Advantage: Affinity, selectivity, target one enzymatic function. Binding at only one tunnel blocks the formation of LTB4, but leaves the enzyme operational for the second function of hydrolysis of the undesirable PGP

Magenta sticks: ARM1 inhibitor blocking one of the access tunnels

Blue: the overall structure of LTA4H; orange: the tunnel used for transport of LTA4; olive: the tunnel used for transport of PGP; black star: the active site

References: 194,495,496

229

#9 Enzyme: Prolyl endopeptidase (PREP) PDB ID: 2BKL

E.C. 3.4.21.26

Function: Cleaves peptides after proline residues. The human Inhibitor: (2S)-1-[(2S)-2- PREP is a cytosolic enzyme involved in the maturation and phenylmethoxycarbonylaminopropanoyl]pyrrolidine degradation of peptide hormones and neuropeptides. -2-carboxylic acid (Z-Ala prolinal)

Clinical relevance: Neurological disorders, Chagas disease, IC50 = 1 µM cancer, celiac sprue (as therapeutics) Structure: Structural features: Shell Gate

Binding: Catalytic site and gate Advantage: Binding affinity. Inhibition of the gating mechanism disallows the substrate from binding in the active site.

Unbound enzyme in the open conformation (PDB ID 1YR2); Magenta sticks: Z-Ala prolinal binding at the catalytic blue: the β-barrel domain; red: the catalytic domain; black site and locking the gate in the closed state star: the catalytic site

References: 446,497,498

230

Chapter 6

#10 Enzyme: Human immunodeficiency virus 1 protease (HIV-1 protease) PDB ID: 3BC4

E.C. 3.4.23.16

Function: HIV protease is a retroviral aspartic endopeptidase Inhibitor: (3S,4S)-pyrrolidine-3,4-diyl bis(2- that is essential for the life cycle of human immunodeficiency (naphthalen-1-yl)acetate) virus (HIV) by cleaving the newly synthesized polyproteins into their functional units. Without this protease’s activity, Ki = 20 µM the HIV virions remain uninfectious Structure: Clinical relevance: AIDS/HIV epidemic

Structural features: Double drawbridge gate

Binding: Catalytic site and gate Advantage: Affinity, enlarged space of binding modes. Most HIV protease inhibitors target the active site, but they are also vulnerable to drug resistance. Targeting the open or the closed conformation of the enzyme gate extends the space

of binding modes, which increases the chances of finding inhibitors that bind to less mutable residues and tackle the issue of drug resistance.

Magenta sticks: inhibitor bound at the gate interface in the open conformation Unbound enzyme in the closed conformation (PDB ID: 1HVR); blue: the core domain; red: the flexible gating flaps; black star: the catalytic site

References: 499,500

231

#11 Enzyme: UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase PDB ID: 3NZK (LpxC)

E.C. 3.5.1.33

Function: Performs a critical step in the biosynthesis of lipid Inhibitor: N-[(2S,3R)-3-hydroxy-1-(hydroxyamino)-1- A. Lipid A constitutes an membrane anchor of the outer oxobutan-2-yl]-4-[2-[4-(morpholin-4- leaflet of the outer membrane of Gram-negative bacteria and ylmethyl)phenyl]ethynyl]benzamide (CHIR-090) is responsible for the cell viability and toxicity in inflammatory responses Ki = 1 nM

Clinical relevance: Antibiotics Structure:

Structural features: One access tunnel formed by a βαβ subdomain which is conserved among many Gram-negative bacteria. The βαβ subdomain contains tunnel-lining residues which are essential for the time-dependent inhibition as well as for antibiotic resistance

Binding: Catalytic site and tunnel Advantage: Broad spectrum activity, drug resistance. Blocking of the evolutionary conserved tunnel might affect the viability of Gram-negative bacteria

Magenta sticks: CHIR-090 blocking the tunnel and Blue: the overall structure of LpxC; orange: the tunnel; chelating the catalytic zinc ion black star: the catalytic site

References: 501–504

232

Chapter 6

#12 Enzyme: β-hydroxyacyl-acyl carrier protein dehydratase (FabZ) PDB ID: 3D04

E.C. 4.2.1

Function: Catalyzes the essential step in biosynthesis of both Inhibitor: (2S)-5-hydroxy-2-(4-hydroxyphenyl)-7- saturated and unsaturated fatty acids in bacteria methoxy-2,3-dihydrochromen-4-one (sakuranetin)

Clinical relevance: Antibiotics IC50 = 2 µM

Structural features: L-shaped access tunnel with two wing Structure: gates located at the entrance and the exit of the tunnel

Binding: Tunnel mouth and tunnel gate Advantage: Broad spectrum activity, drug resistance. The structural differences between the fatty acid biosynthesis systems in bacteria and mammals might yield effective inhibitors with against

pathogenic microbes

Magenta sticks: sakuranetin blocking the tunnel mouth; red sticks: the wing gating residues. The gates prevent the exposure of the active site to the Blue: the overall structure of FabZ; orange: the tunnel; black bulk solvent, drive the product release and star: the catalytic site determine the length of the accommodated acyl chains.

References: 364,452,453

233

Conclusions

In this review, we aimed at highlighting the potential of enzyme tunnels and gates in drug discovery. For that, we first described the importance of molecular tunnels and gates to the regular functioning of enzymes, and gave examples of clinical targets containing those features. The possible forms of binding of an inhibitor to the enzyme tunnels and gates were discussed, as well as the possible benefits resulting from it. The binding modes, benefits and design strategies were illustrated by 12 case-studies of complexes containing inhibitors bound to the tunnels or gates of clinically relevant enzymes.

Several strategies for the design of new inhibitors targeting tunnels or gates were outlined and discussed. These strategies depend much on the knowledge held by the targets and the biological effect sought upon the binding. Understanding the flexibility or dynamical behavior of the targets is highly recommended before carrying out the new designs. Not knowing such properties may hamper the success of a drug design strategy. Many enzymes contain flexible or transient tunnels and gates which have not been disclosed, and yet they may hold the key to discover more efficient drugs.

In summary, we have given the functional and structural basis for drug design strategies that aim the tunnels or gates of enzymes. Some of these approaches are bound to produce effective solutions, and, in some cases, may contribute to overcome old problems.

234

Chapter 6

235

REFERENCES (1) Pasteur, L. Comptes rendus l’Académie des Sci. 1858, 46, 615. (2) Fischer, E. Angew. Chem. Int. Ed. Engl. 1894, 27, 2985. (3) Buchner, E. Ber. Dtsch. Chem. Ges. 1897, 30, 117. (4) Reetz, M. T. J. Am. Chem. Soc. 2013, 135 (34), 12480. (5) Bornscheuer, U. T.; Huisman, G. W.; Kazlauskas, R. J.; Lutz, S.; Moore, J. C.; Robins, K. Nature 2012, 485 (7397), 185. (6) Papagianni, M. Biotechnol. Adv. 2007, 25 (3), 244. (7) Schoemaker, H. E.; Mink, D.; Wubbolts, M. G. Science 2003, 299 (5613), 1694. (8) Glick, B. R.; Pasternak, J. J.; Patten, C. L. Molecular Biotechnology: Principles and Applications of Recombinant DNA; American Society of Microbiology, 2010; Vol. 4. (9) Mullis, K.; Faloona, F.; Scharf, S.; Saiki, R.; Horn, G.; Erlich, H. Cold Spring Harb. Symp. Quant. Biol. 1986, 51 (1), 263. (10) Sanger, F.; Nicklen, S.; Coulson, A. R. Proc. Natl. Acad. Sci. U. S. A. 1977, 74 (12), 5463. (11) Francis, J. C.; Hansche, P. E. Genetics 1972, 70 (1), 59. (12) Reetz, M. T. Angew. Chem. Int. Ed. Engl. 2011, 50 (1), 138. (13) Strohmeier, G. a.; Pichler, H.; May, O.; Gruber-Khadjawi, M. Chem. Rev. 2011, 111 (7), 4141. (14) Richmond, K. E.; Li, M.-H.; Rodesch, M. J.; Patel, M.; Lowe, A. M.; Kim, C.; Chu, L. L.; Venkataramaian, N.; Flickinger, S. F.; Kaysen, J.; Belshaw, P. J.; Sussman, M. R.; Cerrina, F. Nucleic Acids Res. 2004, 32 (17), 5011. (15) Gibson, D. G.; Glass, J. I.; Lartigue, C.; Noskov, V. N.; Chuang, R.-Y.; Algire, M. A.; Benders, G. A.; Montague, M. G.; Ma, L.; Moodie, M. M.; Merryman, C.; Vashee, S.; Krishnakumar, R.; Assad-Garcia, N.; Andrews- Pfannkoch, C.; Denisova, E. A.; Young, L.; Qi, Z.-Q.; Segall-Shapiro, T. H.; Calvey, C. H.; Parmar, P. P.; Hutchison, C. A.; Smith, H. O.; Venter, J. C. Science 2010, 329 (5987), 52. (16) Röthlisberger, D.; Khersonsky, O.; Wollacott, A. M.; Jiang, L.; DeChancie, J.; Betker, J.; Gallaher, J. L.; Althoff, E. A.; Zanghellini, A.; Dym, O.; Albeck, S.; Houk, K. N.; Tawfik, D. S.; Baker, D. Nature 2008, 453 (7192), 190. (17) Medema, M. H.; van Raaphorst, R.; Takano, E.; Breitling, R. Nat. Rev. Microbiol. 2012, 10 (3), 191. (18) Thornton, J. W. Nat. Rev. Genet. 2004, 5 (5), 366. (19) Kazlauskas, R. J.; Bornscheuer, U. T. Nat. Chem. Biol. 2009, 5 (8), 526. (20) Becker, S.; Höbenreich, H.; Vogel, A.; Knorr, J.; Wilhelm, S.; Rosenau, F.; Jaeger, K.-E.; Reetz, M. T.; Kolmar, H. Angew. Chem. Int. Ed. Engl. 2008, 47 (27), 5085. (21) Ragauskas, A. J.; Williams, C. K.; Davison, B. H.; Britovsek, G.; Cairney, J.; Eckert, C. a; Frederick, W. J.; Hallett, J. P.; Leak, D. J.; Liotta, C. L.; Mielenz, J. R.; Murphy, R.; Templer, R.; Tschaplinski, T. Science 2006, 311 (5760), 484. (22) Steen, E. J.; Kang, Y.; Bokinsky, G.; Hu, Z.; Schirmer, A.; McClure, A.; Del Cardayre, S. B.; Keasling, J. D. Nature 2010, 463 (7280), 559. (23) Mohanty, A. K.; Misra, M.; Drzal, L. T. J. Polym. Environ. 2002, 10 (1/2), 19. (24) Wenda, S.; Illner, S.; Mell, A.; Kragl, U. Green Chem. 2011, 13 (11), 3007. (25) Nestl, B. M.; Hammer, S. C.; Nebel, B. a.; Hauer, B. Angew. Chem. Int. Ed. Engl. 2014, 53 (12), 3070. (26) Desai, A. A. Angew. Chem. Int. Ed. Engl. 2011, 50 (9), 1974. (27) Megharaj, M.; Ramakrishnan, B.; Venkateswarlu, K.; Sethunathan, N.; Naidu, R. Environ. Int. 2011, 37 (8), 1362. (28) Verschueren, K. H.; Seljée, F.; Rozeboom, H. J.; Kalk, K. H.; Dijkstra, B. W. Nature 1993, 363 (6431), 693. (29) Prokop, Z.; Monincová, M.; Chaloupková, R.; Klvaňa, M.; Nagata, Y.; Janssen, D. B.; Damborský, J. J. Biol. Chem. 2003, 278 (46), 45094. (30) Schanstra, J. P.; Kingma, J.; Janssen, D. B. J. Biol. Chem. 1996, 271 (25), 14747. (31) Lau, E. Y.; Kahn, K.; Bash, P. A.; Bruice, T. C. Proc. Natl. Acad. Sci. U. S. A. 2000, 97 (18), 9937. (32) Lightstone, F. C.; Zheng, Y.-J.; Bruice, T. C. Bioorg. Chem. 1998, 26 (3), 169. (33) Gao, J.; Devi-Kesavan, L. S.; Garcia-Viloca, M. Theor. Chem. Acc. 2003, 109 (3), 133. (34) Hur, S.; Kahn, K.; Bruice, T. C. Proc. Natl. Acad. Sci. U. S. A. 2003, 100 (5), 2215. (35) Soriano, A.; Silla, E.; Tuñón, I.; Ruiz-López, M. F. J. Am. Chem. Soc. 2005, 127 (6), 1946. (36) Shurki, A.; Štrajbl, M.; Villà, J.; Warshel, A. J. Am. Chem. Soc. 2002, 124 (15), 4097. (37) Pavlova, M.; Klvana, M.; Prokop, Z.; Chaloupkova, R.; Banas, P.; Otyepka, M.; Wade, R. C.; Tsuda, M.; Nagata, 236

References

Y.; Damborsky, J. Nat. Chem. Biol. 2009, 5 (10), 727. (38) Janssen, D. B. D. Curr. Opin. Chem. Biol. 2004, 8 (2), 150. (39) Nagata, Y.; Miyauchi, K.; Damborsky, J.; Manova, K.; Ansorgova, A.; Takagi, M. Appl. Environ. Microbiol. 1997, 63 (9), 3707. (40) Keuning, S.; Janssen, D. B.; Witholt, B. J. Bacteriol. 1985, 163 (2), 635. (41) Curragh, H.; Flynn, O.; Larkin, M. J.; Stafford, T. M.; Hamilton, J. T. G.; Harper, D. B. Microbiology 1994, 140 (6), 1433. (42) Sato, Y.; Monincová, M.; Chaloupková, R.; Prokop, Z.; Ohtsubo, Y.; Minamisawa, K.; Tsuda, M.; Damborsky, J.; Nagata, Y. Appl. Environ. Microbiol. 2005, 71 (8), 4372. (43) Hasan, K.; Fortova, A.; Koudelakova, T.; Chaloupkova, R.; Ishitsuka, M.; Nagata, Y.; Damborsky, J.; Prokop, Z. Appl. Environ. Microbiol. 2011, 77 (5), 1881. (44) Jesenská, A.; Pavlová, M.; Strouhal, M.; Chaloupková, R.; Těšínská, I.; Monincová, M.; Prokop, Z.; Bartoš, M.; Pavlík, I.; Rychlík, I.; Möbius, P.; Nagata, Y.; Damborský, J. Appl. Environ. Microbiol. 2005, 71 (11), 6736. (45) Hesseler, M.; Bogdanović, X.; Hidalgo, A.; Berenguer, J.; Palm, G. J.; Hinrichs, W.; Bornscheuer, U. T. Appl. Microbiol. Biotechnol. 2011, 91 (4), 1049. (46) Gehret, J. J.; Gu, L.; Geders, T. W.; Brown, W. C.; Gerwick, L.; Gerwick, W. H.; Sherman, D. H.; Smith, J. L. Protein Sci. 2012, 21 (2), 239. (47) Li, A.; Shao, Z. PLoS One 2014, 9 (2). (48) Fortova, A.; Sebestova, E.; Stepankova, V.; Koudelakova, T.; Palkova, L.; Damborsky, J.; Chaloupkova, R. Biochimie 2013, 95 (11), 2091. (49) Koudelakova, T.; Chovancova, E.; Brezovsky, J.; Monincova, M.; Fortova, A.; Jarkovsky, J.; Damborsky, J. Biochem. J. 2011, 435 (2), 345. (50) Erable, B.; Goubet, I.; Lamare, S.; Legoy, M. D.; Maugard, T. Chemosphere 2006, 65 (7), 1146. (51) Dvorak, P.; Bidmanova, S.; Damborsky, J.; Prokop, Z. Environ. Sci. Technol. 2014, 48 (12), 6859. (52) Lal, R.; Pandey, G.; Sharma, P.; Kumari, K.; Malhotra, S.; Pandey, R.; Raina, V.; Kohler, H.-P. E.; Holliger, C.; Jackson, C.; Oakeshott, J. G. Microbiol. Mol. Biol. Rev. 2010, 74 (1), 58. (53) Campbell, D. W.; Müller, C.; Reardon, K. F. Biotechnol. Lett. 2006, 28 (12), 883. (54) Bidmanova, S.; Chaloupkova, R.; Damborsky, J.; Prokop, Z. Anal. Bioanal. Chem. 2010, 398 (5), 1891. (55) Prokop, Z.; Opluštil, F.; DeFrank, J.; Damborský, J. Biotechnol. J. 2006, 1 (12), 1370. (56) Prokop, Z.; Sato, Y.; Brezovsky, J.; Mozga, T.; Chaloupkova, R.; Koudelakova, T.; Jerabek, P.; Stepankova, V.; Natsume, R.; Van Leeuwen, J. G. E.; Janssen, D. B.; Florian, J.; Nagata, Y.; Senda, T.; Damborsky, J. Angew. Chem. Int. Ed. Engl. 2010, 49 (35), 6111. (57) Hong, H.; Benink, H. a; Zhang, Y.; Yang, Y.; Uyeda, H. T.; Engle, J. W.; Severin, G. W.; McDougall, M. G.; Barnhart, T. E.; Klaubert, D. H.; Nickles, R. J.; Fan, F.; Cai, W. Am. J. Transl. Res. 2011, 3 (4), 392. (58) Los, G. V.; Encell, L. P.; McDougall, M. G.; Hartzell, D. D.; Karassina, N.; Zimprich, C.; Wood, M. G.; Learish, R.; Ohana, R. F.; Urh, M.; Simpson, D.; Mendez, J.; Zimmerman, K.; Otto, P.; Vidugiris, G.; Zhu, J.; Darzins, A.; Klaubert, D. H.; Bulleit, R. F.; Wood, K. V. ACS Chem. Biol. 2008, 3 (6), 373. (59) So, M. kyung; Yao, H.; Rao, J. Biochem. Biophys. Res. Commun. 2008, 374 (3), 419. (60) Mazzucchelli, S.; Colombo, M.; Verderio, P.; Rozek, E.; Andreata, F.; Galbiati, E.; Tortora, P.; Corsi, F.; Prosperi, D. Angew. Chem. Int. Ed. Engl. 2013, 52 (11), 3121. (61) Ohana, R. F.; Hurst, R.; Vidugiriene, J.; Slater, M. R.; Wood, K. V.; Urh, M. Protein Expr. Purif. 2011, 76 (2), 154. (62) Buckley, D. L.; Raina, K.; Darricarrerre, N.; Hines, J.; Gustafson, J. L.; Smith, I. E.; Miah, A. H.; Harling, J. D.; Crews, C. M. ACS Chem. Biol. 2015, 150612170759005. (63) Neklesa, T. K.; Tae, H. S.; Schneekloth, A. R.; Stulberg, M. J.; Corson, T. W.; Sundberg, T. B.; Raina, K.; Holley, S. A.; Crews, C. M. Nat. Chem. Biol. 2011, 7 (8), 538. (64) Friedman Ohana, R.; Kirkland, T. a.; Woodroofe, C. C.; Levin, S.; Uyeda, H. T.; Otto, P.; Hurst, R.; Robers, M. B.; Zimmerman, K.; Encell, L. P.; Wood, K. V. ACS Chem. Biol. 2015. (65) Bosma, T.; Pikkemaat, M. G.; Kingma, J.; Dijk, J.; Janssen, D. B. Biochemistry 2003, 42 (26), 8047. (66) Prudnikova, T.; Mozga, T.; Rezacova, P.; Chaloupkova, R.; Sato, Y.; Nagata, Y.; Brynda, J.; Kuty, M.; Damborsky, J.; Smatanova, I. K. Acta Crystallogr. Sect. F. Struct. Biol. Cryst. Commun. 2009, 65 (Pt 4), 353. (67) Damborský, J.; Rorije, E.; Jesenská, A.; Nagata, Y.; Klopman, G.; Peijnenburg, W. J. Environ. Toxicol. Chem. 2001, 20 (12), 2681. (68) Chovancová, E.; Kosinski, J.; Bujnicki, J. M.; Damborský, J. Proteins 2007, 67 (2), 305.

237

(69) Damborský, J.; Koca, J. Protein Eng. 1999, 12 (11), 989. (70) Boháč, M.; Nagata, Y.; Prokop, Z.; Prokop, M.; Monincová, M.; Tsuda, M.; Koča, J.; Damborský, J. Biochemistry 2002, 41 (48), 14272. (71) Silberstein, M.; Damborsky, J.; Vajda, S. Biochemistry 2007, 46 (32), 9239. (72) Nagata, Y.; Ohtsubo, Y.; Tsuda, M. Appl. Microbiol. Biotechnol. 2015. (73) Fung, H. K. H.; Gadd, M. S.; Drury, T. A.; Cheung, S.; Guss, J. M.; Coleman, N. V; Matthews, J. M. Mol. Microbiol. 2015, 97 (3), 439. (74) Daniel, L.; Buryska, T.; Prokop, Z.; Damborsky, J.; Brezovsky, J. J. Chem. Inf. Model. 2015, 55 (1), 54. (75) Nardini, M.; Dijkstra, B. W. Curr. Opin. Struct. Biol. 1999, 9 (6), 732. (76) Lenfant, N.; Hotelier, T.; Velluet, E.; Bourne, Y.; Marchot, P.; Chatonnet, A. Nucleic Acids Res. 2013, 41 (D1), 423. (77) Kmunícek, J.; Luengo, S.; Gago, F.; Ortiz, A. R.; Wade, R. C.; Damborský, J.; Kmunicek, J.; Luengo, S.; Gago, F.; Ortiz, A. R.; Wade, R. C.; Damborsky, J. Biochemistry 2001, 40 (30), 11288. (78) Franken, S. M.; Rozeboom, H. J.; Kalk, K. H.; Dijkstra, B. W. EMBO J. 1991, 10 (6), 1297. (79) Newman, J.; Peat, T. S.; Richard, R.; Kan, L.; Swanson, P. E.; Affholter, J. A.; Holmes, I. H.; Schindler, J. F.; Unkefer, C. J.; Terwilliger, T. C. Biochemistry 1999, 38 (49), 16105. (80) Marek, J.; Vevodova, J.; Smatanova, I. K.; Nagata, Y.; Svensson, L. A.; Newman, J.; Takagi, M.; Damborsky, J. Biochemistry 2000, 39 (46), 14082. (81) Mazumdar, P. A.; Hulecki, J. C.; Cherney, M. M.; Garen, C. R.; James, M. N. G. Biochim. Biophys. Acta 2008, 1784 (2), 351. (82) Chaloupkova, R.; Prudnikova, T.; Rezacova, P.; Prokop, Z.; Koudelakova, T.; Daniel, L.; Brezovsky, J.; Ikeda- Ohtsubo, W.; Sato, Y.; Kuty, M.; Nagata, Y.; Kuta Smatanova, I.; Damborsky, J. Acta Crystallogr. D. Biol. Crystallogr. 2014, 70 (Pt 7), 1884. (83) Novak, H. R.; Sayer, C.; Isupov, M. N.; Gotz, D.; Spragg, A. M.; Littlechild, J. A. FEBS Lett. 2014, 588 (9), 1616. (84) Chrast, L.; Tratsiak, K.; Daniel, L.; Sebestova, E.; Prudnikova, T.; Brezovsky, J.; Kuta-Smatanova, I.; Damborsky, J.; Chaloupkova, R. In preparation 2016. (85) Klvana, M.; Pavlova, M.; Koudelakova, T.; Chaloupkova, R.; Dvorak, P.; Prokop, Z.; Stsiapanava, A.; Kuty, M.; Kuta-Smatanova, I.; Dohnalek, J.; Kulhanek, P.; Wade, R. C.; Damborsky, J. J. Mol. Biol. 2009, 392 (5), 1339. (86) Guazzaroni, M. E.; Corte, N. L. Handb. Hydrocarb. Lipid Microbiol. 2010, 1 (Iii), 1. (87) Chovancova, E.; Pavelka, A.; Benes, P.; Strnad, O.; Brezovsky, J.; Kozlikova, B.; Gora, A.; Sustr, V.; Klvana, M.; Medek, P.; Biedermannova, L.; Sochor, J.; Damborsky, J. PLoS Comput. Biol. 2012, 8 (10), e1002708. (88) Lavecchia, A.; Di Giovanni, C. Curr. Med. Chem. 2013, 20 (23), 2839. (89) Heikamp, K.; Bajorath, J. Chem. Biol. Drug Des. 2013, 81 (1), 33. (90) Lavecchia, A. Drug Discov. Today 2014, 20 (3), 318. (91) Shoichet, B. K. Nature 2004, 432 (7019), 862. (92) Hermann, J. C.; Marti-Arbona, R.; Fedorov, A. A.; Fedorov, E.; Almo, S. C.; Shoichet, B. K.; Raushel, F. M. Nature 2007, 448 (7155), 775. (93) Gerlt, J. A.; Allen, K. N.; Almo, S. C.; Armstrong, R. N.; Babbitt, P. C.; Cronan, J. E.; Dunaway-Mariano, D.; Imker, H. J.; Jacobson, M. P.; Minor, W.; Poulter, C. D.; Raushel, F. M.; Sali, A.; Shoichet, B. K.; Sweedler, J. V. Biochemistry 2011, 50 (46), 9950. (94) Zhao, S.; Kumar, R.; Sakai, A.; Vetting, M. W.; Wood, B. M.; Brown, S.; Bonanno, J. B.; Hillerich, B. S.; Seidel, R. D.; Babbitt, P. C.; Almo, S. C.; Sweedler, J. V.; Gerlt, J. A.; Cronan, J. E.; Jacobson, M. P. Nature 2013, 502 (7473), 698. (95) Ripphausen, P.; Nisius, B.; Bajorath, J. Drug Discov. Today 2011, 16 (9-10), 372. (96) Geppert, H.; Vogt, M.; Bajorath, J. J Chem Inf. Model 2010, 50 (2), 205. (97) Green, D. V. Expert Opin. Drug Discov. 2008, 3 (9), 1011. (98) Cerqueira, N. M. F. S. a.; Gesto, D.; Oliveira, E. F.; Santos-Martins, D.; Brás, N. F.; Sousa, S. F.; Fernandes, P. a.; Ramos, M. J. Arch. Biochem. Biophys. 2015, 582, 56. (99) Ripphausen, P.; Stumpfe, D.; Bajorath, J. Future Med. Chem. 2012, 4 (5), 603. (100) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. Nucleic Acids Res. 2000, 28 (1), 235. (101) Spyrakis, F.; Cavasotto, C. N. Arch. Biochem. Biophys. 2015, 583, 105. (102) Hendlich, M.; Rippmann, F.; Barnickel, G. J. Mol. Graph. Model. 1997, 15 (6), 359. (103) Laskowski, R. a. J. Mol. Graph. 1995, 13 (5), 323.

238

References

(104) Le Guilloux, V.; Schmidtke, P.; Tuffery, P. BMC Bioinformatics 2009, 10, 168. (105) Laurie, a. T. R.; Jackson, R. M. Bioinformatics 2005, 21 (9), 1908. (106) Weisel, M.; Proschak, E.; Schneider, G. Chem. Cent. J. 2007, 1, 7. (107) Irwin, J. J.; Sterling, T.; Mysinger, M. M.; Bolstad, E. S.; Coleman, R. G. J. Chem. Inf. Model. 2012, 52 (7), 1757. (108) Bolton, E. E.; Wang, Y.; Thiessen, P. a.; Bryant, S. H. Annu. Rep. Comput. Chem. 2008, 4, 217. (109) Pence, H. E.; Williams, A. J. Chem. Educ. 2010, 87 (11), 1123. (110) Del Rio, A.; Barbosa, A. J. M.; Caporuscio, F.; Mangiatordi, G. F. Mol. Biosyst. 2010, 6 (11), 2122. (111) Lipinski, C. a; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Adv. Drug Deliv. Rev. 1997, 23 (1-3), 3. (112) Veber, D. F.; Johnson, S. R.; Cheng, H.; Smith, B. R.; Ward, K. W.; Kopple, K. D. J. Med. Chem. 2002, 45, 2615. (113) Böhm, H. J. J. Comput. Aided. Mol. Des. 1992, 6 (6), 593. (114) Morris, G. M.; Goodsell, D. S.; Halliday, R. S.; Huey, R.; Hart, W. E.; Belew, R. K.; Olson, A. J.; Al, M. E. T. J. Comput. Chem. 1998, 19 (14), 1639. (115) Ferreira, L.; dos Santos, R.; Oliva, G.; Andricopulo, A. Molecules 2015, 20 (7), 13384. (116) Moitessier, N.; Englebienne, P.; Lee, D.; Lawandi, J.; Corbeil, C. R. Br. J. Pharmacol. 2008, 153 Suppl , S7. (117) Chen, R.; Li, L.; Weng, Z. Proteins 2003, 52 (1), 80. (118) Ewing, T. J.; Makino, S.; Skillman, A. G.; Kuntz, I. D. J. Comput. Aided. Mol. Des. 2001, 15 (5), 411. (119) Sauton, N.; Lagorce, D.; Villoutreix, B. O.; Miteva, M. A. BMC Bioinformatics 2008, 9 (1), 184. (120) Venkatachalam, C. M.; Jiang, X.; Oldfield, T.; Waldman, M. J. Mol. Graph. Model. 2003, 21, 289. (121) Friesner, R. A.; Banks, J. L.; Murphy, R. B.; Halgren, T. A.; Klicic, J. J.; Mainz, D. T.; Repasky, M. P.; Knoll, E. H.; Shelley, M.; Perry, J. K.; Shaw, D. E.; Francis, P.; Shenkin, P. S. J. Med. Chem. 2004, 47 (7), 1739. (122) Morris, G. M.; Huey, R.; Lindstrom, W.; Sanner, M. P.; Belew, R. K.; Goodsell, D. S.; Olson, A. J. J. Comput. Chem. 2009, 30 (16), 2785. (123) Jones, G.; Willett, P.; Glen, R. C.; Leach, A. R.; Taylor, R. J. Mol. Biol. 1997, 267 (3), 727. (124) Abagyan, R.; Totrov, M.; Kuznetsov, D. J. Comput. Chem. 1994, 15 (5), 488. (125) DesJarlais, R. L.; Sheridan, R. P.; Dixon, J. S.; Kuntz, I. D.; Venkataraghavan, R. J. Med. Chem. 1986, 29 (11), 2149. (126) D.A. Case; T.A. Darden; T.E. Cheatham; C.L. Simmerling; J. Wang; R.E. Duke; R. Luo; R.C. Walker; W. Zhang; K.M. Merz; B.P. Roberts; B. Wang; S. Hayik; A. Roitberg; G. Seabra; I. Kolossváry; K.F. Wong; F. Paesani; J. Vanicek; J. Liu; X. Wu; S.R. Brozell; T. Steinbrecher; H. Gohlke; Q. Cai; X. Ye; J. Wang; M.-J. Hsieh; G. Cui; D.R.Roe; D.H. Mathews; M.G. Seetin; C. Sagui; V. Babin; T. Luchko; S. Gusarov; A. Kovalenko. 2015,. (127) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. J. Comput. Chem. 1983, 4 (4), 187. (128) Grinter, S. Z.; Zou, X. Molecules 2014, 19 (7), 10150. (129) Velec, H. F. G.; Gohlke, H.; Klebe, G. J. Med. Chem. 2005, 48 (20), 6296. (130) Ishchenko, A. V; Shakhnovich, E. I. J. Med. Chem. 2002, 45 (13), 2770. (131) Fan, H.; Schneidman-Duhovny, D.; Irwin, J. J.; Dong, G.; Shoichet, B. K.; Sali, A. J. Chem. Inf. Model. 2011, 51, 3078. (132) Ballester, P. J.; Mitchell, J. B. O. Bioinformatics 2010, 26 (9), 1169. (133) Huey, R.; Morris, G. M.; Olson, A. J.; Goodsell, D. S. J. Comput. Chem. 2007, 28, 1145. (134) Eldridge, M. D.; Murray, C. W.; Auton, T. R.; Paolini, G. V; Mee, R. P. J. Comput. Aided. Mol. Des. 1997, 11, 425. (135) Sotriffer, C. a; Sanschagrin, P.; Matter, H.; Klebe, G. Proteins 2008, 73 (2), 395. (136) Reulecke, I.; Lange, G.; Albrecht, J.; Klein, R.; Rarey, M. ChemMedChem 2008, 3 (6), 885. (137) Bouvier, G.; Evrard-todeschi, N.; Girault, J. P.; Bertho, G. Bioinformatics 2009, 26 (1), 53. (138) Charifson, P. S.; Corkery, J. J.; Murcko, M. a.; Walters, W. P. J. Med. Chem. 1999, 42 (25), 5100. (139) Paul, N.; Rognan, D. Proteins 2002, 47 (January), 521. (140) Graves, A. P.; Shivakumar, D. M.; Boyce, S. E.; Jacobson, M. P.; Case, D. A.; Shoichet, B. K. J. Mol. Biol. 2008, 377 (3), 914. (141) Thompson, D. C.; Humblet, C.; Joseph-McCarthy, D. J. Chem. Inf. Model. 2008, 48 (5), 1081. (142) Therrien, E.; Weill, N.; Tomberg, A.; Corbeil, C. R.; Lee, D.; Moitessier, N. J. Chem. Inf. Model. 2014. (143) Savile, C. K.; Janey, J. M.; Mundorff, E. C.; Moore, J. C.; Tam, S.; Jarvis, W. R.; Colbeck, J. C.; Krebber, A.; Fleitz, F. J.; Brands, J.; Devine, P. N.; Huisman, G. W.; Hughes, G. J. Science 2010, 329 (5989), 305. (144) Steiner, K.; Schwab, H. Comput. Struct. Biotechnol. J. 2012, 2 (3), 1. (145) Morley, K. L.; Kazlauskas, R. J. Trends Biotechnol. 2005, 23 (5), 231.

239

(146) Laskowski, R. A.; Swindells, M. B. J. Chem. Inf. Model. 2011, 51 (10), 2778. (147) Stierand, K.; Rarey, M. ACS Med. Chem. Lett. 2010, 1 (9), 540. (148) Sebestova, E.; Bendl, J.; Brezovsky, J.; Damborsky, J. In Methods in molecular biology (Clifton, N.J.); 2014; Vol. 1179, pp 291–314. (149) Midelfort, K. S.; Kumar, R.; Han, S.; Karmilowicz, M. J.; McConnell, K.; Gehlhaar, D. K.; Mistry, A.; Chang, J. S.; Anderson, M.; Villalobos, A.; Minshull, J.; Govindarajan, S.; Wong, J. W. Protein Eng. Des. Sel. 2013, 26 (1), 25. (150) Pavelka, A.; Chovancova, E.; Damborsky, J. Nucleic Acids Res. 2009, 37. (151) Neylon, C. Nucleic Acids Res. 2004, 32 (4), 1448. (152) Reetz, M. T.; Bocola, M.; Carballeira, J. D.; Zha, D.; Vogel, A. Angew. Chem. Int. Ed. Engl. 2005, 44 (27), 4192. (153) Li, Y.; Cirino, P. C. Biotechnol. Bioeng. 2014, 111 (7), 1273. (154) Toscano, M. D.; Woycechowsky, K. J.; Hilvert, D. Angew. Chem. Int. Ed. Engl. 2007, 46 (18), 3212. (155) Kiss, G.; Çelebi-Ölçüm, N.; Moretti, R.; Baker, D.; Houk, K. N. Angew. Chem. Int. Ed. Engl. 2013, 52 (22), 5700. (156) Wu, Z. P.; Hilvert, D. J. Am. Chem. Soc. 1989, 111 (12), 4513. (157) Jiang, L.; Althoff, E. A.; Clemente, F. R.; Doyle, L.; Röthlisberger, D.; Zanghellini, A.; Gallaher, J. L.; Betker, J. L.; Tanaka, F.; Barbas, C. F.; Hilvert, D.; Houk, K. N.; Stoddard, B. L.; Baker, D.; Rothlisberger, D.; Zanghellini, A.; Gallaher, J. L.; Betker, J. L.; Tanaka, F.; Barbas, C. F.; Hilvert, D.; Houk, K. N.; Stoddard, B. L.; Baker, D. Science 2008, 319 (5868), 1387. (158) Schmidt, D. M. Z.; Mundorff, E. C.; Dojka, M.; Bermudez, E.; Ness, J. E.; Govindarajan, S.; Babbitt, P. C.; Minshull, J.; Gerlt, J. A. Biochemistry 2003, 42 (28), 8387. (159) Sun, Z.; Lonsdale, R.; Kong, X.-D.; Xu, J.-H.; Zhou, J.; Reetz, M. T. Angew. Chem. Int. Ed. Engl. 2015, 54 (42), 12410. (160) Witkowski, A.; Witkowska, H. E.; Smith, S. J. Biol. Chem. 1994, 269 (1), 379. (161) Millard, C. B.; Lockridge, O.; Broomfield, C. A. Biochemistry 1995, 34 (49), 15925. (162) Lockridge, O.; Blong, R. M.; Masson, P.; Froment, M. T.; Millard, C. B.; Broomfield, C. A. Biochemistry 1997, 36 (4), 786. (163) Humble, M. S.; Berglund, P. European J. Org. Chem. 2011, 2011 (19), 3391. (164) Vongvilai, P.; Linder, M.; Sakulsombat, M.; Svedendahl Humble, M.; Berglund, P.; Brinck, T.; Ramström, O. Angew. Chem. Int. Ed. Engl. 2011, 50 (29), 6592. (165) Hederos, S.; Broo, K. S.; Jakobsson, E.; Kleywegt, G. J.; Mannervik, B.; Baltzer, L. Proc. Natl. Acad. Sci. U. S. A. 2004, 101 (36), 13163. (166) Leitgeb, S.; Nidetzky, B. Chembiochem 2010, 11 (4), 502. (167) Prokop, Z.; Gora, A.; Brezovsky, J.; Chaloupkova, R. Stepankova, V. Damborsky, J. In Protein Engineering Handbook; Lutz, S., Bornscheuer, U. T., Eds.; Wiley-VCH, Weinheim, 2012; pp 421–464. (168) Kingsley, L. J.; Lill, M. a. Proteins 2015, 83 (4), 599. (169) Koshland, D. E. Proc. Natl. Acad. Sci. U. S. A. 1958, 44 (2), 98. (170) Boehr, D. D.; Nussinov, R.; Wright, P. E. Nat. Chem. Biol. 2009, 5 (11), 789. (171) Gora, A.; Brezovsky, J.; Damborsky, J. Chem. Rev. 2013, 113 (8), 5871. (172) Marques, S. M.; Daniel, L.; Buryska, T.; Prokop, Z.; Brezovsky, J.; Damborsky, J. In review 2016. (173) Danielson, P. B. Curr. Drug Metab. 2002, 3 (6), 561. (174) Cojocaru, V.; Winn, P. J.; Wade, R. C. Biochim. Biophys. Acta 2007, 1770 (3), 390. (175) Lüdemann, S. K.; Lounnas, V.; Wade, R. C. J. Mol. Biol. 2000, 303 (5), 797. (176) Petrek, M.; Otyepka, M.; Banás, P.; Kosinová, P.; Koca, J.; Damborský, J. BMC Bioinformatics 2006, 7, 316. (177) Brezovsky, J.; Chovancova, E.; Gora, A.; Pavelka, A.; Biedermannova, L.; Damborsky, J. Biotechnol. Adv. 2012, 31 (1), 38. (178) Sehnal, D.; Vařeková, R. S.; Berka, K.; Pravda, L.; Navrátilová, V.; Banáš, P.; Ionescu, C. M.; Otyepka, M.; Koča, J. J. Cheminform. 2013, 5 (8), 1. (179) Yaffe, E.; Fishelovitch, D.; Wolfson, H. J.; Halperin, D.; Nussinov, R. Proteins 2008, 73 (1), 72. (180) Kozlikova, B.; Sebestova, E.; Sustr, V.; Brezovsky, J.; Strnad, O.; Daniel, L.; Bednar, D.; Pavelka, A.; Manak, M.; Bezdeka, M.; Benes, P.; Kotry, M.; Gora, A.; Damborsky, J.; Sochor, J. Bioinformatics 2014, 30 (18), 2684. (181) Masood, T. Bin; Sandhya, S.; Chandra, N.; Natarajan, V. BMC Bioinformatics 2015, 16 (1), 1. (182) Kim, J.-K.; Cho, Y.; Lee, M.; Laskowski, R. a.; Ryu, S. E.; Sugihara, K.; Kim, D.-S. Nucleic Acids Res. 2015, 43 (W1), W413. (183) Berka, K.; Hanák, O.; Sehnal, D.; Banáš, P.; Navrátilová, V.; Jaiswal, D.; Ionescu, C. M.; Svobodová Vařeková,

240

References

R.; Koča, J.; Otyepka, M. Nucleic Acids Res. 2012, 40 (W1). (184) Chaloupková, R.; Sýkorová, J.; Prokop, Z.; Jesenská, A.; Monincová, M.; Pavlová, M.; Tsuda, M.; Nagata, Y.; Damborský, J. J. Biol. Chem. 2003, 278 (52), 52622. (185) Kotik, M.; Štěpánek, V.; Kyslík, P.; Marešová, H. J. Biotechnol. 2007, 132 (1), 8. (186) van Loo, B.; Spelberg, J. H. L.; Kingma, J.; Sonke, T.; Wubbolts, M. G.; Janssen, D. B. Chem. Biol. 2004, 11 (7), 981. (187) Biedermannová, L.; Prokop, Z.; Gora, A.; Chovancová, E.; Kovács, M.; Damborský, J.; Wade, R. C. J. Biol. Chem. 2012, 287 (34), 29062. (188) Koudelakova, T.; Chaloupkova, R.; Brezovsky, J.; Prokop, Z.; Sebestova, E.; Hesseler, M.; Khabiri, M.; Plevaka, M.; Kulik, D.; Kuta Smatanova, I.; Rezacova, P.; Ettrich, R.; Bornscheuer, U. T.; Damborsky, J. Angew. Chem. Int. Ed. Engl. 2013, 52 (7), 1959. (189) Liskova, V.; Bednar, D.; Prudnikova, T.; Rezacova, P.; Koudelakova, T.; Sebestova, E.; Smatanova, I. K.; Brezovsky, J.; Chaloupkova, R.; Damborsky, J. ChemCatChem 2015, 7 (4), 648. (190) Brezovsky, J.; Babkova, P.; Degtjarik, O.; Fortova, A.; Gora, A.; Rezacova, P.; Dvorak, P.; Kuta-Smatanova, I.; Prokop, Z.; Chaloupkova, R.; Damborsky, J. 2015. (191) Gray, K. A.; Richardson, T. H.; Kretz, K.; Short, J. M.; Bartnek, F.; Knowles, R.; Kan, L.; Swanson, P. E.; Robertson, E. Adv. Synth. Catal. 2001, 343 (6), 607. (192) Yu, X.; Cojocaru, V.; Wade, R. C. Biotechnol. Appl. Biochem. 2013, 60 (1), 134. (193) Urban, P.; Truan, G.; Pompon, D. Biochim. Biophys. Acta 2015, 1850 (4), 696. (194) Stsiapanava, A.; Olsson, U.; Wan, M.; Kleinschmidt, T.; Rutishauser, D.; Zubarev, R. a; Samuelsson, B.; Rinaldo-Matthis, A.; Haeggström, J. Z. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (11), 4227. (195) Favia, A. D.; Nobeli, I.; Glaser, F.; Thornton, J. M. J. Mol. Biol. 2008, 375 (3), 855. (196) Suresh, P. S.; Kumar, A.; Kumar, R.; Singh, V. P. J. Mol. Graph. Model. 2008, 26 (5), 845. (197) Xu, T.; Zhang, L.; Wang, X.; Wei, D.; Li, T. BMC Bioinformatics 2009, 10, 257. (198) Yalcin, E. B.; Stangl, H.; Pichu, S.; Mather, T. N.; King, R. S. ACS Chem. Biol. 2011, 6 (2), 176. (199) Irwin, J. J.; Raushel, F. M.; Shoichet, B. K. Biochemistry 2005, 44 (37), 12316. (200) Neklesa, T. K.; Noblin, D. J.; Kuzin, A.; Lew, S.; Seetharaman, J.; Acton, T. B.; Kornhaber, G.; Xiao, R.; Montelione, G. T.; Tong, L.; Crews, C. M. ACS Chem. Biol. 2013, 8 (10), 2293. (201) D.A. Case; T.A. Darden; T.E. Cheatham; C.L. Simmerling; J. Wang; R.E. Duke; R. Luo; R.C. Walker; W. Zhang; K.M. Merz; B.P. Roberts; B. Wang; S. Hayik; A. Roitberg; G. Seabra; I. Kolossváry; K.F. Wong; F. Paesani; J. Vanicek; J. Liu; X. Wu; S.R. Brozell; T. Steinbrecher; H. Gohlke; Q. Cai; X. Ye; J. Wang; M.-J. Hsieh; G. Cui; D.R.Roe; D.H. Mathews; M.G. Seetin; C. Sagui; V. Babin; T. Luchko; S. Gusarov; A. Kovalenko. Amber 12, University of California, San Francisco; 2010. (202) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, D. J.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Montgomery, J. A., J.; Peralta, J. E.; Ogliaro, F.; Bearpark, M.; Heyd, J. J.; Brothers, E.; Kudin, K. N.; Staroverov, V. N.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Rega, N.; Millam, J. M.; Klene, M.; Knox, J. E.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Zakrzewski, V. G.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Dapprich, S.; Daniels, A. D.; Farkas, Ö.; Foresman, J. B.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Wallingford CT 2009,. (203) Hay, P. J.; Wadt, W. R. J. Chem. Phys. 1985, 82 (1), 299. (204) Hay, P. J.; Wadt, W. R. J. Chem. Phys. 1985, 82 (1), 270. (205) Wadt, W. R.; Hay, P. J. J. Chem. Phys. 1985, 82 (1), 284. (206) Bayly, C. C. I.; Cieplak, P.; Cornell, W. D.; Kollman, P. A. J. Phys. Chem. B 1993, 97 (40), 10269. (207) Sanner, M. F. J. Mol. Graph. Model. 1999, 17 (1), 57. (208) Hsin, K.-Y. K. Y.; Morgan, H. P.; Shave, S. R.; Hinton, A. C.; Taylor, P.; Walkinshaw, M. D. Nucleic Acids Res. 2011, 39 (Database), D1042. (209) Berman, H. M.; Battistuz, T.; Bhat, T. N.; Bluhm, W. F.; Bourne, P. E.; Burkhardt, K.; Feng, Z.; Gilliland, G. L.; Iype, L.; Jain, S.; Fagan, P.; Marvin, J.; Padilla, D.; Ravichandran, V.; Schneider, B.; Thanki, N.; Weissig, H.; Westbrook, J. D.; Zardecki, C. Acta Crystallogr. D. Biol. Crystallogr. 2002, 58 (Pt 6 No 1), 899. (210) Gordon, J. C.; Myers, J. B.; Folta, T.; Shoja, V.; Heath, L. S.; Onufriev, A. Nucleic Acids Res. 2005, 33 (Web

241

Server), W368. (211) Solis, F. J.; Wets, R. J.-B. Math. Oper. Res. 1981, 6 (1), 19. (212) Durrant, J. D.; McCammon, J. A. J. Chem. Inf. Model. 2011, 51, 2897. (213) Jakalian, A.; Bush, B. L.; Jack, D. B.; Bayly, C. I. J. Comput. Chem. 2000, 21 (2), 132. (214) Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T.; Caldwell, J.; Wang, J.; Kollman, P. J. Comput. Chem. 2003, 24 (16), 1999. (215) Lee, M. C.; Duan, Y. Proteins 2004, 55 (3), 620. (216) Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. J. Comput. Chem. 2004, 25 (9), 1157. (217) Onufriev, A.; Bashford, D.; Case, D. A. Proteins 2004, 55 (2), 383. (218) Hou, T.; Wang, J.; Li, Y.; Wang, W. J. Comput. Chem. 2011, 32 (5), 866. (219) Feig, M.; Onufriev, A.; Lee, M. S.; Im, W.; Case, D. A.; Brooks, C. L. J. Comput. Chem. 2004, 25 (2), 265. (220) Weiser, J.; Shenkin, P. S.; Still, W. C. J. Comput. Chem. 1999, 20 (2), 217. (221) Miller, B. R.; McGee, T. D.; Swails, J. M.; Homeyer, N.; Gohlke, H.; Roitberg, A. E. J. Chem. Theory Comput. 2012, 8 (9), 3314. (222) Holloway, P.; Trevors, J. T.; Lee*, H. J. Microbiol. Methods 1998, 32 (1), 31. (223) O’Boyle, N. M.; Banck, M.; James, C. A.; Morley, C.; Vandermeersch, T.; Hutchison, G. R. J. Cheminform. 2011, 3 (1), 33. (224) Schramm, V. L. Curr. Opin. Struct. Biol. 2005, 15 (6), 604. (225) Warshel, A.; Florián, J. Proc. Natl. Acad. Sci. U. S. A. 1998, 95 (11), 5950. (226) Irwin, J. J.; Shoichet, B. K. J. Chem. Inf. Model. 2005, 45 (1), 177. (227) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. Nucleic Acids Res. 2012, 40 (D1), D1100. (228) Law, V.; Knox, C.; Djoumbou, Y.; Jewison, T.; Guo, A. C.; Liu, Y.; MacIejewski, A.; Arndt, D.; Wilson, M.; Neveu, V.; Tang, A.; Gabriel, G.; Ly, C.; Adamjee, S.; Dame, Z. T.; Han, B.; Zhou, Y.; Wishart, D. S. Nucleic Acids Res. 2014, 42 (Database issue), D1091. (229) Delano, W. T. The PyMOL molecular graphics system, version 1.5, Schrödinger, LLC; 2009. (230) Kurumbang, N. P.; Dvorak, P.; Bendl, J.; Brezovsky, J.; Prokop, Z.; Damborsky, J. ACS Synth. Biol. 2014, 3 (3), 172. (231) Koudelakova, T.; Bidmanova, S.; Dvorak, P.; Pavelka, A.; Chaloupkova, R.; Prokop, Z.; Damborsky, J. Biotechnology Journal. January 2013, pp 32–45. (232) Janssen, D. B.; Gerritse, J.; Brackman, J.; Kalk, C.; Jager, D.; Witholt, B. Eur. J. Biochem. 1988, 171 (1-2), 67. (233) Nagata, Y.; Nariya, T.; Ohtomo, R.; Fukuda, M.; Yano, K.; Takagi, M. J. Bacteriol. 1993, 175 (20), 6403. (234) Sallis, P. J.; Armfield, S. J.; Bull, A. T.; Hardman, D. J. J. Gen. Microbiol. 1990, 136 (1), 115. (235) Köhler, R.; Brokamp, A.; Schwarze, R.; Reiting, R. H.; Schmidt, F. R. J. Curr. Microbiol. 1998, 36 (2), 96. (236) Cole, S. T.; Brosch, R.; Parkhill, J.; Garnier, T.; Churcher, C.; Harris, D.; Gordon, S. V; Eiglmeier, K.; Gas, S.; Barry, C. E.; Tekaia, F.; Badcock, K.; Basham, D.; Brown, D.; Chillingworth, T.; Connor, R.; Davies, R.; Devlin, K.; Feltwell, T.; Gentles, S.; Hamlin, N.; Holroyd, S.; Hornsby, T.; Jagels, K.; Krogh, a; McLean, J.; Moule, S.; Murphy, L.; Oliver, K.; Osborne, J.; Quail, M. a; Rajandream, M. a; Rogers, J.; Rutter, S.; Seeger, K.; Skelton, J.; Squares, R.; Squares, S.; Sulston, J. E.; Taylor, K.; Whitehead, S.; Barrell, B. G. Nature 1998, 393 (6685), 537. (237) Hasan, K.; Gora, A.; Brezovsky, J.; Chaloupkova, R.; Moskalikova, H.; Fortova, A.; Nagata, Y.; Damborsky, J.; Prokop, Z. In FEBS Journal; 2013; Vol. 280, pp 3149–3159. (238) Hanwell, M. D.; Curtis, D. E.; Lonie, D. C.; Vandermeerschd, T.; Zurek, E.; Hutchison, G. R. J. Cheminform. 2012, 4 (1), 17. (239) D.A. Case; T.A. Darden; T.E. Cheatham; C.L. Simmerling; J. Wang; R.E. Duke; R. Luo; R.C. Walker; W. Zhang; K.M. Merz; B.P. Roberts; B. Wang; S. Hayik; A. Roitberg; G. Seabra; I. Kolossváry; K.F. Wong; F. Paesani; J. Vanicek; J. Liu; X. Wu; S.R. Brozell; T. Steinbrecher; H. Gohlke; Q. Cai; X. Ye; J. Wang; M.-J. Hsieh; G. Cui; D.R.Roe; D.H. Mathews; M.G. Seetin; C. Sagui; V. Babin; T. Luchko; S. Gusarov; A. Kovalenko. Amber 11, University of California, San Francisco; 2010. (240) Trott, O.; Olson, A. J. J. Comput. Chem. 2010, 31 (2), 455. (241) Huson, D. H.; Richter, D. C.; Rausch, C.; Dezulian, T.; Franz, M.; Rupp, R. BMC Bioinformatics 2007, 8 (1), 460. (242) Jakalian, A.; Jack, D. B.; Bayly, C. I. J. Comput. Chem. 2002, 23 (16), 1623. (243) Nagata, Y.; Prokop, Z.; Sato, Y.; Jerabek, P.; Kumar, A.; Ohtsubo, Y.; Tsuda, M.; Damborsky, J. Appl Env. Microbiol 2005, 71 (4), 2183. (244) Iwasaki, I.; Utsumi, S.; Ozawa, T. Bulletin of the Chemical Society of Japan. 1952, pp 226–226.

242

References

(245) Kulakova, A. N.; Larkin, M. J.; Kulakov, L. A. Microbiology 1997, 143, 109. (246) Jesenská, A.; Sedláček, I.; Damborský, J. Appl. Environ. Microbiol. 2000, 66 (1), 219. (247) Ryoo, S. W.; Park, Y. K.; Park, S.-N.; Shim, Y. S.; Liew, H.; Kang, S.; Bai, G.-H. J. Microbiol. 2007, 45 (3), 268. (248) Guan, L.; Yabuki, H.; Okai, M.; Ohtsuka, J.; Tanokura, M. Appl. Microbiol. Biotechnol. 2014, 98 (20), 8573. (249) Babine, R. E.; Bender, S. L. Chem. Rev. 1997, 97 (5), 1359. (250) Livnah, O.; Bayer, E. a; Wilchek, M.; Sussman, J. L. Proc. Natl. Acad. Sci. U. S. A. 1993, 90 (11), 5076. (251) Ollis, D. L.; Cheah, E.; Cygler, M.; Dijkstra, B.; Frolow, F.; Franken, S. M.; Harel, M.; Remington, S. J.; Silman, I.; Schrag, J. Protein Eng. 1992, 5 (3), 197. (252) Holmquist, M. Curr. Protein Pept. Sci. 2000, 1 (2), 209. (253) Verschueren, K. H. G.; Kingma, J.; Rozeboom, H. J.; Kalk, K. H.; Janssen, D. B.; Dijkstra, B. W. Biochemistry 1993, 32 (35), 9031. (254) Pikkemaat, M. G.; Ridder, I. S.; Rozeboom, H. J.; Kalk, K. H.; Dijkstra, B. W.; Janssen, D. B. Biochemistry 1999, 38 (37), 12052. (255) Ridder, I. S.; Rozeboom, H. J.; Dijkstra, B. W. Acta Crystallogr. D Biol. Crystallogr. 1999, 55 (Pt 7), 1273. (256) Oakley, A. J.; Prokop, Z.; Boháč, M.; Kmuníček, J.; Jedlička, T.; Monincová, M.; Kutá-Smatanová, I.; Nagata, Y.; Damborský, J.; Wilce, M. C. J. Biochemistry 2002, 41 (15), 4847. (257) Oakley, A. J.; Klvana, M.; Otyepka, M.; Nagata, Y.; Wilce, M. C. J.; Damborský, J. Biochemistry 2004, 43 (4), 870. (258) Streltsov, V. A.; Prokop, Z.; Damborský, J.; Nagata, Y.; Oakley, A.; Wilce, M. C. J. Biochemistry 2003, 42 (34), 10104. (259) Monincová, M.; Prokop, Z. Z.; Vévodová, J.; Nagata, Y.; Damborský, J.; Damborsky, J. Appl. Environ. Microbiol. 2007, 73 (6), 2005. (260) Krooshof, G. H.; Ridder, I. S.; Tepper, A. W. J. W.; Vos, G. J.; Rozeboom, H. J.; Kalk, K. H.; Dijkstra, B. W.; Janssen, D. B. Biochemistry 1998, 37 (43), 15013. (261) Schindler, J. F.; Naranjo, P. A.; Honaberger, D. A.; Chang, C. H.; Brainard, J. R.; Vanderberg, L. A.; Unkefer, C. J. Biochemistry 1999, 38 (18), 5772. (262) Jesenská, A.; Bartoš, M.; Czerneková, V.; Rychlìk, I.; Pavlìk, I.; Damborský, J. Appl. Environ. Microbiol. 2002, 68 (8), 3724. (263) Chan, W. Y.; Wong, M.; Guthrie, J.; Savchenko, A. V; Yakunin, A. F.; Pai, E. F.; Edwards, E. A. Microb. Biotechnol. 2010, 3 (1), 107. (264) Drienovska, I.; Chovancova, E.; Koudelakova, T.; Damborsky, J.; Chaloupkova, R. Appl. Environ. Microbiol. 2012, 78 (14), 4995. (265) Otyepka, M.; Damborský, J. Protein Sci. 2002, 11 (5), 1206. (266) Kmunícek, J.; Hynková, K.; Jedlicka, T.; Nagata, Y.; Negri, A.; Gago, F.; Wade, R. C.; Damborský, J. Biochemistry 2005, 44 (9), 3390. (267) Wold, S.; Esbensen, K.; Geladi, P. Chemom. Intell. Lab. Syst. 1987, 2 (1–3), 37. (268) Mueller, U.; Darowski, N.; Fuchs, M. R.; Förster, R.; Hellmig, M.; Paithankar, K. S.; Pühringer, S.; Steffien, M.; Zocher, G.; Weiss, M. S. J. Synchrotron Radiat. 2012, 19 (Pt 3), 442. (269) Minor, W.; Cymborowski, M.; Otwinowski, Z.; Chruszcz, M. Acta Crystallogr. D Biol. Crystallogr. 2006, 62 (Pt 8), 859. (270) Vagin, A.; Teplyakov, A. J. Appl. Crystallogr. 1997, 30 (6), 1022. (271) Murshudov, G. N.; Vagin, A. A.; Dodson, E. J. Acta Crystallographica Section D: Biological Crystallography. May 1997, pp 240–255. (272) Number 4, C. C. P. Acta Crystallogr. D Biol. Crystallogr. 1994, 50 (Pt 5), 760. (273) Emsley, P.; Cowtan, K. Acta Crystallogr. D Biol. Crystallogr. 2004, 60 (Pt 12 Pt 1), 2126. (274) Winn, M. D.; Isupov, M. N.; Murshudov, G. N. Acta Crystallogr. D. Biol. Crystallogr. 2001, 57 (Pt 1), 122. (275) Lovell, S. C.; Davis, I. W.; Arendall, W. B.; De Bakker, P. I. W.; Word, J. M.; Prisant, M. G.; Richardson, J. S.; Richardson, D. C. Proteins 2003, 50 (3), 437. (276) Krissinel, E.; Henrick, K. In Lecture Notes in Computer Science; Berthold, M. R., Glen, R. C., Diederichs, K., Kohlbacher, O., Fischer, I., Eds.; Lecture Notes in Computer Science; Springer Berlin Heidelberg, 2005; Vol. 3695 LNBI, pp 163–174. (277) Krissinel, E.; Henrick, K. J. Mol. Biol. 2007, 372 (3), 774. (278) Jones, S.; Thornton, J. M. Proc. Natl. Acad. Sci. U. S. A. 1996, 93 (1), 13. (279) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem. Phys. 1983, 79 (2), 926.

243

(280) Darden, T.; York, D.; Pedersen, L. J. Chem. Phys. 1993, 98 (12), 10089. (281) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. . J. Comput. Phys. 1977, 23 (3), 327. (282) Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graph. 1996, 14 (1), 33. (283) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Proteins 2006, 65 (3), 712. (284) Still, W. C.; Tempczyk, A.; Hawley, R. C.; Hendrickson, T. J. Am. Chem. Soc. 1990, 112 (16), 6127. (285) Mongan, J.; Case, D. A.; McCammon, J. A. J. Comput. Chem. 2004, 25 (16), 2038. (286) Fasman, G. D. Circular Dichroism and the Conformational Analysis of Biomolecules; Springer US, 1996. (287) Damborsky, J.; Chaloupkova, R.; Pavlova, M.; Chovancova, E.; Brezovsky, J. Timmis, K. N., Ed.; Springer Berlin Heidelberg, 2010; pp 1081–1098. (288) Sato, Y.; Natsume, R.; Tsuda, M.; Damborsky, J.; Nagata, Y.; Senda, T. Acta Crystallogr. Sect. F. Struct. Biol. Cryst. Commun. 2007, 63 (Pt 4), 294. (289) Kohn, W. D.; Kay, C. M.; Hodges, R. S. J. Mol. Biol. 1997, 267 (4), 1039. (290) Zhang, Y.; Cremer, P. Curr. Opin. Chem. Biol. 2006, 10 (6), 658. (291) Tadeo, X.; Pons, M.; Millet, O. Biochemistry 2007, 46 (3), 917. (292) Ogawa, H.; Qiu, Y.; Philo, J. S.; Arakawa, T.; Ogata, C. M.; Misono, K. S. Protein Sci. 2010, 19 (3), 544. (293) Zhou, P.; Tian, F.; Zou, J.; Ren, Y. J. Phys. Chem. B 2010, 114 (47), 15673. (294) Blasiak, L. C.; Drennan, C. L. Acc. Chem. Res. 2009, 42 (1), 147. (295) Fiedler, T.; Davey, C.; Fenna, R. J. Biol. Chem. 2000, 275 (16), 11964. (296) Blair-Johnson, M.; Fiedler, T.; Fenna, R. Biochemistry 2001, 40 (46), 13990. (297) Natesh, R.; Schwager, S. L. U.; Sturrock, E. D.; Acharya, K. R. Nature 2003, 421 (6922), 551. (298) Murray, J. W.; Maghlaoui, K.; Kargul, J.; Ishida, N.; Lai, T.-L.; Rutherford, A. W.; Sugiura, M.; Boussac, A.; Barber, J. Energy Environ. Sci. 2008, 1 (1), 161. (299) Kawakami, K.; Umena, Y.; Kamiya, N.; Shen, J.-R. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (21), 8567. (300) Furtmüller, P. G.; Zederbauer, M.; Jantschko, W.; Helm, J.; Bogner, M.; Jakopitsch, C.; Obinger, C. Arch. Biochem. Biophys. 2006, 445 (2), 199. (301) Proteasa, G.; Tahboub, Y. R.; Galijasevic, S.; Raushel, F. M.; Abu-Soud, H. M. Biochemistry 2007, 46 (2), 398. (302) Tzakos, A. G.; Galanis, A. S.; Spyroulias, G. A.; Cordopatis, P.; Manessi-Zoupa, E.; Gerothanassis, I. P. Protein Eng. 2003, 16 (12), 993. (303) Moiseeva, N. A.; Binevski, P. V; Baskin, I. I.; Palyulin, V. A.; Kost, O. A. Biochemistry 2005, 70 (10), 1167. (304) Pokhrel, R.; McConnell, I. L.; Brudvig, G. W. Biochemistry 2011, 50 (14), 2725. (305) Guskov, A.; Kern, J.; Gabdulkhakov, A.; Broser, M.; Zouni, A.; Saenger, W. Nat. Struct. Mol. Biol. 2009, 16 (3), 334. (306) Monincova, M. Functional characterization and classification of haloalkane dehalogenases. Ph.D. Thesis, 2007. (307) Damborský, J.; Petřek, M.; Banáš, P.; Otyepka, M. Biotechnol. J. 2007, 2 (1), 62. (308) Karplus, M.; McCammon, J. A. Nat. Struct. Biol. 2002, 9 (9), 646. (309) Yaffe, E.; Fishelovitch, D.; Wolfson, H. J.; Halperin, D.; Nussinov, R. Nucleic Acids Res. 2008, 36 (Web Server issue). (310) Pellegrini-Calace, M.; Maiwald, T.; Thornton, J. M. PLoS Comput. Biol. 2009, 5 (7). (311) Shindyalov, I. N.; Bourne, P. E. Protein Eng. 1998, 11 (9), 739. (312) Fetzner, S. Appl. Microbiol. Biotechnol. 1998, 50 (6), 633. (313) Westerbeek, A.; Szymański, W.; Feringa, B. L.; Janssen, D. B. ACS Catal. 2011, 1 (12), 1654. (314) Jesenská, A.; Monincová, M.; Koudeláková, T.; Hasan, K.; Chaloupková, R.; Prokop, Z.; Geerlof, A.; Damborský, J. Appl. Environ. Microbiol. 2009, 75 (15), 5157. (315) Gomes, J.; Steiner, W. Food Technol. Biotechnol. 2004, 42 (4), 223. (316) Hough, D. W.; Danson, M. J. Curr. Opin. Chem. Biol. 1999, 3 (1), 39. (317) Pace, C. N.; Fu, H.; Fryar, K. L.; Landua, J.; Trevino, S. R.; Shirley, B. A.; Hendricks, M. M.; Iimura, S.; Gajiwala, K.; Scholtz, J. M.; Grimsley, G. R. J. Mol. Biol. 2011, 408 (3), 514. (318) Feller, G. J. Phys. Condens. Matter 2010, 22 (32), 323101. (319) Kumar, S.; Nussinov, R. Cell. Mol. Life Sci. 2001, 58 (9), 1216. (320) Lam, S. Y.; Yeung, R. C. Y.; Yu, T. H.; Sze, K. H.; Wong, K. B. PLoS Biol. 2011, 9 (3). (321) Ding, Y.; Cai, Y.; Han, Y.; Zhao, B. Extremophiles 2012, 16 (1), 67. (322) Yamanaka, Y.; Kazuoka, T.; Yoshida, M.; Yamanaka, K.; Oikawa, T.; Soda, K. Biochem. Biophys. Res. Commun. 2002, 298 (5), 632.

244

References

(323) Kazuoka, T.; Masuda, Y.; Oikawa, T.; Soda, K. J. Biochem. 2003, 133 (1), 51. (324) Kazuoka, T.; Oikawa, T.; Muraoka, I.; Kuroda, S.; Soda, K. Extremophiles 2007, 11 (2), 257. (325) Liu, S.; Duan, X.; Lu, X.; Gao, P. Chinese Sci. Bull. 2006, 51 (2), 191. (326) Fedøy, A. E.; Yang, N.; Martinez, A.; Leiros, H. K. S.; Steen, I. H. J. Mol. Biol. 2007, 372 (1), 130. (327) Novak, H. R.; Sayer, C.; Panning, J.; Littlechild, J. A. Mar. Biotechnol. 2013, 15 (6), 695. (328) Ward, B. B.; Priscu, J. C. Hydrobiologia 1997, 347, 57. (329) Chaloupkova, R.; Prokop, Z.; Sato, Y.; Nagata, Y.; Damborsky, J. FEBS J. 2011, 278 (15), 2728. (330) Tratsiak, K.; Degtjarik, O.; Drienovska, I.; Chrast, L.; Rezacova, P.; Kuty, M.; Chaloupkova, R.; Damborsky, J.; Kuta Smatanova, I. Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun. 2013, 69 (6), 683. (331) Oikawa, T.; Kazuoka, T.; Soda, K. J. Mol. Catal. B Enzym. 2003, 23 (2-6), 65. (332) Sykora, J.; Brezovsky, J.; Koudelakova, T.; Lahoda, M.; Fortova, A.; Chernovets, T.; Chaloupkova, R.; Stepankova, V.; Prokop, Z.; Smatanova, I. K.; Hof, M.; Damborsky, J. Nat. Chem. Biol. 2014, 10, 428. (333) Lennox, E. S. Virology 1955, 1 (2), 190. (334) Bradford, M. M. Anal. Biochem. 1976, 72, 248. (335) Kabsch, W. J. Appl. Crystallogr. 1993, 26 (pt 6), 795. (336) Kabsch, W. Acta Crystallogr. D Biol. Crystallogr. 2010, 66 (2), 125. (337) Joosten, R. P.; Long, F.; Murshudov, G. N.; Perrakis, A. IUCrJ 2014, 1 (4), 213. (338) Chen, V. B.; Arendall, W. B.; Headd, J. J.; Keedy, D. A.; Immormino, R. M.; Kapral, G. J.; Murray, L. W.; Richardson, J. S.; Richardson, D. C. Acta Crystallogr. D Biol. Crystallogr. 2010, 66 (1), 12. (339) Vaguine, A. a; Richelle, J.; Wodak, S. J. Acta Crystallogr. Sect. D Biol. Crystallogr. 1999, 55 (1), 191. (340) Roe, D. R.; Cheatham, T. E. J. Chem. Theory Comput. 2013, 9 (7), 3084. (341) Le Grand, S.; Götz, A. W.; Walker, R. C. Comput. Phys. Commun. 2013, 184 (2), 374. (342) Götz, A. W.; Williamson, M. J.; Xu, D.; Poole, D.; Le Grand, S.; Walker, R. C. J. Chem. Theory Comput. 2012, 8 (5), 1542. (343) Joung, S.; Cheatham, T. E. J. Phys. Chem. B 2009, 113 (40), 13279. (344) Joung, I. S.; Cheatham, T. E. J. Phys. Chem. B 2008, 112 (30), 9020. (345) Kellogg, E. H.; Leaver-Fay, A.; Baker, D. Proteins 2011, 79 (3), 830. (346) Sanchis, J.; Fernández, L.; Carballeira, J. D.; Drone, J.; Gumulya, Y.; Höbenreich, H.; Kahakeaw, D.; Kille, S.; Lohmer, R.; Peyralans, J. J.-P.; Podtetenieff, J.; Prasad, S.; Soni, P.; Taglieber, A.; Wu, S.; Zilly, F. E.; Reetz, M. T. Appl. Microbiol. Biotechnol. 2008, 81 (2), 387. (347) Teilum, K.; Olsen, J. G.; Kragelund, B. B. Biochim. Biophys. Acta - Proteins Proteomics 2011, 1814 (8), 969. (348) Craveur, P.; Joseph, A. P.; Esque, J.; Narwani, T. J.; Shinada, N.; Goguet, M.; Leonard, S.; Poulain, P.; Bertrand, O.; Faure, G.; Rebehmed, J.; Ghozlane, A.; Swapna, L. S.; Barnoud, J.; Jallu, V.; Cerny, J.; Schneider, B.; Etchebest, C.; Srinivasan, N.; Gelly, J.-C.; de Brevern, A. G. Front. Mol. Biosci. 2015, 2 (May), 20. (349) Sinko, W.; Lindert, S.; Mccammon, J. A. Chem. Biol. Drug Des. 2013, 81 (1), 41. (350) Feixas, F.; Lindert, S.; Sinko, W.; McCammon, J. A. Biophys. Chem. 2014, 186, 31. (351) Sanson, B.; Colletier, J.-P.; Xu, Y.; Lang, P. T.; Jiang, H.; Silman, I.; Sussman, J. L.; Weik, M. Protein Sci. 2011, 20 (7), 1114. (352) Oprea, T. I.; Hummer, G.; Garcia, A. E. Proc. Natl. Acad. Sci. USA 1997, 94 (6), 2133. (353) Kingsley, L. J.; Lill, M. A. J. Comput. Chem. 2014, 35 (24), 1748. (354) Chen, L.; Lyubimov, A. Y.; Brammer, L.; Vrielink, A.; Sampson, N. S. Biochemistry 2008, 47 (19), 5368. (355) Zamocky, M.; Herzog, C.; Nykyri, L. M.; Koller, F. FEBS Lett. 1995, 367 (3), 241. (356) Smith, G.; Modi, S.; Pillai, I.; Lian, L. Y.; Sutcliffe, M. J.; Pritchard, M. P.; Friedberg, T.; Roberts, G. C.; Wolf, C. R. Biochem J. 1998, 331, 783. (357) Carmichael, A. B.; Wong, L. L. Eur. J. Biochem. 2001, 268 (10), 3117. (358) Floquet, N.; Mouilleron, S.; Daher, R.; Maigret, B.; Badet, B.; Badet-Denisot, M. A. FEBS Lett. 2007, 581 (16), 2981. (359) Xu, Z.; Metsä-Ketelä, M.; Hertweck, C. J. Biotechnol. 2009, 140 (1-2), 107. (360) Labonte, P.; Axelrod, V.; Agarwal, A.; Aulabaugh, A.; Amin, A.; Mak, P. J Biol Chem 2002, 26, 26. (361) Lafaquière, V.; Barbe, S.; Puech-Guenot, S.; Guieysse, D.; Cortés, J.; Monsan, P.; Siméon, T.; André, I.; Remaud-Siméon, M. ChemBioChem 2009, 10 (17), 2760. (362) Boublik, Y.; Saint-Aguet, P.; Lougarre, A.; Arnaud, M.; Villatte, F.; Estrada-Mondaca, S.; Fournier, D. Protein Eng. 2002, 15 (1), 43. (363) Ruvinov, S. B.; Yang, X. J.; Parris, K. D.; Banik, U.; Ahmed, S. A.; Miles, E. W.; Sackett, D. L. J. Biol. Chem. 1995,

245

270 (11), 6357. (364) Zhang, L.; Liu, W.; Hu, T.; Du, L.; Luo, C.; Chen, K.; Shen, X.; Jiang, H. J. Biol. Chem. 2008, 283, 5370. (365) Dang, T.; Prestwich, G. D. Chem. Biol. 2000, 7 (8), 643. (366) Meyer, M. E.; Gutierrez, J. A.; Raushel, F. M.; Richards, N. G. J. Biochemistry 2010, 49 (43), 9391. (367) Kim, J.; Raushel, F. M. Arch. Biochem. Biophys. 2004, 425 (1), 33. (368) Matsuzaki, R.; Tanizawa, K. Biochemistry 1998, 37 (40), 13947. (369) Guo, R. T.; Kuo, C. J.; Ko, T. P.; Chou, C. C.; Liang, P. H.; Wang, A. H. J. Biochemistry 2004, 43 (24), 7678. (370) Schmitt, J.; Brocca, S.; Schmid, R. D.; Pleiss, J. Protein Eng. 2002, 15 (7), 595. (371) Reetz, M. T.; Wang, L. W.; Bocola, M. Angew. Chem. Int. Ed. Engl. 2006, 45 (8), 1236. (372) Moore, S. A.; Baker, H. M.; Blythe, T. J.; Kitson, K. E.; Kitson, T. M.; Baker, E. N. Structure 1998, 6 (12), 1541. (373) Kalko, S. G.; Gelpí, J. L.; Fita, I.; Orozco, M. J. Am. Chem. Soc. 2001, 123 (39), 9665. (374) Piubelli, L.; Pedotti, M.; Molla, G.; Feindler-Boeckh, S.; Ghisla, S.; Pilone, M. S.; Pollegioni, L. J. Biol. Chem. 2008, 283 (36), 24738. (375) Xin, Y.; Gadda, G.; Hamelberg, D. Biochemistry 2009, 48 (40), 9599. (376) Johnson, B. J.; Cohen, J.; Welford, R. W.; Pearson, A. R.; Schulten, K.; Klinman, J. P.; Wilmot, C. M. J. Biol. Chem. 2007, 282 (24), 17767. (377) Furse, K. E.; Pratt, D. A.; Schneider, C.; Brash, A. R.; Porter, N. A.; Lybrand, T. P. Biochemistry 2006, 45 (10), 3206. (378) Nandakishore, R.; Yalavarthi, P. R.; Kiran, Y. R.; Rajapranathi, M. Curr. Drug Discov. Technol. 2014, 11 (2), 127. (379) Gurpinar, E.; Grizzle, W. E.; Piazza, G. A. Clin. Cancer Res. An Off. J. Am. Assoc. Cancer Res. 2014, 20 (5), 1104. (380) Yap, T. A.; Carden, C. P.; Attard, G.; de Bono, J. S. Curr. Opin. Pharmacol. 2008, 8 (4), 449. (381) Zobniw, C. M.; Causebrook, A.; Fong, M. K. Res. Reports Urol. 2014, 6, 97. (382) Yu, X.; Nandekar, P.; Mustafa, G.; Cojocaru, V.; Lepesheva, G. I.; Wade, R. C. Biochim. Biophys. Acta - Gen. Subj. 2016, 1860 (1), 67. (383) Osborne, M. J.; Schnell, J.; Benkovic, S. J.; Dyson, H. J.; Wright, P. E. Biochemistry 2001, 40 (33), 9846. (384) Bhabha, G.; Lee, J.; Ekiert, D. C.; Gam, J.; Wilson, I. A.; Dyson, H. J.; Benkovic, S. J.; Wright, P. E. Science 2011, 332 (6026), 234. (385) Liu, S.; Neidhardt, E. A.; Grossman, T. H.; Ocain, T.; Clardy, J. Structure 2000, 8 (1), 25. (386) Davies, M.; Heikkilä, T.; McConkey, G. A.; Fishwick, C. W. G.; Parsons, M. R.; Johnson, A. P. J. Med. Chem. 2009, 52 (9), 2683. (387) da Silva Lima, C. H.; de Alencastro, R. B.; Kaiser, C. R.; de Souza, M. V. N.; Rodrigues, C. R.; Albuquerque, M. G. Int. J. Mol. Sci. 2015, 16 (10), 23695. (388) Saam, J.; Ivanov, I.; Walther, M.; Holzhütter, H.-G.; Kuhn, H. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (33), 13319. (389) Milczek, E. M.; Binda, C.; Rovida, S.; Mattevi, A.; Edmondson, D. E. FEBS J. 2011, 278 (24), 4860. (390) McDonald, G. R.; Olivieri, A.; Ramsay, R. R.; Holt, A. Pharmacol. Res. 2010, 62 (6), 475. (391) Morris, S. M.; Billiar, T. R. Am. J. Physiol. 1994, 266 (6 Pt 1), E829. (392) Whited, C. A.; Warren, J. J.; Lavoie, K. D.; Weinert, E. E.; Agapie, T.; Winkler, J. R.; Gray, H. B. J. Am. Chem. Soc. 2012, 134 (1), 27. (393) Binda, C.; Bossi, R. T.; Wakatsuki, S.; Arzt, S.; Coda, A.; Curti, B.; Vanoni, M. A.; Mattevi, A. Struct. (London, Engl. 1993) 2000, 8 (12), 1299. (394) Singh, H.; Arentson, B. W.; Becker, D. F.; Tanner, J. J. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (9), 3389. (395) Kuwabara, Y.; Nishino, T.; Okamoto, K.; Matsumura, T.; Eger, B. T.; Pai, E. F.; Nishino, T. Proc. Natl. Acad. Sci. U. S. A. 2003, 100 (14), 8170. (396) Qiu, X.; Choudhry, A. E.; Janson, C. A.; Grooms, M.; Daines, R. A.; Lonsdale, J. T.; Khandekar, S. S. Protein Sci. 2005, 14 (8), 2087. (397) Cookson, T. V. M.; Evans, G. L.; Castell, A.; Baker, E. N.; Lott, J. S.; Parker, E. J. Biochemistry 2015, 54 (39), 6082. (398) Thornburg, J. M.; Nelson, K. K.; Clem, B. F.; Lane, A. N.; Arumugam, S.; Simmons, A.; Eaton, J. W.; Telang, S.; Chesney, J. Breast cancer Res. BCR 2008, 10 (5), R84. (399) Wrenger, C.; Müller, I. B.; Schifferdecker, A. J.; Jain, R.; Jordanova, R.; Groves, M. R. J. Mol. Biol. 2011, 405 (4), 956. (400) Ehler, A.; Benz, J.; Schlatter, D.; Rudolph, M. G. Acta Crystallogr. D Biol. Crystallogr. 2014, 70 (Pt 8), 2163.

246

References

(401) Doublié, S.; Sawaya, M. R.; Ellenberger, T. Structure 1999, 7 (2), R31. (402) Kensch, O.; Restle, T.; Wöhrl, B. M.; Goody, R. S.; Steinhoff, H.-J. J. Mol. Biol. 2000, 301 (4), 1029. (403) Labonté, P.; Axelrod, V.; Agarwal, A.; Aulabaugh, A.; Amin, A.; Mak, P. J. Biol. Chem. 2002, 277 (41), 38838. (404) Santoso, Y.; Joyce, C. M.; Potapova, O.; Reste, L. Le; Hohlbein, J.; Torella, J. P.; Grindley, N. D. F.; Kapanidis, A. N. Proc. Natl. Acad. Sci. U. S. A. 2010, 107 (2), 715. (405) Johansson, P.; Wiltschi, B.; Kumari, P.; Kessler, B.; Vonrhein, C.; Vonck, J.; Oesterhelt, D.; Grininger, M. Proc. Natl. Acad. Sci. U. S. A. 2008, 105 (35), 12803. (406) Teplyakov, A.; Obmolova, G.; Badet, B.; Badet-Denisot, M. A. J. Mol. Biol. 2001, 313 (5), 1093. (407) Krahn, J. M.; Kim, J. H.; Burns, M. R.; Parry, R. J.; Zalkin, H.; Smith, J. L. Biochemistry 1997, 36 (37), 11061. (408) Raushel, F. M.; Thoden, J. B.; Holden, H. M. Acc. Chem. Res. 2003, 36 (7), 539. (409) Rossjohn, J.; McKinstry, W. J.; Oakley, A. J.; Verger, D.; Flanagan, J.; Chelvanayagam, G.; Tan, K. L.; Board, P. G.; Parker, M. W. Struct. (London, Engl. 1993) 1998, 6 (3), 309. (410) Barford, D.; Hu, S.-H.; Johnson, L. N. J. Mol. Biol. 1991, 218 (1), 233. (411) Amaro, R. E.; Myers, R. S.; Davisson, V. J.; Luthey-Schulten, Z. a. Biophys. J. 2005, 89 (1), 475. (412) Voss, N. R.; Gerstein, M.; Steitz, T. A.; Moore, P. B. J. Mol. Biol. 2006, 360 (4), 893. (413) Hansen, J. L.; Moore, P. B.; Steitz, T. A. J. Mol. Biol. 2003, 330 (5), 1061. (414) Cross, P. J.; Dobson, R. C. J.; Patchett, M. L.; Parker, E. J. J. Biol. Chem. 2011, 286 (12), 10216. (415) Wang, L. K.; Lima, C. D.; Shuman, S. EMBO J. 2002, 21 (14), 3873. (416) Taylor, S. S.; Kornev, A. P. Trends Biochem. Sci. 2011, 36 (2), 65. (417) Furman, R. R.; Hoelzer, D. Semin. Oncol. 2007, 34 (6 Suppl 5), S29. (418) Cook, I.; Wang, T.; Almo, S. C.; Kim, J.; Falany, C. N.; Leyh, T. S. Biochemistry 2013, 52 (2), 415. (419) Stroud, R. M.; Finer-Moore, J. S. Biochemistry 2003, 42 (2), 239. (420) Pinkas, D. M.; Strop, P.; Brunger, A. T.; Khosla, C. PLoS Biol 2007, 5 (12), e327. (421) Lovering, A. L.; De Castro, L.; Lim, D.; Strynadka, N. C. J. Protein Sci. 2006, 15 (7), 1701. (422) Borra, P. S.; Samuelsen, Ø.; Spencer, J.; Walsh, T. R.; Lorentzen, M. S.; Leiros, H.-K. S. Antimicrob. Agents Chemother. 2013, 57 (2), 848. (423) Chakraborty, S.; Kumar, S.; Basu, S. Neurochem. Int. 2011, 58 (8), 914. (424) Tolia, A.; Horré, K.; Strooper, B. De. J. Biol. Chem. 2008, 283 (28), 19793. (425) Rydberg, E. H.; Brumshtein, B.; Greenblatt, H. M.; Wong, D. M.; Shaya, D.; Williams, L. D.; Carlier, P. R.; Pang, Y.-P.; Silman, I.; Sussman, J. L. J. Med. Chem. 2006, 49 (18), 5491. (426) Xu, Y.; Colletier, J.-P.; Weik, M.; Jiang, H.; Moult, J.; Silman, I.; Sussman, J. L. Biophys. J. 2008, 95 (5), 2500. (427) Wang, J.; Song, J. J.; Franklin, M. C.; Kamtekar, S.; Im, Y. J.; Rho, S. H.; Seong, I. S.; Lee, C. S.; Chung, C. H.; Eom, S. H. Struct. (London, Engl. 1993) 2001, 9 (2), 177. (428) Wang, J.; Song, J. J.; Seong, I. S.; Franklin, M. C.; Kamtekar, S.; Eom, S. H.; Chung, C. H. Struct. (London, Engl. 1993) 2001, 9 (11), 1107. (429) Norman, D. D.; Ibezim, A.; Scott, W. E.; White, S.; Parrill, A. L.; Baker, D. L. Bioorg. Med. Chem. 2013, 21 (17), 5548. (430) Liu, K.; Ologbenla, A.; Houry, W. A. Crit. Rev. Biochem. Mol. Biol. 2014, 49 (5), 400. (431) Durrant, J. D.; Keränen, H.; Wilson, B. A.; McCammon, J. A. PLoS Negl. Trop. Dis. 2010, 4 (5). (432) Hoelz, L. V. B.; Leal, V. F.; Rodrigues, C. R.; Pascutti, P. G.; Albuquerque, M. G.; Muri, E. M. F.; Dias, L. R. S. J. Biomol. Struct. Dyn. 2015, 1. (433) Whittington, D. a; Rusche, K. M.; Shin, H.; Fierke, C. a; Christianson, D. W. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 8146. (434) Coggins, B. E.; Li, X.; McClerren, A. L.; Hindsgaul, O.; Raetz, C. R. H.; Zhou, P. Nat. Struct. Biol. 2003, 10 (8), 645. (435) Nardini, M.; Ridder, I. S.; Rozeboom, H. J.; Kalk, K. H.; Rink, R.; Janssen, D. B.; Dijkstra, B. W. J. Biol. Chem. 1999, 274 (21), 14579. (436) Nardini, M.; Rink, R.; Janssen, D. B.; Dijkstra, B. W. J. Mol. Catal. B Enzym. 2001, 11 (4–6), 1035. (437) Biswal, B. K.; Morisseau, C.; Garen, G.; Cherney, M. M.; Garen, C.; Niu, C.; Hammock, B. D.; James, M. N. G. J. Mol. Biol. 2008, 381 (4), 897. (438) Lombardi, P. M.; Cole, K. E.; Dowling, D. P.; Christianson, D. W. Curr. Opin. Struct. Biol. 2011, 21 (6), 735. (439) Chang, C.-E. A.; Trylska, J.; Tozzini, V.; Andrew McCammon, J. Chem. Biol. Drug Des. 2007, 69 (1), 5. (440) Katoh, E.; Louis, J. M.; Yamazaki, T.; Gronenborn, A. M.; Torchia, D. A.; Ishima, R. Protein Sci. a Publ. Protein Soc. 2003, 12 (7), 1376.

247

(441) Grochulski, P.; Bouthillier, F.; Kazlauskas, R. J.; Serreqi, A. N.; Schrag, J. D.; Ziomek, E.; Cygler, M. Biochemistry 1994, 33 (12), 3494. (442) Ericsson, D. J.; Kasrayan, A.; Johansson, P.; Bergfors, T.; Sandström, A. G.; Bäckvall, J.-E.; Mowbray, S. L. J. Mol. Biol. 2008, 376 (1), 109. (443) Schrag, J. D.; Li, Y.; Cygler, M.; Lang, D.; Burgdorf, T.; Hecht, H.-J.; Schmid, R.; Schomburg, D.; Rydel, T. J.; Oliver, J. D.; Strickland, L. C.; Dunaway, C. M.; Larson, S. B.; Day, J.; McPherson, A. Structure 1997, 5 (2), 187. (444) Brown, C. K.; Madauss, K.; Lian, W.; Beck, M. R.; Tolbert, W. D.; Rodgers, D. W. Proc. Natl. Acad. Sci. U. S. A. 2001, 98 (6), 3127. (445) da Silva Giotto, M. t.; Garratt, R. c.; Oliva, G.; Mascarenhas, Y. p.; Giglio, J. r.; Cintra, A. c. o.; de Azevedo, W. f.; Arni, R. k.; Ward, R. j. Proteins 1998, 30 (4), 442. (446) Shan, L.; Mathews, I. I.; Khosla, C. Proc. Natl. Acad. Sci. U. S. A. 2005, 102 (10), 3599. (447) Kaszuba, K.; Róg, T.; Danne, R.; Canning, P.; Fülöp, V.; Juhász, T.; Szeltner, Z.; St. Pierre, J.-F.; García- Horsman, A.; Männistö, P. T.; Karttunen, M.; Hokkanen, J.; Bunker, A. Biochimie 2012, 94 (6), 1398. (448) Lima, C. D.; Wang, L. K.; Shuman, S. Cell 1999, 99 (5), 533. (449) Jain, R.; Shuman, S. J. Biol. Chem. 2008, 283 (45), 31047. (450) Roberts, B. P.; Miller, B. R.; Roitberg, A. E.; Merz, K. M. J. Am. Chem. Soc. 2012, 134 (24), 9934. (451) Li, Q.-A.; Mavrodi, D. V.; Thomashow, L. S.; Roessle, M.; Blankenfeldt, W. J. Biol. Chem. 2011, 286 (20), 18213. (452) Chen, J.; Zhang, L.; Zhang, Y.; Zhang, H.; Du, J.; Ding, J.; Guo, Y.; Jiang, H.; Shen, X. BMC Microbiol. 2009, 9, 91. (453) Zhang, L.; Kong, Y.; Wu, D.; Zhang, H.; Wu, J.; Chen, J.; Ding, J.; Hu, L.; Jiang, H.; Shen, X. U. 2008, 1971. (454) Rowlett, R. S. Biochim. Biophys. Acta 2010, 1804 (2), 362. (455) Teng, Y.-B.; Jiang, Y.-L.; He, Y.-X.; He, W.-W.; Lian, F.-M.; Chen, Y.; Zhou, C.-Z. BMC Struct. Biol. 2009, 9, 67. (456) Huang, W.; Boju, L.; Tkalec, L.; Su, H.; Yang, H. O.; Gunay, N. S.; Linhardt, R. J.; Kim, Y. S.; Matte, A.; Cygler, M. Biochemistry 2001, 40 (8), 2359. (457) Giardina, G.; Montioli, R.; Gianni, S.; Cellini, B.; Paiardini, A.; Voltattorni, C. B.; Cutruzzolà, F. Proc. Natl. Acad. Sci. U. S. A. 2011, 108 (51), 20514. (458) Hyde, C. C.; Ahmed, S. A.; Padlan, E. A.; Miles, E. W.; Davies, D. R. J. Biol. Chem. 1988, 263 (33), 17857. (459) Miles, E. W. Chem. Rec. 2001, 1 (2), 140. (460) May, M.; Mehboob, S.; Mulhearn, D. C.; Wang, Z.; Yu, H.; Thatcher, G. R. J.; Santarsiero, B. D.; Johnson, M. E.; Mesecar, A. D. J. Mol. Biol. 2007, 371 (5), 1219. (461) Mehboob, S.; Guo, L.; Fu, W.; Mittal, A.; Yau, T.; Truong, K.; Johlfs, M.; Long, F.; Fung, L. W.-M.; Johnson, M. E. Biochemistry 2009, 48 (29), 7045. (462) Thomä, N. H.; Leadlay, P. F. Protein Sci. A Publ. Protein Soc. 1996, 5 (9), 1922. (463) Mancia, F.; Evans, P. R. Structure 1998, 6 (6), 711. (464) Oliaro-Bosso, S.; Caron, G.; Taramino, S.; Ermondi, G.; Viola, F.; Balliano, G. PLoS One 2011, 6 (7), e22134. (465) Wendt, K. U.; Poralla, K.; Schulz, G. E. Science 1997, 277 (5333), 1811. (466) Wierenga, R. K.; Noble, M. E.; Postma, J. P.; Groendijk, H.; Kalk, K. H.; Hol, W. G.; Opperdoes, F. R. Proteins 1991, 10 (1), 33. (467) Rozovsky, S.; Jogl, G.; Tong, L.; McDermott, A. E. J. Mol. Biol. 2001, 310 (1), 271. (468) Massi, F.; Wang, C.; Palmer, A. G. Biochemistry 2006, 45 (36), 10787. (469) Tesson, A. R.; Soper, T. S.; Ciustea, M.; Richards, N. G. J. Arch. Biochem. Biophys. 2003, 413 (1), 23. (470) Fan, Y.; Lund, L.; Shao, Q.; Gao, Y.-Q.; Raushel, F. M. J. Am. Chem. Soc. 2009, 131 (29), 10211. (471) Endrizzi, J. A.; Kim, H.; Anderson, P. M.; Baldwin, E. P. Biochemistry 2004, 43 (21), 6447. (472) Endrizzi, J. A.; Kim, H.; Anderson, P. M.; Baldwin, E. P. Biochemistry 2005, 44 (41), 13491. (473) Wu, X.; Zhang, W.; Font-Burgada, J.; Palmer, T.; Hamil, A. S.; Biswas, S. K.; Poidinger, M.; Borcherding, N.; Xie, Q.; Ellies, L. G.; Lytle, N. K.; Wu, L.-W.; Fox, R. G.; Yang, J.; Dowdy, S. F.; Reya, T.; Karin, M. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (38), 13870. (474) Zhou, H. X.; McCammon, J. A. Trends in Biochemical Sciences. 2010, pp 179–185. (475) Van Den Heuvel, R. H. H.; Svergun, D. I.; Petoukhov, M. V.; Coda, A.; Curti, B.; Ravasio, S.; Vanoni, M. A.; Mattevi, A. J. Mol. Biol. 2003, 330 (1), 113. (476) Zhou, H. X.; Wlodek, S. T.; McCammon, J. a. Proc. Natl. Acad. Sci. USA 1998, 95 (16), 9280. (477) McCammon, J. A. BMC Biophys. 2011, 4, 4. (478) Sgrignani, J.; Bon, M.; Colombo, G.; Magistrato, A. J. Chem. Inf. Model. 2014, 54 (10), 2856.

248

References

(479) Panjkovich, A.; Daura, X. BMC Bioinformatics 2012, 13 (1), 273. (480) Pan, J.; Chen, Q.; Willenbring, D.; Mowrey, D.; Kong, X. P.; Cohen, A.; Divito, C. B.; Xu, Y.; Tang, P. Structure 2012, 20 (9), 1463. (481) Lemoine, D.; Jiang, R.; Taly, A.; Chataigneau, T.; Specht, A.; Grutter, T. Chem. Rev. 2012, 112 (12), 6285. (482) Garcin, E. D.; Arvai, A. S.; Rosenfeld, R. J.; Kroeger, M. D.; Crane, B. R.; Andersson, G.; Andrews, G.; Hamley, P. J.; Mallinder, P. R.; Nicholls, D. J.; St-Gallay, S. a; Tinker, A. C.; Gensmantel, N. P.; Mete, A.; Cheshire, D. R.; Connolly, S.; Stuehr, D. J.; Aberg, A.; Wallace, A. V; Tainer, J. a; Getzoff, E. D. Nat. Chem. Biol. 2008, 4 (11), 700. (483) Duncan, A. J.; Heales, S. J. R. Mol. Aspects Med. 2005, 26 (1-2 SPEC. ISS.), 67. (484) Fischmann, T. O.; Hruza, A.; Niu, X. D.; Fossetta, J. D.; Lunn, C. A.; Dolphin, E.; Prongay, A. J.; Reichert, P.; Lundell, D. J.; Narula, S. K.; Weber, P. C. Nat. Struct. Biol. 1999, 6 (3), 233. (485) Lepesheva, G. I.; Hargrove, T. Y.; Anderson, S.; Kleshchenko, Y.; Furtak, V.; Wawrzak, Z.; Villalta, F.; Waterman, M. R. J. Biol. Chem. 2010, 285, 25582. (486) Roberts, C. W.; McLeod, R.; Rice, D. W.; Ginger, M.; Chance, M. L.; Goad, L. J. Mol. Biochem. Parasitol. 2003, pp 129–142. (487) DeVore, N. M.; Scott, E. E. Nature 2012, 482 (7383), 116. (488) Miller, W. L.; Auchus, R. J. Endocr. Rev. 2011, 32 (1), 81. (489) Beringer, M. RNA 2008, 14 (5), 795. (490) Wu, W. I.; Voegtli, W. C.; Sturgis, H. L.; Dizon, F. P.; Vigers, G. P. a; Brandhuber, B. J. PLoS One 2010, 5 (9), e12913. (491) Bellacosa, A.; Chan, T. O.; Ahmed, N. N.; Datta, K.; Malstrom, S.; Stokoe, D.; McCormick, F.; Feng, J.; Tsichlis, P. Oncogene 1998, 17 (3), 313. (492) Liu, P.; Cheng, H.; Roberts, T. M.; Zhao, J. J. Nat. Rev. Drug Discov. 2009, 8 (8), 627. (493) Sussman, J. L.; Harel, M.; Silman, I. Chem. Biol. Interact. 1993, 87 (1-3), 187. (494) Johnson, G.; Moore, S. W. Curr. Pharm. Des. 2006, 12 (2), 217. (495) Haeggström, J. Z. J. Biol. Chem. 2004, 279 (49), 50639. (496) Snelgrove, R. J.; Jackson, P. L.; Hardison, M. T.; Noerager, B. D.; Kinloch, A.; Gaggar, A.; Shastry, S.; Rowe, S. M.; Shim, Y. M.; Hussell, T.; Blalock, J. E. Science 2010, 330 (6000), 90. (497) Petit, A.; Barelli, H.; Morain, P.; Checler, F. Br. J. Pharmacol. 2000, 130 (7), 1613. (498) Lawandi, J.; Gerber-Lemaire, S.; Juillerat-Jeanneret, L.; Moitessier, N. J. Med. Chem. 2010, 53, 3423. (499) Böttcher, J.; Blum, A.; Dörr, S.; Heine, A.; Diederich, W. E.; Klebe, G. ChemMedChem 2008, 3 (9), 1337. (500) Tiefenbrunn, T.; Forli, S.; Baksh, M. M.; Chang, M. W.; Happer, M.; Lin, Y.-C.; Perryman, A. L.; Rhee, J.-K.; Torbett, B. E.; Olson, A. J.; Elder, J. H.; Finn, M. G.; Stout, C. D. ACS Chem. Biol. 2013, 8 (6), 1223. (501) Cole, K. E.; Gattis, S. G.; Angell, H. D.; Fierke, C. a.; Christianson, D. W. Biochemistry 2011, 50, 258. (502) Barb, A. W.; Zhou, P. Curr. Pharm. Biotechnol. 2008, 9, 9. (503) Lee, C. J.; Liang, X.; Chen, X.; Zeng, D.; Joo, S. H.; Chung, H. S.; Barb, A. W.; Swanson, S. M.; Nicholas, R. a.; Li, Y.; Toone, E. J.; Raetz, C. R. H.; Zhou, P. Chem. Biol. 2011, 18 (1), 38. (504) Barb, A. W.; Jiang, L.; Raetz, C. R. H.; Zhou, P. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (47), 18433.

249

CURRICULUM VITAE

Mgr. Lukáš Daniel Date and place of birth: September 29, 1986 in Zlín, Czech Republic Nationality: Czech

Affiliation: Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; e-mail: [email protected]

Education 2011: M.Sc. in Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic 2009: B.Sc. in Biochemistry, Faculty of Science, Masaryk University, Brno, Czech Republic

Awards 2015: Dean’s Award for publishing excellent research, Masaryk University, Brno, Czech Republic 2007 – 2010: Merit scholarships for excellent grades, Masaryk University, Brno, Czech Republic 3/2010 – 12/2010: Scholarship for student’s assistants, Masaryk University, Brno, Czech Republic

Research experience 01/2013 – present: International Clinical Research Centre, St. Anne's University Hospital, Brno, Czech Republic 10/2011 – 12/2013: Centre for Toxic Compounds in the Environment, Masaryk University, Brno, Czech Republic

Contracted research 6/2014: Enantis Ltd. (www.enantis.com), Molecular docking of ligands into human constitutive androstane, pregnane X and vitamin D receptors 11/2015 – 2/2016 Enantis Ltd. (www.enantis.com), Computational design of a cytochrome P450 for target reactions

Research stays 11-12/2014: Group of Professor Federico Gago, Department of Biomedical Sciences, University of Alcalá, Alcalá de Henares, Spain

250

Pedagogical activities 2011 – present: Lecturer of Bioinformatics practice, Masaryk University, Brno, Czech Republic 06/2014: Lecturer of Bioinformatics during the III. Summer school of protein engineering, Masaryk University, Brno, Czech Republic (loschmidt.chemi.muni.cz/school/) 2013: Supervision of B.Sc. thesis of Marta Surgentova: Computational analysis of protein tunnels, Masaryk University, Brno, Czech Republic

Research interests Biocatalysis and biotechnology; rational design of enzymes; development of novel pharmaceutics;

251

LIST OF PUBLICATIONS

* These authors contributed equally to this work

Buryska, T.* , Daniel, L.*, Kunka, A., Brezovsky, J., Damborsky, J., Prokop, Z., 2016: Discovery of Novel Haloalkane Dehalogenase Inhibitors. Applied and Environmental Microbiology 82: 1958- 1965, DOI:10.1128/AEM.03916-15

Daniel, L., Buryska, T., Prokop, Z., Damborsky, J., Brezovsky, J., 2015: Mechanism-Based Discovery of Novel Substrates of Haloalkane Dehalogenases using in Silico Screening. Journal of Chemical Information and Modeling 55: 54-62, DOI:10.1021/ci500486y

Nehybova, T., Smarda, J., Daniel, L., Brezovsky, J., Benes, P., 2015: Wedelolactone Induces Growth of Breast Cancer Cells by Stimulation of Estrogen Receptor Signalling. Journal of Steroid Biochemistry and Molecular Biology 152: 76-83, DOI:10.1016/j.jsbmb.2015.04.019

Kozlikova, B., Sebestova, E., Sustr, V., Brezovsky, J., Strnad, O., Daniel, L., Bednar, D., Pavelka, A., Manak, M., Bezdeka, M., Benes, P., Kotry, M., Gora, A., Damborsky, J., Sochor, J., 2014: CAVER Analyst 1.0: Graphic Tool for Interactive Visualization and Analysis of Tunnels and Channels in Protein Structures. Bioinformatics 30: 2684-2685, DOI:10.1093/bioinformatics/btu364

Chaloupkova, R., Prudnikova, T., Rezacova, P., Prokop, Z., Koudelakova, T., Daniel, L., Brezovsky, J., Ikeda-Ohtsubo, W., Sato, Y., Kuty, M., Nagata, Y., Kuta Smatanova, I., Damborsky, J., 2014: Structural and Functional Analysis of a Novel Haloalkane Dehalogenase with Two Halide-Binding Sites. Acta Crystallographica D70: 1884-1897, DOI:10.1107/S1399004714009018

Marques, S. M.*, Daniel, L. *, Buryska, T., Prokop, Z., Brezovsky, J., Damborsky, J., 2016: Enzyme Tunnels and Gates as Relevant Targets in Drug Design (under review)

Chrast L, Tratsiak K., Daniel L., Sebestova E., Brezovsky J., Kuta Smatanova I., Damborsky J., Chaloupkova R. 2016: Structural Basis of Paradoxically Thermostable Dehalogenase from Psychrophilic Bacterium (under review)

Chaloupkova R., Daniel L., Liskova V., Sebestova E., Waterman J., Brezovsky J., Damborsky J., 2016: Promiscuous Enzyme Activity by Phylogenetic Reconstruction: Light-emitting Dehalogenase or Dehalogenating Luciferase (in preparation)

252

Vanacek P., Sebestova E., Babkova P.,Bidmanova S., Stepankova V., Dvorak P., Chaloupkova R., Daniel L., Brezovsky J., Prokop Z., Damborsky, J., 2016: Integrated Strategy Combining In Silico and Wet Lab Approaches for Functional Characterization of an Enzyme Family (in preparation) LIST OF PATENTS Damborsky, J., Nikulenkov, F., Sisakova, A., Havel, S., Krejci, L., Carbain, B., Brezovsky, J., Daniel, L., Paruch, K., 2015: Pyrazolotriazines as Inhibitors of Nucleases. Masaryk University, Brno, Czech Republic. Patent WO 2015/192817 A1.

253