<<

Identification, characterization and application

of novel (R)-selective amine transaminases

Inauguraldissertation

zur

Erlangung des akademischen Grades

doctor rerum naturalium (Dr. rer. nat.)

an der Mathematisch-Naturwissenschaftlichen Fakultät

der

Ernst-Moritz-Arndt-Universität Greifswald

vorgelegt von

Sebastian Schätzle

geboren am 12.1.1983

in Duisburg

Greifswald, November 2011

Dekan: Prof. Dr. Klaus Fesser

1. Gutachter: Prof. Dr. Uwe T. Bornscheuer

2. Gutachter: Prof. Dr. Per Berglund

Tag der Promotion: 11.01.2012 The Contents

TABLE OF CONTENTS

TABLE OF CONTENTS I

LIST OF ABBREVIATIONS AND SYMBOLS II

SCOPE AND OUTLINE OF THIS THESIS III

1 THE BACKGROUND - 1 -

1.1 AMINES IN NATURE AND PHARMACEUTICALS - 2 - 1.2 CHEMICAL SYNTHESIS OF CHIRAL AMINES - 3 - 1.3 ENZYMATIC SYNTHESIS OF CHIRAL AMINES - 4 - 1.4 TRANSAMINASES – OF NOBLE FAMILY - 5 - 1.5 DISCOVERY – MANY WAYS LEAD TO ROME ARTICLE I - 7 -

2 THE ANALYTICS - 9 -

2.1 A FAST AND EASY ASSAY FOR SCREENING ARTICLE II - 11 - 2.2 A SLIGHTLY AMBITIOUS ASSAY FOR CHARACTERIZATION ARTICLE III - 12 -

3 EXPLORING NEW PATHS IN ENZYME DISCOVERY ARTICLE IV - 14 -

4 APPLICATION OF THE NEW BIOCATALYSTS ARTICLE V - 19 -

5 CONCLUSION - 23 -

6 REFERENCES - 24 -

ARTICLES - 28 -

AFFIRMATION - 83 -

CURRICULUM VITAE - 84 -

ACKNOWLEDGEMENTS - 85 -

I

The Abbreviations

LIST OF ABBREVIATIONS AND SYMBOLS ATA amine transaminase M mol per liter AspTer R-ATA from Aspergillus terreus mg milligram AspFum R-ATA from Aspergillus fumigatus min minute AspOry R-ATA from Aspergillus oryzea ml milliliter -MBA -methylbenzylamine mM millimol per liter -TA -amino acid transaminase MTP microtiter plate BCAT branched chain amino acid S microsiemens aminotransferase MycVan R-ATA from Mycobacterium BLAST basic local alignment search tool vanbaalenii c conversion NAD(P)+ nicotinamide adenine dinucleotide CE capillary electrophoreses (phosphate), oxidized form cm centimeter NAD(P)H nicotinamide adenine dinucleotide °C degree celsius (phosphate), reduced form DATA D-amino acid aminotransferase NeoFis R-ATA from Neosartorya fischeri DMSO dimethylsulfoxide OD optical density DNA desoxyribonucleic acid PAGE polyacrylamide gel electrophoresis E. coli Escherichia coli PCR polymerase chain reaction ee enantiomeric excess PDB Brookhaven protein database e.g. for example PDC pyruvate decarboxylase g gram PenChr R-ATA from Penicillium chrysogenum GC gas chromatography pH pondus hydrogenii GDH glucose dehydrogenase pI isoelectric point GibZea R-ATA from Gibberella zeae RhoSph S-ATA from Rhodobacter sphaeroides h hour sp. species HPLC high performance liquid t time chromatography TA transaminase i.e. that is U unit l liter VibFlu S-ATA from Vibrio fluvialis LB lysogeny broth LDH lactate dehydrogenase Furthermore, the usual codes for amino acids  wavelength were used.

II

The Outline

SCOPE AND OUTLINE OF THIS THESIS This thesis is about the identification of novel (R)-selective amine transaminases and their application in asymmetric synthesis of various chiral amines. Before a novel in silico search strategy (article IV) was developed for the identification of these enzymes out of sequence databases, two specific assay systems (article II & III) had been established, which allowed a fast activity screening and characterization of their catalytic properties. Seven of 21 initially identified proteins were applied later on in the asymmetric synthesis of enantiopure amines (article V). The review article I summarizes state-of-the-art methods for enzyme discovery and protein engineering, illustrating this thesis in a comprehensive context.

Article I Discovery and Protein Engineering of Biocatalysts for Organic Synthesis G. A. Behrens*, A. Hummel*, S. K. Padhi*, S. Schätzle*, U. T. Bornscheuer*, Adv. Synth. Catal. 2011, 353, 2191-2215 Modern tools for enzyme discovery and protein engineering substantially broadened the number of applicable enzymes for biocatalysis and enable to alter their properties such as substrate scope and enantioselectivity. Article I provides a summary of different concepts and technologies, which are exemplified for various enzymes.

Article II Rapid and Sensitive Kinetic Assay for Characterization of ω-Transaminases S. Schätzle, M. Höhne, E. Redestad, K. Robins, U. T. Bornscheuer, Anal. Chem. 2009, 81, 8244-8248 Article II describes the development of a fast kinetic assay for measuring the activity of amine transaminases (ATAs) based on the conversion of the widely used model substrate (R)- methylbenzylamine. The product of this reaction, acetophenone, can be detected spectrophotometrically at 245 nm with high sensitivity. As all low-absorbing ketones, or keto acids can be used as cosubstrates, the amino acceptor specificity of a given ATA can be obtained quickly. Furthermore, the assay allows the fast investigation of enzymatic properties like the optimum pH and temperature as well as the stability of the catalyst.

Article III Conductometric Method for the Rapid Characterization of the Substrate Specificity of Amine-Transaminases S. Schätzle, M. Höhne, K. Robins, U. T. Bornscheuer, Anal. Chem. 2010, 82, 2082-2086 While article II describes a method for characterizing the specificity for different amino acceptors, in article III a kinetic conductivity assay for investigation of the the amino donor specificity of a given ATA was established. The course of an ATA-catalyzed reaction can be followed conductometrically – and quantitatively evaluated – since the conducting substrates, a positively charged amine and a negatively charged keto acid, are converted to non-conducting products, a non-charged ketone and a zwitterionic amino acid.

III

The Outline

Articel IV Rational Assignment of Key Motifs for Function guides in silico Enzyme Identification M. Höhne*, S. Schätzle*, H. Jochens, K. Robins, U. T. Bornscheuer, Nature Chem. Biol. 2010, 6, 807–813 Biocatalysts are powerful tools for the synthesis of chiral compounds, but identifying novel enzymes that are able to catalyze new reactions is challenging. Article IV presents a new strategy to find existing (R)- selective amine transaminases (R-ATAs) based on rationally developed structural motifs, which were used to search sequence space for suitable candidates, yielding 17 active enzymes with 100% correct prediction of the enantiopreference.

Article V Enzymatic Asymmetric Synthesis of Enantiomerically Pure Aliphatic, Aromatic and Arylaliphatic Amines with (R)-Selective Amine Transaminases S. Schätzle, F. Steffen-Munsberg, A. Thontowi, M. Höhne, K. Robins, U. T. Bornscheuer, Adv. Synth. Catal. 2011, 353, 2439-2445 In article V seven of the newly identified R-ATAs were applied in the asymmetric synthesis of twelve aliphatic, aromatic and arylaliphatic (R)-amines starting from the corresponding ketones using a lactate dehydrogenase/glucose dehydrogenase system for the necessary shift of the thermodynamic equilibrium. For all ketones, at least one enzyme was found allowing complete conversion to the corresponding chiral amine with excellent optical purities >99% ee. Variations in substrate profiles are discussed based on the phylogenetic relationships between the seven R-ATAs.

* equal contribution

IV

The Background

1 THE BACKGROUND New biocatalytical processes can be based on the availability of new interesting enzymes, but more often a desired product raises the demand for a suitable biocatalyst. Sometimes such an enzyme is already commercially available or has been described in the literature. Alternatively, it will be necessary to screen organisms or enzymes that allow conversion of available reactants.1 With promising candidates at hand, a detailed biochemical characterization will help finding appropriate conditions for a successful application as biocatalysts under the requirements of a specific process. If this characterization should not yield satisfying results, several rounds of enzyme engineering might help creating and finding more suitable enzyme variants.

Figure 1| The biocatalysis cycle1.

In our case, the aforementioned desired products were chiral amines with (R)-configuration and the demand for novel biocatalysts initiated this project. The following chapters will give a brief introduction to the importance of chiral amines as natural products and building blocks for pharmaceuticals, possible methods for their synthesis – either chemical or enzymatic – and will introduce some general knowledge about the investigated class of enzymes – amine transaminases. Furthermore, some-state-of-the-art strategies for finding novel enzymes with desired properties will be summarized, placing this project in a comprehensive context.

- 1 -

The Background

1.1 AMINES IN NATURE AND PHARMACEUTICALS

Amines and amino acids are ubiquitous in nature. They are not only essential parts of all proteins and nucleic acids, but are also very important biologically active compounds themselves. Primary amines are biogenetically produced by decarboxylation of the corresponding amino acids and they can be further N- alkylated to yield secondary, tertiary and quaternary amines, which often results in heterocyclic structures. Primary amines have great importance as neurotransmitter (e.g. adrenaline and histamine) and as precursor of coenzymes (e.g. cysteamine of coenzyme A) and of complex lipids (e.g. ethanolamine of phosphatidylethanolamine)2. Especially the higher substituted amines – pharmaceutically classified as alkaloids – show an enormous variety of structures as well as biological effects found in all domains of life (Figure 2).

Figure 2| Examples for amines in biologically active compounds. The displayed compounds are from human, bacterial or herbal origin. Coniin is the neurotoxic main alkaloid of Conium maculatum, Poison Hemlock, which was used to poison condemned prisoners in ancient Greece.

This huge variety of biological effects – from antibiotic to antiarrhytmic, analgesic and neurotoxic – shows their potential as pharmaceuticals and thus makes them very promising candidates in the search for new drugs. In fact, almost 80% of the 200 most prescribed brand name drugs in 2010 in the US contain nitrogen, with the amino group [being] adjacent to a chiral carbon atom in more than 40% of these compounds3. The absolute configuration of the stereocenters is crucial for the interaction with biomolecules and thus for the type of effect on biological systems4. During the synthesis of the complex target compounds (see Figure 3 for examples), the generation of chirality is most often the actual challenge. Thus, optically pure amines gained an increased significance as building blocks in organic chemistry5.

Figure 3| Examples for chiral amines in pharmaceuticals. The displayed compounds rank among the 200 most prescribed brand name drugs in 2010 in the US3.

- 2 -

The Background

1.2 CHEMICAL SYNTHESIS OF CHIRAL AMINES

Most of the drugs mentioned above – and their precursors – are still produced chemically, and there are a lot of different possibilities for the chemical synthesis of enantiopure amines. As this thesis is about , the chemical methods are only introduced briefly and not discussed in detail. For further aspects, the reviews written by Breuer et. al5 and Nugent et. al6 are highly recommended.

Traditionally, the chiral resolution of a racemic mixture of a given amine can be achieved by precipitating one enantiomer as diastereomeric salt by addition of a chiral acid like (R)-mandelic acid or (R,R)-tartaric acid5. Unfortunately, the theoretical yield of the desired enantiomer with this method is limited to 50% (Figure 4). The better alternative with a theoretical yield of 100%, is to start from a prostereogenic compound in an asymmetric synthesis to yield a chiral, enantiopure product.

Figure 4| General chemical routes to chiral amines. Besides the chiral resolution of a racemic amine with chiral acids, asymmetric hydrogenation and addition are the two main strategies for the synthesis of -chiral primary amines. If chiral auxiliaries are used to generate enantioselectivity, R3 represents a chiral substituent.

The general approaches comprise asymmetric hydrogenation of enamides or ketimines and asymmetric addition to aldimines or ketimines. The latter two are produced at first by a condensation reaction of a ketone/ and an amine, whereas enamides can also be obtained from ketones by conversion to the oxime and subsequent reduction, for instance with iron in the presence of acetic anhydride5. An organocatalytic alternative is the proline-catalyzed direct asymmetric -amination of aldehydes with azodicarboxylates7. It is worth mentioning that cyclic secondary amines can only be synthesized – in one step – by asymmetric hydrogenation8. Furthermore, -tertiary amines with three different substituents

- 3 -

The Background

(unlike hydrogen) can only be produced by asymmetric addition to a ketimine9 as the chiral center is generated by C-C coupling rather than by reduction, which always yields a C-H bond.

However, more and more companies replace their chemical routes by fermentation or biocatalysis – for obvious reasons. Using biocatalysis, one is able to avoid toxic compounds like transition state catalysts, save a lot of energy as the processes run at low temperature and pressure – in contrast to most chemical processes, which run at rather high temperatures and often high pressure – and reduce waste, especially during downstream processing and product isolation10. The companies Merck & Co. and Codexis, for example, recently published an impressive case study on process chemistry, replacing the rhodium- catalyzed Sitagliptin manufacture by a biocatalytic process using a highly-evolved amine transaminase11. This process not only reduced the total waste (19%) and eliminated all need for heavy metals, but even increased the overall yield by 13% and the productivity (kg/L per day) by 53% compared to the chemical process. Both routes have recently been compared in detail12.

1.3 ENZYMATIC SYNTHESIS OF CHIRAL AMINES

For the biocatalytic production of chiral amines, there are in principle three options: the use of , or transferases13.

Figure 5| Enzymatic routes to chiral amines. The displayed examples show (a) the kinetic resolution and (b) dynamic kinetic resolution with hydrolases, (c) deracemization with monoamine oxidases (MAO) and (d) asymmetric synthesis with amine dehydrogenases (amine-DH).

Proteases, amidases and lipases from the class of hydrolases are limited to (dynamic) kinetic resolutions of a given racemic amine (Figure 5a and b) and monoamine oxidases (oxidoreductases) allow either a

- 4 -

The Background kinetic resolution or a deracemization. The latter is an efficient one-pot reaction, wherein the produced imine has to be reduced chemically to the racemic amine in order to accumulate the non-converted enantiomer14 (Figure 5c). To complete the picture, there is another promising group of enzymes – NAD(P)+ dependent amine dehydrogenases – which might be used for asymmetric syntheses in the future (Figure 5d). This is because the only NAD(P)+ dependent amine dehydrogenase described so far has indeed a broad substrate scope but is not yet applicable due to its low enantioselectivity15. Other known amine dehydrogenases use different artificial electron acceptors such as phenazine methosulfate16 or tryptophan tryptophylquinone17 and thus are not of biotechnological interest. Until today, only amine transaminases (ATAs) can be used for kinetic resolutions, one pot-two step- deracemizations and asymmetric syntheses.

Figure 6| Production of chiral amines using amine transaminases. In a kinetic resolution (a), the amine transaminase (ATA) converts in the ideal case only one of the amine enantiomers to the corresponding ketone. The remaining enantiomer can be isolated in high optical purity and a maximum yield of 50%. In an asymmetric synthesis (b), a prostereogenic ketone is aminated enantioselectively yielding directly the chiral amine. The most common co-substrates for ATAs are pyruvate/alanine. Since the equilibrium favors ketone formation, high yields in asymmetric synthesis can only be achieved by shifting the equilibrium, for example by enzymatic removal of the formed co-product pyruvate.

The kinetic resolution of the racemic amine with an (S)-selective transaminase produces the (R)- enantiomer. The disadvantage of this approach is a theoretical maximum yield of only 50 % (Figure 6a). The atom efficiency of such a process is low, as the ketone generated has little value and the recycling of this compound requires chemical reductive amination to produce the racemic amine for subsequent resolutions. Synthetically, an asymmetric synthesis is much more economical because yields of up to 100% are possible – in the case of an (S)-selective ATA the (S)-amine would be produced (Figure 6b). This fact will be addressed in chapter 3 in more detail, but first, the protagonists of this project will be introduced.

1.4 TRANSAMINASES – ENZYMES OF NOBLE FAMILY

Transaminases belong to the group of pyridoxal-5’-phosphate (PLP) dependent enzymes, whose members most likely catalyze the largest scope of chemical reactions and belong to , , and even hydrolases and oxidoreductases18. The different types of reactions can be organized according to the position at which the reaction occurs, namely  (transamination, - 5 -

The Background racemization, decarboxylation, elimination and substitution),  and  (both elimination and substitution). However, initial attempts to organize the enzymes into an -, - and -class failed, so they were classified with respect to their protein fold. Today, there are seven fold types of PLP-dependent enzymes known and transaminases belong to fold class I (aspartate aminotransferase family) and IV (D-amino acid aminotransferase family)19, 20. The transaminases of both fold types are homodimeric enzymes with two catalytic sites at the interface of the subunits.

The catalytic cycle of the transamination reaction consists of two half reactions and follows a typical ping pong bi bi kinetic. During the first half reaction the amino group of the amino donor is transferred to the PLP to form pyridoxaminphosphate (PMP) and the keto product, which is released. The amino group is then, in the second half reaction, transferred from the PMP to the amino acceptor, whereby the is recycled and the product is released (Figure 7). The cofactor stabilizes the intermediate formed after deprotonation of the external aldimine by delocalization of the negative charge through the  system, and thus initially enables the deprotonation. As mentioned above, PLP-dependent enzymes catalyze a huge variety of reactions and so does the cofactor itself in the absence of enzyme, even though very slowly21. The function of the protein apoenzyme, therefore, is to enhance this innate catalytic potential and to enforce mechanistic and substrate selectivity22. The latter will be discussed in more detail in chapter 3.

Figure 7| Simplified reaction scheme of the transamination. Illustrated is the first half reaction of the catalytic cycle. Sequentially, the PMP attacks the amino acceptor and finally the amine product is released and the PLP recycled22.

Because of their crucial role in the amino acid metabolism,-amino acid aminotransferases (EC 2.6.1, - transaminases) are ubiquitous enzymes found in all organisms. On the contrary, the small group of amine-pyruvate transaminases (EC 2.6.1.18, amine transaminases) also converts substrates lacking an - carboxylic acid moiety. The actual function of amine transaminases (ATAs) in nature so far remains unknown; they probably take part in degrading biogenic amines like histamine or putrescin and enable recycling of the nitrogen, in contrast to monoamine oxidases that release the nitrogen as . Nevertheless, as the product range is not limited to -amino acids, ATAs are very attractive for organic synthesis of optically active amines13, especially due to the possibility of asymmetric syntheses. Although

- 6 -

The Background an asymmetric synthesis strategy is clearly favored over a kinetic resolution with respect to the atom efficiency of the process, it has the major disadvantage that only one specific enantiomer can be accessed with an enzyme having a distinct enantiopreference. As both enantiomers of a building block are often needed for targets with different absolute configuration, an enzyme platform providing either (S)- or (R)-specific enzymes is highly desired. Unfortunately, the majority of the more than a dozen described amine transaminases in literature are (S)-selective. Only one commercially available enzyme and two microorganisms were reported to show (R)-specificity23, but details about the proteins were unknown. Consequently, many research groups – both from university and industry – have been looking eagerly for new (R)-selective amine transaminases.

1.5 ENZYME DISCOVERY – MANY WAYS LEAD TO ROME ARTICLE I

The described lack of amine transaminases with the desired enantiopreference is a good example showing that despite the enormous number of already described enzymes, there is a constant need for novel biocatalysts with altered properties. Besides enantiopreference or –selectivity and substrate scope, there are other properties, which often need to be altered regarding process conditions like thermostability, pH profile and solvent tolerance.

Figure 8| Strategies for protein discovery and engineering. In addition to the classical way of screening strain collections or other environmental samples, structural information of known enzymes may be used to alter the properties by means of protein engineering. Hot spot positions for semi-random mutagenesis may be predicted or mutations suggested and modelled in silico. Instead of introducing mutations identified by rational design experimentally and further optimizing the enzyme by directed evolution, enzymes carrying these mutations may be identified in a database search. The bars represent the estimated amount of required information (green) and the necessary screening effort (blue) for the particular strategy.

- 7 -

The Background

To meet these demands, there are basically three strategies: (i) modification of already characterized enzymes, (ii) search for new enzymes, or (iii) de novo design of completely new biocatalysts as a potential, but not yet established route (Figure 8). For the modification of known enzymes there are two main strategies in order to adapt their properties to the desired application: rational design and directed (molecular) evolution24. Rational design uses structural and mechanistic information and/or molecular modeling for the prediction of changes in the protein structure in order to alter or induce the desired properties. In directed evolution on the other hand, mutant libraries are created by random changes and screened/selected for the desired property. The variants showing promising results are subjected to further rounds of evolution. As outlined in Figure 8, the choice of the proper method to solve a given problem is limited on one end by the amount of information available and on the other end by the disposable analytical methods. Without any structural information and deep comprehension of the catalytic mechanism, rational design is a rather risky approach – especially if applied to alter global protein properties like thermostability or pH dependency. Another limiting factor is the restricted quality of computer simulations, which always is a compromise between informative value and time saving. On the other hand, by creating large mutant libraries by methods of directed evolution, hundreds and thousands of variants need to be screened. Without a fast and reliable assay system, the throughput is so small that either the desired variant cannot be identified or the screening process takes very long time. Nowadays, more and more researchers use combined methods of these two strategies, dubbed focused directed evolution or semi-rational design25, 26. Here, information available from related protein structures, families and mutants already identified is combined and then used for targeted randomization of certain areas of the protein.

Access to new enzymes was traditionally gained through enrichment selection of microorganisms by growth under limiting conditions or enzymes from plant or animal tissues were applied as crude extracts. However, only a small number of microbes can successfully be cultivated in the laboratory (approximately <1%)27. The metagenome approach circumvents this problem: the complete genomic DNA is extracted from an environmental sample, fragmented and cloned yielding the corresponding metagenome libraries. These libraries can then be screened either with a function-based assay for the desired activity, or in a sequence-based approach in which genes are identified based on homology to already described enzymes, or the entire DNA is sequenced. Besides the classical enrichment cultivation and the metagenome approach, a third way to access new enzymes is called ‘in silico screening’. Protein sequences obtained from sequencing of single enzymes, entire genomes or microbial consortia (e.g. from the Sargasso Sea28), are deposited in public databases and currently >12 million protein sequences are available (http://www.ncbi.nlm.nih.gov/RefSeq/). However, only a tiny fraction of all these sequences corresponds to enzymes, which have been produced in the laboratory and their biochemical properties and function have been experimentally confirmed. For the vast majority, a possible function is postulated by sequence comparison, but this annotation can often be wrong or at least misleading, as highlighted in a recent commentary29. Discovering novel biocatalysts in public databases is easier the - 8 -

The Background more information about the desired enzyme class exists. If several enzymes with a distinct activity are already described in the literature, an alignment of their protein sequences together with experimentally confirmed activities or specificities can reveal conserved regions within the sequences and a simple BLAST-search can yield new and useful biocatalysts. There are numerous examples for the discovery of new enzymes in silico including nitrilases30, a ligase31, an epoxide hydrolase32 and a reductase33. By curtailing the sequences to be searched through, one is even able to increase the chances that the new catalysts fulfil additional criteria. Scanning the genomes of (hyper-)thermophilic organisms for example will increase the chances to find a thermostable enzyme. Thus, Machielsen et al. identified a thermostable ADH in the genome of Pyrococcus furiosus with a half-life of 130 min at 100°C34 and Fraaije et al. discovered a thermostable Baeyer-Villiger monooxygenase with a promiscuous sulphur oxidizing activity35.

However, all these successful examples have one very important aspect in common: substantial structural and especially sequence information about similar enzymes having the desired properties. Unfortunately for the purpose of this project, there was no information about the type of enzyme we were looking for, except that one commercially enzyme36 (Codexis ATA-117) and two microorganisms with (R)-amine transaminase activity were known37, 38.

2 THE ANALYTICS While working with large enzyme libraries, the fast, effective and reliable identification of enzymes with the desired properties is often the bottleneck, since this is the most time-consuming step. Therefore a large variety of fast and robust methods for determining enzyme activity have been developed for many biocatalytic reactions, which often are capable of being performed in a high-throughput screening format. But screening enzyme libraries for new or improved biocatalysts is not the only purpose of such assays. Having found some promising candidates, an appropriate assay system will accelerate the necessary biochemical characterization including substrate specificity, enantioselectivity and both temperature and pH optima.

However, for the screening of amine transaminase (ATA) activity and said biochemical characterization only a limited number of methods have been reported, due to the special type of reaction catalyzed: an amine and a carbonyl substrate bearing a ketone or aldehyde group are converted into products by exchanging their amino and carbonyl functional groups, without the stoichiometric consumption of a cofactor. In case of other enzymatic reactions like e.g. hydrolysis of esters, the substrates (ester) and products (alcohol and acid) can be easily discriminated39, or during reactions involving a cofactor like 40, 41 NAD(P)H or side products like H2O2, these compounds can be detected specifically . For ATAs, these approaches are not possible, and there are only a few methods to discriminate the similar substrates and products without affecting enzyme activity42. Thus, only two high-throughput assays had been published until recently. - 9 -

The Analytics

Hwang et al. developed an assay based on the detection of alanine formed during ATA-catalyzed reactions: after addition of a CuSO4/MeOH staining solution, the blue colored Cu-alanine complex could be determined at 595 nm42. With this assay various aspects of the reaction can be monitored, like investigating different amines and β-amino acids as substrates in combination with pyruvate as co- substrate. As the assay depends solely on the conversion of the amino acceptor pyruvate, the amino donor can be applied enantiomerically pure and thus the enantiopreference of a given enzyme can be determined and an estimation of its enantioselectivity is possible (Figure 9a).

As an alternative, Truppo et al. published a multi enzyme cascade pH-indicator assay to monitor the conversion in an asymmetric amine synthesis with ATAs43. The ketone is converted with alanine as co- substrate, and NADH-dependent lactate or alanine dehydrogenases are used in order to remove the co- product pyruvate. NADH is subsequently recycled with glucose dehydrogenase, and the arising gluconic acid δ-lactone induces a decrease of the pH, which is visualized by the application of a pH-indicator (Figure 9b).

Figure 9| Assays described in literature for measuring amine transaminase activity. ATA, amine transaminase; LDH, lactate dehydrogenase; GDH, glucose dehydrogenase; AAO, D- or L- amono acid oxidase; HRP, horse radish peroxidase; PGR, pyrogallol red.

Obviously, both methods show several drawbacks: since the copper staining solution in the first assay inhibits the enzyme, the assay can only be used as an endpoint measurement and thus the determination of kinetic parameters is rather difficult. Furthermore, the sensitivity is very low (ε ≈ 10 M- 1cm-1) and most commonly used buffers – e.g. phosphate and the Good’s buffer – form insoluble blue Cu-complexes, which need to be centrifuged before the measurement. Cell extracts also give a blue - 10 -

The Analytics colored product, which, however, is soluble and thus cannot be removed. This is probably due to the amino acids present in the cells. Although the sensitivity of the second assay is higher compared to the

CuSO4/MeOH assay, it works in the asymmetric synthesis mode and thus no information about enantiopreference or -selectivity can be obtained. Furthermore, the second assay uses two additional enzymes, which may be influenced by important parameters like temperature, pH and cosolvents. Meanwhile, another method had been published using a combination of amino acid oxidase and 44 peroxidase for quantification of alanine via its oxidation and detection of the arising H2O2 (Figure 9c). This assay has the disadvantage that it also depends on two additional enzyme activities, which must be precisely synchronized.

Thus, our main target was to develop a reliable and easy-to-use assay system, which does not depend on other enzyme activities and is resistant to external influences. In the end, we established two novel assay systems for a fast and kinetic screening of ATA activity as well as for the characterization of the amino acceptor and amino donor specificity of any given ATA.

2.1 A FAST AND EASY ASSAY FOR SCREENING ARTICLE II

In search of a better spectrophotometric assay we studied the absorption spectra of various compounds and noticed that acetophenone, which is formed during the transamination of the standard substrate - methylbenzylamine (-MBA), showed a high absorbance in the ultraviolet range of the spectrum with a local maximum at 245 nm due to its carbonyl group conjugated to the aromatic system (Figure 10). Compounds like pyruvate, alanine, alkylamines, -methylbenzylamine and non-aromatic ketones on the other hand showed an absorption, which was less than 5% of acetophenone at this wavelength. Thus, the reaction can be followed simply by monitoring the increasing absorption in the model reaction of ATAs (Figure 10) neglecting absorption changes caused by other reaction partners. Since -MBA is converted by most ATAs and all low absorbing ketones, aldehydes and keto acids may be used as cosubstrates, the assay allows a fast characterization of the amino acceptor spectrum of any ATA, determination of its enantiopreference as well as estimation of the enantioselectivity towards -MBA. Furthermore, the pH and temperature optimum and stability can be characterized very easily. The accuracy of the system was validated by measuring the absorption at 245 nm and detection of the acetophenone formed by capillary electrophoresis in parallel. The results were comparable for both purified enzyme and crude extract and the calculated activities differed only by 3%. The absorption coefficient of acetophenone at 245 nm was determined to be ~12 mM-1cm-1, which results in a thousand -1 -1 fold higher sensitivity compared to the CuSO4/MeOH based assay (10 M cm ).

- 11 -

The Analytics

Figure 10| Principle of the developed acetophenone assay. Acetophenone (1 mM) shows a high absorbance (right) compared to other reactants such as -methylbenzylamine, pyruvate or alanine (all 10 mM).

The only limitation of the assay comes along with the rather low wavelength of 245 nm, because in this range the protein contributes to the initial absorbance. This limits the amount of protein that can be applied, which eventually results in a decreased sensitivity at higher enzyme loads due to an increased initial absorbance.

For verification, the assay was used to characterize an ATA identified in the genome of Rhodobacter sphaeroides. After recombinant expression and affinity tag purification, the amino acceptor spectrum of the biocatalyst was investigated. Furthermore, standard enzymatic properties like pH profile, temperature profile and stability as well as the effect of additives or buffer compositions could be investigated quickly and simply using this assay. Especially these last-mentioned properties could not have been as easily determined with the previously described assay systems.

2.2 A SLIGHTLY AMBITIOUS ASSAY FOR CHARACTERIZATION ARTICLE III

As the spectophotometric assay is limited to α-MBA as amino donor, we developed a method for a fast and easy characterization of the amino donor specificity, too. With this conductivity based approach every amino donor and acceptor can be applied as substrate, as long as one of the substrates is an amino or keto acid, respectively. In the course of an ATA-catalyzed reaction the conductivity of the reaction medium changes since charged reactants – amine and keto acid – are converted to non-charged species – ketone and a zwitterionic amino acid (Figure 11). For all experiments, the transamination was carried out in the kinetic resolution mode. The rationale behind this approach is as follows: (i) the reaction equilibrium favors product formation, (ii) both enantiomers of a given amine substrate can be used separately in the assay, so that also information about enantioselectivity can be obtained, (iii) the amine substrate usually is much better soluble than the corresponding ketone. Thus, even with higher substrate

- 12 -

The Analytics concentrations a homogeneous assay solution is obtained rather than a biphasic system, which otherwise might interfere with the measurements. The system allows a simple measurement of the reaction progress, if some crucial requirements are fulfilled: first, to maximize sensitivity, the buffer system should have low background conductivity. The pH of the buffer has to be kept in a pH range from 4 to 8, where the net charge of alanine is ≈0. Furthermore, for practical reasons or for high-throughput screening purposes, the assay should not be sensitive towards variations in crude extract concentrations.

Figure 11| Principle and validation of the conductometric assay. A positively charged amine and a negatively charged keto acid are converted to zwitterionic amino acid and a non-charged ketone (left), thus the conductivity of the solution decreases. The assay was validated by following the reaction of 10 mM -MBA and pyruvate (right) using both conductivity measurements (black symbols/bars) and detection of the acetophenone formed by gas chromatography (white bars).

Standard buffer systems like phosphate buffer or Tris-HCl could not be applied due to their very high conductivity, whereby the relative change in conductivity caused by the enzymatic reaction is rather small. Thus, five different Good’s buffers – BES, CHES, EPPS, HEPES and Tricine – were investigated regarding the conductivity of 20 mM solutions at pH 7.5 and their effects on transaminase activity. Due to their zwitterionic nature, solutions of these compounds show a much smaller conductivity if the pH is kept within 2 units near the pI of the respective compound (isoelectric buffer). For CHES, EPPS and HEPES we found significant inhibitory effects on the ATA from Rhodobacter sphaeroides (24% to 76% relative activity compared to phosphate buffer) and though the BES buffer showed a very good performance with purified enzyme, the activities of crude extract measured in this buffer unfortunately did not match the results obtained by gas chromatography. Finally, the Tricine buffer system showed both a good buffer capacity at pH 7.5 and a very good agreement of determined activities of purified enzyme and crude extract by measuring the conductivity compared to GC analysis (Figure 11). For this reason the Tricine buffer was used for all subsequent experiments. The inhibitory effect of the EPPS and CHES buffer

- 13 -

The Analytics may vary for other transaminases, so these buffers should be considered for different enzymes due to their really good performance in other respects.

To answer the second important question, whether the assay is sensible towards variations in crude extract concentrations, we performed calibration experiments where different conversions were simulated by varying the concentrations of all reactants according to a real reaction. As expected, the background conductivity of the reaction medium increased proportional to the concentration of cell extract used but the addition of different amounts of crude extract did not seem to affect the change of conductivity significantly. This was surprising, because in theory the molar conductivity of a given compound depends on the concentration of all electrolytes present in the reaction system for mainly two reasons: on the one hand the protonation state and thus the netto charge of an analyte may be affected by other compounds present in the solution. On the other hand and more importantly, ions or even neutral molecules from the background electrolyte may affect the mobility and thus the molar conductivity of an analyte significantly by intermolecular electrostatic or hydrophobic interactions. This explains why the conductivity of a mixture comprised of different ions usually differs to some extent from the sum of the contributions of the single ions to overall conductivity. For this reason, all participating reactants and even the neutral species had to be included in the calibration of the assay. If otherwise the change of the conductivity per millimolar reaction conversion [µS/mM] is calculated from molar conductivities of the single compounds, the obtained value would be higher (49.4 µS/mM conversion) compared to the measurement of the complete mixtures (44.1 µS/mM conversion).

In conclusion, the conductivity assay is an excellent complementary method to the spectrophotometric assay and allows a fast screening of ATA variants for the reaction of any desired amine with a keto acid. One limitation of the assay is that an appropriate buffer with low conductivity in the pH-range of 4–8 is indispensable. For determining enzymatic properties like the pH- or temperature optimum, the photometric assay is more flexible since every low absorbing buffer at any pH may be used and absorbance is virtually not dependent on temperature, in contrast to conductivity. With both assays, the complete characterization of the substrate specificity and estimation of enantioselectivity can be performed quickly.

3 EXPLORING NEW PATHS IN ENZYME DISCOVERY ARTICLE IV With two convenient assays at hand, we felt comfortable to start searching for the desired (R)-selective amine transaminases (R-ATAs). We first considered protein engineering to create an R-ATA by inverting the enantiopreference of an S-ATA. Inverting an enzyme’s enantiopreference is generally possible by changing the binding pockets of the in a way that two substituents of the substrate’s chiral carbon atom are bound inversely. This approach has already been applied successfully to many different enzymes including e.g. a lyase45, a fluorophosphatase46 and an esterase47. Unfortunately, there were several limitations, as no crystal structure of any ATA was available and thus rational protein design - 14 -

The Search would be very difficult to perform. Meanwhile, Svedendahl et al. succeeded in inverting the enantiopreference of the S-ATA from Arthrobacter citreus at least substrate dependently: the mutant V328A converted (R)-1-(4-fluorophenyl)propane-2-amine instead of the (S)-enantiomer, while it retained its high (S)-selectivity for 1-(4-nitrophenyl)ethanamine48. Keeping in mind that the authors actually aimed at increasing the enzymes enantioselectivity towards 1-(4-fluorophenyl)propane-2-amine – and were successful with mutant Y331C – this example illustrates the tight boundaries of rational design, especially without a proper 3D structure. However, there are crystal structures of (S)-selective amino acid TA available. So we changed our plan and thought of reengineering the substrate-recognition site of an -TA to create a variant that accepts substrates lacking the carboxylic function (Figure 12). This altered substrate specificity would be accompanied by a formal switch in enantiopreference owing to the changed priority according the Cahn-Ingold-Prelog (CIP) rule49, resulting in an (R)-selective amine TA.

Figure 12| Strategies for protein engineering. Possible ancestors of (R)-amine transaminases can be used to engineer the very same. This can be achieved by modification of the amino acids in the carboxyl group binding pocket of an -TA such as an L-branched chain TA of the PLP fold type IV (left) or an aromatic amino acid TA from PLP fold type I (lower right) or by engineering of the binding pockets of an (S)-selective amine-TA (upper right) from PLP fold type I. It was assumed that according to the CIP-rules the large substituent (RL) has a higher priority than the small substituent (RS).

We explored this strategy with a well-investigated aromatic amino acid TA from Paracoccus dentitrificans50 from PLP-fold class I. Unfortunately again, this enzyme performs a typical induced fit mechanism and the coordination of the -carboxyl group of the substrate plays an important role for the domain closure. The guanidinium group of Arg386 coordinates the carboxyl group in an end-on geometry, while it is locked into position by several surrounding residues resulting in a complex network of hydrogen bonds. To solve this challenge, many additional mutations might have been required, which

- 15 -

The Search could not be predicted because of the complexity of the problem. After a close look at other (S)-selective amino acid TAs from fold class I we realized that most of the enzymes showed an induced fit, too.

Therefore, we focused on fold class IV of PLP-dependent enzymes. In this fold class there are known two types of transaminases with opposite enantiopreference, namely branched chain amino acid transaminase51 (BCAT) and D-amino acid transaminase52 (DATA). This fact alone is very interesting since it indicates a certain flexibility within this fold type, which apparently is not given in fold class I where all enzymes are (S)-selective. The opposite enantiopreference of DATA and BCAT could be explained on the basis of crystal structures by the differences in substrate coordination in the active site51, 52, which consists of two binding pockets. If the -carboxyl group of the substrate is accommodated in binding pocket A, this TA is (R)-selective and thus converts D-amino acids (Figure 12). If on the other hand the - carboxyl group is positioned in binding pocket B, the enzyme converts L-amino acids and thus exhibits (S)-preference (Figure 12). In other words, if (R)-selective amine TAs have evolved in this fold type, the plausible ancestor must be an L-specific BCAT, since substitution of the carboxyl group of a L-amino acid by a methyl group yields an (R)-amine – according to the CIP rule.

Working from the known crystal structures, we studied the coordination of the -carboxyl group in BCATs and DATAs. This allowed us to predict the necessary differences in protein sequences within PLP- fold class IV proteins that determine the desired switch in substrate specificity from -amino acids to amines, in-line with the formal switch in enantiopreference. In contrast to the very rigid coordination in aromatic amino acids TAs and DATAs, both BCATs from E. coli (PDB: 2IYD) and Thermus thermophilus (PDB: 2EJ0) realize the coordination of the carboxyl group in a more subtle manner, without direct contact of the carboxyl group atoms to any basic amino acid side chain such as an arginine (Figure 13). Instead, the negative charge of one oxygen is compensated by a hydrogen bond to Tyr95, which is activated by an adjacent Arg97. This pair of Tyr95 and Arg97 is highly conserved in all BCATs. The other oxygen is bonded by two backbone amide nitrogen atoms of Thr262 and Ala263, which also are activated by the coordination of their adjacent carbonyl groups by Arg40. Consequently, there are two key mutations required to generate an (R)-selective amine TA from a BCAT. First, Tyr95 should be exchanged by a hydrophobic residue, which is incapable of forming a hydrogen bond with the carboxyl group. Secondly, Arg40 should be changed to a residue, which cannot activate the amide backbone nitrogens of Thr262 and Ala263 by coordination of the neighbouring backbone carbonyl . However, the situation regarding position 40 in PLP-fold class IV proteins is complex. DATA also has a basic amino acid, Lys40, but in contrast to the Arg40 in BCAT, Lys40 in DATA adopts an altered conformation so that its ε-amino group forms a hydrogen bond to the backbone of an adjacent loop. That is why we included the absence of a basic residue at position 40 as a clear hint for R-ATA activity, but we did not declare its presence as strict exclusion criterion.

- 16 -

The Search

Figure 13| Key amino acid motifs that allow prediction of function of PLP-dependent fold class IV proteins. Top: Schematic drawing of the external aldimines of the respective substrates in the active site of DATA, BCAT and ADCL. Blue colored amino acids are part of binding pocket A (Figure 12) and amino acids of binding pocket B (Figure 12) are shown in green. The grey shaded circle represents Gly or Val, respectively, which is located behind the carboxyl group. The red colored Thr36 is important during the catalytic mechanism for shuttling of a proton during the reaction transition state. Bottom: sequence motifs derived from multiple sequence alignments are given using the same color code, showing that amino acids important for substrate binding in the active site are rather conserved and can be used for a prediction of the substrate specificity of PLP-dependent fold class IV enzymes as well as for the enantiopreference of the transaminases within this fold class.

Instead of inserting these key mutations and performing several rounds of directed evolution to create and identify an enzyme with the desired selectivity from a random mutant library, we developed a strategy to search protein databases for proteins already carrying these mutations. These naturally evolved “mutants” could have undergone selection over millions of years, which would have resulted in highly optimized catalysts. Aside from the key residues considered important for amine TA activity, we compared residues involved in substrate coordination in the different enzymes in order to exclude undesirable specificities or activities – apart from BCATs and DATAs, the third enzyme class described in fold class IV is 4-amino-4-deoxychorismate lyase53. Fortunately, the amino acids that are in direct contact to the substrate in the active site are arranged in two relatively short sequence blocks. The first block is located at positions 36–40 and the second block comprises six amino acids at positions 95-97 and 107- 109 (positions according to a multiple sequence alignment). Most of these amino acids fold into a - sheet, only residues 107-109 are part of a loop. This information is important since during evolution insertions or deletions take place preferentially in loop sequences without destroying enzyme activity but not usually in α-helices or -sheets54. Alignments, which include all known BCAT, DATA and ADCL proteins with experimentally verified enzyme activity, showed that the amino acids involved in substrate recognition in the active site seem to be quite conserved in these three groups. This allowed us to formulate different sequence motifs characteristic for DATA, BCAT and ADCL activity (Figure 13) Based - 17 -

The Search on these comparative considerations an annotation algorithm was developed using the amino acid sequence motifs identified. This enabled an easy exclusion of all enzyme candidates, which could clearly be assigned as BCAT, ADCL or DATA. The analysis of the remaining sequences aimed at identifying transaminase sequences that fulfill the requirements for the desired (R)-selective amine-TA activity. Using this algorithm, we analyzed about 5,700 sequences annotated as BCAT and 280 PLP-class IV- annotated protein sequences from the NCBI database. This search identified 21 sequences, seven from eukaryotes and 14 from prokaryotes. We ordered the synthetic genes, subcloned and expressed them in E. coli BL21 and isolated the recombinant proteins via affinity tag purification. Seventeen of the 21 putative (R)-selective amine-TA genes that had been identified were found to be (R)-selective amine transaminases. Three proteins could not be expressed in E. coli in sufficient amounts and one protein was found to have very low activity on the substrates studied. Specific activity and substrate range differed greatly between all of the enzymes investigated, which we find not surprising keeping in mind that the protein sequences originate from various microorganisms, recombinant expression was never reported before and specific activities within their natural function as well as the natural substrates are unknown.

Figure 14| Characterization of the discovered (R)-selective amine transaminases. Specific activities toward (R) and (S) enantiomers of amines 1-4 and amino acids 5 (D-alanine) and 6 (L-glutamate) are indicated by a color gradient. -KG, -ketoglutarate.

- 18 -

The Search

Nevertheless, 10 out of the 21 proteins exhibited a specific activity >0.5 U/mg towards at least one of the investigated amines (Figure 14), which is in the same activity range of known (S)-selective amine-TA55, 56. Consequently, these newly identified (R)-selective transaminases were applied in the asymmetric synthesis to yield optically pure (R)-amines (chapter 4).

The strength of this new in silico approach is that novel enzymes can be discovered very fast. On the contrary, directed evolution requires several rounds of random mutagenesis or iterative saturation mutagenesis in order to alter the enantioselectivity or substrate specificity, even if structural information is available to identify hot spots for mutagenesis. Usually more than 103–104 variants have to be screened for the identification of a mutant with desired properties. In contrast to this very low 'hit-rate', about 50 % of the putative proteins identified in our study turned out to be useful biocatalysts, which are an ideal starting point for further fine tuning and optimization by protein engineering. Hence, before creating mutants or libraries based on rational predictions in the laboratory, it is worthwhile investigating if nature has already designed the mutants and additionally optimized these catalysts over million years of natural evolution.

4 APPLICATION OF THE NEW BIOCATALYSTS ARTICLE V Before we applied the seven most promising candidates – with the highest specific activities towards -methylbenzylamine (-MBA) and sufficient expression levels – in asymmetric syntheses of various amines, we optimized the production of stable biocatalysts. During the initial biochemical characterization we noticed that several enzymes were rather unstable after affinity tag purification in the Tricine buffer used for the conductometric assay. Therefore, the crude extracts containing the seven R-ATAs (AspTer from Aspergillus terreus, AspFum from Aspergillus fumigatus, AspOry from Aspergillus oryzea, PenChry from Penicillium chrysogenum, NeoFis from Neosartorya fischeri, GibZea from Gibberella zeae and MycVan from Mycobacterium vanbaalenii) were simply subjected to ammonium sulfate precipitation and could be recovered fully active. Next, various additives – commonly used osmolytes such as glycerol, ethylenglycol, PEG 1550, trehalose, saccharose and sarcosine as well as the surfactant Tween 80 – were investigated for their stabilizing effect on the proteins during freeze-drying and storage as lyophilizate and in solution. In most cases saccharose or glycerol had the best stabilizing effect on the R-ATAs. All enzymes were stable for at least three months in lyophilized form and for at least two weeks in solution at 4°C.

With stable and active biocatalysts at hand, the next important challenge for a successful asymmetric synthesis of chiral amines was the identification of a good method to shift the equilibrium towards product formation. This, unfortunately, is necessary because the equilibrium constant of e.g. the standard reaction of -MBA and pyruvate is 8810057. Even a ten-fold excess of alanine in the reversed reaction – the asymmetric syntheses of -MBA – would yield a theoretical yield of only <1%58. Luckily, there have been some very convenient methods for shifting the equilibrium established. Most methods - 19 -

The Application remove the by-product pyruvate e.g. with a lactate dehydrogenase36 (LDH) or pyruvate decarboxylase59 (PDC) in order to pull the equilibrium to the product site. Another method uses alanine amino acid dehydrogenase to recycle alanine from the arising pyruvate, thus pulling and pushing the equlibrium towards product formation at the same time60. Maybe the most elegant method uses an excess of isopropylamine as amino donor and the arising acetone is evaporated at low pressure61. The by-product acetone can also be removed with an alcohol dehydrogenase62, but this requires – as all methods using dehydrogenases – recycling of the cofactor NAD(P)H. Based on our good experiences in previous projects, we investigated the use of PDC and the combination of LDH with glucose dehydrogenase (GDH); the latter gave the most satisfying results in our preliminary experiments.

Next, several buffers were investigated to identify the best reaction system for biocatalysis. Two major concerns led to these investigations: recent studies showed that the most frequently used phosphate buffer is not always the best choice for transaminases because of inhibitory effects of the phosphate ions – assumingly caused by blocking the of the phosphate group of the PLP-cofactor. On the other hand, we observed in previous studies that some Good’s buffers had an inhibitory effect on the S- ATA from Rhodobacter sphaeroides. Therefore, the initial activity towards -MBA in four different buffer systems (TRIS, HEPES, MOPS and BES) was compared to the activity in phosphate buffer. Altogether the influence of the different buffer systems was only moderate and hence phosphate buffer was chosen for subsequent experiments, as this buffer is known to be suitable for the LDH/GDH recycling system. Eventually, we determined the pH profiles of the seven enzymes using the spectrophotometric assay. The majority of the enzymes show a pH optimum between 8–9, which is also typical for S-ATAs such as the enzyme from Vibrio fluvialis55; AspTer and PenChr show a sharp optimum at pH 8.5–9 while AspOry, AspFum and NeoFis have a broader optimal range between 8–9. GibZea has a rather untypical pH optimum of 7.5 and loses already 30 to 70% activity at pH 8.5 and 9 respectively, whereas MycVan has a very broad optimum between pH 7.5–8.5 and still shows ~80% activity at pH 9. Nevertheless, in combination with the LDH/GDH system preliminary experiments gave the highest yields at pH 7.5.

Finally, we applied the seven stabilized biocatalysts to asymmetric syntheses of various amines (Figure 15) in the optimal reaction systems described above. For all substrates at least one R-ATA – and for various compounds several enzymes – could be identified, which allowed complete conversion of the ketone into the chiral amine with excellent enantioselectivity (>99% ee). This exceeded our expectations as the ketones represent a rather diverse set of compounds. In only a few cases, lower conversions were observed, but further optimization to increase the conversion was considered unnecessary in light of the other very useful R-ATAs.

- 20 -

The Application

100% ee 99% ee 95% ee 90% ee

80% ee

70% ee 50% ee 30% ee

15% ee

Figure 15| Conversion and enantiomeric excess of the amines obtained by asymmetric synthesis. Conversions are given in the order of the displayed products. The color gradient indicates the excess of the obtained (R) enantiomer, in case of VibFlu and RhoSph of the (S) enantiomer.

Although PenChr and AspTer show a rather high similarity score in a multiple sequence alignment (Figure 16), they showed very different performances. PenChr gave the lowest conversions, not exceeding 20% . AspTer on the other hand is a pretty good candidate, especially for aliphatic amines. The two other enzymes from Aspergillus species (AspOry and AspFum) turned out to be even better catalysts, as well as NeoFis. According to their sequence identity these enzymes are much alike, especially AspFum and NeoFis with a sequence identity of 96%. AspOry shares a similarity of 72% and 73%, respectively, and gave very good to excellent conversions for the aliphatic substrates, considerably less conversion for 68 (probably due to the sterical hindrance in -position) and again very good conversions for the arylaliphatic substrates. MycVan is the outsider of the investigated proteins, with similarity scores not exceeding 40%. The phylogenetic tree (Figure 16) shows that MycVan has the highest degree of relationship to the assumed ancestor, branched chain amino acid transaminase (BCAT). MycVan enabled good yields for the aliphatic and arylaliphatic compounds, moderate yields for 6a and 7a and did not like 8a at all.

- 21 -

The Application

Figure 16| Phylogenetic tree of the investigated R-ATAs. The figure shows all 21 hits from the database search and the biocatalysts applied in asymmetric synthesis are named. DATA (D-amino acid transaminase), ADCL (4-amino-4- deoxychorismate ) and BCAT (branched chain amino acid transaminase) are the other three members of the PLP-fold type IV.

Bringing together the observed performances during biocatalysis and the degree of relationship as shown in the phylogenetic tree (Figure 16), several correlations attract attention. MycVan, for example, is a good biocatalyst with a comparable performance as AspOry, AspFum and NeoFis, which show a high degree of relationship. Although GibZea shows an even higher similarity towards these excellent catalysts than MycVan, it showed only a rather poor performance. Interestingly, AspTer and PenChr are also strongly related, but showed very different performances.

For comparative purposes we also applied two S-ATAs from Vibrio fluvialis and Rhodobacter sphaeroides to asymmetric syntheses of the (S)-amines. We used the same reaction system as before and the yields were good to moderate. But more importantly, the (S)-selective enzymes were significantly less enantioselective as our (R)-selective enzymes (Figure 15). Just two amines (7 and 8) could be obtained with >99% ee. In literature, only two of our substrates (7 and 9) were used for asymmetric syntheses with the enzyme from V. fluvialis and the reported enantioselectivities match our results58, 59. In case of aliphatic amines (15), the Trp57Gly mutant yielded excellent enantiomeric excesses in kinetic resolutions57, but – on the contrary – did not show any activity in asymmetric syntheses63. The enzyme from R. sphaeroides has not been used for asymmetric syntheses in literature. For some substrates, our enzymes showed higher selectivities in comparison to the commercially available ATA-117, too. Although this enzyme shows high enantioselectivities towards 9 and short aliphatic amines, the synthesis of 7 yielded only 96% ee and of 2-aminooctane just 90% ee36.

- 22 -

The Conclusion

5 CONCLUSION In this thesis, two novel assay systems had been developed, which allow a fast and easy screening for amine transaminase activity as well as the characterization of the amino donor and acceptor specificity of a given amine transaminase. The assays overcome some limitations of previously described assays but of course have some limitations themselves. The relatively low wavelength of 245 nm, at which the production of acetophenone is detected with the spectrophotometric assay, limits the amount of protein/crude extract that can be applied, which eventually results in a decreased sensitivity at higher enzyme loads due to an increased initial absorbance. Otherwise, this assay can be used very easily for the investigation of the amino acceptor specificity and both pH and temperature dependencies of amine transaminases. The conductometric assay is – by its very nature – limited to low-conducting buffers, a neutral pH and constant temperatures. In summary, the assays complement one another very well and the complete characterization of the most important enzyme properties can be accomplished quickly.

Furthermore, we developed and applied a novel in silico search strategy for the identification of (R)- selective amine transaminases in sequence databases. Structural information of probably related proteins was used for rational protein design to predict key amino acid substitutions that indicate the desired activity. We subsequently searched protein databases for proteins already carrying these mutations instead of constructing the corresponding mutants in the laboratory. This methodology exploits the fact that naturally evolved proteins have undergone selection over millions of years, which has resulted in highly optimized catalysts. Using this in silico approach, we have discovered 17 (R)- selective amine transaminases. In theory, this strategy can be applied to other enzyme classes and fold types as well and for this reason constitutes a new concept for the identification of desired enzymes.

Finally, we applied the seven most promising candidates of the identified proteins to asymmetric synthesis of various optical pure amines with (R)-configuration starting from the corresponding ketones. We used a lactate dehydrogenase/glucose dehydrogenase system for the necessary shift of the thermodynamic equilibrium. For all ketones at least one enzyme was found that allowed complete conversion to the corresponding chiral amine with excellent optical purities >99% ee. Bearing in mind that until last year there was only one (R)-selective amine transaminase commercially available and two microorganisms with the corresponding activity described, the identification of numerous enzymes is a breakthrough in asymmetric synthesis of chiral amines.

- 23 -

The References

6 REFERENCES 1. Schmid, A., Dordick, J.S., Hauer, B., Kiener, A., Wubbolts, M. & Witholt, B. Industrial biocatalysis today and tomorrow. Nature 409, 258-268 (2001). 2. Teuscher, E., Melzig, M.F. & Lindequist, U. Biogene Arzneimittel. (Wissenschaftliche Verlagsgesellschaft, Stuttgart; 2004). 3. Mack, D.J., Weinrich, M.L., Vitaku, E. & Njardarson, J.T. (University of Arizona, Arizona; 2011). 4. Voet, D. & Voet, J.G. Biochemie. (VCH, Weinheim; 1994). 5. Breuer, M., Ditrich, K., Habicher, T., Hauer, B., Keßeler, M., Stürmer, R. & Zelinski, T. Industrial methods for the production of optically active intermediates. Angew. Chem. Int. Ed. 43, 788-824 (2004). 6. Nugent, T.C. & El-Shazly, M. Chiral amine synthesis – recent developments and trends for enamide reduction, reductive amination, and imine reduction. Adv. Synth. Catal. 352, 753-819 (2010). 7. List, B. Direct catalytic asymmetric α-amination of aldehydes. J. Am. Chem. Soc. 124, 5656-5657 (2002). 8. Li, C. & Xiao, J. Asymmetric hydrogenation of cyclic imines with an ionic Cp*Rh(III) catalyst. J. Am. Chem. Soc. 130, 13208-13209 (2008). 9. Cogan, D.A., Liu, G. & Ellman, J. Asymmetric synthesis of chiral amines by highly diastereoselective 1,2-additions of organometallic reagents to N-tert-butanesulfinyl imines. Tetrahedron 55, 8883-8904 (1999). 10. Hou, C.T. (ed.) Handbook of industrial biocatalysis. (CRC Press, Boca Raton; 2005). 11. Savile, C.K., Janey, J.M., Mundorff, E.C., Moore, J.C., Tam, S., Jarvis, W.R., Colbeck, J.C., Krebber, A., Fleitz, F.J., Brands, J., Devine, P.N., Huisman, G.W. & Hughes, G.J. Biocatalytic asymmetric synthesis of chiral amines from ketones applied to Sitagliptin manufacture. Science 329, 305-309 (2010). 12. Desai, A.A. Sitagliptin manufacture: a compelling tale of green chemistry, process intensification, and industrial asymmetric catalysis. Angew. Chem. Int. Ed. 50, 1974-1976 (2011). 13. Höhne, M. & Bornscheuer, U. Biocatalytic routes to optically active amines. ChemCatChem 1, 42- 51 (2009). 14. Alexeeva, M., Enright, A., Dawson, M.J., Mahmoudian, M. & Turner, N.J. Deracemization of α- methylbenzylamine using an enzyme obtained by in vitro evolution. Angew. Chem. Int. Ed. 114, 3309-3312 (2002). 15. Itoh, N., Yachi, C. & Kudome, T. Determining a novel NAD+-dependent amine dehydrogenase with a broad substrate range from Streptomyces virginiae IFO 12827: purification and characterization. J. Mol. Catal. B-Enzym. 10, 281-290 (2000). 16. Iwaki, M., Yagi, T., Horiike, K., Saeki, Y., Ushijima, T. & Nozaki, M. Crystallization and properties of aromatic amine dehydrogenase from Pseudomonas sp. Arch. Biochem. Biophys. 220, 253-262 (1983). 17. Govindaraj, S., Eisenstein, E., Jones, L.H., Sanders-Loehr, J., Chistoserdov, A.Y., Davidson, V.L. & Edwards, S.L. Aromatic amine dehydrogenase, a second tryptophan tryptophylquinone enzyme. J. Bacteriol. 176, 2922-2929 (1994). 18. Frey, P.A. & Hegemann, P. Enzymatic reaction mechanism. (Oxford University Press, New York; 2007). 19. Jansonius, J.N. Structure, evolution and action of vitamin B6-dependent enzymes. Curr. Opin. Struc. Biol. 8, 759-769 (1998).

- 24 -

The References

20. Percudani, R. & Peracchi, A. The B6 database: a tool for the description and classification of vitamin B6-dependent enzymatic activities and of the corresponding protein families. BMC Bioinformatics 10, 273 (2009). 21. Christen, P. & Metzler, D.E. (eds.) Transaminases. (Wiley, 1985). 22. Eliot, A.C. & Kirsch, J.F. Pyridoxal phosphate enzymes: mechanistic, structural, and evolutionary considerations. Annu. Rev. Biochem. 73, 383-415 (2004). 23. Koszelewski, D., Tauber, K., Faber, K. & Kroutil, W. ω-Transaminases for the synthesis of non- racemic α-chiral primary amines. Trends Biotechnol. 28, 324-332 (2010). 24. Kazlauskas, R.J. & Bornscheuer, U.T. Finding better protein engineering strategies. Nat. Chem. Biol. 5, 526-529 (2009). 25. Reetz, M.T., Wang, L.-W. & Bocola, M. Directed evolution of enantioselective enzymes: iterative cycles of CASTing for probing protein-sequence space. Angew. Chem. Int. Ed. 118, 1258-1263 (2006). 26. Fox, R.J., Davis, S.C., Mundorff, E.C., Newman, L.M., Gavrilovic, V., Ma, S.K., Chung, L.M., Ching, C., Tam, S., Muley, S., Grate, J., Gruber, J., Whitman, J.C., Sheldon, R.A. & Huisman, G.W. Improving catalytic function by ProSAR-driven enzyme evolution. Nat. Biotech. 25, 338-344 (2007). 27. Pace, N.R. A molecular view of microbial diversity and the biosphere. Science 276, 734-740 (1997). 28. Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A., Wu, D., Paulsen, I., Nelson, K.E., Nelson, W., Fouts, D.E., Levy, S., Knap, A.H., Lomas, M.W., Nealson, K., White, O., Peterson, J., Hoffman, J., Parsons, R., Baden-Tillson, H., Pfannkoch, C., Rogers, Y.-H. & Smith, H.O. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66-74 (2004). 29. Furnham, N., Garavelli, J.S., Apweiler, R. & Thornton, J.M. Missing in action: enzyme functional annotations in biological databases. Nat. Chem. Biol. 5, 521-525 (2009). 30. Kaplan, O., Bezouška, K., Malandra, A., Veselá, A., Petříčková, A., Felsberg, J., Rinágelová, A., Křen, V. & Martínková, L. Genome mining for the discovery of new nitrilases in filamentous fungi. Biotechnol. Lett. 33, 309-312 (2011). 31. Kino, K., Noguchi, A., Arai, T. & Yagasaki, M. Identification and characterization of a novel L- amino acid from Photorhabdus luminescens subsp. laumondii TT01. J.Biosci. Bioeng. 110, 39-41 (2010). 32. van Loo, B., Kingma, J., Arand, M., Wubbolts, M.G. & Janssen, D.B. Diversity and biocatalytic potential of epoxide hydrolases identified by genome analysis. Appl. Environ. Microbiol. 72, 2905-2917 (2006). 33. Wang, L.-J., Li, C.-X., Ni, Y., Zhang, J., Liu, X. & Xu, J.-H. Highly efficient synthesis of chiral alcohols with a novel NADH-dependent reductase from Streptomyces coelicolor. Bioresource Technol. 102, 7023-7028 (2011). 34. Machielsen, R., Uria, A.R., Kengen, S.W.M. & van der Oost, J. Production and characterization of a thermostable alcohol dehydrogenase that belongs to the aldo-keto reductase superfamily. Appl. Environ. Microbiol. 72, 233-238 (2006). 35. Fraaije, M.W., Wu, J., Heuts, D.P.H.M., van Hellemond, E.W., Spelberg, J.H.L. & Janssen, D.B. Discovery of a thermostable Baeyer–Villiger monooxygenase by genome mining. Appl. Microbiol. Biot. 66, 393-400 (2005). 36. Koszelewski, D., Lavandera, I., Clay, D., Rozzell, D. & Kroutil, W. Asymmetric synthesis of optically pure pharmacologically relevant amines employing ω-transaminases. Appl. Environ. Microbiol. 350, 2761-2766 (2008). 37. Iwasaki, A., Yamada, Y., Ikenaka, Y. & Hasegawa, J. Microbial synthesis of (R)- and (S)-3,4- dimethoxyamphetamines through stereoselective transamination. Biotechnology Letters 25, 1843-1846 (2003). - 25 -

The References

38. Matcham, G.W. & Bowen, A.R.S. Biocatalysis for chiral intermediates: meeting commercial and technical challenges. Chim. Oggi 14, 20-24 (1996). 39. Krebsfänger, N., Schierholz, K. & Bornscheuer, U.T. Enantioselectivity of a recombinant esterase from Pseudomonas fluorescens towards alcohols and carboxylic acids. J. Biotechnol. 60, 105-111 (1998). 40. Hinman, L.M. & Blass, J.P. An NADH-linked spectrophotometric assay for complex in crude tissue homogenates. J. Biol. Chem. 256, 6583-6586 (1981). 41. Holt, A. & Palcic, M.M. A peroxidase-coupled continuous absorbance plate-reader assay for flavin monoamine oxidases, copper-containing amine oxidases and related enzymes. Nat. Protocols 1, 2498-2505 (2006). 42. Hwang, B.-Y. & Kim, B.-G. High-throughput screening method for the identification of active and enantioselective ω-transaminases. Enzyme Micro. Tech. 34, 429-436 (2004). 43. Truppo, M.D., Rozzell, J.D., Moore, J.C. & Turner, N.J. Rapid screening and scale-up of transaminase catalysed reactions. Org. Biomol. Chem. 7, 395-398 (2009). 44. Hopwood, J., Truppo, M.D., Turner, N.J. & Lloyd, R.C. A fast and sensitive assay for measuring the activity and enantioselectivity of transaminases. Chem. Commun. 47, 773-775 (2011). 45. Williams, G.J., Woodhall, T., Farnsworth, L.M., Nelson, A. & Berry, A. Creation of a pair of stereochemically complementary biocatalysts. J. Am. Chem. Soc. 128, 16238-16247 (2006). 46. Melzer, M., Chen, J.C.H., Heidenreich, A., Gab, J., Koller, M., Kehe, K. & Blum, M.M. Reversed enantioselectivity of diisopropyl fluorophosphatase against organophosphorus nerve agents by rational design. J. Am. Chem. Soc. 131, 17226-17232 (2009). 47. Bartsch, S., Kourist, R. & Bornscheuer, U.T. Complete inversion of enantioselectivity towards acetylated tertiary alcohols by a double mutant of a bacillus subtilis esterase. Angew. Chem. Int. Ed. 47, 1508-1511 (2008). 48. Svedendahl, M., Branneby, C., Lindberg, L. & Berglund, P. Reversed enantiopreference of an ω- transaminase by a single-point mutation. ChemCatChem 2, 976-980 (2010). 49. Cahn, R.S., Ingold, C. & Prelog, V. Specification of molecular chirality. Angew. Chem. Int. Ed. 5, 385-415 (1966). 50. Okamoto, A., Ishii, S., Hirotsu, K. & Kagamiyama, H. The active site of Paracoccus denitrificans aromatic amino acid aminotransferase has contrary properties: flexibility and rigidity. Biochemistry 38, 1176-1184 (1999). 51. Goto, M., Miyahara, I., Hayashi, H., Kagamiyama, H. & Hirotsu, K. Crystal structures of branched- chain amino acid aminotransferase complexed with glutamate and glutarate: true reaction intermediate and double substrate recognition of the enzyme. Biochemistry 42, 3725-3733 (2003). 52. Sugio, S., Petsko, G.A., Manning, J.M., Soda, K. & Ringe, D. Crystal structure of a D-amino acid aminotransferase: how the protein controls stereoselectivity. Biochemistry 34, 9661-9669 (1995). 53. O'Rourke, P.E.F., Eadsforth, T.C., Fyfe, P.K., Shepherd, S.M. & Hunter, W.N. Pseudomonas aeruginosa 4-amino-4-deoxychorismate lyase: spatial conservation of an active site tyrosine and classification of two types of enzyme. PLoS One 6, e24158 (2011). 54. Wolf, Y., Madej, T., Babenko, V., Shoemaker, B. & Panchenko, A. Long-term trends in evolution of indels in protein sequences. BMC Evol. Biol. 7, 19 (2007). 55. Shin, J.S., Yun, H., Jang, J.W., Park, I. & Kim, B.G. Purification, characterization, and molecular cloning of a novel amine:pyruvate transaminase from Vibrio fluvialis JS17. Appl. Microbiol. Biot. 61, 463-471 (2003). 56. Hanson, R.L., Davis, B.L., Chen, Y., Goldberg, S.L., Parker, W.L., Tully, T.P., Montana, M.A. & Patel, R.N. Preparation of (R)-amines from racemic amines with an (S)-amine transaminase from Bacillus megaterium. Appl. Environ. Microbiol. 350, 1367-1375 (2008). - 26 -

The References

57. Yun, H., Hwang, B.-Y., Lee, J.-H. & Kim, B.-G. Use of enrichment culture for directed evolution of the Vibrio fluvialis JS17 ω-transaminase, which is resistant to product inhibition by aliphatic ketones. Appl. Environ. Microbiol. 71, 4220-4224 (2005). 58. Shin, J.-S. & Kim, B.-G. Asymmetric synthesis of chiral amines with ω-transaminase. Biotechnol. Bioeng. 65, 206-211 (1999). 59. Höhne, M., Kühl, S., Robins, K. & Bornscheuer, U.T. Efficient asymmetric synthesis of chiral amines by combining transaminase and pyruvate decarboxylase. ChemBioChem 9, 363-365 (2008). 60. Koszelewski, D., Lavandera, I., Clay, D., Guebitz, G.M., Rozzell, D. & Kroutil, W. Formal asymmetric biocatalytic reductive amination. Angew. Chem. Int. Ed. 120, 9477-9480 (2008). 61. Matcham, G., Bhatia, M., Lang, W., Lewis, C., Nelson, R., Wang, A. & Wu, W. Enzyme and reaction engineering in biocatalysis: synthesis of (S)-methoxyisopropylamine (= (S)-1- methoxypropan-2-amine). Chimia 53, 584-589 (1999). 62. Cassimjee, K.E., Branneby, C., Abedi, V., Wells, A. & Berglund, P. Transaminations with isopropyl amine: equilibrium displacement with alcohol dehydrogenase coupled to in situ cofactor regeneration. Chem. Commun. 46, 5569-5571 (2010). 63. Koszelewski, D., Göritzer, M., Clay, D., Seisser, B. & Kroutil, W. Synthesis of optically active amines employing recombinant ω-Transaminases in E. coli cells. ChemCatChem 2, 73-77 (2010).

- 27 -

The Articles

AUTHOR CONTRIBUTION

Article I Discovery and Protein Engineering of Biocatalysts for Organic Synthesis G. A. Behrens*, A. Hummel*, S. K. Padhi*, S. Schätzle*, U. T. Bornscheuer*, Adv. Synth. Catal. 2011, 353, 2191-2215

* All authors contributed equally.

Article II Rapid and Sensitive Kinetic Assay for Characterization of ω-Transaminases S. Schätzle, M. Höhne, E. Redestad, K. Robins, U. T. Bornscheuer, Anal. Chem. 2009, 81, 8244-8248

S.S. designed and performed the experiments with the help of M.H. and E.R. U.T.B, M.H. and S.S. wrote the manuscript.

Article III Conductometric Method for the Rapid Characterization of the Substrate Specificity of Amine-Transaminases S. Schätzle, M. Höhne, K. Robins, U. T. Bornscheuer, Anal. Chem. 2010, 82, 2082-2086

S.S. designed and performed the experiments. M.H. conceived the original idea. U.T.B. and S.S. wrote the manuscript.

Articel IV Rational Assignment of Key Motifs for Function guides in silico Enzyme Identification M. Höhne*, S. Schätzle*, H. Jochens, K. Robins, U. T. Bornscheuer, Nature Chem. Biol. 2010, 6, 807–813

* These authors contributed equally. S.S. coordinated the comparative characterization of all proteins and performed cloning, expression, purification, data collection and data analysis. M.H. designed the in silico strategy, devised the annotation algorithm and performed the database search and identification of the putative amine transaminases. M.H. expressed and confirmed amine transaminase activity and (R)-selectivity for the first three proteins. H.J. contributed to gene cloning, protein expression and activity measurements. K.R. and U.T.B. initiated the project. U.T.B. and M.H. cowrote the paper, and all authors read and edited the manuscript.

Article V Enzymatic Asymmetric Synthesis of Enantiomerically Pure Aliphatic, Aromatic and Arylaliphatic Amines with (R)-Selective Amine Transaminases S. Schätzle, F. Steffen-Munsberg, A. Thontowi, M. Höhne, K. Robins, U. T. Bornscheuer, Adv. Synth. Catal. 2011, 353, 2439-2445

S.S designed the experiments. S.S. and F.S.M performed the experiments with the help of A.T. U.T.B and S.S. wrote the manuscript.

Confirmed, Greifswald, Nov. 22nd 2011 Uwe Bornscheuer

- 28 -

The Articles

ARTICLE I

- 29 -

REVIEWS

DOI: 10.1002/adsc.201100446 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Geoffrey A. Behrens,a,b Anke Hummel,a,b Santosh K. Padhi,a,b Sebastian Schtzle,a,b and Uwe T. Bornscheuera,b,* a Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, Greifswald University, Felix-Hausdorff-Str. 4, 17487 Greifswald, Germany Fax : (+49)-3834-86-794367; e-mail: [email protected] b All authors contributed equally

Received: May 30, 2011; Revised: July 13, 2011; Published online: September 5, 2011

Abstract: Modern tools for enzyme discovery and 3.2 Methods for Rational Protein Design protein engineering substantially broadened the 3.3 Semi-Rational Design/Focused Directed Evolu- number of enzymes applicable for biocatalysis and tion helped to alter their properties such as substrate 4 Selected Examples range, enantioselectivity, and stability under process 4.1 Transaminases conditions. In addition, these methods also enabled 4.2 Enoate Reductases one to explore reactions for organic synthesis for 4.3 Esterases for Tertiary Alcohol Synthesis which no suitable enzymes were available until re- 4.4 Monoamine Oxidases cently. This review provides a summary of the differ- 4.5 Dehalogenases ent concepts and technologies, which are exemplified 4.6 Aldolases for various enzymes. 4.7 P450-Monooxygenases 4.8 Baeyer–Villiger Monooxygenases 1 Introduction 5 Biocatalysts for New Reactions 2 Tools for Enzyme Discovery 6 Conclusions 2.1 Metagenome Approach 2.2 Database Mining Keywords: biocatalysis; directed evolution; enantio- 3 Tools for Protein Engineering selectivity; enzyme catalysis; protein engineering; ra- 3.1 Methods for Directed Evolution tional protein design

1 Introduction biocatalysis challenge. Similarly, for reduction of ke- tones to yield optically active chiral alcohols either The earliest examples for the application of enzymes baker’s yeast [notably containing various ketoreduc- as biocatalysts for organic syntheses date back more tases/alcohol dehydrogenases (ADH) with different than a century. For instance, Rosenthaler used a plant selectivities] or a few animal-derived (such as horse extract for the synthesis of (R)-mandelonitrile from liver ADH) or microbial (such as Thermoanaerobium benzaldehyde and hydrogen cyanide in 1908.[1] Still, brockii ADH, Lactobacillus species ADH) enzymes until about 25 years ago, researchers who wished to could be investigated. For other enzyme classes, avail- use biocatalysts for organic synthesis had access to ability was a major hindrance and only minute only a handful of commercially available enzymes for amounts could be ordered at high price, as was the a given reaction. For instance, about 5–10 lipases case for, e.g., an aldolase from rabbit muscle from a few suppliers were usually at hand to find a (RAMA) before the enzyme could be cloned and re- biocatalyst with desired activity towards a certain sub- combinantly expressed.[2] Although researchers could strate and if none of them showed sufficient activity demonstrate the synthetic utility of RAMA, gram- and especially enantioselectivity, then variations such scale synthesis was almost unaffordable, larger scale as use of different solvents (medium engineering) or synthesis impossible. alternative acyl donors in transesterifications (or In order to find a new biocatalyst even for a similar esters of different chain-length in hydrolysis; substrate reaction one had to screen many microbial sources in engineering) were the means to find a solution for the extensive screening programs. Altering important bio-

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 2191 REVIEWS Geoffrey A. Behrens et al.

postdoctoral positions at the University of Florida, Gainesville, USA with Jon D. Stewart, and Universi- ty of Minnesota, Minneapolis, USA with Romas J. Kazlauskas. Currently he is a postdoctoral associate at the University of Greifswald in the group of Uwe Bornscheuer as an Alexander von Humboldt re- search fellow. His research interests include enzyme catalysis and enzyme discovery by protein engineer- ing for organic synthesis, especially for asymmetric synthesis.

Sebastian Schtzle (born 1983) studied Biochemistry in Greifswald. During his diploma thesis in 2008 he characterised several nitrile hydratases. Currently he is working on his PhD thesis on the identification and application of novel (R)-selective amine transa- Geoffrey A. Behrens (born 1985) studied biochem- minases under supervision of Uwe Bornscheuer. istry in Greifswald, Germany. In 2010 after finishing During his studies, he stayed in Hat Yai, Thailand his diploma thesis, he started his PhD thesis on for a short research internship. amino acid and amino acid derivative-converting en- zymes under supervision of Uwe Bornscheuer. His Uwe T. Bornscheuer (born 1964) studied chemistry main focus is on protein engineering, by rational and completed his doctorate in 1993 at the Universi- design and by evolutive methods, of these enzymes. ty of Hannover. He then was a postdoc at the Uni- versity of Nagoya, Japan. In 1998, he completed his Anke Hummel (born 1981) studied biochemistry in Habilitation at the University of Stuttgart at the In- Greifswald, Germany. During her diploma thesis in stitute of Technical Biochemistry. He has been Pro- 2007 she identified and characterised novel isoen- fessor at the Institute of Biochemistry at the Univer- zymes of pig liver esterase. Currently she is working sity of Greifswald since 1999. Bornscheuer has on her PhD thesis on the isolation of novel enzymes edited and written several books, is Editor-in-Chief from metagenome sources under supervision of of the Eur. J. Lipid Sci. Technol. and Co-Chairman Uwe Bornscheuer. During her studies she stayed of the recently launched journal ChemCatChem.In both in Reykjavk, Iceland and in Lund, Sweden for 2008, he received the BioCat2008 Award for his in- a half-year research internship. novative work on tailored biocatalysts for industrial applications. His current research interest is focusing Santosh Kumar Padhi (born 1977) studied chemistry on protein engineering of enzymes from various and obtained his Ph.D. (2005) from the Indian Insti- classes with special emphasis on the synthesis of tute of Technology Madras, India with supervision chiral compounds and in lipid modification. by Anju Chadha. Between 2005 and 2009, he held catalytic properties such as substrate range or selec- structures of proteins was then already possible, but tivity, which are essential for using a biocatalyst in an modelling for the prediction of positions for mutagen- organic reaction, remained a challenge for a long esis was mostly limited to bioinformatic experts able time. The target to find a biocatalyst with high robust- to program UNIX, and computers and software were ness, thermostability, resistance to organic solvents very expensive. Gene identification, cloning and re- and further process-related conditions often remained combinant expression of proteins in microorganisms unfulfilled, which prevented the area of enzyme catal- were also not so easy and mostly performed by expert ysis from being competitive with chemicals routes for microbiology groups. decades. Although the discovery of recombinant This situation has changed dramatically in the past DNA technology in the late 1970s was a major break- two decades, fostered mostly by progress in molecular through, it still did not solve the issue of generating biology and bioinformatics, but the reasons are many. an enzyme in the laboratory with desired biocatalytic For instance, sequencing of genomes and entire mi- properties. Consequently, many attempts to establish crobial consortia now takes months not years; cheap, biocatalytic reactions suffered from the lack of avail- easy and fast access to synthetic genes circumvents able enzymes and easy methods to alter their proper- many time-consuming steps traditionally starting from ties. Of course, rational protein design based on X-ray growth of a microorganism, protein purification, N-

2192 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Figure 1. Overview of approaches for protein discovery and engineering by rational, evolutionary or combined methods. terminal sequencing, genomic DNA (or mRNA) iso- dious identification of best conditions for recombi- lation, eventually succeeding in the identification of nant enzyme production. Furthermore, protein engi- the protein encoding gene and its recombinant ex- neering is not restricted any longer to rational design pression. Consequently, the time scale to have the re- as the invention of directed (molecular) evolution in combinant protein at hand is now reduced from many the early 1990s based on simple random mutagenesis months and often years to just a few weeks. Impres- by methods such as error-prone PCR in combination sive examples are the discovery and characterization with high-throughput screening makes creation of li- of > 150 nitrilases in metagenome libraries by DeSan- braries and identification of desired enzymes relative- tis et al.[3] , while only a few nitrilases were found ly easy. This now allows researchers to identify better before, during several decades of microbial screening. enzymes even if the knowledge about their function is We have recently discovered 20 sequences encoding rather small. (R)-selective transaminases by an in silico screening Fostered by these important achievements on the approach[4] and the cloning, production and prelimi- technology side, modern biocatalysis using protein nary characterization of these novel enzymes took discovery and engineering now enables many academ- just a few months. ic scientists as well as industrial researchers to estab- Models of protein structure can nowadays be auto- lish enzymatic routes for the synthesis of chemicals matically generated (e.g., by Phyre[5] , SwissModell[6] that are highly competitive to organic synthesis. or Robetta[7]), software for molecular modelling now Hence, numerous novel biocatalytic processes have runs on standard personal computers and is rather been implemented on the multi-kilogram to ton scales easy to use (e.g., Yasara[8]), tools for sequence com- since then, often replacing existing chemical routes. parison (e.g., BLAST: http://blast.ncbi.nlm.nih.gov/ The most recent and impressive example is the devel- Blast.cgi[9]) and alignment (e.g., ClustalW: http:// opment of an engineered (R)-selective transaminase www.ebi.ac.uk/Tools/msa/clustalw2/[10]) enable quick and its implementation for the highly efficient and identification of novel sequences and conserved enantioselective synthesis[11] of the drug Sitagliptin, motifs. A plethora of expression systems and chaper- which replaced the transition metal-catalyzed hydro- one kits are available to remove the bottleneck of te- genation reaction. Apart from the industrial exam-

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2193 REVIEWS Geoffrey A. Behrens et al. ples, a large number of commercial enzymes (listed in step. For this, the complete genomic DNA is extracted the key references cited below) are available today, from an environmental sample, fragmented and which allows many organic chemistry researchers to cloned yielding the corresponding metagenome libra- explore their synthetic potential. ries.[17] These libraries can then be screened either for It would be impossible to give a full coverage of the desired activity (with a function-based assay), or the achievements made in the past decades in the in a sequence-based approach in which genes are field of protein engineering of enzymes for organic identified based on homology to already described synthesis in this article. Many examples are covered enzymes, or the entire DNA is sequenced. Both ap- together with applications in books[12] or (most proaches have their advantages and disadvantages. recent) reviews.[13] This review rather aims to provide The hit rate in the activity-based approach is usually an overview about principle approaches for protein lower because the gene needs to be functionally ex- discovery (Figure 1), their engineering by rational and pressed in the host microorganism, which requires evolutionary methods and hopes to provide insights correct positioning of the metagenome sequence; pro- to facilitate success when facing future protein engi- moter and codon usage must fit to the host and cor- neering problems. In addition, selected examples for rect post-translational modification of the enzyme different enzyme classes and industrial processes are must be ensured. If enzyme activity depends on sever- given to demonstrate the usefulness of biocatalysts al subunits or requires additional proteins, e.g., for for organic synthesis. electron transfer, the functional expression becomes even more challenging. On the other hand, enzymes discovered by this activity-based approach are already 2 Tools for Enzyme Discovery cloned and accessible in a functional way. As the costs and time required for DNA-sequencing substan- Despite the high number of already described en- tially dropped by orders of magnitude in the past few zymes, there is a constant need for novel biocatalysts years, entire metagenome libraries can now be se- with altered properties such as substrate scope, enan- quenced at reasonable budgets and timelines. This has tioselectivity or -preference, thermostability, pH pro- the advantage that novel enzymes and even complete file, solvent tolerance, etc. To meet these demands, pathways can be discovered quickly using bioinfor- there are basically three strategies: (i) search for new matic tools. Together with sequence alignments from enzymes, (ii) modification of already characterized en- databases, this approach facilitates the identification zymes, or (iii) de novo design of completely new bio- of novel enzymes and the choice of an appropriate ex- catalysts as a potential, but not yet established route pression system. Current reviews nicely summarize re- (Figure 1). cently discovered enzymes using both approaches.[18] In the activity-based approach the major limitation is the availability of reliable high-throughput assays. 2.1 Metagenome Approach This is the reason why still most of the enzymes iso- lated from metagenome libraries are hydrolases.[19] Traditionally, access to new enzymes was gained Although esterases, lipases and proteases form the through enrichment selection of microorganisms by largest group, an increasing number of other enzyme growth under limiting conditions or enzymes from classes has been accessed from metagenome libraries, plant or animal tissues were applied as crude extracts. amongst them several industrially interesting polysac- This has been overcome by the development of re- charide-degrading enzymes like xylanases[20] , gluca- combinant DNA techniques, which have substantially nases[21] , or cellulases[22] which are of interest for the simplified the access to enzymes and now allow the production of bulk and fine chemicals from renewable production of biocatalysts at stable quality as was resources or b-galactosidases for the food industry.[23] shown, for instance, for the hydroxynitrile lyase from Hevea brasiliensis[14] or pure isoenzymes of pig liver esterase.[15] 2.2 Database Mining However, only a small number of microbes can suc- cessfully be cultivated in the laboratory. Depending Besides the classical enrichment cultivation and the on the habitat, it was estimated that <1% of the mi- metagenome approach, a third way to access new en- crobes present in an environmental sample can be zymes is in silico screening. Protein sequences ob- grown under standard laboratory conditions.[16] Rea- tained from sequencing of single enzymes, entire ge- sons include the facts that proper cultivation condi- nomes or microbial consortia (i.e., from the Sargasso tions are unknown, growth rates are too slow or the Sea[24]), are deposited in public databases and current- growth of a microbe is linked to other species provid- ly >12 million protein sequences are available (http:// ing nutrients. The metagenome approach overcomes www.ncbi.nlm.nih.gov/RefSeq/). However, only a tiny these limitations by circumventing the cultivation fraction of all these sequences correspond to enzymes,

2194 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis which have been produced in the laboratory and their 3 Tools for Protein Engineering biochemical properties and function have been exper- imentally confirmed. For the vast majority, a possible Two main strategies for protein engineering are used function is postulated by sequence comparison, but to adjust enzyme properties towards the desired appli- this annotation can often be wrong or at least mis- cation: rational design and directed (molecular) evo- leading, as shown in two recent examples from our lution.[13d,37] group for a monooxygenase[25] and transaminases.[4] Rational design uses structural and mechanistic in- Discovering novel biocatalysts in public databases is formation and molecular modelling for the prediction easier the more information about the desired of changes in the protein structure in order to alter or enzyme (class) exists. If several enzymes with a dis- induce the desired properties. The advances in com- tinct activity are already described in the literature, puter technology have helped in creating better pro- an alignment of their protein sequences together with tein models to improve predictions for rational experimentally confirmed activities or specificities can design, but structure-activity relationships are still not reveal conserved regions within the sequences and a trivial. In directed evolution, mutant libraries are cre- simple BLAST-search can yield new and useful bio- ated by random changes, screened for the desired catalysts. There are numerous examples for the dis- property and the variants showing promising results covery of new enzymes in silico including nitrilases[26] , are subjected to further rounds of evolution. Both a ligase[27] , an epoxide [28] and a reductase strategies have their advantages and limitations. for the highly selective reduction yielding ethyl (S)-4- Nowadays, more researchers use combined methods chloro-3-hydroxybutanoate with excellent optical of these two strategies, dubbed focused-directed evo- purity (99% ee) at high substrate concentration lution or semi-rational design. Here, information (600 g/L).[29] By curtailing the sequences to be available from related protein structures, families and searched through, one is even able to increase the mutants already identified is combined and then used chances that the new catalysts fulfil additional crite- for targeted randomisation of certain areas of the pro- ria. Scanning the genomes of (hyper-)thermophilic or- tein (Figure 1). ganisms, for example, will increase the chances to find a thermostable enzyme. Thus, Machielsen et al. identi- fied a thermostable ADH in the genome of Pyrococ- 3.1 Methods for Directed Evolution cus furiosus with a half-life of 130 min at 1008C[30] and Fraaije et al. discovered a thermostable Baeyer– All methods used in directed evolution share a Villiger monooxygenase with a promiscuous sulphur common feature: they create random changes in the oxidizing activity.[31] protein sequence. Two different approaches are used Many more examples can be found in recent re- in directed evolution, recombining and non-recombin- views. Stewart, for example, illustrated the potential ing. The first uses a set of related sequences, which of genome mining strategies exemplified for yeast de- are randomly combined (e.g., gene shuffling[38]) hydrogenases[32] while Furuya et al. focused on aspects whereas the latter focuses on changing one single pro- of the genome mining approach that are relevant for tein sequence [e.g., error-prone polymerase chain re- the discovery of novel P450 biocatalysts.[33] The iden- action (epPCR[39]) or site saturation mutagenesis[40]]. tification of biosynthetic gene clusters of indole alka- Common methods for directed evolution of enzymes loids by this approach was summarized recently by are covered in depth elsewhere.[41] Li.[34] Directed evolution is a powerful tool in enzyme With increasing computing power and better soft- design, but still with limitations. For example, the ware, scientists are now able to screen whole virtual random exchange of three amino acids in a 200 amino mutant libraries in silico. Juhl et al. applied rational acid protein leads to over 9 billion possible combina- design to the lipase CAL-B to expand the range of tions. The screening of these huge libraries can hardly carboxylic acids converted, which is mainly limited to be performed in microtiter plates, but there are meth- non-branched fatty acids. After screening a total of ods to access libraries that large, for example, fluores- 2,400 mutants in silico with multiple docking experi- cence-activated cell sorting (FACS) allows the screen- ments and expressing 11 variants in E. coli, they ing of up to 108 variants per day.[42] Furthermore, the found one single mutant T138S with a 5-fold in- simultaneous measurement of pooled mutants and creased activity in the conversion of isononanoic later deconvolution of active variants can help to acid.[35] A comparison between the power of in vivo screen more mutants per time frame. An interesting vs. in silico screening of mutants was recently been method using Monte-Carlo simulations has been made by Barakat et al. as exemplified for the rational proven to successfully find active variants among redesign of an instable variant of Streptococcal pro- many inactive clones,[43] but still a very sensitive activ- tein G.[36] ity assay is needed.

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2195 REVIEWS Geoffrey A. Behrens et al.

Another way to reduce library size is to eliminate Bearing in mind that mutations can have additive or unfolded or unstable proteins. One strategy for this cooperative effects, the order of positions to be mu- elimination is to mimic the second major mechanism tated and screened might be crucial, leading to differ- of biological evolution: genetic drift. Genetic drift is ent pathways. Following all pathways is work inten- the accumulation of random changes, while maintain- sive, as each step in a pathway includes a whole li- ing normal function. The assumption in this approach brary that needs to be screened.[49] Weinreich and co- is that most variants with multiple substitutions are workers tested all 120 paths to an improved variant of unstable. By screening for those that maintain the a b-lactamase and found that only 18 paths improved original function, one ensures that the protein is still resistance to b-lactams at each stage.[50] Similarly, only properly folded.[44] 8 paths increased the enantioselectivity of an epoxide hydrolase at each step (55 paths if one includes neu- tral mutations).[51] 3.2 Methods for Rational Protein Design Introduction of statistical methods can also greatly improve mutagenesis strategies. A computer program Rational design relies on protein structures to acquire analyses sets of sequence-activity data, preferentially information about possible and beneficial mutation gained from different mutagenesis experiments and sites. X-ray structures are not always at hand, thus a extracts information about the influence of individual need for more accurate models has arisen. Molecular mutations. The ProSAR-strategy (protein sequence modelling uses force fields,[45] which are based on ex- activity relationships) developed by Fox et al.[52] was perimental data to mimic protein behaviour as exactly used to guide the development of a halohydrin deha- as possible,[6,8] while on the other hand using not too logenase by the data obtained from different protein much computational power for large proteins. Hybrid engineering approaches (rational design, random mu- models are used to enhance modelling capabilities tagenesis, gene shuffling, saturation mutagenesis), while maintaining similar calculation speeds.[47] The sorting the single mutations into beneficial, neutral structures are then evaluated for proposed mutations and deleterious ones and keeping only the beneficial and the promising changes identified in silico are then mutations for further rounds of mutagenesis. The au- actually inserted into the target protein by site-direct- thors highlight this feature, as the problem with ed mutagenesis. random directed evolution might be that certain mu- tations present in positive mutants will undoubtedly be taken along into further mutation rounds, although 3.3 Semi-Rational Design/Focused Directed the mutation itself might not be beneficial. ProSAR Evolution detects and eliminates these. The final variant in the halohydrin dehalogenase example contained 35 amino Both strategies mentioned above have drawbacks that acid substitutions leading to a 4,000-fold increased limit their practical use. Rational design can be re- volumetric productivity in the synthesis of a Lipitor stricted because quite often the information needed is side chain.[52] lacking or incomplete and extensive computation To increase enzyme stability against high tempera- might be necessary. On the other hand, directed evo- ture or organic solvents, the “consensus approach” lution is limited by the size of the library to be can be the key to success.[53] It is assumed that the screened versus the availability of fast and reliable most abundant amino acids at each position in a set high-throughput screening methods (Figure 1, top). of homologous enzymes contribute more than average A combination of both is often an alternative to to protein stability. Comparison of sequences within overcome the limitations of each method. Focusing, large enzyme families can identify conserved and dif- for example, on certain enzyme areas that are known fering amino acids, which then guides the planning of to be linked to the desired property, can drastically mutations to be introduced into the starting protein. reduce the screening effort. The combinatorial active- This strategy was used to increase thermostability and site saturation test (CAST) is a method pursuing this organic solvent tolerance of a glucose-1-dehydrogen- strategy. For this, a small set of amino acids in the vi- ase, raising the apparent melting point in the best var- cinity of the active site is chosen and mutated ran- iant to > 808C and the half life in organic solvents by domly followed by screening for best hits. One single >2500-fold.[54] In another approach, a database CAST approach can result in an impressive enhance- (3DM) with 1,751 structurally related proteins from ment of enzyme features as demonstrated for an ep- the a/b-hydrolase enzyme superfamily served as basis oxide hydrolase,[48] but this procedure is by definition to create “small, but smart” libraries.[55] This 3DM not an evolutionary one. ISM, iterative saturation mu- analysis-based library was designed to cover only tagenesis, introduces this evolutionary factor into the amino acids frequently occurring at a given position CASTing strategy. The best variants from a first in this superfamily. This concept enabled one to im-

CASTing round serve as template for the next round. prove the enantioselectivity (Emutant =80, EWT =3.2) of

2196 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis an esterase from Pseudomonas fluorescens (PFE) by very difficult. However, directed evolution enabled mutating four residues near the active site and also the identification of ATAs with strongly increased improved the activity up to 240-fold.[56] This method specific activity or reduced substrate and product in- was also used to increase the thermoactivity of PFE hibition. In a very interesting directed evolution by 88C.[57] In another example, identification and mu- study, Matcham and co-workers focused directly on tagenesis of flexible residues using the B-FIT ap- avoiding product inhibition in the preparation of 2- proach in combination with ISM gave drastically im- amino-1-methoxypropane.[120] The wild-type transami- proved thermoactivity (from 45 to 93 8C)[58] as well as nase underwent significant product inhibition so that enhanced stability in organic solvents for a lipase with 1M methoxyacetone and 1.5 equivalents of iso- from Bacillus subtilis.[59] propylamine only 12% conversion could be achieved, although this was not caused by an unfavourable equilibrium (the equilibrium constant was 8). Thus, 4 Selected Examples the transaminase was improved by three rounds of random mutagenesis and screening, and an enzyme

Enzymes such as lipases, esterases and oxidoreductas- was obtained from which, due to a higher Ki value for es dominated the last two decades with many applica- substrates and products, 65% conversion could be ob- tions in organic synthesis, especially targeting enantio- tained without any need to shift the equilibrium. An pure compounds. In the past decade several enzymes explanation for the lowered inhibition constants may from other classes were discovered, characterised, be the increased KM values for both substrates. As a cloned and engineered, to meet a range of challenges next step, five rounds of directed evolution were car- in organic synthesis and many are used already in in- ried out yielding a variant with enhanced thermal and dustry. The examples given below mostly refer to chemical stability so that the removal of acetone (gen- enzyme classes, which came into focus in the past five erated from the isopropylamine) at 50 8C under re- years although they had been known and studied duced pressure was possible. The most effective since several decades. We believe that the modern mutant afforded > 99% optically pure (S)-2-amino-1- tools for protein discovery and engineering covered methoxypropane from 2M methoxyacetone and 2.5M above were crucial to make them attractive biocata- isopropylamine at 93% yield after 7 h. Since the cost lysts and to meet synthetic chemistry demands. The of the biocatalyst was rather low (5% of the overall examples given below in more detail and surveyed in cost), it could be discarded after each production Table 1 provide an overview of enzymes improved by batch. protein engineering being important for organic syn- In a more recent study, a mesophilic (S)-selective thesis. amine transaminase from Arthrobacter species was used to produce optically pure substituted aminotetra- line.[121] Although a high excess of isopropylamine as 4.1 Transaminases amino donor was used, the equilibrium still disfav- oured product formation. To allow for an efficient Transaminases (TA, EC 2.6.1.x) catalyse the transfer asymmetric synthesis, first, the catalyst had to be opti- of an amino group from an amino donor (amine or a- mised to withstand high amounts of isopropylamine amino acid) to an amino acceptor (ketone or a-keto and reaction temperatures of 55 8C, so that acetone acid) utilising the cofactor pyridoxal 5’-phosphate distillation could shift the equilibrium, and, second, (PLP). This reaction can be performed either as ki- the specific activity for tetraline had to be increased netic resolution of a racemic amine (Scheme 1, left) to reduce the reaction time. Five rounds of random or as an asymmetric synthesis starting from a proster- mutagenesis resulted in a biocatalyst with a tempera- eogenic ketone (Scheme 1, right). By means of a com- ture optimum at 558C and a 280-fold increased turn- bination of both strategies, deracemisation is also pos- over frequency while the Michaelis constant KM sible: the ketone generated in a kinetic resolution can stayed constant at about 10 mM. For a specific large- subsequently be used as substrate in an asymmetric scale process the authors were able to reduce the re- synthesis employing a TA of opposite enantioprefer- action time from more than 48 h to less than 12 h and ence.[119] to reduce the enzyme loading 3-fold. Whereas a-transaminases require the presence of a Apart from these successful examples for applying carboxylic acid group in the a-position to the keto or directed evolution to improve the performance of amine functionality and hence only allow the forma- transaminases, there are several examples for success- tion of a-amino acids, amine transaminases (ATA) ful rational redesign of ATAs, too, although no crystal are much more versatile as they in principle accept structure of any amine transaminase has been solved any ketone or amine. (the first crystals could be obtained only recently).[122] Unfortunately, there is still no crystal structure of Cho et al. redesigned the substrate specificity of the an ATA available and hence rational protein design is extensively studied ATA from Vibrio fluvialis by a

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2197 REVIEWS Geoffrey A. Behrens et al.

Table 1. Survey of recent examples of enzymes optimized by protein engineering.[a]

Enzyme Methods Result and Comments Ref Thermostability [60] Papain RD Tmax for enzymatic activity increased by 158C for best triple mutant Xylanase RD Half-life at 50 8C and catalytic efficiency increased 15- and 1.3-times, respec- [61] tively [62] Pyranose 2-oxidase SRD Melting temperature (TM) increased by 148C and improved catalytic proper- ties Dioxygenase RD Optimum temperature and stability at basic pH increased by introduction of [63] bond Phytase epPCR 20% improvement of thermostability and 6–78C increased melting tempera- [64] ture Lipase RD Residual activity at 608C after 6 h increased from 20% to >70% by introduc- [65] tion of disulfide bond b-Keto ester reductase epPCR Residual activity at 458C after 2 h increased by single point mutation from 0 [66] to 64% accompanied by increased enantioselectivity [67] Parathion hydrolase RD TM increased by 38C and catalytic efficiency doubled by exchange of glycine to proline b-Amylase RD Half-denaturation time at 608C increased from 6 to 35 min by the so called [68] “ancestral mutation method“ Endoglucanase epPCR Triple mutant had 92% longer half-life at 608C on carboxymethyl cellulose [69] Cellulase Hybrid. Half-life at 63 8C, pH 4.8 increased from 1.5 h to ~50 h [70] [71] Asparaginase StEP, SSM TM increased by 108C and half-life at 508C increased from 3 to 160 h [72] Lipase B-FIT Increased thermolability with T50 value reduced from 728Cto368C General stability Endoxylanase RD Improved stability under acidic conditions (pH 3-4) [73] Pancreatic lipase RM & RD Activity at acidic pH enhanced to 50% [74] Mannanase RD & shuffling Activity and stability increased 3- and 7-fold, respectively [75] Lipase ISSM Increased stability towards acetonitrile, dimethyl sulfoxide, dimethylforma- [58] mide Substrate specificity Monooxygenase RD 190-fold increased initial oxidation rate for 2-phenylethanol [76] Peptide synthetase RD Specificity ratio Phe/Leu changed from 0.004 to 9.3 while increasing catalytic [77] efficiency DNA hemi-methylase RM & RD Specificity changed from GCGC to CGCG by drastically increasing the activ- [78] ity towards (A)CGCG(T) RD & SSM 15,000-fold increased specificity constant (glyphosate/glycine) and 210-fold [79] increased catalytic efficiency for glyphosate ACHTUNGRE [80] Choline oxidase RD & DE Activity towards MTEA increased 5-fold and kcat/KM(MTEA)/kcat/KM- (choline)ACHTUNGRE changed from 0.01 to 0.2 Styrene monooxyge- RD Exchange of one amino acid led to acceptance of bulkier a-ethylstyrene [81] nase Carboxymethylproline RD Production of substituted N-heterocycles with stereocontrolled enoate forma- [82] synthase tion Transaldolase SSM Improved acceptance of non-phosphorylated substrates dihydroxyacetone [83] and glyceraldehyde Xylose reductase SSM & epPCR Narrowing substrate acceptance to reduce unwanted formation of arabitol [84] from arabinose while retaining xylose reductase activity Toluene monooxyge- Enantioselective oxidation of aromatic sulfides by point mutation yielding [85] nase methyl phenyl sulfoxide with >98% ee (pro-S) Galactose oxidase epPCR Catalyses oxidation of secondary alcohols [86] Limonene epoxide hy- ISSM Catalyses the asymmetrisation of cyclopentene-oxide stereoselectively to [87] drolase form the (R,R)- or the (S,S)-diol Lipase CAST Hydrolysis of a-substituted p-nitrophenyl esters (95–99% ee) [88] Aniline dioxygenase SSM & epPCR Increased bioremediation activity up to 98-fold for aromatic amines [89] Cofactor specificity Alcohol dehydrogen- RD Improved preference for NADP(H) over NAD(H) [90] ase Alcohol dehydrogen- RD Single amino acid exchange broadened coenzyme acceptance from pure [91] ase NAD(H) to both NAD(H) and NADP(H)

2198 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Table 1. (Continued) Enzyme Methods Result and Comments Ref Lactate dehydrogenase RD & SSM Broadening of cofactor acceptance from NAD(H) to NADP(H) [92] RD Mutation leads to more stable binding of cofactor FAD over ADP [93] Enantioselectivity/-preference Lipase RD E value for bulky substrate increased from 5 (WT) to > 200 [94] Lipase CAST E=111 for kinetic resolution of an axially chiral allene, p-nitrophenyl 4-cy- [95] clohexyl-2-methylbuta-2,3-dienoate Diisopropyl fluoro- RD Reversed enantioselectivity and enhanced activity towards nerve agents [96] phosphatase N-Acetylneuraminic epPCR & SSM Two mutants with opposite stereoselectivity identified [97] acid lyase P450 monooxygenase ISSM Changed preference from 43% ee (S) for wt to 83% ee (R) [98] Alcohol dehydrogen- RD Reduction of benzylic and heteroarylic ketones with anti-prelog configuration [99] ase by a single mutant Ammonia lyase RD Stereoselectivity of 3-methylaspartate ammonia lyase enhanced by active site [100] mutation Increased activity Glucose 6-oxidase RD 400-fold increased activity towards glucose [101] Dehalogenase RD & SSM 32-fold higher activity towards 1,2,3-trichloropropane [102] Cocaine hydrolase RD Computational design of mutant with 2,000-fold improved catalytic efficiency [103] 5 [104] Serum paraoxonase RM Increased activity toward the coumarin analog of Sp-cyclosarin ~10 -fold d [105] Aldolase epPCR, SSM 60-fold higher kcat/KM for catalyzing the addition of pyruvate to -erythrose & shuffling 4-phosphate to form DAHP Phenylalanine ammo- DE 15-fold increased reaction rate [106] nia lyase Citramalate synthase epPCR Up to 22-fold higher production of 1-PrOH and 1-BuOH compared to wild- [107] type d [108] b-Glucuronidase epPCR & SSM 60-fold more active (kcat/KM) at pH 7.0 towards b- -glucuronide Glycerol dehygroge- RD & shuffling 26-fold improved activity for the oxidation of 1,3-butanediol to 4-hydroxy-2- [109] nase butanone Galactose oxidase SSM Mutations to optimize codon usage enable drastic enhancement of expression [110] (> 100,000 fold) and SSM in active site increased activity against galactose by 60% [111] Nitrogenase RD Variants show increased production of H2 as side product during N2 fixation Promiscuous activity O-Glycosyltransferase RD O-glycosylating enzyme switched to C-glycosyltransferase [112] Pyruvate decarboxy- RD Aldehyde release vs. carboligation inverted (100-fold preference for the [113] lase latter) Benzoylformate decar- RD Single mutation converts benzaldehyde lyase into benzoylformate decarboxy- [114] boxylase lase Other properties b-Lactamase DE Introduction of by metal ions [115] P450 monooxygenase RD Construction of a hybrid metalloenzyme combining CYP450 and CytC with [116] good stability and activity Histidine kinase Fusion proteins Combining histidine kinase with a light-oxygen-voltage photosensor domain [117] to a light-regulated histidine kinase variant b-Galactosidase SSM Removal of product inhibition (~10-times more galactose tolerated), though [118] activity and stability are reduced [a] Abbreviations: (S)RD: (semi-)rational design; (I)SSM: (iterative) site-directed saturation mutagenesis; epPCR: error- prone PCR; DE: directed evolution; CAST: combinatorial active-site saturation test; MTEA: tris-(2-hydroxyethyl)-meth- ylammonium methyl sulfate; RM: random mutagenesis; wt: wild-type; Hybrid.: hybridization; B-FIT: B-factor iterative

test; TM =melting temperature. single point mutation.[123] Several residues for poten- matic amines, but had a more narrow substrate specif- tial mutations were identified in a homology model icity towards aliphatic amines. The mutant W57G constructed using the structure of a 2,2-dialkylglycine with the large binding pocket even widened showed decarboxylase (pdb-code: 1DGE). The wild-type significant changes in its substrate specificity without enzyme showed high specific activities towards aro- compromising the original activities toward aromatic

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2199 REVIEWS Geoffrey A. Behrens et al.

Scheme 1. Strategies for the synthesis of optically active amines using amine transaminases (ATA). In a kinetic resolution (left), the ATA converts one of the amine enantiomers to the corresponding ketone ( 50% yield). In an asymmetric synthe- sis (right), a prostereogenic ketone is aminated enantioselectively, yielding directly the optically active amine at a theoretical yield of 100%. amines and the enantioselectivity. The specific activity alyzed process. Both routes have recently been com- towards 1-(1,2,3,4-tetrahydronaphthalen-2-yl)ethana- pared.[126] Apart from the actual success in finding the mine, for example, was increased 20-fold, towards 4- desired catalyst, the whole project brought in a huge phenylbutan-1-amine 41-fold and towards isopropyla- variety of mutants with a broad substrate range and mine 19-fold. very good characteristics for application in organic The ATA from Arthrobacter citreus has first been synthesis also useful for a-phenylethylamines with subjected to directed evolution to increase thermosta- electron-rich substituents and pyrrolidines. bility and isopropylamine acceptance resulting in the We applied a different approach for the identifica- mutant CNB05-01.[124] This mutant was later on ra- tion of so far unknown (R)-selective ATAs (R- tionally redesigned by Svedendahl et al. for a further ATAs).[127] During our project, the sequence of the improvement of the already high enantioselectivity ATA-117 from Codexis as the only known isolated R- towards 4-fluorophenylacetone.[125] Mutant CNB05-01 ATA was not available. Thus, we first had to hypothe- gave 22% yield with 98% ee in the asymmetric syn- sise about possible relatives or ancestors of the target thesis of (S)-1-(4-fluorophenyl)propan-2-amine. With enzyme in order to identify important residues re- the help of a homology model constructed using a sponsible for substrate coordination in aforesaid rela- class III transaminase from Silicibacter pomeroyi tives. In the next step, comparable to rational design, (pdb-code: 3HMU) that showed only 27% sequence we predicted key mutations for generating the desired identity, they identified an active-site loop with sever- substrate specificity. Assuming that these mutations al residues close to the phenyl group of the substrate would have generated at least a little of the desired as well as to the phosphate group of the cofactor PLP. activity, we then skipped the idea to go via directed The single mutant Y331C actually showed a higher evolution, but explored public databases for proteins enantioselectivity with >99.5% ee and interestingly, with these mutations, amongst other criteria. Our as- mutant V328 A showed a reversed enantioselectivity sumption was to circumvent the usually required sev- with 58% ee for the opposite enantiomer (R)-1-(4-flu- eral rounds of directed evolution for improving the orophenyl)propan-2-amine while it retained its high implanted activity as this should have occurred al- (S)-selectivity for 1-(4-nitrophenyl)ethanamine. ready in nature during millions of years of natural The most extensive as well as successful redesign of evolution. In the end, we identified 20 protein se- an ATA was recently reported by Savile et al. as men- quences after filtering 5,000 sequences in public data- tioned in the introduction.[11] They applied a combina- bases believed to be so far unknown (R)-selective tion of different techniques starting with homology ATA, ordered the synthetic genes and expressed model-based rational design followed by ten rounds them recombinantly in E. coli. The following charac- of directed evolution including random and site-satu- terisation resulted in ten enzymes with high R-ATA ration mutagenesis as well as DNA-shuffling. The activity (hit rate of 48%), and all 17 characterised mu- final 27 mutations of the best variant accumulated tants that could be expressed in E. coli showed the mainly in or close to the active site and also at the desired activity and especially (R)-enantiopreference. subunit interface of the homodimer. Starting from Seven of these enzymes were recently applied suc- ATA-117 wild-type with hard to detect activity (0.2% cessfully in the asymmetric synthesis of various (R)- conversion!) towards the target substrate, the final amines having >99% ee at > 99% yield.[128] variant converted 200 g/L prositagliptin ketone to Si- tagliptin with 99.95% ee at 92% yield. The biocata- lytic process not only reduced the total waste and 4.2 Enoate Reductases eliminated all need for heavy metals, but even in- creased the overall yield by 10% and the productivity Enoate reductases (EC 1.3.1.31) catalyse the asym- (kg/L per day) by 53% compared to the rhodium-cat- metric reduction of activated C=C bonds. The activat-

2200 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Scheme 2. OYE1-wild-type and W116I form products with opposite stereochemistry from (S)-carvone. With (R)-carvone both enzymes produced the same diastereomer. ing group is usually an electron-withdrawing group stereopreference for certain substrates. OYE1-W116I (EWG), which includes aldehyde, ketone, carboxylic reduced (R)- and (S)-carvone to enantiomers com- acid, carboxylic ester, anhydride, lactone, imide or pared to diastereomers formed by the wild-type nitro groups. This biocatalytic asymmetric reduction is enzyme (Scheme 2). The inverted stereoselectivity useful as it can produce two adjacent sp3 chiral cen- was due to the substrate binding in a flipped orienta- tres from a prostereogenic activated alkene in a single tion in the active site followed by net trans H2 addi- step. Furthermore, the reduction occurs by the net ad- tion. OYE1-W116I also showed opposite stereoprefer- dition of hydrogen (H2)inatrans-stereospecific ence for the C=C-reduction of (R)-perillaldehyde to manner, which provides a synthetic route for asym- produce trans-dihydroperillaldehyde compared to the metric enzymatic trans-hydrogenation of activated cis-form made by wild-type OYE1. However, another C=C bonds. variant OYE1-W116F reduced neral to (R)-citronellal The synthetic application of this reaction has ex- compared to (S)-citronellal formed by the wild-type. plored a number of enoate reductases from various In the first report of directed evolution on the and microbial sources, with the first enzyme enoate-reductase YqjM from Bacillus subtilis, a ho- purified from Saccharomyces pastorianus (OYE1). mologue of the old yellow enzyme,[132] a library of var- Toogood et al.[129] and Strmer et al.[130] described var- iants was generated, which showed increased activity, ious other sources of enoate reductase in their recent enhanced and inverted enantioselectivity in the reduc- reviews. Many of the enoate reductases were charac- tion of a number of cyclic enones. Compared to the terised and cloned to express the heterologous pro- low activity of wild-type YqjM towards the reduction teins, which were subsequently used for biocatalysis. of 3-methylcyclohexenone [4 mU/mg, (R)-specific], Examples include the reduction of substituted and the saturation mutagenesis library produced variants non-substituted a,b-unsaturated cyclic and acyclic al- that showed higher activity and opposite enantioselec- dehydes, ketones, imides, nitroalkenes, carboxylic tivity. Also broader selectivity for the C=C-reduction acids, and esters, maleimides, cyclic and acyclic of other cyclic enones was found (Scheme 3, Table 2). enones, terpenoids, cyanides, nitrate esters and also The site saturation mutagenesis on PETN reductase the nitro group reduction of nitroaromatics (TNT), at eight single positions with degenerate oligonucleo- glycerol trinitrate (GTN), and pentaerythritol tetrani- tides (NNK) produced mutant W102F, which gave op- trate (PETN). posite enantiopreference in the reduction of a- In spite of the broad substrate scope of OYE1 (Sac- methyl-trans-cinnamaldehyde,[133] with 62% ee for the charomyces pastorianus), it lacked higher activity to- (R)-2-methyl-3-phenylpropionaldehyde compared to wards the reduction of bulky 3-alkyl-substituted 2-cy- 13% ee (S)-enantiomer formed by the wild-type. clohexenones. In order to increase substrate selectivi- Mutant T26S also gave opposite enantiopreference ty, Padhi et al.[131] carried out site-saturation mutagen- for (R)-2-phenyl-1-nitropropane, 44% ee for the re- esis at position 116 of OYE1 and discovered the var- duction of (E)-2-phenyl-1-nitropropene, compared to iants W116I and W116F, which showed opposite (S)-product, 37% ee (Scheme 4). Although reversible

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2201 REVIEWS Geoffrey A. Behrens et al.

Table 2. Reduction of cyclic enones catalyzed by YqjM and variants.[a]

Substrate n, R Enzyme Conversion/% eeP (Conf.) Substrate n, R Enzyme Conversion/% eeP (Conf.) 1, ethyl wild-type n.c.[b] 1, COOMe wild-type 85/99 (R) 1, ethyl C26D/I69T 100/99 (R) 1, COOMe C26D/I69T 92/99 (R) 1, ethyl C26G/A60V 9/96 (S) 1, COOMe C26G 100/98 (S) 1, isopropyl wild-type n.c.[b] 0, methyl wild-type 1/n.d.[b] 1, isopropyl C26D/I69T 33/99 (R) 0, methyl C26D/I69T 5/50 (R) 1, isopropyl C26G/A60V 2/91 (S) 0, methyl C26D 36/93 (S) 1, n-butyl wild-type n.c.[b] 0, COOMe wild-type 100/96 (R) 1, n-butyl C26D/I69T 92/99 (R) 0, COOMe I69T 100/99(R) 0, COOMe C26G 100/93 (S) [a] n and R: see Scheme 3. [b] n.c.: no conversion, n.d.: not determined.

4.3 Esterases for Tertiary Alcohol Synthesis

Lipases and esterases are known to catalyse the hy- drolysis of tertiary alcohol esters to produce chiral tertiary alcohols, but with little selectivity.[135] In order Scheme 3. YqjM-catalyzed reduction of cyclic enones. to improve the enantioselectivity, Henke et al. engi- neered an esterase from Bacillus subtilis (BS2) and isomerisation of (E)-2-phenyl-1-nitropropene to the the first single mutant G105A gave higher enantiose- (Z)-form could produce the (S)-product, the T26S lectivity, that is, E=19 compared to E=3 for wild- variant increased the relative population of the (E)- type in the hydrolysis of 3-phenylbut-1-yn-3-yl ace- substrate by removing the possible clash between the tate.[136] In a further medium engineering experiment, substrate phenyl and the threonine and hence autohydrolysis of the substrate was suppressed by ad- switched the stereochemistry of the product. dition of the water-miscible solvent DMSO, which in- Site-saturation mutagenesis on PETN reductase at creased E to 56 for variant BS2-G105A. Substrate H181 and H184 identified several single mutants and protein engineering guided by molecular model- H184N/A/K/R. All four of them increased the enan- lingACHTUNGRE showed that BS2-G105A could catalyse the hy- tioselectivity towards the reduction of (E)-2-phenyl-1- drolysis of 4,4,4-trifluoro-3-phenylbut-1-yn-3-yl ace- nitropropene (substrate of the second reaction of tate with E> 100 compared to E=42 for wild- Scheme 4).[134] While the wild-type PETN reductase type.[137] The high substrate selectivity and E value of gave 42% ee, the variants produced 85–94% ee of the the BS2-G105A variant were consistent with the pre- (S)-2-phenyl-1-nitropropane. In the library also var- viously established concept that a GGG(A)X motif in iants were found, which switched the reactivity to pre- hydrolases is a key determinant for enzymatic activity dominantly catalyze nitro-reduction. Variants H181A/ on tertiary alcohol substrates. Another variant (BS2- C/N showed dominating oxime products by the reduc- E188D) also showed high selectivity in the hydrolysis tion of the nitro group in 52–75% yield as compared of 3-phenylbut-1-yn-3-yl acetate (E=46) and of 4,4,4- to only 10% product by wild-type PETN reductase. trifluoro-3-phenylbut-1-yn-3-yl acetate with E> 100. Both of the above mentioned BS2 variants accepted a

Scheme 4. Opposite enantiopreference of PETN variants towards C=C-bond reduction.[134]

2202 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Scheme 5. Kinetic resolution of several arylaliphatic tertiary alcohol acetates by engineered BS2 enzymes showing opposite enantiopreference. broad range of trifluoromethyl arylaliphatic tertiary 4.4 Monoamine Oxidases alcohol acetates with E> 100.[138] Next, Bartsch et al. performed a focused directed Monoamine oxidases (EC 1.4.3.4, MAO) catalyse the evolution based on simultaneous saturation mutagen- of monoamines leading to the esis at positions E188, A190 and M193 using the de- formation of the corresponding aldehydes. Mechanis- generate NNK codons with the aim to invert the tic investigation of this reaction, however, revealed enantiopreference of BS2.[48] After screening 2,800 that the reaction proceeds through the formation of colonies with a high-throughput assay[139], they identi- the aldimine, which could also be isolated.[140] Turner fied a key variant (BS2-E188D-M193C), which et al. exploited this enzymatic transformation in the showed inverted enantioselectivity in the hydrolysis deracemisation of racemic amines to synthesise the of 4,4,4-trifluoro-3-phenylbut-1-yn-3-yl acetate and corresponding enantiopure products. The deracemisa- other substrates with different aryl substituents. This tion process involves stereoinversion of one enantio- variant showed high (S)-selectivity, ES =64 compared mer to the other by repeated cycles of MAO-cata- to ER =42 by wild-type BS2. Moreover, the engi- lysed stereoselective oxidation to the imine combined neered enzyme also catalysed the kinetic resolution of with a non-selective chemical reduction of the imine a range of other trifluoromethyl derivatives with high to the amine, which notably works in a one-pot reac- enantiomeric ratio (ES up to > 100) and hence its tion (Scheme 7). broad substrate scope was proven (Scheme 5). Using directed evolution Alexeeva et al.[141] im- Analysis of the role of E188 on enantioselectivity proved the selectivity of the enzyme from Aspergillus revealed that increasing bulkiness of the residue dras- tically changed the enantiopreference from (R)to(S) (Scheme 6). High opposite enantiopreference was only found for the double mutant, as the single mutant E188W showed ES =26 and M193C gave ER = 16, which emphasises the importance of synergistic ef- fects of the residues.

Scheme 7. Deracemisation of a-methylbenzylamine (a- MBA) using an (S)-selective coupled with ammonia-borane reduction of the intermediate imine Scheme 6. Effect of the size of amino acid substitutions at to form racemic amine. In the first round, 50% (S)-amine is position 188 in BS2 on enantioselectivity in the hydrolysis of oxidised to the imine yielding 25% (R)-amine and 25% (S)- arylaliphatic tertiary alcohol acetates. Small residues lead to amine after chemical reduction, the latter being oxidised preferred conversion of the (R)-enantiomer while bulky res- again. After 7–8 rounds, the racemic starting material is idues gave (S)-selectivity. completely converted into optically pure (R)-a-MBA.

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2203 REVIEWS Geoffrey A. Behrens et al.

ple of a Ugi-type 3-component reaction (Scheme 9).[147]

4.5 Dehalogenases

Dehalogenases are enzymes which catalyse the hy-

drolysis of a carbon-halogen bond by an SN2 mecha- nism. Haloalkane dehalogenases (HLD) and halo acid dehalogenases (HAD) act on haloalkane and halo acid, respectively.[148] However, halohydrin deha- logenases (HHDH) act only on vicinal halo alcohols to produce an epoxide (Scheme 10). Apart from Scheme 8. Chemo-enzymatic synthesis of (+)-crispine using simple chlorinated, brominated and iodinated alkanes, MAO-N-5-catalysed deracemisation. the HLDs also catalyse the hydrolysis of halogenated alkenes, cycloalkanes, alcohols, carboxylic acids, esters, ethers, epoxides, amides and nitriles to produce niger (MAO-N) towards the oxidation of a-MBA. a range of corresponding halogen hydrolysed products The MAO-N mutant N336 showed 47-fold higher ac- and hence provide access to various chiral intermedi- tivity towards (S)-a-MBA and 5.8-fold higher selectiv- ates.[149] The halohydrin dehalogenases not only cata- ity towards (S)-a-MBA over (R)-a-MBA. Combining lyse the formation of epoxides, but also the reversible this engineered protein with ammonia-borane as a re- reaction to open the epoxide ring with different nu- ducing agent, (S)-a-MBA could be synthesised in cleophiles such as cyanide, nitro and azide[150] to pro- 77% yield and 93% ee from its racemic substrate. The duce the corresponding multifunctional products. mutant also showed broad substrate scope and high A haloalkane dehalogenase from Rhodococcus rho- enantioselectivity towards a range of primary dochrous (DhaA) catalysed the hydrolysis of 1,2,3-tri- amines.[142] Directed evolution of MAO-N produced chloropropane (TCP) into 2,3-dichloropropanol. another variant, which catalysed the deracemisation However, the half-life (t1/2) of the wild-type enzyme of a structurally diverse range of secondary was only 11 min at 558C. Gray et al.[151] carried out amines.[143] Similarly, a further mutant enabled the site saturation mutagenesis on DhaA and found a li- (S)-selective oxidation of tertiary amines and hence could be used in the deracemisation process for the synthesis of enantiopure cyclic tertiary amines.[144] One variant (MAO-N-5) was used in the chemo-enzy- matic synthesis of the alkaloid (+)-crispine and its de- rivatives (Scheme 8).[145] MAO-N-5 also catalysed the desymmetrisation of 3,4-substituted meso-pyrolidines to produce the corre- sponding D1-pyrolines.[146] The optically pure pyrolines from the above process underwent one-pot condensa- tion in the presence of a carboxylic acid and an isocy- Scheme 10. Typical examples of three different types of re- anide to afford chiral substituted prolyl peptides. This actions catalysed by dehalogenases. DhlA and DhlB are hal- provided a chemo-enzymatic route for the stereose- oalkane and haloacid dehalogenases from Xanthobacter au- lective synthesis of highly functionalised, optically totrophicus. HheC is a halohydrin dehalogenase from Agro- pure 3,4-substituted prolyl peptides, a striking exam- bacterium AD1.

Scheme 9. MAO-N catalysed oxidative desymmetrisation of meso-pyrolidines to optically pure D1-pyrolines and their subse- quent use in the Ugi-3-component reaction (Ugi-3-CR).

2204 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

brary of variants with increased t1/2. The single site DbjAD+ H189A variants showed E=33 and 100 in variants had half-lives from 5–45 h. A combined var- halide hydrolysis of 2-bromopentane and E=290 and iant with five mutations had a half-life ~ 3000-fold 245 for methyl 2-bromobutyrate, respectively. higher than wild-type DhaA. Combination of eight HheC was found to be unstable under oxidising single-site mutations on DhaA increased the half-life conditions, which could be prevented by addition of by another factor of ten. b-mercaptoethanol. Tang et al.[154] carried out site di- Random mutagenesis of the haloalkane dehaloge- rected mutagenesis on HheC and showed that the var- nase from Rhodococcus sp. m15-3 (DhaA) followed iant C153S had improved stability and also optical by screening on eosin-methylene blue agar plates purity of the product was increased from 72% ee identified a new variant that catalysed the dehaloge- (wild-type) to 93% ee in the kinetic resolution of 2- nation of TCP into 2,3-dichloro-1-propanol nearly chloro-1-phenylethanol. eight times more efficient than the wild-type dehalo- In the case of HheC-catalysed dehalogenation of vi- genase (Scheme 11).[152] The resulting mutant C176Y/ cinal halo alcohols to epoxides, halide release was À1 À1 Y273F showed kcat/KM = 280M s compared to found to be the rate-limiting step during the catalytic 36MÀ1 sÀ1 by wild-type DhaA. cycle. Tang et al.[155] performed site-directed mutagen- esis and identified variants Y187F and W249F with higher activity in the epoxidation of 1,3-dichloro-2- propanol and p-nitro-2-bromo-1-phenylethanol (Scheme 12). Mutant Y187F had a two-fold increased À1 kcat (18.7Æ 1.1 s ) and kcat/KM was raised to 1.6 Scheme 11. Conversion of TCP into 2,3-dichloropropane-1- 106 MÀ1 sÀ1 (wild-type 9.3105 MÀ1 sÀ1) for 1,3-di- ol by a DhaA variant. chloro-2-propanol. Variant Y249F had increased enantioselectivity with E=900 compared to 150 for

the wild-type and an improved kcat/KM. Further protein engineering on this enzyme pro- duced variants which showed up to 32-fold higher ac- tivity than the wild-type toward the degradation of the anthropogenic compound TCP.[102] Identification of key residues in access tunnels connecting the buried active site with bulk solvent by rational design followed by random mutagenesis and directed evolu- Scheme 12. HheC-catalysed conversion of a halohydrin into tion produced improved variants. Three of them an epoxide. showed up to 32-fold higher activity in the hydrolysis À1 of TCP compared to wild-type DhaA (kcat =0.08 s ). Table 3 shows the steady-state kinetic parameters of In an impressive example, researchers at Codexis the variants and wild-type DhaA for this halide hy- (USA) developed a method using engineered HheC drolysis. to synthesise the side-chain of the cholesterol-lower- DbjA, a haloalkane dehalogenase from Bradyrhi- ing drug, atorvastatin (Lipitor).[52] Starting from zobium japonicum USDA110, showed E=132 and ethyl (S)-4-chloro-3-hydroxybutyrate (ECHB), the 209 for the halide hydrolysis of 2-bromopentane and (S)-epoxide is formed first followed by formation of methyl 2-bromobutyrate, respectively, with the (R)- ethyl (R)-4-cyano-3-hydroxybutyrate (HN), the key enantiomer as predominant product. In comparison building block for the atorvastatin side chain to other HLDs, DbjA features an additional segment (Scheme 13). (S)-ECHB and its corresponding epox- (140-His-Thr-Glu-Val-Ala-Glu-Glu-146) located on ide are poor substrates for wild-type HheC. Engineer- the protein surface between the core a/b-hydrolase ing based on the ProSAR-driven directed evolution and cap domains.[153] Deletion of this segment (see Section 3.3) led to variants which allowed a one- (DbjAD) altered the enantioselectivity for the two pot conversion yielding the product with 99.5% purity substrates mentioned above. The DbjAD and and >99.9% ee.

Table 3. Steady state kinetic parameters of wild-type DhaA and its variants.

À1 À1 À1 Variant kcat [s ] KM [mM] kcat/KM [M s ] Wild-type 0.04Æ0.01 1.0Æ0.2 40Æ13 I135L-W141F-C176Y-V245F-L246I-Y273F 0.55Æ0.04 1.2Æ0.2 458Æ83 I135V-C176Y-V245F-L246I-Y273F 1.02Æ0.06 1.1Æ0.1 927Æ100 I135F-C176Y-V245F-L246I-Y273F 1.26Æ0.07 1.2Æ0.1 1050Æ105

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2205 REVIEWS Geoffrey A. Behrens et al.

Scheme 13. A HheC variant catalysed the highly selective formation of ethyl (R)-4-cyano-3-hydroxybutyrate from the (S)- chloro derivative, which can be subsequently used in the preparation of the atorvastatin side chain.

4.6 Aldolases donor.[158] Among the pyruvate-dependent aldolases, N-acetylneuraminic acid aldolase (NeuAc) is interest- Aldolases (E.C. 4.1.2.x) constitute a group of CÀC ing for pharmaceutical application as it converts N- bond forming lyases. The nucleophilic donor is usually acetylmannosamine and pyruvate to d-sialic acid, a invariable and aldolases are grouped by their depend- building block for the synthesis of neuraminidase in- ency on the donors dihydroxyacetone phosphate hibitors for the treatment of flu.[159] The stereospeci- (DHAP), dihydroxyacetone (DHA), pyruvate/phos- ficity of NeuAc could be inverted by directed evolu- phoenol pyruvate, or glycine/alanine. tion using epPCR to form l-sialic acid in contrast to On the other hand, they accept a broad range of ac- the naturally occurring d-enantiomer.[160] This is im- ceptor aldehydes enabling the formation of versatile portant because the unnatural l-form is more stable products such as rare or non-natural sugars and their against enzymatic degradation (Scheme 14). derivatives. They also allow the introduction of differ- The group of Berry managed to modify the stereo- ent heteroatoms (amino, nitro, azido, phospho, sulfo, specificity of a tagatose 1,6-bisphosphate aldolase by chloro, fluoro or deoxy groups). Especially attractive gene shuffling providing interesting mechanistic in- is the fact that two stereocentres are formed during sights.[161] The most prominent example for an indus- the aldolase reaction and that the absolute configura- trially applied aldolase is the acetaldehyde-dependent tion of the product can be influenced by the choice of 2-deoxy-d-ribose-5-phosphate aldolase (DERA), the enzyme. For instance, four different DHAP-de- which is used for the preparation of (3R,5S)-6-chloro- pendent aldolases enable selective formation of all 2,4,6-trideoxyhexose, a precursor in the synthesis of four possible diastereomers from DHAP and glyceral- the side-chain of drugs like atorvastatin (Scheme 15). dehyde 3-phosphates[156] as shown in a review.[157] To The major obstacles for this process were the narrow avoid the costly and rather unstable DHAP, variants substrate range and the low stability of the aldolase were created which accept DHA as alternative under process conditions. This could be overcome for

Scheme 14. Inversion of the stereospecificity of N-acetylneuraminic acid aldolase (NeuAc) towards sialic acid by directed evolution.[160]

2206 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Scheme 15. Improvement of 2-deoxy-d-ribose-5-phosphate aldolase (DERA) towards higher stability and activity against chloroacetaldehyde to form intermediates for the production of statin drugs like atorvastatin (Lipitor).[162–163] the E. coli DERA by directed evolution yielding a activity, regioselectivity, thermal stability or to change variant with much higher resistance and productivity cofactor-dependency or substrate preference. Position towards the chloroacetaldehyde substrate.[162] In an al- F87 was identified in several studies to be a hotspot ternative approach, metagenome libraries were mediating substrate specificity and regioselectivity, for screened for novel aldolases exhibiting higher activity example, in the rational design study by the Pleiss and stability against chloroacetaldehyde.[163] The iso- group, where mutations at positions 87 and 328 ena- lated enzyme had <30% sequence identity towards bled the conversion of the four terpene substrates ger- the E. coli DERA and would thus hardly have been anylacetone, nerylacetone, (4R)-(+)-limonene and found by sequence-based screening approaches. In (+)-valencene (Scheme 16).[165] The triple mutant combination with process improvements, a highly effi- A74G/F87V/L188Q accepts polycyclic aromatic hy- cient, scalable and enantioselective process could be drocarbons as substrates.[166] In another computer developed. modelling-guided approach P450-BM3 was engi- neered to catalyse the oxidation of amorphadiene to artemisinic-11S,12-epoxide, a precursor of the anti- 4.7 Cytochrome P450-Monooxygenases malaria drug artemisinin, which was achieved by the introduction of four mutations into P450-BM3.[167] P450-monooxygenases display a versatile superfamily In a molecular modelling-based study, the specifici- of -dependent oxygenases. Sequence databases ty against alkoxyresorufin substrates could be altered currently contain >10,000 sequences of P450-like en- by exchange of five amino acids in the human P450 zymes from all domains of life, although just a small 1A1.[168] The study demonstrated that in the case of fraction of them has been functionally expressed and P450 enzymes the prediction of the influence of cer- characterised so far.[164] P450s catalyse a broad range tain amino acid positions often works very well due of reactions including carbon hydroxylation, epoxida- to the great structural information available for this tion of double bonds, CÀC bond cleavage, ring forma- enzyme class. The thermal stability of P450-BM3 tion and oxidation of heteroatoms. The substrates could be significantly increased by a chimera-forming range from short-chain to long-chain alkanes/alkenes approach by the Schmid group through combining the over fatty acids and alkaloids to complex molecules CYP102A1 with the reductase subunit of a sulfite re- like steroids, amongst them various pharmaceutically ductase from the thermophilic strain Geobacillus interesting compounds. In spite of their great poten- stearothermophilus.[169] In several other studies the tial, there are only few examples for the industrial ap- natural fusion protein P450-BM3 was taken as inspi- plication of isolated P450 enzymes. Many P450s are difficult to handle as they are often membrane bound, have low stability and activity or may consist of sever- al subunits, required for an efficient electron transfer. To circumvent these problems, industrial applications are often based on whole-cell systems, which also enable multi-step reactions. For the application of iso- lated enzymes, a great potential lies in the subfamily CYP102, a group of soluble P450-monooxygenases. Its most prominent member is P450-BM3 (CYP102A1) from Bacillus megaterium, in which oxy- genase and reductase subunits are fused to one single Scheme 16. P450-BM3 was modified by rational design to polypeptide chain. In numerous studies this enzyme accept the four terpenes, (4R)-(+)-limonene, (+)-valencene, has been subjected to engineering aiming to increase geranylacetone and nerylacetone as substrates.[165]

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2207 REVIEWS Geoffrey A. Behrens et al. ration to create unnatural fusion proteins combining genase (PAMO) from Thermobifida fusca, active on different subunits, for example, in the study by Sher- aromatic substrates, was isolated in this way.[31] The man et al., in which the reductase domain of Rhodo- group of Grogan cloned 29 new BVMOs by the se- coccus sp. NCIMB 9784 P450 was combined with the quence-based approach from the genomes of the acti- oxygenase subunit of PikC cytochrome P450 from nomycetes M. tuberculosis[179] and Rhodococcus Streptomyces venezuelae. The resulting fusion protein jostii.[180] Remarkably, the conserved region used to showed enhanced electron transfer efficiency and cat- identify the genes turned out not to be in the active alytic activity in the synthesis of pikromycin and me- site, but within a surface loop in the connection of the thymycin.[170] The same reductase subunit was used to FAD and the NADPH domain. construct a vector, which allows the fusion of different The discovery of the first BVMO structure of the oxygenase subunits with this reductase and the system thermostable PAMO[181] in 2004 was a major break- was validated with 24 different heme domains.[171] Be- through to enable rational protein engineering. Five sides P450-BM3, the camphor-hydroxylating P450cam years later, a second BVMO structure of a novel from Pseudomonas putida (CYP101) is another well- CHMO in complex with NADP was resolved.[182] studied monooxygenase. For example, mutation of po- Based on the PAMO structure, the Reetz group sition Y96 lead to increased oxidation of naphthalene evolved this BVMO in several studies yielding a var- and pyrene by 1–2 orders of magnitude.[172] In a iant, which now converts 2-phenylcyclohexanone recent semi-rational approach, the substrate specifici- enantioselectively.[183] The exchange of two proline ty of P450cam was changed towards diphenylme- residues turned PAMO into a variant which accepts thane.[173] In another study the oxidation products of substrates commonly converted by CHMO-type (+)-a-pinene were changed.[174] BVMOs.[184] Besides these rational design studies on Monooxygenases can also be applied for epoxida- PAMO, there are further examples in which the prop- tion of substrates. In P450-BM3 an epoxidising activi- erties of BVMOs were changed by engineering. ty could be created by saturation mutagenesis.[175] Be- Kirschner et al. were able to enhance the enantiose- sides the P450 type monooxygenases, there are also lectivity of a BVMO from Pseudomonas fluorescens flavine-dependent styrene monooxygenases catalysing by error-prone PCR for the kinetic resolution of b-hy- epoxidations.[176] The activity of the enzyme from droxy ketones (Scheme 17).[25b,185] The CHMO from Pseudomonas putida CA-3 could be improved by A. calcoaceticus was also redesigned in several studies. error-prone PCR yielding an improved epoxidation In a combined approach of epPCR and saturation activity towards styrene and indene.[177] mutagenesis, it was possible to enhance and invert the enantioselectivity in the conversion of 4-hydroxycy- clohexanone (Scheme 18).[186] The same group modi- 4.8 Baeyer–Villiger Monooxygenases fied CHMO to oxidise prostereogenic thioethers yielding chiral sulfoxides (Scheme 19).[187] By ex- Baeyer–Villiger monooxygenases (BVMO) catalyse change of methionine and cysteine residues on the the enzymatic counterpart to the chemical Baeyer– protein surface, it was possible to improve the stabili- Villiger reaction introducing an oxygen atom next to ty of CHMO.[188] a carbonyl function yielding esters or lactones. The One bottleneck for the application of BVMO on a substrate spectrum ranges from linear and bulky ke- large scale is the dependency on the cofactor. Most tones to complex molecules like steroids and sesqui- BVMOs depend on NADPH as cofactor, which is terpenoids. The most intensely studied BVMO is the more expensive than NADH. Therefore the identifi- cyclohexanone monooxygenase (CHMO) from Acine- cation of the residues essential for the discrimination tobacter calcoaceticus NCIMB 9871.[178] While only between both cofactors was useful for further studies few BMVOs were available for a long time, the and the acceptance of NADH could be increased for number of enzymes increased immensely by the iden- 4-hydroxyacetophenone monooxygenase and tification of a conserved sequence motif to find genes CHMO.[189] A very interesting concept to address the in sequenced genomes. The phenylacetone monooxy- cofactor issue was developed by Hollmann et al.

Scheme 17. Directed evolution of a BVMO increased activity and enantioselectivity in the kinetic resolution of b-hydroxy ketones yielding the monoacetate of a 1,2-diol as major product and the b-hydroxy acid methyl ester as minor by-prod- uct.[185]

2208 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

The principle strategy in creating a new enzyme is based on the fact that an enzyme can catalyse a chem- ical transformation because, amongst other reasons, it stabilises the transition state of the reaction. First of all the basic mechanism of the reaction must be known in detail, then, focusing on the transition state, amino acid residues that can stabilise or even interact with the substrate are positioned in the relevant order, thus forming the putative active site. This step already requires sophisticated computing using quan- Scheme 18. Enantioselective conversion of 4-hydroxycyclo- tum mechanical (QM) calculations to find the right hexanone to the corresponding lactones using engineered amino acids and position them as exactly as possible. CHMO mutants.[186a] The next step consists of finding a natural scaffold that can be used to accommodate these residues in their proper positioning. The group of Baker was able to develop and enhance algorithms to perform this task.[193] Using these methods, it was possible to design a retro-aldolase, a Diels–Alderase and a Kemp-elimi- nase. The retro-aldolase reaction is catalysed by a lysine residue via a Schiff base or an imine intermedi- ate. Since the proposed mechanism consisted of sever- al steps, the above-mentioned general method needed to be extended for the design of active sites that are compatible with multiple transition states. In total, 32 designs belonging to four different motifs were found to show retro-aldolase activity with rate enhance- ments of up to 104 over the non-catalysed reaction, but still with low turnover rates of about 9 Scheme 19. CHMO variants created that were able to con- À3 À1 [194] vert prostereogenic thioethers into chiral sulfoxides.[187] 10 min . The Kemp eliminase was proposed to use a general base, being either aspartate or glutamate or a histi- dine-aspartate/glutamate dyad. Of the 59 selected var- based on reduction of the flavin cofactor by (sun)light iants 39 were aspartate or glutamate dependent and in the presence of a mediator.[190] A completely differ- twenty had a dyad in the active site and eight designs ent way to increase the catalytic efficiency of the showed activity. A promising variant relied on a His- À2 À1 BVMO was taken by the group of Fraaije fusing the Asp dyad and showed a kcat of 210 s . By random BVMO with an NADPH regenerating phosphite de- mutagenesis this could be further enhanced to hydrogenase yielding a self-sufficient enzyme.[191] Fur- 1.3 sÀ1.[195] Another variant was further optimised by ther examples for the application and engineering of various techniques ending up with mutants showing a BVMOs can be found in comprehensive recent re- turnover number of 5–8 sÀ1 and catalytic efficiencies views.[192] of 5104 MÀ1 sÀ1.[196] Designing an enzyme capable of catalysing the Diels–Alder reaction was supposed to be quite chal- 5 Biocatalysts for New Reactions lenging, since bimolecular bond formation is involved. A carbonyl oxygen of a glutamine or asparagine resi- Although numerous enzymes have been identified, due was used to form a hydrogen bond to the diene some important organic reactions seem to have no intermediate, whereas the dienophile was hydrogen- natural enzymatic counterpart. In recent years signifi- bonded by the hydroxy group of a serine, threonine cant advances have been made towards the design of or tyrosine. With these assumptions 1019 possible tailor-made enzymes from scratch. Although the rela- active site variants were computed and about 106 tionship between an enzymes structure and its activi- could be matched into protein scaffolds. These var- ty is still poorly understood, examples have been pub- iants were further evaluated and in the end fifty var- lished using extensive computational power to design iants were obtained as soluble proteins. Only two de- enzymes that catalyse reactions not observed in signs had Diels–Alderase activity and after further nature. However, these proteins still do not reach the optimisation by mutagenesis, a variant gave a rate of high activity of natural biocatalysts. kcat/kuncat =89 which is an effective turnover rate of

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2209 REVIEWS Geoffrey A. Behrens et al. about 2 sÀ1. Interestingly, the most active variant was covery and protein engineering described in this strictly stereospecific, as intended by the design, and review. formed > 97% of the desired stereoisomer.[197] Discovering new enzymatic activity is for sure the most difficult challenge in protein design and engi- Acknowledgements neering and hence it is much more common and rela- tively easy to alter the biocatalytic properties of an We are grateful to the Alexander von Humboldt Foundation [198] existing enzyme. for a stipendium to SKP, the Deutsche Bundesstiftung Umwelt for a stipendium to AH and the State Mecklenburg- Vorpommern for a stipendium to GAB. 6 Conclusions

The exponential growth of the application of biocata- References lysts since the last two decades was mostly due to sig- nificant achievements by several interdisciplinary [1] L. Rosenthaler, Biochem. Z. 1908, 14, 238–253. branches which all became very important for bioca- [2] D. R. Tolan, A. B. Amsden, S. D. Putney, M. S. Urdea, talysis, such as molecular biology, bioinformatics and E. E. Penhoet, J. Biol. Chem. 1984, 259, 1127–1131. mostly protein engineering with all its various ap- [3] a) G. DeSantis, Z. Zhu, W. A. Greenberg, M. J. Burk, proaches and tools. This has substantially changed the J. Am. Chem. Soc. 2002, 124, 9024–9025; b) D. E. way to discover or design biocatalysts compared to Robertson, J. A. Chaplin, G. DeSantis, M. Podar, M. traditional methods. The many successful examples Madden, E. Chi, T. Richardson, A. Milan, M. Miller, covered in this review clearly underline that computa- D. P. Weiner, K. Wong, J. McQuaid, B. Farwell, L. A. Preston, X. Tan, M. A. Snead, M. Keller, E. Mathur, tional methods to alter enzymes to catalyse new reac- P. L. Kretz, M. J. Burk, J. M. Short, Appl. Environ. Mi- tions or to change their selectivities has already crobiol. 2004, 70, 2429–2436. reached a mature level and it is to be expected that [4] M. Hçhne, S. Schtzle, H. Jochens, K. Robins, U. T. they will become even more important and easier to Bornscheuer, Nat. Chem. Biol. 2010, 6, 807–813. use in the future. De novo in silico design of novel en- [5] L. A. Kelley, M. J. E. Sternberg, Nat. Protoc. 2009, 4, zymatic activities is still far away from creating bio- 363–371. catalysts with sufficiently high activities and proper- [6] a) K. Arnold, L. Bordoli, J. Kopp, T. Schwede, Bioin- ties required for application in processes. However, formatics 2006, 22, 195–201; b) F. Kiefer, K. Arnold, within just a few years this technique was developed M. Knzli, L. Bordoli, T. Schwede, Nucleic Acids Res. 37 Bio/Technolo- and thus it is likely that in the next decade our under- 2009, , D387-D392; c) M. C. Peitsch, gy 1995, 13, 658–660. standing of the underlying principles to design more [7] D. E. Kim, D. Chivian, D. Baker, Nucleic Acids Res. efficient biocatalysts by computational means will be 2004, 32, W526-W531. better and further so far undiscovered reactions will [8] E. Krieger, K. Joo, J. Lee, J. Lee, S. Raman, J. Thomp- be catalysed by designed enzymes. Protein engineer- son, M. Tyka, D. Baker, K. Karplus, Proteins Struct. ing also can contribute to the development of inte- Funct. Bioinf. 2009, 77, 114–122. grated processes where chemo- and enzyme catalysis [9] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. are combined. A recent special issue of ChemCat- Lipman, J. Mol. Biol. 1990, 215, 403–410. ChemACHTUNGRE [199] already demonstrates that this is feasible [10] a) M. Goujon, H. McWilliam, W. Li, F. Valentin, S. and has the advantage to create cleaner and greener Squizzato, J. Paern, R. Lopez, Nucleic Acids Res. 38 processes. Further, the tools for protein engineering 2010, (Suppl.), W695–699; b) M. A. Larkin, G. Blackshields, N. P. Brown, R. Chenna, P. A. McGetti- of biocatalysts covered in this review must not be con- gan, H. McWilliam, F. Valentin, I. M. Wallace, A. sidered solely in a context of the use of isolated en- Wilm, R. Lopez, J. D. Thompson, T. J. Gibson, D. G. zymes as more and more processes are now being es- Higgins, Bioinformatics 2007, 23, 2947–2948. tablished using metabolic engineering; that is, the in- [11] C. K. Savile, J. M. Janey, E. C. Mundorff, J. C. Moore, corporation of multi-step enzyme catalysis in an engi- S. Tam, W. R. Jarvis, J. C. Colbeck, A. Krebber, F. J. neered microorganisms. An example is the synthesis Fleitz, J. Brands, P. N. Devine, G. W. Huisman, G. J. of optically pure amino acids in the hydantoinase pro- Hughes, Science 2010, 329, 305–309. cess, which was initially based on the separate use of [12] a) A. Liese, K. Seelbach, C. Wandrey (Eds.), Industrial immobilised hydantoinase, carbamoylase and race- Biotransformations, 2nd edn., Wiley-VCH, Weinheim, Biocatalysis in the Pharma- mase in packed-bed column reactors and nowadays 2006; b) R. N. Patel (Ed.), [200] ceutical and Biotechnology Industries, CRC Press, can be performed using engineered E. coli cells. Boca Raton, 2006; c) K. Buchholz, V. Kasche, U. T. Most recently, the production of fatty acid alkyl Bornscheuer, Biocatalysts and Enzyme Technology, [201] [202] esters or 1,4-butanediol with whole cells was re- Wiley-VCH, Weinheim, 2005; d) K. Faber, Biotrans- ported, both processes rely beside sophisticated meta- formations in Organic Chemistry, 5th edn., Springer, bolic engineering efforts on the tools for enzyme dis- Berlin, 2004; e) K. Drauz, H. Waldmann (Eds.),

2210 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

Enzyme Catalysis in Organic Synthesis, Vols. 1–3, 2nd Pfannkoch, Y.-H. Rogers, H. O. Smith, Science 2004, edn., Wiley-VCH, Weinheim, 2002. 304, 66–74. [13] a) M. T. Reetz, Angew. Chem. 2011, 123, 144–182; [25] a) A. Kirschner, J. Altenbuchner, U. T. Bornscheuer, Angew. Chem. Int. Ed. 2011, 50, 138–174; b) A. S. Appl. Microbiol. Biotechnol. 2007, 73, 1065–1072; Bommarius, J. K. Blum, M. J. Abrahamson, Curr. b) A. Kirschner, U. T. Bornscheuer, Angew. Chem. Opin. Chem. Biol. 2011, 15, 194–200; c) G. A. Stroh- 2006, 118, 7161–7163; Angew. Chem. Int. Ed. 2006, 45, meier, H. Pichler, O. May, M. Gruber-Khadjawi, 7004–7006. Chem. Rev. 2011, DOI: 10.1021/cr100386u; d) R. J. [26] O. Kaplan, K. Bezousˇka, A. Malandra, A. Vesel, A. Kazlauskas, U. T. Bornscheuer, Nat. Chem. Biol. 2009, Petrˇcˇkov, J. Felsberg, A. Ringelov, V. Krˇen, L. 5, 526–529. Martnkov, Biotechnol. Lett. 2011, 33, 309–312. [14] a) H. Griengl, H. Schwab, M. Fechter, Trends Biotech- [27] K. Kino, A. Noguchi, T. Arai, M. Yagasaki, J. Biosci. nol. 2000, 18, 252–256; b) D. V. Johnson, A. A. Zabe- Bioeng. 2010, 110, 39–41. linskaja-Mackova, H. Griengl, Curr. Opin. Chem. [28] B. van Loo, J. Kingma, M. Arand, M. G. Wubbolts, Biol. 2000, 4, 103- 109. D. B. Janssen, Appl. Environ. Microbiol. 2006, 72, [15] a) A. Musidlowska, S. Lange, U. T. Bornscheuer, 2905–2917. Angew. Chem. 2001, 113, 2934–2936; Angew. Chem. [29] L.-J. Wang, C.-X. Li, Y. Ni, J. Zhang, X. Liu, J.-H. Xu, Int. Ed. 2001, 40, 2851–2853; b) A. Hummel, E. Brse- Bioresour. Technol. 2011, 102, 7023–7028. haber, D. Bçttcher, H. Trauthwein, K. Doderer, U. T. [30] R. Machielsen, A. R. Uria, S. W. M. Kengen, J. van Bornscheuer, Angew. Chem. 2007, 119, 8644–8646; der Oost, Appl. Environ. Microbiol. 2006, 72, 233–238. Angew. Chem. Int. Ed. 2007, 46, 8492–8494. [31] M. W. Fraaije, J. Wu, D. P. H. M. Heuts, E. W. van [16] a) R. I. Amann, W. Ludwig, K.-H. Schleifer, Micro- Hellemond, J. H. L. Spelberg, D. B. Janssen, Appl. Mi- biol. Rev. 1995, 59, 143–169; b) J. T. Staley, A. Konop- crobiol. Biotechnol. 2005, 66, 393–400. ka, Annu. Rev. Microbiol. 1985, 39, 321–346; c) N. R. [32] J. D. Stewart, in: Advances in Microbiology, Vol. 59, Pace, Science 1997, 276, 734–740; d) P. Hugenholtz, (Eds.: S. S. Allen I. Laskin, M. G. Geoffrey), Academ- B. M. Goebel, N. R. Pace, J. Bacteriol. 1998, 180, ic Press, San Diego, 2006, pp 31–52. 4765–4774. [33] T. Furuya, K. Kino, Appl. Microbiol. Biotechnol. 2010, [17] W. R. Streit, R. Daniel, (Eds.), Metagenomics, Spring- 86, 991–1002. er Protocols, Humana Press, New York, 2010. [34] S. M. Li, J. Antibiot. 2011, 64, 45–49. [18] a) T. Uchiyama, K. Miyazaki, Curr. Opin. Biotechnol. [35] P. B. Juhl, K. Doderer, F. Hollmann, O. Thum, J. 2009, 20, 616–622; b) L. Fernndez-Arrojo, M.-E. Pleiss, J. Biotechnol. 2010, 150, 474–480. Guazzaroni, N. Lpez-Corts, A. Beloqui, M. Ferrer, [36] N. H. Barakat, N. H. Barakat, J. J. Love, Protein Eng. Curr. Opin. Biotechnol. 2010, 21, 725–733. Des. Sel. 2010, 23, 799–807. [19] a) E. Y. Yu, M. A. Kwon, M. Lee, J. Y. Oh, J. E. Choi, [37] S. Lutz, U. T. Bornscheuer, (Eds.), Protein Engineering J. Y. Lee, B. K. Song, D. H. Hahm, J. K. Song, Appl. Handbook, Wiley-VCH, Weinheim, 2009. Microbiol. Biotechnol. 2011, 90, 573–581; b) S. Y. [38] W. P. C. Stemmer, Nature 1994, 370, 389–391. Park, H. J. Shin, G. J. Kim, J. Microbiol. 2011, 49,7– [39] R. C. Cadwell, G. F. Joyce, PCR Meth. Appl. 1992, 2, 14; c) Y. Okamura, T. Kimura, H. Yokouchi, M. Men- 28–33. eses-Osorio, M. Katoh, T. Matsunaga, H. Takeyama, [40] L. W. Chiang, in: In Vitro Mutagenesis Protocols,(Ed.: Mar. Biotechnol. 2010, 12, 395–402; d) R. P. Chen, M. K. Trower), Humana Press, Totowa, NJ, 1996, L. Z. Guo, H. Y. Dang, World J. Microbiol. Biotech- pp 311–321. nol. 2011, 27, 431–441; e) S. Bayer, A. Kunert, M. [41] a) F. H. Arnold, G. Georgiou, (Eds.), Directed Evolu- Ballschmiter, T. Greiner-Stoeffele, J. Mol. Microbiol. tion Library Creation: Methods and Protocols, Vol. Biotechnol. 2010, 18, 181–187. 231, Humana Press, Totawa, 2003; b) F. H. Arnold, G. [20] a) G. Z. Wang, H. Y. Luo, Y. R. Wang, H. Q. Huang, Georgiou, (Eds.), Directed Enzyme Evolution: Screen- P. J. Shi, P. L. Yang, K. Meng, Y. G. Bai, B. Yao, Biore- ing and Selection Methods, Vol. 230, Humana Press, sour. Technol. 2011, 102, 3330–3336; b) X. C. Mo, Totawa, 2003; c) S. Brakmann, A. Schwienhorst, C. L. Chen, H. Pang, Y. Feng, J. X. Feng, Appl. Micro- (Eds.), Evolutionary Methods in Biotechnology: biol. Biotechnol. 2010, 87, 2137–2146. Clever Tricks for Directed Evolution, Wiley-VCH, [21] J. A. Liu, W. D. Liu, X. L. Zhao, W. J. Shen, H. Cao, Weinheim, 2004. Z. L. Cui, Appl. Microbiol. Biotechnol. 2011, 89, [42] G. Y. Yang, S. G. Withers, ChemBioChem 2009, 10, 1083–1092. 2704–2715. [22] E. J. Kwon, Y. S. Jeong, Y. H. Kim, S. K. Kim, H. B. [43] a) K. M. Polizzi, M. Parikh, C. U. Spencer, I. Matsu- Na, J. Kim, H. D. Yun, H. Kim, J. Korean Soc. Appl. mura, J. H. Lee, M. J. Realff, A. S. Bommarius, Bio- Biol. 2010, 53, 702–708. technol. Prog. 2006, 22, 961–967; b) K. M. Polizzi, [23] K. Wang, G. Li, S. Q. Yu, C. T. Zhang, Y. H. A. Liu, C. U. Spencer, A. Dubey, I. Matsumura, J. H. Lee, Appl. Microbiol. Biotechnol. 2010, 88, 155–165. M. J. Realff, A. S. Bommarius, J. Biomol. Screening [24] J. C. Venter, K. Remington, J. F. Heidelberg, A. L. 2005, 10, 856–864. Halpern, D. Rusch, J. A. Eisen, D. Wu, I. Paulsen, [44] a) J. D. Bloom, F. H. Arnold, Proc. Natl. Acad. Sci. K. E. Nelson, W. Nelson, D. E. Fouts, S. Levy, A. H. USA 2009, 106, 9995–10000; b) R. D. Gupta, D. S. Knap, M. W. Lomas, K. Nealson, O. White, J. Peter- Tawfik, Nat. Methods 2008, 5, 939–942; c) S. G. Peisa- son, J. Hoffman, R. Parsons, H. Baden-Tillson, C. jovich, D. S. Tawfik, Nat. Methods 2007, 4, 991–994.

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2211 REVIEWS Geoffrey A. Behrens et al.

[45] E. Krieger, G. Koraimann, G. Vriend, Proteins Struct. [66] H. Asako, M. Shimizu, N. Itoh, Appl. Microbiol. Bio- Funct. Gen. 2002, 47, 393–402. technol. 2009, 84, 397–405. [46] E. Krieger, T. Darden, S. B. Nabuurs, A. Finkelstein, [67] J. A. Tian, P. Wang, S. Gao, X. Y. Chu, N. F. Wu, Y. L. G. Vriend, Proteins Struct. Funct. Bioinf. 2004, 57, Fan, FEBS J. 2010, 277, 4901–4908. 678–683. [68] K. Yamashiro, S. Yokobori, S. Koikeda, A. Yamagishi, [47] a) M. Mladenovic, K. Junold, R. F. Fink, W. Thiel, T. Protein Eng. Des. Sel. 2010, 23, 519–528. Schirmeister, B. Engels, J. Phys. Chem. B 2008, 112, [69] W. J. Liu, X. Z. Zhang, Z. M. Zhang, Y. H. P. Zhang, 5458–5469; b) H. M. Senn, W. Thiel, Atomistic Ap- Appl. Environ. Microbiol. 2010, 76, 4914–4917. proaches in Modern Biology: From Quantum Chemis- [70] P. Heinzelman, C. D. Snow, I. Wu, C. Nguyen, A. Vil- try to Molecular Simulations, 2007, 268, 173–290. lalobos, S. Govindarajan, J. Minshull, F. H. Arnold, [48] S. Bartsch, R. Kourist, U. T. Bornscheuer, Angew. Proc. Natl. Acad. Sci. USA 2009, 106, 5610–5615. Chem. 2008, 120, 1531–1534; Angew. Chem. Int. Ed. [71] G. A. Kotzia, N. E. Labrou, FEBS J. 2009, 276, 1750– 2008, 47, 1508–1511. 1761. [49] a) M. T. Reetz, L. W. Wang, M. Bocola, Angew. Chem. [72] M. T. Reetz, P. Soni, L. Fernandez, Biotechnol. 2006, 118, 1258–1263; Angew. Chem. Int. Ed. 2006, 45, Bioeng. 2009, 102, 1712–1717. 1236–1241; b) M. T. Reetz, M. Bocola, J. D. Carbal- [73] T. Belien, I. J. Joye, J. A. Delcour, C. M. Courtin, Pro- leira, D. X. Zha, A. Vogel, Angew. Chem. 2005, 117, tein Eng. Des. Sel. 2009, 22, 587–596. 4264–4268; Angew. Chem. Int. Ed. 2005, 44, 4192– [74] D. Y. Colin, P. Deprez-Beauclair, N. Silva, L. Infantes, 4196. B. Kerfelec, Protein Eng. Des. Sel. 2010, 23, 365–373. [50] D. M. Weinreich, N. F. Delaney, M. A. Depristo, D. L. [75] H. Y. Zhou, H. Y. Pan, L. Q. Rao, Y. Y. Wu, Appl. Hartl, Science 2006, 312, 111–114. Biochem. Biotechnol. 2011, 163, 186–194. [51] M. T. Reetz, J. Sanchis, ChemBioChem 2008, 9, 2260– [76] M. Brouk, Y. Nov, A. Fishman, Appl. Environ. Micro- 2267. biol. 2010, 76, 6397–6403. [52] R. J. Fox, S. C. Davis, E. C. Mundorff, L. M. Newman, [77] C. Y. Chen, I. Georgiev, A. C. Anderson, B. R. V. Gavrilovic, S. K. Ma, L. M. Chung, C. Ching, S. Donald, Proc. Natl. Acad. Sci. USA 2009, 106, 3764– Tam, S. Muley, J. Grate, J. Gruber, J. C. Whitman, 3769. R. A. Sheldon, G. W. Huisman, Nat. Biotechnol. 2007, [78] R. Gerasimaite, G. Vilkaitis, S. Klimasauskas, Nucleic 25, 338–344. Acids Res. 2009, 37, 7332–7341. [53] M. Lehmann, L. Pasamontes, S. F. Lassen, M. Wyss, [79] M. Pedotti, E. Rosini, G. Molla, T. Moschetti, C. Biochim. Biophys. Acta 2000, 1543, 408–415. Savino, B. Vallone, L. Pollegioni, J. Biol. Chem. 2009, [54] E. Vazquez-Figueroa, V. Yeh, J. M. Broering, J. F. 284, 36415–36423. Chaparro-Riggers, A. S. Bommarius, Protein Eng. Des. [80] D. Ribitsch, S. Winkler, K. Gruber, W. Karl, E. Wehr- Sel. 2008, 21, 673–680. schutz-Sigl, I. Eiteljorg, P. Schratl, P. Remler, R. Stehr, [55] a) R. K. Kuipers, H. J. Joosten, W. J. van Berkel, N. G. C. Bessler, N. Mussmann, K. Sauter, K. H. Maurer, H. Leferink, E. Rooijen, E. Ittmann, F. van Zimmeren, Schwab, Appl. Microbiol. Biotechnol. 2010, 87, 1743– H. Jochens, U. Bornscheuer, G. Vriend, V. A. dos 1752. Santos, P. J. Schaap, Proteins 2010, 78, 2101–2113; [81] A. A. Qaed, H. Lin, D. F. Tang, Z. L. Wu, Biotechnol. b) R. Kourist, H. Jochens, S. Bartsch, R. Kuipers, S. K. Lett. 2011, 33, 611–616. Padhi, M. Gall, D. Bçttcher, H. J. Joosten, U. T. Born- [82] R. B. Hamed, J. R. Gomez-Castellanos, A. Thalham- scheuer, ChemBioChem 2010, 11, 1635–1643. mer, D. Harding, C. Ducho, T. D. W. Claridge, C. J. [56] H. Jochens, U. T. Bornscheuer, ChemBioChem 2010, Schofield, Nat. Chem. 2011, 3, 365–371. 11, 1861–1866. [83] S. Schneider, M. Gutierrez, T. Sandalova, G. Schneid- [57] H. Jochens, D. Aerts, U. T. Bornscheuer, Protein Eng. er, P. Clapes, G. A. Sprenger, A. K. Samland, Chem- Des. Sel. 2010, 23, 903–909. BioChem 2010, 11, 681–690. [58] M. T. Reetz, P. Soni, L. Fernandez, Y. Gumulya, J. D. [84] N. U. Nair, H. M. Zhao, ChemBioChem 2008, 9, 1213– Carballeira, Chem. Commun. 2010, 46, 8657–8658. 1215. [59] M. T. Reetz, J. D Carballeira, A. Vogel, Angew. [85] R. Feingersch, J. Shainsky, T. K. Wood, A. Fishman, Chem. 2006, 118, 7909–7915; Angew. Chem. Int. Ed. Appl. Environ. Microbiol. 2008, 74, 1555–1566. 2006, 45, 7745–7751. [86] F. Escalettes, N. J. Turner, ChemBioChem 2008, 9, [60] D. Choudhury, S. Biswas, S. Roy, J. K. Dattagupta, 857–860. Protein Eng. Des. Sel. 2010, 23, 457–467. [87] H. B. Zheng, M. T. Reetz, J. Am. Chem. Soc. 2010, [61] J. C. Joo, S. Pohkrel, S. P. Pack, Y. J. Yoo, J. Biotech- 132, 15744–15751. nol. 2010, 146, 31–39. [88] K. Engstroem, J. Nyhlen, A. G. Sandstroem, J.-E. [62] O. Spadiut, C. Leitner, C. Salaheddin, B. Varga, B. G. Baeckvall, J. Am. Chem. Soc. 2010, 132, 7038–7042. Vertessy, T. C. Tan, C. Divne, D. Haltrich, FEBS J. [89] E. L. Ang, J. P. Obbard, H. Zhao, Appl. Microbiol. 2009, 276, 776–792. Biotechnol. 2009, 81, 1063–1070. [63] J. Wei, Y. Zhou, T. Xu, B. Lu, Appl. Biochem. Bio- [90] E. Campbell, I. R. Wheeldon, S. Banta, Biotechnol. technol. 2010, 162, 116–126. Bioeng. 2010, 107, 763–774. [64] M.-S. Kim, X. G. Lei, Appl. Microbiol. Biotechnol. [91] N. Stiti, I. O. Adewale, J. Petersen, D. Bartels, H. H. 2008, 79, 69–75. Kirch, Biochem. J. 2011, 434, 459–471. [65] Z. L. Han, S. Y. Han, S. P. Zheng, Y. Lin, Appl. Micro- [92] N. Richter, A. Zienert, W. Hummel, Eng. Life Sci. biol. Biotechnol. 2009, 85, 117–126. 2011, 11, 26–36.

2212 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

[93] M. M. Kopacz, S. Rovida, E. van Duijn, M. W. Fraaije, [117] A. Mçglich, R. A. Ayers, K. Moffat, J. Mol. Biol. A. Mattevi, Biochemistry 2011, 50, 4209–4217. 2009, 385, 1433–1444. [94] T. Ema, S. Kamata, M. Takeda, Y. Nakano, T. Sakai, [118] X. Hu, S. Robin, S. OConnell, G. Walsh, J. G. Wall, Chem. Commun. 2010, 46, 5440–5442. Appl. Microbiol. Biotechnol. 2010, 87, 1773–1782. [95] J. D. Carballeira, P. Krumlinde, M. Bocola, A. Vogel, [119] D. Koszelewski, D. Clay, D. Rozzell, W. Kroutil, Eur. M. T. Reetz, J.-E. Backvall, Chem. Commun. 2007, J. Org. Chem. 2009, 2289–2292. 1913–1915. [120] G. B. Matcham, L. Mohit; W. Lang, C. Lewis, Nelson, [96] M. Melzer, J. C. H. Chen, A. Heidenreich, J. Gab, M. A. Wang, W. Wu, Chimia 1999, 53, 584–589. Koller, K. Kehe, M. M. Blum, J. Am. Chem. Soc. [121] A. R. Martin, R. DiSanto, I. Plotnikov, S. Kamat, D. 2009, 131, 17226–17232. Shonnard, S. Pannuri, Biochem. Eng. J. 2007, 37, 246– [97] G. J. Williams, T. Woodhall, L. M. Farnsworth, A. 255. Nelson, A. Berry, J. Am. Chem. Soc. 2006, 128, 16238– [122] T.-H. Jang, B. Kim, O. K. Park, J. Y. Bae, B.-G. Kim, 16247. H. Yun, H. H. Park, Acta Crystallogr. F 2010, 66, 923– [98] W. L. Tang, Z. Li, H. Zhao, Chem. Commun. 2010, 46, 925. 5461–5463. [123] B.-K. Cho, H.-Y. Park, J.-H. Seo, J. Kim, T.-J. Kang, [99] M. M. Musa, N. Lott, M. Laivenieks, L. Watanabe, C. B.-S. Lee, B.-G. Kim, Biotechnol. Bioeng. 2008, 99, Vieille, R. S. Phillips, ChemCatChem 2009, 1, 89–93. 275–284. [100] H. Raj, B. Weiner, V. P. Veetil, C. R. Reis, W. J. Quax, [124] S. Pannuri, S. V. Kamat, A. R. M. Garcia, Patent WO D. B. Janssen, B. L. Feringa, G. J. Poelarends, Chem- 2006/063336 A2, 2006. BioChem 2009, 10, 2236–2245. [125] M. Svedendahl, C. Branneby, L. Lindberg, P. Ber- [101] S. M. Lippow, T. S. Moon, S. Basu, S. H. Yoon, X. Li, glund, ChemCatChem 2010, 2, 976–980. B. A. Chapman, K. Robison, D. Lipovsek, K. L. J. [126] A. A. Desai, Angew. Chem. 2011, 123, 2018–2020; Prather, Chem. Biol. 2010, 17, 1306–1315. Angew. Chem. Int. Ed. 2011, 50, 1974–1976. [102] M. Pavlova, M. Klvana, Z. Prokop, R. Chaloupkova, [127] M. Hçhne, S. Schtzle, H. Jochens, K. Robins, U. T. P. Banas, M. Otyepka, R. C. Wade, M. Tsuda, Y. Bornscheuer, Nat. Chem. Biol. 2010, 6, 807–813. Nagata, J. Damborsky, Nat. Chem. Biol. 2009, 5, 727– [128] S. Schtzle, F. Steffen-Munsberg, M. Hçhne, K. 733. Robins, U. T. Bornscheuer, Adv. Synth. Catal. 2011, [103] F. Zheng, W. C. Yang, M. C. Ko, J. J. Liu, H. Cho, 353, 2439–2445. D. Q. Gao, M. Tong, H. H. Tai, J. H. Woods, C. G. [129] H. S. Toogood, J. M. Gardiner, N. S. Scrutton, Chem- Zhan, J. Am. Chem. Soc. 2008, 130, 12148–12155. CatChem 2010, 2, 892–914. [104] R. D. Gupta, M. Goldsmith, Y. Ashani, Y. Simo, G. [130] R. Stuermer, B. Hauer, M. Hall, K. Faber, Curr. Opin. Mullokandov, H. Bar, M. Ben-David, H. Leader, R. Chem. Biol. 2007, 11, 203–213. Margalit, I. Silman, J. L. Sussman, D. S. Tawfik, Nat. [131] S. K. Padhi, D. J. Bougioukou, J. D. Stewart, J. Am. Chem. Biol. 2011, 7, 120–125. Chem. Soc. 2009, 131, 3271–3280. [105] N. Ran, J. W. Frost, J. Am. Chem. Soc. 2007, 129, [132] D. J. Bougioukou, S. Kille, A. Taglieber, M. T. Reetz, 6130–6139. Adv. Synth. Catal. 2009, 351, 3287–3305. [106] S. Bartsch, U. T. Bornscheuer, Protein Eng. Des. Sel. [133] M. E. Hulley, H. S. Toogood, A. Fryszkowska, D. 2010, 23, 929–933. Mansell, G. M. Stephens, J. M. Gardiner, N. S. Scrut- [107] S. Atsumi, J. C. Liao, Appl. Environ. Microbiol. 2008, ChemBioChem 11 74, 7802–7808. ton, 2010, , 2433–2447. [108] K.-C. Chen, C.-H. Wu, C.-Y. Chang, W.-C. Lu, Q. [134] H. S. Toogood, A. Fryszkowska, M. Hulley, M. Tseng, Z. M. Prijovich, W. Schechinger, Y.-C. Liaw, Sakuma, D. Mansell, G. M. Stephens, J. M. Gardiner, Y.-L. Leu, S. R. Roffler, Chem. Biol. 2008, 15, 1277– N. S. Scrutton, ChemBioChem 2011, 12, 738–749. 1286. [135] a) R. Kourist, U. T. Bornscheuer, Appl. Microbiol. [109] H. Zhang, G. T. Lountos, C. B. Ching, R. Jiang, Appl. Biotechnol. 2011, 91, 505–517; b) R. Kourist, P. Dom- Microbiol. Biotechnol. 2010, 88, 117–124. nguez de Mara, U. T. Bornscheuer, ChemBioChem [110] S. E. Deacon, M. J. McPherson, ChemBioChem 2011, 2008, 9, 491–498. 12, 593–601. [136] E. Henke, U. T. Bornscheuer, R. D. Schmid, J. Pleiss, [111] H. Masukawa, K. Inoue, H. Sakurai, C. P. Wolk, R. P. ChemBioChem 2003, 4, 485–493. Hausinger, Appl. Environ. Microbiol. 2011, 76, 6741– [137] B. Heinze, R. Kourist, L. Fransson, K. Hult, U. T. 6750. Bornscheuer, Protein Eng. Des. Sel. 2007, 20, 125–131. [112] J. Harle, S. Gunther, B. Lauinger, M. Weber, B. Kam- [138] R. Kourist, S. Bartsch, U. T. Bornscheuer, Adv. Synth. merer, D. L. Zechel, A. Luzhetskyy, A. Bechthold, Catal. 2007, 349, 1393–1398. Chem. Biol. 2011, 18, 520–530. [139] M. Baumann, R. Strmer, U. T. Bornscheuer, Angew. [113] D. Meyer, L. Walter, G. Kolter, M. Pohl, M. Muller, Chem. 2001, 113, 4329–4333; Angew. Chem. Int. Ed. K. Tittmann, J. Am. Chem. Soc. 2011, 133, 3609–3616. 2001, 40, 4201–4204. [114] G. S. Brandt, M. M. Kneen, G. A. Petsko, D. Ringe, [140] J. C. G. Woo, R. B. Silverman, J. Am. Chem. Soc. M. J. McLeish, J. Am. Chem. Soc. 2001, 123, 438–439. 1995, 117, 1663–1664. [115] V. Mathieu, J. Fastrez, P. Soumillion, Protein Eng. [141] M. Alexeeva, A. Enright, M. J. Dawson, M. Mahmou- Des. Sel. 2010, 23, 699–709. dian, N. J. Turner, Angew. Chem. 2002, 114, 3309– [116] T. Ying, F. Zhong, Z.-H. Wang, W. Li, X. Tan, Z.-X. 3312; Angew. Chem. Int. Ed. 2002, 41, 3177–3180. Huang, ChemBioChem 2011, 12, 707–710.

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2213 REVIEWS Geoffrey A. Behrens et al.

[142] R. Carr, M. Alexeeva, A. Enright, T. S. C. Eve, M. J. [161] G. J. Williams, S. Domann, A. Nelson, A. Berry, Proc. Dawson, N. J. Turner, Angew. Chem. 2003, 115, 4955– Natl. Acad. Sci. USA 2003, 100, 3143–3148. 4958; Angew. Chem. Int. Ed. 2003, 42, 4807–4810. [162] S. Jennewein, M. Schrmann, M. Wolberg, I. Hilker, [143] R. Carr, M. Alexeeva, M. J. Dawson, V. Gotor-Fer- R. Luiten, M. Wubbolts, D. Mink, Biotechnol. J. 2006, nandez, C. E. Humphrey, N. J. Turner, ChemBioChem 1, 537–548. 2005, 6, 637–639. [163] W. A. Greenberg, A. Varvak, S. R. Hanson, K. Wong, [144] C. J. Dunsmore, R. Carr, T. Fleming, N. J. Turner, J. H. Huang, P. Chen, M. J. Burk, Proc. Natl. Acad. Sci. Am. Chem. Soc. 2006, 128, 2224–2225. USA 2004, 101, 5788–5793. [145] K. R. Bailey, A. J. Ellis, R. Reiss, T. J. Snape, N. J. [164] a) D. Sirim, F. Wagner, A. Lisitsa, J. Pleiss, BMC Bio- Turner, Chem. Commun. 2007, 44, 3640–3642. chem. 2009, 10, 27; b) D. R. Nelson, Human Genomics [146] V. Kçhler, K. R. Bailey, A. Znabet, J. Raftery, M. Hel- 2009, 4, 1473–1479; c) M. Fischer, Q. K. Thai, M. liwell, N. J. Turner, Angew. Chem. 2010, 122, 2228– Grieb, J. Pleiss, BMC Bioinformatics 2006, 7, 495; 2230; Angew. Chem. Int. Ed. 2010, 49, 2182–2184. d) M. Fischer, M. Knoll, D. Sirim, F. Wagner, S. [147] A. Znabet, E. Ruijter, F. J. J. de Kanter, V. Kçhler, M. Funke, J. Pleiss, Bioinformatics 2007, 23, 2015–2017. Helliwell, N. J. Turner, R. V. A. Orru, Angew. Chem. [165] A. Seifert, S. Vomund, K. Grohmann, S. Kriening, 2010, 122, 5417–5420; Angew. Chem. Int. Ed. 2010, 49, V. B. Urlacher, S. Laschat, J. Pleiss, ChemBioChem 5289–5292. 2009, 10, 853–861. [148] D. B. Janssen, Adv. Appl. Microbiol. 2007, 61, 233– [166] Q.-S. Li, J. Ogawa, R. D. Schmid, S. Shimizu, Appl. 252. Environ. Microbiol. 2001, 67, 5735–5739. [149] T. Koudelakova, E. Chovancova, J. Brezovsky, M. [167] J. A. Dietrich, Y. Yoshikuni, K. J. Fisher, F. X. Woo- Monincova, A. Fortova, J. Jarkovsky, J. Damborsky, lard, D. Ockey, D. J. McPhee, N. S. Renninger, Biochem. J. 2011, 354, 345–354. M. C. Y. Chang, D. Baker, J. D. Keasling, ACS Chem. [150] L. S. Campbell-Verduyn, W. Szymnski, C. P. Postema, Biol. 2009, 4, 261–267. R. A. Dierckx, P. H. Elsinga, D. B. Janssen, B. L. Fer- [168] Y. Tu, R. Deshmukh, M. Sivaneri, G. D. Szklarz, Drug inga, Chem. Commun. 2010, 46, 898–900. Metab. Dispos. 2008, 36, 2371–2380. [151] K. A. Gray, T. H. Richardson, K. Kretz, J. M. Short, F. [169] S. Eiben, L. Kaysser, S. Maurer, K. Khnel, V. B. Ur- Bartnek, R. Knowles, L. Kan, P. E. Swanson, Dan E. lacher, R. D. Schmid, J. Biotechnol. 2006, 124, 662– Adv. Synth. Catal. 343 Robertson, 2001, , 607–617. 669. [152] T. Bosma, G. Stucki, D. B. Janssen, Appl. Environ. Mi- [170] S. Li, L. M. Podust, D. H. Sherman, J. Am. Chem. Soc. crobiol. 2002, 68, 3582–3587. 2007, 129, 12940–12941. [153] Z. Prokop, Y. Sato, J. Brezovsky, T. Mozga, R. Cha- [171] F. Sabbadin, R. Hyde, A. Robin, E.-M. Hilgarth, M. loupkova, T. Koudelakova, P. Jerabek, V. Stepankova, Delenne, S. Flitsch, N. Turner, G. Grogan, N. C. R. Natsume, J. G. E. van Leeuwen, D. B. Janssen, J. Bruce, ChemBioChem 2010, 11, 987–994. Florian, Y. Nagata, T. Senda, J. Damborsky, Angew. [172] P. A. England, C. F. Harford-Cross, J.-A. Stevenson, Chem. 2010, 122, 6247–6251; Angew. Chem. Int. Ed. D. A. Rouch, L.-L. Wong, FEBS Lett. 1998, 424, 271– 2010, 6111–6115. 274. [154] L. Tang, J. E. T. v. H. Vlieg, J. H. L. Spelberg, M. W. [173] G. Hoffmann, K. Bçnsch, T. Greiner-Stçffele, M. Fraaije, D. B. Janssen, Enzyme Microb. Technol. 2002, Ballschmiter, Protein Eng. Des. Sel. 2011, 24, 439–446. 30, 251–258. [155] L. Tang, D. E. Torres PazmiÇo, M. W. Fraaije, R. M. [174] S. G. Bell, X. Chen, R. J. Sowden, F. Xu, J. N. Wil- de Jong, B. W. Dijkstra, D. B. Janssen, Biochemistry liams, L.-L. Wong, Z. Rao, J. Am. Chem. Soc. 2003, 2005, 44, 6609–6618. 125, 705–714. [156] R. Schoevaart, F. van Rantwijk, R. A. Sheldon, Bio- [175] T. Kubo, M. W. Peters, P. Meinhold, F. H. Arnold, technol. Bioeng. 2000, 70, 349–352. Chem. Eur. J. 2006, 12, 1216–1220. [157] J.-Q. Liu, T. Dairi, N. Itoh, M. Kataoka, S. Shimizu, [176] a) S. Panke, B. Witholt, A. Schmid, M. G. Wubbolts, H. Yamada, J. Mol. Catal. B: Enzym. 2000, 10, 107– Appl. Environ. Microbiol. 1998, 64, 2032–2043; b) D. 115. Tischler, D. Eulberg, S. Lakner, S. R. Kaschabek, [158] a) X. Garrabou, J. Joglar, T. Parella, R. Crehuet, J. W. J. H. van Berkel, M. Schlomann, J. Bacteriol. 2009, Bujons, P. Claps, Adv. Synth. Catal. 2011, 353, 89–99; 191, 4996–5009. b) J. A. Castillo, C. Gurard-Hlaine, M. Gutirrez, X. [177] L. Gursky, J. Nikodinovic-Runic, K. Feenstra, K. Garrabou, M. Sancelme, M. Schrmann, T. Inoue, V. OConnor, Appl. Microbiol. Biotechnol. 2010, 85, 995– Hlaine, F. Charmantray, T. Gefflaut, L. Hecquet, J. 1004. Joglar, P. Claps, G. A. Sprenger, M. Lemaire, Adv. [178] R. Gagnon, G. Grogan, M. S. Levitt, S. M. Robets, Synth. Catal. 2010, 352, 1039–1046. P. W. H. Wan, A. J. Willetts, J. Chem. Soc. Perkin [159] M. von Itzstein, W.-Y. Wu, G. B. Kok, M. S. Pegg, J. C. Trans. 1 1994, 2537–2543. Dyason, B. Jin, T. V. Phan, M. L. Smythe, H. F. White, [179] D. Bonsor, S. F. Butz, J. Solomons, S. Grant, I. J. S. S. W. Oliver, P. M. Colman, J. N. Varghese, D. M. Fairlamb, M. J. Fogg, G. Grogan, Org. Biomol. Chem. Ryan, J. M. Woods, R. C. Bethell, V. J. Hotham, J. M. 2006, 4, 1252–1260. Cameron, C. R. Penn, Nature 1993, 363, 418–423. [180] C. Szolkowy, L. D. Eltis, N. C. Bruce, G. Grogan, [160] M. Wada, C.-C. Hsu, D. Franke, M. Mitchell, A. ChemBioChem 2009, 10, 1208–1217. Heine, I. Wilson, C.-H. Wong, Bioorg. Med. Chem. [181] E. Malito, A. Alfieri, M. W. Fraaije, A. Mattevi, Proc. 2003, 11, 2091–2098. Natl. Acad. Sci. USA 2004, 101, 13157–13162.

2214 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2191 – 2215 Discovery and Protein Engineering of Biocatalysts for Organic Synthesis

[182] I. A. Mirza, B. J. Yachnin, S. Wang, S. Grosse, H. Ber- C. A. Smith, W. Sheffler, I. W. Davis, S. Cooper, A. geron, A. Imura, H. Iwaki, Y. Hasegawa, P. C. K. Lau, Treuille, D. J. Mandell, F. Richter, Y. E. A. Ban, S. J. A. M. Berghuis, J. Am. Chem. Soc. 2009, 131, 8848– Fleishman, J. E. Corn, D. E. Kim, S. Lyskov, M. Ber- 8854. rondo, S. Mentzer, Z. Popovic, J. J. Havranek, J. Kara- [183] a) M. Bocola, F. Schulz, F. Leca, A. Vogel, M. W. nicolas, R. Das, J. Meiler, T. Kortemme, J. J. Gray, B. Fraaije, M. T. Reetz, Adv. Synth. Catal. 2005, 347, Kuhlman, D. Baker, P. Bradley, Methods Enzymol. 979–986; b) M. T. Reetz, S. Wu, Chem. Commun. 2011, 545–574. 2008, 5499–5501. [194] L. Jiang, E. A. Althoff, F. R. Clemente, L. Doyle, D. [184] M. T. Reetz, S. Wu, J. Am. Chem. Soc. 2009, 131, Rothlisberger, A. Zanghellini, J. L. Gallaher, J. L. 15424–15432. Betker, F. Tanaka, C. F. Barbas 3rd, D. Hilvert, K. N. [185] A. Kirschner, U. T. Bornscheuer, Appl. Microbiol. Houk, B. L. Stoddard, D. Baker, Science 2008, 319, Biotechnol. 2008, 81, 465–472. 1387–1391. [186] a) M. T. Reetz, B. Brunner, T. Schneider, F. Schulz, [195] D. Rçthlisberger, O. Khersonsky, A. M. Wollacott, L. C. M. Clouthier, M. M. Kayser, Angew. Chem. 2004, Jiang, J. DeChancie, J. Betker, J. L. Gallaher, E. A. 116, 4167–4170; Angew. Chem. Int. Ed. 2004, 43, Althoff, A. Zanghellini, O. Dym, S. Albeck, K. N. 4075–4078; b) M. D. Mihovilovic, F. Rudroff, A. Win- Houk, D. S. Tawfik, D. Baker, Nature 2008, 453, 190– Org. Lett. 8 ninger, T. Schneider, F. Schulz, 2006, , 195. 1221–1224. [196] O. Khersonsky, D. Rothlisberger, A. M. Wollacott, P. [187] M. T. Reetz, F. Daligault, B. Brunner, H. Hinrichs, A. Murphy, O. Dym, S. Albeck, G. Kiss, K. N. Houk, D. Deege, Angew. Chem. 2004, 116, 4170–4173; Angew. Baker, D. S. Tawfik, J. Mol. Biol. 2011, 407, 391–412. Chem. Int. Ed. 2004, 43, 4078–4081. [197] J. B. Siegel, A. Zanghellini, H. M. Lovick, G. Kiss, [188] D. J. Opperman, M. T. Reetz, ChemBioChem 2010, 11, A. R. Lambert, J. L. St Clair, J. L. Gallaher, D. Hil- 2589–2596. [189] N. M. Kamerbeek, M. W. Fraaije, D. B. Janssen, Eur. J. vert, M. H. Gelb, B. L. Stoddard, K. N. Houk, F. E. Biochem. 2004, 271, 2107–2116. Michael, D. Baker, Science 2010, 329, 309–313. [190] F. Hollmann, A. Taglieber, F. Schulz, M. T. Reetz [198] N. U. Nair, C. A. Denard, H. Zhao, Curr. Org. Chem. Angew. Chem. 2007, 119, 2961–2964; Angew. Chem. 2010, 14, 1870–1882. Int. Ed. 2007, 46, 2903–2906. [199] Special issue: “Chemoenzymatic Synthesis” Chem- [191] D. E. Torres PazmiÇo, R. Snajdrova, B.-J. Baas, M. CatChem 2011, 3, 237–419. Ghobrial, M. D. Mihovilovic, M. W. Fraaije, Angew. [200] B. Wilms, A. Wiese, C. Syldatk, R. Mattes, J. Alten- Chem. 2008, 120, 2307–2310; Angew. Chem. Int. Ed. buchner, J. Biotechnol. 2001, 86, 19–30. 2008, 47, 2275–2278. [201] E. J. Steen, Y. Kang, G. Bokinsky, Z. Hu, A. Schirmer, [192] a) G. de Gonzalo, M. D. Mihovilovic, M. W. Fraaije, A. McClure, S. B. Del Cardayre, J. D. Keasling, Nature ChemBioChem 2010, 11, 2208–2231; b) J. Rehdorf, 2010, 463, 559–562. U. T. Bornscheuer, Monooxygenases, Baeyer–Villiger [202] H. Yim, R. Haselbeck, W. Niu, C. Pujol-Baxley, A. Applications in Organic Synthesis, John Wiley & Sons, Burgard, J. Boldt, J. Khandurina, J. D. Trawick, R. E. Hoboken, 2009; c) H. Leisch, K. Morley, P. C. K. Lau, Osterhout, R. Stephen, J. Estadilla, S. Teisan, H. B. Chem. Rev. 2011, 111, 4165–4222. Schreyer, S. Andrae, T. H. Yang, S. Y. Lee, M. J. Burk, [193] A. Leaver-Fay, M. Tyka, S. M. Lewis, O. F. Lange, J. S. van Dien, Nat. Chem. Biol. 2011, 7, 445–452. Thompson, R. Jacak, K. Kaufman, P. D. Renfrew,

Adv. Synth. Catal. 2011, 353, 2191 – 2215 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2215 The Articles

ARTICLE II

- 55 -

Anal. Chem. 2009, 81, 8244–8248

Rapid and Sensitive Kinetic Assay for Characterization of ω-Transaminases

Sebastian Scha¨ tzle,† Matthias Ho¨ hne,† Erik Redestad,† Karen Robins,‡ and Uwe T. Bornscheuer*,†

Institute of Biochemistry, Department of Biotechnology and Enzyme Catalysis, Greifswald University, Felix Hausdorff-Str. 4, 17487 Greifswald, Germany, and Lonza AG, Valais Works, Visp, Switzerland

For the biocatalytic preparation of optically active amines, Table 1. Purification Summary for the ω-TA from ω-transaminases (ω-TA) are of special interest since they Rhodobacter sphaeroides allow the asymmetric synthesis starting from prostereo- specific genic ketones with 100% yield. To facilitate the purifica- volume protein activity yield activity purification tion and characterization of novel ω-TA, a fast kinetic assay step (mL) (mg) (U/mL) (%) (U/mg) factor was developed based on the conversion of the widely used crude cell extract 20 670 5.4 100 0.161 1 model substrate r-methylbenzylamine, which is com- IMAC/ 1 27 27.5 25.5 1.019 6.3 ultrafiltration monly accepted by most of the known ω-TAs. The product from this reaction, acetophenone, can be detected spec- trophotometrically at 245 nm with high sensitivity (ε ) is one of the key prerequisites for efficient processes enabling 12 mM-1 cm-1), since the other reactants show only a the use of transaminases on industrial scale. During process low absorbance. Besides the standard substrate pyru- development or for the identification of novel ω-TA, purification vate, all low-absorbing ketones, aldehydes, or keto of the enzyme and characterization of enzymatic properties is of acids can be used as cosubstrates, and thus the amino great interest, but as ω-TA activity is usually determined with low acceptor specificity of a given ω-TA can be obtained throughput methods like HPLC or capillary electrophoresis (CE), quickly. Furthermore, the assay allows the fast inves- the assay can become the limiting step. tigation of enzymatic properties like pH and tempera- This motivated us to develop a simple, fast and sensitive assay ture optimum and stability. This method was used for comparable to, for example, the p-nitrophenyl ester assays for 6 the characterization of a novel ω-TA cloned from esterases and lipases, which can be used in a similar manner as Rhodobacter sphaeroides, and the data obtained were a standard method for the determination of ω-TA activity. Kim in excellent accordance with a standard capillary and co-workers developed an assay based on the detection of electrophoresis assay. alanine formed during ω-TA-catalyzed reactions: after addition of a CuSO4/MeOH staining solution, the blue colored Cu-alanine complex could be determined at 595 nm.7 With this assay ω-Transaminases (ω-TAs) are PLP-dependent enzymes that various aspects of the reaction can be monitored, like testing catalyze amino group transfer reactions. They have gained different amines and -amino acids as substrates in combination increased attention because of their potential for the asymmetric with pyruvate as the cosubstrate, the enantiopreference of the synthesis of optically active amines, which are frequently used as enzyme, and an estimation of enantioselectivity. There are, building blocks for the preparation of numerous pharmaceuticals.1 however, several drawbacks: Since the staining solution inhibits Although kinetic resolutions of racemic amines have been the enzyme, the assay can only be used as an end point investigated as well, the limitation to a maximum yield of 50% measurement and determination of kinetic parameters is not considerably hampers their application. On the other hand, possible. Furthermore, the sensitivity is very low (ε ≈ 10 M-1 asymmetric synthesis requires methods to shift the unfavorable cm-1), and the most commonly used buffers (e.g., phosphate equilibrium toward synthesis of single enantiomers of optically and the Goods buffer) form insoluble blue Cu-complexes, which pure amines for which several methods were developed.1-5 This need to be centrifuged before the determination of the absor- * To whom correspondence should be addressed. Fax: (+49) 3834-86-80066. bance value. Cell extracts also give a blue colored product, E-mail: [email protected]. which is soluble and thus cannot be removed. This is probably † Greifswald University. due to the amino acids present in the cells. ‡ Valais Works. (1) Ho¨hne, M.; Bornscheuer, U. T. ChemCatChem 2009, 1, 42–51. As an alternative, Truppo and co-workers recently published (2) Matcham, G.; Bhatia, M.; Lang, W.; Lewis, C.; Nelson, R.; Wang, A.; Wu, W. a multi-enzyme cascade pH-indicator assay to monitor the conver- Chimia 1999, 53, 584–589. sion in an asymmetric amine synthesis with ω-TAs.8 The ketone (3) Koszelewski, D.; Lavandera, I.; Clay, D.; Rozzell, D.; Kroutil, W. Adv. Synth. Catal. 2008, 350, 2761–2766. is converted with alanine as cosubstrate, and NADH-dependent (4) Koszelewski, D.; Lavandera, I.; Clay, D.; Guebitz, G. M.; Rozzell, D.; Kroutil, W. Angew. Chem., Int. Ed. 2008, 47, 9337–9340. (6) Krebsfa¨nger, N.; Schierholz, K.; Bornscheuer, U. T. J. Biotechnol. 1998, (5) Ho¨hne, M.; Ku¨hl, S.; Robins, K.; Bornscheuer, U. T. ChemBioChem 2008, 60, 105–111. 9, 363–365. (7) Hwang, B.-Y.; Kim, B.-G. Enzyme Microb. Technol. 2004, 34, 429–436.

8244 Analytical Chemistry, Vol. 81, No. 19, October 1, 2009 10.1021/ac901640q CCC: $40.75  2009 American Chemical Society Published on Web 09/09/2009 Figure 1. Expression of Rhodobacter sphaeroides-ωTA in E. coli BL21. The formation of an overexpression band with a size of around 43 kDa indicates the expression of the ω-TA in the soluble fraction. Also a minor amount of enzyme was found in the insoluble fraction. M: marker (Roti-Mark Standard, Roth: trypsin-inhibitor, 20 kDa; carboanhydrase, 29 kDa; ovalbumine, 43 kDa; serum albumine, 66 kDa; -galactosidase, 119 kDa; myosine, 200 kDa). Purified: enzyme after affinity chromatography (HisTag). Cell: crude extract before purification. lactate or alanine dehydrogenases are used to remove the aggagatatacatatgcgtgacgatgcaccgaattcctg; reverse-primer: gatgat- coproduct pyruvate. NADH is then recycled with glucose dehy- gatgggatcctgaggcgacttcggcgaagaccttc; underlined: NdeI, BamHI drogenase, and the arising gluconic acid δ-lactone induces a restriction sites). After 30 cycles consisting of 30 s denaturation decrease of the pH, which is visualized by the application of a pH at 95 °C, 30 s annealing at 65 °C and elongation at 74 °C for 1.5 indicator. Although the sensitivity is higher compared to the min, the fragment was purified, digested with BamH1 and NdeI, 9 CuSO4/MeOH assay, no information about enantiopreference and ligated in pGASTON. or enantioselectivity can be obtained. Expression and Purification of Recombinant ω-TA in E. coli BL21 (DE3). Transformed E. coli BL21 (DE3) strains were EXPERIMENTAL SECTION grown in 500 mL of LB medium supplemented with ampicillin Cloning, Expression, Purification of Rhodobacter sphaeroi- (100 µg/mL). Cells were incubated initially at 37 °C on a gyratory des ω-TA. Genomic DNA of Rhodobacter sphaeroides 2.4.1 (DSM shaker until the OD600 reached 0.7. The cells were then induced 158) was amplified using the GenomePhi kit (GE healthcare) and by the addition of 0.2% rhamnose, and at the same time the used in a subsequent PCR to amplify the gene (forward-primer: incubation temperature was decreased to 20 °C. After induction the incubation was continued for a further 18 h. Aliquots were withdrawn at several points of time after induction to follow the expression. Crude cell extracts were prepared by sonication and centrifugation (16 060g for 5 min) to separate the soluble and insoluble fractions. The cell pellet (∼3 g wet weight) was washed twice with 50 mM phosphate buffer (pH 8), containing 0.1 mM PLP at 4 °C. After disruption (french press) the cell suspension was centrifuged (4000 × g, 30 min), and the resulting supernatant was passed through a 0.5 µm filter prior to chromatography. Chromatography

Table 2. Amino Acceptor Profile of Rhodobacter sphaeroides ω-TA As Determined by the Acetophenone Assay

amino acceptor relative activity [%] pyruvate 100 R-ketoglutarate n.d. glyoxylate 49 succinic semialdehyde 78 ethylpyruvate 52 Figure 2. Comparison of the acetophenone assay at different ethyl-3-oxobutanoate 0.1 wavelengths. The same reaction was measured at 245, 270, 275, acetaldehyde 14 280, and 290 nm, whereas the calculated activities at 245 and 270 butyraldehyde 38 nm differed only by 1-7%. At wavelengths >270 nm, however, acetone n.d. deviations of calculated activities in the range of 10-20% were 2-heptanone 0.2 observed. The considerably lower overall absorption at higher acetoin 0.07 wavelengths allows measurements starting with higher substrate N-Boc-3-aminopyrrolidone 0.07 concentrations up to higher product concentrations.

Analytical Chemistry, Vol. 81, No. 19, October 1, 2009 8245 Figure 3. Principle of the acetophenone assay (left). Acetophenone (1 mM) shows a high absorbance (right) compared to other reactants such as R-methylbenzylamine, pyruvate or alanine (all 10 mM). was performed using an A¨KTA Purifier (GE Healthcare). The filtered cellular extract was applied toa5mLcolumn of IMAC Sepharose 6 Fast Flow (GE Healthcare). The column was washed at a flow rate of 5 mL min-1 with a 10 column volume of 50 mM phosphate buffer, pH 8, containing 100 mM NaCl, 0.1 mM PLP, and 30 mM imidazole (to avoid unspecific binding), and the ω-TA activity was eluted with 10 column volumes of 50 mM phosphate buffer, pH 8, containing 100 mM NaCl, 0.1 mM PLP, and 300 mM imidazole (flow rate of 5 mL min-1). The activity containing fractions were pooled, concentrated and washed with 0.1 mM PLP in phosphate buffer (pH 8) several times by ultrafiltration and the purified enzyme was stored at 4 °C (Table 1). Sodium Dodecyl Sulfate-Polyacrylamide Gel Electro- phoresis (SDS-PAGE). Samples from cultivation after sonication were divided into soluble and insoluble fractions (10 µL) and analyzed by SDS-PAGE on polyacrylamide gels (12.5 or 8%) with Figure 4. Validation of the formation of acetophenone by the a stacking gel (4%). The low molecular weight protein standard spectrophotometric assay (b) and by micellar electrokinetic chroma- mixture obtained from Roth (Karlsruhe, Germany) was used as tography (O). The calculated activities differed only by 3% for both reference. Gels were stained for protein detection with Coomassie purified enzyme (data not shown) and crude extract (data shown). Brilliant Blue (Figure 1). Establishment and Validation of the Spectrophotometric One unit of activity was defined as the amount of enzyme that Assay. All UV-photometric measurements were performed using produced 1 µmol acetophenone from (S)-R-MBA per minute. a Varioskan spectrophotometer (Thermo Electron Corporation, Standard curves were measured with substrate ((S)-R-MBA/ Langenselbold, Germany). Spectra of compounds in adequate pyruvate) and product (acetophenone/alanine) concentrations of concentrations were taken in the range from 220 to 300 nm in 2 to 2.5 mM and 0 to 0.5 mM, respectively. Brand UV 384-well MTP (sample volume 114 µL) in five dupli- For validation of the spectrophotometric assay the reaction cates. Kinetic spectrophotometric measurements were conducted solution was divided into different wells of a microtiter plate. at 245 nm in Greiner UV 96-well plates with a reaction mixture During the spectrophotometric measurement, the reaction was containing 2.5 mM R-MBA and 2.5 mM pyruvate in 200 mL stopped successively in different wells by adding 1/10 of the phosphate buffer (50 mM, pH 8), 0.25% DMSO, and an appropriate volume of 1 M HCl and centrifuged. The concentration of amount of enzyme (0.04 to 4 µg purified enzyme or 0.2 to 20 µL acetophenone was analyzed by micellar electrokinetic chroma- of crude extract with an OD600 of 10, depending on the respective activity). The initial absorbance was below 0.5. If higher tography (MEKC) in a capillary electrophoresis system using the concentrations of R-MBA or pyruvate are desired, for example, CElixir SDS kit (MicroSolv Technology Corporation, Eatontown, for determination of kinetic constants, measurements can be U.S.A.). The separations were done on a Beckman PACE-MDQ performed at higher wavelengths, for example, 270-290 nm, system equipped with a fused capillary and a photodiode array or in a lower reaction volume to decrease the initial absorbance detector. The capillary length was 30 cm, with a distance of 10 (Figure 2). Each measurement was repeated at least three times. cm to the detector and an inner diameter of 50 µm. After injection The specific activity was expressed as units/mg purified protein. (0.5 psi for 5 s) and injection of a water plug (0.1 psi for 10 s) the samples were separated by applying 25 kV for 2.5 min. The (8) Truppo, M. D.; Rozzell, J. D.; Moore, J. C.; Turner, N. J. Org. Biomol. Chem. R 2009, 7, 395–398. migration times were 1.4 min (acetophenone) and 1.8 min ( - (9) Henke, E. PhD thesis, University of Greifswald, 2001. MBA).

8246 Analytical Chemistry, Vol. 81, No. 19, October 1, 2009 Figure 5. Temperature profile (a) and pH-profile (b) of Rhodobacter sphaeroides ω-TA. local maximum at 245 nm due to its carbonyl group conjugated to the aromatic ring system. On the other hand, compounds like pyruvate, alanine, alkylamines, R-methylbenzylamine, and ketones showed an absorption which was of less than 5% of acetophenone at this wavelength. Thus, the reaction can be followed simply by monitoring the increasing absorption in the model reaction of the ω-TA (Figure 3) neglecting absorption changes caused by other reaction partners. Since most ω-TA converts R-methylbenzylamine (R-MBA) well, all low absorbing ketones, aldehydes, and keto acids may be used in the assay as cosubstrate. This allows the characterization of the amino acceptor spectrum of an ω-TA, as well as the determination of the enantiopreference and an estimation of enantioselectivity for the substrate R-MBA. Addition- ally, characterization of pH and temperature optimum and tem- perature stability can be performed easily. We also investigated other R-arylalkylketones like p-bromo- or p-nitroacetophenone, Figure 6. Temperature dependency of the stability of the Rhodo- 2-indanone, and benzaldehyde as amino donors and their respec- bacter sphaeroides ω-TA. tive amine products, but the differences in absorbance of the ketone and amine was not as high as found for the R-methylben- Characterization of the ω-TA from Rhodobacter sphaeroi- zylamine/acetophenone pair. Furthermore, the amines also showed des. Protein concentrations were determined using the BCA assay a significant absorption itself, so that the reaction mixture would kit (Uptima, Montlucon, France). For determination of the have to be diluted prior to the measurement, since already at small temperature and pH optimum of the ω-TA, the activity was assayed amine concentrations a high initial absorbance would prevent a in the reaction mixture described above at various temperatures (4, 10, 20, 30, 40, 50, 60, 70 °C) and various pH values (5.5-7 kinetic measurement of the reaction. phosphate; 7.5-9 Tris; 9.5-11 glycine), respectively. For deter- This assay was validated by following the reaction using both mination of the stability, the purified enzyme was incubated at measurement of the absorption at 245 nm and by detection of different temperatures (see above), and aliquots were taken at the acetophenone formed by capillary electrophoresis with mi- certain points. After dilution to 1:10 with phosphate buffer (pH 8) cellar electrokinetic chromatography (MEKC), yielding compa- and incubation on ice for 5 min the activity was assayed at room rable results when using either purified enzyme or crude cell temperature. extract as enzyme source (Figure 4). The absorption coefficient ≈ -1 -1 To determine the substrate specificity, the amino donor of acetophenone at 245 nm was determined to be 12 mM cm . pyruvate was replaced by the same concentration of other ketones, One limitation of the assay is that the protein content of the aldehydes, and keto acids (Table 2), and activities were assayed sample contributes to the initial absorbance. This limits the as described above. amount of protein to be used for the assay: the addition of a higher amount of enzyme preparation does not lead to an increased RESULTS AND DISCUSSION sensitivity, because at a higher initial absorption the sensitivity While studying the absorption spectra of various compounds for small changes in the absorption decreases. we found that acetophenone, which is formed during the tran- Application of the Assay in the Characterization of an ω-TA samination of the standard substrate R-methylbenzylamine, showed from Rhodobacter sphaeroides. To verify the methods de- a high absorbance in the ultraviolet range of the spectrum with a scribed above, an ω-TA identified in the genome of Rhodobacter

Analytical Chemistry, Vol. 81, No. 19, October 1, 2009 8247 sphaeroides 2.4.1 was characterized using this assay. The gene enzyme is most active at pH 8.5 and 40 °C; however, the half- gi3718887 was cloned from genomic DNA, and the enzyme was life of the enzyme decreases rapidly if it is incubated at expressed in E. coli BL21. A C-terminal His-tag was introduced temperatures above 20 °C. and allowed a simple purification of the enzyme to near homogeneity (Table 1). Investigation of the amino acceptor CONCLUSIONS In this paper we described a simple and sensitive kinetic assay profile revealed that beside pyruvate also glyoxylate, succinic that can be used as a high throughput method for screening semialdehyde, and butyraldehyde are well accepted substrates. protein libraries and for the fast characterization of amino acceptor The ketones 2-heptanone and 1-N-Boc-3-aminopyrrolidine were specificity of ω-transaminases. This was achieved without the need converted only very slowly, and no conversion of 2-oxoglutarate for any additional enzymes or staining solutions, which is the or acetone was detected (Table 2). major advantage compared to previously described procedures. Standard enzymatic properties like pH-profile (Figure 5b), temperature profile (Figure 5a), stability (Figure 6), and the Received for review July 23, 2009. Accepted August 22, effect of additives or buffer composition on activity could be 2009. investigated very quickly and simply using this assay. The AC901640Q

8248 Analytical Chemistry, Vol. 81, No. 19, October 1, 2009 The Articles

ARTICLE III

- 61 -

Anal. Chem. 2010, 82, 2082–2086

Conductometric Method for the Rapid Characterization of the Substrate Specificity of Amine-Transaminases

Sebastian Scha¨ tzle,† Matthias Ho¨ hne,† Karen Robins,‡ and Uwe T. Bornscheuer*,†

Institute of Biochemistry, Department of Biotechnology and Enzyme Catalysis, Greifswald University, Felix Hausdorff-Strasse 4, 17487 Greifswald, Germany, and Lonza AG, Valais Works, Visp, Switzerland

Amine-transaminases (ATAs, ω-transaminases, ω-TA) are PLP-dependent enzymes that catalyze amino group trans- fer reactions. In contrast to the widespread and well- known amino acid-transaminases, ATAs are able to con- vert substrates lacking an r-carboxylic functional group. They have gained increased attention because of their potential for the asymmetric synthesis of optically active amines, which are frequently used as building blocks for the preparation of numerous pharmaceuticals. Having already introduced a fast kinetic assay based on the r conversion of the model substrate -methylbenzylamine Figure 1. Transaminase reaction and principle of the conductivity for the characterization of the amino acceptor specificity, assay. A positively charged amine and a negatively charged keto acid we now report on a kinetic conductivity assay for inves- are converted to a zwitterionic amino acid and a noncharged ketone, tigating the amino donor specificity of a given ATA. The thus the conductivity of the reaction solution decreases. course of an ATA-catalyzed reaction can be followed are capable of being performed in a high-throughput screening conductometrically since the conducting substrates, a positively charged amine and a negatively charged keto (HTS) format. Examples includes solid-phase bound assays related 1 2 acid, are converted to nonconducting products, a non- to immunoassay, IR thermography, circular dichroism (CD) 3,4 5 - charged ketone and a zwitterionic amino acid. The de- spectroscopy, fluorescence image analysis, and UV vis-based 6,7 crease of conductivity for the investigated reaction systems methods. were determined to be 33-52 µSmM-1. In contrast to Amine-transaminases (ATA, ω-transaminases, ω-TA) gained other ATA-assays previously described, with this ap- increased attention in research in the last years because of their proach all transamination reactions between any amine potential for the synthesis of optically active amines, which are and any keto acid can be monitored without the need frequently used as building blocks for the preparation of numerous for an additional enzyme or staining solutions. The physiologically active compounds. However, for the screening of assay was used for the characterization of a ATA from the substrate specificities and enantioselectivities of ATA only a Rhodobacter sphaeroides, and the data obtained were limited number of methods have been reported, due to the special in excellent agreement with gas chromatography type of reaction catalyzed (Figure 1): An amine and a carbonyl analysis. substrate bearing a ketone or aldehyde group are converted into products by exchanging their amino and carbonyl functional The identification of novel biocatalysts and the optimization groups, without the stochiometric consumption of a cofactor. In of existing enzymes are key steps for developing highly efficient case of other enzymatic reactions like e.g. hydrolysis of esters or processes. Thus, protein engineering and the metagenome ap- amides, the substrates (ester/amide) and products (alcohol/amine proach have emerged as powerful tools that are used frequently to achieve these aims. While working with large enzyme libraries (1) Taran, F.; Gauchet, C.; Mohar, B.; Meunier, S.; Valleix, A.; Renard, P. Y.; which are generated during these strategies, the fast, effective, et al. Angew. Chem., Int. Ed. 2002, 41, 124–127. (2) Reetz, M. T.; Becker, M. H.; Ku¨hling, K. M.; Holzwarth, A. Angew. Chem., and reliable identification of enzymes with the desired properties Int. Ed. 1998, 37, 2647–2650. is often the bottleneck since this is the most time-consuming step. (3) Ding, K.; Ishii, A.; Mikami, K. Angew. Chem., Int. Ed. 1999, 38, 497–501. Therefore a large variety of fast methods for determining enzyme (4) Reetz, M. T.; Ku¨hling, K. M.; Hinrichs, H.; Deege, A. Chirality 2000, 12, 479–482. activity have been developed for many biocatalysts, which often (5) Soleihac, J.-M.; Cornille, F.; Martin, L.; Lenoir, C. Anal. Biochem. 1996, 241, 120–127. * To whom correspondence should be addressed: Fax: (+49) 3834-86-80066. (6) Morı´s-Varas, F.; Shah, A.; Aikens, J.; Nadkarni, N. P.; Rozzell, J. D.; E-mail: [email protected]. Demirjian, D. C. Bioorg. Med. Chem. 1999, 7, 2183–2188. † Greifswald University. (7) Baumann, M.; Hauer, B. H.; Bornscheuer, U. T. Tetrahedron: Asymmetry ‡ Lonza AG. 2000, 11, 4781–4790.

2082 Analytical Chemistry, Vol. 82, No. 5, March 1, 2010 10.1021/ac9028483  2010 American Chemical Society Published on Web 02/11/2010 Table 1. Calibration of the Conductivity Assaya

substrate/product 1 & 62& 63& 64& 65& 63& 73& 83& 93& 10 conductivity [µsmM-1 cm-1] 36.6 39.0 44.0 34.6 32.9 37.1 40.0 49.5 52.1

a The values reflect the change in conductivity due to the conversion of 1 mM substrates to products (see Figure 4 for structures of substrates and products 1-10). The resolution of the applied conductometer was 0.1 µS. and acid) can be discriminated very easily,8 or during reactions Buffer Preparation. All buffers had a concentration of 20 mM involving a cofactor like NAD(P)H or side product like H2O2, and were adjusted to pH 7.5. Tris was adjusted with HCl, the three these compounds can be detected specifically.9,10 For ATA, this buffers containing BES, CHES, and HEPES were adjusted with is not the case, and there are only a few special methods to Bis-Tris (bis(2-hydroxyethyl)iminotris(hydroxymethyl)methane), discriminate the similar substrates and products without affecting and EPPS and tricine were adjusted with 1,8-diazabicyclo[5.4.0]- 11 enzyme activity. undec-7-en. In addition to ketones and aldehydes, ATA also convert the Calibration. For calibrating the assay, different standard R -keto carboxylates pyruvate and glyoxylate. The methods for solutions with decreasing substrate and increasing product assaying aminotransferase activity published so far comprise (i) concentrations of 0-10 mM were prepared, and the conductivity measurement of the generated amino acid (alanine, glycine) via and pH were measured (Table 1). No change of the pH could be detection of the corresponding copper complex11 and (ii) a pH detected. It is noteworthy that the accuracy of the assay was indicator based multienzyme cascade assay.12 Obviously, there significantly higher starting from 10 mM substrates than starting are several drawbacks: the copper staining solution in method from 5 mM substrates. Furthermore, it was important to use all (i) inhibits the enzyme, and nonspecific color formation with reactants simultaneously, including the nonconducting ketone and several buffers and crude cell extracts are observed. Thus, only alanine, for the calibration (see results). For analyzing the end point measurements are possible, and appropriate measure- influence of crude extract on the conductivity measurements, ments of the blank are necessary. Method (ii) works in the - asymmetric synthesis mode (Figure 1, reaction from right to left standard curves with different amounts (0 18% (v/v)) of crude ≈ for bottom partners), and thus no information about enantiopref- extract (OD600 10) of E. coli BL21 without ATA were erence or -selectivity can be obtained. measured in parallel. For the preparation of the crude extract, Alternatively, we recently published an UV-spectrophotometric cells were washed twice and disrupted by sonication in the assay based on the conversion of the widely used model substrate buffer used for conductivity measurements. R-methylbenzylamine.13 The product from this reaction, acetophe- Kinetic Measurements. The reactions were carried out at none, can be detected spectrophotometrically at 245 nm with high room temperature (RT). Kinetic measurements were performed sensitivity (ε ) 12 mM-1 cm-1), since the other reactants in reaction mixtures containing 10 mM substrates in buffer (20 showed only a low absorbance. Besides the standard substrate mM, pH 7.5), 10% dimethyl sulfoxide, and an appropriate amount pyruvate, all low-absorbing ketones, aldehydes, or R-keto of enzyme (purified enzyme or crude extract). Using the platinum carboxylates could be used as cosubstrates. Thus, the assay (L 5 mm) electrode, the reaction can be carried out in a well of is a fast and easy method for determining transaminase activity. a microtiter plate, and a reaction volume of 200 µL is sufficient. Additionally, the amino acceptor specificity of a given ATA can The course of the reaction was followed by measuring the be characterized very quickly. conductivity of the solution continuously. Each measurement was As the acetophenone assay is limited to R-methylbenzylamine repeated at least three times. For blank measurement, the reaction as amino donor, we now developed a method for a fast and easy was carried out with cell extract lacking an amine-transaminase characterization of the amine donor specificity of a given ATA. or with reaction mixtures containing only one of the substrates, With this conductivity based approach every amine donor and but no significant change in conductivity could be detected. The acceptor can be applied as substrate, as long as one of the specific activity was expressed as units per milligram protein. One substrates is an amino or keto acid, respectively. unit of activity was defined as the amount of enzyme that produced 1 µmol ketone product per minute. EXPERIMENTAL SECTION Validation of the Assay with Gas Chromatography. During All conductivity mesasurements were performed using either kinetic measurement, aliquots were taken at certain points for a Qcond 2200 (VWR, Germany) or a CX-401 (Elmetron, Poland) validation of the conductivity assay by gas chromatography. The L L multimeter with a graphite ( 12 mm) or a platinum ( 5 mm) reaction was stopped by adding 1/10 of the volume of 1 M HCl. electrode, respectively. The ketones formed during the reaction were extracted with one

(8) Krebsfa¨nger, N.; Schierholz, K.; Bornscheuer, U. T. J. Biotechnol. 1998, volume of ethyl acetate and determined by gas chromatography 60, 105–111. (Hewlett-Packard 5890 Series II Plus) using a forte BP-21 column (9) Hinman, L. M.; Blass, J. P. J. Biol. Chem. 1981, 256, 6583–6586. (SGE, Griesheim, Germany) with benzaldehyde as internal (10) Holt, A.; Palcic, M. M. Nature Prot. 2006, 1, 2499–2505. (11) Hwang, B.-Y.; Kim, B.-G. Enzyme Microb. Technol. 2004, 34, 429–436. standard. (12) Truppo, M. D.; Rozzell, J. D.; Moore, J. C.; Turner, N. J. Org. Biomol. Chem. Characterization of the ATA from Rhodobacter sphaeroi- 2009, 7, 395–398. (13) Scha¨tzle, S.; Ho¨hne, M.; Redestad, E.; Robins, K.; Bornscheuer, U. T. Anal. des. The ATA from Rhodobacter sphaeroides 2.4.1 (DSM 158) was Chem. 2009, 81, 8244–8248. recombinantly expressed in E. coli BL21 (DE3) and purified via

Analytical Chemistry, Vol. 82, No. 5, March 1, 2010 2083 given amine substrate can be used separately in the assay, so that Table 2. Summary of the Buffer Systems Investigatedc also information about enantioselectivity can be obtained, and (iii) conductivity rel a -1 b the amine substrate usually is much better soluble than the buffer pKa [µScm ] activity [%] corresponding ketone. Thus, even with higher substrate concen- PB 7.2 2760 100 BES 7.3 445 120 trations a homogeneous assay solution is obtained rather than a HEPES 7.7 141 24 biphasic system, which otherwise might interfere with the EPPS 8.0 246 76 Tris 8.1 1600 71 measurements. tricine 8.3 240 98 This allows a simple measurement of the reaction progress, if CHES 9.4 46 67 some crucial requirements are fulfilled: First, to maximize A. dest - 30 sensitivity the buffer system should have low background con- a PB: phosphate buffer; BES: N,N-bis(2-hydroxyethyl)taurine; HEPES: ductivity. The pH of the buffer has to be kept in a pH range from 4-(2-hydroxyethyl)piperazine-1-ethanesulfonic acid; EPPS: 4-(2-hydroxy- ethyl)-1-piperazinepropanesulfonic acid; Tris: 2-amino-2-(hydroxym- 4.0-8.0, where the net charge of alanine is ≈0. Furthermore, for ethyl)-1,3-propanediol; tricine: N-[tris(hydroxymethyl)methyl]glycine; practical reasons or for high-throughput screening purposes, the CHES: 2-(cyclohexylamino)ethanesulfonic acid. b Activities in the different buffer systems were determined using the acetophenone assay should not be sensitive toward variations in crude extract assay.13 c All buffers had a concentration of 20 mM and were adjusted concentrations. to pH 7.5 (see the Experimental Section). Standard buffer systems like phosphate buffer or Tris-HCl could not be applied due to their very high conductivity, whereby affinity chromatography (IMAC sepharose) as reported previ- the relative change in conductivity caused by the enzymatic ously.13 reaction is rather small. Thus, Good’s buffers were investigated Protein concentrations were determined using the BCA assay regarding the conductivity of 20 mM solutions at pH 7.5 and effects kit (Uptima, Montlucon, France). on aminotransferase activity (Table 2). Due to their zwitterionic To determine the substrate specificity of the enzyme, 10 mM nature, solutions of these compounds show a much smaller of different substrates and an appropriate amount of purified conductivity if the pH is kept within 2 units near the pI of the -1 enzyme (0.1-2mgmL )weremixedwith1mLof20mM respective compound (isoelectric buffers). tricine buffer pH 7.5. Dimethyl sulfoxide (10% (v/v)) was added The first system investigated was BES-Bis-Tris. It showed a to the mixture to dissolve hydrophobic substrates. Solutions very good performance with purified enzyme, but unfortunately R of oxaloacetate and -ketoglutarate were prepared freshly prior activities of crude extract measured in this buffer did not match to measurements. The reaction was carried out at RT and the results obtained by gas chromatography. Although the CHES- followed by measuring the conductivity continuously. For Bis-Tris system showed excellent performance with both purified validation by GC, aliquots were taken at certain points, and enzyme and crude extract, due to its inhibitory effect (67% enzyme the concentration of the ketone products 1b-5b was analyzed activity compared to phosphate buffer) and its rather low buffer as described above. capacity at pH 7.5, we explored an alternative buffer system. The RESULTS AND DISCUSSION inhibitory effect was the same for the EPPS buffer (76% rel activity) In the course of an ATA-catalyzed reaction the conductivity of and even greater for the HEPES-Bis-tris system (24% rel activity). the reaction medium changes since charged reactants (amine and Finally, the tricine buffer showed both a good buffer capacity at keto acid) are converted to noncharged species (the ketone and pH 7.5 and a very good agreement of determined activities of a zwitterionic amino acid) (Figure 1). For all experiments, the purified enzyme and crude extract by measuring the conductivity transamination was carried out in kinetic resolution mode. The compared to GC analysis (Figure 2). For this reason the tricine rationale behind this approach is as follows: (i) the reaction buffer was used for all subsequent experiments. The inhibitory equilibrium favors product formation, (ii) both enantiomers of a effect of the EPPS and CHES buffer may vary for other transami-

Figure 2. Validation of the assay by following the reaction of 10 mM (S)-R-methylbenzylamine and pyruvate using both conductivity measurements (black symbols/bar) and detection of the acetophenone formed by gas chromatography (white symbols/bar). The calculated activities differed only by 5% for purified enzyme (left, 0 0.25 mg/mL, 3 0.5 mg/mL, O 1 mg/mL) and crude extract (10% (v/v)) (right, the curve displays the conductivity measurement, the bars show the calculated acetophenone concentrations).

2084 Analytical Chemistry, Vol. 82, No. 5, March 1, 2010 and thus the net charge of an analyte may be affected by other compounds present in the solution. On the other hand, more importantly, ions or even neutral molecules from the background electrolyte may affect the mobility and thus the molar conductivity of an analyte significantly by intermolecular electrostatic or hydrophobic interactions. This explains why the conductivity of a mixture comprised of different ions usually differs to some extent from the sum of the contributions of the single ions to overall conductivity. For this reason, all participating reactants and even the neutral species had to be included in the calibration of the assay. If otherwise the change of the conductivity per milimolar reaction conversion [µS/mM] is calculated from molar conductivi- ties of the single compounds, the obtained value would be higher (49.4 µS/mM conversion, O in Figure 3) compared with the measurement of the complete mixtures (44.1 µS/mM conversion, b in Figure 3). Figure 3. Influence of different concentrations of crude extract (b Application of the Assay in the Characterization of an ATA 0%, 1 5%, 9 10%, [ 14%, 2 18% (v/v)) on the change of conductivity from Rhodobacter sphaeroides. To verify the method described during the transaminase reaction. While measuring the standard above, an ATA identified in the genome of Rhodobacter sphaeroides curves all reactants have to be present, as if measured separately, the calculated rate ∆µSmM-1 (O) is significantly higher. Contrary to 2.4.1 was characterized using this assay. The enzyme was other buffer systems investigated (and at substrate concentrations expressed recombinantly in E. coli BL21 and purified as de- of 5 mM), no significant influence of different crude extract concentra- scribed.13 Investigation of the amino donor profile revealed that tions were observed. besides the model substrate R-methylbenzylamine also benzyl- nases, so these buffers should be considered for further amine is a very well accepted substrate (Figure 4). Aliphatic and experiments. arylaliphatic compounds were converted only slowly, especially In practice and especially for screening of many samples, crude the cyclic aliphatic amine 1-N-Boc-3-aminopiperidine. The results cell extract is used as enzyme preparation instead of purified obtained for the amino acceptor profile are in good agreement 13 proteins. Thus an important question was whether the presence with previously measured data using the acetophenone assay. of different amounts of crude extract has an impact on the Pyruvate is the best converted acceptor, glyoxylate and succinic conductivity measurement. As expected, the background conduc- semialdehyde are converted moderately, and the keto-dicarboxylic tivity of the reaction medium increased proportional to the acids oxalacetate and R-ketoglutarate are converted only very concentration of cell extract used. Next, calibration experiments slowly. were performed where different conversions were simulated by The conductivity assay is an excellent complementary method varying the concentrations of all reactants according to a real to the acetophenone assay, since the usually more interesting reaction. Fortunately, the addition of different amounts of crude amino donor specificity of an ATA can be investigated. With both extract did not seem to affect the change of conductivity assays in hand, the complete characterization of the substrate significantly (Figure 3). This was surprising, because in theory specificity can be performed very rapidly. Furthermore, the the molar conductivity of a given compound depends on the conductivity assay allows a fast screening of ATA variants for the concentration of all electrolytes present in the reaction system reaction of any desired amine with a keto acid. One limitation of for mainly two reasons: on the one hand, the protonation state the conductivity assay is that an appropriate low conductivity buffer

Figure 4. Substrate specificity of Rhodobacter sphaeroides ATA was determined using the conductivity assay (black) as well as gas chromatography (gray). All measurements were repeated three times.

Analytical Chemistry, Vol. 82, No. 5, March 1, 2010 2085 in the pH-range of 4-8 has to be used. For determining other of the amino donor substrate profile of ATA. This was achieved enzymatic properties like the pH- or temperature optimum of an without the need for any additional enzymes or staining solutions. ATA, the acetophenone assay is more flexible since every low With appropriate equipment, we envision an application of this absorbing buffer at any pH may be used and absorbance is assay as a high-throughput method for screening enzyme libraries. virtually not dependent on temperature in contrast to conductivity, so that multiple calibrations can be avoided. Received for review December 15, 2009. Accepted CONCLUSIONS January 30, 2010. In this contribution we described a simple and sensitive conductivity assay that can be used for the fast characterization AC9028483

2086 Analytical Chemistry, Vol. 82, No. 5, March 1, 2010 The Articles

ARTICLE IV

- 67 -

article published online: 26 september 2010 | doi: 10.1038/nchembio.447

Rational assignment of key motifs for function guides in silico enzyme identification

Matthias Höhne1,3, Sebastian Schätzle1,3, Helge Jochens1, Karen Robins2 & Uwe T Bornscheuer1*

Biocatalysis has emerged as a powerful alternative to traditional chemistry, especially for asymmetric synthesis. One key requirement during process development is the discovery of a biocatalyst with an appropriate enantiopreference and enantio­ selectivity, which can be achieved, for instance, by protein engineering or screening of metagenome libraries. We have devel­ oped an in silico strategy for a sequence-based prediction of substrate specificity and enantiopreference. First, we used rational protein design to predict key amino acid substitutions that indicate the desired activity. Then, we searched protein databases for proteins already carrying these mutations instead of constructing the corresponding mutants in the laboratory. This meth­ odology exploits the fact that naturally evolved proteins have undergone selection over millions of years, which has resulted in highly optimized catalysts. Using this in silico approach, we have discovered 17 (R)-selective amine transaminases, which catalyzed the synthesis of several (R)-amines with excellent optical purity up to >99% enantiomeric excess.

nzyme catalysis represents one cornerstone in the area of of the enantiopreference of a Bacillus subtilis esterase by a double white (industrial) biotechnology1–5. The usually excellent mutation identified in a library of only 2,800 variants30. Echemo-, regio- and enantioselectivity of biocatalysts6,7 facili- In this contribution, we have explored a third approach, in which tates and simplifies many chemical processes for the production of a we take advantage of the numerous protein sequences already broad range of products. The production of optically pure building deposited in databases. One practical advantage when compared blocks for the pharmaceutical and fine chemicals industries are the with classical screening is that the complete nucleotide—and hence, most valuable areas8–12 of use for enzyme catalysis. These processes protein—sequence is already known, and after identification of are environmentally friendlier than other options because addi- promising candidates, the gene can easily be cloned from the ­original tional reaction steps can be avoided and the processes are usually source organism or obtained as a synthetic gene. The major chal- performed under milder reaction conditions with fewer organic lenge is, however, the retrieval of desired sequences if no enzyme solvents and/or in the absence of hazardous materials13. The bio- with desired enantiopreference has been described in the literature catalytic synthesis of optically pure compounds can either be per- and the easy and ideal starting point for this in silico approach is thus formed as a kinetic resolution or as an asymmetric synthesis14, as missing. This is indeed the case for amine-­pyruvate transaminases shown in Scheme 1 for transaminase-catalyzed reactions. (referred to here as amine transaminases, EC 2.6.1.18): in contrast The kinetic resolution of the racemic amine with an (S)-selective to various (S)-selective amine transaminases, only two enzymes transaminase produces the (R) enantiomer. The disadvantage of with (R)-selectivity have been reported31–33, and at the submission of this approach is that a maximum yield of only 50% can be achieved this study no information about nucleotide or amino acid sequences © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 (Scheme 1, left). The atom efficiency of such a process is low, as the was available. In the meantime, the amino acid sequence of a previ- ketone generated has little value and the recycling of this compound ously commercially available (R)-selective amine transaminase has requires chemical reductive amination to produce the racemic amine been published34. for subsequent resolutions. Synthetically, an asymmetric synthesis is Transaminases belong to fold classes I and IV of pyridoxal-5′- much more economical because yields of up to 100% are possible. phosphate (PLP)–dependent enzymes35,36. In contrast to α-amino However, in this case the (S)-amine would be produced (Scheme 1, acid aminotransferases (referred to throughout as α-transaminases, right). An asymmetric synthesis strategy is thus clearly favored with EC 2.6.1), which are ubiquitous enzymes found in all organisms, the respect to the atom efficiency of the process but has the major dis- small group of amine transaminases also converts substrates lack- advantage that only one specific enantiomer can be accessed with an ing an α-carboxylic acid moiety (Scheme 1). This makes them very enzyme having a distinct enantiopreference. As both enantiomers attractive for organic synthesis of optically active amines37, especially of a building block are often needed for targets with different abso- as the product range is not limited to α-amino acids. Hence, there lute configuration, an enzyme platform providing either (R)- or (S)- is an urgent need for the discovery of (R)-selective amine transami- specific enzymes is highly desired. Several strategies can be used to nases to access a broad range of (R)-amines by efficient asymmetric provide access to enzymes with complementary enantiopreference. synthesis (Scheme 1, right). This includes classical screening of strain collections for the identi- We first considered protein engineering to create an (R)-selective fication of complementary enzymes15. Indeed, (R)- and (S)-selective amine transaminase by inverting the enantiopreference of an (S)- hydroxy nitrile lyases16, hydantoinases17 and keto reductases18 have selective amine transaminase (Fig. 1a), but there were several limi- been described. A much more promising option is the use of protein tations, as no crystal structure of any amine transaminase is available engineering tools19–22, with which changes in substrate specificity21, and thus rational protein design is very difficult to perform. stability improvements23 and also switches in enantiopreference24–29 One option would be to start from the crystal structure of an (S)- can be achieved. A recent example of the combination of rational selective α-transaminase and reengineer the substrate-recognition protein design and directed evolution concepts is the inversion site to create a variant that accepts substrates lacking the carboxylic

1Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, Greifswald University, Greifswald, Germany. 2Lonza AG, Valais Works, Visp, Switzerland. 3These authors contributed equally to this work. *e-mail: [email protected]

nature CHEMICAL BIOLOGY | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology 807 article Nature chemical biology doi: 10.1038/nchembio.447

Kinetic resolution Asymmetric synthesis (S)-selective amine (S)-selective amine transaminase NH NH2 transaminase O NH2 O 2 + RS RL RS RL RS RL RS RL RS RL (R)-amine (S)-amine Amino acceptor, Coproduct, Amino donor, Coproduct, ≤50% yield ≤100% yield such as pyruvate such as alanine such as alanine such as pyruvate

RS - small sized alkyl group Coproduct removal RL - medium/large sized alkyl or aryl group Scheme 1 | Strategies for the synthesis of optically active amines using amine transaminase. In a kinetic resolution (left), the amine transaminase converts in the ideal case only one of the amine enantiomers to the corresponding ketone. The remaining enantiomer can be isolated in high optical purity and at a (theoretical) maximum yield of 50%. In an asymmetric synthesis (right), a prostereogenic ketone is aminated enantioselectively, yielding directly the optically active amine. The most common cosubstrates for amine transaminases are pyruvate and alanine. As the equilibrium favors ketone formation, high yields in asymmetric synthesis can only be achieved by shifting the equilibrium, for example by enzymatic removal of the formed coproduct pyruvate49.

function (Fig. 1a), resulting in an (R)-selective amine transaminase Instead of performing directed evolution to create and iden- (owing to the switch in priority according the Cahn-Ingold-Prelog tify an enzyme with the desired selectivity from a random mutant (CIP) rule38). Unfortunately, the substrate’s α-carboxyl group plays library, we developed a strategy to find enzymes with complemen- an important role for the domain closure of the α-transaminase tary enantiopreference by searching in silico in protein databases during substrate binding39,40. Thus, to solve this challenge, many (Fig. 1b). Essentially, the method is based on two steps: (i) identifi- additional mutations might be required, which could not be pre- cation and prediction of important amino acid residues on the basis dicted because of the complexity of the problem. of structural information from related enzymes and (ii) data mining

a A A A B B B

– COO RS RL RS RL NH NH 2 2 NH2

L-Branched chain transaminase Desired (R)-amine transaminase (S)-amine transaminase PLP-dependent fold class IV PLP-dependent fold class I

b Structural information What would be a plausible ancestor for the evolution of an (R)-specific amine transaminase?

A Hypothesis of evolution © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 Which amino acid residues are involved in substrate recognition in the active site? Which amino acid exchanges are expected? B Prediction of key mutations

Can enzyme activities other than the desired be excluded?

C Annotation algorithm based on sequence motifs

PLP-dependent fold class IV proteins

D Database search Which sequences match the expected criteria?

Putative (R)-selective amine transaminase

E Cloning and expression of identified sequences

Does the activity of the expressed protein match the predicted function? Desired enzyme

Figure 1 | Strategies for protein engineering. (a) Possible ancestors of amine transaminase with (R)-enantiopreference can be used to engineer an (R)-selective amine transaminase (center). This can be achieved by modification of the amino acids in the carboxyl group–binding pocket of an α-transaminase (left), such as an L-branched chain transaminase of the PLP-dependent fold class IV, or by engineering of the binding pockets of an 38 (S)-selective amine transaminase (right) from PLP-dependent fold class I. It was assumed that according to the CIP rule , the large substituent (RL)

has a higher priority than the small substituent (RS). (b) Flow scheme of the in silico approach for the identification of transaminases with inverted enantiopreference, with steps A–E.

808 nature chemical biology | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology Nature chemical biology doi: 10.1038/nchembio.447 article

(R)-DATA (S)-BCAT ADCL Tyr97 Tyr95 X: Ile, Met, Arg97 Gln99 Ala, Ser, Thr, His Lys40 NH Lys97 Asn256 H2N O O H2N – H X95 Arg107 O O Asn127 HN + NH O Arg40 O H Tyr36 NH – O O O Arg107 O Phe36 O O H Thr262 H R Met107 R H – HN – Thr262 O Val38 O HN O Val109 O His109 + + Gly38 + N N + N H H H H NH2 – – Thr38 O O O O – O O – – – PO H PO3H PO H 3 Lys163 3 + + N N N H H H 31 95 107 31 95 107 31 95 107 FGDGIYEVIK XIYLQ…RxH YGxxVFEGLK YIRxx…LGV FGDGCFxTAx VxKVx…RGY S V VR L I A MR L M L YS L I G L Ax V M S I V V V L I F I V

Figure 2 | Identification of key amino acid motifs that allow prediction of function of PLP-dependent fold class IV proteins. Top: schematic drawing of amino acids contributing to substrate binding in DATAs, BCATs and ADCLs. The external aldimines, which are formed after binding and transimination of the respective substrates with the PLP cofactor of the enzymes, are shown. Blue amino acids are part of binding pocket A (Fig. 1a), and amino acids of binding pocket B (Fig. 1a) are shown in green. The gray shaded circle represents the glycine or valine, depending on the transaminase, that is located behind the carboxyl group. The red threonine residue is important in the catalytic mechanism for shuttling of a proton during the reaction transition state. Bottom: sequence motifs derived from multiple-sequence alignments are given using the same color code, showing that amino acids important for substrate binding in the active site are rather conserved and can be used for a prediction of the substrate specificity of PLP-dependent fold class IV enzymes as well as for the enantiopreference of the transaminases within this fold class.

in protein sequences. This concept takes advantage of the currently pockets. If the α-carboxyl group is positioned in binding pocket B available knowledge about the function of proteins combined with (Fig. 1a), the enzyme converts L-amino acids and thus shows the experimentally uncharacterized huge diversity stored in protein (S)-preference. If, however, the α-carboxyl group of the sub- sequence databases. strate is accommodated in binding pocket A, this transaminase is (R)-selective and thus converts D-amino acids. RESULTS This indicates that the desired diversity is perhaps already In the first step, we carefully analyzed structural information of related available in nature and that it only needs to be identified and enzymes to evaluate a possible hypothesis for the evolution of an exploited. Following this line of reasoning, an (R)-selective amine (R)-selective amine transaminase (step A; see Fig. 1b for a schematic transaminase could already have been evolved from an (S)-selective representation of the steps). Then, we predicted important positions α-transaminase if modifications of the α-carboxyl group–binding © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 and suggestions for amino acid exchanges (step B). Next, we per- pocket had occurred, allowing small hydrophobic side chains to bind formed identification of putative—and so far unknown—sequence instead of the α-carboxyl functionality. The plausible ancestor of an motifs to exclude unwanted enzyme activities, and, as the key (R)-specific amine transaminase must therefore be an L-(S)-selective step, we developed an annotation algorithm on the basis of these BCAT, as, according to the CIP rule38, the substitution of the carboxyl sequence motifs (step C). Protein-coding sequences thus identi- group of an L-amino acid by a methyl group yields an (R)-amine. fied in a database search (step D) as matching the predicted criteria were then cloned from synthetic genes. The resulting enzymes were Prediction of key features of the desired enzyme investigated biochemically for transaminase activity and desired Working from the known crystal structures, we studied the (R)-selectivity (step E). ­coordination of the α-carboxyl group in BCATs and DATAs. This allowed us to predict the necessary differences in protein sequences Analysis of structural information within PLP-dependent fold class IV proteins that determine the Two fold classes have been described for transaminases35. All desired switch in substrate specificity from α-amino acids to amines, α-transaminases belonging to fold class I of PLP-dependent enzymes in line with the formal switch in enantiopreference. In the case of known so far are L-selective, and no D-selective amino acid transami- BCATs, the substrate binding in pocket B is realized in a more subtle nases have been described within this fold class. However, D-amino manner than it is in DATAs and other α-transaminases (for more acid aminotransferases41 (DATAs) are members of fold class IV, and details see Supplementary Results and Supplementary Fig. 1). There notably, L-branched chain amino acid aminotransferases (BCATs)42 is no direct contact of the carboxyl group oxygen atoms to any basic belong to the same fold class43. This indicates that within this protein amino acid side chain such as that of an arginine (Fig. 2). Instead, one fold class there is a certain flexibility in the architecture of the active carboxyl oxygen atom is coordinated by the Tyr95 hydroxyl group, site with regard to substrate recognition, and therefore we focused which is polarized by the coordination of an adjacent Arg97, and our search on fold class IV aminotransferases. Apart from DATAs the other oxygen is bonded by two backbone amide nitrogen atoms and BCATs, only 4-amino-4-deoxychorismate lyase (ADCL) is cur- of Thr262 and Ala263, which are activated by the coordination of rently known as a further member of fold class IV. their adjacent carbonyl groups by Arg40 (Fig. 2 and Supplementary The opposite enantiopreference of DATAs and BCATs can be Fig. 1; note that in related sequences a lysine is at position 40). explained—via their crystal structures41,42—by the difference in sub- Thus, an amine transaminase should differ from BCATs in strate coordination in the active site, which consists of two ­binding the following residues. First, Tyr95 should be exchanged with

nature CHEMICAL BIOLOGY | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology 809 article Nature chemical biology doi: 10.1038/nchembio.447

Table 1 | Section of a multiple-sequence alignment of putative (R)-selective amine transaminases. Sequence motif 1 Sequence motif 2 Entry, GeneID Organism 31 36 40 95 99 107 1 ecDATA Bacillus sp. 26 FGDGVYEVVKVYN 77 HIYFQVTRGTSPRAHQFP 2 ecBCAT Eschericha coli 26 YGTSVFEGIRCYD 86 YIRPLIFVGDVGMGVNPP 3 ecADCL Eschericha coli 21 FGDGCFTTARVID 78 VLKVVISRGSGGRGYSTL 4 115385557 Aspergillus terreus 55 HSDLTYDVPSVWD 106 FVELIVTRGLKGVRGTRP 5 211591081 Penicillium chrysogenum 53 HSDLTYDVPSVWD 104 FVEIIVTRGLKGVRGSRP 6 145258936 Aspergillus niger 53 RSDLTYDVISVWD 104 YVALIVTRGLQSVRGAKP 7 169768191 Aspergillus oryzae 53 HSDLTYDVPSVWD 104 FVELIVTRGLKGVRGNKP 8 70986662 Aspergillus fumigatus 53 HSDLTYDVISVWD 104 FVEVIVTRGLTGVRGSKP 9 119483224 Neosartorya fischeri 53 HGDLTYDVTTVWD 104 FVEVIVTRGLTGVRGSKP 10 46109768 Gibberella zeae 53 HGDLTYDVPAVWD 104 FVELIVTRGLKPVREAKP 11 114797240 Hyphomonas neptunium 53 HSDLTYDVPAVWN 104 YVEIIVTRGLKFLRGAQA 12 120405468 Mycobacterium vanbaalenii 69 HSDLTYTVAHVWH 120 FVNLTITRGYGKRKGEKD 13 13471580 Mesorhizobium loti 54 HSDATYDTVHVWN 105 YVEMLCTRGASPTFSRDP 14 20804076 Mesorhizobium loti 53 HSDATYDTVHVWE 104 YVEMICTRGGSPTFSRDP 15 86137542 Roseobacter sp. 45 HSDATYDVAHVWK 96 YVEFICTRGTSPTFSRDP 16 87122653 Marinomonas sp. 44 HSDATYDVVHVWQ 95 YVEMICTRGNSPDFSRDP 17 190895112 Rhizobium etli 38 RSDACQDTVSVWD 90 YVQIIMTRGRPPIGSRDL 18 89899273 Rhodoferax ferrireducens 34 RSDATYDVVTVWD 85 YVEMICTRGQPPWGSRDP 19 89053613 Jannaschia sp. 32 HSDIAYDVVPVWR 83 YVAMVAARGRNPVPGSRD 20 EEE43073 Labrenzia alexandrii 42 HSDITYDVVPVLD 93 YVAMVTSRGVNQVPGSRD 21 78059900 Burkholderia sp. 53 HADAAYDVVTVSR 104 YVWWCVTRGPLSVDRRDR 22 ABK12047 Burkholderia cenocepacia 48 HSDVTYDTVHVWN 99 YVEMLCTRGVSPTFSRDP 23 ZP_01448442 Alpha proteobacterium 22 HSDATYDVAHVWG 73 YVEFICTRGTSPNFSRDP 24 219677744 Gamma proteobacterium 30 LGDGVFDVVSAWK 81 SIRFIVTRGEPKGVVADP The first three entries are the DATA, BCAT and ADCL sequences used for the identification of sequence motifs. Entries 4–24 refer to the newly discovered enzymes.

a ­hydrophobic residue that is incapable of forming a hydrogen multiple-sequence alignment (Supplementary Fig. 2). Fortunately, bond with the carboxyl group. Second, Arg40 should be changed the amino acids that are in direct contact with the substrate in the to a residue that cannot activate the amide backbone nitrogens of active site are arranged in two relatively short sequence blocks. Thr262 and Ala263 by coordination of the neighboring backbone The first block is located at positions 36–40, and the second block © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 carbonyl oxygens. However, the situation regarding position 40 in comprises six amino acids at positions 95–97 and 107–109. Most PLP-dependent fold class IV proteins is complex. DATAs also have of these amino acids fold into a β-sheet; only residues 107–109 a basic amino acid, Lys40, but in contrast to the Arg40 in BCATs, are part of a loop. This information is important because during Lys40 in DATAs adopts an altered conformation so that its ε-amino evolution, insertions or deletions take place preferentially in loop group forms a hydrogen bond to the backbone of an adjacent loop sequences without destroying enzyme activity but do not usually (Fig. 2 and Supplementary Fig. 1). Thus, in spite of the presence of take place in α-helices or β-sheets44. Thus, it was observed that resi- Lys40 in DATAs, the activation of the amide backbone nitrogens, dues contributing to substrate recognition of a PLP-dependent fold critical for carboxyl-group binding in binding pocket B, is pre- class IV enzyme aligned well within the different enzymes, except vented. From this observation, we concluded that if an Arg40 or in the motif at positions 107–109, where we sometimes observed Lys40 is found in one of the sequences in a fold class IV protein, insertions or deletions of one to two amino acids. Alignments, whether this residue facilitates the binding of a substrate carrying an which include all known BCAT, DATA and ADCL proteins with α-carboxyl group or not cannot be predicted. In summary, the pres- experimentally verified enzyme activity, showed that the amino ence of a hydrophobic amino acid in position 95 and the absence of acids involved in substrate recognition in the active site seem to be an arginine or lysine residue at position 40 would be a clear hint of quite conserved in these three groups. This allowed us to formu- altered substrate specificity toward amines. late different sequence motifs characteristic of DATA, BCAT and ADCL activity (Fig. 2; for a summary of the structure-­function Design and application of a sequence-based algorithm relationship of individual amino acids of the sequence motifs see Next, we developed a sequence-based prediction of the substrate Supplementary Tables 1 and 2; for the multiple-sequence align- specificity—conversion of amines versus α-D- or α-L-amino ments see Supplementary Figs. 3–5). On the basis of these com- acids—to identify putative (R)-amine transaminases within PLP- parative considerations, an annotation algorithm was developed dependent fold class IV proteins. Aside from the key residues (Supplementary Fig. 6) using the amino acid sequence motifs considered important for amine transaminase activity, we com- identified. This enabled an easy exclusion of all enzyme candidates pared residues involved in substrate coordination in the ­different that could clearly be designated as BCATs, ADCLs or DATAs. The enzymes. To simplify the structural description, we introduced analysis of the remaining sequences aimed at identifying trans­ a general numbering scheme for the amino acid residues of the aminase sequences that fulfilled the requirements for the desired different PLP-dependent fold class IV proteins. This was based on a (R)-selective amine transaminase activity.

810 nature chemical biology | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology Nature chemical biology doi: 10.1038/nchembio.447 article

NCBI database as DATAs, 7 could clearly be identified as BCATs Boc and one as an (R)-amine transaminase (Supplementary Fig. 7). N H N O 2 Confirmation of predicted activity and enantiopreference

H2N H2N H2N H2N HO O HO O In the next step we ordered all the putative (R)-selective amine 1 23456 transaminase genes as codon-optimized sequences for expression in ++++++ Pyruvate Pyruvate Pyruvate Pyruvate α-KG L-Glutamate Escherichia coli. They were subcloned and expressed in E. coli BL21 RS RSS RSR and underwent His-tag purification. Then we investigated the puri- 4 Aspergillus terreus fied enzymes with respect to activity, enantioselectivity and enantio- 5 Penicillium chrysogenum preference toward a range of amines (1–4) (Fig. 3). Additionally, 7 Aspergillus oryzae we also examined whether they possessed BCAT or DATA activity 8 Aspergillus fumigatus (conversion of d-alanine 5 or l-glutamate 6 with concomitant for- 9 Neosartorya fischeri mation of valine). Seventeen of the 21 putative (R)-selective amine 10 Gibberella zeae transaminase genes that had been identified were found to be (R)- 11 Hyphomonas neptunium selective amine transaminases (Fig. 3 and Supplementary Tables 3 12 Mycobacterium vanbaalenii and 4). Three proteins could not be expressed in E. coli in sufficient 13 Mesorhizobium loti amounts, and one protein was found to have very low activity on 14 Mesorhizobium loti the substrates studied. Specific activity and substrate range differed 15 Roseobacter sp. greatly among all of the enzymes investigated, which we find unsur- 16 Marimonas sp. 17 Rhizobium etli prising considering that the protein sequences originate from various 18 Rhodoferax ferrireducens microorganisms, that recombinant expression was never reported 19 Jannaschia sp. before and that specific activities within their natural functions as 20 Labrenzia alexandrii well as their natural substrates are unknown. Nevertheless, 10 out of −1 21 Burkholderia sp. the 21 proteins showed a specific activity >0.5 U mg toward at least 24 Gamma proteobacterium one of the investigated amines (Fig. 3), which is in the same activ- ity range of known (S)-selective amine transaminases45–47. (One unit (U) of activity was defined as the amount of enzyme that produced Specific activity [U mg–1]: 510.5 0.20.1 0.05 0.01 0.002 0 1 μmol ketone product per minute.) Consequently, these newly identified (R)-selective transami- Figure 3 | Characterization of the discovered (R)-selective amine nases can be used in asymmetric synthesis to yield optically pure transaminases. Specific activities toward (R) and (S) enantiomers of (R)-amines. This was confirmed in preliminary experiments amines 1–4 and α-amino acids 5 (d-alanine) and 6 (l-glutamate) are for the asymmetric synthesis of 2-aminohexane (2), 2-amino-4- indicated by a color gradient. The numbering of the proteins corresponds ­phenylbutane (3), 1-N-Boc-3-aminopyrrolidine (4a) and 1-N- to that given in Table 1 (details on specific activity and expression levels are Boc-3-aminopiperidine (4b) from the corresponding ketones with given in Supplementary Tables 2 and 3). α-KG, α-ketoglutarate. three of the amine transaminases and resulted in low to moder- ate yields with excellent enantiomeric excess (ee) up to 99.6% (see Supplementary Table 5). Using this algorithm, we analyzed about 5,700 sequences anno- tated as BCATs and 280 protein sequences annotated as PLP- DISCUSSION dependent fold class IV proteins from the US National Center for The strength of this new in silico approach is that new enzymes can © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 Biotechnology Information (NCBI) Protein Database. This search be discovered very quickly. In contrast, directed evolution requires identified 21 sequences, 7 from eukaryotes and 14 from prokaryotes several rounds of random mutagenesis or iterative saturation muta- (Table 1). These sequences matched our criteria for the selection genesis to alter the enantioselectivity or substrate specificity, even of putative (R)-selective amine transaminases with the predicted if structural information is available to identify hot spots for muta- enantiopreference: the absence of arginine or lysine at position genesis. Usually more than 103–104 variants have to be screened for 40 in all sequences and a Phe95 instead of a Tyr95 in 8 out of the the identification of a mutant with desired properties. In contrast to 21 sequences indicated that substrates having a small alkyl group this very low ‘hit rate’, at least 50% of the putative proteins identified instead of a carboxyl group in the α-position might be bound to in our study turned out to be useful biocatalysts. the active site. In the other 13 sequences, a putative BCAT activity Hence, before creating mutants or libraries stemming from as suggested by a Tyr95 residue is unlikely because of the presence rational predictions in the laboratory, it is worthwhile to investigate of alanine, glutamine, aspartic acid or glutamic acid at residue 97 whether nature has already designed variants and possibly opti- instead of the required polarizing arginine. Thus, the coordination mized these catalysts over millions of years of natural evolution. of a small alkyl group in binding pocket B seemed much more likely The newly identified (R)-selective amine transaminases are an than binding of an α-carboxyl group, which we considered strong ideal starting point for further fine-tuning and optimization by pro- evidence that the respective proteins are amine transaminases rather tein engineering, as the requirements for industrial processes are than BCATs. To convey (R)-selectivity, binding pocket B should often different from the enzyme’s original function in nature. This only allow the binding of a small alkyl group. The fact that in posi- was demonstrated in a recent study in which the substrate specific- tion 95 a tyrosine or phenylalanine is found suggests that the size of ity of an (R)-amine transaminase was broadened so that bulky sub- the binding pocket is not large compared to BCATs, as this could be strates could also be converted34. Also, other important properties, obtained by mutation of Tyr95 to a small hydrophobic amino acid, such as stability at higher temperatures and tolerance of high cosol- such as alanine or valine. vent, substrate and product concentration, can be improved signifi- Thus, we considered the criteria for the desired switch in sub- cantly and can facilitate the application of amine transaminases for strate specificity toward (R)-amines to be fulfilled. It is noteworthy industrial-scale biotransformations. that sometimes the annotated function of a protein included in a In the example shown here, we took advantage of the enormous database did not match the prediction made by our sequence motif diversity of structures, elucidated mechanisms and substrate spec- approach. For example, out of 26 proteins annotated in the curated ificities already reported for PLP-dependent proteins. Although

nature CHEMICAL BIOLOGY | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology 811 article Nature chemical biology doi: 10.1038/nchembio.447

such diversity is not yet explored in depth for all enzyme classes, the Asymmetric synthesis of amines 1–4. Preliminary asymmetric syntheses were per- possibilities for the discovery of new catalysts using this concept formed at 30 °C for 24 h in sodium phosphate buffer (100 mM, pH 7) containing + will steadily increase on a daily basis as the number of protein PLP (1 mM) and NAD (1 mM) in 1.5-ml Eppendorf tubes. The reaction mixture contained 50 mM ketone, L-alanine (5 equiv., 250 mM), lactate dehydrogenase sequences and solved structures available in the databases grows. from bovine heart (90 U), glucose (150 mM) and glucose dehydrogenase (15 U). A second important requirement for the successful application Amine transaminases from Aspergillus terreus, Mycobacterium vanbaalenii and of in silico enzyme discovery is a reliable prediction of the nece­ssary Mesorhizobium loti (entries 4, 12 and 14 in Table 1) were expressed in E. coli BL21 key amino acid differences that determine the desired substrate as described above, frozen in aliquots and applied directly as whole-cell biocata- specificity. In this study, the molecular mechanism for the change in lysts (~0.05 g wet cell weight per ml) without further purification. The conversion was measured by detection of the formed amines (1, gas chromatography; 2–4, substrate specificity from binding a carboxyl group to a small alkyl capillary electrophoresis). Chiral analysis of 2–4 was performed using capillary group—and the related predicted, desired switch in enantioprefer- electrophoresis as reported previously49. The percent enantiomeric excess values ence—could be easily rationalized. Method develop­ment for under- for 1 were analyzed by gas chromatography. After extraction of the amine with standing the molecular basis and prediction of substrate specificity ethyl acetate, derivatization to the trifluoroacetamide was performed by adding involving homology modeling and molecular docking are important a 20-fold excess of trifluoroacetic acid anhydride. After purging with nitrogen to remove excess anhydride and residual trifluoroacetic acid, the derivatized research fields. Additionally, the application of more complex search compound was dissolved in ethyl acetate (50 μl) and baseline separated using algorithms and bioinformatics tools will help to refine such in silico a Shimadzu GC14A that was equipped with a heptakis-(2,3-di-O-acetyl-6-O-tert- enzyme discovery and facilitate its application in cases in which butyldimethylsilyl)-β-cyclodextrin column (25 m by 0.25 mm). The retention more complex solutions might be required. times were 16.0 min ((S)-1) and 16.2 min ((S)-2) using the following oven temperature program: 80 °C for 10 min, heating with 20 °C per min to 180 °C, METHODS maintained at 180 °C for a further 10 min. Alignments. All pairwise and multiple amino acid sequence alignments were done with the computer program STRAP using the ClustalW3D algorithm with stand- Received 23 November 2009; accepted 23 August 2010; ard parameters. The protein sequences annotated as BCATs or PLP-dependent published online 26 September 2010 fold class IV enzymes from the NCBI protein database were used for the database search and aligned to E. coli BCAT. References Cloning and expression of amine transaminase. The codon-optimized open read- 1. Bornscheuer, U.T. & Kazlauskas, R.J. Hydrolases in Organic Synthesis ing frames (ORFs) encoding proteins 4, 5, 6, 7, 11, 12, 14, 16, 17, 18, 20, 21 and (Wiley-VCH, Weinheim, Germany, 2005). 24 (Table 1) were inserted into pGASTON between the NdeI and BamHI 2. Breuer, M. et al. Industrial methods for the production of optically active restriction sites. The codon-optimized ORFs encoding all other proteins were intermediates. Angew. Chem. Int. Edn Engl. 43, 788–824 (2004). ordered subcloned in pET-22b. Transformed E. coli BL21 (DE3) strains were grown 3. Buchholz, K., Kasche, V. & Bornscheuer, U.T. Biocatalysis and Enzyme in 400 ml LB medium supplemented with ampicillin (100 μg ml−1). Cells were Technology (Wiley-VCH, Weinheim, Germany, 2005). incubated initially at 37 °C on a gyratory shaker until the OD600 reached 0.7. 4. Schmid, A. et al. Industrial biocatalysis today and tomorrow. Nature 409, The cells were then induced by addition of 0.2% (w/v) rhamnose (pGASTON) 258–268 (2001). or 0.1 mM IPTG (pET-22b), respectively. At the same time the incubation 5. Schoemaker, H.E., Mink, D. & Wubbolts, M.G. Dispelling the myths—Biocatalysis temperature was decreased to 20 °C, and cultivation continued for 20 h. in industrial synthesis. Science 299, 1694–1697 (2003). 6. Bommarius, A.S. & Riebel, B.R. Biocatalysis: Fundamentals and Applications Purification of proteins. The cell pellet (~3 g wet weight) was washed twice with (Wiley-VCH, Weinheim, Germany, 2004). phosphate buffer (pH 7.5, 50 mM) containing 0.1 mM PLP at 4 °C. After disrup- 7. Grunwald, P. Biocatalysis: Biochemical Fundamentals and Applications tion (French press), the cell suspension was centrifuged (10,000g, 30 min), and the (Imperial College Press, 2009). resulting supernatant was passed through a 0.5-μm filter before chromatography. 8. Liese, A., Seelbach, K. & Wandrey, C. (eds.) Industrial Biotransformations Chromatography was performed using an ÄKTA Purifier (GE Healthcare). The (Wiley-VCH, Weinheim, London, UK, 2006). filtered cellular extract was applied to a 5-ml column of IMAC Sepharose 6 Fast 9. Patel, R.N. (ed.) Biocatalysis in the Pharmaceutical and Biotechnology Flow (GE Healthcare). The column was washed at a flow rate of 5 ml min−1 with Industries (CRC Press, Boca Raton, Florida, USA, 2006). 10 column-volumes of phosphate buffer (pH 7.5, 50 mM, containing 300 mM 10. Pollard, D.J. & Woodley, J.M. Biocatalysis for pharmaceutical intermediates: © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 NaCl, 0.1 mM PLP and 30 mM imidazole to avoid nonspecific binding), and the the future is now. Trends Biotechnol. 25, 66–73 (2007). active protein was eluted with 10 column-volumes of phosphate buffer (pH 7.5, 11. Straathof, A.J.J., Panke, S. & Schmid, A. The production of fine chemicals by 50 mM, containing 300 mM NaCl, 0.1 mM PLP and 300 mM imidazole at a biotransformations. Curr. Opin. Biotechnol. 13, 548–556 (2002). flow rate of 5 ml min−1). The fractions with the desired protein were pooled and 12. Tao, J., Lin, G.-Q. & Liese, A. (eds.) Biocatalysis for the Pharmaceutical desalted via gel chromatography with a 20 mM tricine buffer, pH 7.5, containing Industry: Discovery, Development, and Manufacturing (Wiley-VCH, 0.01 mM PLP. The purified enzymes were stored at 4 °C. Protein concentrations Weinheim, Germany, 2009). were determined using the BCA assay kit (Uptima) after gel chromatography. 13. Hou, C.T. (ed.). Handbook of Industrial Biocatalysis (CRC Press, Boca Raton, Florida, USA, 2005). Characterization of substrate specificity. For an initial confirmation of activity, 14. Faber, K. Biotransformations in Organic Chemistry (Springer-Verlag, Berlin, 2004). α-methylbenzyl amine (1, α-MBA) served as substrate in an acetophenone-based 15. Mugford, P.F., Wagner, U.G., Jiang, Y., Faber, K. & Kazlauskas, R.J. microplate assay47: a solution of 2.5 mM (R)- or (S)-α-MBA and pyruvate was Enantiocomplementary enzymes: classification, molecular basis for their reacted in the presence of the purified enzyme, and the increase in absorbance enantiopreference, and prospects for mirror-image biotransformations. at 245 nm was correlated to the formation of acetophenone. The conversions of Angew. Chem. Int. Edn Engl. 47, 8782–8793 (2008). the other amines (2–4) were monitored using a conductivity assay48: a solution 16. Griengl, H., Schwab, H. & Fechter, M. The synthesis of chiral cyanohydrins containing 10 mM amine and pyruvate was reacted in the presence of the purified by oxynitrilases. Trends Biotechnol. 18, 252–256 (2000). amine transaminase, and the decrease in conductivity was related to the conver- 17. Cheon, Y.-H. et al. Crystal structure of d-hydantoinase from Bacillus sion of substrate. For all measurements, either an appropriate amount of purified stearothermophilus: insight into the stereochemistry of enantioselectivity. enzyme was applied, dependent on the specific activity, or the highest concentra- Biochemistry 41, 9410–9417 (2002). tion possible was applied (0.3–2.2 mg ml−1 purified enzyme, dependent on expres- 18. Gröger, H. et al. Enantioselective reduction of ketones with designer cells at sion and purification) while measuring low activities. high substrate concentrations: highly efficient access to functionalized To verify DATA or BCAT activity, the decrease of NADH was measured optically active alcohols. Angew. Chem. Int. Edn Engl. 45, 5677–5681 (2006). spectrophotometrically at 340 nm using dehydrogenase-coupled microplate 19. Fox, R.J. et al. Improving catalytic function by ProSAR-driven enzyme assays: a solution of 5 mM α-ketoglutaric acid and D-alanine 5 was reacted in the evolution. Nat. Biotechnol. 25, 338–344 (2007). presence of the purified transaminase, and 1 U ml−1 lactate dehydrogenase and 20. Kazlauskas, R.J. & Bornscheuer, U.T. Finding better protein engineering 0.5 mM NADH were used for measuring DATA activity. For measuring BCAT strategies. Nat. Chem. Biol. 5, 526–529 (2009). activity, a solution containing 5 mM 3-methyl-2-oxobutyric acid and L-glutamate 21. Lutz, S. & Bornscheuer, U.T. (eds.) Protein Engineering Handbook (Wiley 6, 10 mM ammonium chloride, 1 U ml−1 and VCH, Weinheim, Germany, 2009). 0.5 mM NADH was used. All reactions took place in tricine buffer (pH 7.5, 22. Turner, N.J. Directed evolution drives the next generation of biocatalysts. 20 mM) containing 0.01 mM PLP. The pH of the buffer was adjusted with Nat. Chem. Biol. 5, 567–573 (2009). 1,8-diazabicyclo[5.4.0]undec-7-ene. The specific activity was expressed as units 23. Reetz, M.T., Carballeira, J.D. & Vogel, A. Iterative saturation mutagenesis on per milligram protein, and one unit of activity was defined as the amount of the basis of B factors as a strategy for increasing protein thermostability. enzyme that produced 1 μmol ketone product per minute. Angew. Chem. Int. Edn Engl. 45, 7745–7751 (2006).

812 nature chemical biology | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology Nature chemical biology doi: 10.1038/nchembio.447 article

24. Ivancic, M., Valinger, G., Gruber, K. & Schwab, H. Inverting enantioselectivity 41. Sugio, S., Petsko, G.A., Manning, J.M., Soda, K. & Ringe, D. Crystal structure of Burkholderia gladioli esterase EstB by directed and designed evolution. of a d-amino acid aminotransferase: how the protein controls J. Biotechnol. 129, 109–122 (2007). stereoselectivity. Biochemistry 34, 9661–9669 (1995). 25. Koga, Y., Kato, K., Nakano, H. & Yamane, T. Inverting enantioselectivity of 42. Goto, M., Miyahara, I., Hayashi, H., Kagamiyama, H. & Hirotsu, K. Crystal Burkholderia cepacia KWI-56 lipase by combinatorial mutation and structures of branched-chain amino acid aminotransferase complexed with high-throughput screening using single-molecule PCR and in vitro glutamate and glutarate: true reaction intermediate and double substrate expression. J. Mol. Biol. 331, 585–592 (2003). recognition of the enzyme. Biochemistry 42, 3725–3733 (2003). 26. Magnusson, A.O., Takwa, M., Hamberg, A. & Hult, K. An S-selective lipase 43. Mehta, P.K., Hale, T.I. & Christen, P. Aminotransferases: demonstration of was created by rational redesign and the enantioselectivity increased with homology and division into evolutionary subgroups. Eur. J. Biochem. 214, temperature. Angew. Chem. Int. Edn Engl. 44, 4582–4585 (2005). 549–561 (1993). 27. May, O., Nguyen, P.T. & Arnold, F.H. Inverting enantioselectivity by directed 44. Wolf, Y., Madej, T., Babenko, V., Shoemaker, B. & Panchenko, A.R. evolution of hydantoinase for improved production of l-methionine. Nat. Long-term trends in evolution of indels in protein sequences. BMC Evol. Biol. Biotechnol. 18, 317–320 (2000). 7, 19 (2007). 28. Williams, G.J., Woodhall, T., Farnsworth, L.M., Nelson, A. & Berry, A. 45. Hanson, R.L. et al. Preparation of (R)-amines from racemic amines with an Creation of a pair of stereochemically complementary biocatalysts. J. Am. (S)-amine transaminase from Bacillus megaterium. Adv. Synth. Catal. 350, Chem. Soc. 128, 16238–16247 (2006). 1367–1375 (2008). 29. Zha, D.X., Wilensek, S., Hermes, M., Jaeger, K.E. & Reetz, M.T. Complete 46. Shin, J.-S., Yun, H., Jang, J.-W., Park, I. & Kim, B.-G. Purification, reversal of enantioselectivity of an enzyme-catalyzed reaction by directed characterization, and molecular cloning of a novel amine:pyruvate evolution. Chem. Commun. (Camb.) 2664–2665 (2001). transaminase from Vibrio fluvialis JS17. Appl. Microbiol. Biotechnol. 61, 30. Bartsch, S., Kourist, R. & Bornscheuer, U.T. Complete inversion of 463–471 (2003). enantioselectivity towards acetylated tertiary alcohols by a double mutant of a 47. Schätzle, S., Höhne, M., Redestad, E., Robins, K. & Bornscheuer, U.T. Rapid Bacillus subtilis esterase. Angew. Chem. Int. Edn Engl. 47, 1508–1511 and sensitive kinetic assay for characterization of ω-transaminases. Anal. (2008). Chem. 81, 8244–8248 (2009). 31. Iwasaki, A., Yamada, Y., Ikenaka, Y. & Hasegawa, J. Microbial synthesis of 48. Schätzle, S., Höhne, M., Robins, K. & Bornscheuer, U.T. A conductometric (R)- and (S)-3,4-dimethoxyamphetamines through stereoselective method for the rapid characterization of the substrate specificity of transamination. Biotechnol. Lett. 25, 1843–1846 (2003). amine-transaminases. Anal. Chem. 82, 2082–2086 (2010). 32. Matcham, G.W. & Bowen, A.R.S. Biocatalysis for chiral intermediates: 49. Höhne, M., Kühl, S., Robins, K. & Bornscheuer, U.T. Efficient asymmetric Meeting commercial and technical challenges. Chim. Oggi 14, 20–24 synthesis of chiral amines by combining transaminase and pyruvate (1996). decarboxylase. ChemBioChem 9, 363–365 (2008). 33. Koszelewski, D., Lavandera, I., Clay, D., Rozzell, D. & Kroutil, W. Asymmetric synthesis of optically pure pharmacologically relevant amines employing ω-transaminases. Adv. Synth. Catal. 350, 2761–2766 (2008). Author contributions 34. Savile, C.K. et al. Biocatalytic asymmetric synthesis of chiral amines from K.R. and U.T.B. initiated the project. M.H. designed the in silico strategy, devised the an- ketones applied to sitagliptin manufacture. Science 329, 305–309 (2010). notation algorithm and performed the database search and identification of the putative 35. Jansonius, J.N. Structure, evolution and action of vitamin B6-dependent amine transaminases. M.H. expressed and confirmed amine transaminase activity and enzymes. Curr. Opin. Struct. Biol. 8, 759–769 (1998). (R)-selectivity for the first three proteins. S.S. coordinated the comparative characteriza- 36. Percudani, R. & Peracchi, A. The B6 database: a tool for the description and tion of all proteins and performed cloning, expression, purification, data collection and classification of vitamin B6-dependent enzymatic activities and of the data analysis. H.J. contributed to gene cloning, protein expression and activity measure- corresponding protein families. BMC Bioinformatics 10, 273 (2009). ments. U.T.B. and M.H. cowrote the paper, and all authors read and edited the manuscript. 37. Höhne, M. & Bornscheuer, U.T. Biocatalytic routes to optically active amines. ChemCatChem 1, 42–51 (2009). Competing financial interests 38. Cahn, R.S., Ingold, C. & Prelog, V. Specification of molecular chirality. Angew. The authors declare competing financial interests: details accompany the full-text HTML Chem. Int. Edn Engl. 5, 385–415 (1966). version of the paper at http://www.nature.com/naturechemicalbiology/. 39. Mizuguchi, H. et al. Strain is more important than electrostatic interaction in controlling the pK(a) of the catalytic group in aspartate aminotransferase. Biochemistry 40, 353–360 (2001). Additional information 40. Okamoto, A., Nakai, Y., Hayashi, H., Hirotsu, K. & Kagamiyama, H. Crystal Supplementary information is available online at http://www.nature.com/ structures of Paracoccus denitrificans aromatic amino acid aminotransferase: naturechemicalbiology/. Reprints and permissions information is available online at © 2010 Nature America, Inc. All rights reserved. All rights Inc. America, Nature © 2010 A substrate recognition site constructed by rearrangement of hydrogen bond http://npg.nature.com/reprintsandpermissions/. Correspondence and requests for network. J. Mol. Biol. 280, 443–461 (1998). materials should be addressed to U.T.B.

nature CHEMICAL BIOLOGY | vol 6 | NOVEMBER 2010 | www.nature.com/naturechemicalbiology 813 The Articles

ARTICLE V

- 75 -

FULL PAPERS

DOI: 10.1002/adsc.201100435 Enzymatic Asymmetric Synthesis of Enantiomerically Pure Aliphatic, Aromatic and Arylaliphatic Amines with (R)-Selective Amine Transaminases

Sebastian Schtzle,a Fabian Steffen-Munsberg,a Ahmad Thontowi,a Matthias Hçhne,a Karen Robins,b and Uwe T. Bornscheuera,* a Institute of Biochemistry, Department of Biotechnology and Enzyme Catalysis, Greifswald University, Felix Hausdorff-Str. 4, 17487 Greifswald, Germany Fax : (+49)-3834-86-794367; e-mail: [email protected] b Lonza AG, Valais Works, 3930 Visp, Switzerland

Received: May 29, 2011; Published online: August 25, 2011

Supporting information for this article is available on the WWW under http://dx.doi.org/10.1002/adcs.201100435.

Abstract: Seven (R)-selective amine transaminases cose dehydrogenase system to shift the equilibrium. (R-ATAs) recently discovered by an in silico-based For all ketones, at least one enzyme was found that approach in sequence databases were produced re- allows complete conversion to the corresponding combinantly in Escherichia coli and subjected to par- chiral amine having excellent optical purities >99% tial purification by ammonium sulfate precipitation. ee. Variations in substrate profiles are also discussed A range of additives and various buffers were inves- based on the phylogenetic relationships between the tigated to identify best conditions to ensure good seven R-ATAs. Thus, we have identified a versatile storage stability and stable activity during biocataly- toolbox of (R)-amine transaminases showing remark- sis. All enzymes show pH optima between pH 7.5–9. able properties for application in biocatalysis. These R-ATAs were then applied in the asymmetric synthesis of twelve aliphatic, aromatic and arylali- phatic (R)-amines starting from the corresponding Keywords: chiral amines; enantioselectivity; enzyme prochiral ketones using a lactate dehydrogenase/glu- catalysis; transaminases

Introduction prostereogenic ketones. Furthermore, these enzymes usually show excellent enantioselectivity and thus Optically active amines play an important role in the they are in great demand. In principle, both enantio- pharmaceutical, agrochemical and chemical industry. mers of a given amine can be produced with a single They are frequently used as building blocks for the enantioselective amine transaminase having, for in- synthesis of various biological active compounds used stance, (S)-enantiopreference. A kinetic resolution as drugs or agrochemicals including, for example, with this enzyme would leave the (R)-enantiomer drugs for the treatment of Alzheimers disease[1] and behind, which can be obtained at maximum in 50% malaria[2] as well as prostate drugs[3] and antitumor yield as the (S)-enantiomer is converted into the antibiotics.[4] As both enantiomers of a building block ketone. In turn, asymmetric synthesis starting from are often needed for targets with different absolute the corresponding prochiral ketone would yield the configuration, an enzyme platform for the production (S)-amine as product in theoretically 100% yield if of both (R)- and (S)-enantiomers is highly desired. the equilibrium is efficiently shifted towards product In principle, various enzymatic routes can be em- formation for which a range of methods are avail- ployed for the synthesis of optically active primary able.[6] Whereas many (S)-selective amine transami- amines using hydrolases, oxidoreductases or transfer- nase (S-ATAs) had been discovered and extensively ases (for further reading see review[5]). All routes investitaged, only a few (R)-amine transaminases (R- have their advantages and disadvantages, but only ATAs) were known until recently and only one amine transaminases offer the unique possibility to enzyme (ATA-117, Codexis) was commercially avail- synthesize enantiomerically pure amines directly from able. During an extensive project aiming for an R-

Adv. Synth. Catal. 2011, 353, 2439 – 2445 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 2439 FULL PAPERS Sebastian Schtzle et al.

ATA for the industrial synthesis of the drug Sitaglip- MycVan from Mycobacterium vanbaalenii) for the tin, Codexis and Merck & Co created a variant of asymmetric synthesis of a range of aliphatic, aromatic ATA-117, which now allows synthesis of the drug on and arylaliphatic amines to elucidate the substrate industrial scale.[7] Thus, an efficient asymmetric syn- scope of these enzymes originating from various mi- thesis of (R)-amines was mostly hampered by the lim- croorganisms and for which no natural substrate is ited access to R-ATAs. Recently, we identified 20 known (Scheme 1). These enzymes were primarily (R)-amine transaminases by an in silico search strat- chosen from the 17 recently discovered R-ATAs[8] be- egy in which careful analysis of typical motifs for (S)- cause of their high specific activities towards 2b, 7b or (R)-selective amino acid transaminases (BCAT, and 9b and sufficient expression levels in E. coli. branched chain amino acid transaminase; DATA, d- amino acid transaminase) and certain lyases from the same fold class led to a complex algorithm.[8] This al- Enzyme Production and Formulation gorithm was then used in an alignment of 5,000 se- quences deposited in public databases to filter out all Before the newly discovered R-ATAs could be bio- known enzymes with BCAT, DATA or lyase activity chemically characterized in more detail and used in and certain additional criteria to ensure that the re- asymmetric synthesis, methods for their expression in maining sequences are true amine transaminases and E. coli on sufficient scale and stability studies needed that they exhibit (R)-enantiopreference. The proteins to be performed, especially as we have noticed before encoded by the remaining 20 sequences were all that several enzymes were rather unstable after His- cloned, functionally expressed in E. coli and in pre- tag purification in tricine buffer used for the conduc- liminary analyses, we could demonstrate that 17 en- tivity assay.[9] For application in biocatalysis purified zymes were indeed amine transaminases and all had enzymes are not needed as long as it is ensured that (R)-selectivity (three enzymes could not be expressed the crude cell lysate does not contain any interfering at sufficiently high levels) as confirmed by determin- background activity, which was never observed for all ing specific activities towards different amine enantio- R-ATAs expressed in E. coli. Thus, the crude extracts mers. containing the seven R-ATAs were only subjected to In this contribution, we now report further analysis ammonium sulfate precipitation at 60% saturation of seven of these R-ATAs, which includes expression and could be recovered fully active. Some parts of the optimization to produce the enzyme in sufficient total protein could be discarded in the cases of amounts, formulation to achieve stable biocatalysts AspFum and GibZea by fractionated precipitation at and the determination of pH profiles. Special focus is an ammonium sulfate saturation of 40% beforehand. given on the application of the enzymes in asymmet- This step was not carried out for the other enzymes as ric synthesis to identify their potential in industrial it would have caused a reasonable loss of activity. biocatalysis and hence suitable methods to shift the Next, various additives – commonly used osmolytes equilibrium needed to be identified too. such as glycerol, ethylene glycol, PEG 1550, trehalose, sucrose and sarcosine as well as the surfactant Tween 80 – were investigated for their stabilizing effect on Results and Discussion the proteins during freeze-drying and storage as lyo- philizate and in solution. The best conditions for each In the present study we utilized a toolbox consisting R-ATA along with the activity and yield of lyophili- of seven (R)-selective amine transaminases (AspTer zate from 400-mL shake flask cultivations are sum- from Aspergillus terreus, AspFum from Aspergillus fu- marized in Table 1. In most cases sucrose or glycerol migatus, AspOry from Aspergillus oryzae, PenChry had the best stabilizing effect on the R-ATAs. All en- from Penicillium chrysogenum, NeoFis from Neosar- zymes were stable for at least three months in lyophi- torya fischeri, GibZea from Gibberella zeae and

Scheme 1. Chiral (R)-amines obtained from the corresponding prochiral ketones using the amine transaminases.

2440 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2439 – 2445 Enzymatic Asymmetric Synthesis of Enantiomerically Pure Aliphatic, Aromatic and Arylaliphatic Amines

Table 1. Summary of parameters for the preparation of the lyophilized biocatalysts. Activities were measured with the ace- tophenone assay photometrically[10] and protein concentrations were determined with the BC assay according to the supplier’s manual.

Enzyme Additive[a] Activity[b] Yield[c] Total ac- Activity [U/mg] [mg] tivity[c] [U/mg [U] protein] AspTer sucrose 0.74 956 711 3.94 PenChr sucrose 0.15 789 122 1.01 AspOry – 1.10 757 829 3.65 Scheme 2. Reaction scheme for the asymmetric synthesis of AspFum sucrose 0.62 687 426 4.83 chiral (R)-amines with the R-ATAs in combination with the NeoFis sarcosine 1.22 622 759 8.14 LDH/GDH system to shift the equilibrium towards product GibZea sucrose 2.47 574 1415 18.63 formation. MycVan glycerol 0.44 580 258 1.95 [a] 2% (w/v) added after ammonium sulfate precipitation. [b] Activity of lyophilisate. of the PLP-cofactor. On the other hand, we showed [c] For lyophilisate obtained from 400 mL cultivation. in a previous study that TRIS, EPPS and especially HEPES had an inhibitory effect on the S-ATA from Rhodobacter sphaeroides compared to phosphate lized form and for at least two weeks in solution at buffer (7b was converted with 71%, 76% and 24%, 48C. respectively[9]). Therefore, the initial activity towards 7b in four different buffer systems (TRIS, HEPES, MOPS and BES) was compared to the activity in Finding the Right Reaction System 50 mM and 100 mM phosphate buffer (Table 2). Overall, the influence of the different buffer sys- With stable and active biocatalysts available, the next tems was only moderate (Table 2). Significantly lower important challenge for a successful asymmetric syn- activity was only observed for PenChr, in 100 mM thesis of chiral amines was the identification of the phosphate or 50 mM TRIS buffer, whereas higher ac- best method to shift the equilibrium towards product tivity was found in tricine or HEPES buffer. Alto- formation. Thus, the use of pyruvate decarboxylase[6b] gether the differences were rather small and hence and the combination of lactate dehydrogenase (LDH) phosphate buffer was chosen for subsequent experi- with glucose dehydrogenase (GDH) for cofactor recy- ments as this buffer is suitable for the LDH/GDH re- cling[6a] were preliminary investigated and the LDH/ cycling system. GDH system (Scheme 2) gave most satisfying results. The pH profiles of the R-ATAs were determined Next, several buffers were investigated to identify using the photometric acetophenone assay.[10] The ma- the best system for the biocatalysis. Recent studies jority of the enzymes show a pH optimum between 8– showed that the most often used phosphate buffer is 9, which is also typical for S-ATAs such as the not always the best choice for transaminases because enzyme from Vibrio fluvialis;[11] AspTer and PenChr of inhibitory effects of the phosphate ions – assuming- show a sharp optimum at pH 8.5–9 while AspOry, ly by blocking the binding site of the phosphate group AspFum and NeoFis have a broader optimal range

Table 2. Relative activity of the R-ATAs in different buffer systems. The activity towards 7b was measured spectrophotomet- rically[10] in the different buffers at pH 8 and was normalized to the activity in 50 mM phosphate buffer (set to 100%). All values are means from three individual measurements and standard deviations did not exceed 10%.

Relative activity [%] Phosphate Phosphate TRIS HEPES MOPS BES Tricine (50 mM) (100 mM) (100 mM) (100 mM) (100 mM) (100 mM) (100 mM) AspTer 100 99 93 100 98 80 104 PenChr 100 76 52 123 114 99 136 AspOry 100 107 106 103 106 104 105 AspFum 100 103 98 104 103 101 102 NeoFis 100 101 103 102 101 97 101 GibZea 100 112 107 122 123 122 119 MycVan 100 95 103 104 104 109 108

Adv. Synth. Catal. 2011, 353, 2439 – 2445 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2441 FULL PAPERS Sebastian Schtzle et al. between 8–9. GibZea has a rather untypical pH opti- mum of 7.5 and loses already 30 to 70% activity at pH 8.5 and 9, respectively, whereas MycVan has a very broad optimum between pH 7.5–8.5 and still shows ~80% activity at pH 9 (Figure 1). Nevertheless, in combination with the LDH/GDH system, prelimi- nary experiments gave the highest yields at pH 7.5.

Asymmetric Synthesis

Having found suitable ways for stabilizing the en- zymes and having elucidated the optimal reaction conditions, the R-ATAs were applied to asymmetric synthesis to obtain the aliphatic, aromatic and arylali- phatic amines (Scheme 1) and the results are summar- ized in Table 3. These data clearly show that all seven enzymes ex- hibited excellent selectivity (>99% ee) and in all cases the (R)-enantiopreference could be confirmed. Most importantly, for all substrates at least one R- ATA – and for various compounds several enzymes – could be identified, which allowed complete conver- sion of the ketone into the chiral amine. This exceed- ed our expectations as the ketones represent a rather diverse set of compounds. In only a few cases, lower conversions were observed (i.e., 2b, 7b and 9b with PenChr or GibZea) but further optimization to ach- ieve higher conversion was considered unnecessary in light of the other very useful R-ATAs. Although PenChr and AspTer show a rather high similarity score in a multiple sequence alignment (Figure 2), the conversions achieved in biocatalysis differed considerably between these two R-ATA. AspTer enabled 100% conversion for 2-aminohexane 2b, -heptane 3b and -nonane 4b, but not for the short- er 2-aminopentane 1b (~40%) and 2-amino-4-methyl- pentane 5b (~60%). Substrates with sterically more demanding groups than alkyl (cyclohexane or phenyl) next to the ketone function were converted even worse. These differences will be analyzed in more detail once a 3-D structure or homology model of these R-ATAs is available. Another interesting result is the fact that AspTer did not give high yields for methylphenylpropiona- mine 9b and the 4-methoxy derivative 10b (32% and 20%), but good yields of around 80% could be ach- ieved with the hydroxy derivatives 11b and 12b. This phenomenon was only observed with AspTer. The two other enzymes from Aspergillus species (AspOry and AspFum) turned out to be even better catalysts, as well as NeoFis. According to their sequence identi- ty these enzymes are much alike, especially AspFum and NeoFis with a sequence identity of 96%. AspOry Figure 1. pH profiles of the seven R-ATAs. Activities were measured with the acetophenone assay spectrophotometri- shares a similarity of 72% and 73%, respectively, and cally[10] in Davies buffer[12], which is commonly used for pH gave very good to excellent conversions for the ali- profile determinations. phatic substrates, considerably less conversion for 6a,

2442 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2439 – 2445 Enzymatic Asymmetric Synthesis of Enantiomerically Pure Aliphatic, Aromatic and Arylaliphatic Amines

Table 3. Conversion and enantiomeric excess of the amines obtained by asymmetric synthesis from the corresponding ke- tones. In general, reactions were repeated in triplicates and standard deviation did not exceed 10%.

R-ATA AspTer PenChr AspOry AspFum NeoFis GibZea MycVan [a] [b] [a] [b] [a] [b] [a] [b] [a] [b] [a] [b] [a] [b] Substrate c eep c eep c eep c eep c eep c eep c eep [%] [%ACHTUNGRE ee] [%] [%ACHTUNGRE ee] [%] [%ACHTUNGRE ee] [%] [%ACHTUNGRE ee] [%] [%ACHTUNGRE ee] [%] [%ACHTUNGRE ee] [%] [%ACHTUNGRE ee] 1b 47 >99 1 – 92 >99 >99 >99 >99 >99 30 >99 66 >99 2b >99 >99 1 – >99 >99 >99 >99 >99 >99 14 >99 88 >99 3b >99 >99 15 >99 >99 >99 >99 >99 >99 >99 43 >99 69 >99 4b >99 >99 12 >99 77 >99 >99 >99 >99 >99 23 >99 65 >99 5b 64 >99 9 >99 79 >99 >99 >99 >99 >99 32 >99 68 >99 6b 30 >99 1 – 53 >99 93 >99 59 >99 16 >99 56 >99 7b 3 >99 0 – 39 >99 >99 >99 87 >99 4 >99 42 >99 8b 17 >992–89968>99 74 >99 3 99 6 99 9b 32 >99 2 – 92 >99 >99 >99 >99 >99 9 >99 89 >99 10b 20 >99 1 – >99 >99 >99 >99 92 >99 7 >99 72 >99 11b 83 >99 4 – >99 >99 >99 >99 >99 >99 5 >99 71 >99 12b 77 >99 2 – 98 >99 94 >99 >99 >99 6 >99 85 >99 [a] Conversion as determined from the product analysis by HPLC. [b] % enantiomeric excess of product.

compounds, moderate yields for 6a and 7a and did not like 8a at all. Bringing together the observed performances during biocatalysis and the degree of relationship as shown in the phylogenetic tree (Figure 2), several cor- relations attract attention. MycVan, for example, is a good biocatalysis not far from the performance of AspOry, AspFum and NeoFis, which show a high degree of relationship. Although GibZea shows a higher similarity towards these good biocatalysts such as MycVan, it showed only a rather poor perfor- mance. Interestingly, AspTer and PenChr are also strongly related, but showed very different perform- ances regarding the conversions achieved. However, they both prefer aliphatic substrates with a chain length of at least six carbon atoms over shorter sub- strates, yielded only very small amounts of 7b, and both achieved higher conversions with at least one hy- droxy derivative of 10b as with 10b itself

Figure 2. Phylogenetic tree of the investigated R-ATAs. The figure shows all 21 hits from the database search and the Conclusions biocatalysts used in this study are named. DATA, ADCL (4- amino-4-deoxychorismate lyase) and BCAT are the other We have successfully applied the recently discovered three members of the PLP-fold type IV. R-ATAs to asymmetric synthesis of various amines with >99% conversion and excellent enantiomeric purities of >99% ee. Looking back to our first pre- 7a and 8a and again very good conversions for the ar- liminary experiments with conversions of < 40%,[8] ylaliphatic substrates 9–12a. MycVan is the outsider one may conclude that producing biocatalysts in the of the investigated proteins, with similarity scores not right formulation, optimization of the reaction system exceeding 40%. The phylogenetic tree (Figure 2) as well as using a broad set of substrates is as impor- shows that MycVan has the highest degree of relation- tant as finding new enzymes. ship to the assumed ancestor BCAT. MycVan ach- ieved good yields for the aliphatic and arylaliphatic

Adv. Synth. Catal. 2011, 353, 2439 – 2445 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2443 FULL PAPERS Sebastian Schtzle et al.

Experimental Section ganic phase was dried over Na2SO4 and the extracted amines were derivatized to the trifluoroacetamides by All chemicals were of analytical grade purity and obtained adding a 20-fold excess of trifluoroacetic anhydride. After from Sigma Aldrich (Munich, Germany), ABCR (Karlsruhe, 5 min at room temperature and purging with nitrogen to Germany) or Acros (Geel, Belgium). remove excess anhydride, residual trifluoracetic acid and solvent, the derivatized compounds were dissolved in 50 mL Functional Expression and Preparation of ethyl acetate and 0.6 mL of this solution were injected into a Lyophilized Transaminases Hewlett Packard 5890 Series II gas chromatograph equipped Expression of the transaminases was carried out as de- with an octakis-(2,3-di-O-acetyl-6-O-tert-butyldimethylsilyl)- scribed elsewhere.[8] The crude extract was stirred on ice and g-cyclodextrin column (50 m0.25 mm) for analyzing 8b or supplemented with 60% saturation of (NH4)2SO4 for precip- with a heptakis-(2,3-di-O-acetyl-6-O-tert-butyldimethyl- itation of R-ATAs and stirred for 30 min. In the cases of silyl)-b-cyclodextrin column (25 m 0.25 mm) for 1–7b and AspFum and GibZea a fraction of the proteins from the 9–10b. crude extract was precipitated by adding 40% saturation of Chiral analysis of 11–12b (Table 4) was performed by ca- (NH ) SO from the supernatant, stirred and centrifuged be- 4 2 4 pillary electrophoresis on a PACE-MDQ system equipped forehand. The transaminase containing pellet was dissolved with a fused silica capillary (50 mm inner diameter) as re- in 10 mL sodium phosphate buffer (50 mM, pH 7.5) contain- [6b] ing 0.1 mM PLP and 2% additive (w/v) (Table 1) prior to ported previously. lyophilization. Protein concentrations were determined using the BC assay kit (Uptima, Montlucon, France). Table 4. Details of chiral analysis.

[a] Transaminase Activity Assay Amine Temperature Retention time Retention time [8C] (S)[min] (R) [min] Activity measurements were performed in triplicate as de- scribed.[10] The pH profiles were determined using 100 mM 1b 125 3.2 3.5 Davies buffer and 10 mL (from a 10 mgmLÀ1 solution of lyo- 2b 130 3.3 3.8 philisate in deionized water) protein solution were added. 3b 140 2.7 3.0 The acetophenone assay was also used to evaluate the 4b 130 6.3 8.5 buffer acceptance. For this, seven different buffers (sodium 5b 130 2.4 2.9 phosphate 50 mM pH 8, phosphate 100 mM pH 8, Tris 6b 150 2.3 2.4 100 mM pH 8, BES 100 mM pH 8, HEPES 100 mM pH 8, 7b 170 7.6 8.3 MOPS 100 mM pH 8 and tricine 100 mM pH 8) were chosen 8b 220 7.0 6.3 and enzyme solution was added before measuring activity. 9b 185 2.3 2.5 10b 185 5.6 6.1 Asymmetric Synthesis 11b CE 3.7 3.9 12b CE 3.5 3.7

All reactions were carried out in 0.5 mL scale in 1.5 mL Ep- [a] pendorf tubes containing 30 mgmLÀ1 transaminase lyophili- For 1b–10b the retention times refer to the trifluoracetic sate, 90 U lactate dehydrogenase, 15 U glucose dehydrogen- acid derivative. ase, 50 mM 1a–13a, 250 mM d-alanine, 150 mM d-glucose, 1 mM NADH and 0.1 mM PLP in sodium phosphate buffer (100 mM, pH 7.5). While incubating at 308C and 1000 rpm Acknowledgements in an Eppendorf shaker, samples (150 mL) were taken after distinct time periods and analyzed by HPLC for conversion The authors thank the Enzymaticals AG (Greifswald, Ger- and gas chromatography or capillary electrophoresis for de- many) for the larger scale production of the amine transami- termination of optical purity (% eeP). Promising reactions nases. were repeated in triplicate with 30 mgmLÀ1 TA lyophilisate and 150 mL were analyzed by HPLC whereas another 150 mL were used for chiral analysis.

Analytical Methods References For quantitative analysis 150 mL samples of the reaction [1] M. Rçsler, R. Anand, A. Cicin-Sain, S. Gauthier, Y. mixture were supplemented with 15 mL TFA, centrifuged Agid, P. Dal-Bianco, H. B. Sthelin, R. Hartman, M. (13000 g, 5 min) and 15 mL of the supernatant were directly Gharabawi, T. Bayer, Br. Med. J. 1999, 318, 633–640. analyzed by HPLC (Hitachi LaChrom) using a reversed- [2] a) G. Bringmann, R. Weirich, H. Reuscher, J. R. phase C18 column (LiChrospher ). Detection took place Jansen, L. Kinzinger, T. Ortmann, Liebigs Ann. Chem. with a UV detector set at 245 nm and an evaporating light 1993, 1993, 877–888; b) T. R. Hoye, M. Chen, Tetrahe- scattering detector (see Supporting Information). dron Lett. 1996, 37, 3099–3100; c) Y. F. Hallock, K. P. The chiral analysis of 1–10b was performed by gas chro- Manfredi, J. W. Blunt, J. H. Cardellina, M. Schaeffer, matography. A 150 mL sample was supplemented with 15 mL K.-P. Gulden, G. Bringmann, A. Y. Lee, J. Clardy, J. 10M NaOH and extracted with 400 mL ethyl acetate. The or- Org. Chem. 1994, 59, 6349–6355.

2444 asc.wiley-vch.de 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim Adv. Synth. Catal. 2011, 353, 2439 – 2445 Enzymatic Asymmetric Synthesis of Enantiomerically Pure Aliphatic, Aromatic and Arylaliphatic Amines

[3] P. Abrams, M. Speakman, M. Stott, D. Arkell, R. 9340; f) K. E. Cassimjee, C. Branneby, V. Abedi, A. Pocock, Br. J. Urol. 1997, 80, 587–596. Wells, P. Berglund, Chem. Commun. 2010, 46, 5569– [4] X. Chen, J. Chen, M. De Paolis, J. Zhu, J. Org. Chem. 5571. 2005, 70, 4397–4408. [7] C. K. Savile, J. M. Janey, E. C. Mundorff, J. C. Moore, [5] M. Hçhne, U. Bornscheuer, ChemCatChem 2009, 1, 42– S. Tam, W. R. Jarvis, J. C. Colbeck, A. Krebber, F. J. 51. Fleitz, J. Brands, P. N. Devine, G. W. Huisman, G. J. [6] a) D. Koszelewski, I. Lavandera, D. Clay, D. Rozzell, Hughes, Science 2010, 329, 305–309. W. Kroutil, Adv. Synth. Catal. 2008, 350, 2761–2766; [8] M. Hçhne, S. Schtzle, H. Jochens, K. Robins, U. T. b) M. Hçhne, S. Khl, K. Robins, U. T. Bornscheuer, Bornscheuer, Nat. Chem. Biol. 2010, 6, 807–813. ChemBioChem 2008, 9, 363–365; c) G. B. Matcham, L. [9] S. Schtzle, M. Hçhne, K. Robins, U. T. Bornscheuer, Mohit, L. Wei, N. Craig, R. Nelson, A. Wang, W. Wu, Anal. Chem. 2010, 82, 2082–2086. Chimia 1999, 53, 584–589; d) M. D. Truppo, J. D. Roz- [10] S. Schtzle, M. Hçhne, E. Redestad, K. Robins, U. T. zell, N. J. Turner, Org. Process Res. Dev. 2010, 14, 234– Bornscheuer, Anal. Chem. 2009, 81, 8244–8248. 237; e) D. Koszelewski, I. Lavandera, D. Clay, G. M. [11] J. S. Shin, H. Yun, J. W. Jang, I. Park, B. G. Kim, Appl. Guebitz, D. Rozzell, W. Kroutil, Angew. Chem. 2008, Microbiol. Biotechnol. 2003, 61, 463–471. 120, 9477–9480; Angew. Chem. Int. Ed. 2008, 47, 9337– [12] M. T. Davies, Analyst 1959, 84, 248–251.

Adv. Synth. Catal. 2011, 353, 2439 – 2445 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim asc.wiley-vch.de 2445 The Affirmation

Hiermit erkläre ich, dass diese Arbeit bisher von mir weder an der Mathematisch- Naturwissenschaftlichen Fakultät der Ernst-Moritz-Arndt-Universität Greifswald noch einer anderen wissenschaftlichen Einrichtung zum Zwecke der Promotion eingereicht wurde.

Ferner erkläre ich, dass ich diese Arbeit selbständig verfasst und keine anderen als die darin angegebenen Hilfsmittel und Hilfen benutzt und keine Textabschnitte eines Dritten ohne Kennzeichnung übernommen habe.

- 83 -

My Vita

CURRICULUM VITA

PHD IN BIOCHEMISTRY since 10/2008 PhD thesis in the group of Prof. Bornscheuer (Dept. of Biotechnology & Enzyme Catalysis) at the Institute of Biochemistry, University of Greifswald, Germany

Identification, Characterization and Application of novel (R)-selective Amine Transaminases

10/2010-12/2010 Research Internship in the group of Prof. Kittikun, Prince of Songkla University, Hat Yai, Thailand

DIPLOMA IN BIOCHEMISTRY

10/2007-07/2008 Diploma thesis in the group of Prof. Bornscheuer at the Institute of Biochemistry, University of Greifswald, Germany

Investigation of Nitrile Hydratases

04/2006-06/2007 Student assistant for Dr. Rainer Wardenga at the Dept. of Biotechnology & Enzyme Catalysis” – Enzymatic Synthesis of amino acid surfactants

10/2006-11/2006 Industrial placement with Dr. Frank Hollmann, Goldschmidt GmbH (Evonik), Essen, Germany – Organic Synthesis with esterases & lipases

10/2005-03/2006 Student assistant in the group of Inorganic Chemistry as supervisor of the practical course Qualitative Inorganic Analytics

CIVILLIAN SERVICE

06/2002-03/2003 Caritasverband, City of Oberhausen, Franziskus-Haus Caregiver for People with Disabilities

04/2003-08/2003 Temporary employee as Caregiver for People with Disabilities (see above)

SCHOOL

09/1993-05/2002 Heinrich-Heine-Gymnasium, Bottrop, Germany

- 84 -

The Acknowledgements

ACKNOWLEDGEMENTS

Mein erstes und größtes Dankeschön gilt Uwe. Vielen Dank für Dein Vertrauen in unsere Ideen und unsere Arbeit sowie den damit verbundenen Freiheiten. Ich habe in den letzten Jahren sehr, sehr viel von Dir gelernt und hoffe, dass mir möglichst viel davon noch lange in Erinnerung bleibt. Besonderen Dank auch für meine tolle Zeit im Ausland und vor allem für die mit großem Fingerspitzengefühl gesetzten dead-lines und all Deine Hilfe.

Das zweite und ähnlich große Dankeschön geht an Matthias. Vielen, lieben Dank für die hervorragende Zusammenarbeit, Deine Ideen und Anregungen. Ich glaube das mit uns hat so prima funktioniert, weil sich unsere doch recht unterschiedlichen Charaktere so vorzüglich ergänzt haben.

Das letzte Mitglied der “einflussreichen Drei” ist Onkel Rainer. Du hast mich ja als Hiwi nicht nur gequält und schikaniert, sondern mir vor allem unglaublich viel beigebracht – sowohl praktisch als auch theoretisch. Somit hast Du mich wahrscheinlich “wissenschaftlich“ gesehen am meisten geprägt. Noch viel mehr möchte ich mich aber für die Zeit danach bedanken, nämlich dafür, dass Du mir wirklich immer geholfen hast. Dankeschön.

Weiterhin bedanke ich mich bei allen Transaminase-Jungs, also bei Fabian, Hannes, Ahmad und Erik. Vielen Dank für die tolle Zusammenarbeit und die schöne Zeit im Labor. Und natürlich für Eure Mühen und vor allem die guten Ergebnisse. Erik, thank you so much for the great teamwork and – much more important – for the occasional beer after work.

Bei der Lonza AG und insbesondere bei Karen bedanke ich mich für die sowohl finanzielle als auch inhaltliche Unterstützung dieses Projekts. Vielen Dank für Dein großes Interesse an unserer Arbeit und vor allem für Dein Vertrauen und all Deine Mühen. Und natürlich für die ganzen Gene.

Selbstverständlich möchte ich mich auch bei all den anderen netten Menschen aus dem Arbeitskreis bedanken. Ganz besonders bei Howie, Marlen, Helge und dem Pückerer. Jedes Käffchen mit Euch war mir eine Freude. Stefan, Maria und Hendrik danke ich für die schöne Atmosphäre im Labor. Ina, Angelika, Dominique, Anke und Anita danke ich dafür, dass Ihr Euch immer um alles kümmert und deshalb alles so wunderbar funktioniert.

Und zu guter Letzt: meine liebe Familie. Euch danke ich von ganzem Herzen für Euer Vertrauen und Eure Geduld. Für die Geduld natürlich ganz besonders Dir, lieber Opa. Tut mir Leid, dass ich Dich so lange hab‘ warten lassen. Vielen Dank, dass Ihr immer für mich da seid und für Euer Verständnis, wenn ich mal nicht da sein konnte.

- 85 -