US 20090061471 A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2009/0061471 A1 Fasan et al. (43) Pub. Date: Mar. 5, 2009

(54) METHODS AND SYSTEMS FOR SELECTIVE Publication Classification FLUORINATION OF ORGANIC MOLECULES (51) Int. Cl. (76) Inventors: Rudi Fasan, Brea, CA (US); CI2O I/26 (2006.01) Frances H. Arnold, La Canada, CA CI2P 7/62 (2006.01) (US) CI2P 7/38 (2006.01) CI2P I 7/04 (2006.01) Correspondence Address: CI2P I 7/14 (2006.01) Joseph R. Baker, APC CI2P 9/44 (2006.01) Gavrilovich, Dodd & Lindsey LLP 4660 La Jolla Village Drive, Suite 750 (52) U.S. Cl...... 435/25; 435/135; 435/149; 435/126; San Diego, CA 92122 (US) 435/120: 435/74 (21) Appl. No.: 11/890,218 (22) Filed: Aug. 4, 2007 (57) ABSTRACT A method and system for selectively fluorinating organic Related U.S. Application Data molecules on a target site wherein the target site is activated (60) Provisional application No. 60/835,613, filed on Aug. and then fluorinated are shown together with a method and 4, 2006. system for identifying a molecule having a biological activity.

OH DeOXO- F 1 fluorination

HO F Oxygenase 2 DeOXO Sea-intries fluOrination

Oxygenase 3 HO H F O F DeOXO (e) ---->fluorination Patent Application Publication Mar. 5, 2009 Sheet 1 of 8 US 2009/0061471 A1

OH DeOXO- F ov fluorination HO F Oxygenase 2 DeOXO - - -m-m-e- fluorination on N HO OH F DeOXO fluorination

FI G. 1

Chemo-enzymatic strategy F Oxygenase OH DeOXO fluorination High regio- and stereoselectivity Highly enantiopure fluoro derivative in good yields Chemical Strategy F F FIUOrination (OE FChiral -m-ap reSOIUtion

(o) Enantiopurein poor fluoro-derivative yields Poor Stereoselectivity With Current methods FIG. 2 Patent Application Publication Mar. 5, 2009 Sheet 2 of 8 US 2009/0061471 A1

D helix L helix FIG. 3 Patent Application Publication Mar. 5, 2009 Sheet 3 of 8 US 2009/0061471 A1

S

(). S N S S -a-a-b- L l l l ()l S

N

S Sl S S s SS gs S N. N O () Patent Application Publication Mar. 5, 2009 Sheet 4 of 8 US 2009/0061471 A1

2/BA

co 9-9/8A aayaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

|||-8JPM

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa. 6-9/BM EFH 2222222222a2/2ZYZZ22a2/24%a22 Z1-8JPM- |JPA [JEM NISI: G-9UBA |-(}) 61-8.JPM

S LIOS/a1103 fello I0IGLISO 31001 Patent Application Publication Mar. 5, 2009 Sheet 5 of 8 US 2009/0061471 A1

N scs S.

a.

N vs.

seasyxxx seawaxxx Q SSIs S. l Lie as a t slala 2

2 S.

axxas 2%Llls S. s ITTLs2

2. LLL's

%

2. Zaayaay a S

st UOS/8MU00 fe01 lOlillSlp lonpold

Patent Application Publication Mar. 5, 2009 Sheet 7 of 8 US 2009/0061471 A1

B Regioselectivity in MIMP0 activity (GC analysis) 120.0% S 100.0% 2, 2 2 2 S 80.0% 3 É % S. 60.0%...... aging S 40.0% 3 3- a MMPCH S on no % %

... WAO War3-19 War3-18 Var3-8 War2 Var3-5 var3-16 War3-6 War3-11

C 70% S 60% 3 50%

S 40% S 30% is N is 20% is g: E. H. H. H. H. H. H. War3-19 War3-18 var3-8 War2 War3-5 War3-16 War3-6 Var3-11

FIG. 7 (Cont'd) Patent Application Publication Mar. 5, 2009 Sheet 8 of 8 US 2009/0061471 A1

Whole-cell activation of dihydrolasnone (DH) 35.0 300 25.0 200 15.0 OXDH 10.0

O 5 10 15 20 25 30 35 40 time (hrs)

FIG. 8 US 2009/0061471 A1 Mar. 5, 2009

METHODS AND SYSTEMIS FOR SELECTIVE fluorine-containing Substance selected for research, pharma FLUORINATION OF ORGANIC MOLECULES ceutical, or agrochemical application has to be man-made. 0009. Despite a few reports on the application of molecu CROSS REFERENCE TO RELATED lar fluorine (F) for direct fluorination of organic compounds APPLICATIONS (Chambers, Skinner et al. 1996; Chambers, Hutchinson et al. 0001. This application claims priority to U.S. Provisional 2000), this method typically suffers from poor selectivity and Application Ser. No. 60/835,613 filed on Aug. 4, 2006, the requires handling of a highly toxic and gaseous reagent. Sev disclosure of which is incorporated herein by reference it its eral chemical strategies have been developed over the past entirety. decades to afford selective fluorination of organic compounds under friendlier conditions. These have been recently TECHNICAL FIELD reviewed by Togni (Togni, Mezzetti et al. 2001), Cahard (Ma 0002 The present disclosure generally relates to the fields and Cahard 2004), Sodeoka (Hamashima and Sodeoka 2006), of synthetic organic chemistry and pharmaceutical chemistry. and Gouverneur (Bobbio and Gouverneur 2006). These strat In particular, the present disclosure relates to methods and egies involve catalytic as well as non-catalytic methods. The latter comprise -controlled fluorination methods, systems for the selective fluorination of organic molecules. which generally make use of a chiral auxiliary, and reagent controlled fluorination methods, which generally make use of BACKGROUND chiral electrophilic N F or nucleophilic fluorinating 0003. The importance of fluorine in altering the physico reagents. chemical properties of organic molecules and its exploitation 0010. These fluorination methods, however, need several in medicinal chemistry has been highlighted in recent reviews chemical steps to prepare the chiral Substrates (Davis and Han (Bohm, Banneretal. 2004). Although similar in size to hydro 1992: Enders, Potthoff et al. 1997) or the chiral reagents gen, H->F substitutions can cause dramatic effects on several (Davis, Zhou et al. 1998; Taylor, Kotoris et al. 1999; Nyffeler, properties of organic molecules, including the lipophilicity, Duron et al. 2005) and have an applicability restricted to dipole moment, and pKa thereof. In addition, fluorine substi reactive C–H bonds (Cahard, Audouard et al. 2000; Shibata, tutions can dramatically alter the reactivity of the fluorinated Suzuki et al. 2000; Kim and Park 2002; Beeson and Mac site as well as that of neighboring functional groups. Millan 2005; Marigo, Fielenbach et al. 2005) in specific 0004. In particular, in medicinal chemistry, there is a classes of compounds Such as aldehydes (Beeson and Mac growing interest towards incorporating fluorine atoms in Millan 2005; Marigo, Fielenbach et al. 2005) or di-carbonyls building blocks, lead compounds and drugs in that this may (Hintermann and Togni 2000; Ma and Cahard 2004; Shibata, increase by many-fold the chances of turning these molecules lshimaru et al. 2004; Hamashima and Sodeoka 2006). into marketable drugs. Several studies have shown that potent 0011. Despite much progress in the field of organofluorine drugs can be obtained through fluorination of much less chemistry, the number of available methods for direct or active precursors. Some representative examples include indirect asymmetric synthesis of organofluorine compounds anticholesterolemic EZetimib (Clader 2004), anticancer CF3 taxanes (Ojima 2004), fluoro-steroids, and antibacterial fluo remains limited and additional tools are desirable. In particu roquinolones. lar, a general method to afford mono- or poly-fluorination of 0005. The improved pharmacological properties of organic compounds at reactive and unreactive sites of their fluoro-containing drugs are often due to their improved phar molecular scaffold is desirable. macokinetic properties (biodistribution, clearance) and enhanced metabolic stability (Park, Kitteringham et al. 2001). SUMMARY Primary metabolism of drugs in humans generally occurs through P450-dependent systems, and the introduction of 0012 Provided herein are methods and systems for the fluorine atoms at or near the sites of metabolic attack has often selective fluorination of a target site of an organic molecule, proven Successful in increasing the half-life of a compound which include the activation and subsequent fluorination of (Bohm, Banner et al. 2004). A comprehensive review cover the target site. In the methods and systems herein disclosed, ing the influence of fluorination on drug metabolism (espe the target site is an oxidizable carbon atom of the organic cially P450-dependent) is presented. (Park, Kitteringham et molecule, the activation is performed by introducing an oxy al. 2001). gen-containing functional group on the target site, and the 0006. In other cases, the introduction of fluorine substitu fluorination of the activated site is performed by replacing the ents leads to improvements in the pharmacological properties functional group introduced on the target site with fluorine as a result of enhanced binding affinity of the molecule to The introduction of the oxygen-containing functional group biological receptors. Examples of the effect of fluorine on and the replacement of the functional group with a fluorine binding affinity are provided by recent results in the prepara can be performed by Suitable agents tion of NK1 antagonists (Swain and Rupniak 1999), 5HT1D 0013. According to a first aspect, a method for fluorinating agonists (van Niel, Collins et al. 1999), and PTB1B antago an organic molecule is disclosed, the method comprising nists (Burke, Ye et al. 1996). providing an organic molecule comprising a target site; pro 0007 Over the past years, fluorination has been covering viding an oxydizing agent that oxidizes the organic molecule an increasingly important role in drug discovery, as exempli by introducing an oxygen containing functional group on the fied by the development of fluorinated derivatives of the anti target site, contacting the oxydizing agent with the organic cancer drugs paclitaxel and docetaxel (Ojima 2004). molecule for a time and under condition to allow introduction 0008. However, only a handful of organofluorine com of the oxygen-containing functional group on the target site pounds occur in nature and even those only in very small thus providing an oxygenated organic molecule, providing a amounts (Harper and O'Hagan 1994). Consequentely, any fluorinating agent and contacting the fluorinating agent with US 2009/0061471 A1 Mar. 5, 2009 the oxygenated organic molecule, for a time and under con that is already biologically active and/or to improve the bio dition to allow replacement of the oxygen-containing func logical activity of the original molecule by selective insertion tional group with fluorine. of fluorine. 0014. According to a second aspect, a system for the fluo 0023. A still further advantage of the methods and systems rination of an organic molecule is disclosed, the system com of the identification of a molecule having a biological activity, prising an oxydizing agent for introducing an oxygen-con is the possibility to derive molecules that have a biological taining functional group in an organic molecule and a activity that is pharmacologically relevant, or to improve the fluorinating agent for replacing the oxygen-containing func pharmacologically activity of a molecule that is already phar tional group in the organic molecule with fluorine or a fluo macologically active. This in view of the known ability of rine group. An oxygen-providing compound and/or fluorine fluorine to improve the pharmacological profile of drugs. providing compound can also be included in the system. 0024. The details of one or more embodiments of the 0015. A first advantage of the methods and systems herein disclosure are set forth in the accompanying drawings and the disclosed is to allow the fluorination of organic molecules in description below. Other features, objects, and advantages one or more specific and predetermined target sites including will be apparent from the description and drawings, and from one or more target sites of interest, thus allowing a regiose the claims. lective mono- and poly-fluorination. 0016 A second advantage of the methods and systems BRIEF DESCRIPTION OF THE DRAWINGS herein disclosed is to allow introduction of fluorine in an 0025 The accompanying drawings, which are incorpo fluorine unreactive site of a molecule, i.e. a site that, in rated into and constitute a part of this specification, illustrate absence of the oxygen-containing functional group is one or more embodiments of the present disclosure and, unlikely to undergo a chemical transformation Such as a fluo together with the detailed description, serve to explain the rination, as long as said site is oxidizable. principles and implementations of the disclosure. 0017. A third advantage of the methods and systems 0026. In the drawings: herein disclosed is that by using a suitable agent, in particular 0027 FIG. 1 is a schematic representation of the methods a Suitable oxidizing agent, it is possible to control the chirality and systems for the selective fluorination of an organic mol of the final and therefore produce a product molecule ecule A according to an embodiment of herein disclosed; having a desired chirality (Stereoselective fluorination). 0028 FIG. 2 is a schematic representation of methods and 0018. A fourth advantage of the methods and systems systems for stereoselective fluorination of an organic mol herein disclosed is that the methods and system can provide ecule A according to an embodiment of what herein disclosed fluorinated compounds wherein the fluorine is introduced in a (chemo-enzymatic strategy), illustrated in comparison with predetermined site expected to be associated with a biological methods and systems of the art (chemical strategy); activity, which are therefore candidate compounds to be 0029 FIG. 3 is a graphic representation of the crystal screened for the activity. structure of a P450 heme domain; helixes D, L, I and E in the 0019. According to a third aspect, a method for the iden domain are also indicated; the heme prosthetic group in the tification of a molecule having a biological activity is dis domain is indicated as “heme'; the cysteine in the heme closed, the method comprising, providing an organic mol ligand loop is displayed in spheres (black). ecule comprising a target site; providing an oxydizing agent, 0030 FIG. 4 is a schematic representation of methods and contacting the oxydizing agent with the organic molecule for system for identifying a molecule having a biological activity a time and under condition to allow introduction of an oxy according to an embodiment disclosed in the present specifi gen-containing functional group on the target site thus pro cation; viding an oxygenated organic molecule, providing a fluori 0031 FIG. 5, illustrates exemplary results from the nating agent; contacting the fluorinating agent with the screening of a Subset of pre-selected for the iden oxygenated organic molecule, for a time and under condition tification of a suitable oxydizing agent for the selective acti to allow replacement of the oxygen-containing functional Vation of the organic molecule dihydrojasmone. Panel A) is a group with fluorine and testing the fluorinated organic mol diagram showing the conversion ratios for the reaction of ecule for the biological activity. activation of dihydrojasmone with wild-type P450 and 0020. According to a fourth aspect, a system for identify variants thereof, as determined by GC analysis; Panel B) is a ing a molecule having a biological activity is disclosed. The diagram showing the product distribution obtained with wild system comprises an oxidizing agent capable of introducing type P450, and variants thereof in the reaction of activation an oxygen-containing functional group in a target site of an of dihydrojasmone, as determined by GC analysis. Cpd 1 to organic molecule, a fluorinating agent capable of replacing cpd 9 indicate activated products 1 to 9; the oxygen-containing functional group in the organic mol 0032 FIG. 6. illustrates exemplary results from the ecule with fluorine, and an agent for testing the biological screening of a Subset of pre-selected oxygenases for the iden activity. An oxygen-providing agent and/or fluorine-provid tification of a suitable oxydizing agent for the selective acti ing-agent can also be included in the system. vation of the organic molecule Menthofuran; Panel A) is a 0021. A further advantage of the methods and systems for diagram showing the conversion ratios for the reaction of the identification of a molecule having a biological activity is activation of menthofuran with wild-type P450, and vari the possibility to produce abroad spectrum of molecules that ants thereof, as determined by GC analysis. Panel B) is a in view of the selected insertion offluorine, already constitute diagram showing the product distribution obtained with wild promising candidates, thus shortening and improving the type P450 and variants thereof in the reaction of activation selection process. of menthofuran, as determined by GC analysis. Cpd 1 to cpd 0022. An additional advantage of the methods and sys 10 indicate activated products 1 to 10; tems for the identification of a molecule having a biological 0033 FIG. 7 illustrates exemplary results from the screen activity is the possibility to confer new activities to a molecule ing of a Subset of pre-selected oxygenases for the identifica US 2009/0061471 A1 Mar. 5, 2009

tion of a suitable oxydizing agent for the selective activation interactions (such as Van der Waals forces, hydrogen bond of the organic molecule dihydro-4-methoxymethyl-2-me ing, hydrophobic interactions, electrostatic interactions, thyl-5-phenyl-2-oxazoline (MMPO). In particular, Panel (A) dipole-dipole interactions) to dominate the interaction of the is a diagram showing the results from HTS screening of chemical units. For example, when an oxygenase is oxydizing agent pool using calorimetric reagent Purpald; contacted with a target molecule, the enzyme is allowed to Panel (B) is a diagram showing the results from the re-screen interact with and bind to the organic molecule through non of the positive hits identified with calorimetric HTS, where covalent interactions so that a reaction between the enzyme the regioselectivity of each oxygenase is determined by GC and the target molecule can occur. analysis (MMPOH is dihydro-4-hydroxymethyl-2-methyl-5- 0039. The wording “chemical unit' identifies single atoms phenyl-2-oxazoline, that is the desired activated product); as well as groups of atoms connected by a chemical bond. Panel (C) is a diagram showing the conversion ratios for the Exemplary chemical units herein described include, but are activation reactions of MMPO with each of the tested oxy not limited to fluorine atom, chemical groups such as oxygen dizing agents, as determined by GC analysis; and containing chemical group and fluorine-containing groups, 0034 FIG. 8 shows a diagram illustrating the time course organic molecules or portions thereof including target sites, for whole-cell activation of the organic molecule dihydrojas chemical agents, including oxydizing agents and fluorinating mone (DHJ) using batch culture of var3-expressing E. coli agents. DH5O. cells (0.5 L). The consumption of substrate (DHJ) and 0040. The term "agent” as used herein refers to a chemical the accumulation of the desired activated product (oxDHJ) unit that is capable to cause a specified in were monitored over time by GC analysis of aliquots of the the identifier accompanying the term. Accordingly, an "oxy cell culture. dizing agent is an agent capable of causing an oxygenation reaction of a suitable Substrate and a “fluorinating agent' is an DETAILED DESCRIPTION agent capable of causing a fluorination reaction of a Suitable 0035 Methods and systems for the selective fluorination Substrate. An oxygenation reaction is a chemical reaction in of a predetermined target site of an organic molecule are which one or more oxygen atoms are inserted into one or herein disclosed. In these methods and systems, the predeter more pre-existing chemical bonds of said Substrate. A fluori mined target site is first activated by an oxydizing agent that nation reaction is a chemical reaction in which a substituent introduces an oxygen-containing functional group in the tar connected to an atom in said Substrate is substituted for fluo get site, and then fluorinated by a fluorinating agent that 1. replaces the Oxygen-containing functional group with fluo 0041. The term “introducing as used herein with refer rine or a fluorine group. In particular, activation and fluori ence to the interaction between two chemical units, such as a nation of an organic molecule can be performed as Schemati functional groups and a target site, indicates a reaction result cally illustrated in FIGS. 1 and 2, FIG. 2 also showing ing in the formation of a bond between the two chemical units, activation and fluorination of an organic molecule performed e.g. the functional group and the target site. according to some embodiments herein disclosed, in com 0042. The term “functional group’ as used herein refers to parison with chemical methods and systems of the art. a chemical unit within a molecule that is responsible for a 0036. The term “target site' as used herein refers to an characteristic chemical reaction of that molecule. An "oxy oxidizable C atom, i.e. a C atom in the organic molecule that gen-containing functional group' is a functional group that bears an oxidizable bond. Examples of oxidizable bonds comprises an oxygen atom. Exemplary oxygen-containing include but are not limited to a C-H bond, a C-C double functional groups include but are not limited to a hydroxyl bond, and a C X bond, single or double, where X is an group (-OH), ether group (-R), carbonyl oxygen (=O), heteroatom independently selected from the group consisting hydroperoxy group (-OOH), and peroxy group (—OOR). of B (boron), 0, (oxygen), P (phosphorous), N (nitrogen), S 0043. The terms “replace' and “replacement” as used (sulphur), Si (silicium), Se (selenium), F (fluorine), Cl (chlo herein with reference to chemical units indicates formation of rine), Br (bromine), and I (iodine). a chemical bond between the chemical units in place of a 0037. The terms “activate” and “activation as used herein pre-existing bond in at least one of said chemical unit. In with reference to a target site indicate a chemical reaction particular, replacing an oxygen-containing functional group resulting in an enhanced reactivity of the C atom that forms on the target site with a fluorine or fluorine group indicates the the site, so that said C atom acquires or improves its ability to formation of a bond between the target site and the fluorine or undergo a chemical transformation, more specifically a fluo fluorine group in place of the bond between the target site and rination reaction. For example, the insertion of an oxygen the oxygen-containing functional group. atom in a target site bearing a C-H bond and resulting in the 0044 Any organic molecule that includes at least one tar formation of a hydroxyl group (C OH) on the site activates get site, i.e. at least one oxidizable C atom, and is a substrate the target site for a deoxofluorination reaction. A further of at least one oxydizing agent, can be used as an organic example is the insertion of an oxygen atom in a target site molecule to be fluorinated according to the methods and bearing a C=C double bond and resulting in the formation of systems herein disclosed. an epoxy group activates the site for a ring-opening fluorina 0045. In some embodiments, the oxydizing agent is an tion reaction. Accordingly, the wording “activated site' as enzyme such as an oxygenase or oxydizing that is used herein refers to a C atom of an organic molecule that, able to introduce an oxygen-containing functional group in following activation, has acquired or improved its ability to the target site of the organic molecule using an oxygen Source undergo a chemical transformation and in particular a fluori such as molecular oxygen (O), hydrogen peroxide (HO), a nation reaction when contacted with a fluorine. hydroperoxide (R—OOH), or a peroxide (R-O-O-R), 0038. The term “contact’ as used herein with reference to including the with an Enzyme Classification interactions of chemical units indicates that the chemical (EC) number typically corresponding to EC 1.13 or EC 1.14. units are at a distance that allows short range non-covalent Suitable oxygenases for the systems and methods herein US 2009/0061471 A1 Mar. 5, 2009

described include but are not limited to , tural core, which binds to the heme group and comprises a dioxygenases, peroxygenases, and peroxidases. In particular, P450 signature sequence. The conserved P450 structural core monooxygenases and peroxygenase can be used to introduce is formed by a four-helix bundle composed of three parallel on the target site an oxygen-containing functional group that helices (usually labeled D. L., and 1), and one antiparallel comprises one oxygen atom, dioxygenases can be used to helix (usually labeled as helix E) (Presnell and Cohen 1989) introduce on the target site an oxygen-containing functional and by a Cys heme-ligand loop which includes a conserved group that comprises two oxygenatoms, and peroxydases can cysteine that binds to the heme group and the P450 signature. be used to introduce on the target site an oxygen-containing In particular, the conserved cysteine that binds to the heme functional group that comprises one or two oxygen atoms. group is the proximal or “fifth ligand to the heme iron and the 0046. In some embodiments, the oxygenases are wild relevant ligand group (a thiolate) is the origin of the charac type oxygenases and in Some embodiments the oxygenase is teristic name giving 450-nm Soret absorbance observed for a mutant or variant. An oxygenase is wild-type if it has the the ferrous-CO complex (Pylypenko and Schlichting 2004). structure and function of an oxygenase as it exists in nature. The P450 signature sequence is the sequence indicated in the An oxygenase is a mutant or variant if it has been mutated enclosed sequence listing as SEQ ID NO: 1. FIG. 3 is a rep from the oxygenase as it exists in nature and provides an resentation of the P450 structural core of bacterial P450. oxygenase enzymatic activity. In the illustration of FIG. 3, the prosthetic heme group 0047. In some embodiments, the variant oxygenase pro (heme) is located between the distal I helix (helix I) and vides an enhanced oxygenase enzymatic activity compared to proximalL helix (helix L) and is bound to the adjacent Cys the corresponding wild-type oxygenase. In some embodi heme-ligand loop containing the P450 signature sequence ments, the variant oxygenases maintain the binding specific SEQID NO: 1. Helices D and E are also indicated in FIG.3. ity of the corresponding wild-type oxygenase, in other 0050 P450 enzymes are known to be involved in metabo embodiments the variant oxygenases disclosed herein are lism of exogenous and endogenous compounds. In particular, instead bindingly distinguishable from the corresponding P450 enzymes can act as terminal oxidases in multicompo wild-type and bindingly distinguishable from another. The nent electron transfer chains, called here P450-containing wording “bindingly distinguishable' as used herein with ref systems. Reactions catalyzed by cytochrome P450 enzymes erence to molecules, indicates molecules that are distinguish include hydroxylation, epoxidation, N-dealkylation, able based on their ability to specifically bind to, and are O-dealkylation, S-oxidation and other less common transfor thereby defined as complementary to a specific molecule. mations. The most common reaction catalyzed by P450 Accordingly, a first oxygenase is bindingly distinguishable enzymes is the reaction using molecular from a second oxygenase if the first oxygenase specifically oxygen (O), where one atom of oxygen is inserted into a binds and is thereby defined as complementary to a first substrate while the other is reduced to water. Substrate and the second oxygenase specifically binds and is 0051 P450 monooxygenases can catalyze the monooxy thereby defined as complementary to a second Substrate, with genation of a variety of structurally diverse Substrates. Exem the first substrate distinct from the second substrate. In some plary Substrates, that can be oxidized by naturally-occurring embodiments, the variant oxygenase herein disclosed, has an P450s include Cs-C alkanes, cyclic alkanes, cyclic alkenes, increased enzyme half-time in vivo, a reduced antigenicity, alkane derivatives, alkene derivatives, Co-Co fatty acids, and/or an increased storage stability when compared to the steroids, terpenes, aromatic hydrocarbons, natural products corresponding wild-type oxygenase. and natural product analogues such as polyketides, prostagi 0048. In some embodiments, the oxygenase is a heme andlines, thromboxanes, leukotrienes, anthraquinones, tetra containing oxygenase or a variant thereof. The wording cyclines, anthracyclines, polyenes, statins, amino acids, fla "heme' or “heme domain” as used herein refers to an amino vonoids, Stilbenes, alkaloids (e.g. lysine-derived, nicotinic acid sequence within an oxygenase, which is capable of bind acid-derived, tyrosine-derived, tryptophan-derived, anthra ing an iron-complexing structure Such as a porphyrin. Com nilic acid-derived, histidine-derived, purine-derived alka pounds of iron are typically complexed in a porphyrin (tet loids), beta-lactams, aminoglycosides, polymyxins, quinolo rapyrrole) ring that may differ in side chain composition. nes, synthetic derivatives such as aromatic heterocyclic Heme groups can be the prosthetic groups of cytochromes derivatives (e.g. phenyl-, pyrimidine-, pyridine-, piperidine-, and are found in most oxygen carrier proteins. Exemplary pyrrole-, furan-, triazol-, thiophene-, pyrazole-, imidazole-, heme domains include that of P450 as well as truncated or tetrazole-, oxazole-, isoxazole-, thiazole-, isothiazole-, mutated versions of these that retain the capability to bind the pyran-, pyridazine-, pyrazine-, piperazine-, thiazine-, and iron-complexing structure. A skilled person can identify the oxazine-derivatives), and the like. heme domain of a specific protein using methods known in 0.052 Naturally-occurring P450 monooxygenases have the art. Exemplary organic molecules that can be oxidized by been also mutated in their primary sequence to favor their heme-containing oxygenases include Cs-C alkanes, fatty activity towards other non-native Substrates Such as short acids, Steroids, terpenes, aromatic hydrocarbons, chain fatty acids, 8- and 12-pNCA, indole, aniline, p-nitro polyketides, prostaglandins, terpenes, statins, amino acids, phenol, polycyclic hydrocarbons (e.g. indole, naphthalene), flavonoids, and stilbenes. styrene, medium- and short-chain alkanes, alkenes (e.g. 0049. In particular, in some embodiments the “heme-con cyclohexene, 1-hexene, Styrene, benzene), quinoline, Steroid taining oxygenase' is a cytochrome P450 enzyme (herein derivatives, and various drugs (e.g. chlorZoxaZone, propra also indicates as CYPs or P450s) or a variant thereof. The nolol, amodiaquine, dextromethorphan, acetaminophen, wording “P450 enzymes’ indicates a group of heme-contain ifosfamide, cyclophosphamide, benzphetamine, buspirone, ing oxygenases that share a common overall fold and topol MDMA). ogy despite less than 20% sequence identity across the cor 0053 P450 monooxygenases suitable in the methods and responding gene superfamily (Denisov, Makris et al. 2005). systems herein disclosed include cytochrome P450 In particular, the P450 enzymes share a conserved P450 struc monooxygenases (EC 1.14.14.1) from different Sources (bac US 2009/0061471 A1 Mar. 5, 2009

terial, fungi, yeast, plant, mammalian, and human), and Vari (SEQ ID NO: 15), CYP2C19 (SEQ ID NO: 16), CYP2D6 ants thereof. Exemplary P450 monooxygenases suitable in (SEQID NO:17), CYP2E1 (SEQID NO: 18), CYP2F1 (SEQ the methods and systems herein disclosed include members ID NO: 19), CYP3A4 (SEQ ID NO: 20), CYP153-AlkBurk of CYP102A subfamily (e.g. CYP102A1, CYP102A2, from Alcanivorax borkumensis (SEQ ID NO: 60), CYP153 CYP102A3, CYP102A5), members of CYP101A subfamily EB104 from Acinetobacter sp. EB104 (SEQ ID NO: 61), (e.g. CYP101A1), members of CYP102e subfamily (e.g. CYP153-OC4 from Acinetobacter sp. OC4 (SEQ ID NO: CYP102E1), members of CYP1A subfamily (e.g. CYP1A1, 62), and variants thereof. Exemplary organic molecules that CYP1A2), members of CYP2A subfamily (e.g. CYP2A3, can be oxidized by these P450 monooxygenases include CYP2A4, CYP2A5, CYP2A6, CYP2A12, CYP2A13), mem bers of CYP1B subfamily (e.g. CYP1B1), members of branched and linear Co-Co fatty acids, Co-Co alkanes, CYP2B subfamily (e.g. CYP2B6), members of CYP2C sub cyclic alkanes, cyclic alkenes, alkane derivatives, alkene family (e.g. CYP2C8, CYP2C9, CYP2C10C, CYP2C18, derivatives, steroids, terpenes, aromatic hydrocarbons, natu CYP2C19) members of CYP2D subfamily (e.g. CYP2D6), ral products and natural product analogues such as members of CYP3A subfamily (e.g. CYP3A4, CYP3A5, polyketides, prostaglandines, thromboxanes, leukotrienes, CYP3A7, CYP3A43), members of CYP107A subfamily (e.g. anthraquinones, tetracyclines, anthracyclines, polyenes, CYP107A1), and members of CYP153 family (e.g. statins, amino acids, flavonoids, Stilbenes, alkaloids (e.g. CYP153A1, CYP153A2, CYP153A6, CYP153A7, lysine-derived, nicotinic acid-derived, tyrosine-derived, tryp CYP153A8, CYP153A11, CYP153D3, and CYP153D2, tophan-derived, anthranilic acid-derived, histidine-derived, (van Beilen and Funhoff 2007)). Exemplary organic mol purine-derived alkaloids), beta-lactams, aminoglycosides, ecules oxidizable by P450 monooxygenases include Cs-C polymyxins, quinolones, synthetic derivatives such as aro alkanes, cyclic alkanes, cyclic alkenes, alkane derivatives, matic heterocyclic derivatives (e.g. phenyl-, pyrimidine-, alkene derivatives, Co-Co fatty acids, steroids, terpenes, pyridine-, piperidine-, pyrrole-, furan-, triazol-, thiophene-, aromatic hydrocarbons, natural products and natural product pyrazole-, imidazole-, tetrazole-, oxazole-, isoxazole-, thiaz analogues such as polyketides, prostaglandines, thrombox ole-, isothiazole-, pyran-, pyridazine-, pyrazine-, piperazine-, anes, leukotrienes, anthraquinones, tetracyclines, anthracy thiazine-, and oxazine-derivatives), and the like. clines, polyenes, statins, amino acids, flavonoids, stilbenes, alkaloids (e.g. lysine-derived, nicotinic acid-derived, 0056. In particular, in some embodiments P450 monooxy tyrosine-derived, tryptophan-derived, anthranilic acid-de genases suitable for the methods and systems herein dis rived, histidine-derived, purine-derived alkaloids), beta-lac closed include CYP102A1 (SEQ ID NO: 2) and variants tams, aminoglycosides, polymyxins, quinolones, synthetic thereof, wherein none, one or more of the amino acids that are derivatives such as aromatic heterocyclic derivatives (e.g. located within 50 A from the heme iron are mutated to any phenyl-, pyrimidine-, pyridine-, piperidine-, pyrrole-, furan-, other of the natural aminoacids or mutated to an unnatural triazol-, thiophene-, pyrazole-, imidazole-, tetrazole-, amino acid or modified in some way so to alter the properties oxazole-, isoxazole-, thiazole-, isothiazole-, pyran-, of the enzyme. Examples of amino acid positions that can be pyridazine-, pyrazine-, piperazine-, thiazine-, and oxazine modified in CYP102A1 to produce a P450 monooxygenase derivatives), and the like. suitable in the methods and systems herein disclosed include 0054) Other exemplary P450 monooxygenases suitable in without limitations: 25, 26, 42,47, 51, 52, 58,74, 75, 78, 81, the methods and systems herein disclosed include 82, 87, 88,90,94, 96, 102,106, 107, 108,118, 135, 138, 142, CYP106A2, CYP2F1, CYP2J2, CYP2R1, CYP2S1, 145, 152, 172,173, 175, 178, 180, 181, 184, 185, 188, 197, CYP2U1, CYP2W1, CYP4A11, CYP4A22, CYP4B1, 199, 205, 214, 226, 231, 236, 237,239, 252, 255, 260, 263, CYP4F2, CYP4F3, CYP4F8, CYP4F11, CYP4F12, 264, 265, 268,273, 274, 275,290, 295,306, 324, 328, 354, CYP4F22, CYP4V2, CYP4X1, CYP4Z1, CYP5A1, 366, 398, 401, 430, 433, 434, 437, 438, 442, 443, 444, and CYP7A1, CYP7B1, CYP8A1, CYP8B1, CYP11A1, 446. CYP11B1, CYP11B2, CYP17A1, CYP19A1, CYP20A1, 0057. In particular, in some embodiments, P450 CYP21A2, CYP24A1, CYP26A1, CYP26B1, CYP26C1, monooxygenases Suitable in the methods and system herein CYP27A1, CYP27C1, CYP39A1, CYP46A1, CYP51A1. disclosed are selected from the group consisting of 0055. In particular, in some embodiments P450 monooxy CYP102A1 (SEQ ID NO:2) and variants thereof including genases Suitable in the methods and systems herein disclosed CYP102A1 warl (SEQID NO: 21), CYP102A1 var2 (SEQID include CYP102A1 (also called P450) from Bacillus NO: 22), CYP102Alvar3 (SEQ ID NO. 23), megaterium (SEQID NO: 2), CYP102A2 from Bacillus sub CYP102A1 var3-2 (SEQ ID NO:24), CYP102A1var3-3 tilis (SEQID NO:3), CYP102A3 from Bacillus subtilis (SEQ ID NO:4), CYP102A5 from Bacillus cereus (SEQID NO:5), (SEQ ID NO: 25), CYP102A1 var3-4 (SEQ ID NO: 26), CYP102E1 from Ralsionia metallidurans (SEQ ID NO: 6), CYP102A1var3-5 (SEQ ID NO: 27), CYP102A1var3-6 CYP102A6 from Bradyrhizobium japonicum (SEQ ID NO: (SEQ ID NO: 28), CYP102A1var3-7 (SEQ ID NO: 29), 7), CYP101A1 (also called P450cam) from Pseudomonas CYP102A1var3-8 (SEQ ID NO: 30), CYP102A1var3-9 putida (SEQ ID NO: 8), CYP106A2 (also called P450meg) (SEQ ID NO: 31), CYP102A1 var3-10 (SEQ ID NO:32), from Bacillus megaterium (SEQ ID NO: 9), CYP153A6 CYP102A1 var3-11 (SEQ ID NO: 33), CYP102A1 var3-12 (SEQID NO:54), CYP153A7 (SEQIDNO:55), CYP153A8 (SEQ ID NO. 34), CYP102A1 var3-13 (SEQ ID NO: 35), (SEQ ID NO. 56), CYP153A11 (SEQ ID NO. 57), CYP102A1 var3-14 (SEQ ID NO:36), CYP102A1 var3-15 CYP153D2 (SEQIDNO:58), CYP153D3 (SEQID NO:59), (SEQ ID NO: 37), CYP102A1var3-16 (SEQ ID NO: 38), P450cin from Citrobacter brakii (SEQID NO: 10), P450terp CYP102A1 var3-17 (SEQ ID NO:39), CYP102A1 var3-11 from Pseudomonas sp. (SEQ ID NO: 11), P450eryF from (SEQ ID NO: 40), CYP102A1 var3-19 (SEQ ID NO: 41), Saccharopolyspora erythreae (SEQ ID NO: 12), CYP1A2 CYP102A1 var3-20 (SEQ ID NO: 42) CYP102A1 var3-21 (SEQ ID NO: 13), CYP2C8 (SEQ ID NO: 14), CYP2C9 (SEQ ID NO: 43), CYP102A1 var3-22 (SEQ ID NO: 44), US 2009/0061471 A1 Mar. 5, 2009

CYP102A1 var3-23 (SEQID NO: 45), CYP102A1 var4 (SEQ 0.058 The above variants are illustrated in particular in the ID NO: 46) CYP102A1varS (SEQ ID NO:47), following Table 1 wherein the respective sequences reporting CYP102A1 varó (SEQID NO:48), CYP102A1 var7 (SEQID in the enclosed Sequence Listing and the mutations of each NO:49), CYP102A1var8 (SEQID NO:50), CYP102A1 var9 variant with respect to the wild type (SEQ ID NO: 2) are (SEQ ID NO:51), CYP102A1var9-1 (SEQ ID NO: 52) listed.

TABLE 1 Name Sequence Mutation(s) with respect to CYP102A1 S D NO: 2 CYP war1 D NO: 21

war2 D NO: 22

war3 D NO: 23

CYP war3-2 D NO: 24

CYP war3-3 D NO: 25

D NO: 26

war3-5 D NO: 27

CYP war3-6 D NO: 28

CYP war3-7 D NO: 29

CYP war3-8 D NO: 30

CYP war3-9 D NO: 31

war3 D NO: 32

CYP D NO: 33

CYP D NO: 34

CYP D NO: 35

CYP D NO: 36

D NO: 37

CYP D NO: 38

D NO: 39

CYP D NO: 40

CYP D NO: 41 US 2009/0061471 A1 Mar. 5, 2009

TABLE 1-continued Name Sequence Mutation(s) with respect to CYP102A1 CYP102A1 var3-20 SEQID NO:42 A328V, L353V CYP102A1 var3-21 SEQID NO: 43 A29OV, L353V CYP102A1 var3-22 SEQID NO:44 A29OV, A328V, L353V CYP102A1 var3-23 SEQID NO:45 L353V CYP102A1 war4 SEQID NO:46 F87A CYP102A1 wars SEQID NO.47 F87V CYP102A1 waró SEQID NO:48 F87V, L188Q CYP102A1 warf SEQID NO: 49 A74G, F87V, L188Q CYP102A1 war8 SEQID NO: 50 R47L, F87V, L188Q CYP102A1 war SEQID NO: 51 F87A, T235A, R471A, E494K, S1024E CYP102A1varo-1 SEQID NO: 52 F87A. A184K, T235A, R471A, E494K, S1024E

0059. In some embodiments, the P450 monooxygenases to any other of the natural aminoacids or mutated to an listed in Table 1 are provided as oxigenating agents for the unnatural amino acid or modified in Some way so to alter the methods and systems herein disclosed, wherein the organic properties of the enzyme. molecules, include branched and linear Co-Co fatty acids, 0063. In particular, in some embodiments P450 monooxy C-C alkanes, cyclic alkanes, cyclic alkenes, alkane deriva genases Suitable in the methods and systems herein disclosed tives, alkene derivatives, steroids, terpenes, aromatic hydro carbons, prostaglandines, aromatic heterocyclic derivatives are selected from the group consisting of CYP102A3 (SEQ Such as phenyl-, pyrimidine-, pyridine-, piperidine-, pyrrole-, ID NO:4) and variants thereof including CYP102A3varl furan-, triazol-, thiophene-, pyrazole-, imidazole-, tetrazole-, (SEQ ID NO: 64). The above variants are illustrated in par oxazole-, isoxazole-, thiazole-, isothiazole-, pyran-, ticular in the following Table 3 wherein the mutations of each pyridazine-, pyrazine-, piperazine-, thiazine-, and oxazine variant with respect to the wild type (SEQ ID NO: 4) are derivatives. listed. 0060. In some embodiments P450 monooxygenases suit able in the methods and systems herein disclosed include TABLE 3 CYP102A2 from Bacillus sublilis (SEQID NO:3), and vari Mutation(s) with respect ants thereof, wherein none, one or more of the amino acids Name Sequence to CYP101A1 that are located within 50 A from the heme iron are mutated to any other of the natural aminoacids or mutated to an CYP102A3 SEQID NO: 4 unnatural amino acid or modified in Some way so to alter the CYP102A3war1 SEQID NO: 64 F88A properties of the enzyme. 0061. In particular, in some embodiments, P450 0064. In particular, in some embodiments P450 monooxy monooxygenases suitable in the methods and system herein genases Suitable in the methods and systems herein disclosed disclosed are selected from the group consisting of CYP102A2 (SEQ ID NO:3) and variants thereof including include CYP101A1 (also called P450cam) from Pseudomo CYP102A2var1 (SEQ ID NO:63). The above variants are nas pulida (SEQ ID NO: 8) and variants thereof, wherein illustrated in particular in the following Table 2 wherein the none, one or more of the amino acids that are located within mutations of each variant with respect to the wild type (SEQ 50 A from the heme iron are mutated to any other of the ID NO:3) are listed. natural aminoacids or mutated to an unnatural amino acid or modified in Some way so to alter the properties of the enzyme. TABLE 2 0065. In particular, in some embodiments, P450 Mutation(s) with respect monooxygenases Suitable in the methods and system herein Name Sequence to CYP101A1 disclosed are selected from the group consisting of CYP102A2 SEQID NO:3 CYP101A1 (SEQ ID NO:8) and variants thereof including CYP102A2war1 SEQID NO: 63 F88A CYP101A1 warl (SEQID NO:65), CYP101A1 var2 (SEQ ID NO:66), CYP101A1var2-1 (SEQ ID NO:67), 0062. In some embodiments P450 monooxygenases suit CYP101A1 var2-2 (SEQID NO:68), and CYP101A1 var2-3 able for the methods and systems herein disclosed include (SEQID NO:69). CYP102A3 from Bacillus subtilis (SEQID NO: 4), and vari 0066. The above variants are illustrated in particular in the ants thereof, wherein none, one or more of the amino acids following Table 4 wherein the mutations of each variant with that are located within 50 A from the heme iron are mutated respect to the wild type (SEQID NO: 8) are listed. US 2009/0061471 A1 Mar. 5, 2009

derivatives. Exemplary Substrates accepted by non-heme TABLE 4 monooxygenases such as integral membrane di-iron alkane hydroxylases (e.g. AlkB), Soluble di-iron methane monooxy Mutation(s) with respect genases (SMMO), di-iron propane monooxygenases, di-iron Name Sequence to CYP101A1 butane monooxygenases, membrane-bound copper-contain CYP101A1 SEQID NO: 8 ing methane monooxygenases, styrene monooxygenase, CYP101Awar1 SEQID NO: 65 Y96A Xylene monooxygenase include C-C linear and branched CYP101A1 war2 SEQID NO: 66 Y96F CYP101A1 war2-1 SEQID NO: 67 Y96F, F87W alkanes, alkenes, and aromatic hydrocarbons. CYP101A1 war2-2 SEQID NO: 68 Y96F, V247L 0075. In some embodiments, the oxydizing agent is a CYP101A1 war2-3 SEQID NO: 69 F87W, Y96F, V247L dioxygenase or a variant thereof and in particular a dioxyge nase involved in the catabolism of aromatic hydrocarbons. 0067. In some embodiments, the P450 enzyme is included Dioxygenases are a class of oxygenase enzymes that incor in a P450-containing system, a system including a P450 porate both atoms of molecular oxygen (O) onto the Sub enzyme and one or more proteins that deliver one or more strate according to the general scheme of reaction: electrons to the heme iron in the P450 enzyme. Natural P450 containing systems occur according to the following general schemes: 21 21 OH dioxygenase 0068 CYP reductase (CPR)/cytochrome b5 (cyb5)/P450 Her systems: typically employed by eukaryotic microsomal (i.e., S4S. O S. S. not mitochondrial) CYPs, they involve the reduction of cyto R y OH chrome P450 reductase (variously CPR, POR, or CYPOR) by NADPH, and the transfer of reducing power as electrons to the CYP. Cytochrome b5 (cyb5) can also contribute reducing 0076 Dioxygenases are metalloprotein and activation of power to this system after being reduced by cytochrome b5 molecular oxygen is carried out in a site within the structural reductase (CYB5R): fold of the enzyme that is covalently or non-covalently bound 0069. Ferrodoxin Reductase (FdxR) or Putidaredoxin to one or more metal atoms. The metal is typically iron, Reductase (PdxR)/Ferrodoxin (Fdx) or Putidaredoxin (Pdx)/ manganese, or copper. Example of dioxygenases are catechol P450 systems, typically employed by mitochondrial and dioxygenases, toluene dioxygenases, biphenyldioxygenases. some bacterial CYPs. Reducing electrons from a soluble Catechol dioxygenases catalyze the oxidative cleavage of , typically NADPH or NADH, are transferred catechols and have different Substrate specificities, including through the reductase to electron carrier, Fdx or Pdx, and catechol 1,2-dioxygenase (EC 1.13.11.1), catechol 2,3-di transferred from the electron carrier to the P450 component; oxygenase (EC 1.13.11.2), and protocatechuate 3,4-dioxyge 0070 P450-CPR fusion systems, where the CYP domain nase (EC 1.13.11.3). Toluene dioxygenase and biphenyl is naturally fused to the electron donating partners. An dioxygenases are involved in the natural degradation of aro example of these systems is represented by cytochrome matic compounds and typically introduce two oxygen atoms P450 (CYP102A1) from the soil bacterium Bacillus across a double bond in aromatic or non-aromatic com megaterium, pounds. Diocoxygenases, e.g. toluene dioxygenase, can be (0071 CYB5R/cyb5/P450 systems, where both electrons engineered to accept substrates for which the wild-type required by the CYP derive from cytochrome b5; enzyme shows only basal or no activity, e.g. 4-picoline (Saka 0072 FMN/Fd/P450 systems, where a FMN-domain moto, Joernet al. 2001). Potentially suitable substrates for containing reductase is fused to the CYP. This type of system dioxygenase enzymes include but are not restricted to Substi was originally found in Rhodococcus sp; tuted or non-substituted monocyclic, polycyclic, and hetero 0073 P450 only systems, which do not require external cyclic aromatic compounds. On these substrates, the diooxy reducing power. These include CYP5 (thromboxane syn genase can introduce one or more cis dihydrodiol functional thase), CYP8, prostacyclin synthase, and CYP74A (allene group. oxide synthase). 0077. In some embodiments, the oxidizing agent is a per 0.074. In some embodiments, the oxydizing agent is a non oxygenase. Natural peroxygenases are heme-dependent oxi heme containing monooxygenases i.e. a monooxygenases dases that are distinct from cytochrome P450 enzymes and that is able to function without a heme prosthetic group. These peroxidases and that accept only peroxides, in particular monooxygenases include but are not limited to flavin hydrogen peroxide, as the Source of oxidant. Natural peroxy monooxygenases, pterin-dependent non-heme monooxyge genases are typically membrane-bound and can catalyze nases, non-heme diron monooxygenases, and diron hydroxylation reactions of aromatics, Sulfoxidations of Xeno hydroxylases. In these enzymes, oxygen activation occurs at biotics, or epoxidations of unsaturated fatty acids. In contrast a site in the enzyme’s structural fold that is covalently or to cytochrome P450 monoxygenases, peroxygenases activ non-covalently bound to a flavin cofactor, a pterin cofactor, or ity does not require any cofactor such as NAD(P)H and does a diron cluster. Examples of non-heme containing monooxy not use molecular oxygen. Examples are the plant peroxyge genases include but are not limited to co-hydroxylases (n-oc nase (PXG) (Hanano, Burcklen et al. 2006), soybean peroxy tane ()-hydroxylase, n-decane co-hydroxylases, 9-O-hy genase (Blee, Wilcox et al. 1993), and oat seed peroxygenase. droxylase, and AlkB), Styrene monooxygenase, butane 0078. In some embodiments, the peroxygenase is a cyto monooxygenases, propane monooxygenases, and methane chrome P450s can also use peroxides as oxygen donors. This monooxygenases. Non-heme containing monooxygenases constitutes the so-called peroxide shunt pathway and the catalyze the monooxygenation of a variety of structurally enzyme does not need a reductase and NAD(P)H to carry out diverse Substrates. Exemplary Substrates accepted by proges catalysis. Normally, this peroxide-driven reaction in P450s is terone 9-O-hydroxylase from Nocordia sp. include steroid not significant. However, mutations in the heme domain of US 2009/0061471 A1 Mar. 5, 2009

P450 enzymes can enhance their latent peroxygenase activity, saturated or unsaturated hydrocarbon group, including but as in the case of P450cam (Joo, Lin et al. 1999) and P450 not limited to alkyl group, alkenyl group and alkynyl groups. (Cirino and Arnold 2003). Using three engineered P450 The term "heteroatom-containing aliphatic” as used herein enzymes, namely CYP102A1, CYP102A2 and CYP102A3, refer to an aliphatic moiety where at least one carbonatom is that are capable of peroxygenase activity, a library of ~6000 replaced with a heteroatom. members peroxygenase chimeras was created by site-di I0084. The term “alkyl and “alkyl group’ as used herein rected recombination (Otey, Landwehr et al. 2006). refers to a linear, branched, or cyclic Saturated hydrocarbon 0079 Naturally-occurring P450 peroxygenases also exist. typically containing 1 to 24 carbon atoms, preferably 1 to 12 P450s (CYP152A1) and P450s. (CYP152B1), recently carbon atoms, such as methyl, ethyl, n-propyl, isopropyl. isolated from Bacillus subtilis and Sphingomonas paucimo n-butyl, isobutyl, t-butyl, octyl, decyl and the like. The term bilis (Matsunaga, Sumimoto et al. 2002; Matsunaga, Yamada "heteroatom-containing alkyl as used herein refers to an et al. 2002), efficiently utilize HO to hydroxylate fatty alkyl moiety where at least one carbon atom is replaced with acids, prevalently in C. and B positions. a heteroatom, e.g. oxygen, nitrogen, Sulphur, phosphorus, or 0080 Exemplary peroxygenases suitable in the methods silicon, and typically oxygen, nitrogen, or Sulphur. and system herein disclosed include but are not limited to I0085. The term “alkenyl and “alkenyl group’ as used natural heme-containing peroxygenases, natural P450 per herein refers to a linear, branched, or cyclic hydrocarbon oxygenases, engineered P450s with peroxygenase activity, group of 2 to 24 carbon atoms, preferably of 2 to 12 carbon and P450 peroxygenase chimeras described in more details in atoms, containing at least one double bond, such as ethenyl, the work of Arnold and co-workers (Otey, Landwehr et al. n-propenyl, isopropenyl. n-butenyl, isobutenyl, octenyl, 2006). These peroxygenases show activity on a variety of decenyl, and the like. The term "heteroatom-containing alk substrates including fatty acids, 8- and 12-pNCA, indole, enyl as used herein refer to an alkenyl moiety where at least aniline, p-nitrophenol, heterocyclic derivatives (e.g. chlor one carbon atom is replaced with a heteroatom. ZOxazone, buspirone), statins, naphtyl derivatives. I0086. The term “alkynyl and “alkynyl group’ as used 0081. Other suitable oxydizing agents for the systems and herein refers to a linear, branched, or cyclic hydrocarbon methods herein disclosed are peroxidases (EC number 1.11. group of 2 to 24 carbon atoms, preferably of 2 to 12 carbon 1.X). Sequences of the peroxidase enzymes identified so far atoms, containing at least one triple bond. Such as ethynyl, can be found in the PeroxiBase database. Peroxidases typi n-propynyl, and the like. The term "heteroatom-containing cally catalyze a reaction of the form: ROOR'+electron donor alkynyl as used herein refer to an alkynyl moiety where at (2e)+2H"->ROH+R'OH. For most peroxidases the optimal least one carbon atom is replaced with a heteroatom. oxygen providing compound is hydrogen peroxide, but others I0087. The term “aryl” and “aryl group” as used herein are more active with organic hydroperoxides Such as lipid refers to an aromatic Substituent containing a single aromatic peroxides. Peroxidases can contain a heme cofactor in their or multiple aromatic rings that are fused together, directly active sites, or redox-active cysteine or selenocysteine resi linked, or indirectly linked (such as linked through a methyl dues. The nature of the electron donor is very dependent on ene or an ethylene moiety). Preferred aryl groups contain 5 to the structure of the enzyme. For example, horseradish peroxi 24 carbon atoms, and particularly preferred aryl groups con dase can use a variety of organic compounds as electron tain 5 to 14 carbon atoms. The term "heteroatom-containing donors and acceptors. Horseradish peroxidase has an acces aryl' as used herein refer to an aryl moiety where at least one sible and many compounds can reach the site of the carbon atom is replaced with a heteroatom. reaction. In contrast, cytochrome c peroxidase has a much I0088. The term “alkoxy” and “alkoxy group’ as used more restricted active site, and the electron-donating com herein refers to an aliphatic group or a heteroatom-containing pounds are very specific. Glutathione peroxidase is a peroxi aliphatic group bound through a single, terminal ether link dase found in humans, which contains selenocysteine. It uses age. Preferred aryl alkoxy groups contain 1 to 24 carbon glutathione as an electron donor and is active with both atoms, and particularly preferred alkoxy groups contain 1 to hydrogen peroxide and organic hydroperoxide Substrates. 14 carbon atoms. 0082 In some embodiments the organic molecule has the I0089. The term “aryloxy” and “aryloxy group’ as used structure of formula (I) herein refers to an aryl group or a heteroatom-containing aryl group bound through a single, terminal ether linkage. Pre ferred aryloxy groups contain 5 to 24 carbon atoms, and (I) particularly preferred aryloxy groups contain 5 to 14 carbon atOmS. (0090. The terms “halo' and “halogen” are used in the conventional sense to refer to a fluoro, chloro, bromo or iodo substituent. (0091. By “substituted” it is intended that in the alkyl, in which X-C atom is the target site, and R. R. and R are alkenyl, alkynyl, aryl, or other moiety, at least one hydrogen independently selected from the group consisting of hydro atom is replaced with one or more non-hydrogen atoms. gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, Examples of such substituents include, without limitation: heteroatom-containing aliphatic, heteroatom-containing functional groups referred to herein as "FG", such as alkoxy, aryl, Substituted heteroatom-containing aliphatic, Substituted aryloxy, alkyl, heteroatom-containing alkyl, alkenyl, heteroa heteroatom-containing aryl, alkoxy, aryloxy, and functional tom-containing alkenyl, alkynyl, heteroatom-containing groups (FG) or are taken together to form a ring, Such that the alkynyl, aryl, heteroatom-containing aryl, alkoxy, heteroa carbon atom is a secondary or tertiary carbon atom. tom-containing alkoxy, aryloxy, heteroatom-containing ary 0083. The term “aliphatic' is used in the conventional loxy, halo, hydroxyl ( OH), sulfhydryl ( SH), substituted sense to refer to an open-chain or cyclic, linear or branched, Sulthydryl, carbonyl (-CO—), thiocarbonyl, (-CS ), car US 2009/0061471 A1 Mar. 5, 2009

boxy ( OOH), amino ( NH), substituted amino, nitro resulting from the application of the systems and methods ( NO), nitroso ( NO), sulfo ( SO OH), cyano herein disclosed can be (RRRCF), (RRCF), ( C=N), cyanato ( O C=N), thiocyanato ( S (RRCF), or (RRCF). C=N), formyl ( CO H), thioformyl ( CS H), 0096. In some embodiments of the methods and systems phosphono (—P(O)OH), Substituted phosphono, and phos herein disclosed wherein the organic molecule is a compound pho ( PO). of Formula (I), in which R=H, —CH or=O, and/or Rand R are connected together through 4, 5, 6, or 7-methylene 0092. In particular, the substituents R. R. and R of for moiety to form a ring, the oxydizing agent can be an oxyge mula I can be independently selected from hydrogen, C-C, nase. Such as a P450 monooxygenase, and in particular alkyl, C-C Substituted alkyl, C-C Substituted heteroa CYP102A1 warl (SEQID NO:21), CYP102A1 var2 (SEQID tom-containing alkyl, C-C substituted heteroatom-con NO:22), CYP1 O2A1Var3 (SEQ ID NO:23), taining alkyl, C-C alkenyl, C-C substituted alkenyl, CYP102A1var3-7 (SEQ ID: NO:29), CYP101A1 (SEQ ID C-C Substituted heteroatom-containing alkenyl, C-C, NO:8), CYP101A1 warl (SEQ ID NO:65), and/or Substituted heteroatom-containing alkenyl, Cs-C aryl, CYP101 Alvar2-3 (SEQ ID NO:69), and is expected to acti Cs-C. Substituted aryl, Cs-C. Substituted heteroatom-con vate the target site of the corresponding compound of For taining aryl, Cs-C. Substituted heteroatom-containing aryl, mula (I) by introducing an oxygen-containing functional C-C alkoxy, Cs-Caryloxy, carbonyl, thiocarbonyl, and group in the form of a hydroxyl group. carboxy. More in particular, R. RandR of formula I can be 0097. In some embodiments of the methods and systems independently selected from hydrogen, C-C alkyl, C-C2 herein disclosed wherein the organic molecule is a compound Substituted alkyl, C-C2 substituted heteroatom-containing of Formula (I), in which R=H, R=-Me. -Et, —Pr, or -iPr alkyl, C-C substituted heteroatom-containing alkyl, and/or R= -(CH2)COOH with n between 9 and 15, the C-C alkenyl, C-C Substituted alkenyl, C-C Substi oxydizing agent can be, an oxygenase Such as a P450 mon tuted heteroatom-containing alkenyl, C-C substituted het oxygenase, in particular CYP102A1 (SEQ ID NO:2), eroatom-containing alkenyl, Cs-Caryl, Cs-C substituted CYP102A1 var4 (SEQID NO:46), CYP102A1 vars (SEQID aryl, Cs-C. Substituted heteroatom-containing aryl, Cs-Ca NO:47), CYP102A2 (SEQID NO:3), CYP102A2var1 ((SEQ Substituted heteroatom-containing aryl, C-C alkoxy, ID NO:63), CYP102A3 (SEQ ID NO:4), and/or Cs-Caryloxy, carbonyl, thiocarbonyl, and carboxy. CYP102A3Varl (SEQID NO:64), which is expected to acti vate the target site by introducing an oxygen-containing func 0093. Oxydizing agents known or expected to react with tional group in the form of a hydroxyl group. the target site of a compound of Formula (I) include but are 0098. In some embodiments of the methods and systems not limited to oxygenases or variants thereof. herein disclosed wherein the organic molecule is a compound 0094. In some embodiments, the oxygenase can be a non of Formula (I), in which R=RE-Me, R = CH-O-sub heme monooxygenase or a variant thereof, a heme-containing stituted-Ph, activation can be performed by reacting the monooxygenase or a variant thereof, a peroxygenase or a organic molecule with an oxygenase, Such as a P450 variant thereof. Such as any of the heme-containing monooxygenase, including CYP102A1 (SEQ ID NO:2), monooxygenase, nonheme-containing monooxygenases and CYP102A1 var3-4 (SEQ ID NO:26), CYP102A1 var3-14 peroXugenases herein disclosed. In particular, the oxygenase (SEQ ID NO:36), CYP102A1var3-15 (SEQ ID NO:37), can be any of the P450 monooxygenases and P450 peroxy CYP102A1var3-3 (SEQ ID NO:25), CYP102A1var3-2 genases herein disclosed. (SEQ ID NO:24), CYP102A1 var3 (SEQ ID NO:23), 0095. In some embodiments, the oxygenase or variant CYP102A1 var3-9 (SEQID NO:31), CYP102A1 warl (SEQ thereof can be butane monooxygenase, CYP102A1 (SEQID ID NO:21), and/or CYP102A1 var2 (SEQID NO:22), which NO:2), CYP102A1 var4 (SEQ ID NO:46), CYP102A1 var8 introduce anhydroxyl group in the target site, as exemplified (SEQ ID NO: 50), CYP102A1 warl (SEQ ID NO:21), in Examples 11 and illustrated in corresponding scheme 11. CYP102A1 var2 (SEQID NO:22), CYP102A1 var3 (SEQID 0099. In some embodiments of the methods and systems NO:23), CYP102A1 var3-20 (SEQ ID NO:42), herein disclosed wherein the organic molecule is a compound CYP102A1 var3-2 (SEQ ID NO:44), CYP102A1 var3-3 of Formula (I), in which R=R-H, activation can be per (SEQ ID NO:25), CYP102A1 var3-4 (SEQ ID NO:26), formed by reacting the organic molecule with an oxygenase CYP102A1var3-5 (SEQ ID NO:27), CYP102A1var3-7 such as a P450 monooxygenase, including CYP153A6 (SEQ (SEQ ID NO:29), CYP102A1 var3-8 (SEQ ID NO:30), ID NO:54), CYP153A7 (SEQID NO:55), CYP153A8 (SEQ CYP102A1var3-9 (SEQ ID NO:31), CYP102A1var3-11 ID NO:56), CYP153A11 (SEQ ID NO:57), CYP153D2 (SEQ ID NO:33), CYP102A1var3-13 (SEQ ID NO:35), (SEQ ID NO:58), and/or CYP153D3 (SEQ ID NO:59), CYP102A1 var3-14 (SEQ ID NO:36), CYP102A1 var3-15 which are expected to introduce an hydroxyl group on the (SEQ ID NO:37), CYP101A1 (SEQ ID NO:8), target site CYP101A1 varl (SEQID NO: 65), CYP101A1 var2-3 (SEQ 0100. In some embodiments of the methods and systems ID NO:69), CYP102A2 (SEQ ID NO:3), CYP102A2var1 herein disclosed wherein the organic molecule is a compound (SEQ ID NO:63), CYP102A3 (SEQ ID NO:4), of Formula (I), in which R=n-C-Cio alkyl (e.g. linear CYP102A3var1 (SEQID NO:64) and CYP153A6 (SEQ ID Co-Co alkanes), activation can be performed by an oxyge NO:54), CYP153A7 (SEQID NO:55), CYP153A8(SEQID nase such as a butane monooxygenase, which is expected to NO:56), CYP153A11 (SEQ ID NO:57), CYP153D2 (SEQ introduce an hydroxyl group on the target site. ID NO:58), and/or CYP106A2 (SEQID NO:9). In particular, 0101. In some embodiments of the methods and systems in those embodiments at least one of said oxygenases or herein disclosed wherein the organic molecule is a compound variants thereof is expected to activate the target site by intro of Formula (I), in which R cyclohexenyl (e.g. limonene), ducing an oxygen-containing functional group in the form of the oxidating agent can be an oxygenase, Such as a P450 a hydroxyl group. In these embodiments, the final products monooxygenase including CYP153A6 (SEQ ID NO:54), US 2009/0061471 A1 Mar. 5, 2009

CYP153A7 (SEQID NO:55), CYP153A8 (SEQID NO:56), groups (FG) or are taken together to form a ring, Such that the CYP153A11 (SEQ ID NO:57), CYP153D2 (SEQ ID carbon atom is a secondary or tertiary carbon atom. NO:58), and/or CYP153D3 (SEQ ID NO:59) which are 0107. In particular, the substituents R. Rs and R of For expected to introduce an hydroxyl group on the target site. mula (II) can be independently selected from hydrogen, 0102. In some embodiments of the methods and systems C-C alkyl, C-C Substituted alkyl, C-C Substituted herein disclosed wherein the organic molecule is a compound heteroatom-containing alkyl, C-C Substituted heteroatom containing alkyl, C-C alkenyl, C-C substituted alkenyl, of Formula (I), in which Ran-Cz, the oxidating agent can be C-C Substituted heteroatom-containing alkenyl, C-C, a monooxygenase Such as a P450 monooxygenases including Substituted heteroatom-containing alkenyl, Cs-C aryl, CYP102Alvar3-13 (SEQID NO:35), which is expected to Cs-C. Substituted aryl, Cs-C. Substituted heteroatom-con introduce an hydroxyl group on the target site. taining aryl, Cs-C. Substituted heteroatom-containing aryl, 0103) In some embodiments of the methods and systems C-C alkoxy, Cs-Caryloxy, carbonyl, thiocarbonyl, car herein disclosed wherein the organic molecule is a compound boxy, Sulthydryl, amino, Substituted amino. More in particu of Formula (I), in which R=H. R. and R are connected lar, R can be independently selected from hydrogen, C-C, throughn methylene moieties, activation can be performed by alkoxy, Cs-C aryloxy, amino, Substituted amino, Sulfhy reacting the Substrate with monooxygenases Such as P450 dryl, Substituted Sulthydryl, C-C alkyl, C-C substituted monooxygenases including CYP102A1 Varl (SEQ ID NO: alkyl, C-C Substituted heteroatom-containing alkyl, 21), CYP102A1var2 (SEQ ID NO: 22), CYP102A1 var3-20 C-C Substituted heteroatom-containing alkyl, C-C alk (SEQ ID NO: 42). In particular, when n=5, as in the case of enyl, C-C Substituted alkenyl, C-C substituted heteroa cyclopentanecarboxylic acid derivatives, the compound of tom-containing alkenyl, C-C Substituted heteroatom-con formula (I) can be activated with methods and systems herein taining alkenyl, Cs-Caryl, Cs-C. Substituted aryl, Cs-Ca disclosed wherein the oxidating agent is a monooxygenase Substituted heteroatom-containing aryl, and C-C substi CYP102Alvar8 (SEQ ID NO: 50). When instead n=6, as in tuted heteroatom-containing aryl, while Rs and R are inde the case of camphor, cyclohexane and cyclohexene, the com pendently selected from hydrogen, C-C alkyl, C-C Sub pound of formula (I) can be activated with methods and stituted alkyl, C-C Substituted heteroatom-containing systems herein disclosed wherein the oxidating agent is a alkyl, C-C substituted heteroatom-containing alkyl, monooxygenase. Such as a P450 monooxygenase including C-C alkenyl, C-C Substituted alkenyl, C-C Substi CYP101A1 (SEQID NO:8), CYP153A6 (SEQID NO:54), tuted heteroatom-containing alkenyl, C-C substituted het CYP153A7 (SEQID NO:55), CYP153A8(SEQID NO:56), eroatom-containing alkenyl, Cs-Caryl, Cs-C substituted CYP153A11 (SEQ ID NO:57), CYP153D3 (SEQ ID NO: aryl, Cs-C. Substituted heteroatom-containing aryl, Cs-Ca 59) or CYP153D2 (SEQID NO:58). In those embodiments, Substituted heteroatom-containing aryl, C-C alkoxy, activation is known or expected to result in the introduction of Cs-Caryloxy, carbonyl, thiocarbonyl, and carboxy. a hydroxyl group in the target site. 0.108 Oxydizing agents known or expected to react with 0104. In some embodiments of the methods and systems the target site of a compound of Formula (II) include but are herein disclosed wherein the organic molecule is a compound not limited to oxygenases or variants thereof. of Formula (I), wherein R=H. R. and R are connected 0109. In some embodiments, the oxygenase can be a non through 5 or 6 methylene moieties, so to form a polycyclic heme monooxygenase or a variant thereof, a heme-containing unsaturated system, such as in Steroids, activation can be monooxygenase or a variant thereof, a peroxygenase or a performed by reacting the Substrate with a monooxygenases variant thereof. Such as any of the heme-containing such as a P450 monooxygenase including CYP106A2 (SEQ monooxygenase, non heme-containing monooxygenases and ID NO: 9), and the activation is expected to result in the perOXugenases herein disclosed. In particular, the oxygenase introduction of an hydroxyl group in the target site. can be any of the P450 monooxygenases and P450 peroxy 0105. In the compound of formula I, wherein R=H, genases herein disclosed. R—CHCOOH, R-n-dodecyl, activation can be per 0110. In some embodiments, the oxygenase or variant formed by reacting the substrate with peroxygenase P450s thereof can be a P450 monooxygenase or peroxygenase (CYP152A1) (SEQID NO:70), resulting in the introduction including CYP102A1 (SEQ ID NO:2), CYP102A1 var4 of a hydroxyl group in the target site. (SEQ ID NO:46), CYP102A1 var8 (SEQ ID NO:50), 0106. In some embodiments, the organic molecule has the CYP102A1 warl (SEQID NO:21), CYP102A1 var2 (SEQID structure of formula (II) NO:22), CYP1 O2A1Var3 (SEQ ID NO:23), CYP102A1var3-7 (SEQID NO:9), CYP102Alvar3-5 (SEQ ID NO:27), CYP102A1var3-9 (SEQ ID NO:31), (II) CYP102A1 var3-14 (SEQ ID NO:36), CYP102A1 var3-15 (SEQ ID NO:37), CYP102A1var3-17 (SEQ ID NO:39), CYP101A1 (SEQ ID NO:8), CYP101A1 (Y96F), CYP101A1 var2-1 (SEQID NO:67), CYP101A1 warl (SEQ ID NO:65), CYP101Alvar2-2 (SEQ ID NO:68), CYP1A2 (SEQ ID NO:13), CYP2C9 (SEQ ID NO:15), CYP2C19 (SEQ ID NO: 16), CYP2D6 (SEQ ID NO: 17), CYP2E1 in which X is the target site C atom, and R. Rs, and R are (SEQ ID NO: 18), CYP3A4 (SEQ ID NO:20), P450s independently selected from the group consisting of hydro (CYP152A1) (SEQ ID NO:70) and/or P450, gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, (CYP152B1). In particular, in those embodiments at least one heteroatom-containing aliphatic, heteroatom-containing of said oxygenases or variants thereof is expected to activate aryl, Substituted heteroatom-containing aliphatic, Substituted the target site of a compound of Formula (II) by introducing heteroatom-containing aryl, alkoxy, aryloxy, and functional an oxygen-containing functional group in the form of a US 2009/0061471 A1 Mar. 5, 2009

hydroxyl group. In these embodiments, the final products in which X is the target site C atom, and R-7, Rs. Ro, Ro and resulting from the application of the systems and methods R are independently selected from the group consisting of herein disclosed can be (RRCF (CO)—R), (RCF hydrogen, aliphatic, aryl, Substituted aliphatic, Substituted (CO)—R), or (RCF (CO)—R). aryl, heteroatom-containing aliphatic, heteroatom-contain 0111. In some embodiments of the methods and systems ing aryl, Substituted heteroatom-containing aliphatic, Substi herein disclosed wherein the organic molecule is a compound tuted heteroatom-containing aryl, alkoxy, aryloxy, and func of Formula (II) with Ra—H, the oxydizing agent can be a tional groups (FG) or are taken together to form a ring, Such peroxygenase, such as a P450s (CYP152A1) (SEQ ID that the carbon atom is a secondary or tertiary carbon atom. NO:70) and/or or a peroxygenase P450s. (CYP152B1), 0117. In particular, the Substituents R-7, Rs. R. Rio, and which are most expected activate the target site, in particular R of Formula (III) can be independently selected from by introducing an oxygen-containing functional group in the hydrogen, C-C alkyl, C-C substituted alkyl, C-C sub form of a hydroxyl group. stituted heteroatom-containing alkyl, C-C Substituted het 0112. In some embodiments of the methods and systems eroatom-containing alkyl, C-C alkenyl, C-C Substituted herein disclosed wherein the organic molecule is a compound alkenyl, C-C substituted heteroatom-containing alkenyl, of Formula (II), with R4—OR, and R=a C-C alkyl, the C-C Substituted heteroatom-containing alkenyl, Cs-Ca oxidizing agent can be an oxygenase and in particular a P450 aryl, Cs-C. Substituted aryl, Cs-C. Substituted heteroatom oxygenase such as CYP102A1 (F87A), CYP102Alvar3 containing aryl, Cs-C. Substituted heteroatom-containing (SEQ ID NO. 23), CYP102A1var3-7 (SEQ ID NO: 29), aryl, C-C alkoxy, Cs-Caryloxy, carbonyl, thiocarbonyl, CYP102A1 var3-14 (SEQ ID NO: 36), CYP102A1 var3-15 and carboxy. More in particular, R-7, Rs. Ro Ro, and Rare (SEQ ID NO: 37), and/or CYP102A1 var3-5 (SEQ ID NO: independently selected from hydrogen, C-C alkyl, C-C2 27), which are most expected to activate the target site, in Substituted alkyl, C-C substituted heteroatom-containing particular by introducing an oxygen-containing functional alkyl, C-C Substituted heteroatom-containing alkyl, group in the form of a hydroxyl group. C-C alkenyl, C-C substituted alkenyl, C-C substi 0113. In some embodiments of the methods and systems tuted heteroatom-containing alkenyl, C-C Substituted het herein disclosed wherein the organic molecule is a compound eroatom-containing alkenyl, Cs-Caryl, Cs-C. Substituted of Formula (II), in which R is —OMe, —OEt, —OPr. aryl, Cs-C. Substituted heteroatom-containing aryl, Cs-Ca —OBu, —OtBu, Rs is hydrogen, and R is benzyl, o-chloro Substituted heteroatom-containing aryl, C-C alkoxy, phenyl, p-chloro-phenyl, or m-chloro-phenyl, o-methyl-phe Cs-Caryloxy, carbonyl, thiocarbonyl, and carboxy. nyl, p-methyl-phenyl, or m-methyl-phenyl, o-methoxy-phe 0118 Oxydizing agents known or expected to react with nyl, p-methoxy-phenyl, or m-methoxy-phenyl, the activation the target site of a compound of Formula (III) include but are can be performed by reacting the Substrate with oxygenase not limited to oxygenases or variants w thereof. CYP102A1 var4 (SEQID NO:46), CYP102A1 var3 (SEQID 0119. In some embodiments, the oxygenase can be a non NO. 23), and CYP102A1 var3-7 (SEQ ID NO: 29), as illus heme monooxygenase or a variant thereof, a heme-containing trated in Examples 1,2,3 and 4 and corresponding schemes 1. monooxygenase or a variant thereof, a peroxygenase or a 2, 3, and 4. variant thereof. Such as any of the heme-containing 0114. In some embodiments of the methods and systems monooxygenase, non heme-containing monooxygenases and herein disclosed wherein the organic molecule is a compound perOXugenases herein disclosed. In particular, the oxygenase of Formula (II), in which R is —OH, Rs is hydrogen, and R. can be any of the P450 monooxygenases and P450 peroxy is a linear C2-alkyl chain, (for example a myristic acid), the genases herein disclosed. activation can be performed by reacting the substrate with I0120 In some embodiments, the oxygenase or variant peroxygenases P450s (CYP152A1) and P450s. thereof can be a P450 oxygenase including CYP102A1 Varl (CYP152B1), resulting in the introduction of an hydroxy (SEQ ID NO:21), CYP102A1 var2 (SEQ ID NO:22), group in the target site. CYP102A1 var3 (SEQ ID NO:23), CYP102A1 var3-2 (SEQ 0115. In some embodiments of the methods and systems ID NO:24), CYP102A1 var3-6 (SEQ ID NO:28), herein disclosed wherein the organic molecule is a compound CYP102A1var3-5 (SEQ ID NO:27), CYP102A1var3-8 of Formula (II), in which Rs is -Me, and Rs and R are con (SEQ ID NO:30), CYP102A1var3-9 (SEQ ID NO:31), nected through a 6-methylene ring, (for example a C-thu CYP102A1var3-11 (SEQ ID NO:33), CYP102A1var3-17 jone), the activation can be performed by reacting the Sub (SEQ ID NO:39), CYP102A1varS (SEQ ID NO:47), strate with monooxygenases CYP101A1 (SEQ ID NO: 8), CYP102A1 varó (SEQID NO:48), CYP102A1 var7 (SEQID CYP102A1 (SEQ ID NO: 2), CYP1A2 (SEQ ID NO: 13), NO:49), CYP102A1var8 (SEQID NO:50), CYP101A1 var1 CYP2C9 (SEQ ID NO: 14), CYP2C19 (SEQ ID NO: 16), (SEQ ID NO:65), CYP101A1 var2-1 (SEQ ID NO:67, CYP2D6 (SEQID NO:17), CYP2E1 (SEQID NO: 18), and CYP101 var2-3 (SEQ ID NO:69), CYP2C19 (SEQ ID CYP3A4 (SEQIDNO:20), resulting in the introduction of an NO:16) and/or CYP2D6 (SEQ ID NO:17). In particular, in hydroxyl group in the target site. those embodiments at least one of said oxygenases or variants thereof is expected to activate the target site of a compound of 0116. In some embodiments, the organic molecule has the Formula III by introducing an oxygen-containing functional structure of formula (III) group in the form of a hydroxyl group. In these embodiments, (III) the final products resulting from the application of the sys R10 tems and methods herein disclosed can be (R,RCF C(R) H =CRR), (RCF C(R)—CRR) or (RCF C R11 \ (R)=CRR). 3-R. I0121. In some embodiments of the methods and systems Ro Rs herein disclosed wherein the organic molecule is a compound of Formula (III) in which R7—H. R. —CH, R-n-CH US 2009/0061471 A1 Mar. 5, 2009

and Rs and R are linked to form a substituted 5-member I0127. In particular, the substituent Ar of Formula (IV) can ring, activation can be performed by reacting the Substrate be Cs-Caryl, Cs-C. Substituted aryl, Cs-C. Substituted with oxygenases such as CYP102Alvar2 (SEQ ID NO:22), heteroatom-containing aryl, or Cs-C. Substituted heteroa CYP102A1 var3 (SEQ ID NO:23), CYP102A1 var3-2 (SEQ tom-containing aryl, while R and R are independently ID NO:24), CYP102A1 var3-6 (SEQ ID NO:28), selected from hydrogen, C-C alkyl, C-C Substituted CYP102A1var3-5 (SEQ ID NO:27), CYP102A1var3-8 alkyl, C-C substituted heteroatom-containing alkyl, (SEQ ID NO:30), CYP102A1var3-9 (SEQ ID NO:31), C-C Substituted heteroatom-containing alkyl, C-C alk resulting in the introduction of anhydroxyl group in the target enyl, C-C Substituted alkenyl, C-C substituted heteroa site as illustrated in Examples 5 and corresponding scheme 5. tom-containing alkenyl, C-C substituted heteroatom-con 0122. In some embodiments of the methods and systems taining alkenyl, Cs-Caryl, Cs-Ca. Substituted aryl, Cs-Ca herein disclosed wherein the organic molecule is a compound Substituted heteroatom-containing aryl, C-C substituted of Formula (III), in which R—H. R. H. Rio —CH. Rs heteroatom-containing aryl, C2-C alkoxy, Cs-Caryloxy, and R are linked to form a Substituted 6-member ring, carbonyl, thiocarbonyl, and carboxy. activation can be performed by reacting the substrate with I0128 Oxydizing agents known or expected to react with oxygenase CYP101Alvar2-3 (SEQ ID NO:69), resulting in the target site of a compound of Formula (IV) include but are the introduction of an hydroxyl group in the target site as in not limited to oxygenases or variants thereof. the case of a-pinene. I0129. In some embodiments, the oxygenase can be a non 0123. In some embodiments of the methods and systems heme monooxygenase or a variant thereof, a heme-containing herein disclosed wherein the organic molecule is a compound monooxygenase or a variant thereof, a peroxygenase or a of Formula (III), in which R=Rs—Rio H. R. and R are variant thereof. Such as any of the heme-containing connected through a Substituted 5-member ring, activation monooxygenase, non heme-containing monooxygenases and can be performed by reacting the organic molecule with an perOXugenases herein disclosed. In particular, the oxygenase oxygenase such as CYP102Alvar3-2 (SEQ ID NO:24), can be any of the P450 monooxygenases and P450 peroxy resulting in the introduction of anhydroxyl group in the target genases herein disclosed. site as illustrated by Example 6 and corresponding scheme 6. 0.130. In some embodiments, the oxygenase or variant 0.124. In some embodiments of the methods and systems thereof can be such as CYP102A1 (SEQ ID NO:2), herein disclosed wherein the organic molecule is a compound CYP102A1 var4 (SEQID NO:46), CYP102A1 vars (SEQID of Formula (III), in which Ro R= -CH, R=R, H, NO:47), CYP102A1var6 (SEQID NO:48), CYP102A1 var7 and R, substituted Cs alkenyl, activation can be performed (SEQ ID NO:49), CYP102Alvar1 (SEQ ID NO:21), by reacting the substrate with oxygenases such as CYP2C19 CYP102A1 var2 (SEQID NO:22), CYP102A1 var3 (SEQID (SEQ ID NO:16) and CYP2D6 (SEQ ID NO:17), as in the NO:23), CYP102A1var3-2 (SEQ ID NO:24), case of linalool. CYP102A1 var3-3 (SEQ ID NO:25), CYP102A1var3-4 (SEQ ID NO:26), CYP102A1var3-5 (SEQ ID NO:27), 0.125. In some embodiments of the methods and systems CYP102A1var3-7 (SEQ ID NO:29), CYP102A1var3-8 herein disclosed wherein the organic molecule is a compound (SEQ ID NO:30), CYP102A1var3-9 (SEQ ID NO:31), of Formula (III), in which R7—H. R. and R are linked CYP102A1var3-17 (SEQID NO:39), CYP102A1var8 (SEQ together to form a 6-membered aromatic ring, R, and Ro are ID NO:50), CYP101A1var2-1 (SEQ ID NO:67), and/or linked together to form a 5-carbon cyclic alkenyl, activation CYP101 Alvar2-3 (SEQ ID NO:69). In particular, in those can be performed by reacting the Substrate with oxygenases embodiments at least one of said oxygenases or variants CYP102A1vars (SEQID NO:47), CYP102A1 var6 (SEQID thereof is expected to activate the target site of a compound of NO:48), and/or CYP102Alvar7 (SEQ ID NO:49), resulting Formula IV by introducing an oxygen-containing functional in the introduction of anhydroxyl group in the target site as in group ilh the form of a hydroxyl group. In these embodiments, the case of acenaphthene. the final products resulting from the application of the sys 0126. In some embodiments, the organic molecule has the tems and methods hercin disclosed can be RRArC-F, structure of formula (IV) RArCF, or R.ArCF. I0131. In some embodiments of the methods and systems herein disclosed wherein the organic molecule is a compound (IV) of Formula (IV), in which Ar para-substituted phenyl R=H., R -iPr, activation can be performed by reacting the organic molecule with an oxygenase Such as a P450 monooxygenase including CYP102A1 (SEQ ID NO:2) and CYP102Alvarš (SEQID NO:47), which results in the intro duction of anhydroxyl group in the target site as illustrated in in which the C is the target site, Ar can be a C-C aryl, Examples 10 and corresponding scheme 10. Cs-C. Substituted aryl, Cs-C. Substituted heteroatom-con 0.132. In some embodiments of the methods and systems taining aryl or Cs-C. Substituted heteroatom-containing herein disclosed wherein the organic molecule is a compound aryl, while R2 and R are independently selected from the of Formula (IV), in which Ar para- or ortho or meta substi group consisting of hydrogen, aliphatic, aryl, Substituted ali tuted phenyl (where substituent is halo, —CH3, or—OCH), phatic, Substituted aryl, heteroatom-containing aliphatic, het RH, R= -COOR, where R is C-C n-alkyl, activation eroatom-containing aryl, Substituted heteroatom-containing can be performed by reacting the Substrate with oxygenase aliphatic, Substituted heteroatom-containing aryl, alkoxy, CYP102A1varS (SEQID NO:47), CYP102A1 var3 (SEQ ID aryloxy, and functional groups (FG) or are taken together to NO:23), and CYP102A1var3-7 (SEQ ID NO:29), as illus form a ring, Such that the carbon atom is a secondary or trated in Examples 1,2,3 and 4 and corresponding schemes 1. tertiary carbon atom. 2, 3, and 4. US 2009/0061471 A1 Mar. 5, 2009

0133. In some embodiments of the methods and systems 0.139. In some embodiments, the oxygenase or variant herein disclosed wherein the organic molecule is a compound thereof can be CYP102A1 var8 (SEQ ID NO:50), of Formula (IV), in which Ra—H, Ar is ortho substituted CYP102A1 var3-2 (SEQ ID NO:24), CYP102A1var3-3 phenyl, R is linked to Arthrough a phenyl moiety, activation (SEQ ID NO:25), CYP102A1var3-5 (SEQ ID NO:27), can be performed by reacting the Substrate with oxygenases CYP102A1 var3-6 (SEQ ID NO:28), CYP102A1var3-9 CYP102A1 varó (SEQID NO:48) and CYP102Alvar8 (SEQ (SEQ ID NO:31), CYP102A1 var3-11 (SEQ ID NO:33), ID NO:50), resulting in the introduction of anhydroxyl group CYP102A1var3-16 (SEQ ID NO:38), CYP102A1var3-19 in the target site as in the case of fluorene. (SEQ ID NO:41, CYP102Alvar3-18 (SEQ ID NO:40), 0134. In some embodiments of the methods and systems CYP102A1 var3-2 (SEQ ID NO:24), CYP102A1var3-3 herein disclosed wherein the organic molecule is a compound (SEQ ID NO:25), CYP102A1 var3-14 (SEQ ID NO:36), of Formula (IV), in which Ra—H, Ar is ortho substituted CYP102A1var3-15 (SEQ ID NO:37), CYP102A1var3-17 phenyl, R is linked to Ar through a 2-methylene bridge, (SEQ ID NO:39), CYP102A1var3-9 (SEQ ID NO:31), activation can be performed by reacting the substrate with CYP101A1 var2-3 (SEQ ID NO:69), and/or CYP3A4 (SEQ oxygenase CYP102AlvarS (SEQID NO:47), resulting in the ID NO:20). In particular, in those embodiments, at least one introduction of an hydroxyl group in the target site as in the of said oxygenases or variants thereof is expected to activate case of indan. a compound of Formula V by affording a hydroxyl group at 0135) In some embodiments the organic molecule has the the target site. In these embodiments, the final product result structure of formula (V), ing from the application of the systems and methods herein disclosed can be R.R.R.C. F. 0140. In some embodiments of the methods and systems (V) herein disclosed wherein the organic molecule is a compound of Formula (V), in which R, Rs. R7, Rs are hydrogen, and R is 2-methyl-5-phenyl-4,5-dihydrooxazolyl, activation can be performed by reacting the Substrate with oxygenases CYP102A1var3-5 (SEQ ID NO:27), CYP102A1var3-6 (SEQ ID NO:28), CYP102A1 var3-11 (SEQ ID NO:33), in which X is the target site Catom, R. Ris, Re, R7Rs are CYP102A1var3-16 (SEQ ID NO:38), CYP102A1var3-19 independently selected from the group consisting of hydro (SEQ ID NO:41), CYP102A1 var3-18 (SEQ ID NO:40), gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, resulting in a hydroxyl group at the target site as illustrated in heteroatom-containing aliphatic, heteroatom-containing Examples 12 and corresponding scheme 12. aryl, Substituted heteroatom-containing aliphatic, Substituted 0.141. In some embodiments of the methods and systems heteroatom-containing aryl, alkoxy, aryloxy, and functional herein disclosed wherein the organic molecule is a compound groups (FG) or are taken together to form a ring, Such that the of Formula (V), in which R. R. R. Rare hydrogen, and carbon atom is a secondary or tertiary carbon atom. R is 2.3.4.5-tetramethoxy-tetrahydro-2H-pyranyl, activa 0.136. In particular, the Substituents R. R. R. R., and tion can be performed by reacting the Substrate with oxyge Rs of Formula (V) can be independently selected from nases such as CYP102Alvar3-2 (SEQ ID NO:24), hydrogen, C-C alkyl, C-C Substituted alkyl, C-C Sub CYP102A1 var3-3 (SEQ ID NO:25), CYP102A1 var3-14 stituted heteroatom-containing alkyl, C-C Substituted het (SEQ ID NO:36), CYP102A1var3-15 (SEQ ID NO:37), eroatom-containing alkyl, C-C alkenyl, C-C Substituted CYP102A1var3-17 (SEQ ID NO:39), CYP102A1var3-9 alkenyl, C-C substituted heteroatom-containing alkenyl, (SEQ ID NO:31), resulting in a hydroxyl group at the target C-C Substituted heteroatom-containing alkenyl, Cs-Ca site as illustrated in Examples 13 and corresponding scheme aryl, Cs-C substituted aryl, Cs-C. Substituted heteroatom 13. containing aryl, Cs-C. Substituted heteroatom-containing 0142. In some embodiments of the methods and systems aryl, carbonyl, thiocarbonyl, and carboxy. More in particular, herein disclosed wherein the organic molecule is a compound R. R. R. R., and Rs are independently selected from of Formula (V), in which Ra—CN, R =6-dimethylamino hydrogen, C-C alkyl, C-C Substituted alkyl, C-C Sub naphtyl, R=R7—H. Rs—H, or hydrogen, activation can stituted heteroatom-containing alkyl, C-C substituted het be performed by reacting the Substrate with oxygenases Such eroatom-containing alkyl, C-C alkenyl, C-C Substituted as CYP102A1var8 (SEQ ID NO:50) and CYP3A4 (SEQ ID alkenyl, C-C substituted heteroatom-containing alkenyl, NO:20), as in the case of C. cyano-naphtyl ethers. C-C Substituted heteroatom-containing alkenyl, Cs-Ca aryl, Cs-C. Substituted aryl, Cs-Ca. Substituted heteroatom 0143. In some embodiments the organic molecule has the containing aryl, Cs-C. Substituted heteroatom-containing structure of formula (VI) aryl, C-C alkoxy, Cs-Caryloxy, carbonyl, and carboxy. 0.137 Oxydizing agents known or expected to react with (VI) R19 R21 the target site of a compound of Formula (V) include but are v not limited to oxygenases or variants thereof. XS 0.138. In some embodiments, the oxygenase can be a non / heme monooxygenase or a variant thereof, a heme-containing R20 R22 monooxygenase or a variant thereof, a peroxygenase or a variant thereof. Such as any of the heme-containing in which X is the target site Catom, and R. R. R. Rare monooxygenase, nonheme-containing monooxygenases and independently selected from the group consisting of hydro peroXugenases herein disclosed. In particular, the oxygenase gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, can be any of the P450 monooxygenases and P450 peroxy heteroatom-containing aliphatic, heteroatom-containing genases herein disclosed. aryl, Substituted heteroatom-containing aliphatic, Substituted US 2009/0061471 A1 Mar. 5, 2009

heteroatom-containing aryl, and functional groups (FG) or of Formula (VI), in which Ro-Ro-R=H. R. n-butyl, are taken together to form a ring representing in this case a activation through epoxidation can be performed by reacting cycloalkenyl, Substituted cycloalkenyl, heteroatom-contain the substrate with oxygenases such as CYP102A1 Varl (SEQ ing cycloalkenyl, or a Substituted heteroatom-containing ID NO: 21), CYP102A1 var3-21 (SEQ ID NO: 43), cycloalkenyl derivative. CYP102A1 var3-22 (SEQ ID NO; 44), CYP102A1 var3-23 0144. In particular, the Substituents R. R. R., and R (SEQID NO: 45), resulting in the introduction of an epoxide of formula VI are independently selected from hydrogen, functional group at the target site as in the case of 1-hexene. C-C alkyl, C-C Substituted alkyl, C-C Substituted 0150. In some embodiments of the methods and systems heteroatom-containing alkyl, C-C substituted heteroatom herein disclosed wherein the organic molecule is a compound containing alkyl, C-C alkenyl, C-C Substituted alkenyl, of Formula (VI), in which Ro-Ro-R=H. R. phenyl, C-C substituted heteroatom-containing alkenyl, C-C, activation through epoxidation can be performed by reacting Substituted heteroatom-containing alkenyl, Cs-C aryl, the substrate with oxygenases varl, CYP1A2 (SEQ ID NO; Cs-C. Substituted aryl, Cs-C. Substituted heteroatom-con 13), CYP102A1var9 (SEQID NO;51) or CYP102Alvar9-1 taining aryl, Cs-C substituted heteroatom-containing aryl. (SEQID NO. 52), resulting in the introduction of an epoxide carbonyl, thiocarbonyl, carboxy, and Substituted amino. functional group at the target site as in the case of styrene. More in particular, R. R. R. and R are independently 0151. In some embodiments of the methods and systems selected from hydrogen, C-C alkyl, C-C Substituted herein disclosed wherein the organic molecule is a compound alkyl, C-C Substituted heteroatom-containing alkyl, of Formula (VI), in which Ro-R=H. R. and R are C-C Substituted heteroatom-containing alkyl, C-C alk connected together through 4 methylene units so to form a enyl, C-C Substituted alkenyl, C-C Substituted heteroa 6-membered ring, activation through epoxidation can be per tom-containing alkenyl, C-C substituted heteroatom-con formed by reacting the substrate with oxygenases of CYP153 taining alkenyl, Cs-Caryl, Cs-Ca. Substituted aryl, Cs-Ca family, such as CYP153A6 (SEQ ID NO. 54), CYP153A7 Substituted heteroatom-containing aryl, C-C substituted (SEQ ID NO: 55), CYP153A8 (SEQ ID NO. 56), heteroatom-containing aryl, carbonyl, and carboxy. CYP153A11 (SEQ ID NO; 57), CYP153D2 (SEQ ID NO; 0145. Oxydizing agents known or expected to react with 58), resulting in the introduction of an epoxide functional the target site of a compound of Formula (VI) include but are group at the target site as in the case of cyclohexene. not limited to oxygenases or variants thereof. 0152. In some embodiments of the methods and systems 0146 In some embodiments, the oxygenase can be a non herein disclosed wherein the organic molecule is a compound heme monooxygenase or a variant thereof, a heme-containing of Formula (VI), in which Ro-R-H, -Ro n-pentyl, monooxygenase or a variant thereof, a peroxygenase or a R=Co-alkenyl, activation through epoxidation can be per variant thereof. Such as any of the heme-containing formed by reacting the substrate with oxygenases CYP102A1 monooxygenase, non heme containing monooxygenases and (SEQ ID NO; 2), resulting in the introduction of an epoxide peroXugenases herein disclosed. In particular, the oxygenase functional group at the target site as in the case of linolenic can be any of the P450 monooxygenases and P450 peroxy acid. genases herein disclosed. 0153. In some embodiments of the methods and systems 0147 In some embodiments, the oxygenase or variant herein disclosed wherein the organic molecule is a compound thereof can be CYP102A1 (SEQID NO:2), CYP102A1 warl of Formula (VI), in which Ro-R=H. Ro and R are (SEQ ID NO:21), CYP102A1 var2 (SEQ ID NO:22), linked together to form a 6-membered substituted or non CYP102A1 var3 (SEQID NO:23), CYP102A1 var3-18 (SEQ Substituted aromatic ring, activation can be performed by ID NO:40), CYP102AlvarS (SEQ ID NO:47), reacting the Substrate with toluene dioxygenase; resulting in CYP102A1 var4 (SEQID NO:46), CYP102A1 var3-21 (SEQ the introduction of an oxygen-containing functional group in ID NO:43), CYP102Alvar3-22 (SEQ ID NO:44), the form of a vicinal diol. In those embodiments, the oxygen CYP102A1 var3-23 (SEQID NO:45), CYP102A1 var9 (SEQ containing functional group will have the form of an epoxy ID NO:51), CYP102Alvar9-1 (SEQID NO:52), and/or tolu group (C=(O)=C), that is an oxygen atom joined by single ene dioxygenase. In particular, in those embodiments at least bonds to two adjacent carbon atoms so to form a three-mem one of said oxygenases or variants thereof is expected to bered ring. activate a compound of Formula VI by introducing an oxy 0154) In some embodiments, the oxidating agent suitable gen-cortaining functional group in the form of an epoxy to activate an organic molecule including a target site with the group. In these embodiments, the final products resulting methods and systems herein disclosed can be identified by (a) from the application of the systems and methods herein dis providing the organic molecule, (b) providing an oxydizing closed can be (RRC(OH)—CFR-R), (RoRoCF C agent, (c) contacting the oxydizing agent with the organic (OH)RR), or (RoRoCF CFR-R). molecule for a time and under condition to allow the intro 0148. Additional oxydizing agents that are expected to duction of an oxygen-containing functional group on the react with the target site of a compound of Formula (VI) target site; (d) detecting the oxygen-containing functional include but are not limited to dioxygenases such as toluene group on the target site of the organic molecule resulting from dioxygenase. More specifically, dioxydizing agents are step c), and repeating steps (a) to (d) until an oxygen contain expected to activate a compound of Formula (VI) by intro ing functional group is detected on the target site. In particu ducing an oxygen-containing functional group in the form of lar, one or more oxidating agents can be provided under step a vicinal diol. In these embodiments, the final products, b) of the method herein disclosed. resulting from the application of the systems and methods 0.155. In particular, in embodiments wherein the organic herein disclosed can be (RRC(OH)—CFR-R), molecule is a molecule of formula (I), (II), (III), and (IV), (RRCF C(OH)RR), or (RRCF CFR-R). detecting the oxygen-containing functional group on the tar 0149. In some embodiments of the methods and systems get site can be performed by: e) isolating of the organic herein disclosed wherein the organic molecule is a compound molecule resulting from step c), for example by a separation US 2009/0061471 A1 Mar. 5, 2009 method or a combination of separation methods, including and additional techniques identifiable by a skilled person. but not limited to extraction, chromatography, distillation, Examples of those embodiments is provided in the Examples precipitation, Sublimation, and crystallization; and f) charac section and illustrated in FIGS. 5 and 6. terizing the isolated organic molecule resulting from step c) to 0160. In embodiments wherein the organic molecule is an identify the oxygen containing functional group, for example organic molecule of general formula (V) wherein R is 2-me by a characterization method or a combination of methods, thyl-5-phenyl-4,5-dihydrooxazolyl and including but not limited to spectroscopic or spectrometric R=R-R-Rs—H, upon contacting a library of engi technique, preferably a combination of two or more spectro neered P450 monooxygenases (oxydizing agents) the oxygen scopic or spectrometric techniques, including UV-VIS spec containing functional group can be detected using colorimet troscopy, fluorescence spectroscopy, IR spectroscopy, ric reagent (e.g. Purpald) and measuring the change in absor H-NMR, C-NMR, 2D-NMR,3D-NMR, GC-MS, LC-MS, bance (e.g. at 550 nm on a microtiter plate reader). In embodi and MS-MS. ments wherein the organic molecule is an organic molecule of 0156. In particular, in embodiments wherein the organic general formula (V) wherein R is 2,3,4,5-tetramethoxytet molecule is a molecule of formula (V), detecting the oxygen rahydro-2H-pyranyl and R=R-R-Rs—H upon con containing functional group on the target site can be per tacting a library of engineered P450 monooxygenases, the formed by monitoring the removal of the —CHR,Rs moi oxygen containing functional group can also using colorimet ety associated with the introduction of an oxygen containing ric reagent (e.g. Purpald) and measuring the change in absor functional group in the target site. In those embodiments, bance (e.g. at 550 nm on a microtiter plate reader). monitoring the removal of the —CHRRs moiety, can be 0.161. In some embodiments, the isolated and character performed by g) contacting the organic molecule resulting ized organic molecule that includes the oxygen-containing from step c) with a reagent that can react with an aldehyde functional group at the target site can be used as authentic (R-CHO), a ketone (R—C(O)—R), a dicarbonyl (R—C standard for high-throughput Screening of other, more Suit (O) C(O)—R), or a glyoxal (R—C(O)—CHO) functional able oxydizing agents, or improvement of reaction conditions group; and h) detecting the formation of an adduct or a com for the activation reaction. In exemplary embodiments, high plex between an aldehyde, ketone, dicarbonyl, or glyoxal in throughput screening can be carried out performing the acti the organic molecule, the aldehyde, ketone, dicarbonyl, or vation reaction in a multi-well plate, typically a 96-well or glyoxal resulting from the removal of the —CHRRs moiety. 384-well plate, each well containing the candidate organic 0157 Detecting the formation of an adduct or complex molecule, the oxydizing agent, and the co-reagents (e.g. can be performed by spectroscopic (colorimetric, fluorimet cofactors, oxygen) required for the reaction to proceed, and ric) or chromatographic methods and additional methods detecting the activation of the target site using one of the identifiable by a skilled person upon reading of the present following techniques, UV-VIS spectroscopy, fluorimetry, IR, disclosure. LC, GC, GC-MS, LC-MS, or a combination thereof, accord 0158 Reagents that can react with an aldehyde, ketone, ing to the nature and properties of the candidate organic dicarbonyl, or glyoxal and Suitable for the methods and sys molecule and the activated product. tems described herein include but are not limited to 4-amino 0162. In some embodiments, an oxygenase that oxidizes a 3-hydrazino-5-mercapto-1,2,4-triazole-4-amino-5-hy pre-determined organic molecule in a target site is provided drazino-1,2,4-triazole-3-thiol (Purpald), by (i) providing a candidate oxygenase, () mutating the can (pentafluorobenzyl)-hydroxylamine, p-nitrophenyl-hydra didate oxygenase to generate a mutant or variant oxygenase, zine, 2,4-dinitrophenyl-hydrazine, 3-methylbenzothiazolin (k) contacting the variant oxygenase with the pre-determined 2-one hydrazone, diethyl acetonedicarboxylate and ammo organic molecule for a time and under condition to allow nia, cyclohexane-1,3-dione and ammonia, detection of an oxygen containing functional group on the m-phenylenediamine, p-aminophenol. 3.5-diaminobenzoic target site, (1) detecting the introduction of the oxygen con acid, p-dimethylamino-aniline, m-dinitrobenzene, o-phe taining functional group on the target site and repeating steps nylenediamine, and the like. (i) to (l) until formation of on oxygen containing functional 0159. In some embodiments, a plurality of oxidating group is detected. agents can be provided to identify a suitable oxidating agent 0163. In some embodiments, mutating the candidate oxy in the methods and systems herein disclosed. In particular, in genase can be performed by laboratory evolutionary methods Some of those embodiments wherein the organic molecule is and/or rational design methods, using one or a combination of an organic molecule of general formula (I), (II), (III). (IV)and techniques such as random mutagenesis, site-saturation (V) a pool of oxydizing agents, for example a library of mutagenesis, site-directed mutagenesis, DNA shuffling, engineered P450s, e.g. in a 96-well plate, can be provided. In DNA recombination, and additional techniques identifiable particular, in embodiments wherein the organic molecule is by a skilled person. In particular, mutating a candidate oxy an organic molecule of general formula (I), (II), (III) and (IV), genase can be performed by targeting one or more of the isolating the organic molecule resulting from step c) can be amino acid residues comprised in the oxygenase's nucleotidic performed by extracting the reaction mixture with organic or amino acidic primary sequence to provide a mutant or Solvent and characterizing the oxygen containing functional variant polynucleotide or polypeptide. group in the organic molecule can be performed by GC analy 0164. In general, the term “mutant” or “variant' as used sis of the extraction solution. In some of those embodiments, herein with reference to a molecule such as polynucleotide or selected mixtures of oxydizing agent, and co-reagents (e.g. polypeptide, indicates that has been mutated from the mol cofactors, oxygen) which gave rise to the largest amount of ecule as it exits in nature. In particular, the term "mutate' and activated products for a given organic molecule, can be “mutation” as used herein indicates any modification of a repeated at a larger scale. The activated products can be nucleic acid and/or polypeptide which results in an altered Subsequently isolated by Suitable technique including liquid nucleic acid or polypeptide. Mutations include any process or chromatography and identified by 'H-, 'C-NMR, and MS mechanism resulting in a mutant protein, enzyme, polynucle US 2009/0061471 A1 Mar. 5, 2009

otide, gene, or cell. This includes any mutation in which a Smith 1983; Zoller and Smith 1987; Zoller 1992), phospho polynucleotide or polypeptide sequence is altered, as well as rothioate-modified DNA mutagenesis (Taylor, Schmidt et al. any detectable change in a cell wherein the mutant polynucle 1985; Nakamaye and Eckstein 1986; Sayers, Schmidt et al. otide or polypeptide is expressed arising from Such a muta 1988), mutagenesis using gapped duplex DNA (Kramer, tion. Typically, a mutation occurs in a polynucleotide or gene Drutsa et al. 1984: Kramer and Fritz 1987), point mismatch, sequence, by point mutations, deletions, or insertions of mutagenesis using repair-deficient host strains, deletion single or multiple nucleotide residues. A mutation in a poly mutagenesis (Eghtedarzadeh and Henikoff 1986), restriction nucleotide includes mutations arising withina protein-encod selection and restriction-purification (Braxton and Wells ing region of a gene as well as mutations in regions outside of a protein-encoding sequence, Such as, but not limited to, 1991), mutagenesis by total gene synthesis (Nambiar, Stack regulatory or promoter sequences. A mutation in a coding house et al. 1984; Grundstrom, Zenke et al. 1985; Wells, polynucleotide Such as a gene can be 'silent’, i.e., not Vasser et al. 1985), double-strand break repair (Mandecki reflected in an amino acid alteration upon expression, leading 1986), and the like. Additional details on many of the above to a “sequence-conservative' variant of the gene. A mutation methods can be found in Methods in Enzymology Volume in a polypeptide includes but is not limited to mutation in the 154, which also describes useful controls for trouble-shoot polypeptide sequence and mutation resulting in a modified ing problems with various mutagenesis methods. amino acid. Non-limiting examples of a modified amino acid 0168 Additional details regarding the methods to gener include a glycosylated amino acid, a Sulfated amino acid, a ate variants of naturally-occurring sequences can be found in prenylated (e.g., farnesylated, geranylgeranylated) amino the following U.S. patents, PCT publications, and EPO pub acid, an acetylated amino acid, an acylated amino acid, a lications: U.S. Pat. No. 5,605,793 to Stemmer (Feb. 25, PEGylated amino acid, a biotinylated amino acid, a carboxy 1997), “Methods for In vitro Recombination.” U.S. Pat. No. lated amino acid, a phosphorylated amino acid, and the like. 5,811.238 to Stemmer et al. (Sep. 22, 1998) “Methods for References adequate to guide one of skill in the modification Generating Polynucleotides having Desired Characteristics of amino acids are replete throughout the literature. Example by Iterative Selection and Recombination: U.S. Pat. No. 5.830,721 to Stemmer et al. (Nov. 3, 1998), “DNA Mutagen protocols are found in Walker (1998) Protein Protocols on esis by Random Fragmentation and Reassembly: U.S. Pat. CD-ROM (Humana Press, Towata, N.J.). No. 5,834.252 to Stemmer, et al. (Nov. 10, 1998) “End 0.165. A mutant or engineered protein or enzyme is usu Complementary Polymerase Reaction.” U.S. Pat. No. 5,837, ally, although not necessarily, expressed from a mutant poly 458 to Minshull, et al. (Nov. 17, 1998), “Methods and Com nucleotide or gene. Engineered cells can be obtained by intro positions for Cellular and Metabolic Engineering:” WO duction of an engineered gene or part of it in the cell. The 95/22625, Stemmer and Crameri, “Mutagenesis by Random terms “engineered cell”, “mutant cell' or “recombinant cell Fragmentation and Reassembly:” WO 96/33207 by Stemmer as used herein refer to a cell that has been altered or derived, and Lipschutz "End Complementary Polymerase Chain or is in some way different or changed, from a parent cell, Reaction:” WO97/20078 by Stemmer and Crameri “Meth including a wild-type cell. The term “recombinant as used ods for Generating Polynucleotides having Desired Charac herein with reference to a cell in alternative to “wild-type' or teristics by Iterative Selection and Recombination: WO “native', indicates a cell that has been engineered to modify 97/35966 by Minshull and Stemmer, “Methods and Compo the genotype and/or the phenotype of the cell as found in sitions for Cellular and Metabolic Engineering:” WO nature, e.g., by modifying the polynucleotides and/or 99/41402 by Punnonen et al. “Targeting of Genetic Vaccine polypeptides expressed in the cell as it exists in nature. A Vectors:” WO99/41383 by Punnonen et al. “Antigen Library “wild-type cell refers instead to a cell which has not been Immunization:” WO99/41369 by Punnonen et al. “Genetic engineered and displays the genotype and phenotype of said Vaccine Vector Engineering.” WO99/41368 by Punnonen et cell as found in nature. al. “Optimization of Immunomodulatory Properties of 0166 The term “engineer refers to any manipulation of a Genetic Vaccines:” EP 752008 by Stemmer and Crameri, molecule or cell that result in a detectable change in the “DNA Mutagenesis by Random Fragmentation and Reas molecule or cell, wherein the manipulation includes but is not sembly;” EP 0932670 by Stemmer “Evolving Cellular DNA limited to inserting a polynucleotide and/or polypeptide het Uptake by Recursive Sequence Recombination: WO erologous to the cell and mutating a polynucleotide and/or 99/23107 by Stemmer et al., “Modification of Virus Tropism polypeptide native to the cell. Engineered cells can also be and Host Range by Viral Genome Shuffling.” WO99/21979 obtained by modification of the cell genetic material, lipid by Apt et al., “Human Papillomavirus Vectors: WO distribution, or protein content. In addition to recombinant 98/31837 by del Cardayre et al. “Evolution of Whole Cells production, the enzymes may be produced by direct peptide and Organisms by Recursive Sequence Recombination: WO synthesis using Solid-phase techniques, such as Solid-Phase 98/27230 by Patten and Stemmer, “Methods and Composi Peptide Synthesis. Peptide synthesis may be performed using tions for Polypeptide Engineering.” WO 98/13487 by Stem manual techniques or by automation. Automated synthesis mer et al., “Methods for Optimization of Gene Therapy by may be achieved, for example, using Applied Biosystems Recursive Sequence Shuffling and Selection:” WO00/00632, 431A Peptide Synthesizer (PerkinElmer, Foster City, Calif.) “Methods for Generating Highly Diverse Libraries:” WO in accordance with the instructions provided by the manufac 00/09679, “Methods for Obtaining in vitro Recombined turer Polynucleotide Sequence Banks and Resulting Sequences: 0167 Variants of naturally-occurring sequences can be WO 98/42832 by Arnold et al., “Recombination of Poly generated by site-directed mutagenesis (Botstein and Shortle nucleotide Sequences. Using Random or Defined Primers: 1985; Smith 1985; Carter 1986; Dale and Felix 1996; Ling WO99/29902 by Arnold et al., “Method for Creating Poly and Robinson 1997), mutagenesis using containing nucleotide and Polypeptide Sequences:” WO 98/41653 by templates (Kunkel, Roberts et al. 1987: Bass, Sorrells et al. Vind, “An in vitro Method for Construction of a DNA 1988), oligonucleotide-directed mutagenesis (Zoller and Library;” WO98/41622 by Borchertet al., “Method for Con US 2009/0061471 A1 Mar. 5, 2009 structing a Library Using DNA Shuffling.” WO 98/.42727 by used to identify fragments of proteins that can be recombined Pati and Zarling, “Sequence Alterations using Homologous to minimize disruptive interactions that would prevent the Recombination:” WO00/18906 by Patten et al., “Shuffling of protein from folding into its active form. Codon-Altered Genes:” WO 00/04190 by del Cardayre et al. 0171 In some embodiments, activation of a target site in “Evolution of Whole Cells and Organisms by Recursive an organic molecule can be performed in a whole-cell system. Recombination:” WO 00/42561 by Crameri et al., “Oligo To prepare the whole-cell system, the encoding sequence of nucleotide Mediated Nucleic Acid Recombination: WO the oxydizing agent can be introduced into a host cell using a 00/42559 by Selifonov and Stemmer “Methods of Populating Suitable vector, Such as a plasmid, a cosmid, a phage, a virus, Data Structures for Use in Evolutionary Simulations: WO a bacterial artificial chromosome (BAC), a yeast artificial 00/42560 by Selifonov et al., “Methods for Making Character chromosome (YAC), or the like, into which the said sequence Strings, Polynucleotides & Polypeptides Having Desired of the disclosure has been inserted, in a forward or reverse Characteristics:” WO 01/23401 by Welch et al., “Use of orientation. In some embodiments, the construct further com Codon-Varied Oligonucleotide Synthesis for Synthetic Shuf prises regulatory sequences, including, for example, a pro fling:” and WO 01/64864 “Single-Stranded Nucleic Acid moter linked to the sequence. Large numbers of Suitable Template-Mediated Recombination and Nucleic Acid Frag vectors and promoters are known to those of skill in the art, ment Isolation” by Affholter. and are commercially available. 0169. In particular, in some embodiments, site-directed 0172 Accordingly, in other embodiments, vectors that mutagenesis can be performed on predetermined residues of include a nucleic acid molecule of the disclosure are pro the oxygenase. These predetermined sites can be identified vided. In other embodiments, host cells transfected with a using the crystal structure of said oxydizing agentifavailable nucleic acid molecule of the disclosure, or a vector that or a crystal structure of a homologous protein that shares at includes a nucleic acid molecule of the disclosure, are pro least 20% sequence identity with said oxydizing agent and an vided. Host cells include eucaryotic cells Such as yeast cells, alignment of the polynucleotide or amino acid sequences of insect cells, or animal cells. Host cells also include procary the oxydizing agent and its homologous protein. The prede otic cells such as bacterial cells. termined sites are chosen among the amino acid residues that 0173. In other embodiments, methods for producing a cell are found within 50 A, preferably within 35 A from the that converts a target molecule into a pre-determined oxygen oxygen-activating site of said oxydizing agent. For example, ated derivative are provided. Such methods generally include: when a cytochrome P450 monooxygenase is to be used as the (a) transforming a cell with an isolated nucleic acid molecule Oxydizing agent, the predetermined site are chosen among the encoding a polypeptide comprising an amino acid sequence amino acid residues that are found within 50 A, preferably set forthin SEQID NO: 2 to SEQIDNO: 70; (b) transforming within 35 A from the heme iron. Mutagenesis of the prede a cell with an isolated nucleic acid molecule encoding a termined sites can be performed changing one, two or three of polypeptide of the disclosure; or (c) transforming a cell with the nucleotides in the codon that encodes for each of the an isolated nucleic acid molecule of the disclosure. predetermined amino acids. Mutagenesis of the predeter (0174. The terms “vector”, “vector construct” and “expres mined sites can be performed in the described way so that sion vector” as used herein refer to a vehicle by which a DNA each of the predetermined amino acid is mutated to any of the or RNA sequence (e.g. a foreign gene) can be introduced into other 19 natural amino acids. Substitution of the predeter a host cell, so as to transform the host and promote expression mined sites with unnatural amino acids can be performed (e.g. transcription and translation) of the introduced using methods established in vivo (Wang, Xie et al. 2006), in sequence. Vectors typically comprise the DNA of a transmis vitro (Shimizu, Kuruma et al. 2006), semisynthetic sible agent, into which foreign DNA encoding a protein is (Schwarzer and Cole 2005) or synthetic methods (Camarero inserted by restriction enzyme technology. A common type of and Mitchell 2005) for incorporation of unnatural amino vector is a "plasmid', which generally is a self-contained acids into polypeptides. molecule of double-stranded DNA that can readily accept 0170 Instill further embodiments, libraries of engineered additional (foreign) DNA and which can readily introduced variants can be obtained by laboratory evolutionary methods into a suitable host cell. A large number of vectors, including and/or rational design methods, using one or a combination of plasmid and fungal vectors, have been described for replica techniques such as random mutagenesis, site-saturation tion and/or expression in a variety of eukaryotic and prokary mutagenesis, site-directed mutagenesis, DNA shuffling, otic hosts. Non-limiting examples include pKK plasmids DNA recombination, and the like and targeting one or more of (Clonetech), puC plasmids, pET plasmids (Novagen, Inc., the amino acid residues, one at a time or simultaneously, Madison, Wis.), pRSET or pREP plasmids (Invitrogen, San comprised in the oxydizing agent's amino acid sequence. Diego, Calif.), or pMAL plasmids (New England Biolabs, Said libraries can be arrayed on multi-well plates and Beverly, Mass.), and many appropriate host cells, using meth screened for activity on the target molecule using a calori ods disclosed or cited herein or otherwise known to those metric, fluorimetric, enzymatic, or luminescence assay and skilled in the relevant art. Recombinant cloning vectors will the like. For example a method for making libraries for often include one or more replication systems for cloning or directed evolution to obtain P450s with new or altered prop expression, one or more markers for selection in the host, e.g., erties is recombination, or chimeragenesis, in which portions antibiotic resistance, and one or more expression cassettes. of homologous P450s are swapped to form functional chime (0175. The terms “express” and “expression” refers to ras, can use used. Recombining equivalent segments of allowing or causing the information in a gene or DNA homologous proteins generates variants in which every sequence to become manifest, for example producing a pro amino acid Substitution has already proven to be successful in tein by activating the cellular functions involved in transcrip one of the parents. Therefore, the amino acid mutations made tion and translation of a corresponding gene or DNA in this way are less disruptive, on average, than random muta sequence. A DNA sequence is expressed in or by a cell to form tions. A structure-based algorithm, such as SCHEMA, can be an “expression product” such as a protein. The expression US 2009/0061471 A1 Mar. 5, 2009

product itself, e.g. the resulting protein, may also be said to be a deoxofluorination reaction. The term “reagent” as used “expressed by the cell. A polynucleotide or polypeptide is herein is equivalent to the term "agent'. expressed recombinantly, for example, when it is expressed 0184. In some embodiments, the fluorination can be per or produced in a foreign host cell under the control of a formed by ring-opening fluorination of the oxygenated foreign or native promoter, or in a native host cell under the organic molecule. control of a foreign promoter. 0185. The terms “ring-opening fluorination' and “ring 0176) Polynucleotides provided herein can be incorpo opening fluorination reaction' as used herein refer to a chemi rated into any one of a variety of expression vectors Suitable cal reaction where an epoxide is reacted with a nucleophile, for expressing a polypeptide. Suitable vectors include chro specifically fluoride (F) to afford a fluorohydrin ( CFR mosomal, nonchromosomal and synthetic DNA sequences, C(OH)R ) or a vicinal difluoride- (—CRF CRF ) con e.g., derivatives of SV40; bacterial plasmids; phage DNA; taining derivative. Accordingly, the terms "ring-opening fluo baculovirus; yeast plasmids; vectors derived from combina rination agent” and “ring-opening fluorinating agent as used tions of plasmids and phage DNA, viral DNA such as vac herein refer to a chemical agent that is able to carry out a cinia, adenovirus, fowlpox virus, pseudorabies, adenovirus, ring-opening fluorination reaction. adeno-associated viruses, retroviruses and many others. Any 0186. In particular, the deoxofluorination reaction can be vector that transduces genetic material into a cell, and, if performed using commercially available, deoxofluorinating replication is desired, which is replicable and viable in the agents such as sulfur tetrachloride (SF), DAST (diethylami relevant host can be used. nosulfur trifluoride, (Middleton 1975), U.S. Pat. No. 3,914, 0177 Vectors can be employed to transform an appropri 265; U.S. Pat. No. 3,976,691), Deoxo-Fluor (bis-(2-meth ate host to permit the host to express a protein or polypeptide. oxyethyl)-aminosulfur trifluoride, (Lal, Pezetal. 1999), U.S. Examples of appropriate expression hosts include: bacterial Pat. No. 6,222,064), DFI (2,2-difluoro-1,3-dimethylimidazo cells, such as E. coli, B. subtilis, Streptomyces, and Salmo lidine, (Hayashi, Sonoda et al. 2002), U.S. Pat. No. 6,632, nella typhimurium; fungal cells, such as Saccharomyces cer 949), or analogues and derivatives thereof. Other deoxoflu evisiae, Pichia pastoris, and Neurospora crassa; insect cells orinating agents include XeF, SiF, and SeF. The Such as Drosophila and Spodoptera frugiperda; mammalian deoxofluorination reaction can be performed in the presence cells such as CHO, COS, BHK, HEK 293 br Bowes mela or in the absence of additional chemical agents that facilitate noma; or plant cells or explants, etc. or enable the deoxofluorination to occur. These additional 0.178 Inbacterial systems, a number of expression vectors agents include but are not limited to hydrogen fluoride (HF), may be selected depending upon the use intended for the Lewis acids, fluoride salts (e.g. CsF. KF, NaF. LiF BF), oxydizing polypeptide. For example, such vectors include, crown-ethers, ionic liquids and the like. but are not limited to, multifunctional E. coli cloning and 0187. In particular, the ring-opening fluorination reaction expression vectors such as BLUESCRIPT (Stratagene), in can be performed using nucleophilic fluoride-containing which the oxydizing agent-encoding sequence may be ligated agents including without limitations metal fluorides (e.g. CSF, into the vector in-frame with sequences for the amino-termi KF, NaF. LiF. AgF, BF), potassium hydrogen difluoride nal Met and the Subsequent 7 residues of beta-galactosidase (KHF), BuNHF, R.N.nHF, BuNFnHF, Py.9 HF (Olah's so that a hybrid protein is produced; plN vectors; pET vec reagent), and the like. The ring-opening fluorination reaction tors; and the like. can be performed in the presence or in the absence of addi tional chemical agents that facilitate or enable the deoxoflu 0179 Similarly, in the yeast Saccharomyces cerevisiae a orination to occur. These additional agents include but are not number of vectors containing constitutive or inducible pro limited to hydrogen fluoride (HF), Lewis acids, fluoride salts moters such as alpha factor, alcohol oxidase and PGH may be (e.g. CsP, KF, NaF. LiF), crown-ethers, ionic liquids and the used for production of the oxydizing agent. like 0180. In some embodiments, the activation of the target 0188 Exemplary fluorinations of an organic molecule site in an organic molecule by an oxidating agent can be containing an oxygen-containing group include but are not performed using an immobilized oxydizing agent. Immobi limited to conversion of a hydroxyl group to a fluoride, a lization of the oxydizing agent can be carried out through carboxylic acid group to a carbonyl fluoride, an aldehyde covalent attachment or physical adsorption to a Support, group to a gem-difluoride, a keto group to a gem-difluoride, entrapment in a matrix, encapsulation, cross-linking of oxy an epoxide group to a fluorohydrin (also called Vic-fluoro dizing agent's crystals or aggregates and the like. Several alcohol), an epoxide group to a vic-difluoride. immobilization techniques are known (Bornscheuer 2003: 0189 Exemplary products produced by methods and sys Cao 2005). The type of immobilization and matrix that pre tems herein disclosed comprise fluorinated derivatives of serves activity often depends on the nature and physical organic molecules which include 2-aryl-acetate esters, dihy chemical properties of the oxydizing agent. drojasmone, menthofuran, guaiol, permethylated mannopy 0181. In any of the above mentioned embodiments, the ranoside, methyl 2-(4'-(2"-methylpropyl)phenyl)propanoate oxygen-containing functional group introduced on a target and a 5-phenyl-2-oxazoline. site of any of the above molecules is then replaced by fluorine. 0.190 Specifically, the methods and systems herein dis 0182. In some embodiments, the fluorination is performed closed have been applied to produce methyl 2-fluoro-2-phe by deoxofluorination of the oxygenated organic molecule. nylacetate, ethyl 2-fluoro-2-phenylacetate, propyl 2-(3-chlo 0183. The terms “deoxofluorination” and “deoxofluorina rophenyl)-2-fluoroacetate, propyl 2-fluoro-o-tolylacetate, tion reaction” as used herein refer to a chemical reaction and propyl 2-fluoro-p-tolylacetate starting from correspond where an oxygen-containing chemical unit is replaced with ing 2-aryl-acetate esters, 4-fluoro-3-methyl-2-pentylcyclo fluorine. Accordingly, the terms “deoxofluorinating agent'. pent-2-enone, 4.4-difluoro-3-methyl-2-pentylcyclopent-2- "deoxofluorinating agent, and "deoxofluorination agent as enone, and 3-(fluoromethyl)-2-pentylcyclopent-2-enone, used herein refer to a chemical agent that is able to carry out starting from dihydrojasmone: methyl 2-(4'-(1"-fluoro-2'- US 2009/0061471 A1 Mar. 5, 2009 20 methylpropyl)phenyl)propanoate and methyl 2-(4'-(2"- 0196) Guaiol is a sesquiterpene alcohol having the guaiane fluoro-2'-methylpropyl)phenyl)propanoate, starting from skeleton, found in many medicinal plants. The essential oils methyl 2-(4'-(2"-methylpropyl)phenyl)propanoate: 6-fluoro of Salvialanigera and Helitia longifoliata, which both contain menthofuran-2-ol from menthofuran: 2-((3S,5S,8S)-4- guaiol as a major component, were found to possess pro fluoro-3,8-dimethyl-1,2,3,4,5,6,7,8-octahydroaZulen-5-yl) nounced antibacterial activity (De-Moura, Simionatto et al. propan-2-ol from (-)-guaiol; 6-fluoro-6-deoxy-1,2,3,4- 2002). Structural modification of naturally-occurring bioac tetramethyl-mannopyranoside starting from 1.2.3.4.6- tive Substances by conventional chemical methods is very pentamethyl-mannpyranoside; (4R,5S)-4-(fluoromethyl)-2- difficult and often not feasible. Accessible methods to pro methyl-5-phenyl-4,5-dihydrooxazole, starting from (4S,5S)- duce derivatives of these natural products (and specifically in 4-(methoxymethyl)-2-methyl-5-phenyl-4,5-dihydrooxazole. the context of this disclosure, fluorinated derivatives) would 0191 More specifically, the methods and systems herein be highly desirable. disclosed have been applied to fluorinate a target site, namely 0.197 Furans and 2-(5H)-furanones are attractive building a C carbonatom, in a highly regioselective manner despite the blocks being present in a large number of natural products presence of other similar moieties in the molecule, as in the that display a wide range of biological activities, and being case of 1,2,3,4,6-pentamethyl-mannopyranoside. present in a number of drugs with biologically relevant prop 0.192 Even more specifically, the methods and systems erties. Such as antifungal, antibacterial and anti-inflammatory herein disclosed have been applied to fluorinate target organic activities (Knight 1994; De Souza 2005). Many methods are molecules, namely 2-aryl-acetate esters, in a highly stereose available for their synthesis. However, strategies for post lective manner, leading to the formation of the (R)-fluoro synthetic functionalization (and specifically in the context of enantiomer in considerable excess over the (S)-fluoro enan the disclosure, fluorination) of these scaffolds and com tiomer. pounds incorporating these scaffolds would be highly desir 0193 The above mentioned fluorinated products are or able. can be associated with a biological activity or can be used for 0198 Embodiments, wherein methods for selective fluo the synthesis of chemical compounds that are or can be asso rination of protected hydroxyl groups in the form of R—O— ciated with a biological activity. CHRR are performed where the resulting product is R—F 0194 2-fluoro-2-phenylacetate derivatives find potential is expected to expand our current synthetic capabilities and applications in the synthesis of prodrugs, in particular in the facilitate the synthesis of fluorinated compounds that bear preparation of ester-type anticancer prodrugs with different multiple hydroxyl functional groups as well as the synthesis susceptibility to hydrolysis, which can be useful in selective of compounds that incorporate chemical units or structural targeting of cancer cells (Yamazaki, Yusa et al. 1996). 2-(4- features that are uncompatible with the currently available (2"-methylpropyl)phenyl)propionate also known as ibupro methods for protection/deprotection of hydroxyl groups fen is a marketed drug of the class non-steroidal anti-inflam (Green and Wuts 1999). The protection of hydroxyl groups matory drugs (NSAIDs). This drug has ample application in with alkyl groups different from methoxymethyl (MOM), the treatment of arthritis, primary dysmenorrhoea, fever, and tetrahydropyranyl (THP), allyl, and benzyl (Bn) is rarely used as an analgesic, especially in the presence of inflammation in practice, if ever, due to the requirement of harsh chemical process. Ibuprofen exerts its analgesic, antipyretic, and anti reagents and conditions for their removal (e.g. strong Lewis inflammatory activity through inhibition of acids in the case of a methoxy group). These chemical (COX-2), thus inhibiting prostaglandin synthesis. More reagents are poorly chemoselective, reacting with any recently, ibuprofen was found to be useful in the prophylaxis nucleophilic group of the molecule. Chemical methods for of Alzheimer's disease (AD) (Townsend and Pratico 2005). regioselective Substitution, and more specifically fluorina The anti-AD activity of ibuprofen is presumably due to its tion, of a single protected hydroxyl functional group in the ability to lower the levels of amyloid-beta (Abeta) peptides, presence of multiple identically protected hydroxyl groups in particular the longer, highly amyloidogenic isoform Abeta are not available. 42, which are believed to be the central disease-causing (0199. In some embodiments, activation and fluorination of agents in Alzheimer's disease (AD). There is therefore a the organic molecules can be performed as it follows. growing interest towards the discovery of Abeta 42-lowering 0200. The activation reaction can be carried out in aqueous compounds with improved potency and brain permeability Solvent containing variable amounts of organic solvents to (Leuchtenberger, Beher et al. 2006). Unlike other NSAIDs, facilitate dissolution of the organic molecule in the mixture. ibuprofen was also found to be useful in protection against The co-solvents include but are not limited to alcohols, aceto Parkinson's disease, although the underlying mechanism is nitrile, dimethylsulfoxide, dimethylformamide, acetone. The not yet known (Casper, Yaparpalvi et al. 2000). one or more oxydizing agents can be present as free in solu 0.195 Dihydrojasmone incorporates a cyclopentenone tion or inside a cell where its expression has been achieved structural unit. The cyclopentanone and cyclopentenone scaf using a plasmid vector or other strategies as described earlier. folds are presentina wide range of important natural products The reaction can be carried out in batch, semicontinuously or Such as jasmonoids, cyclopentanoid antibiotic, and prostag continuously, in air or using devices to flow air or oxygen landins. This type of compound has a broad spectrum of through the solution, at autogeneous pressure or higher. The biological activities and important application in medicinal reaction temperature will generally be in the range of 0° C. chemistry as well as in the perfume and cosmetic industry, and 100° C., depending on the nature and stability of the and agriculture. Despite their relatively simple structures, the biocatalysts and substrates, preferably in the range of about 4 synthesis of these scaffolds is not trivial (Mikolajczyk, C. and 30°C. The amount of biocatalyst is generally in the Mikina et al. 1999). Therefore, novel routes for functional range of about 0.01 mole% to 10 mole %, preferably in the ization (and specifically in the context of the disclosure, fluo range of about 0.05 mole % to 1 mole %. The cofactor rination) of these scaffolds and compounds incorporating (NADPH) can be added directly, regenerated using an these scaffolds would be highly desirable. enzyme-coupled system (typically dehydrogenase-based), or US 2009/0061471 A1 Mar. 5, 2009 provided by the host cell. Reducing equivalent to the biocata requiring numerous additional synthetic steps. For example, lysts can be provided though the use of an electrode or chemi the described methyl 2-(4'-(1"-fluoro-2'-methylpropyl)phe cal reagents. Superoxide dismutase, catalase or other reactive nyl)propanoate, methyl 2-(4'-(2"-fluoro-2'-methylpropyl) oxygen species-scavenging agents, can be used to prevent phenyl)propanoate, and methyl 2-(4'-(1", 1"-difluoro-2'-me biocatalyst inactivation and improve the yields of the activa thylpropyl)phenyl)propanoate prepared according to the tion reaction. Glycerol, bovine serum albumine or other sta methods and systems herein disclosed could be conceivably bilizing agents can be used to prevent biocatalyst aggregation synthesized using (1-fluoro-2-methylpropyl)benzyl, and improve the yields of the activation reaction. (2-fluoro-2-methylpropyl)benzyl, (1,1-difluoro-2-methyl 0201 After the activation reaction, the activated products propyl)benzyl derivatives, which however are not commer may or may not be isolated through any of the following cially available and therefore need to be prepared from row methods or combination thereof: extraction, distillation, pre material through several chemical steps. cipitation, Sublimation, chromatography, crystallization with 0207. A further advantage of the methods and systems optional seeding and/or co-crystallization aids. herein disclosed is the possibility to produce fluorinated 0202 The activated products are then contacted with the derivatives of a candidate organic molecule at a preparative fluorinating agent in the presence or the absence of an organic scale, obtaining from a minimum of 10 up to hundreds mil solvent underinert atmosphere. The activated products can be ligrams of the final fluorinated product with overall yields reacted in the form of isolated compound, purified com (after isolation) of up to 80%. These quantities and yields pound, partially-purified mixtures or crude mixtures. No par enable the evaluation of the biological, pharmacological, and ticular restriction is imposed upon the solvent of the reaction pharmacokinetic properties of said products as well as their as long as the solvent does not react with the fluorination use in further synthesis of more complex molecules. reagent, enzymatic product, or reaction product. 0208. An additional advantage of the methods and sys 0203 Solvents that can be used in the fluorination reaction tems herein disclosed is the possibility to substitute protected include, but are not restricted to, dichloromethane, pyridine, hydroxyl groups in the form of R O CHRR for fluo acetonitrile, chloroform, ethylene dichloride, 1,2-dimethoxy rine. A further advantage is that the substitution of protected ethane, diethylene glycol dimethyl ether, N-methylpyrroli hydroxyl group for fluorine can be carried out under mild done, dimethylformamide, and 1,3-dimethyl-2-imidazolidi conditions (room temperature and pressure), with limited use none, preferably dichloromethane or pyridine. The reaction of hazardous chemical and toxic solvents, in a chemoselective temperature will generally be in the range of -80°C. to 150° and regioselective manner. C., preferably in the range of about -78°C. and 30°C. The 0209 Classes of molecules that can be potentially amount of the fluorination reagent is preferably 1 equivalent obtained using the methods and systems herein disclosed or more for oxygenatom introduced in the molecular scaffold includebutare not limited to C-fluoro acid derivatives, fluoro of the organic molecule during the enzymatic reaction. After alkyl derivatives, fluoro-allyl derivatives, fluorohydrins, vic completion of the reaction, the fluorinated products are iso and gem-difluoride derivatives. lated through any of the following methods or combination 0210 Classes of molecules that can be potentially thereof: extraction, distillation, precipitation, Sublimation, obtained in enantiopure form using the methods and systems chromatography, crystallization with optional seeding and/or herein disclosed include but are not limited to C-fluoro acid co-crystallization aids. derivatives, fluoro-alkyl derivatives, fluoro-allyl derivatives, 0204 An advantage of the methods and systems is the and fluorohydrins. possibility to perform fluorination of predetermined target 0211. In general, the methods and systems herein dis sites in a candidate organic molecule. A further advantage is closed, in contrast to previously known synthetic methods, that subjecting the activated product or the fluorinated deriva provide a simple, environmentally benign, two-step proce tive to the action of the same oxydizing agent used for its dure for regio- and stereospecific incorporation of fluorine in preparation or another oxydizing agent, polyfluorination of a wide variety of organic compounds both at reactive and the molecule at the same or another predetermined target site non-reactive sites of their molecular scaffold. Particularly, it can be achieved. A further advantage is that the mono- and/or will be appreciated that methods and systems herein dis poly fluorination of predetermined target sites in a candidate closed procedure gives access to organofluorine derivatives, organic molecule can be carried out under mild conditions whose preparation through alternative routes would require (room temperature and pressure), with limited use of hazard many more synthetic steps and much higher amounts of toxic ous chemical and toxic solvents, in a chemoselective, regi reagents and organic solvents. oselective, and stereoselective manner. 0212. Accordingly, the methods and systems herein dis 0205 An additional advantage of the methods and sys closed have utility in the field of organic chemistry for prepa tems herein disclosed is the possibility to carry out fluorina ration of fluorinated building blocks and in medicinal chem tion of nonreactive sites of a candidate organic molecule, that istry for the preparation or discovery of fluorinated is sites that would could not be easily functionalized using derivatives of drugs, drug-like molecules, drug precursors, chemical reagents or would react only after or concurrently to and chemical building blocks with altered or improved physi other, more reactive sites of the molecule. cal, chemical, pharmacokinetic, or pharmacological proper 0206. A further advantage of the methods and systems ties. herein disclosed is the possibility to produce fluorinated 0213. In particular, in some embodiment of the methods derivatives of candidate organic molecules with an estab and systems herein disclosed, the organic molecules are pre lished or potentially relevant biological activity in only two selected among molecules of interest, such as drugs, drug steps. This post-synthetic' transformation represents a con precursors, lead compounds, and synthetic building blocks. siderable advantage compared to synthesis of the same The term “drug as used herein refer to a synthetic or non derivative or derivatives starting from fluorine-containing synthetic chemical entity with established biological and/or building blocks which may or may not be available, thus pharmacological activity, which is used to treat a disease, cure US 2009/0061471 A1 Mar. 5, 2009 22 a disfunction, or alter in some way a physiological or non mainly rely on the use of fluorinated building blocks), is that physiological function of a living organism. Lists of drugs can the methods herein disclosed can be carried outpost-syntheti be easily found in online databases such as www.accessdata. cally. As a consequence, the method herein disclosed can be faa.gov, www.drugs.com, www.rxlist.com, and the like. The broadly applied to produce oxygenated/fluorinated deriva term “drug precursor as used herein refers to a synthetic or tives starting from marketed drugs, drugs in advanced testing non-synthetic chemical entity which can be converted into a phase, lead compounds, or screening hits. drug through a chemical or biochemical transformation. The 0219. Additionally, a pre-selection of organic molecules conversion of a drug precursor into a drug can also occur after of interest and/or related fluorinated products can be made on administration, in which case the drug precursor is typically the basis of the ability of fluorine atoms to improve dramati referred to as “prodrug’. Accordingly, any synthetic or semi cally the pharmacological profile of drugs. In particular, this synthetic intermediate in the preparation of a drug can be can be done in view of several studies have shown that potent considered a drug precursor. The term “lead compound as drugs can be obtained through fluorination of much less used herein refers to a synthetic or non-synthetic chemical active precursors. Anticholesterolemic EZetimib (Clader entity that has pharmacological or biological activity and 2004), anticancer CF-taxanes (Ojima 2004), fluoro-steroids, whose chemical structure is used as a starting point for chemi and antibacterial fluoroquinolones are only some representa cal modifications in order to improve potency, selectivity, or tive examples. The improved pharmacological properties of pharmacokinetic parameters. Lead compounds are often fluoro-containing drugs are due mainly to enhanced meta found in high-throughput Screenings ("hits’) or are secondary bolic stability (Park, Kitteringham et al. 2001). Primary metabolites from natural sources. Reports on the discovery metabolism of drugs in humans generally occurs through and/or identification of lead compounds for various applica P450-dependent systems, and the introduction of fluorine tions are widespread in the Scientific literature and in particu atoms at or near the sites of metabolic attack has often proven lar in specialized journals such as Journal of medicinal chem Successful in increasing the half-life of a compound (Bohm, istry, Bioorganic & medicinal chemistry, Current medicinal Banner et al. 2004). chemistry, Current topics in medicinal chemistry, European 0220. In some cases, the introduction of fluorine substitu Journal of Medicinal Chemistry, Mini reviews in medicinal ents leads to improvements in the pharmacological properties chemistry, and the like. The term “synthetic building blocks' as a result of enhanced binding affinity of the molecule to as used herein refer to any synthetic or non-synthetic chemi biological receptors. Examples of the effect of fluorine on cal entity that is used for the preparation of a structurally more binding affinity are provided by recent results in the prepara complex molecule. tion of NK1 antagonists (Swain and Rupniak 1999), 5HT 0214. Upon fluorination of the target site of the pre-se agonists (van Niel, Collins et al. 1999), and PTB1B antago lected molecule, the fluorinated organic molecules produced nists (Burke, Ye et al. 1996). can be further used in the synthesis of more complex mol 0221. Accordingly, with the methods and system herein ecules, or, in addition, or in alternative, being tested for bio disclosed production of various oxygenated/fluorinated prod logical activities. ucts was expected Starting from a given drug or a drug-like 0215. In particular, in any embodiment, wherein identifi molecule, for example a lead compound identified in a drug cation of a an organic molecule having a predetermined bio discovery program. logical activity is desired, the methods and systems herein 0222. In an embodiment of the methods and systems, an disclosed further comprise testing the fluorinated organic array of oxygenases (P450 monooxygenases, non-heme iron molecule for the desired biological activity. Testing can in monooxygenases, dioxygenases and peroxygenases) can be particular be performed by screening the products of the used to produce various mono- and poly-oxygenated com reaction by the methods and systems illustrated in FIG. 4 in pounds. Some of these products can be isolated and Subjected form of mixture or as isolated compound for altered or to fluorination, e.g. deoxo-fluorination, where all or a Subset improved metabolic stability, biological activity, pharmaco of the introduced oxygen-containing functional groups are logical potency, and pharmacokinetic properties. substituted for fluorine. The resulting products can then be 0216. The wording “biological activity” as used herein separated and tested for improved biological properties. refers to any activity that can affect the status of a biological molecule or biological entity. A biological molecule can be a EXAMPLES protein or a polynucleotide. A biological entity can be a cell, 0223) The present disclosure is further illustrated in the an organ, or a living organism. The wording “pharmacologi following examples, which are provided by way of illustra cal activity” as used herein refers to any activity that can affect tion and are not intended to be limiting. and, generally but not necessarily, improve the status of a 0224. The following experiments have been carried out to living organism. perform chemo-enzymatic fluorination approach according 0217. In embodiments where identification of a molecule to embodiments of the methods and systems herein disclosed. having pharmacological activity is desired, use of P450 as 0225. First, a set of organic molecules has been selected, oxydizing agents is particularly preferred, since Phase I drug from which potentially useful fluorinated products can be metabolism in humans is mainly dependent on P450s. In this obtained. connection, one clear advantage of the methods and systems 0226. These compounds include: (a) 2-aryl acetic acid herein disclosed is that they allow for protection through derivatives, as demonstrative examples of useful synthetic fluorination of sites in the molecule that are sensitive to P450 blocks, for example in the preparation of prodrugs with dif hydroxylation attack. ferent susceptibility to hydrolysis. With the systems and 0218. A further advantage of the methods and systems methods herein described, stereoselective fluorination of the herein described for the identification of a molecule having alpha position of these target molecules was achieved, afford biological activity compared to corresponding strategies ing 2-fluoro-2-aryl acetic acid derivatives in considerable known in the art for producing fluorinated drugs (which enantiomeric excess; (b) ibuprofen methyl ester, as demon US 2009/0061471 A1 Mar. 5, 2009 strative example of a marketed drug, of which more potent Using the selected oxydizing agents, conditions for the acti and BBB (blood-brain-barrier)-penetrating derivatives are Vation reaction were optimized, testing different co-solvents sought after for treatment of Alzheimer's and Parkinson's (e.g. ethanol, ethylacetate), additives (e.g. BSA, glycerol), diseases. With the systems and methods herein described, ROS(Reactive oxygen species) scavengers (e.g. SOD, cata regioselective fluorination of weakly reactive sites of this lase), temperature, and target molecule:oxydizing agent target molecule was achieved, affording various C F deriva ratios. Once optimized, the activation reaction was scaled up tives; (c) dihydrolasmone, menthofuran, and guaiol, as to 100-300 mL reaction scale, where the oxydizing agent demonstrative examples of various molecular scaffolds that concentration typically ranged from 0.5 to 15uM, the target are present in several natural, synthetic, and semisynthetic molecule concentration from 5 to 20 mM, and a cofactor biologically active molecules. With the systems and methods regeneration system was used. The co-solvent was usually herein described, regioselective fluorination of weakly reac ethanol, typically at a final concentration of 0.5% to 2%. tive sites of these target molecules was achieved, affording 0230. Large scale reactions were incubated under stirring various C F derivatives; (d) dihydro-4-methoxymethyl-2- at room temperature for a period of time of up to 56 hours, methyl-5-phenyl-2-oxazoline, as demonstrative example for during which target molecule conversion was monitored by chemoselective substitution of methoxygroup for fluorine. extracting Small aliquots of the reaction mixture and analyZ With the systems and methods herein described, fluorination ing them by gas chromatography. of the methoxy protected group in the target molecules was 0231. As the desired amount of activated product was achieved, affording a demethoxy-fluoro derivative; (e) perm produced, the reaction mixture was extracted with an organic ethylated mannopyranoside as demonstrative example for Solvent, typically chloroform, and the activated product was regioselective Substitution of a specific methoxy group for isolated by silica gel chromatography using hexane:ethyl fluorine in the presence of several otheridentical groups in the acetate gradient. Purified products were identified using GC molecule. With the systems and methods herein described, MS, "H , and 'C-NMR. regioselective fluorination of the methoxy protected group in 0232. Once the product with the activated target site was position 6 of the target molecule was achieved, affording a identified, the activated product was subjected to fluorination 6-demethoxy-fluoro derivative. using the deoxo-fluorinating agent DAST in dichlo 0227. A pool of oxydizing agents, comprising wild-type romethane. Different reaction conditions were typically P450(CYP102A1), variants of wild-type P450 carry tested to optimize yield and possibly achieve quantitative ing one or more mutations at the positions 25, 26, 42, 47, 51. conversion. During these tests, the conversion of the activated 52, 58,64,74, 75,78,81, 82,87, 88,90,94, 96,102,106, 107, product to the corresponding fluorinated derivative was typi 108, 118, 135, 138, 142, 143, 145, 152, 172,173, 175, 178, cally monitored by GC-MS. 180, 181, 184, 185, 188, 197, 199, 205, 214, 226, 231, 236, 0233. After the fluorination reaction, the fluorinated prod 237, 239, 252, 255, 260, 263,264, 265, 267,268, 273, 274, uct was isolated by silica gel chromatography using a hexane: 275,290, 295,306, 324, 328,354, 366, 398, 401, 430, 433, ethyl acetate gradient. The identity of the purified product was 434, 437, 438, 442, 443, 444, and 446, and a selection of the confirmed by GC-MS, HR-MS. H. , C , and 'F-NMR. most active P450 chimera peroxygenases and monooxygena 0234. The pool of pre-selected oxydizing agents and other ses from the libraries described in Otey et al. (Otey, Landwehr selected variants from mutagenesis libraries of var3-10 i.e. et al. 2006) and Landwehr et al. (Landwehr, Carbone et al. libraries where positions 74, 82, 87, 88, and 328 position of 2007) were arrayed on 96-well plates. Arrays were prepared var3-10 were subjected to saturation mutagenesis—were by growing recombinant E. coli transformed with an expres screened for activity towards activation of the pre-selected sion plasmid encoding for the P450 sequence, inducing pro organic molecules dihydro-4-methoxymethyl-2-methyl-5- tein expression with IPTG, and preparing a cell lysate. phenyl-2-oxazoline (MMPO) and 1,2,3,4,6-pentamethyl 0228. The activation reaction of the pre-selected organic mannopyranoside using a colorimetric assay on a 96-well molecules ibuprofen methyl ester, menthofuran, dihydrolas plate format. In the case of MMPO, for example, different mone, and guaiol with the pool of pre-selected oxydizing oxydizing agents were arrayed on a 96-well plate, each well agents was tested at a 1-mL scale dissolving the organic containing about 150 uL phosphate buffer and about 1 uM molecule in phosphate buffer (1% ethanol) at a final concen oxydizing agent. The target molecule was added to the solu tration of 2 mM. The oxydizing agent was then added to the tion from an ethanol stock to a final concentration of 2 mM solution at a final concentration of about 200-400 nM. The (and 1% ethanol). After addition of 1 mM NADPH, the reac reaction was started by adding NADPH and a glucose-6- tion mixture was incubated for 30 minutes at room tempera phosphate dehydrogenase cofactor regeneration system to the ture. After incubation, MMPO activation activity was deter mixture. After 20 hrs incubation at room temperature, the mined using the calorimetric reagent Purpald (Sigma), which reactions were extracted with chloroformandanalyzed by gas reacts with formaldehyde and serves in this case to detect the chromatography. Total conversion ratios were calculated demethylation of the methoxy group in the target molecule. including in the experiment a sample containing no enzyme Positive hits were re-tested on a 1-mL scale using 1 mM and adding an internal standard to the samples. The 20-30% MMPO, 0.5 uM oxydizing agent, 1 mM NADPH, and a most promising oxydizing agents were re-tested at a larger cofactor regeneration system. After incubation at room tem scale (3 mL) to identify false positives and determine regi perature, the reaction mixtures were extracted with chloro oselectivity and product distribution. Exemplary results from form and analyzed by gas chromatography. In this way, the the screening of the pool of P450s on dihydrojasmone and regioselectivity and conversion efficiency of each oxydizing menthofuran are reported on FIGS. 5 and 6. agent was established. The identity of the activated product 0229. A group of about 5 to 10 most interesting oxydizing was also confirmed by GC-MS. Most promising oxydizing agents were then selected based on the results from the re agents, that is those showing highest regioselectivity and/or screen, in particular based on their regioselectivity, conver conversion efficiency, were used for scale-up tests and for sion efficiency, or ability to produce “rare activated product. producing larger quantities of activated product for the fluo US 2009/0061471 A1 Mar. 5, 2009 24 rination reaction as described above. Representative results from the screening of the P450 pool for MMPO activation -continued activity are reported on FIG. 7. 0235. The activation of the target molecule dihydrojas OH mone was also carried out using a whole-cell system (FIG. 8). 1.2 eq DAST Specifically, the whole-cell system consisted of E. coli DH5a CH O N CHC-78°2-12 C. cells transformed with a pCWori vector that contains the 75% sequence for var3. The whole-cell activation reaction was carried outgrowing a 0.5 L culture of the recombinant cells in TB medium, inducing the intracellular expression of var3 during mid-log phase by adding 0.5 mMIPTG, and growing the cells at 30° C. for further 12 hours. After that, 15 mL dodecane were added to the culture. Dihydrojasmone was y then added to the culture at a final concentration of 30 mM. Formation of the activated product and consumption of the target molecule were monitored by gas chromatography for up to 36 hours. Conversion ratio at the end of the 36 hours Crs amounted to ~10%. Higher conversion ratios (>90-95%) 74% ee were achieved in vitro with the same variant using a cofactor regeneration system. The lower efficiency of the whole-cell system in the case of dihydrojasmone may be attributed to potential toxicity of this molecule or its activated product to 0238 Methyl 2-phenyl acetate was subjected to selective the cells as well as their low membrane permeability. Never fluorination of the target site C (alpha position) according to theless, this experiment demonstrates that the activation of the target molecule for the scope of the systems and methods the systems and methods herein disclosed and, more specifi herein described can also be performed using a whole-cell, cally, according to the general procedure described above. especially in cases where the chemo-physical properties of 0239 Experimental description: 90 mg methyl 2-phenyl the candidate molecule may make this option more favorable. acetate were dissolved in 500LL ethanol and added to 240 mL 0236 Chemical reagents, substrates and solvents were potassium phosphate buffer pH 8.0. P450 was added to purchased from Sigma, Aldrich, and Fluka. Silica gel chro matography purifications were carried out using AMD Silica the mixture at a final concentration of 2 uM. The mixture was Gel 60 230-400 mesh. Gas chromatography (GC) analyses split in 4 mL aliquots into 15 mL Scintillation vials equipped were carried out using a Shimadzu GC-17A gas chromato with a stir bar. 500 uLofa 5 mMNADPH solution were added graph, a FID detector, and an Agilent HP5 column (30 mix0. to each vial and stirred for 2 minutes. 500 uL of a cofactor 32 mmx0.1 um film). Chiral GC analyses were carried out regeneration solution containing 300 mM glucose-6-phos using a Shimadzu GC-17A gas chromatograph, a FID detec phate and 10 units/mL glucose-6-phosphate dehydrogenase tor, and an Agilent Cyclosilb column (30 mx0.52 mmx0.25 were then added to each vial. The resulting mixtures were um film). GC-MS analyses were carried out on a Hewlett Packard 5970B MSD with 5890 GC and a DB-5 capillary stirred at room temperature. After 4 hours, the reaction mix column. 'H, C, and 'F NMR spectra were recorded on a tures were joined together and extracted with chloroform Varian Mercury 300 spectrometer (300 MHz, 75 MHz, and (3x100 mL). The organic phase was then dried over magne 282 MHz, respectively), and are internally referenced to sium sulfate (MgSO) and evaporated in vacuo. Purification residual protio solvent signal. Data for "H NMR are reported of the resulting oil by silica gel chromatography (5% ethyl in the conventional form: chemical shift (8 ppm), multiplicity acetate: 95% hexane) afforded the activated product (S)-me (s=singlet, d=doublet, t-triplet, q-quartet, m-multiplet, thyl 2-hydroxy-2-phenylacetate, 40.5 mg). 40 mg (0.24 br broad), coupling constant (HZ), integration, and assign ment). Data for CNMR are reported in the terms of chemi mmol) of activated product were dissolved in 2 mL dry cal shift (8 ppm). Data for 'F NMR are reported in the terms dichloromethane (CH2Cl2) and a catalytic amount (4 drops) of chemical shift (8 ppm) and multiplicity. High-resolution of ethanol was added to the solution. The solution was cooled mass spectra were obtained with a JEOL JMS-600H High to -78°C. (dry ice) and added with 41 uL DAST (0.29 mmol). Resolution Mass Spectrometer at the California Institute of The reaction was stirred in dry ice for 12 hours. The reaction Technology Mass Spectral facility. mixture was then added with 5 mL saturated sodium bicar bonate (NaHCO) and extracted with dichloromethane (3x15 Example 1 mL). The organic phase was then dried over magnesium Stereoselective Fluorination of methyl 2-phenyl sulfate (MgSO) and evaporated in vacuo. Purification of the acetate resulting oil by silica gel chromatography (5% ethyl acetate 95% hexane) afforded the fluorinated product ((R)-methyl 0237) 2-fluoro-2-phenylacetate) (30 mg, 75% yield, paleyellow oil) Scheme 1. in 74% ee, as determined by chiral GC analysis. H-NMR (300 MHz, CDC1): 83.75 (s.3H, OCH), & 5.77 (d. J=48 Hz, 1H, CHF), & 737-7.46 (m, 5H); 'C-NMR (75 MHz, CH ON 0.1%KPi, mol pH P450BMB = 8.0 CDC1): 82.8, 89.5 (d. J=184.5 Hz), 126.8, 126.9, 129.0, 8 He 129.9, 8 134.4 (d. J=34.5 Hz), 8 169.0. 'F-NMR (282 MHz, r 45% CDC1): 8 -180.29 (d. J=48.7 Hz). HRMS (EI+): exact mass O calculated for CHFO requires m/z, 168.0587, found 168. O594. US 2009/0061471 A1 Mar. 5, 2009 25

Example 2 81.2, 89.6 (d. J=184.5 Hz), 126.8, 126.9, 128.9, 129.8, 134.4 (d. J=34.5 Hz), 8 169.0. 'F-NMR (282 MHz, CDC1): 8 Stereoselective Fluorination of ethyl 2-phenyl -180.27 (d. J=48.7 Hz). HRMS (EI+): exact mass calculated acetate for CH, FO, requires m/z 182.0743, found 182.0750. 0240 Example 3 Stereoselective Fluorination of propyl 2-(3-chlo Scheme 2. rophenyl)acetate H 0.1% mol WT(F87A) 0243 CH O ---KPi, pH = 8.0 66%

Scheme 3. H gH 1.2 eq DAST CH2Cl2, -78° C. 0.05% mol war3-7 --- KPi, pH = 8.0 f 9N-1 -- 7896 75% Orr O C gH 9N-1 1.5 eq DAST CHCl2, -78°C. Orr He 9 3% ee O

0241 Ethyl 2-phenyl acetate was subjected to selective orC fluorination of the target site C (alpha position) according to the systems and methods herein disclosed and, more specifi cally, according to the general procedure described above. N-1a 0242 Experimental description: 100 mg ethyl 2-phenyl acetate were dissolved in 500LL ethanol and added to 250 mL r potassium phosphate buffer pH 8.0. WT(F87A) was added to the mixture at a final concentration of 2 uM. The mixture was C split in 4 mL aliquots into 15 mL Scintillation vials equipped 89% ee with a stir bar. 500 uLofa 5 mMNADPH solution were added to each vial and stirred for 2 minutes. 500 uL of a cofactor regeneration solution containing 300 mM glucose-6-phos 0244 Propyl 2-(3-chlorophenyl)acetate was subjected to phate and 10 units/mL glucose-6-phosphate dehydrogenase selective fluorination of the target site C (alpha position) were then added to each vial. The resulting mixtures were according to the systems and methods herein disclosed and, stirred at room temperature. After 3 hours, the reaction mix more specifically, according to the general procedure tures were joined together and extracted with chloroform described above. (3x100 mL). The organic phase was then dried over magne 0245 Experimental description: 95 mg propyl 2-(3-chlo sium sulfate (MgSO) and evaporated in vacuo. Purification rophenyl)acetate were dissolved in 500 uLethanol and added of the resulting oil by silica gel chromatography (5% ethyl to 250 mL potassium phosphate buffer pH 8.0. Var3-7 was acetate: 95% hexane) afforded the activated product (S)- added to the mixture at a final concentration of 1 uM. The ethyl 2-hydroxy-2-phenylacetate, 66 mg). 66 mg (0.36 mmol) mixture was split in 4 mL aliquots into 15 mL Scintillation of activated product were dissolved in 2 mL dry dichlo vials equipped with a stir bar. 500 uL of a 5 mM NADPH romethane (CH2Cl2) and a catalytic amount (4 drops) of solution were added to each vial and stirred for 2 minutes. 500 ethanol was added to the solution. The solution was cooled to LL of a cofactor regeneration solution containing 300 mM -78°C. (dry ice) and added with 61 uL DAST (0.43 mmol). glucose-6-phosphate and 10 units/mL glucose-6-phosphate The reaction was stirred in dry ice for 12 hours. The reaction dehydrogenase were then added to each vial. The resulting mixture was then added with 5 mL saturated sodium bicar mixtures were stirred at room temperature. After 4 hours, the bonate (NaHCO) and extracted with dichloromethane (3x15 reaction mixtures were joined together and extracted with mL). The organic phase was then dried over magnesium chloroform (3x100 mL). The organic phase was then dried sulfate (MgSO) and evaporated in vacuo. Purification of the over magnesium sulfate (MgSO4) and evaporated in vacuo. resulting oil by silica gel chromatography (5% ethyl acetate Purification of the resulting oil by silica gel chromatography 95% hexane) afforded the fluorinated product ((R)-ethyl (5% ethyl acetate: 95% hexane) afforded the activated product 2-fluoro-2-phenylacetate) (51 mg, 78% yield, pale yellow oil) ((S)-propyl 2-hydroxy-2-(3-chlorophenyl)acetate, 71 mg). in 93% ee, as determined by chiral GC analysis H-NMR 70 mg (0.3 mmol) of activated product were dissolved in 2 mL (300 MHz, CDC1): 8 1.24 (t, J=7.2 Hz, 3H, CH), 8 dry dichloromethane (CH2Cl2) and a catalytic amount (4 4.16-4.27 (m,2H, OCH), 65.75 (d. J=48 Hz, 1H, -CHF), drops) of ethanol was added to the solution. The solution was 8 7.37-7.46 (m,5H); 'C-NMR (75 MHz, CDC1): 14.2, 62.0, cooled to -78°C. (dry ice) and added with 64 uL DAST (0.45 US 2009/0061471 A1 Mar. 5, 2009 26 mmol). The reaction was stirred in dry ice for 12 hours. The reaction mixture was then added with 5 mL saturated sodium -continued bicarbonate (NaHCO) and extracted with dichloromethane (3x15 mL). The organic phase was then dried oyer magne sium sulfate (MgSO) and evaporated in vacuo. Purification of the resulting oil by silica gel chromatography (5% ethyl N-n acetate 95% hexane) afforded the fluorinated product ((R)- O propyl 2-fluoro-2-(3-chlorophenyl)acetate) (57 mg. 82% 85% ee yield, colorless oil) in 89% ee, as determined by chiral GC analysis. 'H-NMR (300 MHz, CDC1): 80.85 (t, J–7 Hz, 3H, 0247 Propyl 2-(4-methylphenyl)acetate and propyl 2-(2- —CH), 8 1.56-1.68 (m, 2H, CH), & 4.12 (t, J=6 Hz, 2H, methylphenyl)acetate were subjected to selective fluorination OCH), & 5.72 (d. J=48 Hz, 1H, CHF), & 7.32 (br,3H), 8 of the target site C (alpha position) according to the systems 7.44 (br. 1H); 'C-NMR (75 MHz, CDC1): 10.3, 21.9, 67.7, and methods herein disclosed and, more specifically, accord 688.7 (d. J=186.5 Hz), 124.8, 126.9, 129.9, 130.3, 134.9. ing to the general procedure described above. F-NMR (282 MHz, CDC1): 8 -182.8 (d. J=48.7 Hz). 0248 Experimental description: stereoselective activation HRMS (EI+): exact mass calculated for CHCIFO, and fluorination of 2-(4-methylphenyl)acetate and propyl requires m/z 2300510, found 230.0502. 2-(2-methylphenyl)acetate were carried out starting from 100 mg Substrate according to the experimental protocol Example 4 described in Example 3. The fluorinated product (R)-propyl 2-fluoro-2-(4-methylphenyl)acetate was obtained in 87% ee Stereoselective Fluorination of propyl 2-(4-meth (54 mg, colorless oil). H-NMR (300 MHz, CDC1): 8 0.84 ylphenyl)acetate and propyl 2-(2-methylphenyl)ac 0.91 (m,3H, -CH), 81.57-1.68 (m,2H, CH), 82.37 (s.3H, etate —CH), 8 408-4.16 (m, 2H, OCH), & 5.75 (d. J=48 Hz, 1H, CHF), & 7.18-7.27 (m, 2H), & 7.27-7.44 (m, 2H): 0246 'C-NMR (75 MHz, CDC1): 10.40, 19.41, 22.08, 67.47, 126.57, 131.10. 'F-NMR (282 MHz, CDC1): 8 -178.5 (d. J=48.7 Hz). HRMS (EI--): exact mass calculated for Scheme 4-1. CHFO, requires m/z 210.1056, found 210.1062. The flu orinated product (R)-propyl 2-fluoro-2-(2-methylphenyl)ac H etate was obtained in 87% ee (54 mg. colorless oil). H-NMR h O 0.05% mol war3-7 (300 MHz, CDC1): 8 0.83 (t, J=7.5 Hz, 3H, CH,), 8 n-1N KPi, pH = 8.0 1.52-1.68 (m,2H, CH), 62.43 (s.3H, —CH), 84.12 (m,2H, r 65% OCH), & 5.96 (d. J=48 Hz, 1H, CHF), & 7.16-7.30 (m, O 4H); 'C-NMR (75 MHz, CDC1): 10.3, 19.3, 22.0, 29.9, 67.4, 87.4 (d. J=183 Hz), & 126.5, 8 127.5, 8 129.8, 8 131.0. OH 'F-NMR (282 MHz, CDC1): 8 -180.1 (d. J=48.7 Hz). 1.5 eq DAST HRMS (EI--): exact mass calculated for CHFO requires 9N-1S CH2Cl2, -78°C. m/Z 210.1056, found 210.1070. 83% 0249 Examples 1, 2, 3, and 4 illustrate the application of O the systems and methods of the disclosure for stereoselective fluorination of a chemical building block, exemplified by 2-aryl acetic acid derivatives (Schemes 1-4).

O Example 5 Regioselective Fluorination of 3-methyl-2-pentylcy O clopent-2-enone (dihydrojasmone) in Position 4 87% ee Scheme 4-2. 0250

H h O 0.05% mol war3-7 r n1n KPi, 7296pH = 8.0 O S. OH 1.5 eq DAST O 1.3 eq DAST CH2Cl2, -78° C. in N-n CH2Cl2,88% -78° C. S. CH-OH -->92% O US 2009/0061471 A1 Mar. 5, 2009 27

-continued -continued O 1.5 eq DAST CH2Cl2, -78° C. CH-F -- 89%

HC-OH 0251 Dihydrojasmone was subjected to selective fluori nation of the target site C (position 4) according to the sys tems and methods herein disclosed and, more specifically, O according to the general procedure described above. 0252 Experimental description: 270 uL dihydrojasmone were dissolved in 1.2 mL ethanol and added to 150 mL potassium phosphate buffer pH 8.0. Var2 was added to the HC-F mixture at a final concentration of 2 uM. The mixture was split in 4.8 mL aliquots into 15 mL scintillation vials equipped with a stir bar. 600 uL 10 mMNADPH in KPibuffer 0254. Dihydrojasmone was subjected to selective fluori were added to each vial and stirred for 2 minutes. 600 uL nation of the target site C (position 11) according to the cofactor regeneration Solution containing 500 mM glucose systems and methods herein disclosed and, more specifically, 6-phosphate and 10 units/mL glucose-6-phosphate dehydro according to the general procedure described above. genase were then added to each vial. The resulting mixtures 0255 Experimental description: 240 uL dihydrojasmone were stirred at room temperature. After 36 hours, the reaction mixtures were joined together and extracted with chloroform were dissolved in 1.1 mL ethanol and added to 130 mL (3x50 mL). The organic phase was then dried over magne potassium phosphate buffer pH 8.0. Var2 was added to the sium sulfate (MgSO) and evaporated in vacuo. Purification mixture at a final concentration of 2 uM. The mixture was of the resulting oil by silica gel chromatography (0-30% ethyl split in 4.8 mL aliquots into 15 mL scintillation vials acetate/hexane) afforded the activated product (4-hydroxy-3- equipped with a stir bar. 600 uL 10 mMNADPH in KPibuffer methyl-2-pentylcyclopent-2-enone, 222 mg). 210 mg (1.15 were added to each vial and stirred for 2 minutes. 600 uL mmol) of activated product were dissolved in 2 mL dry cofactor regeneration solution containing 500 mM glucose dichloromethane (CH2Cl2) and a catalytic amount (4 drops) of ethanol was added to the solution. The solution was cooled 6-phosphate and 10 units/mL glucose-6-phosphate dehydro to -78°C. (dry ice) and added with 215uL DAST (1.5 mmol). genase were then added to each vial. The resulting mixtures The reaction was stirred in dry ice for 12 hours. The reaction were stirred at room temperature. After 36 hours, the reaction mixture was then added with 5 mL saturated sodium bicar mixtures were joined together and extracted with chloroform bonate (NaHCO) and extracted with dichloromethane (3x15 (3x50 mL). The organic phase was then dried over magne mL). The organic phase was then dried over magnesium sium sulfate (MgSO) and evaporated in vacuo. Purification sulfate (MgSO) and evaporated in vacuo. Purification of the of the resulting oil by silica gel chromatography (0-30% ethyl resulting oil by silica gel chromatography (0-30% ethyl acetate/hexane) afforded the activated product (11-hydroxy acetate/hexane) afforded the fluorinated product, 4-fluoro-3- methyl-2-pentylcyclopent-2-enone (193 mg, 92% yield, yel 3-methyl-2-pentylcyclopent-2-enone, 35 mg). 30 mg (0.16 low oil). "H-NMR (300 MHz, CDC1): 8 0.88 (t, J=6.6 Hz, mmol) of activated product were dissolved in 2 mL dry 3H, CH), 81.25-1.40 (m, 6H, CH), 62.10 (d. J–2.1 Hz, 2H, dichloromethane (CH2Cl2) and a catalytic amount (4 drops) CH), 62.20 (t, J=7.1 Hz, 2H), 82.44-2.60 (m. 1H), 82.70 of ethanol was added to the solution. The solution was cooled 2.82 (m, 1H), & 5.47 (dd, J=54.2 Hz, J=5.8, 1H); 'C-NMR to -78°C. (dry ice) and added with 35uL DAST (0.25 mmol). (75 MHz, CDC1): 8 13.7, 14.2, 22.6, 23.1, 27.9, 29.9, 31.9, The reaction was stirred in dry ice for 12 hours. The reaction 41.4 (d. J=19.6 Hz), 8 91.2 (d. J=174 Hz): 'F-NMR (282 mixture was then added with 5 mL saturated sodium bicar MHz, CDC1): 8 - 179.08 (ddd, J=51.88 Hz, J-21.43 Hz, bonate (NaHCO) and extracted with dichloromethane (3x15 J=9.3 Hz). HRMS (EI--): exact mass calculated for CH, FO mL). The organic phase was then dried over magnesium requires m/z 184.1263, found 184.1255. sulfate (MgSO) and evaporated in vacuo. Purification of the resulting oil by silica gel chromatography (0-30% ethyl Example 6 acetate/hexane) afforded the fluorinated product, 11-fluoro 3-methyl-2-pentylcyclopent-2-enone (27 mg. 89% yield, yel Regioselective Fluorination of 3-methyl-2-pentylcy low oil). H-NMR (300 MHz, CDC1): 80.88 (t, J=6.6 Hz,3H, clopent-2-enone(dihydrojasmone) in Position 11 CH), 8 1.25-1.40 (m, 6H, CH), 6 2.10 (d. J=2.1 Hz, 2H, CH), 6 2.17 (t, J=7.6 Hz, 2H), 82.38-2.44 (m. 1H), 8 2.59 0253 2.64 (m, 1H), & 5.20 (d. J=48.8 Hz, 1H); 'C-NMR (75 MHz, Scheme 6. CDC1): & 14.3, 22.8, 23.3, 28.4, 31.8, 29.9, 31.9, 34.8, 60.6, O 80.3 (d. J=164 Hz), 87.9; 'F-NMR (282 MHz, CDC1): 8 0.02% mol war3-2 –48.80 (d. J=48.7 Hz). HRMS (EI+): exact mass calculated KPi, pH = 8.0 H-e- for CH, FO requires m/z 184.1263, found 184.1263. 1596. 0256 Examples 5 and 6 illustrate the application of the O 8 6 systems and methods of the disclosure for regioselective fluo HC-H2 rination of an organic molecule at weakly reactive sites, exemplified by dihydrojasmone (Schemes 5 and 6). US 2009/0061471 A1 Mar. 5, 2009 28

Example 7 Regioselective Difluorination of 3-methyl-2-pentyl cyclopent-2-enone(dihydrojasmone) in Position 4 0257

Scheme 7.

O O 0.02% mol war2 O CHH -- -- CF-H KPi, pH = 8.0 CF-OH s 75% S.

1.3 eq DAST CH2Cl2, -78° C. 859%

O

CF-F S.

0258 Dihydrojasmone was subjected to selective difluori Example 8 nation of the target site C (position 4) according to the sys Regioselective Fluorination of Menthofuran tems and methods herein disclosed and, more specifically, according to the general procedure described above. 0261 0259 Experimental description: 4-fluoro-3-methyl-2- pentylcyclopent-2-enone was obtained according to the Scheme 8. experimental described in Example 5. 180 mg 4-fluoro-3- methyl-2-pentylcyclopent-2-enone were dissolved in 900 uL 0.01% mol war3-11 H-e-KPi, pH = 8.0 ethanol and added to 120 mL potassium phosphate buffer pH 47% 8.0. Var2 was added to the mixture at a final concentration of

2 uM. The mixture was split in 4.8 mL aliquots into 15 mL scintillation vials equipped with a stir bar. 600 uL 10 mM 3 eq DAST NADPH in KPi buffer were added to each vial and Stirred for CHCl2, -78% 2 minutes. 600 uL cofactor regeneration solution containing 22%

500 mM glucose-6-phosphate and 10 units/mL glucose-6- phosphate dehydrogenase were then added to each vial. The resulting mixtures were stirred at room temperature. After 36 hours, the reaction mixtures were joined together and extracted with chloroform (3x50 mL). The organic phase was then dried over magnesium sulfate (MgSO) and evaporated in vacuo. Purification of the resulting oil by silica gel chro 0262 Menthofuran was subjected to selective fluorination matography (0-30% ethyl acetate/hexane) afforded the acti of the target site C (position 6) according to the systems and vated product (4-hydroxy-4-fluoro-3-methyl-2-pentylcyclo methods herein disclosed and, more specifically, according to pent-2-enone, 135 mg). 100 mg (0.54 mmol) of activated the general procedure described above. product were dissolved in 2 mL dry dichloromethane 0263. Experimental description: 112 mg menthofuran (CH2Cl2) and a catalytic amount (4 drops) of ethanol was were dissolved in 0.6 mL ethanol and added to 125 mL added to the solution. The solution was cooled to -78°C. (dry potassium phosphate buffer pH 8.0. Var3-11 was added to the ice) and added with 100 uL DAST (0.7 mmol). The reaction mixture at a final concentration of 0.7 LM. The mixture was was stirred in dry ice for 12 hours. The reaction mixture was split in 4 mL aliquots into 15 mL Scintillation vials equipped then added with 5 mL saturated sodium bicarbonate with a stir bar. 500 uL 10 mM NADPH in KPi buffer were (NaHCO) and extracted with dichloromethane (3x15 mL). added to each vial and stirred for 2 minutes. 500 uL cofactor The organic phase was then dried over magnesium sulfate regeneration solution containing 500 mM glucose-6-phos (MgSO) and evaporated in vacuo. Purification of the result phate and 10 units/mL glucose-6-phosphate dehydrogenase ing oil by silica gel chromatography (0-30% ethyl acetate/ were then added to each vial. The resulting mixtures were hexane) afforded the fluorinated product, 4.4-difluoro-3-me stirred at room temperature. After 24 hours, the reaction thyl-2-pentylcyclopent-2-enone (85 mg. 85% yield, yellow nearly reached completion (95% substrate conversion). The oil). MS (EI+): m/z 202. Mw for CHFO: 202.24. reaction mixtures were joined together and extracted with 0260 Example 7 illustrates the application of the systems chloroform (3x50 mL). The organic phase was then dried and methods of the disclosure for regioselective polyfluori over magnesium sulfate (MgSO4) and evaporated in vacuo. nation of an organic molecule at a weakly reactive site, exem The resulting oil (53 mg) was subjected directly to deoxo plified by dihydrojasmone (Scheme 7). fluorination without purification of the activated product. 53 US 2009/0061471 A1 Mar. 5, 2009 29 mg of the activation mixture (-0.32 mmol) were dissolved in mL glucose-6-phosphate dehydrogenase were then added to 2 mL dry dichloromethane (CHCl) and a catalytic amount each vial. The resulting mixtures were stirred at room tem (4 drops) of ethanol was added to the solution. The solution perature. After 48 hours, the reaction mixtures were joined was cooled to -78°C. (dry ice) and added with 150 uL DAST together and extracted with chloroform (3x50 mL). The (1 mmol). The reaction was stirred in dry ice for 16 hours. The organic phase was then dried over magnesium Sulfate reaction mixture was then added with 5 mL saturated sodium (MgSO) and evaporated in vacuo. Purification of the result bicarbonate (NaHCO) and extracted with dichloromethane ing oil by silica gel chromatography (0-30% ethyl acetate/ (3x15 mL). The organic phase was then dried over magne hexane) afforded the activated product (6-hydroxy-guaiol, 30 sium sulfate (MgSO) and evaporated in vacuo. Purification mg, colorless oil). H-NMR (300 MHz, CDC1): 8 1.01 (d. of the resulting oil by silica gel chromatography (0-10% ethyl J=6.9 Hz, 3H, -CH), 8 1.22 (s, 3H, CH), 8 1.28 (s, 3H, acetate/hexane) afforded the fluorinated product, 6-fluoro CH), 8 1.25 (d. J=9 Hz, 3H, CH), 8 1.42-1.45 (m. 2H), 8 menthofuran-2-ol (12 mg, 22% yield, yellow oil). H-NMR 1.685 (bs, 2H), 8 1.74-1.183 (m, 2H), 8 1.95-2.03 (m, 2H), 8 (300 MHz, CDC1): 8 1.13 (d. J: 75.6 Hz, 3H, -CH), 8 2.15-2.24 (m, 2H), 82.54-2.72 (m,3H), 82.97-3.06 (m, 1H), 1.2-1.3 (m. 2H, —CH2—), 8 1.84 (s.3H, —CH), 1.95-2.4 83.67 (d. J=9 Hz, 1H); 'C-NMR (75 MHz, CDC1,): & 11.20, (dm, 2H, -CH ), 2.4-2.6 (dm, 2H, -H, ); 'C-NMR 19.41, 26.08, 28.32, 31.07, 34.13, 35.33, 38.08, 42.42, 48.01, (75 MHz, CDC1): 822.7 (d. J–209Hz), 43.08, 45.36, 91.7 (d. 72.12, 73.15, 178.94. 15 mg of the activated product (0.06 J=215 Hz), 114.9, 127.1. 'F-NMR (282 MHz, CDC1): 8 mmol) were dissolved in 1 mL dry dichloromethane -114.4 (m). HRMS (EI--): exact mass calculated for (CHCl) and a catalytic amount (3 drops) of ethanol was CHFO, requires m/z 184.0909, found 184.0899. added to the solution. The solution was cooled to -78°C. (diy 0264. Example 8 illustrates the application of the systems ice) and added with 18 uL DAST (0.12 mmol). The reaction and methods of the disclosure for regioselective fluorination was stirred in dry ice for 16 hours. The reaction mixture was of an organic molecule at a weakly reactive site, exemplified then added with 5 mL saturated sodium bicarbonate by menthofuran (Scheme 8). (NaHCO) and extracted with dichloromethane (3x15 mL). The organic phase was then dried over magnesium Sulfate Example 9 (MgSO) and evaporated in vacuo. Purification of the result Regioselective Fluorination of (-)-guaiol ing oil by silica gel chromatography (0-30% ethyl acetate/ hexane) afforded the fluorinated product, 6-fluoro-guaiol (7 0265 mg, 45% yield, pale yellow oil). MS (EI--): m/z. 242. Mw for CHFO: 242.35. Scheme 9. 0268 Example 9 illustrates the application of the systems and methods of the disclosure for regioselective fluorination O of an organic molecule at a weakly reactive site, exemplified 0.05% mol war3-2 by (-)-guaiol (Scheme 9). KPi, pH = 8.0 H 13 12% 6C Example 10 OH Regioselective Fluorination of Ibuprofen Methyl 14 H 2 Ester (methyl 2-(4'-(2"-methylpropyl)phenyl)pro panoate) in 1" Position 2.5 eq DAST CHCl2, -78°C. 0269 Ho H 45% Scheme 10. C OH O O O.1%ao no.O P450 BM3 OH N KP, pH - 8.0 g 49% 2 t 2

H i C H O F O 1.2 eq DAST F N CH2Cl2, -78° C. -- 0266 (-)-Guaiol was subjected to selective fluorination of the target site C (position 6) according to the systems and l 65% methods herein disclosed and, more specifically, according to the general procedure described above. isOH 0267 Experimental description: 250 mg guaiol were dis O solved in 2 mL ethanol and added to 210 mL potassium phosphate buffer pH 8.0. Var3-2 was added to the mixture at a final concentration of 3 uM. The mixture was split in 4.8 mL aliquots into 15 mL Scintillation vials equipped with a stir bar. l 600 uL 10 mMNADPH in KPibuffer were added to each vial and stirred for 2 minutes. 600 uL cofactor regeneration solu tion containing 500 mM glucose-6-phosphate and 10 units/ US 2009/0061471 A1 Mar. 5, 2009 30

0270. Ibuprofen methyl ester was subjected to selective fluorination of the target site C (position 1") according to the -continued systems and methods herein disclosed and, more specifically, O O 1.2 eq DAST according to the general procedure described above. OH N CHCl2, -78° C. 0271 Experimental description: 150 mg ibuprofen methyl NI 45% ester were dissolved in 1.4 mL ethanol and added to 150 mL -C potassium phosphate buffer pH 8.0. P450 was added to O the mixture at a final concentration of 10 uM. The mixture O was split in 4 mL-aliquots into 15 mL Scintillation vials F N equipped with a stir bar. 500 uL 10 mMNADPH in KPibuffer NI were added to each vial and stirred for 2 minutes. 500 uL l'C cofactor regeneration Solution containing 500 mM glucose 6-phosphate and 10 units/mL glucose-6-phosphate dehydro genase were then added to each vial. The resulting mixtures 0273 Ibuprofen methyl ester was subjected to selective were stirred at room temperature. After 48 hours, the reaction fluorination of the target site C (position 2") according to the mixtures were joined together and extracted with chloroform systems and methods herein disclosed and, more specifically, (3x50 mL). The organic phase was then dried over magne according to the general procedure described above. sium sulfate (MgSO) and evaporated in vacuo. Purification 0274 Experimental description: 150 mg ibuprofen methyl of the resulting oil by silica gel chromatography (5-40% ethyl ester were dissolved in 1.4 mL ethanol and added to 150 mL acetate/hexane) afforded the activated product (methyl 2-(4- potassium phosphate buffer pH 8.0. Var3-4 was added to the (1"-hydroxy-2"-methylpropyl)phenyl)propanoate, 73 mg). mixture at a final concentration of 3 M. The mixture was 15 mg (0.06 mmol) of activated product were dissolved in 2 split in 4 mL-aliquots into 15 mL Scintillation vials equipped mL dry dichloromethane (CH2Cl2) and a catalytic amount (4 with a stir bar. 500 uL 10 mM NADPH in KPi buffer were drops) of ethanol was added to the solution. The solution was added to each vial and stirred for 2 minutes. 500 uL cofactor cooled to -78°C. (dry ice) and added with 11 uL DAST (0.72 regeneration solution containing 500 mM glucose-6-phos mmol). The reaction was stirred in dry ice for 16 hours. The phate and 10 units/mL glucose-6-phosphate dehydrogenase reaction mixture was then added with 5 mL saturated sodium were then added to each vial. The resulting mixtures were bicarbonate (NaHCO) and extracted with dichloromethane stirred at room temperature. After 48 hours, the reaction mix (3x15 mL). The organic phase was then dried over magne tures were joined together and extracted with chloroform sium sulfate (MgSO) and evaporated in vacuo. Purification (3x50 mL). The organic phase was then dried over magne of the resulting oil by silica gel chromatography (0-30% ethyl sium sulfate (MgSO) and evaporated in vacuo. Purification acetate/hexane) afforded the fluorinated product, methyl of the resulting oil by silica gel chromatography (5-40% ethyl 2-(4'-(1"-fluoro-2'-methylpropyl)phenyl)propanoate (10 acetate/hexane) afforded the activated product (methyl 2-(4'- mg, 65% yield, colorless oil). H-NMR (300 MHz, CDC1): 8 (2'-hydroxy-2"-methylpropyl)phenyl)propanoate, 81 mg). 0.84 (d.J=6.9Hz,3H, CH), 8 1.01 (d. J–6.9 Hz,3H, -CH), 15 mg (0.06 mmol) of activated product were dissolved in 2 & 1.49 (d. J=8.7 Hz, 3H, CH), 82.05-2.08 (m. 1H), 83.66 (s, mL dry dichloromethane (CH2Cl2) and a catalytic amount (4 3H, OCH), & 3.73 (q, J–7.5 Hz, 1H), & 5.07 (dd, J=40.0, drops) of ethanol was added to the solution. The solution was J=6.9 Hz, 1H, CHF), & 7.25 (m, 4H); 'C-NMR (75 MHz, cooled to -78°C. (dry ice) and added with 11 uL DAST (0.72 CDC1): 8 15.5, 17.82 (d), 18.54(d), 34.48 (d. J: 85.7 Hz), mmol). The reaction was stirred in dry ice for 16 hours. The 45.37,52.31,99.3 (d. J=174 Hz), 175.6; F-NMR (282 MHz, reaction mixture was then added with 5 mL saturated sodium CDC1): 8 -179.8 (m). HRMS (EI--): exact mass calculated bicarbonate (NaHCO) and extracted with dichloromethane for CHFO, requires m/z 238.1369, found 238.1367. (3x15 mL). The organic phase was then dried over magne sium sulfate (MgSO) and evaporated in vacuo. Purification of the resulting oil by silica gel chromatography (0-30% ethyl Example 11 acetate/hexane) afforded the fluorinated product, methyl 2-(4'-(1"-fluoro-2'-methylpropyl)phenyl)propanoate (7 mg, Regioselective Fluorination of Ibuprofen Methyl Ester (methyl 2-(4'-(2"-methylpropyl)phenyl)pro 45% yield, colorless oil). H-NMR (300 MHz, CDC1): 8 1.28 (s, 3H, CH), 8 1.35 (s, 3H, CH), 8 1.48 (d. J–6.9 Hz, panoate) in 2" Position 3H, CH,), 82.87 (d.J: 20.4 Hz, 2H), & 3.65 (s.3H, OCH), 8 3.70 (q, J=7.15 Hz, 1H), 87.17 (m, 4H); 'C-NMR (75 MHz, 0272 CDC1): 8 18.80, 26.83 (d.J: 24.2 Hz), 45.24, 47.37 (d. J:22.8 Hz), 52.25, 129.12 (d. J: 258.8 Hz), ca. 130, ca. 132, ca. 139, ca. 173; F-NMR (282 MHz, CDC1):6-137.7 (m). HRMS (EI+): exact mass calculated for CHFO requires m/z. 0.05% mol war3-4 238.1369, found 238.1370. KPi, pH = 8.0 0275 Example 10 and 11 illustrate the application of the 7296 systems and methods of the disclosure for regioselective fluo rination of an organic molecule at weakly and non-reactive site, such as positions 1" and 2" of ibuprofen methyl ester (Schemes 10 and 11). US 2009/0061471 A1 Mar. 5, 2009 31

Example 12 The organic phase was then dried over magnesium Sulfate (MgSO) and evaporated in vacuo. Purification of the result Regioselective Fluorination of dihydro-4-methoxym ing oil by silica gel chromatography (20% ethyl acetate/ ethyl-2-methyl-5-phenyl-2-oxazoline hexane) afforded the fluorinated product, dihydro-4-fluorom 0276 ethyl-2-methyl-5-phenyl-2-oxazoline (12 mg, 40% yield, colorless oil). H-NMR (300 MHz, CDC1): 8 1.59 (s.3H, CH3), a 2.09 (dim, 1H, CH), 84.15-4.35 (m, 1H, CH), & 5.66 (tm, 1H, CH), 8 7.37 (m, 5H, Ph); 'F-NMR (282 MHz, CDC1): 8 -114.14 (m). HRMS (EI+): exact mass calculated for CHFNO requires m/z 193.0903, found 193.0917. 0279. In another aspect, example 12 illustrates the appli 0.1% mol war 3-5 cation of the systems and methods of the disclosure for selec KPi, pH = 8.0 tive fluorination of an organic molecule at a site carrying a 64% protected hydroxyl group, Such as in dihydro-4-methoxym ethyl-2-methyl-5-phenyl-2-oxazoline (Scheme 12). Example 13

2 eq DAST Regioselective Fluorination of 1,2,3,4,6-pentam CH2Cl2, -20° C. ethyl-O-D-mannopyranoside He 40% 0280

Scheme 13. y 0.2%mol war3-6 HC O KPi, pH = 8.0 N 60% yo 0. /O OS (0277 Dihydro-4-methoxymethyl-2-methyl-5-phenyl-2- pi oxazoline was subjected to selective fluorination of the target HC 6 eq DAST site C atom carrying a methoxy group, according to the sys N O CHCl,2V-12 r.t. tems and methods herein disclosed and, more specifically, Oo 30% according to the general procedure described above. / O 0278 Experimental description: 100 mg dihydro-4-meth oxymethyl-2-methyl-5-phenyl-2-oxazoline ibuprofen / On methyl ester were dissolved in 1.2 mL ethanol and added to 160 mL potassium phosphate buffer pH 8.0. Var3-5 was HC added to the mixture at a final concentration of 3 uM. The No O mixture was split in 4 mL-aliquots into 15 mL Scintillation vials equipped with a stir bar. 500 uL 10 mM NADPH in KPi 7° O buffer were added to each vial and stirred for 2 minutes. 500 LL cofactor regeneration solution containing 500 mM glu / On cose-6-phosphate and 10 units/mL glucose-6-phosphate dehydrogenase were then added to each vial. The resulting 0281 1,2,3,4,6-pentamethyl-O-D-mannopyranoside was mixtures were stirred at room temperature. After 48 hours, the subjected to regioselective fluorination of the target site C in reaction mixtures were joined together and extracted with position 6, according to the systems and methods herein chloroform (3x50 mL). The organic phase was then dried disclosed and, more specifically, according to the general over magnesium sulfate (MgSO) and evaporated in vacuo. procedure described above. Purification of the resulting oil by silica gel chromatography 0282 Experimental description: 50 mg of 1,2,3,4,6-pen (20% ethyl acetate/hexane) afforded the activated product tamethyl-O-D-mannopyranoside were dissolved in 0.5 mL (dihydro-4-hydroxymethyl-2-methyl-5-phenyl-2-oxazoline, ethanol and added to 100 mL potassium phosphate buffer pH 64 mg). 30 mg (0.16 mmol) of activated product were dis 8.0. Var3-6 was added to the mixture at a final concentration solved in 2 mL dry dichloromethane (CHCl) and a catalytic of 4 uM. The mixture was split in 4 mL-aliquots into 15 mL amount (4 drops) of ethanol was added to the solution. The scintillation vials equipped with a stir bar. 500 uL 10 mM solution was cooled to -78°C. (dry ice) and added with 22 uL NADPH in KPi buffer were added to each vial and Stirred for DAST (0.32 mmol). The reaction was stirred in dry ice for 2 2 minutes. 500LL cofactor regeneration solution containing hours and then at -20°C. for 16 hours. The reaction mixture 500 mM glucose-6-phosphate and 10 units/mL glucose-6- was then added with 5 mL saturated sodium bicarbonate phosphate dehydrogenase were then added to each vial. The (NaHCO) and extracted with dichloromethane (3x15 mL). resulting mixtures were stirred at room temperature. After 36 US 2009/0061471 A1 Mar. 5, 2009 32 hours, the reaction mixtures were joined together and rality” includes two or more referents unless the content extracted with chloroform (3x50 mL). The organic phase was clearly dictates otherwise. Unless defined otherwise, all tech then dried over magnesium Sulfate (MgSO4) and evaporated nical and Scientific terms used herein have the same meaning in vacuo. Purification of the resulting oil by silica gel chro as commonly understood by one of ordinary skill in the art to matography (10% ethyl acetate/hexane) afforded the acti which the disclosure pertains. Although any methods and vated product (1,2,3,4-tetramethyl-O-D-mannopyranoside, materials similar or equivalent to those described herein can 30 mg). 15 mg (0.1 mmol) of activated product were dis be used in the practice for testing of the disclosure(s), specific solved in 2 mL dry dichloromethane (CHCl) and a catalytic examples of appropriate materials and methods are described amount (4 drops) of ethanol was added to the solution. The herein. solution was cooled to -78°C. (dry ice) and added with 85uL 0287. Unless otherwise indicated, the disclosure is nor DAST (0.6 mmol). The reaction was stirred in dry ice for 2 limited to specific molecular structures, Substituents, Syn hours and then at room temperature for 16 hours. The reaction thetic methods, reaction conditions, or the like, as Such may mixture was then added with 5 mL saturated sodium bicar vary. It is also to be understood that the terminology used bonate (NaHCO) and extracted with dichloromethane (3x15 herein is for the purpose of describe particular embodiments mL). The organic phase was then dried over magnesium only and is not intended to be limiting. sulfate (MgSO) and evaporated in vacuo. Purification of the 0288 The entire disclosure of each document cited (in resulting oil by silica gel chromatography (10% ethyl acetate/ cluding patents, patent applications, journal articles, hexane) afforded the fluorinated product, 6-deoxy-6-fluoro abstracts, laboratory manuals, books, or other disclosures) in 1,2,3,4-tetramethyl-O-D-mannopyranoside (4.5 mg, 30% the Background, Detailed Description, and Examples is yield, colorless oil). 'H-NMR (300 MHz, CDC1): 83.35 (s, hereby incorporated herein by reference. Further, the hard 3H, OCH), 3.48 (s, 6H, OCH), 3.53 (s.3H, OCH), 4.0-4.2 copy of the sequence listing Submitted herewith and the cor (m, 4H), 4.6 (dim, J:47.5 Hz, 2H, CHF); 'C-NMR (75 MHz, responding computer readable form are both incorporated CDC1): 8 55.24, 57.94, 59.24, 61.02, 71.12 (d. J: 18.4 Hz), herein by reference in their entireties. 73.2, 75.5, 82.49 (d. J: 192 Hz), 98.26. 'F-NMR (282 MHz, 0289. A number of embodiments of the disclosure have CDC1): 8 -235.2 (m). ESI-MS: m/z calculated for Mw been described. Nevertheless, it will be understood that vari CHFOs: 238.2533, found 238.28. ous modifications may be made without departing from the 0283. In another aspect, example 13 illustrates the appli spirit and scope of the disclosure. Accordingly, other embodi cation of the systems and methods of the disclosure for regi ments are within the scope of the following claims. oselective fluorination of an organic molecule at a defined site carrying a protected hydroxyl group in the presence of other REFERENCES identical functional groups, such as in 1,2,3,4,6-pentamethyl 0290 Bass, S. V. Sorrells, et al. (1988). “Mutant Trp C-D-mannopyranoside (Scheme 13). repressors with new DNA-binding specificities.” Science 0284. The examples set forth above are provided to give 242(4876): 240-5. those of ordinary skill in the art a complete disclosure and 0291 Beeson, T. D. and D. W. C. MacMillan (2005). description of how to make and use the embodiments of the "Enantioselective organocatalytic alpha-fluorination of methods and systems herein disclosed, and are not intended to aldehydes.' Journal of the American Chemical Society 127 limit the scope of the disclosure. Modifications of the above (24): 8826-8828. described modes for carrying out the disclosure that are obvi 0292 Blee, E., A. L. Wilcox, et al. (1993). “Mechanism of ous to persons of skill in the art are intended to be within the Reaction of Fatty-Acid Hydroperoxides with Soybean Per Scope of the following claims. All patents and publications oxygenase.” Journal of Biological Chemistry 268(3): mentioned in the specification are indicative of the levels of 1708-1715. Bobbio, C. and V. Gouverneur (2006). “Cata skill of those skilled in the art to which the disclosure pertains. lytic asymmetric fluorinations.” Org Biomol Chem 4(11): All references cited in this disclosure are incorporated by 2065-75. reference to the same extent as if each reference had been 0293 Bohm, H. J. D. Banner, et al. (2004). “Fluorine in incorporated by reference in its entirety individually. medicinal chemistry.” Chembiochem 5(5): 637-43. 0285. In summary, a method and system, and in particular 0294 Bornscheuer, U.T. (2003). “Immobilizing enzymes: a chemo-enzymatic method and system for selectively fluori how to create more suitable biocatalysts.” Angew. Chem. nating organic molecules on a target site wherein the target Int. Ed. Engl. 42: 3336-3337. site is activated and then fluorinated are present together with 0295) Botstein, D. and D. Shortle (1985). “Strategies and a method and system for identifying a molecule having a applications of in vitro mutagenesis.” Science 229(4719): biological activity. In particular, A chemo-enzymatic method 1193-201 for preparation of selectively fluorinated derivatives of 0296 Braxton, S, and J. A. Wells (1991). “The importance organic compounds with diverse molecular structures is pre of a distal hydrogen bonding group in stabilizing the tran sented together with a system for fluorination of an organic sition state in subtilisin BPN.” J Biol Chem 266(18): molecule and a method for identification of a molecule having 11797-800. a biological activity. 0297 Burke, T. R. B. Ye, et al. (1996). “Small molecule 0286. It is to be understood that the embodiments are not interactions with protein-tyrosine phosphatase PTPIB and limited to particular compositions or biological systems, their use in inhibitor design.” Biochemistry 35(50): 15989 which can, of course, vary. It is also to be understood that the 15996. terminology used herein is for the purpose of describing 0298 Cahard, D., C. Audouard, et al. (2000). “Design, particular embodiments only, and is not intended to be limit synthesis, and evaluation of a novel class of enantioselec ing. As used in this specification and the appended claims, the tive electrophilic fluorinating agents: N-fluoro ammonium singular forms “a” “an.” and “the include plural referents salts of cinchona alkaloids (F-CA-BF4).” Organic Letters unless the content clearly dictates otherwise. The term “plu 2(23): 3699-3701. US 2009/0061471 A1 Mar. 5, 2009

0299 Camarero, J.A. and A. R. Mitchell (2005). “Synthe 0320 Hayashi, H., H. Sonoda, et al. (2002). “2,2-difluoro sis of proteins by native chemical ligation using Fmoc 1,3-dimethylimidazolidine (DFI). A new fluorinating based chemistry.” Protein Pept. Lett. 12(8): 723-8. agent.” Chemical Communications (15): 1618-1619. 0300 Cao, L. (2005). “Immobilised enzymes: science or 0321 Hintermann, L. and A. Togni (2000). “Catalytic art?” Curr Opin Chem Biol 9(2): 217-26. enantioselective fluorination of beta-ketoesters.” 0301 Carter, P. (1986). “Site-directed mutagenesis.” Bio Angewandte Chemie—International Edition 39(23): chem J237(1): 1-7. 4359-+. 0302 Casper, D., U. Yaparpalvi, et al. (2000). “Ibuprofen 0322 Joo, H., Z. L. Lin, et al. (1999). “Laboratory evolu protects dopaminergic neurons against glutamate toxicity tion of peroxide-mediated cytochrome P450 hydroxyla in vitro. Neurosci Lett 289(3): 201-4. tion.” Nature 399(6737): 670-673. 0303 Chambers, R. D., J. Hutchinson, et al. (2000). J. 0323 Kim, D.Y. and E. J. Park (2002). “Catalytic enanti Fluorine Chen. 102: 169. oselective fluorination of beta-keto esters by phase-trans 0304 Chambers, R. D., C. J. Skinner, et al. (1996). J. fer catalysis using chiral quaternary ammonium salts.” Chem. Soc., Perkin Trans. 1: 605. Organic Letters 4(4): 545-547. 0305 Cirino, P. C. and F. H. Arnold (2003). “A self-suffi 0324 Knight, D.W. (1994). Contemporary Organic Syn cient peroxide-driven hydroxylation biocatalyst.” thesis 1: 287. Angewandte Chemie—International Edition 42(28): 3299 0325 Kramer, W. V. Drutsa, et al. (1984). “The gapped 3301. duplex DNA approach to oligonucleotide-directed muta (0306 Clader, J. W. (2004). “The discovery of ezetimibe: a view from outside the receptor. J. Med. Chem. 47(1): 1-9. tion construction.” Nucleic Acids Res 12(24): 9441-56. 0307 Dale, S.J. and 1. R. Felix (1996). “Oligonucleotide 0326 Kramer, W. and H. J. Fritz (1987). “Oligonucle directed mutagenesis using an improved phosphorothioate otide-directed construction of mutations via gapped duplex approach” Methods Mol Biol 57: 55-64. DNA.” Methods Enzymol 154: 350-67. 0308 Davis, F.A. and W. Han (1992). “Diastereoselective 0327 Kunkel, T. A. J. D. Roberts, et al. (1987). “Rapid Fluorination of Chiral Imide Enolates Using N-Fluoro-O- and efficient site-specific mutagenesis without phenotypic Benzenedisulfonimide (Nfobs).” Tetrahedron Letters selection.” Methods Enzymol 154: 367-82. 33(9): 1153-1156. 0328. Lal, G. S., G. P. Pez, et al. (1999). “Bis(2-methoxy 0309 Davis, F. A., P. Zhou, et al. (1998). “Asymmetric ethyl)aminosulfur tricloride: a new broad-spectrum deox fluorination of enolates with nonracemic N-fluoro-2,10 ofluorinating agent with enhanced thermal stability.” J. camphorsultams.” Journal of Organic Chemistry 63(7): Org. Chem. 64: 7048-54. 2273-228O. 0329 Landwehr, M., M. Carbone, et al. (2007). “Diversi 0310 De-Moura, N. F., E. Simionatto, et al. (2002). fication of catalytic function in a synthetic family of chi “Quinoline Alkaloids, Coumarins and Volatile Constitu meric cytochrome p450s. Chem Biol 14(3): 269-78. ents of Helietta longifoliata.” Planta Med. 68: 631-634. 0330 Leuchtenberger, S., D. Beher, et al. (2006). “Selec 0311 DeSouza, M. V. N. (2005). “The furan-2(5H)-ones: tive modulation of Abeta42 production in Alzheimer's dis Recent synthetic methodologies and its application in total ease: non-steroidal anti-inflammatory drugs and beyond.” synthesis of natural products' Mini-rev. Org. Chem. 2(2): Curr Pharm Des 12(33): 4337-55. 139-145. 0331 Ling, M. M. and B. H. Robinson (1997). 0312 Denisov, I. G., T. M. Makris, et al. (2005). “Struc Approaches to DNA mutagenesis: an overview.” Anal ture and chemistry of cytochrome P450.” Chem. Rev. 105 Biochem 254(2): 157-78. (6): 2253-77. 0332 Ma, J. A. and D. Cahard (2004). “Asymmetric fluo 0313 Eghtedarzadeh, M. K. and S. Henikoff (1986). “Use rination, trifluoromethylation, and perfluoroalkylation of oligonucleotides to generate large deletions. Nucleic reactions.” Chem Rey 104(12): 6119-46. Acids Res 14(12): 5115. 0333 Ma, J. A. and D. Cahard (2004). “Copper(II) triflate 0314 Enders, D., M. Potthoff, et al. (1997). “Regio- and bis(oxazoline)-catalysed enantioselective electrophilic enantioselective synthesis of alpha-fluoroketones by elec fluorination of beta-ketoesters.’ Tetrahedron—Asymmetry trophilic fluorination of alpha-silylketone enolates with 15(6): 1007-1011. N-fluorobenzosulfonimide.” Angewandte Chemie Inter 0334. Mandecki, W. (1986). “Oligonucleotide-directed national Edition in English 36(21): 2362-2364. double-strand break repair in plasmids of Escherichia coli: 0315 Green, T. W. and P. G. M. Wuts (1999). Protective a method for site-specific mutagenesis.” Proc Natl Acad Groups in Organic Synthesis. New York, Wiley-Inter Sci USA 83(19): 7177-81. Science. 0335 Marigo, M., D. I. Fielenbach, et al. (2005). “Enan 0316 Grundstrom, T., W. M. Zenke, et al. (1985). “Oligo tioselective formation of stereogenic carbon-fluorine cen nucleotide-directed mutagenesis by microscale shot-gun ters by a simple catalytic method.' Angewandte Chemie— gene synthesis. Nucleic Acids Res 13(9): 3305-16. International Edition 44(24): 3703-3706. 0317 Hamashima, Y. and M. Sodeoka (2006). “Enanti 0336 Matsunaga, I., T. Sumimoto, et al. (2002). “Func oselective fluorination reactions catalyzed by chiral palla tional modulation of a peroxygenase cytochrome P450: dium complexes. Synlett (10): 1467-1478. novel insight into the mechanisms of peroxygenase and 0318 Hanano, A., M. Burcklen, et al. (2006). “Plant seed peroxidase enzymes.” Febs Letters 528(1-3):90-94. peroxygenase is an original heme-oxygenase with an EF 0337 Matsunaga, I., A.Yamada, et al. (2002). “Enzymatic hand calcium binding motif.” Journal of Biological Chem reaction of hydrogen peroxide-dependent peroxygenase istry 281 (44): 33140-33151. cytochrome P450s: kinetic deuterium isotope effects and 0319 Harper, D. B. and D. O'Hagan (1994). “The fluori analyses by resonance Raman spectroscopy. Biochemistry nated natural products.” Nat Prod Rep 11(2): 123-33. 41(6): 1886-92. US 2009/0061471 A1 Mar. 5, 2009 34

0338 Middleton, W. J. (1975). “New Fluorinating 0353 Shimizu, Y., Y. Kuruma, et al. (2006). “Cell-free Reagents—Dialkylaminosulfur Fluorides.' Journal of translation systems for protein engineering.” FEBS J. 273 Organic Chemistry 40(5): 574-578. (18): 4133-40. 0339 Mikolajczyk, M., M. Mikina, et al. (1999). “New 0354 Smith, M. (1985). “In vitro mutagenesis.” Annu Rev phosphonate-mediated Synteses of cyclopentanoids and Genet. 19: 423-62. prostaglandins.” Pure Appl. Chem. 71 (3): 473–480. 0355 Swain, C. and N. M.J. Rupniak (1999). “Progress in the development of neurokinin antagonists.” Annual (0340 Nakamaye, K. L. and F. Eckstein (1986). “Inhibition Reports in Medicinal Chemistry Vol 3434: 51-60. of restriction endonuclease Nici I cleavage by phospho 0356. Taylor, J. W., W. Schmidt, et al. (1985). “The use of rothioate groups and its application to oligonucleotide phosphorothioate-modified DNA in restriction enzyme directed mutagenesis. Nucleic Acids Res 14(24):9679–98. reactions to prepare nicked DNA. Nucleic Acids Res (0341 Nambiar, K. P. J. Stackhouse, et al. (1984). “Total 13(24): 8749-64. synthesis and cloning of a gene coding for the ribonuclease 0357 Taylor, S. D., C. C. Kotoris, et al. (1999). “Recent S protein.” Science 223 (4642): 1299-301. advances in electrophilic fluorination.” Tetrahedron (0342 Nyffeler, P.I., S. G. Duron, et al. (2005). “Select 55(43): 12431-12477. fluor: Mechanistic insight and applications.” Angewandte 0358. Togni, A., A. Mezzetti, et al. (2001). “Developing Chemie—International Edition 44(2): 192-212. catalytic enantioselective fluorination.” (0343 Ojima, I. (2004). “Use of fluorine in the medicinal 0359 Chimia 55(10): 801-805. chemistry and chemical biology of bioactive com 0360 Townsend, K. P. and D. Pratico (2005). “Novel pounds—a case study on fluorinated taxane anticancer therapeutic opportunities for Alzheimer's disease: focus on agents.” Chembiochem 5(5): 628-35. nonsteroidal anti-inflammatory drugs.” Faseb J 19(12): (0344) Otey, C. R., M. Landwehr, et al. (2006). “Structure 1592-601 Guided Recombination Creates an Artificial Family of 0361 van Beilen, J. B. and E. G. Funhoff (2007). “Alkane Cytochromes P450. PLoS Biol 4(5): el 12. hydroxylases involved in microbial alkane degradation.” (0345 Park, B. K., N. R. Kitteringham, et al. (2001). Appl Microbiol Biotechnol 74(1): 13-21. "Metabolism of fluorine-containing drugs.” Annu. Rev. 0362 van Niel, M. B., 1. Collins, et al. (1999). “Fluorina Pharmacol. Toxicol. 41: 443-70. tion of 3-(3-(piperidin-1-yl)propyl)indoles and 3-(3-(pip (0346 Presnell, S. R. and F. E. Cohen (1989). Proc. Natl. erazin-1-yl)propyl)indoles gives selective human 5-HT1D Acad. Sci. U.S.A. 86: 6592. receptor ligands with improved pharmacokinetic profiles.” Journal of Medicinal Chemistry 42(12): 2087-2104. 0347 Pylypenko, O. and I. Schlichting (2004). "Structural 0363 Wang, L. J. Xie, et al. (2006). “Expanding the aspects of ligand binding to and electron transfer in bacte genetic code.” Annu. Rev. Biophys. Biomol. Struct. 35: rial and fungal P450s.” Annu. Rev. Biochem. 73:991-1018. 225-249. 0348 Sakamoto, T.J. M. Joern, et al. (2001). “Laboratory 0364 Wells, J. A., M. Vasser, et al. (1985). “Cassette evolution of toluene dioxygenase to accept 4-picoline as a mutagenesis: an efficient method for generation of mul substrate. Applied and Environmental Microbiology tiple mutations at defined sites. Gene 34(2-3): 315-23. 67(9): 3882-+. 0365 Yamazaki, Y., S.Yusa, et al. (1996). “Effect of fluo (0349 Sayers, J. R. W. Schmidt, et al. (1988). “5'-3' exo rine Substitution of alpha- and beta-hydrogen atoms in nucleases in phosphorothioate-based oligonucleotide-di ethyl phenylacetate and phenylpropionate on their stereo rected mutagenesis. Nucleic Acids Res 16(3): 791-802. selective hydrolysis by cultured cancer cells' J. Fluorine 0350 Schwarzer, D. and P. A. Cole (2005). “Protein semi Chem. 79(2): 167-171. synthesis and expressed protein ligation: chasing a pro 0366 Zoller, M.J. (1992). “New recombinant DNA meth tein's tail.” Curr. Opin. Chem. Biol. 9(6): 561-9. odology for protein engineering. Curr Opin Biotechnol 0351 Shibata, N., T. Ishimaru, et al. (2004). “First enan 3(4): 348-54. tio-flexible fluorination reaction using metal-bis(oxazo 0367 Zoller, M. J. and M. Smith (1983). “Oligonucle line) complexes.” Synlett (10): 1703-1706. otide-directed mutagenesis of DNA fragments cloned into 0352 Shibata, N., E. Suzuki, et al. (2000). “A fundamen M13 vectors.” Methods Enzymol 100: 468-500. tally new approach to enantioselective fluorination based 0368 Zoller, M. J. and M. Smith (1987). “Oligonucle on cinchona alkaloid derivatives/selectfluor combination.” otide-directed mutagenesis: a simple method using two Journal of the American Chemical Society 122(43): 10728 oligonucleotide primers and a single-stranded DNA tem 10729. plate.” Methods Enzymol 154: 329-50.

SEQUENCE LISTING

<16 Oc NUMBER OF SEO ID NOS : 7 O

<210 SEQ ID NO 1 <211 LENGTH: 10 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 signature sequence &220s FEATURE: US 2009/0061471 A1 Mar. 5, 2009 35

- Continued NAME/KEY: SC FEATURE LOCATION: (2) ... (2) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (3) ... (3) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (5) . . (5) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (6) . . (6) OTHER INFORMATION: Xala can be Histidine or Arginine FEATURE: NAME/KEY: SC FEATURE LOCATION: (7) . . (7) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (9) ... (9) OTHER INFORMATION: Xala can be any amino acid SEQUENCE: 1 Phe Xaa Xala Gly Xaa Xaa Xaa Cys Xaa Gly 1. 5 1O

SEQ ID NO 2 LENGTH: 1048 TYPE PRT ORGANISM: Bacillus megaterium FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) ... (1048) OTHER INFORMATION: Cytochrome P450 enzyme CYP102A1

SEQUENCE: 2

Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15

Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O

Ala Asp Glu Lieu. Gly Glu Ile Phe Lys Phe Glu Ala Pro Gly Arg Wall 35 4 O 45

Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Phe Val Arg Asp 65 70 7s Phe Ala Gly Asp Gly Lieu Phe Thr Ser Trp. Thir His Glu Lys Asn Trp 85 90 95

Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O

Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Wall Glin Lieu. Wall Glin 115 12 O 125

Llys Trp Glu Arg Lieu. Asn Ala Asp Glu His Ile Glu Wall Pro Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn. Ser Phe Tyr Arg Asp Glin Pro His Pro Phe Ile Thr Ser 1.65 17O 17s

Met Val Arg Ala Lieu. Asp Glu Ala Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O US 2009/0061471 A1 Mar. 5, 2009 36

- Continued

Pro Asp Asp Pro Ala Asp Glu Asn Arg Glin Phe Glin Glu Asp 195

Ile Lys Wall Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215

Ala Ser Gly Glu Glin Ser Asp Asp Luell Luell Thir His Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Glu Asn Ile Arg Tyr 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Tyr Phe Luell Wall ASn Pro His Wall Luell Glin 27s 285

Ala Ala Glu Glu Ala Ala Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Ala Lys 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Lell Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375

Ile Pro Glin His Ala Phe Pro Phe ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp Thir Asn Tyr Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Tyr Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Tyr Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 58O 585 59 O US 2009/0061471 A1 Mar. 5, 2009 37

- Continued Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala US 2009/0061471 A1 Mar. 5, 2009 38

- Continued

995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 3 <211 LENGTH: 1059 &212> TYPE: PRT <213> ORGANISM; Bacillus subtilis &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (1059) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A2 <4 OO SEQUENCE: 3 Lys Glu Thir Ser Pro Ile Pro Gln Pro Llys Thr Phe Gly Pro Leu Gly 1. 5 1O 15 Asn Lieu Pro Lieu. Ile Asp Lys Asp Llys Pro Thr Lieu. Ser Lieu. Ile Llys 2O 25 3O Lieu Ala Glu Glu Gln Gly Pro Ile Phe Glin Ile His Thr Pro Ala Gly 35 4 O 45 Thir Thr Ile Val Val Ser Gly His Glu Lieu Val Lys Glu Val Cys Asp SO 55 6 O Glu Glu Arg Phe Asp Llys Ser Ile Glu Gly Ala Lieu. Glu Lys Val Arg 65 70 7s 8O Ala Phe Ser Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Pro Asn 85 90 95 Trp Arg Lys Ala His Asn. Ile Lieu Met Pro Thr Phe Ser Glin Arg Ala 1OO 105 11 O Met Lys Asp Tyr His Glu Lys Met Val Asp Ile Ala Val Glin Lieu. Ile 115 12 O 125 Glin Llys Trp Ala Arg Lieu. Asn. Pro Asn. Glu Ala Val Asp Val Pro Gly 13 O 135 14 O Asp Met Thir Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn 145 150 155 160 Tyr Arg Phe Asin Ser Tyr Tyr Arg Glu Thr Pro His Pro Phe Ile Asn 1.65 17O 17s Ser Met Val Arg Ala Lieu. Asp Glu Ala Met His Glin Met Glin Arg Lieu. 18O 185 19 O Asp Val Glin Asp Llys Lieu Met Val Arg Thr Lys Arg Glin Phe Arg Tyr 195 2OO 2O5 Asp Ile Glin Thr Met Phe Ser Lieu Val Asp Ser Ile Ile Ala Glu Arg 21 O 215 22O Arg Ala Asn Gly Asp Glin Asp Glu Lys Asp Lieu. Lieu Ala Arg Met Lieu 225 23 O 235 24 O Asn Val Glu Asp Pro Glu Thr Gly Glu Lys Lieu. Asp Asp Glu Asn. Ile 245 250 255 Arg Phe Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser 26 O 265 27 O Gly Lieu. Lieu. Ser Phe Ala Thr Tyr Phe Lieu. Lieu Lys His Pro Asp Llys 27s 28O 285 US 2009/0061471 A1 Mar. 5, 2009 39

- Continued

Lieu Lys Lys Ala Tyr Glu Glu Val Asp Arg Val Lieu. Thir Asp Ala Ala 29 O 295 3 OO Pro Thr Tyr Lys Glin Val Lieu. Glu Lieu. Thr Tyr Ile Arg Met Ile Leu 3. OS 310 315 32O Asn Glu Ser Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr 3.25 330 335 Pro Lys Glu Asp Thr Val Ile Gly Gly Llys Phe Pro Ile Thr Thr Asn 34 O 345 35. O Asp Arg Ile Ser Val Lieu. Ile Pro Glin Lieu. His Arg Asp Arg Asp Ala 355 360 365 Trp Gly Lys Asp Ala Glu Glu Phe Arg Pro Glu Arg Phe Glu. His Glin 37 O 375 38O Asp Glin Val Pro His His Ala Tyr Llys Pro Phe Gly Asn Gly Glin Arg 385 390 395 4 OO Ala Cys Ile Gly Met Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu 4 OS 41O 415 Gly Met Ile Leu Lys Tyr Phe Thr Lieu. Ile Asp His Glu Asn Tyr Glu 42O 425 43 O Lieu. Asp Ile Lys Glin Thr Lieu. Thir Lieu Lys Pro Gly Asp Phe His Ile 435 44 O 445 Ser Val Glin Ser Arg His Glin Glu Ala Ile His Ala Asp Val Glin Ala 450 45.5 460 Ala Glu Lys Ala Ala Pro Asp Glu Gln Lys Glu Lys Thr Glu Ala Lys 465 470 47s 48O Gly Ala Ser Val Ile Gly Lieu. Asn. Asn Arg Pro Lieu. Lieu Val Lieu. Tyr 485 490 495 Gly Ser Asp Thr Gly Thr Ala Glu Gly Val Ala Arg Glu Lieu Ala Asp SOO 505 51O Thir Ala Ser Lieu. His Gly Val Arg Thr Lys Thr Ala Pro Lieu. Asn Asp 515 52O 525 Arg Ile Gly Lys Lieu Pro Llys Glu Gly Ala Val Val Ile Val Thir Ser 53 O 535 54 O Ser Tyr Asn Gly Lys Pro Pro Ser Asn Ala Gly Glin Phe Val Glin Trp 5.45 550 555 560 Lieu. Glin Glu Ile Llys Pro Gly Glu Lieu. Glu Gly Val His Tyr Ala Val 565 st O sts Phe Gly Cys Gly Asp His Asn Trp Ala Ser Thr Tyr Glin Tyr Val Pro 58O 585 59 O Arg Phe Ile Asp Glu Gln Lieu Ala Glu Lys Gly Ala Thr Arg Phe Ser 595 6OO 605 Ala Arg Gly Glu Gly Asp Val Ser Gly Asp Phe Glu Gly Glin Lieu. Asp 610 615 62O Glu Trp Llys Llys Ser Met Trp Ala Asp Ala Ile Lys Ala Phe Gly Lieu. 625 630 635 64 O Glu Lieu. Asn. Glu Asn Ala Asp Llys Glu Arg Ser Thr Lieu. Ser Lieu. Glin 645 650 655 Phe Val Arg Gly Lieu. Gly Glu Ser Pro Lieu Ala Arg Ser Tyr Glu Ala 660 665 67 O Ser His Ala Ser Ile Ala Glu Asn Arg Glu Lieu. Glin Ser Ala Asp Ser 675 68O 685 US 2009/0061471 A1 Mar. 5, 2009 40

- Continued Asp Arg Ser Thr Arg His Ile Glu Ile Ala Lieu Pro Pro Asp Val Glu 69 O. 695 7 OO Tyr Glin Glu Gly Asp His Lieu. Gly Val Lieu Pro Lys Asn. Ser Glin Thr 7 Os 71O 71s 72O Asn Val Ser Arg Ile Lieu. His Arg Phe Gly Lieu Lys Gly Thr Asp Glin 72 73 O 73 Val Thir Lieu. Ser Ala Ser Gly Arg Ser Ala Gly His Lieu Pro Lieu. Gly 740 74. 7 O Arg Pro Val Ser Lieu. His Asp Lieu. Lieu. Ser Tyr Ser Val Glu Val Glin 7ss 760 765 Glu Ala Ala Thr Arg Ala Glin Ile Arg Glu Lieu Ala Ser Phe Thr Val 770 775 78O Cys Pro Pro His Arg Arg Glu Lieu. Glu Glu Lieu. Ser Ala Glu Gly Val 78s 79 O 79. 8OO Tyr Glin Glu Glin Ile Lieu Lys Lys Arg Ile Ser Met Lieu. Asp Lieu. Lieu. 805 810 815 Glu Lys Tyr Glu Ala Cys Asp Met Pro Phe Glu Arg Phe Lieu. Glu Lieu. 82O 825 83 O Lieu. Arg Pro Leu Lys Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Pro Arg 835 84 O 845 Val Asn Pro Arg Glin Ala Ser Ile Thr Val Gly Val Val Arg Gly Pro 850 855 860 Ala Trp Ser Gly Arg Gly Glu Tyr Arg Gly Val Ala Ser Asn Asp Lieu 865 87O 87s 88O Ala Glu Arg Glin Ala Gly Asp Asp Val Val Met Phe Ile Arg Thr Pro 885 890 895 Glu Ser Arg Phe Gln Leu Pro Lys Asp Pro Glu Thr Pro Ile Ile Met 9 OO 905 91 O Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly Phe Leu Glin Ala 915 92 O 925 Arg Asp Val Lieu Lys Arg Glu Gly Lys Thr Lieu. Gly Glu Ala His Lieu 93 O 935 94 O Tyr Phe Gly Cys Arg Asn Asp Arg Asp Phe Ile Tyr Arg Asp Glu Lieu. 945 950 955 96.O Glu Arg Phe Glu Lys Asp Gly Ile Val Thr Val His Thr Ala Phe Ser 965 97O 97. Arg Lys Glu Gly Met Pro Llys Thr Tyr Val Glin His Lieu Met Ala Asp 98O 985 99 O Glin Ala Asp Thir Lieu. Ile Ser Ile Lieu. Asp Arg Gly Gly Arg Lieu. Tyr 995 1OOO 1005 Val Cys Gly Asp Gly Ser Lys Met Ala Pro Asp Val Glu Ala Ala 1010 1015 1 O2O Lieu. Glin Lys Ala Tyr Glin Ala Wal His Gly Thr Gly Glu Glin Glu 1025 1O3 O 1035 Ala Glin Asn Trp Lieu. Arg His Lieu. Glin Asp Thr Gly Met Tyr Ala 104 O 1045 1 OSO Lys Asp Val Trp Ala Gly 105.5

<210 SEQ ID NO 4 <211 LENGTH: 1052 &212> TYPE: PRT US 2009/0061471 A1 Mar. 5, 2009 41

- Continued

<213> ORGANISM; Bacillus subtilis &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (1052) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A3 <4 OO SEQUENCE: 4 Lys Glin Ala Ser Ala Ile Pro Gln Pro Llys Thr Tyr Gly Pro Leu Lys 1. 5 1O 15 Asn Lieu Pro His Lieu. Glu Lys Glu Gln Lieu. Ser Glin Ser Lieu. Trp Arg 2O 25 3O Ile Ala Asp Glu Lieu. Gly Pro Ile Phe Arg Phe Asp Phe Pro Gly Val 35 4 O 45 Ser Ser Val Phe Val Ser Gly His Asn Lieu Val Ala Glu Val Cys Asp SO 55 6 O Glu Lys Arg Phe Asp Lys Asn Lieu. Gly Lys Gly Lieu. Glin Llys Val Arg 65 70 7s 8O Glu Phe Gly Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Pro Asn 85 90 95 Trp Glin Lys Ala His Arg Ile Lieu. Lieu Pro Ser Phe Ser Glin Lys Ala 1OO 105 11 O Met Lys Gly Tyr His Ser Met Met Lieu. Asp Ile Ala Thr Glin Lieu. Ile 115 12 O 125 Gln Lys Trp Ser Arg Lieu. ASn Pro ASn Glu Glu. Ile Asp Val Ala Asp 13 O 135 14 O Asp Met Thir Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn 145 150 155 160 Tyr Arg Phe Asin Ser Phe Tyr Arg Asp Ser Gln His Pro Phe Ile Thr 1.65 17O 17s Ser Met Lieu. Arg Ala Lieu Lys Glu Ala Met Asn. Glin Ser Lys Arg Lieu. 18O 185 19 O Gly Lieu. Glin Asp Llys Met Met Val Llys Thir Lys Lieu. Glin Phe Glin Lys 195 2OO 2O5 Asp Ile Glu Val Met Asn. Ser Lieu Val Asp Arg Met Ile Ala Glu Arg 21 O 215 22O Lys Ala Asn Pro Asp Glu Asn. Ile Lys Asp Lieu Lleu Ser Lieu Met Lieu. 225 23 O 235 24 O Tyr Ala Lys Asp Pro Val Thr Gly Glu Thir Lieu. Asp Asp Glu Asn. Ile 245 250 255 Arg Tyr Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser 26 O 265 27 O Gly Lieu. Lieu. Ser Phe Ala Ile Tyr Cys Lieu. Lieu. Thir His Pro Glu Lys 27s 28O 285 Lieu Lys Lys Ala Glin Glu Glu Ala Asp Arg Val Lieu. Thir Asp Asp Thr 29 O 295 3 OO Pro Glu Tyr Lys Glin Ile Glin Glin Lieu Lys Tyr Ile Arg Met Val Lieu. 3. OS 310 315 32O Asn Glu Thir Lieu. Arg Lieu. Tyr Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr 3.25 330 335 Ala Lys Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Ile Ser Lys Gly 34 O 345 35. O Glin Pro Val Thr Val Lieu. Ile Pro Llys Lieu. His Arg Asp Glin Asn Ala 355 360 365 US 2009/0061471 A1 Mar. 5, 2009 42

- Continued

Trp Gly Pro Asp Ala Glu Asp Phe Arg Pro Glu Arg Phe Glu Asp Pro 37 O 375 38O Ser Ser Ile Pro His His Ala Tyr Llys Pro Phe Gly Asn Gly Glin Arg 385 390 395 4 OO Ala Cys Ile Gly Met Glin Phe Ala Leu Gln Glu Ala Thr Met Val Lieu. 4 OS 41O 415 Gly Lieu Val Lieu Lys His Phe Glu Lieu. Ile Asn His Thr Gly Tyr Glu 42O 425 43 O Lieu Lys Ile Lys Glu Ala Lieu. Thir Ile Llys Pro Asp Asp Phe Lys Ile 435 44 O 445 Thr Val Llys Pro Arg Llys Thr Ala Ala Ile Asin Val Glin Arg Lys Glu 450 45.5 460 Glin Ala Asp Ile Lys Ala Glu Thir Lys Pro Lys Glu Thir Lys Pro Llys 465 470 47s 48O His Gly Thr Pro Leu Lleu Val Lieu Phe Gly Ser Asn Lieu. Gly Thr Ala 485 490 495 Glu Gly Ile Ala Gly Glu Lieu Ala Ala Glin Gly Arg Gln Met Gly Phe SOO 505 51O Thir Ala Glu Thir Ala Pro Lieu. Asp Asp Tyr Ile Gly Llys Lieu Pro Glu 515 52O 525 Glu Gly Ala Val Val Ile Val Thr Ala Ser Tyr Asn Gly Ala Pro Pro 53 O 535 54 O Asp Asn Ala Ala Gly Phe Val Glu Trp Lieu Lys Glu Lieu. Glu Glu Gly 5.45 550 555 560 Glin Lieu Lys Gly Val Ser Tyr Ala Val Phe Gly Cys Gly Asn Arg Ser 565 st O sts Trp Ala Ser Thr Tyr Glin Arg Ile Pro Arg Lieu. Ile Asp Asp Met Met 58O 585 59 O Lys Ala Lys Gly Ala Ser Arg Lieu. Thir Ala Ile Gly Glu Gly Asp Ala 595 6OO 605 Ala Asp Asp Phe Glu Ser His Arg Glu Ser Trp Glu Asn Arg Phe Trp 610 615 62O Lys Glu Thir Met Asp Ala Phe Asp Ile Asn. Glu Ile Ala Glin Lys Glu 625 630 635 64 O Asp Arg Pro Ser Leu Ser Ile Thr Phe Leu Ser Glu Ala Thr Glu Thr 645 650 655 Pro Val Ala Lys Ala Tyr Gly Ala Phe Glu Gly Ile Val Lieu. Glu Asn 660 665 67 O Arg Glu Lieu. Glin Thr Ala Ala Ser Thr Arg Ser Thr Arg His Ile Glu 675 68O 685 Lieu. Glu Ile Pro Ala Gly Llys Thr Tyr Lys Glu Gly Asp His Ile Gly 69 O. 695 7 OO Ile Lieu Pro Lys Asn. Ser Arg Glu Lieu Val Glin Arg Val Lieu. Ser Arg 7 Os 71O 71s 72O Phe Gly Lieu. Glin Ser Asn His Val Ile Llys Val Ser Gly Ser Ala His 72 73 O 73 Met Ala His Lieu Pro Met Asp Arg Pro Ile Llys Val Val Asp Lieu. Lieu. 740 74. 7 O Ser Ser Tyr Val Glu Lieu. Glin Glu Pro Ala Ser Arg Lieu. Glin Lieu. Arg 7ss 760 765 US 2009/0061471 A1 Mar. 5, 2009 43

- Continued Glu Lieu Ala Ser Tyr Thr Val Cys Pro Pro His Gln Lys Glu Lieu. Glu 770 775 78O Glin Lieu Val Ser Asp Asp Gly Ile Tyr Lys Glu Glin Val Lieu Ala Lys 78s 79 O 79. 8OO Arg Lieu. Thir Met Lieu. Asp Phe Lieu. Glu Asp Tyr Pro Ala Cys Glu Met 805 810 815 Pro Phe Glu Arg Phe Lieu Ala Lieu. Lieu Pro Ser Lieu Lys Pro Arg Tyr 82O 825 83 O Tyr Ser Ile Ser Ser Ser Pro Llys Val His Ala Asn Ile Val Ser Met 835 84 O 845 Thr Val Gly Val Val Lys Ala Ser Ala Trp Ser Gly Arg Gly Glu Tyr 850 855 860 Arg Gly Val Ala Ser Asn Tyr Lieu Ala Glu Lieu. Asn Thr Gly Asp Ala 865 87O 87s 88O Ala Ala Cys Phe Ile Arg Thr Pro Glin Ser Gly Phe Gln Met Pro Asn 885 890 895 Asp Pro Glu Thr Pro Met Ile Met Val Gly Pro Gly Thr Gly Ile Ala 9 OO 905 91 O Pro Phe Arg Gly Phe Ile Glin Ala Arg Ser Val Lieu Lys Lys Glu Gly 915 92 O 925 Ser Thr Lieu. Gly Glu Ala Lieu. Lieu. Tyr Phe Gly Cys Arg Arg Pro Asp 93 O 935 94 O His Asp Asp Lieu. Tyr Arg Glu Glu Lieu. Asp Glin Ala Glu Glin Asp Gly 945 950 955 96.O Lieu Val Thir Ile Arg Arg Cys Tyr Ser Arg Val Glu Asn. Glu Pro Llys 965 97O 97. Gly Tyr Val Glin His Lieu Lleu Lys Glin Asp Thr Glin Llys Lieu Met Thr 98O 985 99 O Lieu. Ile Glu Lys Gly Ala His Ile Tyr Val Cys Gly Asp Gly Ser Glin 995 1OOO 1005 Met Ala Pro Asp Val Glu Arg Thir Lieu. Arg Lieu Ala Tyr Glu Ala 1010 1015 1 O2O Glu Lys Ala Ala Ser Glin Glu Glu Ser Ala Val Trp Lieu. Glin Llys 1025 1O3 O 1035 Lieu. Glin Asp Glin Arg Arg Tyr Val Lys Asp Val Trp Thr Gly 104 O 1045 1 OSO

<210 SEQ ID NO 5 <211 LENGTH: 1065 &212> TYPE: PRT <213> ORGANISM; Bacillus cereus &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (1065) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A5 <4 OO SEQUENCE: 5 Met Asp Llys Llys Val Ser Ala Ile Pro Glin Pro Llys Thr Tyr Gly Pro 1. 5 1O 15 Lieu. Gly Asn Lieu Pro Lieu. Ile Asp Lys Asp Llys Pro Thr Lieu. Ser Phe 2O 25 3O Ile Lys Ile Ala Glu Glu Tyr Gly Pro Ile Phe Glin Ile Glin Thr Lieu. 35 4 O 45 Ser Asp Thir Ile Ile Val Ile Ser Gly His Glu Lieu Val Ala Glu Val US 2009/0061471 A1 Mar. 5, 2009 44

- Continued

SO 55 6 O

Cys Asp Glu Thr Arg Phe Asp Ser Ile Glu Gly Ala Luell Ala Lys 65 70

Wall Arg Ala Phe Ala Gly Asp Gly Luell Phe Thir Ser Glu Thir Glin Glu 85 90 95

Pro Asn Trp Llys Llys Ala His Asn Ile Luell Met Pro Thir Phe Ser Glin 1OO 105 11 O

Arg Ala Met His Ala Met Met Wall Asp Ile Ala Wall Glin 115 12 O 125

Lell Wall Glin Ala Arg Luell Asn Pro ASn Glu Asn Wall Asp Wall 13 O 135 14 O

Pro Glu Asp Met Thir Arg Lell Thir Luell Asp Thir Ile Gly Luell Gly 145 150 155 160

Phe Asn Arg Phe Asn Ser Phe Arg Glu Thir Pro His Pro Phe 1.65 17s

Ile Thir Ser Met Thir Arg Ala Luell Asp Glu Ala Met His Glin Luell Glin 18O 185 19 O

Arg Luell Asp Ile Glu Asp Luell Met Trp Arg Thir Lys Arg Glin Phe 195

Glin His Asp Ile Glin Ser Met Phe Ser Luell Wall Asp Asn Ile Ile Ala 21 O 215

Glu Arg Ser Ser Gly Asn Glin Glu Glu ASn Asp Lell Luell Ser Arg 225 23 O 235 24 O

Met Luell His Wall Glin Asp Pro Glu Thir Gly Glu Lell Asp Asp Glu 245 250 255

Asn Ile Arg Phe Glin Ile Ile Thir Phe Luell Ile Ala Gly His Glu Thir 26 O 265 27 O

Thir Ser Gly Lieu. Luell Ser Phe Ala Ile Phe Lell Lell Asn Pro 285

Asp Lys Luell Ala Tyr Glu Glu Wall Asp Arg Wall Luell Thir Asp 29 O 295 3 OO

Pro Thir Pro Thr Tyr Glin Glin Wall Met Luell Ile Arg Met 3. OS 310 315

Ile Luell Asn Glu Ser Lell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser 3.25 330 335

Lell Ala Lys Glu Asp Thir Wall Ile Gly Gly Pro Ile 34 O 345 35. O

Gly Glu Asp Arg Ile Ser Wall Luell Ile Pro Glin Lell His Arg Asp 355 360 365

Asp Ala Trp Gly Asp Asn Wall Glu Glu Phe Glin Pro Glu Arg Phe 37 O 375

Glu Asp Luell Asp Llys Wall Pro His His Ala Tyr Pro Phe Gly Asn 385 390 395 4 OO

Gly Glin Arg Ala Cys Ile Gly Met Glin Phe Ala Lell His Glu Ala Thir 4 OS 415

Lell Wall Met Gly Met Lell Lell Glin His Phe Glu Phe Ile Asp Glu 42O 425 43 O

Asp Glin Lieu. Asp Wall Glin Thir Luell Thir Lell Lys Pro Gly Asp 435 44 O 445

Phe Lys Ile Arg Ile Wall Pro Arg Asn Glin ASn Ile Ser His Thir Thir 450 45.5 460 US 2009/0061471 A1 Mar. 5, 2009 45

- Continued

Val Lieu Ala Pro Thr Glu Glu Lys Lieu Lys Asn His Glu Ile Lys Glin 465 470 47s 48O Glin Val Glin Llys Thr Pro Ser Ile Ile Gly Ala Asp Asn Lieu. Ser Lieu. 485 490 495 Lieu Val Lieu. Tyr Gly Ser Asp Thr Gly Val Ala Glu Gly Ile Ala Arg SOO 505 51O Glu Lieu Ala Asp Thr Ala Ser Lieu. Glu Gly Val Glin Thr Glu Val Ala 515 52O 525 Ala Lieu. Asn Asp Arg Ile Gly Ser Lieu Pro Lys Glu Gly Ala Val Lieu 53 O 535 54 O Ile Val Thr Ser Ser Tyr Asn Gly Llys Pro Pro Ser Asn Ala Gly Glin 5.45 550 555 560 Phe Val Glin Trp Lieu. Glu Glu Lieu Lys Pro Asp Glu Lieu Lys Gly Val 565 st O sts Gln Tyr Ala Val Phe Gly Cys Gly Asp His Asn Trp Ala Ser Thr Tyr 58O 585 59 O Glin Arg Ile Pro Arg Tyr Ile Asp Glu Gln Met Ala Gln Lys Gly Ala 595 6OO 605 Thir Arg Phe Ser Thr Arg Gly Glu Ala Asp Ala Ser Gly Asp Phe Glu 610 615 62O Glu Gln Lieu. Glu Gln Trp Llys Glu Ser Met Trp Ser Asp Ala Met Lys 625 630 635 64 O Ala Phe Gly Lieu. Glu Lieu. Asn Lys Asn Met Glu Lys Glu Arg Ser Thr 645 650 655 Lieu. Ser Lieu. Glin Phe Val Ser Arg Lieu. Gly Gly Ser Pro Lieu Ala Arg 660 665 67 O Thir Tyr Glu Ala Val Tyr Ala Ser Ile Lieu. Glu Asn Arg Glu Lieu. Glin 675 68O 685 Ser Ser Ser Ser Glu Arg Ser Thr Arg His Ile Glu Ile Ser Leu Pro 69 O. 695 7 OO Glu Gly Ala Thr Tyr Lys Glu Gly Asp His Lieu. Gly Val Lieu Pro Ile 7 Os 71O 71s 72O Asn Ser Glu Lys Asn Val Asn Arg Ile Lieu Lys Arg Phe Gly Lieu. Asn 72 73 O 73 Gly Lys Asp Glin Val Ile Lieu. Ser Ala Ser Gly Arg Ser Val Asn His 740 74. 7 O Ile Pro Lieu. Asp Ser Pro Val Arg Lieu. Tyr Asp Lieu Lleu Ser Tyr Ser 7ss 760 765 Val Glu Val Glin Glu Ala Ala Thr Arg Ala Glin Ile Arg Glu Met Val 770 775 78O Thr Phe Thr Ala Cys Pro Pro His Llys Lys Glu Lieu. Glu Ser Lieu. Leu 78s 79 O 79. 8OO Glu Asp Gly Val Tyr His Glu Glin Ile Lieu Lys Lys Arg Ile Ser Met 805 810 815 Lieu. Asp Lieu. Lieu. Glu Lys Tyr Glu Ala Cys Glu Ile Arg Phe Glu Arg 82O 825 83 O Phe Lieu. Glu Lieu Lleu Pro Ala Lieu Lys Pro Arg Tyr Tyr Ser Ile Ser 835 84 O 845 Ser Ser Pro Lieu. Ile Ala Glin Asp Arg Lieu. Ser Ile Thr Val Gly Val 850 855 860 US 2009/0061471 A1 Mar. 5, 2009 46

- Continued Val Asn Ala Pro Ala Trp Ser Gly Glu Gly Thr Tyr Glu Gly Val Ala 865 87O 87s 88O Ser Asn Tyr Lieu Ala Glin Arg His Asn Lys Asp Glu Ile Ile Cys Phe 885 890 895 Ile Arg Thr Pro Glin Ser Asn Phe Glin Leu Pro Glu Asn Pro Glu Thr 9 OO 905 91 O Pro Ile Ile Met Val Gly Pro Gly Thr Gly Ile Ala Pro Phe Arg Gly 915 92 O 925 Phe Lieu. Glin Ala Arg Arg Val Glin Lys Gln Lys Gly Met Asn Lieu. Gly 93 O 935 94 O Glu Ala His Lieu. Tyr Phe Gly Cys Arg His Pro Glu Lys Asp Tyr Lieu. 945 950 955 96.O Tyr Arg Thr Glu Lieu. Glu Asn Asp Glu Arg Asp Gly Lieu. Ile Ser Lieu. 965 97O 97. His Thr Ala Phe Ser Arg Lieu. Glu Gly His Pro Llys Thr Tyr Val Glin 98O 985 99 O His Val Ile Lys Glu Asp Arg Met Asn Lieu. Ile Ser Lieu. Lieu. Asp Asn 995 1OOO 1005 Gly Ala His Lieu. Tyr Ile Cys Gly Asp Gly Ser Lys Met Ala Pro 1010 1015 1 O2O Asp Val Glu Asp Thir Lieu. Cys Glin Ala Tyr Glin Glu Ile His Glu 1025 1O3 O 1035 Val Ser Glu Glin Glu Ala Arg Asn Trp Lieu. Asp Arg Lieu. Glin Asp 104 O 1045 1 OSO Glu Gly Arg Tyr Gly Lys Asp Val Trp Ala Gly Ile 105.5 106 O 1065

<210 SEQ ID NO 6 <211 LENGTH: 1063 &212> TYPE: PRT <213> ORGANISM: Ralstonia metallidurans &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (1063) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102E1 <4 OO SEQUENCE: 6 Ser Thr Ala Thr Pro Ala Ala Ala Lieu. Glu Pro Ile Pro Arg Asp Pro 1. 5 1O 15 Gly Trp Pro Ile Phe Gly Asn Lieu Phe Glin Ile Thr Pro Gly Glu Val 2O 25 3O Gly Gln His Lieu. Lieu Ala Arg Ser Arg His His Asp Gly Ile Phe Glu 35 4 O 45 Lieu. Asp Phe Ala Gly Lys Arg Val Pro Phe Val Ser Ser Val Ala Lieu. SO 55 6 O Ala Ser Glu Lieu. Cys Asp Ala Thr Arg Phe Arg Lys Ile Ile Gly Pro 65 70 7s 8O Pro Lieu. Ser Tyr Lieu. Arg Asp Met Ala Gly Asp Gly Lieu. Phe Thr Ala 85 90 95 His Ser Asp Glu Pro Asn Trp Gly Cys Ala His Arg Ile Lieu Met Pro 1OO 105 11 O Ala Phe Ser Glin Arg Ala Met Lys Ala Tyr Phe Asp Wal Met Lieu. Arg 115 12 O 125 Val Ala Asn Arg Lieu Val Asp Llys Trp Asp Arg Glin Gly Pro Asp Ala US 2009/0061471 A1 Mar. 5, 2009 47

- Continued

13 O 135 14 O Asp Ile Ala Val Ala Asp Asp Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile 145 150 155 160 Ala Lieu Ala Gly Phe Gly Tyr Asp Phe Ala Ser Phe Ala Ser Asp Glu 1.65 17O 17s Lieu. Asp Pro Phe Wal Met Ala Met Val Gly Ala Lieu. Gly Glu Ala Met 18O 185 19 O Glin Llys Lieu. Thir Arg Lieu Pro Ile Glin Asp Arg Phe Met Gly Arg Ala 195 2OO 2O5 His Arg Glin Ala Ala Glu Asp Ile Ala Tyr Met Arg Asn Lieu Val Asp 21 O 215 22O Asp Val Ile Arg Glin Arg Arg Val Ser Pro Thir Ser Gly Met Asp Lieu 225 23 O 235 24 O Lieu. Asn Lieu Met Lieu. Glu Ala Arg Asp Pro Glu Thir Asp Arg Arg Lieu. 245 250 255 Asp Asp Ala Asn. Ile Arg Asn Glin Val Ile Thr Phe Lieu. Ile Ala Gly 26 O 265 27 O His Glu Thir Thr Ser Gly Lieu. Lieu. Thr Phe Ala Lieu. Tyr Glu Lieu. Leu 27s 28O 285 Arg ASn Pro Gly Val Lieu Ala Glin Ala Tyr Ala Glu Val Asp Thr Val 29 O 295 3 OO Lieu. Pro Gly Asp Ala Lieu. Pro Val Tyr Ala Asp Lieu. Ala Arg Met Pro 3. OS 310 315 32O Val Lieu. Asp Arg Val Lieu Lys Glu Thir Lieu. Arg Lieu. Trp Pro Thir Ala 3.25 330 335 Pro Ala Phe Ala Val Ala Pro Phe Asp Asp Val Val Lieu. Gly Gly Arg 34 O 345 35. O Tyr Arg Lieu. Arg Lys Asp Arg Arg Ile Ser Val Val Lieu. Thir Ala Lieu. 355 360 365 His Arg Asp Pro Llys Val Trp Ala Asn Pro Glu Arg Phe Asp Ile Asp 37 O 375 38O Arg Phe Lieu Pro Glu Asn. Glu Ala Lys Lieu Pro Ala His Ala Tyr Met 385 390 395 4 OO Pro Phe Gly Glin Gly Glu Arg Ala Cys Ile Gly Arg Glin Phe Ala Lieu. 4 OS 41O 415 Thr Glu Ala Lys Lieu Ala Lieu Ala Lieu Met Lieu. Arg Asn. Phe Ala Phe 42O 425 43 O Glin Asp Pro His Asp Tyr Glin Phe Arg Lieu Lys Glu Thir Lieu. Thir Ile 435 44 O 445 Llys Pro Asp Glin Phe Val Lieu. Arg Val Arg Arg Arg Arg Pro His Glu 450 45.5 460 Arg Phe Val Thr Arg Glin Ala Ser Glin Ala Val Ala Asp Ala Ala Glin 465 470 47s 48O Thir Asp Val Arg Gly His Gly Glin Ala Met Thr Val Lieu. Cys Ala Ser 485 490 495 Ser Lieu. Gly. Thir Ala Arg Glu Lieu Ala Glu Glin Ile His Ala Gly Ala SOO 505 51O Ile Ala Ala Gly Phe Asp Ala Lys Lieu Ala Asp Lieu. Asp Asp Ala Val 515 52O 525 Gly Val Lieu Pro Thr Ser Gly Lieu Val Val Val Val Ala Ala Thr Tyr 53 O 535 54 O US 2009/0061471 A1 Mar. 5, 2009 48

- Continued

Asn Gly Arg Ala Pro Asp Ser Ala Arg Llys Phe Glu Ala Met Lieu. Asp 5.45 550 555 560 Ala Asp Asp Ala Ser Gly Tyr Arg Ala Asn Gly Met Arg Lieu Ala Lieu 565 st O sts Lieu. Gly Cys Gly Asn Ser Gln Trp Ala Thr Tyr Glin Ala Phe Pro Arg 58O 585 59 O Arg Val Phe Asp Phe Phe Ile Thr Ala Gly Ala Val Pro Leu Lleu Pro 595 6OO 605 Arg Gly Glu Ala Asp Gly Asn Gly Asp Phe Asp Glin Ala Ala Glu Arg 610 615 62O Trp Lieu Ala Glin Lieu. Trp Glin Ala Lieu. Glin Ala Asp Gly Ala Gly Thr 625 630 635 64 O Gly Gly Lieu. Gly Val Asp Val Glin Val Arg Ser Met Ala Ala Ile Arg 645 650 655 Ala Glu Thir Lieu Pro Ala Gly Thr Glin Ala Phe Thr Val Lieu. Ser Asn 660 665 67 O Asp Glu Lieu Val Gly Asp Pro Ser Gly Lieu. Trp Asp Phe Ser Ile Glu 675 68O 685 Ala Pro Arg Thir Ser Thr Arg Asp Ile Arg Lieu Gln Lieu Pro Pro Gly 69 O. 695 7 OO Ile Thr Tyr Arg Thr Gly Asp His Ile Ala Val Trp Pro Glin Asn Asp 7 Os 71O 71s 72O Ala Glin Lieu Val Ser Glu Lieu. Cys Glu Arg Lieu. Asp Lieu. Asp Pro Asp 72 73 O 73 Ala Glin Ala Thr Ile Ser Ala Pro His Gly Met Gly Arg Gly Lieu Pro 740 74. 7 O Ile Asp Glin Ala Lieu Pro Val Arg Glin Lieu. Lieu. Thir His Phe Ile Glu 7ss 760 765 Lieu. Glin Asp Val Val Ser Arg Glin Thr Lieu. Arg Ala Lieu Ala Glin Ala 770 775 78O Thr Arg Cys Pro Phe Thr Lys Glin Ser Ile Glu Gln Leu Ala Ser Asp 78s 79 O 79. 8OO Asp Ala Glu. His Gly Tyr Ala Thr Llys Val Val Ala Arg Arg Lieu. Gly 805 810 815 Ile Lieu. Asp Val Lieu Val Glu. His Pro Ala Ile Ala Lieu. Thir Lieu. Glin 82O 825 83 O Glu Lieu. Leu Ala Cys Thr Val Pro Met Arg Pro Arg Lieu. Tyr Ser Ile 835 84 O 845 Ala Ser Ser Pro Lieu Val Ser Pro Asp Wall Ala Thr Lieu. Lieu Val Gly 850 855 860 Thr Val Cys Ala Pro Ala Lieu. Ser Gly Arg Gly Glin Phe Arg Gly Val 865 87O 87s 88O Ala Ser Thir Trp Lieu Gln His Lieu Pro Pro Gly Ala Arg Val Ser Ala 885 890 895 Ser Ile Arg Thr Pro Asn Pro Pro Phe Ala Pro Asp Pro Asp Pro Ala 9 OO 905 91 O Ala Pro Met Leu Lleu. Ile Gly Pro Gly Thr Gly Ile Ala Pro Phe Arg 915 92 O 925 Gly Phe Lieu. Glu Glu Arg Ala Lieu. Arg Llys Met Ala Gly Asn Ala Val 93 O 935 94 O US 2009/0061471 A1 Mar. 5, 2009 49

- Continued Thir Pro Ala Glin Lieu. Tyr Phe Gly Cys Arg His Pro Gln His Asp Trp 945 950 955 96.O

Lell Tyr Arg Glu Asp Ile Glu Arg Trp Ala Gly Glin Gly Val Val Glu 965 97.

Wall His Pro Ala Tyr Ser Val Val Pro Asp Ala Pro Arg Tyr Val Glin 98O 985 99 O Asp Lieu. Lieu. Trp Glin Arg Arg Glu Glin Val Trp Ala Glin Wal Arg Asp 995 Gly Ala Thr Ile Tyr Val Cys Gly Asp Gly Arg Arg Met Ala Pro 1010 1015

Ala Val Arg Glin Thr Lieu. Ile Glu Ile Gly Met Ala Glin Gly Gly 1025 1035

Met Thir Asp Lys Ala Ala Ser Asp Trp Phe Gly Gly Lieu Val Ala 104 O 1045 1 OSO

Glin Gly Arg Tyr Arg Glin Asp Val Phe Asn 105.5 106 O

<210 SEQ ID NO 7 &2 11s LENGTH: 1077 &212> TYPE: PRT ORGANISM: Bradyrhizobium japonicum &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (1077) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A6 <4 OO SEQUENCE: 7

Ser Ser Lys Asn Arg Lell Asp Pro Ile Pro Glin Pro Pro Thir Lys Pro 1. 5 15

Wall Val Gly Asn Met Lell Ser Luell Asp Ser Ala Ala Pro Wall Glin His 2O 25

Lell Thir Arg Lieu Ala Glu Luell Gly Pro Ile Phe Trp Luell Asp Met 35 4 O 45

Met Gly Ser Pro Ile Wall Wall Wall Ser Gly His Asp Lell Wall Asp Glu SO 55 6 O

Lell Ser Asp Glu Lys Arg Phe Asp Thir Wall Arg Gly Ala Luell Arg 65 70

Arg Val Arg Ala Val Gly Gly Asp Gly Luell Phe Thir Ala Asp Thir Arg 85 90 95

Glu Pro Asn Trp Ser Ala His Asn Ile Luell Lell Glin Pro Phe Gly 1OO 105 11 O

Asn Arg Ala Met Glin Ser His Pro Ser Met Wall Asp Ile Ala Glu 115 12 O 125

Glin Lieu Val Glin Lys Trp Glu Arg Luell Asn Ala Asp Asp Glu Ile Asp 13 O 135 14 O

Wall Val His Asp Met Thir Ala Luell Thir Luell Asp Thir Ile Gly Luell Cys 145 150 155 160

Gly Phe Asp Tyr Arg Phe Asn Ser Phe Tyr Arg Arg Asp His Pro 1.65 17O 17s

Phe Wall Glu Ser Lieu Wall Arg Ser Luell Glu Thir Ile Met Met Thir Arg 18O 185 19 O

Gly Lieu Pro Phe Glu Glin Ile Trp Met Glin Lys Arg Arg Thir Luell 195

Ala Glu Asp Wall Ala Phe Met Asn Met Wall Asp Glu Ile Ile Ala US 2009/0061471 A1 Mar. 5, 2009 50

- Continued

21 O 215 22O Glu Arg Arg Llys Ser Ala Glu Gly Ile Asp Asp Llys Lys Asp Met Lieu. 225 23 O 235 24 O Ala Ala Met Met Thr Gly Val Asp Arg Ser Thr Gly Glu Glin Lieu. Asp 245 250 255 Asp Wall Asn. Ile Arg Tyr Glin Ile Asn. Thir Phe Lieu. Ile Ala Gly His 26 O 265 27 O Glu Thir Thr Ser Gly Lieu Lleu Ser Tyr Thr Lieu. Tyr Ala Lieu. Leu Lys 27s 28O 285 His Pro Asp Ile Lieu Lys Lys Ala Tyr Asp Glu Val Asp Arg Val Phe 29 O 295 3 OO Gly Pro Asp Val Asn Ala Lys Pro Thr Tyr Glin Glin Val Thr Glin Leu 3. OS 310 315 32O Thr Tyr Ile Thr Glin Ile Leu Lys Glu Ala Lieu. Arg Lieu. Trp Pro Pro 3.25 330 335 Ala Pro Ala Tyr Gly Ile Ser Pro Lieu Ala Asp Glu Thir Ile Gly Gly 34 O 345 35. O Gly Lys Tyr Lys Lieu. Arg Lys Gly Thr Phe Ile Thr Ile Leu Val Thr 355 360 365 Ala Lieu. His Arg Asp Pro Ser Val Trp Gly Pro Asn Pro Asp Ala Phe 37 O 375 38O Asp Pro Glu ASn Phe Ser Arg Glu Ala Glu Ala Lys Arg Pro Ile ASn 385 390 395 4 OO Ala Trp Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys Ile Gly Arg Gly 4 OS 41O 415 Phe Ala Met His Glu Ala Ala Lieu Ala Lieu. Gly Met Ile Lieu. Glin Arg 42O 425 43 O Phe Llys Lieu. Ile Asp His Glin Arg Tyr Gln Met His Lieu Lys Glu Thir 435 44 O 445 Lieu. Thir Met Llys Pro Glu Gly Phe Lys Ile Llys Val Arg Pro Arg Ala 450 45.5 460 Asp Arg Glu Arg Gly Ala Tyr Gly Gly Pro Val Ala Ala Val Ser Ser 465 470 47s 48O Ala Pro Arg Ala Pro Arg Gln Pro Thr Ala Arg Pro Gly His Asn Thr 485 490 495 Pro Met Lieu Val Lieu. Tyr Gly Ser Asn Lieu. Gly Thr Ala Glu Glu Lieu. SOO 505 51O Ala Thr Arg Met Ala Asp Lieu Ala Glu Ile Asn Gly Phe Ala Wal His 515 52O 525 Lieu. Gly Ala Lieu. Asp Glu Tyr Val Gly Lys Lieu Pro Glin Glu Gly Gly 53 O 535 54 O Val Lieu. Ile Ile Cys Ala Ser Tyr Asn Gly Ala Pro Pro Asp Asn Ala 5.45 550 555 560 Thr Glin Phe Wall Lys Trp Lieu. Gly Ser Asp Lieu Pro Lys Asp Ala Phe 565 st O sts Ala Asn Val Arg Tyr Ala Val Phe Gly Cys Gly Asn. Ser Asp Trp Ala 58O 585 59 O Ala Thr Tyr Glin Ser Val Pro Arg Phe Ile Asp Glu Gln Leu Ser Gly 595 6OO 605 His Gly Ala Arg Ala Val Tyr Pro Arg Gly Glu Gly Asp Ala Arg Ser 610 615 62O US 2009/0061471 A1 Mar. 5, 2009 51

- Continued

Asp Lieu. Asp Gly Glin Phe Glin Llys Trp Phe Pro Ala Ala Ala Glin Val 625 630 635 64 O Ala Thr Lys Glu Phe Gly Ile Asp Trp Asin Phe Thr Arg Thr Ala Glu 645 650 655 Asp Asp Pro Lieu. Tyr Ala Ile Glu Pro Val Ala Val Thr Ala Val Asn 660 665 67 O Thir Ile Val Ala Glin Gly Gly Ala Val Ala Met Llys Val Lieu Val Asn 675 68O 685 Asp Glu Lieu. Glin Asn Llys Ser Gly Ser Asn Pro Ser Glu Arg Ser Thr 69 O. 695 7 OO Arg His Ile Glu Val Glin Leu Pro Ser Asn Ile Thr Tyr Arg Val Gly 7 Os 71O 71s 72O Asp His Lieu. Ser Val Val Pro Arg Asn Asp Pro Thr Lieu Val Asp Ser 72 73 O 73 Val Ala Arg Arg Phe Gly Phe Lieu Pro Ala Asp Glin Ile Arg Lieu. Glin 740 74. 7 O Val Ala Glu Gly Arg Arg Ala Glin Lieu Pro Val Gly Glu Ala Val Ser 7ss 760 765 Val Gly Arg Lieu Lleu Ser Glu Phe Val Glu Lieu. Glin Glin Val Ala Thr 770 775 78O Arg Lys Glin Ile Glin Ile Met Ala Glu. His Thr Arg Cys Pro Val Thr 78s 79 O 79. 8OO Llys Pro Llys Lieu. Lieu Ala Phe Val Gly Glu Glu Ala Glu Pro Ala Glu 805 810 815 Arg Tyr Arg Thr Glu Ile Lieu Ala Met Arg Llys Ser Val Tyr Asp Lieu 82O 825 83 O Lieu. Leu Glu Tyr Pro Ala Cys Glu Lieu Pro Phe His Val Tyr Lieu. Glu 835 84 O 845 Met Leu Ser Lieu. Leu Ala Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Pro 850 855 860 Ser Val Asp Pro Ala Arg Cys Ser Ile Thr Val Gly Val Val Glu Gly 865 87O 87s 88O Pro Ala Ala Ser Gly Arg Gly Val Tyr Lys Gly Ile Cys Ser Asn Tyr 885 890 895 Lieu Ala Asn Arg Arg Ala Ser Asp Ala Ile Tyr Ala Thr Val Arg Glu 9 OO 905 91 O Thir Lys Ala Gly Phe Arg Lieu Pro Asp Asp Ser Ser Val Pro Ile Ile 915 92 O 925 Met Ile Gly Pro Gly Thr Gly Lieu Ala Pro Phe Arg Gly Phe Leu Gln 93 O 935 94 O Glu Arg Ala Ala Arg Lys Ala Lys Gly Ala Ser Lieu. Gly Pro Ala Met 945 950 955 96.O Lieu. Phe Phe Gly Cys Arg His Pro Asp Glin Asp Phe Lieu. Tyr Ala Asp 965 97O 97. Glu Lieu Lys Ala Lieu Ala Ala Ser Gly Val Thr Glu Lieu. Phe Thr Ala 98O 985 99 O Phe Ser Arg Ala Asp Gly Pro Llys Thr Tyr Val Glin His Wall Lieu Ala 995 1OOO 1005 Ala Glin Lys Asp Llys Val Trp Pro Lieu. Ile Glu Glin Gly Ala Ile 1010 1015 1 O2O US 2009/0061471 A1 Mar. 5, 2009 52

- Continued Ile Tyr Val Cys Gly Asp Gly Gly Glin Met Glu Pro Asp Wall Lys 1025 1O3 O 1035 Ala Ala Lieu Val Ala Ile Arg His Glu Llys Ser Gly Ser Asp Thr 104 O 1045 1 OSO Ala Thr Ala Ala Arg Trp Ile Glu Glu Met Gly Ala Thr Asn Arg 105.5 106 O 1065 Tyr Val Lieu. Asp Val Trp Ala Gly Gly 1070 1075

<210 SEQ ID NO 8 <211 LENGTH: 415 &212> TYPE: PRT <213> ORGANISM: Pseudomonas putida &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (415) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP101A1

<4 OO SEQUENCE: 8

Met Thir Thr Gul Thir Ile Glin Ser Asn Ala ASn Lell Ala Pro Leu Pro 1. 5 1O 15

Pro His Wall Pro Glu His Lell Wall Phe Asp Phe Asp Met Tyr Asn Pro 25 3O

Ser Asn Luell Ser Ala Gly Wall Glin Glu Ala Trp Ala Wall Lieu. Glin Glu 35 4 O 45

Ser Asn Wall Pro Asp Lell Wall Trp Thr Arg Asn Gly Gly His Trp SO 55 6 O

Ile Ala Thir Arg Gly Glin Lell Ile Arg Glu Ala Glu Asp Tyr Arg 65 70 8O

His Phe Ser Ser Glu Pro Phe Ile Pro Arg Glu Ala Gly Glu Ala 85 90 95

Asp Phe Ile Pro Thir Ser Met Asp Pro Pro Glu Glin Arg Glin Phe 105 11 O

Arg Ala Luell Ala Asn Glin Wall Wall Gly Met Pro Wall Val Asp Llys Lieu. 115 12 O 125

Glu Asn Arg Ile Glin Glu Lell Ala Cys Ser Luell Ile Glu Ser Lieu. Arg 13 O 135 14 O

Pro Glin Gly Glin Cys Asn Phe Thir Glu Asp Tyr Ala Glu Pro Phe Pro 145 150 155 160

Ile Arg Ile Phe Met Lell Lell Ala Gly Lieu. Pro Glu Glu Asp Ile Pro 1.65 17O 17s

His Luell Tyr Lell Thir Asp Glin Met Thr Arg Pro Asp Gly Ser Met 18O 185 19 O

Thir Phe Ala Glu Ala Glu Ala Leu Tyr Asp Lieu. Ile Pro Ile 195 2O5

Ile Glu Glin Arg Arg Glin Lys Pro Gly Thr Asp Ala Ile Ser Ile Wall 21 O 215

Ala Asn Gly Glin Wall Asn Gly Arg Pro Ile Thir Ser Asp Glu Ala Lys 225 23 O 235 24 O

Arg Met Gly Lell Lell Lell Wall Gly Gly Luell Asp Thir Wal Wall Asn 245 250 255

Phe Luell Ser Phe Ser Met Glu Phe Lieu Ala Lys Ser Pro Glu. His Arg 26 O 265 27 O

Glin Glu Luell Ile Glu Arg Pro Glu Arg Ile Pro Ala Ala Cys Glu Glu US 2009/0061471 A1 Mar. 5, 2009 53

- Continued

27s 28O 285 Lieu. Lieu. Arg Arg Phe Ser Lieu Val Ala Asp Gly Arg Ile Lieu. Thir Ser 29 O 295 3 OO Asp Tyr Glu Phe His Gly Val Glin Lieu Lys Lys Gly Asp Glin Ile Lieu 3. OS 310 315 32O Lieu Pro Gln Met Lieu. Ser Gly Lieu. Asp Glu Arg Glu Asn Ala Cys Pro 3.25 330 335 Met His Val Asp Phe Ser Arg Gln Llys Val Ser His Thr Thr Phe Gly 34 O 345 35. O His Gly Ser His Lieu. Cys Lieu. Gly Gln His Lieu Ala Arg Arg Glu Ile 355 360 365 Ile Val Thr Lieu Lys Glu Trp Lieu. Thr Arg Ile Pro Asp Phe Ser Ile 37 O 375 38O Ala Pro Gly Ala Glin Ile Glin His Llys Ser Gly Ile Val Ser Gly Val 385 390 395 4 OO Glin Ala Lieu Pro Lieu Val Trp Asp Pro Ala Thir Thr Lys Ala Val 4 OS 41O 415

<210 SEQ ID NO 9 <211 LENGTH: 410 &212> TYPE: PRT <213> ORGANISM: Bacillus megaterium &220s FEATURE: <221s NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (410) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP106A2

<4 OO SEQUENCE: 9 Met Lys Glu Val Ile Ala Wall Lys Glu Ile Thr Arg Phe Llys Thr Arg 1. 5 1O 15 Thr Glu Glu Phe Ser Pro Tyr Ala Trp Cys Lys Arg Met Leu Glu Asn 2O 25 3O Asp Pro Val Ser Tyr His Glu Gly Thr Asp Thir Trp Asn Val Phe Lys 35 4 O 45 Tyr Glu Asp Wall Lys Arg Val Lieu. Ser Asp Tyr Llys His Phe Ser Ser SO 55 6 O Val Arg Lys Arg Thr Thr Ile Ser Val Gly Thr Asp Ser Glu Glu Gly 65 70 7s 8O Ser Val Pro Glu Lys Ile Glin Ile Thr Glu Ser Asp Pro Pro Asp His 85 90 95 Arg Lys Arg Arg Ser Lieu. Lieu Ala Ala Ala Phe Thr Pro Arg Ser Lieu 1OO 105 11 O Glin Asn Trp Glu Pro Arg Ile Glin Glu Ile Ala Asp Glu Lieu. Ile Gly 115 12 O 125 Glin Met Asp Gly Gly Thr Glu Ile Asp Ile Val Ala Ser Lieu Ala Ser 13 O 135 14 O Pro Leu Pro Ile Ile Val Met Ala Asp Leu Met Gly Val Pro Ser Lys 145 150 155 160 Asp Arg Lieu. Lieu. Phe Llys Llys Trp Val Asp Thir Lieu. Phe Lieu Pro Phe 1.65 17O 17s Asp Arg Glu Lys Glin Glu Glu Val Asp Llys Lieu Lys Glin Val Ala Ala 18O 185 19 O Lys Glu Tyr Tyr Glin Tyr Lieu. Tyr Pro Ile Val Val Glin Lys Arg Lieu. 195 2OO 2O5 US 2009/0061471 A1 Mar. 5, 2009 54

- Continued

Asn Pro Ala Asp Asp Ile Ile Ser Asp Lieu. Lieu Lys Ser Glu Val Asp 21 O 215 22O Gly Glu Met Phe Thr Asp Asp Glu Val Val Arg Thr Thr Met Lieu. Ile 225 23 O 235 24 O Lieu. Gly Ala Gly Val Glu Thir Thir Ser His Lieu. Lieu Ala Asn. Ser Phe 245 250 255 Tyr Ser Lieu. Lieu. Tyr Asp Asp Llys Glu Val Tyr Glin Glu Lieu. His Glu 26 O 265 27 O Asn Lieu. Asp Lieu Val Pro Glin Ala Val Glu Glu Met Lieu. Arg Phe Arg 27s 28O 285 Phe Asn Lieu. Ile Llys Lieu. Asp Arg Thr Val Lys Glu Asp Asn Asp Lieu. 29 O 295 3 OO Lieu. Gly Val Glu Lieu Lys Glu Gly Asp Ser Val Val Val Trp Met Ser 3. OS 310 315 32O Ala Ala Asn Met Asp Glu Glu Met Phe Glu Asp Pro Phe Thr Lieu. Asn 3.25 330 335 Ile His Arg Pro Asn. Asn Llys Llys His Lieu. Thir Phe Gly Asn Gly Pro 34 O 345 35. O His Phe Cys Lieu. Gly Ala Pro Lieu Ala Arg Lieu. Glu Ala Lys Ile Ala 355 360 365 Lieu. Thir Ala Phe Lieu Lys Llys Phe Llys His Ile Glu Ala Val Pro Ser 37 O 375 38O Phe Glin Lieu. Glu Glu Asn Lieu. Thir Asp Ser Ala Thr Gly Glin Thr Lieu. 385 390 395 4 OO Thir Ser Lieu Pro Lieu Lys Ala Ser Arg Met 4 OS 41O

<210 SEQ ID NO 10 <211 LENGTH: 404 &212> TYPE: PRT <213> ORGANISM: Citrobacter brakii &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (4 O4) <223> OTHER INFORMATION: Cytochrome P450 enzyme P450cin

<4 OO SEQUENCE: 10 Met Thr Ala Thr Val Ala Ser Thr Ser Leu Phe Thr Thr Ala Asp His 1. 5 1O 15 Tyr His Thr Pro Leu Gly Pro Asp Gly Thr Pro His Ala Phe Phe Glu 2O 25 3O Ala Lieu. Arg Asp Glu Ala Glu Thir Thr Pro Ile Gly Trp Ser Glu Ala 35 4 O 45 Tyr Gly Gly. His Trp Val Val Ala Gly Tyr Lys Glu Ile Glin Ala Val SO 55 6 O Ile Glin Asn Thr Lys Ala Phe Ser Asn Lys Gly Val Thr Phe Pro Arg 65 70 7s 8O Tyr Glu Thr Gly Glu Phe Glu Lieu Met Met Ala Gly Glin Asp Asp Pro 85 90 95 Val His Llys Llys Tyr Arg Glin Lieu Val Ala Lys Pro Phe Ser Pro Glu 1OO 105 11 O Ala Thr Asp Lieu. Phe Thr Glu Gln Lieu. Arg Glin Ser Thr Asn Asp Lieu 115 12 O 125 US 2009/0061471 A1 Mar. 5, 2009 55

- Continued Ile Asp Ala Arg Ile Glu Lieu. Gly Glu Gly Asp Ala Ala Thir Trp Lieu. 13 O 135 14 O Ala Asn. Glu Ile Pro Ala Arg Lieu. Thir Ala Ile Lieu. Lieu. Gly Lieu Pro 145 150 155 160 Pro Glu Asp Gly Asp Thr Tyr Arg Arg Trp Val Trp Ala Ile Thr His 1.65 17O 17s Val Glu Asn. Pro Glu Glu Gly Ala Glu Ile Phe Ala Glu Lieu Val Ala 18O 185 19 O His Ala Arg Thr Lieu. Ile Ala Glu Arg Arg Thr Asn Pro Gly Asn Asp 195 2OO 2O5 Ile Met Ser Arg Val Ile Met Ser Lys Ile Asp Gly Glu Ser Lieu. Ser 21 O 215 22O Glu Asp Asp Lieu. Ile Gly Phe Phe Thir Ile Lieu Lleu Lieu. Gly Gly Ile 225 23 O 235 24 O Asp Asn. Thir Ala Arg Phe Lieu. Ser Ser Val Phe Trp Arg Lieu Ala Trip 245 250 255 Asp Ile Glu Lieu. Arg Arg Arg Lieu. Ile Ala His Pro Glu Lieu. Ile Pro 26 O 265 27 O Asn Ala Val Asp Glu Lieu. Lieu. Arg Phe Tyr Gly Pro Ala Met Val Gly 27s 28O 285 Arg Lieu Val Thr Glin Glu Val Thr Val Gly Asp Ile Thr Met Llys Pro 29 O 295 3 OO Gly Glin Thir Ala Met Lieu. Trp Phe Pro Ile Ala Ser Arg Asp Arg Ser 3. OS 310 315 32O Ala Phe Asp Ser Pro Asp Asn. Ile Val Ile Glu Arg Thr Pro Asn Arg 3.25 330 335 His Lieu. Ser Lieu. Gly His Gly Ile His Arg Cys Lieu. Gly Ala His Lieu. 34 O 345 35. O Ile Arg Val Glu Ala Arg Val Ala Ile Thr Glu Phe Lieu Lys Arg Ile 355 360 365 Pro Glu Phe Ser Lieu. Asp Pro Asn Lys Glu. Cys Glu Trp Lieu Met Gly 37 O 375 38O Glin Val Ala Gly Met Lieu. His Val Pro Ile Ile Phe Pro Lys Gly Lys 385 390 395 4 OO Arg Lieu. Ser Glu

<210 SEQ ID NO 11 <211 LENGTH: 428 &212> TYPE: PRT <213> ORGANISM: Pseudomonas sp &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (428) <223> OTHER INFORMATION: Cytochrome P450 enzyme P450terp

<4 OO SEQUENCE: 11 Met Asp Ala Arg Ala Thir Ile Pro Glu. His Ile Ala Arg Thr Val Ile 1. 5 1O 15 Lieu Pro Glin Gly Tyr Ala Asp Asp Glu Val Ile Tyr Pro Ala Phe Lys 2O 25 3O Trp Lieu. Arg Asp Glu Glin Pro Lieu Ala Met Ala His Ile Glu Gly Tyr 35 4 O 45 Asp Pro Met Trp Ile Ala Thr Lys His Ala Asp Val Met Glin Ile Gly SO 55 6 O US 2009/0061471 A1 Mar. 5, 2009 56

- Continued

Lys Glin Pro Gly Lell Phe Ser Asn Ala Glu Gly Ser Glu Ie Luell 65 70

Asp Glin Asn Asn Glu Ala Phe Met Arg Ser Ile Ser Gly Cys Pro 85 90 95

His Wall Ile Asp Ser Lell Thir Ser Met Asp Pro Pro Thir Thir Ala 105

Arg Gly Luell Thir Lell Asn Trp Phe Glin Pro Ala Ser Arg 115 12 O 125

Lell Glu Glu Asn Ile Arg Arg Ile Ala Glin Ala Ser Wall Arg Luell 13 O 135 14 O

Lell Asp Phe Asp Gly Glu Asp Phe Met Thir Asp Luell Tyr 145 150 155 160

Pro Luell His Wall Wall Met Thir Ala Luell Gly Wall Pro Asp Asp 1.65 17O 17s

Glu Pro Luell Met Lell Lell Thir Glin Asp Phe Phe Gly Wall His Glu 18O 185 19 O

Pro Asp Glu Glin Ala Wall Ala Ala Pro Arg Glin Ser Ala Asp Glu Ala 195

Ala Arg Arg Phe His Glu Thir Ile Ala Thir Phe Tyr Asp Phe Asn 21 O 215 22O

Gly Phe Thir Wall Asp Arg Arg Ser Pro Lys Asp Asp Wall Met Ser 225 23 O 235 24 O

Lell Luell Ala Asn Ser Lell Asp Gly Asn Tyr Ile Asp Asp Lys 245 250 255

Ile Asn Ala Tyr Tyr Wall Ala Ile Ala Thir Ala Gly His Asp Thir Thir 26 O 265 27 O

Ser Ser Ser Ser Gly Gly Ala Ile Ile Gly Luell Ser Arg Asn Pro Glu 27s 285

Glin Luell Ala Luell Ala Ser Asp Pro Ala Luell Ile Pro Arg Luell Wall 29 O 295 3 OO

Asp Glu Ala Wall Arg Trp Thir Ala Pro Wall Lys Ser Phe Met Arg Thir 3. OS 310 315

Ala Luell Ala Asp Thir Glu Wall Arg Gly Glin ASn Ile Arg Gly Asp 3.25 330 335

Arg Ile Met Luell Ser Pro Ser Ala Asn Arg Asp Glu Glu Wall Phe 34 O 345 35. O

Ser Asn Pro Asp Glu Phe Asp Ile Thir Arg Phe Pro Asn Arg His Luell 355 360 365

Gly Phe Gly Trp Gly Ala His Met Luell Gly Glin His Luell Ala 37 O 375

Lell Glu Met Ile Phe Phe Glu Glu Luell Luell Pro Luell Ser 385 390 395 4 OO

Wall Glu Luell Ser Gly Pro Pro Arg Luell Wall Ala Thir Asn Phe Wall Gly 4 OS 415

Gly Pro Asn Wall Pro Ile Arg Phe Thir Ala 42O 425

SEQ ID NO 12 LENGTH: TYPE : PRT ORGANISM: Saccharopolyspora erythreae FEATURE: US 2009/0061471 A1 Mar. 5, 2009 57

- Continued <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (4 O4) <223> OTHER INFORMATION: Cytochrome P450 enzyme P450eryF

<4 OO SEQUENCE: 12 Met Thr Thr Val Pro Asp Lieu. Glu Ser Asp Ser Phe His Val Asp Trp 1. 5 1O 15 Tyr Arg Thr Tyr Ala Glu Lieu. Arg Glu Thir Ala Pro Val Thr Pro Val 2O 25 3O Arg Phe Lieu. Gly Glin Asp Ala Trp Lieu Val Thr Gly Tyr Asp Glu Ala 35 4 O 45 Lys Ala Ala Lieu. Ser Asp Lieu. Arg Lieu. Ser Ser Asp Pro Llys Llys Llys SO 55 6 O Tyr Pro Gly Val Glu Val Glu Phe Pro Ala Tyr Lieu. Gly Phe Pro Glu 65 70 7s 8O Asp Val Arg Asn Tyr Phe Ala Thr Asn Met Gly. Thir Ser Asp Pro Pro 85 90 95 Thr His Thr Arg Lieu. Arg Llys Lieu Val Ser Glin Glu Phe Thr Val Arg 1OO 105 11 O Arg Val Glu Ala Met Arg Pro Arg Val Glu Glin Ile Thr Ala Glu Lieu 115 12 O 125 Lieu. Asp Glu Val Gly Asp Ser Gly Val Val Asp Ile Val Asp Arg Phe 13 O 135 14 O Ala His Pro Lieu Pro Ile Llys Val Ile Cys Glu Lieu. Lieu. Gly Val Asp 145 150 155 160 Glu Lys Tyr Arg Gly Glu Phe Gly Arg Trp Ser Ser Glu Ile Lieu Val 1.65 17O 17s Met Asp Pro Glu Arg Ala Glu Glin Arg Gly Glin Ala Ala Arg Glu Val 18O 185 19 O Val Asn. Phe Ile Lieu. Asp Lieu Val Glu Arg Arg Arg Thr Glu Pro Gly 195 2OO 2O5 Asp Asp Lieu. Lieu. Ser Ala Lieu. Ile Arg Val Glin Asp Asp Asp Asp Gly 21 O 215 22O Arg Lieu. Ser Ala Asp Glu Lieu. Thir Ser Ile Ala Lieu Val Lieu. Lieu. Lieu 225 23 O 235 24 O Ala Gly Phe Glu Ala Ser Val Ser Lieu. Ile Gly Ile Gly Thr Tyr Lieu. 245 250 255 Lieu. Lieu. Thir His Pro Asp Gln Lieu Ala Lieu Val Arg Arg Asp Pro Ser 26 O 265 27 O Ala Lieu Pro Asn Ala Val Glu Glu Ile Lieu. Arg Tyr Ile Ala Pro Pro 27s 28O 285 Glu Thir Thr Thr Arg Phe Ala Ala Glu Glu Val Glu Ile Gly Gly Val 29 O 295 3 OO Ala Ile Pro Glin Tyr Ser Thr Val Lieu Val Ala Asn Gly Ala Ala Asn 3. OS 310 315 32O Arg Asp Pro Lys Glin Phe Pro Asp Pro His Arg Phe Asp Val Thr Arg 3.25 330 335 Asp Thr Arg Gly His Leu Ser Phe Gly Glin Gly Ile His Phe Cys Met 34 O 345 35. O Gly Arg Pro Lieu Ala Lys Lieu. Glu Gly Glu Val Ala Lieu. Arg Ala Lieu. 355 360 365 Phe Gly Arg Phe Pro Ala Lieu. Ser Lieu. Gly Ile Asp Ala Asp Asp Wall US 2009/0061471 A1 Mar. 5, 2009 58

- Continued

37 O 375 38O

Wall Trp Arg Arg Ser Lieu Lleu Lieu. Arg Gly Ile Asp His Lieu Pro Val 385 390 395 4 OO Arg Lieu. Asp Gly

<210 SEQ ID NO 13 <211 LENGTH: 516 &212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (516) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP1A2 <4 OO SEQUENCE: 13

Met Ala Lieu. Ser Glin Ser Wall Pro Phe Ser Ala Thr Glu Lieu. Lieu. Lieu. 1. 5 1O 15

Ala Ser Ala Ile Phe Cys Lieu Val Phe Trp Val Lieu Lys Gly Lieu. Arg 2O 25 3O

Pro Arg Val Pro Lys Gly Lieu Lys Ser Pro Pro Glu Pro Trp Gly Trp 35 4 O 45

Pro Lieu. Lieu. Gly. His Val Lieu. Thir Lieu. Gly Lys Asn. Pro His Lieu Ala SO 55 6 O

Lell Ser Arg Met Ser Glin Arg Tyr Gly Asp Val Lieu. Glin Ile Arg Ile 65 70 7s 8O Gly Ser Thr Pro Val Lieu Val Lieu. Ser Arg Lieu. Asp Thir Ile Arg Glin 85 90 95

Ala Lieu Val Arg Glin Gly Asp Asp Phe Lys Gly Arg Pro Asp Lieu. Tyr 1OO 105 11 O

Thir Ser Thr Lieu. Ile Thr Asp Gly Glin Ser Lieu. Thr Phe Ser Thr Asp 115 12 O 125

Ser Gly Pro Val Trp Ala Ala Arg Arg Arg Lieu Ala Glin Asn Ala Lieu. 13 O 135 14 O

Asn Thr Phe Ser Ile Ala Ser Asp Pro Ala Ser Ser Ser Ser Cys Tyr 145 150 155 160

Lell Glu Glu. His Val Ser Lys Glu Ala Lys Ala Lieu. Ile Ser Arg Lieu. 1.65 17O 17s

Glin Glu Lieu Met Ala Gly Pro Gly His Phe Asp Pro Tyr Asn Glin Val 18O 185 19 O

Wall Val Ser Val Ala Asn Val Ile Gly Ala Met Cys Phe Gly Gln His 195 2OO 2O5

Phe Pro Glu Ser Ser Asp Glu Met Leu Ser Leu Val Lys Asn Thr His 21 O 215 22O

Glu Phe Val Glu Thir Ala Ser Ser Gly Asn Pro Leu Asp Phe Phe Pro 225 23 O 235 24 O

Ile Lieu. Arg Tyr Lieu Pro ASn Pro Ala Lieu. Glin Arg Phe Lys Ala Phe 245 250 255

Asn Glin Arg Phe Leu Trp Phe Leu Gln Lys Thr Val Glin Glu. His Tyr 26 O 265 27 O

Glin Asp Phe Asp Lys Asn. Ser Val Arg Asp Ile Thr Gly Ala Lieu. Phe 27s 28O 285 His Ser Lys Lys Gly Pro Arg Ala Ser Gly Asn Lieu. Ile Pro Glin 29 O 295 3 OO US 2009/0061471 A1 Mar. 5, 2009 59

- Continued Glu Lys Ile Val Asn Lieu Val Asn Asp Ile Phe Gly Ala Gly Phe Asp 3. OS 310 315 32O Thr Val Thir Thr Ala Ile Ser Trp Ser Leu Met Tyr Lieu Val Thir Lys 3.25 330 335 Pro Glu Ile Glin Arg Lys Ile Glin Lys Glu Lieu. Asp Thr Val Ile Gly 34 O 345 35. O Arg Glu Arg Arg Pro Arg Lieu. Ser Asp Arg Pro Gln Lieu Pro Tyr Lieu 355 360 365 Glu Ala Phe Ile Lieu. Glu Thir Phe Arg His Ser Ser Phe Leu Pro Phe 37 O 375 38O Thir Ile Pro His Ser Thr Thr Arg Asp Thir Thr Lieu. Asn Gly Phe Tyr 385 390 395 4 OO Ile Pro Llys Lys Cys Cys Val Phe Val Asn Gln Trp Glin Val Asn His 4 OS 41O 415 Asp Pro Glu Lieu. Trp Glu Asp Pro Ser Glu Phe Arg Pro Glu Arg Phe 42O 425 43 O Lieu. Thir Ala Asp Gly Thr Ala Ile Asn Llys Pro Lieu. Ser Glu Lys Met 435 44 O 445 Met Lieu. Phe Gly Met Gly Lys Arg Arg Cys Ile Gly Glu Val Lieu Ala 450 45.5 460 Llys Trp Glu Ile Phe Lieu. Phe Lieu Ala Ile Lieu. Lieu. Glin Glin Lieu. Glu 465 470 47s 48O Phe Ser Val Pro Pro Gly Val Llys Val Asp Lieu. Thr Pro Ile Tyr Gly 485 490 495 Lieu. Thir Met Lys His Ala Arg Cys Glu. His Val Glin Ala Arg Lieu. Arg SOO 505 51O

Phe Ser Ile Asn 515

<210 SEQ ID NO 14 <211 LENGTH: 489 &212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (489) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2C8 <4 OO SEQUENCE: 14 Met Glu Pro Phe Val Val Lieu Val Lieu. Cys Lieu Ser Phe Met Leu Lieu. 1. 5 1O 15 Phe Ser Lieu. Trp Arg Glin Ser Cys Arg Arg Arg Llys Lieu Pro Pro Gly 2O 25 3O Pro Thr Pro Leu Pro Ile Ile Gly Asn Met Leu Glin Ile Asp Val Lys 35 4 O 45 Asp Ile Cys Llys Ser Phe Thr Asn Phe Ser Llys Val Tyr Gly Pro Val SO 55 6 O Phe Thr Val Tyr Phe Gly Asn Pro Ile Val Val Phe His Gly Tyr Glu 65 70 7s 8O Ala Wall Lys Glu Ala Lieu. Ile Asp Asn Gly Glu Glu Phe Ser Gly Arg 85 90 95 Gly Asn. Ser Pro Ile Ser Glin Arg Ile Thir Lys Gly Lieu. Gly Ile Ile 1OO 105 11 O Ser Ser Asn Gly Lys Arg Trp Llys Glu Ile Arg Arg Phe Ser Lieu. Thir US 2009/0061471 A1 Mar. 5, 2009 60

- Continued

115 12 O 125 Thir Lieu. Arg Asn. Phe Gly Met Gly Lys Arg Ser Ile Glu Asp Arg Val 13 O 135 14 O Glin Glu Glu Ala His Cys Lieu Val Glu Glu Lieu. Arg Llys Thir Lys Ala 145 150 155 160 Ser Pro Cys Asp Pro Thr Phe Ile Leu Gly Cys Ala Pro Cys Asn Val 1.65 17O 17s Ile Cys Ser Val Val Phe Gln Lys Arg Phe Asp Tyr Lys Asp Glin Asn 18O 185 19 O Phe Lieu. Thir Lieu Met Lys Arg Phe Asn. Glu Asn. Phe Arg Ile Lieu. Asn 195 2OO 2O5 Ser Pro Trp Ile Glin Val Cys Asn Asn Phe Pro Leu Lleu. Ile Asp Cys 21 O 215 22O Phe Pro Gly. Thir His Asn Llys Val Lieu Lys Asn Val Ala Lieu. Thir Arg 225 23 O 235 24 O Ser Tyr Ile Arg Glu Lys Wall Lys Glu. His Glin Ala Ser Lieu. Asp Wall 245 250 255 Asn Asn Pro Arg Asp Phe Ile Asp Cys Phe Lieu. Ile Llys Met Glu Glin 26 O 265 27 O Glu Lys Asp Asin Glin Llys Ser Glu Phe Asn. Ile Glu Asn Lieu Val Gly 27s 28O 285 Thr Val Ala Asp Leu Phe Val Ala Gly Thr Glu Thir Thr Ser Thr Thr 29 O 295 3 OO Lieu. Arg Tyr Gly Lieu Lleu Lleu Lieu. Lieu Lys His Pro Glu Val Thir Ala 3. OS 310 315 32O Llys Val Glin Glu Glu Ile Asp His Val Ile Gly Arg His Arg Ser Pro 3.25 330 335 Cys Met Glin Asp Arg Ser His Met Pro Tyr Thr Asp Ala Val Val His 34 O 345 35. O Glu Ile Glin Arg Tyr Ser Asp Leu Val Pro Thr Gly Val Pro His Ala 355 360 365 Val Thir Thr Asp Thr Lys Phe Arg Asn Tyr Lieu. Ile Pro Lys Gly Thr 37 O 375 38O Thir Ile Met Ala Lieu. Lieu. Thir Ser Val Lieu. His Asp Asp Llys Glu Phe 385 390 395 4 OO Pro Asn Pro Asn. Ile Phe Asp Pro Gly His Phe Lieu. Asp Lys Asn Gly 4 OS 41O 415 Asn Phe Llys Llys Ser Asp Tyr Phe Met Pro Phe Ser Ala Gly Lys Arg 42O 425 43 O Ile Cys Ala Gly Glu Gly Lieu Ala Arg Met Glu Lieu. Phe Lieu. Phe Lieu. 435 44 O 445 Thir Thir Ile Lieu. Glin Asn. Phe Asn Lieu Lys Ser Val Asp Asp Lieu Lys 450 45.5 460 Asn Lieu. Asn. Thir Thr Ala Val Thr Lys Gly Ile Val Ser Leu Pro Pro 465 470 47s 48O Ser Tyr Glin Ile Cys Phe Ile Pro Val 485

<210 SEQ ID NO 15 <211 LENGTH: 490 &212> TYPE: PRT <213> ORGANISM: homo sapiens US 2009/0061471 A1 Mar. 5, 2009 61

- Continued

&220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (490) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2C9

<4 OO SEQUENCE: 15 Met Asp Ser Lieu Val Val Lieu Val Lieu. Cys Lieu. Ser Cys Lieu. Lieu. Lieu. 1. 5 1O 15 Lieu. Ser Lieu. Trp Arg Glin Ser Ser Gly Arg Gly Llys Lieu Pro Pro Gly 2O 25 3O Pro Thr Pro Leu Pro Val Ile Gly Asn Ile Leu Glin Ile Gly Ile Llys 35 4 O 45 Asp Ile Ser Lys Ser Lieu. Thir Asn Lieu. Ser Llys Val Tyr Gly Pro Val SO 55 6 O Phe Thr Lieu. Tyr Phe Gly Lieu Lys Pro Ile Val Val Lieu. His Gly Tyr 65 70 7s 8O Glu Ala Wall Lys Glu Ala Lieu. Ile Asp Lieu. Gly Glu Glu Phe Ser Gly 85 90 95 Arg Gly Ile Phe Pro Lieu Ala Glu Arg Ala Asn Arg Gly Phe Gly Ile 1OO 105 11 O Val Phe Ser Asn Gly Llys Llys Trp Llys Glu Ile Arg Arg Phe Ser Lieu 115 12 O 125 Met Thr Lieu. Arg Asn. Phe Gly Met Gly Lys Arg Ser Ile Glu Asp Arg 13 O 135 14 O Val Glin Glu Glu Ala Arg Cys Lieu Val Glu Glu Lieu. Arg Llys Thir Lys 145 150 155 160 Ala Ser Pro Cys Asp Pro Thr Phe Ile Leu Gly Cys Ala Pro Cys Asn 1.65 17O 17s Val Ile Cys Ser Ile Ile Phe His Lys Arg Phe Asp Tyr Lys Asp Glin 18O 185 19 O Glin Phe Lieu. Asn Lieu Met Glu Lys Lieu. Asn. Glu Asn. Ile Lys Ile Lieu. 195 2OO 2O5 Ser Ser Pro Trp Ile Glin Ile Cys Asn Asin Phe Ser Pro Ile Ile Asp 21 O 215 22O Tyr Phe Pro Gly Thr His Asn Llys Lieu. Leu Lys Asn Val Ala Phe Met 225 23 O 235 24 O Llys Ser Tyr Ile Lieu. Glu Lys Wall Lys Glu. His Glin Glu Ser Met Asp 245 250 255 Met Asn Asn Pro Glin Asp Phe Ile Asp Cys Phe Leu Met Lys Met Glu 26 O 265 27 O Lys Glu Lys His Asn Gln Pro Ser Glu Phe Thr Ile Glu Ser Lieu. Glu 27s 28O 285 Asn Thr Ala Val Asp Leu Phe Gly Ala Gly Thr Glu Thir Thr Ser Thr 29 O 295 3 OO Thir Lieu. Arg Tyr Ala Lieu. Lieu. Lieu. Lieu. Lieu Lys His Pro Glu Val Thr 3. OS 310 315 32O Ala Lys Val Glin Glu Glu Ile Glu Arg Val Ile Gly Arg Asn Arg Ser 3.25 330 335 Pro Cys Met Glin Asp Arg Ser His Met Pro Tyr Thr Asp Ala Val Val 34 O 345 35. O His Glu Val Glin Arg Tyr Ile Asp Lieu. Leu Pro Thr Ser Leu Pro His 355 360 365 US 2009/0061471 A1 Mar. 5, 2009 62

- Continued Ala Val Thir Cys Asp Ile Llys Phe Arg Asn Tyr Lieu. Ile Pro Lys Gly 37 O 375 38O Thir Thir Ile Lieu. Ile Ser Lieu. Thir Ser Val Lieu. His Asp Asn Lys Glu 385 390 395 4 OO Phe Pro Asn Pro Glu Met Phe Asp Pro His His Phe Lieu. Asp Glu Gly 4 OS 41O 415 Gly Asin Phe Llys Lys Ser Lys Tyr Phe Met Pro Phe Ser Ala Gly Lys 42O 425 43 O Arg Ile Cys Val Gly Glu Ala Lieu Ala Gly Met Glu Lieu. Phe Lieu. Phe 435 44 O 445 Lieu. Thir Ser Ile Lieu. Glin Asn. Phe Asn Lieu Lys Ser Lieu Val Asp Pro 450 45.5 460 Lys Asn Lieu. Asp Thr Thr Pro Val Val Asin Gly Phe Ala Ser Val Pro 465 470 47s 48O Pro Phe Tyr Glin Lieu. Cys Phe Ile Pro Val 485 490

<210 SEQ ID NO 16 <211 LENGTH: 490 &212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (490) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2C19 <4 OO SEQUENCE: 16 Met Asp Pro Phe Val Val Lieu Val Lieu. Cys Lieu. Ser Cys Lieu. Lieu. Lieu. 1. 5 1O 15 Lieu. Ser Ile Trp Arg Glin Ser Ser Gly Arg Gly Llys Lieu Pro Pro Gly 2O 25 3O Pro Thr Pro Leu Pro Val Ile Gly Asn Ile Leu Glin Ile Asp Ile Llys 35 4 O 45 Asp Wal Ser Lys Ser Lieu. Thir Asn Lieu. Ser Lys Ile Tyr Gly Pro Val SO 55 6 O Phe Thr Lieu. Tyr Phe Gly Lieu. Glu Arg Met Val Val Lieu. His Gly Tyr 65 70 7s 8O Glu Val Val Lys Glu Ala Lieu. Ile Asp Lieu. Gly Glu Glu Phe Ser Gly 85 90 95 Arg Gly His Phe Pro Lieu Ala Glu Arg Ala Asn Arg Gly Phe Gly Ile 1OO 105 11 O Val Phe Ser Asn Gly Lys Arg Trp Llys Glu Ile Arg Arg Phe Ser Lieu 115 12 O 125 Met Thr Lieu. Arg Asn. Phe Gly Met Gly Lys Arg Ser Ile Glu Asp Arg 13 O 135 14 O Val Glin Glu Glu Ala Arg Cys Lieu Val Glu Glu Lieu. Arg Llys Thir Lys 145 150 155 160 Ala Ser Pro Cys Asp Pro Thr Phe Ile Leu Gly Cys Ala Pro Cys Asn 1.65 17O 17s Val Ile Cys Ser Ile Ile Phe Glin Lys Arg Phe Asp Tyr Lys Asp Glin 18O 185 19 O Glin Phe Lieu. Asn Lieu Met Glu Lys Lieu. Asn. Glu Asn. Ile Arg Ile Val 195 2OO 2O5 Ser Thr Pro Trp Ile Glin Ile Cys Asn Asin Phe Pro Thr Ile Ile Asp US 2009/0061471 A1 Mar. 5, 2009 63

- Continued

21 O 215 22O Tyr Phe Pro Gly Thr His Asn Llys Lieu. Leu Lys Asn Lieu Ala Phe Met 225 23 O 235 24 O Glu Ser Asp Ile Lieu. Glu Lys Val Lys Glu. His Glin Glu Ser Met Asp 245 250 255 Ile Asin Asn Pro Arg Asp Phe Ile Asp Cys Phe Lieu. Ile Llys Met Glu 26 O 265 27 O Lys Glu Lys Glin Asn Glin Glin Ser Glu Phe Thir Ile Glu Asn Lieu Val 27s 28O 285 Ile Thr Ala Ala Asp Lieu. Leu Gly Ala Gly Thr Glu Thir Thr Ser Thr 29 O 295 3 OO Thir Lieu. Arg Tyr Ala Lieu. Lieu. Lieu. Lieu. Lieu Lys His Pro Glu Val Thr 3. OS 310 315 32O Ala Lys Val Glin Glu Glu Ile Glu Arg Val Ile Gly Arg Asn Arg Ser 3.25 330 335 Pro Cys Met Glin Asp Arg Gly His Met Pro Tyr Thr Asp Ala Val Val 34 O 345 35. O His Glu Val Glin Arg Tyr Ile Asp Lieu. Ile Pro Thr Ser Leu Pro His 355 360 365 Ala Val Thir Cys Asp Wall Lys Phe Arg Asn Tyr Lieu. Ile Pro Lys Gly 37 O 375 38O Thir Thr Ile Lieu. Thir Ser Lieu. Thir Ser Val Lieu. His Asp Asn Lys Glu 385 390 395 4 OO Phe Pro Asn Pro Glu Met Phe Asp Pro Arg His Phe Lieu. Asp Glu Gly 4 OS 41O 415 Gly Asin Phe Llys Lys Ser Asn Tyr Phe Met Pro Phe Ser Ala Gly Lys 42O 425 43 O Arg Ile Cys Val Gly Glu Gly Lieu Ala Arg Met Glu Lieu. Phe Lieu. Phe 435 44 O 445 Lieu. Thir Phe Ile Lieu. Glin Asn. Phe Asn Lieu Lys Ser Lieu. Ile Asp Pro 450 45.5 460 Lys Asp Lieu. Asp Thr Thr Pro Val Val Asin Gly Phe Ala Ser Val Pro 465 470 47s 48O Pro Phe Tyr Glin Lieu. Cys Phe Ile Pro Val 485 490

<210 SEQ ID NO 17 <211 LENGTH: 446 &212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (446) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2D6

<4 OO SEQUENCE: 17 Met Gly Lieu. Glu Ala Lieu Val Pro Lieu Ala Val Ile Val Ala Ile Phe 1. 5 1O 15 Lieu. Lieu. Lieu Val Asp Lieu Met His Arg Arg Glin Arg Trp Ala Ala Arg 2O 25 3O Tyr Pro Pro Gly Pro Leu Pro Leu Pro Gly Lieu. Gly Asn Lieu. Lieu. His 35 4 O 45 Val Asp Phe Glin Asn Thr Pro Tyr Cys Phe Asp Gln Lieu. Arg Arg Arg SO 55 6 O US 2009/0061471 A1 Mar. 5, 2009 64

- Continued

Phe Gly Asp Val Phe Ser Leu Gln Leu Ala Trp Thr Pro Val Val Val 65 70 7s 8O Lieu. Asn Gly Lieu Ala Ala Val Arg Glu Ala Lieu Val Thr His Gly Glu 85 90 95 Asp Thr Ala Asp Arg Pro Pro Val Pro Ile Thr Glin Ile Leu Gly Phe 1OO 105 11 O Gly Pro Arg Ser Glin Gly Arg Pro Phe Arg Pro Asn Gly Lieu. Lieu. Asp 115 12 O 125 Lys Ala Val Ser Asn Val Ile Ala Ser Lieu. Thir Cys Gly Arg Arg Phe 13 O 135 14 O Glu Tyr Asp Asp Pro Arg Phe Lieu. Arg Lieu. Lieu. Asp Lieu Ala Glin Glu 145 150 155 160 Gly Lieu Lys Glu Glu Ser Gly Phe Lieu. Arg Glu Val Lieu. Asn Ala Val 1.65 17O 17s Pro Val Lieu. Lieu. His Ile Pro Ala Lieu Ala Gly Llys Val Lieu. Arg Phe 18O 185 19 O Glin Lys Ala Phe Lieu. Thr Glin Lieu. Asp Glu Lieu. Lieu. Thr Glu. His Arg 195 2OO 2O5 Met Thr Trp Asp Pro Ala Gln Pro Pro Arg Asp Lieu. Thr Glu Ala Phe 21 O 215 22O Lieu Ala Glu Met Glu Lys Ala Lys Gly ASn Pro Glu Ser Ser Phe Asn 225 23 O 235 24 O Asp Glu Asn Lieu. Cys Ile Val Val Ala Asp Lieu. Phe Ser Ala Gly Met 245 250 255 Val Thir Thr Ser Thr Thr Lieu Ala Trp Gly Lieu Lleu Lleu Met Ile Leu 26 O 265 27 O His Pro Asp Val Glin Arg Arg Val Glin Glin Glu Ile Asp Asp Val Ile 27s 28O 285 Gly Glin Val Arg Arg Pro Glu Met Gly Asp Glin Ala His Met Pro Tyr 29 O 295 3 OO Thir Thr Ala Val Ile His Glu Val Glin Arg Phe Gly Asp Ile Val Pro 3. OS 310 315 32O Lieu. Gly Val Thr His Met Thr Ser Arg Asp Ile Glu Val Glin Gly Phe 3.25 330 335 Arg Ile Pro Lys Gly Thr Thr Lieu. Ile Thr Asn Lieu Ser Ser Val Lieu. 34 O 345 35. O Lys Asp Glu Ala Val Trp Glu Lys Pro Phe Arg Phe His Pro Glu. His 355 360 365 Phe Lieu. Asp Ala Glin Gly His Phe Val Llys Pro Glu Ala Phe Lieu Pro 37 O 375 38O Phe Ser Ala Gly Arg Arg Ala Cys Lieu. Gly Glu Pro Lieu Ala Arg Met 385 390 395 4 OO

Glu Lieu. Phe Lieu. Phe Phe Thir Ser Lieu Lleu. Glin His Phe Ser Phe Ser 4 OS 41O 415 Val Pro Thr Gly Glin Pro Arg Pro Ser His His Gly Val Phe Ala Phe 42O 425 43 O Lieu Val Thr Pro Ser Pro Tyr Glu Lieu. Cys Ala Val Pro Arg 435 44 O 445

<210 SEQ ID NO 18 <211 LENGTH: 493 US 2009/0061471 A1 Mar. 5, 2009 65

- Continued

&212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (493) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2E1 <4 OO SEQUENCE: 18 Met Ser Ala Lieu. Gly Val Thr Val Ala Lieu. Lieu Val Trp Ala Ala Phe 1. 5 1O 15 Lieu. Lieu. Lieu Val Ser Met Trp Arg Glin Val His Ser Ser Trp Asn Lieu. 2O 25 3O Pro Pro Gly Pro Phe Pro Leu Pro Ile Ile Gly Asn Lieu Phe Gln Leu 35 4 O 45 Glu Lieu Lys Asn. Ile Pro Llys Ser Phe Thir Arg Lieu Ala Glin Arg Phe SO 55 6 O Gly Pro Val Phe Thr Lieu. Tyr Val Gly Ser Glin Arg Met Val Val Met 65 70 7s 8O His Gly Tyr Lys Ala Wall Lys Glu Ala Lieu. Lieu. Asp Tyr Lys Asp Glu 85 90 95 Phe Ser Gly Arg Gly Asp Lieu Pro Ala Phe His Ala His Arg Asp Arg 1OO 105 11 O Gly Ile Ile Phe Asn. Asn Gly Pro Thir Trp Lys Asp Ile Arg Arg Phe 115 12 O 125 Ser Lieu. Thir Thir Lieu. Arg Asn Tyr Gly Met Gly Lys Glin Gly Asn. Glu 13 O 135 14 O Ser Arg Ile Glin Arg Glu Ala His Phe Lieu. Lieu. Glu Ala Lieu. Arg Llys 145 150 155 160 Thr Glin Gly Glin Pro Phe Asp Pro Thr Phe Lieu. Ile Gly Cys Ala Pro 1.65 17O 17s Cys Asn Val Ile Ala Asp Ile Lieu. Phe Arg Llys His Phe Asp Tyr Asn 18O 185 19 O Asp Glu Lys Phe Lieu. Arg Lieu Met Tyr Lieu. Phe Asn. Glu Asn. Phe His 195 2OO 2O5 Lieu. Leu Ser Thr Pro Trp Leu Gln Leu Tyr Asn Asn Phe Pro Ser Phe 21 O 215 22O Lieu. His Tyr Lieu Pro Gly Ser His Arg Llys Val Ile Lys Asn Val Ala 225 23 O 235 24 O Glu Val Lys Glu Tyr Val Ser Glu Arg Val Lys Glu. His His Glin Ser 245 250 255 Lieu. Asp Pro Asn. Cys Pro Arg Asp Lieu. Thir Asp Cys Lieu. Lieu Val Glu 26 O 265 27 O Met Glu Lys Glu Lys His Ser Ala Glu Arg Lieu. Tyr Thr Met Asp Gly 27s 28O 285 Ile Thr Val Thr Val Ala Asp Leu Phe Phe Ala Gly Thr Glu. Thir Thr 29 O 295 3 OO Ser Thir Thr Lieu. Arg Tyr Gly Lieu. Lieu. Ile Leu Met Lys Tyr Pro Glu 3. OS 310 315 32O Ile Glu Glu Lys Lieu. His Glu Glu Ile Asp Arg Val Ile Gly Pro Ser 3.25 330 335 Arg Ile Pro Ala Ile Lys Asp Arg Glin Glu Met Pro Tyr Met Asp Ala 34 O 345 35. O Val Val His Glu Ile Glin Arg Phe Ile Thr Lieu Val Pro Ser Asn Lieu. US 2009/0061471 A1 Mar. 5, 2009 66

- Continued

355 360 365 Pro His Glu Ala Thr Arg Asp Thir Ile Phe Arg Gly Tyr Lieu. Ile Pro 37 O 375 38O Lys Gly Thr Val Val Val Pro Thir Lieu. Asp Ser Val Lieu. Tyr Asp Asn 385 390 395 4 OO Gln Glu Phe Pro Asp Pro Glu Lys Phe Llys Pro Glu. His Phe Lieu. Asn 4 OS 41O 415 Glu Asin Gly Llys Phe Lys Tyr Ser Asp Tyr Phe Llys Pro Phe Ser Thr 42O 425 43 O Gly Lys Arg Val Cys Ala Gly Glu Gly Lieu Ala Arg Met Glu Lieu. Phe 435 44 O 445 Lieu. Lieu. Lieu. Cys Ala Ile Lieu Gln His Phe Asn Lieu Lys Pro Lieu Val 450 45.5 460 Asp Pro Lys Asp Ile Asp Lieu. Ser Pro Ile His Ile Gly Phe Gly Cys 465 470 47s 48O Ile Pro Pro Arg Tyr Lys Lieu. Cys Val Ile Pro Arg Ser 485 490

<210 SEQ ID NO 19 <211 LENGTH: 491 &212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221s NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (491) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2F1

<4 OO SEQUENCE: 19 Met Asp Ser Ile Ser Thr Ala Ile Lieu. Lieu. Lieu Lleu Lieu Ala Lieu Val 1. 5 1O 15 Cys Lieu. Lieu. Lieu. Thir Lieu. Ser Ser Arg Asp Llys Gly Llys Lieu Pro Pro 2O 25 3O Gly Pro Arg Pro Lieu. Ser Ile Lieu. Gly Asn Lieu Lleu Lleu Lieu. Cys Ser 35 4 O 45 Glin Asp Met Lieu. Thir Ser Lieu. Thir Lys Lieu. Ser Lys Glu Tyr Gly Ser SO 55 6 O Met Tyr Thr Val His Leu Gly Pro Arg Arg Val Val Val Lieu. Ser Gly 65 70 7s 8O Tyr Glin Ala Wall Lys Glu Ala Lieu Val Asp Glin Gly Glu Glu Phe Ser 85 90 95 Gly Arg Gly Asp Tyr Pro Ala Phe Phe Asin Phe Thr Lys Gly Asin Gly 1OO 105 11 O Ile Ala Phe Ser Ser Gly Asp Arg Trp Llys Val Lieu. Arg Glin Phe Ser 115 12 O 125 Ile Glin Ile Lieu. Arg Asn. Phe Gly Met Gly Lys Arg Ser Ile Glu Glu 13 O 135 14 O Arg Ile Lieu. Glu Glu Gly Ser Phe Lieu. Lieu Ala Glu Lieu. Arg Llys Thr 145 150 155 160 Glu Gly Glu Pro Phe Asp Pro Thr Phe Val Leu Ser Arg Ser Val Ser 1.65 17O 17s Asn. Ile Ile Cys Ser Val Lieu. Phe Gly Ser Arg Phe Asp Tyr Asp Asp 18O 185 19 O Glu Arg Lieu. Lieu. Thir Ile Ile Arg Lieu. Ile Asn Asp Asn. Phe Glin Ile 195 2OO 2O5 US 2009/0061471 A1 Mar. 5, 2009 67

- Continued

Met Ser Ser Pro Trp Gly Glu Lieu. Tyr Asp Ile Phe Pro Ser Lieu. Leu 21 O 215 22O Asp Trp Val Pro Gly Pro His Glin Arg Ile Phe Glin Asn Phe Lys Cys 225 23 O 235 24 O Lieu. Arg Asp Lieu. Ile Ala His Ser Val His Asp His Glin Ala Ser Lieu. 245 250 255 Asp Pro Arg Ser Pro Arg Asp Phe Ile Glin Cys Phe Lieu. Thir Lys Met 26 O 265 27 O Ala Glu Glu Lys Glu Asp Pro Lieu. Ser His Phe His Met Asp Thir Lieu 27s 28O 285 Lieu Met Thr Thr His Asn Lieu. Leu Phe Gly Gly Thr Lys Thr Val Ser 29 O 295 3 OO Thir Thr Lieu. His His Ala Phe Leu Ala Leu Met Lys Tyr Pro Llys Val 3. OS 310 315 32O Glin Ala Arg Val Glin Glu Glu Ile Asp Lieu Val Val Gly Arg Ala Arg 3.25 330 335 Lieu Pro Ala Lieu Lys Asp Arg Ala Ala Met Pro Tyr Thr Asp Ala Val 34 O 345 35. O Ile His Glu Val Glin Arg Phe Ala Asp Ile Ile Pro Met Asn Lieu Pro 355 360 365 His Arg Val Thr Arg Asp Thr Ala Phe Arg Gly Phe Lieu. Ile Pro Llys 37 O 375 38O Gly Thr Asp Val Ile Thr Lieu. Lieu. Asn Thr Val His Tyr Asp Pro Ser 385 390 395 4 OO Glin Phe Lieu. Thr Pro Glin Glu Phe Asin Pro Glu. His Phe Lieu. Asp Ala 4 OS 41O 415 Asn Glin Ser Phe Lys Lys Ser Pro Ala Phe Met Pro Phe Ser Ala Gly 42O 425 43 O Arg Arg Lieu. Cys Lieu. Gly Glu Ser Lieu Ala Arg Met Glu Lieu. Phe Lieu 435 44 O 445 Tyr Lieu. Thir Ala Ile Leu Gln Ser Phe Ser Leu Gln Pro Leu Gly Ala 450 45.5 460 Pro Glu Asp Ile Asp Lieu. Thr Pro Lieu. Ser Ser Gly Lieu. Gly Asn Lieu. 465 470 47s 48O Pro Arg Pro Phe Glin Lieu. Cys Lieu. Arg Pro Arg 485 490

<210 SEQ ID NO 2 O <211 LENGTH: 503 &212> TYPE: PRT <213> ORGANISM: homo sapiens &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (503) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP3A4

<4 OO SEQUENCE: 2O Met Ala Lieu. Ile Pro Asp Lieu Ala Met Glu Thir Trp Lieu. Lieu. Lieu Ala 1. 5 1O 15 Val Ser Leu Val Lieu Lleu Tyr Lieu. Tyr Gly Thr His Ser His Gly Lieu. 2O 25 3O Phe Llys Llys Lieu. Gly Ile Pro Gly Pro Thr Pro Leu Pro Phe Leu Gly 35 4 O 45 US 2009/0061471 A1 Mar. 5, 2009 68

- Continued Asn Ile Leu Ser Tyr His Lys Gly Phe Cys Met Phe Asp Met Glu. Cys SO 55 6 O His Llys Llys Tyr Gly Lys Val Trp Gly Phe Tyr Asp Gly Glin Glin Pro 65 70 7s 8O Val Lieu Ala Ile Thr Asp Pro Asp Met Ile Llys Thr Val Lieu Val Lys 85 90 95 Glu Cys Tyr Ser Val Phe Thr Asn Arg Arg Pro Phe Gly Pro Val Gly 1OO 105 11 O Phe Met Lys Ser Ala Ile Ser Ile Ala Glu Asp Glu Glu Trp Lys Arg 115 12 O 125 Lieu. Arg Ser Lieu Lleu Ser Pro Thr Phe Thir Ser Gly Llys Lieu Lys Glu 13 O 135 14 O Met Val Pro Ile Ile Ala Glin Tyr Gly Asp Val Lieu Val Arg Asn Lieu. 145 150 155 160 Arg Arg Glu Ala Glu Thr Gly Llys Pro Val Thir Lieu Lys Asp Val Phe 1.65 17O 17s Gly Ala Tyr Ser Met Asp Val Ile Thr Ser Thr Ser Phe Gly Val Asn 18O 185 19 O Ile Asp Ser Lieu. Asn. Asn Pro Glin Asp Pro Phe Val Glu Asn. Thir Lys 195 2OO 2O5 Llys Lieu. Lieu. Arg Phe Asp Phe Lieu. Asp Pro Phe Phe Lieu. Ser Ile Thr 21 O 215 22O Val Phe Pro Phe Lieu. Ile Pro Ile Leu Glu Val Lieu. Asn Ile Cys Val 225 23 O 235 24 O Phe Pro Arg Glu Val Thr Asn. Phe Lieu. Arg Llys Ser Val Lys Arg Met 245 250 255 Lys Glu Ser Arg Lieu. Glu Asp Thr Glin Llys His Arg Val Asp Phe Lieu. 26 O 265 27 O Glin Lieu Met Ile Asp Ser Glin Asn. Ser Lys Glu Thr Glu Ser His Lys 27s 28O 285 Ala Lieu. Ser Asp Lieu. Glu Lieu Val Ala Glin Ser Ile Ile Phe Ile Phe 29 O 295 3 OO Ala Gly Tyr Glu Thir Thr Ser Ser Val Lieu Ser Phe Ile Met Tyr Glu 3. OS 310 315 32O Lieu Ala Thr His Pro Asp Val Glin Glin Llys Lieu. Glin Glu Glu Ile Asp 3.25 330 335 Ala Val Lieu Pro Asn Lys Ala Pro Pro Thr Tyr Asp Thr Val Lieu. Glin 34 O 345 35. O Met Glu Tyr Lieu. Asp Met Val Val Asn Glu Thir Lieu. Arg Lieu. Phe Pro 355 360 365 Ile Ala Met Arg Lieu. Glu Arg Val Cys Llys Lys Asp Val Glu Ile Asn 37 O 375 38O Gly Met Phe Ile Pro Lys Gly Val Val Val Met Ile Pro Ser Tyr Ala 385 390 395 4 OO Lieu. His Arg Asp Pro Llys Tyr Trp Thr Glu Pro Glu Lys Phe Leu Pro 4 OS 41O 415 Glu Arg Phe Ser Llys Lys Asn Lys Asp Asn. Ile Asp Pro Tyr Ile Tyr 42O 425 43 O Thr Pro Phe Gly Ser Gly Pro Arg Asn Cys Ile Gly Met Arg Phe Ala 435 44 O 445 Lieu Met Asn Met Lys Lieu Ala Lieu. Ile Arg Val Lieu. Glin Asn. Phe Ser US 2009/0061471 A1 Mar. 5, 2009 69

- Continued

450 45.5 460 Phe Llys Pro Cys Lys Glu Thr Glin Ile Pro Lieu Lys Lieu. Ser Lieu. Gly 465 470 47s 48O Gly Lieu. Lieu. Glin Pro Glu Lys Pro Val Val Lieu Lys Val Glu Ser Arg 485 490 495 Asp Gly Thr Val Ser Gly Ala SOO

<210 SEQ ID NO 21 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var1 <4 OO SEQUENCE: 21 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Phe Ala Arg Asp 65 70 7s 8O Phe Ala Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Lys Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu Tyr Ile Glu Val Pro Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Ile Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Phe Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Ser Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Thr Arg Val Lieu Val Asp Pro Val Pro Ser US 2009/0061471 A1 Mar. 5, 2009 70

- Continued

29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Gln Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His ASn Thr 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO US 2009/0061471 A1 Mar. 5, 2009 71

- Continued

Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 22 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var2 <4 OO SEQUENCE: 22 US 2009/0061471 A1 Mar. 5, 2009 72

- Continued

Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Phe Ala Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Lys Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Pro Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO US 2009/0061471 A1 Mar. 5, 2009 73

- Continued Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Glin Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His Asn. Thir 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu US 2009/0061471 A1 Mar. 5, 2009 74

- Continued

805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Gln Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Gln 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 23 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3 <4 OO SEQUENCE: 23 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Cys Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Phe Ala Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Ile Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met US 2009/0061471 A1 Mar. 5, 2009 75

- Continued

1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys ASn Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Glin Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His Asn. Thir 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O US 2009/0061471 A1 Mar. 5, 2009 76

- Continued

Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O US 2009/0061471 A1 Mar. 5, 2009 77

- Continued Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 24 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-2 <4 OO SEQUENCE: 24 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Pro Lieu. Gly Asp Gly Lieu. Phe Ala Ser Trp Thr His Glu Lys Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Thr Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Val Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 US 2009/0061471 A1 Mar. 5, 2009 78

- Continued Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Glin Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His Asn. Thir 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val US 2009/0061471 A1 Mar. 5, 2009 79

- Continued

610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu. Ala Lys Arg Lieu. Thr Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O US 2009/0061471 A1 Mar. 5, 2009 80

- Continued

Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 25 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-3 <4 OO SEQUENCE: 25 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Cys Pro Gly Asp Gly Lieu. Ala Thr Ser Trp Thr His Glu Lys ASn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Thr Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Val Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O US 2009/0061471 A1 Mar. 5, 2009 81

- Continued

Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lieu. His Glu A a. Thr Lieu Val Lieu. Gly Met 4 OS 4. O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Glin Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His Asn. Thir 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O US 2009/0061471 A1 Mar. 5, 2009 82

- Continued Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 26 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-4 <4 OO SEQUENCE: 26 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 US 2009/0061471 A1 Mar. 5, 2009 83

- Continued Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Trp Ile Gly Asp Gly Lieu Ala Thir Ser Trp Thr His Glu Lys Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Thr Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Val Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp US 2009/0061471 A1 Mar. 5, 2009 84

- Continued

425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin Tyr 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 82O 825 83 O US 2009/0061471 A1 Mar. 5, 2009 85

- Continued

Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 27 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-5 <4 OO SEQUENCE: 27 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Phe Gly Gly Asp Gly Lieu Val Thr Ser Trp Thr His Glu Lys Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 US 2009/0061471 A1 Mar. 5, 2009 86

- Continued

Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Val Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Glin Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His Asn. Thir 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 US 2009/0061471 A1 Mar. 5, 2009 87

- Continued Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. US 2009/0061471 A1 Mar. 5, 2009 88

- Continued

93 O 935 94 O

Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thir Tyr Wall Glin 965 97O 97.

His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005

Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Wall 1010 1015 Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 28 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-8

<4 OO SEQUENCE: 28

Thir Ile Lys Glu Met Pro Glin Pro Thir Phe Gly Glu Luell Lys Asn 1. 5 15

Lell Pro Luell Luell Asn Thir Asp Wall Glin Ala Lell Met Ile 2O 3O

Ala Asp Glu Lieu. Gly Glu Ile Phe Phe Glu Ala Pro Gly Wall 35 4 O 45

Thir Arg Luell Ser Ser Glin Arg Luell Ile Lys Glu Ala Asp Glu SO 55 6 O

Ser Arg Phe Asp Llys Asn Lell Ser Glin Ala Luell Ala Wall Arg Asp 65 70

Phe Ala Gly Asp Gly Lell Ile Thir Ser Trp Thir His Glu Ile Asn Trp 85 90 95

Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp Glu Arg Lieu. Asn Ala Asp Glu His Ile Glu Wall Ser Glu Asp 13 O 135 14 O

Met Thir Arg Lieu. Thir Lell Asp Thir Ile Gly Luell Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Ile Ser 1.65 17O 17s

Met Wall Arg Ala Lieu. Glu Wall Met Asn Lys Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Ala Asp Glu Asn Arg Glin Cys Glin Glu Asp 195

Ile Lys Wall Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215

Ala Arg Gly Glu Glin Ser Asp Asp Luell Luell Thir Glin Met Luell Asn Gly US 2009/0061471 A1 Mar. 5, 2009 89

- Continued

225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO Ile Gly Glin Glin Phe Ala Lieu. His Glu Ala Thr Lieu Val Lieu. Gly Met 4 OS 41O 415 Met Lieu Lys His Phe Asp Phe Glu Asp His Thr Asn Tyr Glu Lieu. Asp 42O 425 43 O Ile Lys Glu Thir Lieu. Thir Lieu Lys Pro Glu Gly Phe Val Val Lys Ala 435 44 O 445 Lys Ser Lys Lys Ile Pro Leu Gly Gly Ile Pro Ser Pro Ser Thr Glu 450 45.5 460 Glin Ser Ala Lys Llys Val Arg Llys Lys Ala Glu Asn Ala His Asn. Thir 465 470 47s 48O Pro Leu Lieu Val Lieu. Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly Thr 485 490 495 Ala Arg Asp Lieu Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505 51O Val Ala Thir Lieu. Asp Ser His Ala Gly Asn Lieu Pro Arg Glu Gly Ala 515 52O 525 Val Lieu. Ile Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O Lys Glin Phe Val Asp Trp Lieu. Asp Glin Ala Ser Ala Asp Glu Val Lys 5.45 550 555 560 Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala Thr 565 st O sts Thir Tyr Glin Llys Val Pro Ala Phe Ile Asp Glu Thir Lieu Ala Ala Lys 58O 585 59 O Gly Ala Glu Asn. Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 6OO 605 Phe Glu Gly Thr Tyr Glu Glu Trp Arg Glu. His Met Trp Ser Asp Val 610 615 62O Ala Ala Tyr Phe Asn Lieu. Asp Ile Glu Asn. Ser Glu Asp Asn Llys Ser 625 630 635 64 O US 2009/0061471 A1 Mar. 5, 2009 90

- Continued

Thir Lieu. Ser Lieu. Glin Phe Val Asp Ser Ala Ala Asp Met Pro Lieu Ala 645 650 655 Llys Met His Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu Lieu. 660 665 67 O Glin Glin Pro Gly Ser Ala Arg Ser Thr Arg His Lieu. Glu Ile Glu Lieu. 675 68O 685 Pro Lys Glu Ala Ser Tyr Glin Glu Gly Asp His Lieu. Gly Val Ile Pro 69 O. 695 7 OO Arg Asn Tyr Glu Gly Ile Val Asn Arg Val Thr Ala Arg Phe Gly Lieu 7 Os 71O 71s 72O Asp Ala Ser Glin Glin Ile Arg Lieu. Glu Ala Glu Glu Glu Lys Lieu Ala 72 73 O 73 His Lieu Pro Lieu Ala Lys Thr Val Ser Val Glu Glu Lieu. Lieu. Glin Tyr 740 74. 7 O Val Glu Lieu. Glin Asp Pro Val Thr Arg Thr Glin Lieu. Arg Ala Met Ala 7ss 760 765 Ala Lys Thr Val Cys Pro Pro His Llys Val Glu Lieu. Glu Ala Lieu. Lieu 770 775 78O Glu Lys Glin Ala Tyr Lys Glu Glin Val Lieu Ala Lys Arg Lieu. Thir Met 78s 79 O 79. 8OO Lieu. Glu Lieu. Lieu. Glu Lys Tyr Pro Ala Cys Glu Met Llys Phe Ser Glu 805 810 815 Phe Ile Ala Leu Lleu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile Ser 82O 825 83 O Ser Ser Pro Arg Val Asp Glu Lys Glin Ala Ser Ile Thr Val Ser Val 835 84 O 845 Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys Gly Ile Ala 850 855 860 Ser Asn Tyr Lieu Ala Glu Lieu. Glin Glu Gly Asp Thr Ile Thr Cys Phe 865 87O 87s 88O Ile Ser Thr Pro Glin Ser Glu Phe Thr Lieu Pro Lys Asp Pro Glu Thr 885 890 895 Pro Leu. Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O Phe Val Glin Ala Arg Lys Glin Lieu Lys Glu Glin Gly Glin Ser Lieu. Gly 915 92 O 925 Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 US 2009/0061471 A1 Mar. 5, 2009 91

- Continued Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210 SEQ ID NO 29 <211 LENGTH: 1048 &212> TYPE: PRT <213> ORGANISM: Artificial sequence &220s FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-7 <4 OO SEQUENCE: 29 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Cys Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Phe Ala Gly Asp Gly Lieu Ala Thr Ser Trp Thr His Glu Ile Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335