USOO8026085B2

(12) UnitedO States Patent (10) Patent No.: US 8,026,085 B2 Fasan et al. (45) Date of Patent: Sep. 27, 2011

(54) METHODS AND SYSTEMS FOR SELECTIVE 2. R 1939: R et al FLUORINATION OF ORGANIC MOLECULES 7,435,570 B2 10/2008 Arnoldarinas etet al.al. 7.465,567 B2 12/2008 Cirino et al. (75) Inventors: Rudi Fasan, Brea, CA (US); Frances H. 7,524,664 B2 4/2009 Arnold et al. Arnold, La Canada, CA (US) 7,691.616 B2 4/2010 Farinas et al. 2001/0051855 Al 12/2001 Wang et al. (73) Assignee: California Institute of Technology, 2002fOO45.1752002fO16874.0 A1 11,4/2002 2002 WangChen et al. Pasadena, CA (US) 2003, OO77795 A1 4, 2003 Wilson 2003/OO77796 A1 4/2003 Croteau (*) Notice: Subject to any disclaimer, the term of this 2003/0.100744 A1 5/2003 Farinas et al.

patent is extended or adjusted under 35 2.93.892; A. ck 58. WSlOOC alal...... 435/6 U.S.C. 154(b) by 670 days. 2005/0059045 A1 3, 2005 Arnold et al. 2005/00591.28 A1 3/2005 Arnold et al. (21) Appl. No.: 11/890,218 2005/0202419 A1 9, 2005 Cirino et al. 2008.OO57577 A1 3/2008 Arnold et al. (22) Filed: Aug. 4, 2007 2008/0248545 A1 10, 2008 Arnold et al. 2008/0268517 A1 10, 2008 Arnold et al. O O 2008/0293928 A1 11/2008 Farinas et al. (65) Prior Publication Data 2009/0061471 A1 3/2009 Fasan et al. US 2009/OO61471 A1 Mar. 5, 2009 2009, O124515 A1 5/2009 Arnold et al. 2009, O142821 A1 6/2009 Cirino et al. O O 2009/0209010 A1 8, 2009 Fasan et al. Related U.S. Application Data 2009,0264311 A1 10, 2009 Arnold et al. (60) Provisional application No. 60/835,613, filed on Aug. (Continued) 4, 2006. FOREIGN PATENT DOCUMENTS (51) Int. Cl. EP O505198 A1 9, 1992 CI2P 7/00 (2006.01) inued CI2P 2L/04 (2006.01) (Continued) C07C 17/00 (2006.01) CD7C2L/00 (2006.01) OTHER PUBLICATIONS CI2N 9/02 (2006.01) Fujisawa et al. Synthesis and optical resolution of 2-aryl-2- CI2N IS/00 (2006.01) fluoropropionic acids, fluorinated analogues of non-steroidal anti CI2O I/26 (2006.01) inflammatory inflammatory drugs (NSAIDs). Chem Pharm Bull C7H 2L/04 (2006.01) (Tokyo). May 2005:53(5):524-8.* (52) U.S. Cl...... 435/132; 570/123:570/246; 435/189: Branden et al. Introduction to Protein Structure, Garland Publishing 435/69.1:435/71. 1; 435/.440; 435/25; 536/23.2 Inc., New York, p. 247, 1991.* (58) Field of Classification Search ...... None Fruetel, J., et al., “Relationship of Topology to See application file for complete search history. Specificity for Cytochrome P450terp (CYP108).” The Journal of Biological Chemistry, Nov. 18, 1994, pp. 28815-28821, vol. 269, No. (56) References Cited 46, The American Society for Biochemistry and Molecular Biology, Inc. U.S. PATENT DOCUMENTS Gahmberg C., et al., “Nonmetabolic Radiolabeling and Taggin of 4,599,342 A 7, 1986 LaHann Glycoconjugates.” Methods in Enzymology, 1994, pp. 32-44, vol. 5, 198,346 A 3/1993 Ladner et al. 230, Academic Press, Inc. 5,223,409 A 6, 1993 Ladner et al. Gazaryan, I. G., “Heterologous Expressions of Heme Containing 5,429,939 A 7, 1995 Misawa et al. Peroxidases.” Plant Peroxidase Newsletter, Sep.1994, pp. 11-13, No. 5,602,169 A 2, 1997 Hewawasam et al. 4, LABPV Newsletters. 5,605,793 A 2f1997 Stemmer Gillam, E., et al., “Expression of Cytochrome P450 2D6 in 5,741,691 A 4, 1998 Arnold et al. Escherichia coli, Purification, and Spectral and Catalytic Character 5,785,989 A 7/1998 Stanley et al. ization.” Archives of Biochemistry and Biophysics, Jun. 1, 1995, pp. 5,789,166 A 8, 1998 Bauer et al. 540-550, vol. 319, No. 2, Academic Press, Inc. 5,811,238 A 9, 1998 Stemmer et al. 5,830,721 A 11/1998 Stemmer et al. (Continued) 5,837,458 A 11/1998 Minshull et al. 5,965,4085,945,325 A 10,8, 1999 ShortArnold Primary Examiner —Yong Pak 6,090,604. A 7/2000 Golightly et al. (74) Attorney, Agent, or Firm — Joseph R. Baker, Jr.; 6,107,073. A 8, 2000 Chen Gavrilovich Dodd & Lindsey LLP 6,316,216 B1 1 1/2001 Ohto et al. 6,361,988 B1 3, 2002 Arnold (57) ABSTRACT 6,498,026 B2 12/2002 Delagrave et al. 6,524,837 B1 2/2003 Arnold A method and system for selectively fluorinating organic 6,537,746 B2 3/2003 Arnold et al. molecules on a target site wherein the target site is activated 6,643,591 B1 1 1/2003 Korzekwa et al. and then fluorinated are shown together with a method and 6,906,9306,794,168 B2B1 9/20045, 2004 ArnoldWong et et al. al. system for identifyingY a molecule having a biological activity. 6,902,918 B1 6, 2005 Arnold et al. 7,098,010 B1 8, 2006 Arnold et al. 14 Claims, 8 Drawing Sheets US 8,026,085 B2 Page 2

U.S. PATENT DOCUMENTS Hamilton, G.A., et al., “Galactose Oxidase: The Complexities of a 2009/0298148 A1 12/2009 Arnold et al. Simple .” Oxidases and Related Redox Systems, 1973, pp. 103-124, vol. 1, University Park Press. FOREIGN PATENT DOCUMENTS Hartmann, Martin et al., “Selective Oxidations of Linear Alkanes EP O752008 B1 1, 1997 with Molecular Oxygen on Molecular Sieve Catalysts-A Break WO 89.03424 A1 4f1989 through'?.” Journal of the American Chemical Society, 1978, vol. WO 95/22625 A1 8, 1995 100, pp. 888-890. WO 97.16553 A1 5, 1997 Ito, N. et al., "X-Ray Crystallographic Studies of Cofactors in Galac WO 97/20078 A1 6, 1997 tose Oxidase.” Methods in Enzymology, Redox-Active Amino Acids WO 97.35957 A1 10, 1997 in Biology, 1995, pp. 235-262, vol. 258, Academic Press, Inc. WO 97/35966 A1 10, 1997 Ito, N. et al., “Crystal Structure of a Free Radical Enzyme, Galactose WO 98.27 230 A1 6, 1998 Oxidase.” Journal of Molecular Biology, 1994, pp. 794-814, vol. 238, WO 98.31837 A1 7, 1998 No. 5, Academic Press Limited. WO 98/41653 9, 1998 WO 98.42832 10, 1998 Jaeger et al., "Enantioselective biocatalysts optimized by directed WO 99,60096 A2 11/1999 evolution.” Current Opinion in Biotechnology, 2004, vol. 15, No. 4. WO OOOO632 A1 1, 2000 pp. 305-313. WO 00,04190 A1 1, 2000 Joo, H. et al., “Laboratory evolution of peroxide-mediated WO OOO6718 A2 2, 2000 cytochrome P450 hydroxylation.” Nature, Jun. 17, 1999, pp. 670 WO OOO9679 A1 2, 2000 673, vol. 399. WO 00, 18906 A3 4/2000 Joo, Hyun et al., “A high-throughput digital imaging screen for the WO OOf 31273 A2 6, 2000 discovery and directed evolution of .” Chemistry and WO OOf 42560 A2 T 2000 Biology, 1999, pp. 699-706. WO OOf78973 A1 12/2000 Kallis, Russel, International Search Report, Date of Mailing: Feb. 10, WO 01.61344 A1 8, 2001 2004, International Application No. PCT/US03/17775. WO O1,62938 A2 8, 2001 WO 02/083,868 A2 10, 2002 Kiba, N. et al., “A post-column co-immobilized galactose oxidase? WO 03/008563 A2 1, 2003 peroxidase reactor for fluorometric detection of Saccharides in a WO 03/091835 A2 11/2003 liquid chromatographic system.” Journal of Chromatography, 1989, WO 03.101184 A2 12/2003 pp. 183-187, vol. 463, Elsevier Science Publishes B.V., Amsterdam, WO 2005/017105 A2 2, 2005 The Netherlands. WO 2005/017106 A2 2, 2005 Kim, J. et al., “Use of 4-(Nitrobenzyl)Pyridine (4-NBP) to Test WO 2006,105082 A2 10, 2006 Mutagenic Potential of Slow-Reacting Epoxides, Their Correspond WO 2008/O16709 A3 2, 2008 ing Olefins, and Other Alkylating Agents.” Bull. Environ. Contam. WO 2008/085900 A2 T 2008 Toxicol., 1992, pp. 879-885, vol. 49, Springer-Verlag New York Inc. WO 2008.098198 A2 8, 2008 Kim. Ji Yun, International Search Report, Date of Mailing of Search: WO 2008. 115844 A2 9, 2008 Feb. 5, 2008, International Application No. PCT/US07/17409. WO 2008, 118545 A2 10/2008 Klibanov, A. et al., “Stereospecific Oxidation of Aliphatic Alcohols WO 2008/121435 A2 10/2008 Catalyzed by Galactose Oxidase.” Biochemical and Biophysical OTHER PUBLICATIONS Research Communications, 1982, pp. 804-808, vol. 108, No. 2, Aca Gleider et al., “Laboratory evolution of a soluble, self-sufficient, demic Press, Inc. highly active alkanehydroxylase.” Nature Biotech., 2002, vol. 20, pp. Koroleva, O. et al., “Properties of Fusarium graminearum Galactose 1135-1139. Oxidase.” 1984, pp. 500-509, Plenum Publishing Corporation. Gonzalez et al., “Evolution of the P450 gene Superfamily animal Koster, R. et al., “Organoboron Monosaccharides;XII1. Quantitative plant warfare, molecular drive and human genetic differences in Preparation of D-gluco-Hexodialdose from Sodium D-Glucuronate drug oxidation.” Trends Genet. 1990, vol. 6, pp. 182-186. or D-Glucuronic acid.” Synthesis, Aug. 1982, pp. 650-652, No. 8, Gonzalez, Frank J. D. W. Nebert, J. P. Hardwick, and C. B. Kasper Georg Thieme Verlag. “Complete cDNA and Protein Sequences of a pregnenolone 16C.- Kuhn-Velten, W., “Effects of Compatible Solutes on Mammalian Carabonitrile-induced Cytochrome P-450 A Representative of a New Cytochrome P450 Stability.” 1997, pp. 132-135, Verlag der Gene Family” J. Biol. Chem. 260 (12):7435-7441, 1985. Zeitschrift für Naturforschung. Gotoh, Cytochrome P450, 2nd Edition, 1993, pp. 255-272. Landwehr, et al., “Diversification ofCatalytic Function in a Synthetic Govindaraj and Poulos; "Role of the linker region connecting the Family of Chimeric Cytochrome P450s', Chemistry and Biology, reductase and heme domains in cytochrome P450BM-3”; Biochem Current Biology, vol. 14, No. 3, Mar. 23, 2007, pp. 269-278. istry; vol. 34, No. 35, 1995, pp. 11221-11226. Leadbetter, E. R. et al. “Incorporation of Molecular Oxygen in Graham-Lorence, S., et al., “An Active Site Substitution, F87V, Con Bacterial Cells Utilizing Hydrocarbons for Growth' Nature, Oct. 31. verts Cytochrome P450 BM-3 into a Regio- and Stereoselective 1959; vol. 184, pp. 1428-1429. (14S, 15R)-Arachidonic Acid Epoxygenase.” The Journal of Biologi Lee C et al., "Accurate prediction of the stability and activity effects cal Chemistry, Jan. 10, 1997, pp. 1127-1135, vol. 272, No. 2, The of site directed mutagenesis on aprotein core”. Nature, 1991,352, pp. American Society for Biochemistry and Molecular Biology, Inc. 448-451. Green, J., et al., “Substrate Specificity of Soluble Methane Lewis, D., “P450 Substrate Specificity and Metabolism.” Mechanistic Implications.” The Journal of Biologi Cytochrome P450: Structure, Function and Mechanism, Aug. 2001, cal Chemistry, Oct. 25, 1989, pp. 17698-17703, vol. 264, No. 30, The pp. 115-166, Taylor & Francis Publishers. American Society for Biochemistry and Molecular Biology, Inc. Lewis, D.F. W., et al., “Molecular modeling of CYP1 family Groves, John et al., “Models and Mechanisms of Cytochrome P450 CYP1A1, CYP1A2, CYP1A6 and CYP1B1 based on sequence Action.” Cytochrome P450: Structure, Mechanisms, and Biochem homology with CYP102.” Toxicology, 139, 1999, pp. 53-79. istry, 2nd Edition, New York, 1995, pp. 3-48. Li, Huiying et al., “The Structure of the cytochromep450BM-3 haem Guengerich, F., et al., “Purification of Functional Recombinant domain complexed with the fatty acid Substrate, palmitoleic acid.” P450s from Bacteria.” Methods in Enzymology, 1996, pp. 35-44, vol. Nature Structural Biology, 1997, pp. 140-146. 272, Academic Press, Inc. Li, Q. et al., “Rational evolution of a medium chain-specific Haines, Donovan C. et al., “Pivotal Role of Water in the Mechanism cytochrome P-450 BM-3 variant.” Biochimica et Biophysica Acta, of P450BM-3.” Biochemistry, 2001, 40, pp. 13456-13465. 2001, pp. 114-121, 1545, Elsevier Science B.V. Hallinan, E.A. et al., “4-Fluorinated L-lysine analogs as selective Li, Qing-Shan, J. Ogawa, R. D. Schmid, and S. Shimizu, “Engineer i-NOS inhibitors: methodology for introducing fluorine into the ing Cytochrome P450 BM-3 for Oxidation of Polycyclic Aromatic lysine side chain'. Organic & Biomolecular Chemistry, vol. 1 (20), Hydrocarbon Appl. and Env, Microbiol. Dec. 2001, 67(10): 5735 Oct. 21, 2003, pp. 3527-3534. 5739. US 8,026,085 B2 Page 3

Li et al., “Directed evolution of the fatty-acid hydroxylase P450 Miura, Yoshiro, et al., "oo-1, (D-2 and (D-3 hydroxylation of long-chain BM-3 into an indole-hydroxylating catalyst.” Chemistry 2000, vol. 6, fatty acids, amides and alcohols by a soluble enzyme system from pp. 1531-1536. Bacillus megaterium.” Biochimica et Biophysica Acta 388, 1975, pp. Li et al., “residue size at position 87 of cytochrome P450 BM-3 305-317. determines its stereo selectivity in propylbenzene and Modi. S. et al., “NMR Studies of Substrate Binding to Cytochrome 3-chlorostyrene oxidation.” FEBS Lett 508, 2001, pp. 249-252. P450 BM3: Comparisons to Cytochrome P450 cam.” Biochemistry, Li, H., et al., "Characterization of Recombinant Bacillus megaterium 1995, pp. 8982-8988, vol. 34, No. 28, American Chemical Society. Cytochrome P-450BM-3 and its Two Functional Domains”. Journal Moyse, Ellen, International Preliminary Report on Patentability and of Biological Chemistry, vol. 266, No. 18, 1991:266: pp. 11909 11914. Written Opinion, Date of Issuance of Report: Feb. 10, 2009, Interna Li, Q. S., et al.; “Critical Role of the residue size at position 87 in tional Application No. PCT/US07/17409. H2O2-dependent substrate hydroxylation activity in H2O2 inactivi Munro, A. et al., “Alkane Metabolism by Cytochrome P450 BM3.” ation of cytochrome P450-BM-3”; Biochem, Biophysics Res Com Biochemical Society Transactions, 1993, p. 412S, 21. mun. vol. 280, No. 5. Abstract, 2001: pp. 1258-1261. Munro, A. et al., “Probing electronic transfer in flavocytochrome Li, et al., “Critical Role of the Residue Size at Position 87 in H2O2 P-450 BM3 and its component domains.” Eur, J. Biochem., 1996, pp. Dependent Substrate Hydroxylation Activity and H2O2 Inactivation 403-409, FEBS. of Cytochrome P450BM-3”, Biochemical and Biophysical Research Munro et al., “P450 BM3: The very model of a modern Communications, 2001, vol. 280, pp. 1258-1261. flavocyteochrome.” Trends Biochem. Sci., 2002, vol. 27, pp. 250 Lis, M. et al., “Galactose Oxidase-Glucan Binding Domain Fusion 257. Proteins as Targeting Inhibitors of Dental Plaque Bacteria.” Antimi Murrell, J. et al., “Molecular biology and regulation of methane crobial Agents & Chemotherapy, May 1997, pp. 999-1003, vol. 41. monooxygenase.” Arch. Microbiol., 2000, pp. 325-332, 173o. No. 5, American Society for Microbiology. Narhi, L. et al., “Identification and Characterization of Two Func Liu, C. et al., “Sugar-containing Polyamines Prepared Using Galac tional Domains in Cytochrome P-450BM-3, a Catalytically Self tose Oxidase Coupled with Chemical Reduction.” J. Am. Chem. Soc., sufficient Monooxygenase Induced by Barbiturates in Bacillus Jan. 20, 1999, pp. 466-467, vol. 121, No. 2, American Chemical megaterium.” The Journal of Biological Chemistry, May 1987, pp. Society. 6683-6690, vol. 262, No. 14. The American Society of Biological Lundglen, Jeffrey S. International Search Report, Date of Mailing of Chemists, Inc. Search: Jul. 16, 2001, International Application No. PCT/US01/ Narhi, L. et al., “Characterization of a Catalytically Self-sufficient 05043. 199,000-Dalton Cytochrome P-450 Monooxygenase Induced by Ly, Cheyrie D., International Search Report, Date of Mailing of Barbiturates in Bacillus megaterium.” The Journal of Biological Search: Aug. 18, 2004. International Application No. PCT/US02/ Chemistry, Jun. 1986, pp. 7160-7169, vol. 261, No. 16. The American 34342. Society of Biological Chemists, Inc. Maradufu, A. et al., “A Non-Hydrogen-Bonding Role for the Nashed, Nashaat, Transmittal of International Search Report and 4-Hydroxyl Group of D-Galactose in its Reaction with D-Galactose Written Opinion, International Search Report, and Written Opinion, Oxidase.” Carbohydrate Research, 1974, pp. 93-99, 32, Elsevier Sci PCT/US08/00135, Sep. 3, 2008. entific Publishing Company, Amsterdam. The Netherlands. Nashed, Nashaat, International Search Report and Written Opinion, Martinez, C. et al., “Cytochrome P450's: Potential Catalysts for Date of Mailing of Report: Sep. 26, 2008, International Application Asymmetric Olefin Epoxidations.” Current Organic Chemistry, No. PCT/USO8,53472. 2000, pp. 263-282, vol. 4. No. 3, Bentham Science Publishers B.V. Nelson, D., "Appendix A Cytochrome P450 Nomenclature and Matson, R. et al., “Characteristics of a Cytochrome P-450-Dependent Alignment of Selected Sequences.” Cytochrome P450: Structure, Fatty Acid (D-2 Hydroxylase From Bacillus megaterium.” Mechanism, and Biochemistry, Second Ed., 1995, pp. 575-606, Ple Biochimica et Biophysica Acta, 1977, pp. 487-494, 487. Elsevier/ num Press, NY. North Holland Biomedical Press. Ness, J. et al., “DNA shuffling of subgenomic sequences of Matsunaga, I., et al., “Fatty Acid-Specific, Regiospecific, and subtilisin.” Nature Biotechnology, Sep. 1999, pp. 893-896, vol. 17. Stereospecific Hydroxylation by Cyctochrome P450(CYP152B1) No. 9, Nature Publishing Group. from Sphingomonas paucimobilis: Substrate Structure Required for a Nickitas-Etienne, Athina, International Preliminary Report on Pat O-Hydroxylation'. Lipids 2000; 35, pp. 365-371. entability and Written Opinion, Date of Issuance of Report: Jul. 7. McPherson, M. et al., “Galactose Oxidase of Dactylium dendroides.” 2009, International Application No. PCT/US08/00135. Apr. 1992, pp. 8146-8152. The Journal of Biological Chemistry, vol. Nickitas-Etienne, Athina, International Preliminary Report on Pat 267, No. 12, The American Society for Biochemistry and Molecular entability and Written Opinion, Date of Issuance of Report: Nov. 17. Biology, Inc. 2009, International Application No. PCT/US08/53344. McPherson, M. et al., “Galactose oxidase: Molecular analysis and Noble, M. et al., “Roles of key active-site residues in mutagenesis studies. Biochemical Society Transactions, 646th flavocytochrome P450 BM3.” Biochem. J., 1999, pp. 371-379,339, Meeting Leeds, 1993, pp. 1992-1994, vol. 21. The Biochemical Biochemical Society. Society, Portland Press. Ohkuma et al., “Cyp52 (Cytochrome-P450alk) multigene family in Meah, MohammadY., International Search Report and Written Opin candida-maltose—Identification and characterization of 8 mem ion, Date of Mailing: Sep. 10, 2008, International Application No. bers.” DNA and Cell Biology, 1995, vol. 14, No. 2, pp. 163-173. PCT/US06/11273. Oliver, C. et al., “Engineering the substrate specificity of Bacillus Meinhold, P. et al., “Direct Conversion of Ethane to Ethanol by megaterium cytochrome P-450 BM3: hydroxylation of alkyl Engineered Cytochrome P450 BM3.” ChemBioChem, 2005, pp. 1-4. trimethylammonium compounds.” Biochem. J., 1997, pp. 537-544, vol. 6, Wiley-VCHVerlag GmbH & Co. Weinheim, Germany. 327. The Biochemical Society, London, England. Mendonca, M. et al., “Purification and Characterization of Intracel Oliver, C. F., et al., “A single Mutation in Cytochrome P450 BM3 lular Galactose Oxidase from Dactylium dendroides.” Archives of Changes Substrate Orientation in a Catalytic Intermediate and the Biochemistry and Biophysics, Feb. 1987, pp. 507-514, vol. 252, No. Regiospecificity of Hydroxylation'. Biochemistry 1997; 36:1567 2. Academic Press, Inc. 72. Mendonca, M. et al., “Role of Carbohydrate Content on the Proper Ortlepp, S. et al., “Expression and characterization of a protein speci ties of Galactose Oxidase from Dactylium dendroides.” Archives of fied by a synthetic horseradish peroxidase gene in Escherichia coli.” Biochemistry and Biophysics, Nov. 1988, pp. 427-434, vol. 266, No. Journal of Biotechnology, 1989, pp. 353-364. 11, Elsevier Science 2. Academic Press, Inc. Publishers B.V. Meyer et al., “Library analysis of Schema-guided protein recombi Osman, Ahmed M. et al. “Microperoxidase (H-20-2-catalyzed aro nation.” Prot. Sci., 2003, vol. 12, No. 8, pp. 1686-1693. matic hydroxylation proceeds by a cytochrome-P-450-type oxygen Miles, Caroline S. et al., “Protein engineering of cytochromes transfer reaction mechanism'. Eurpoean Journal of Biochemistry, P-450.” Biochimica et Biophysica Acta 1543, 2000, pp. 383-407. vol. 240, No. 1, 1996, pp. 232-238, XP002 187778. US 8,026,085 B2 Page 4

Ost, T. et al., “Rational re-design of the substrate of Appel et al., “A P450 BM-3 mutant hydroxylates alkanes, flavocytochrome P450 BM3.” FEBS Letters, 2000, pp. 173-177,486, cycloalkanes, arenas and heteroarenes.” Journal of Biotechnology, Elsevier Science B.V. 2001, pp. 167-171, Elsevier Science B.V. Ost, T. W., et al. “Rational re-design of the substrate binding site of Arnold et al., “Optimizing Industrial Enzymes by Directed Evolu flavocytochrome P450 BM3”; FEBS Lett., vol. 486, No. 2, Abstract tion.” Advances in Biochemical Engineering/Biotechnology, 1997. 2000. pp. 1-14, vol. 58, Springer-Verlag, Berlin, Germany. Otey, Christopher R. et al., “Structure-guided recombination creates Arnold & Wintrode, Enzymes, Directed Evolution, in Encyclopedia an artificial family of cytochromes P450”. PLOSBiology, vol. 4. No. of bioprocess technology: fermentation, biocatalysis, and biosepara tion, 1999, 2,971. 5, May 2006, pp. 789-798. Arts et al., “Hydrogen Peroxide and Oxygen in Catalytic Oxidation of Otey, et al., “Functional evolution and structural conservation in Carbohydrates and Related Compounds.” Synthesis Journal of Syn chimeric cytochromes P450: Calibrating a structure-guided thetic Organic Chemistry, Jun. 1997, pp. 597-613. approach”. Chemistry & Biology (Cambridge), vol. 11, No. 3, Mar. Ashraf et al., “Bacterial oxidation of propane.” FEMS Microbiology 2004, pp. 309-318, XPO02570369. Letters, 1994, pp. 1-6, Federation of European Microbiological Soci Patten, P. et al., “Applications of DNA shuffling to pharmaceuticals eties, Elsevier. and vaccines.” Biotechnology, 1997, pp. 724-733, vol. 8, Elsevier Assis et al., “Hydrocarbon oxidation with iodosylbenzene catalyzed Science Ltd. by the sterically hindered iron (iii)5-(pentafluorophenyl)-10, 15, Paulsen, M. et al., “Dramatic Differences in the Motions of the Mouth 20-tris(2,6-dichlorophenyl) porphyrin in homogenous solution and of Open and Closed Cytochrome P450BM-3 by Molecular Dynamics covalently bound to silica.” Journal of the Chemical Society—Perkin Simulations.” Proteins: Structure, Function and Genetics, 1995, pp. Transactions 2, 1998, vol. 10, pp. 2221-2226. 237-243, Wiley-Liss, Inc. Aust, S. D., "Commentary Laboratory evolution of peroxide-me Peters, Matthew W. “Regio- and Enantioselective Alkane diated cytochrome P450 Hydroxylation.” Redox Report, 1999, Hydroxylation with Engineered Cytochromes P450 BM-3.” J. Am. 4:1957. Chem. Soc., vol. 125, 2003, pp. 13442-13450. Avigad, “Oxidation Rates of Some Desialylated Glycoproteins by Peterson, J. et al., “Chapter 5—Bacterial P450s—Structural Simi Galactose Oxidase.” Archives of Biochemistry and Biophysics, Jun. larities and Functional Differences”. Cytochrome P450: Structure, 1985, pp. 531-537, vol. 239, No. 2, Academic Press, Inc. Mechanism, and Biochemistry, Second Ed., 1995, pp. 151-180. Avigad, “An NADH Coupled Assay System for Galactose Oxidase.” Analytical Biochemistry, 1978, pp. 470-476, 86, Academic Press, Peterson et al., “The many faces of P450s and their structural and Inc. functional implications. Sixth International Symposium on Avigadet al., “The D-Galactose Oxidase of Polyporus circinatus.” Cytrochrome P450 Biodiversity: University of California, Los Journal of Biological Chemistry, Sep. 1962, pp. 2736-2743, vol. 237. Angels, 2002, p. 26. No. 9, American Society of Biological Chemists, Baltimore, MD, Petrounia, Ioanna and F. H. Arnold “Designed evolution of enzymatic USA. properties.” Current Opinion in Biotech., 11 (4): 325-330, Aug. 2000. Ayala, et al., “Enzymatic Activation of alkanes: constraints and pro Pompon, et al., “Protein engineering by cDNA recombination in spective.” Applied Catalysts A: General, 2004, pp. 1-13, vol. 272. yeasts: shuffling of mammalian cytochrome P-450 functions.” Gene, Barnes, “Maximizing Expression of Eukaryotic Cytochrome P450s 1989, vol. 83, pp. 15-24. in Escherichia coli.” Methods in Enzymology, Cytochrome P450, Ramarao et al., “Identification by in vitro mutagenesis of the inter Part B, 1996, pp. 3-14, vol. 272, Academic Press, Inc., San Diego, action of two segments of C2MstC1, a chimera of cytochromes P450 CA, USA. 2C2 and P450 2C1.” The Journal of Biological Chemistry, Jan. 27. Barnes, H. J., et al., “Expression and enzymatic activity of recombi 1995, vol. 270, No. 4, pp. 1873-1880. nant cytochrome P450 17a-hydroxylase in Escherichia coli,” Proce. Rao, Manjunath N., International Preliminary Report on Patentabil Natl Acad. Sci USA 1991; 88:5597-601. ity, Date of Completion of Search: Mar. 7, 2003, International Appli Baron et al., “Structure and Mechanism of Galactose Oxidase.” The cation No. PCT/US991. 11460. Journal of Biological Chemistry, Sep. 23, 1994, pp. 25.095-25105. Reynolds, M., et al., “Structure and Mechanism of Galactose vol. 269, No. 38, American Soc for Biochemistry and Molecular Oxidase: Catalytic Role of Tyrosine 495.” JBIC, 1997, pp. 327-335, Biology. vol. 2. Bell et al., “Butane and propane oxidation by engineered Roberts, “The power of evolution: accessing the synthetic potential cytochromes P450(cam).” Chemical Communications, 2002, vol. 5, of P450s', Chemistry & Biology, 1999, vol. 6, No. 10, pp. R269 pp. 490–491. R272. Bell et al., “Engineering Cytochrome P450cam into an alkane Rodriguez-Lopez, J., et al., “Role of 38 in Horseradish hydroxylase.” Dalton Transactions, 2003, vol. 11, pp. 2133-2140. Peroxidase—A Critical Residue for Substrate Binding and Cataly Beratan, D. N. T. “The protein bridge between redox centres.” Pro sis.” The Journal of Biological Chemistry, Feb. 23, 1996, pp. 4023 tein Electron Transfer, 1996, Oxford: Bios Scientific Publishers, pp. 4030, vol. 271, No. 8. The American Society for Biochemistry and 23-42. Molecular Biology. Blay et al., “Alkane oxidation by a carbonxylate-bridged Abecassis et al., Nucleic Acids Res., 2000, vol. 28, E88. dimanganese (III) complex. Chemical Communications, 2001, vol. Abecassis et al., “Design and characterization of a novel family 20, pp. 2102-2103. shuffling technology adapted to membrane enzyme: application to Boddupalli et al., “Fatty Acid Monooxygenation by P450BM-3: P450s involved in xenobiotic metabolism,” adv. Exp. Med. Biol. 500, Identification and Proposed Mechanisms for the Sequential 2001, pp. 319-322. Hydroxylation Reactions.” Archives of Biochemistry and Biophys Abecassis et al., “Exploration of natural and artificial sequence ics, Jan. 1992, pp. 20-28, vol. 292, No. 1, Academic Press, Inc. spaces: Towards a functional remodeling of membrane- bound Boddupalli et al., “Fatty Acid Monooxygenation by Cytochrome cytochome P450.” Biocatal. Biotransform, 2003, vol. 21, No. 2, pp. P-450BM-3.” The Journal of Biological Chemistry, 1990, pp. 4233 55-66. 4239. The American Society for Biochemistry and Molecular Biol Adam et al., “Microbial Asymmetric CH Oxidations of Simple Ogy. Hydrocarbons: A Novel Monooxygenase Activity of the Topsoil Borman et al., “Kinetic studies on the reactions of Fusarium galactose Microorganism Bacillus megaterium.” Eur, J. Org. Chem., 2000, pp. oxidase with five different substrates in the presence of dioxygen.” 2923-2926, Wiley-VCH Verlag GmbH. Weinheim, Germany. Journal of Biological Inorganic Chemistry, 1997, pp. 480-487. Soci Aisaka et al., “Production of Galactose Oxidase by Gibberella ety of Biological Inorganic Chemistry. fiujikuroi,” Agric. Biol. Chem., 1981, pp. 2311-2316, 45 (10). Brenner, et al., Protein Science, vol. 3, pp. 1871-1882, 1994. Amarallet al., “Galactose Oxidase of Polyporus circinatus 1-4.” Meth Brooks B.R. et al., "CHARMM: A Program for Macromolecular ods in Enzymology, Carbohydrate Metabolism, 1966, pp. 87-92, vol. Energy, Minimization, and Dynamics Calculations'. J. Comp. 9, Academic Press Inc., New York, NY, USA. Chem., 1983, 4, pp. 187-217. US 8,026,085 B2 Page 5

Capdevila, J. et al., “The Highly Stereoselective Oxidation of Poly Root, R., et al., “Enzymatic Synthesis of Unusual Sugars: Galactose unsaturated Fatty Acids by Cytochrome P450BM-3. The Journal of Oxidase Catalyzed Stereospecific Oxidation of Polyols,” Journal of Biological Chemistry, Sep. 13, 1996, pp. 22663-22671, vol. 271, No. the American Chemical Society, 1985, pp. 2997-2999, vol. 107, No. 37. The American Society for Biochemistry and Molecular Biology, 10, American Chemical Society. Inc. Ruettinger, R., et al., "Coding Nucleotide, 5' Regulatory, and Carmichael, A. et al., “Protein engineering of Bacillus megaterium Deduced Amino Acid Sequences of P-450BM-3, a Single Peptide CYP102.” Eur, J. Biochem., 2001, pp. 31.17-3125, vol. 268, FEBS. Cytochrome P-450:NADPH-P-450 Reductase from Bacillus Chang, Yan-Tyang et al., “Homology Modeling, Molecular Dynam megaterium.” The Journal of Biological Chemistry, Jul. 5, 1989, pp. ics Simulations, and Analysis of CYP119, a P450 Enzyme from Extreme Acidothermophilic Archaeon Sulfolobus solfataricus.” Bio 10987-10995, vol. 264, No. 19, The American Society for Biochem chemistry, 2000, 39, pp. 2484-2498. istry and Molecular Biology, Inc. Chavez et al., “Syntheses, structures, and reactivities of cobalt(III)- Ruettinger, R., et al., “Epoxidation of Unsaturated Fatty Acids by a alkylperoxo complexes and their role in Stoichiometric and catalytic Soluble Cytochrome P-450-dependent System from Bacillus oxidation of hydrocarbons,” Journal of the American Chemical Soci megaterium.” The Journal of Biological Chemistry, Jun. 10, 1981, pp. ety, 1998, vol. 120, No. 35, pp. 9015-9027. 5728-5734, vol. 256, No. 11. Chen, H. et al., “Thermal, Catalytic, Regiospecific Functionalization Salazar, Oriana, P. C. Cirino, F. H. Arnold “Thermostability of a of Alkanes.” Science, 2000, vol. 287, pp. 1995-1997. Cytochrome P450 Peroxygenase.” Chembiochem, 4 (9):891-893, Chen et al., “Stereospecific alkane hydroxylation by non-heme iron Sep. 2003. catalysts: mechanistic evidence for an Fe-V =O active species,” Jour Sasai, "Conformation, energy, and folding ability of selected amino nal of the American Chemical Society, 2001, vol. 123, No. 26, pp. acid sequences'. Proc. Natl. Acad. Sci. USA, 1995, 92, pp. 8438 6327-6337. 8442. Cherry, J. et al., “Directed evolution of a fungal peroxidase.” Nature Savenkova, M., et al. “Improvement of Peroxygenase Activity by Biotechnology, Apr. 1999, pp. 379-384, vol. 17. Nature America Inc., Relocation of a Catalytic Histidine within the Active Site of Horse New York, NY, USA. radish Peroxidase.” Biochemistry, 1998, pp. 10828-10836, vol. 37. Cirino et al. "A self-sufficient peroxide-driven hydroxylation American Chemical Society. biocatalyst.” Angewandte Chemie International Edition, 2003, vol. Saysell, C., et al., “Properties of the Trp290His Variant of Fusarium 42, No. 28, pp. 3299-3301. NRRL 2903 Galactose Oxidase: Interactions of the GOasesemi State Cirino et al., “Exploring the diversity of heme enzymes through with Different Buffers, its Redox Activity and Ability to Bind Azide.” directed evolution,” in Directed Molecular Evolution of Proteins, JBIC, 1997, pp. 702-709, vol. 2. 2002, pp. 215-243, S. Brakmann and K. Johnsson, eds. (Germany: Scheller, U., et al., “Characterization of the n-Alkane and Fatty Acid Wiley-VCH). Hydroxylating Cytochrome P450 Forms 52A3 and 52A4.” Archives Cirino, Patrick C., and R. Georgescu. "Screening for Thermostabil of Biochemistry and Biophysics, Apr. 15, 1996, pp. 245-254, vol. ity.” Methods in Molecular Biology, May 2003, pp. 117-125, vol. 328, No. 2, Academic Press, Inc. 230. Schlegel, R., et al., "Substrate Specificity of D-Galactose Oxidase.” Cussac, Yolaine, International Preliminary Report on Patentability Carbohydrate Research, Jun. 1968, pp. 193-199, vol. 7, No. 2, and Written Opinion, Date of Issuance of Report: Oct. 9, 2007 Inter Elsevier Publishing Company, Amsterdam. national Application No. PCT/US04/18832. Schmid, A., et al., “Industrial Biocatalysis Today and Tomorrow.” Dahlhoff, W. et al., “L-Glucose or D-gluco-Hexadialdose from Nature, Jan. 11, 2001, pp. 258-268, vol. 409, Macmillian Magazines D-Glucurono-6.3-lactone by Controlled Reductions.” Angew. Chem. Ltd. Int. Ed. Engl., 1980, pp. 546-547, 19 No. 7, Verlag Chemie, GmbH, Schneider, S., et al., “Controlled Regioelectivity of Fatty Acid Oxi Weinheim, Germany. dation by Whole Cells Producing Cytochrome P450BM-3 De Visser et al., “Hydrogen bonding modulates the slectivity of Monooxygenase Under Varied Dissolved Oxygen Concentrations.” enzymatic oxidation by P450: Chameleon oxidant behavior by com Biotechnology and Bioengineering, Aug. 5, 1999, pp. 333-341, vol. pound I.” Angewandte Chemie-International Edition, 2002, vol. 41. 64, No. 3, John Wiley & Sons, Inc. No. 11, pp. 1947. Schneider, et al., “Production of chiral hydroxyl long chain fatty De Visser et al., “What factors affect the regioselectivity of oxidation acids by whole cells producing cytochrome P450 (BM-3) by cytochrome P450? A DFT study of allylic hydroxylation and monoxygenase.” Tetrahedron Asymetry, 1998, Vool. 9, No. 16, pp. double bond epoxidation in a model reaction.” Journal of the Ameri 2833-2844. can Chemical Society, 2002, vol. 124, No. 39, pp. 11809-11826. Schwaneberg, U., et al., “A Continuous Spectrophotometric Assay Deacon, S. et al., “Enhanced Fructose Oxidase Activity in a Galac for P450 BM-3, a Fatty Acid Hydroxylating Enzyme, and its Mutant tose Oxidase Variant.” ChemBioChem: A European Journal of F87A.” Analytical Biochemistry, 1999, pp. 359-366, vol. 269, Aca Chemical Biology, 2004, pp. 971-979, 5, Wiley-VCH Verlag demic Press. GmbH & Co., Weinheim, Germany. Schwaneberg, U., et al., "Cost-Effective Whole-Cell Assay for Labo Elliot et al., “Regio- and stereoselectivity of particulate methane ratory Evolution of Hydroxylases in Escherichia coli,” Journal of monoxygenanse from Methylococcus capsulates (Bath). Journal of Biomolecular Screening, 2001, pp. 111-117, vol. 6, No. 2. The Soci the American Chemical Society, 1997, vol. 199, No. 42, pp. 9949 ety for Biomolecular Screening. 9955. Schwaneberg, U., et al., “P450 Monooxygenase in Biotechnology— Farinas, E., et al., “Directed Evolution of a Cytochrome P450 Single-Step, Large-Scale Purification Method for Cytochrome P450 Monooxygenase for Alkane Oxidation.” Adv. Synth. Catal., 2001, pp. BM-3 by Anion-Exchange Chromatography,” Journal of Chromatog 601-606, vol. 343, No. 6-7. raphy, 1999, pp. 149-159, vol. 848, Elsevier Science B.V. Ferreira, S.B., “Diethylaminosulfur trifluoride (DAST). Synlett, SegheZZi et al., “Identification of characterization of additional mem No. 7, Apr. 24, 2006, pp. 1130-1131. bers of the cytochrome-P450 multigene family Cyp52 of candida Fisher, M., et al., “Positional Specificity of Rabbit CYP4B1 for tropicalis.” DNA and Cell Biology, 1992, vol. 11, No. 10, pp. 767 Co-Hydroxylation of Short-Medium Chain Fatty Acids and Hydrocar T80. bons. Biochemical and Biophysical Research Communications, Shanklin, J., et al., “Mossbauer Studies of Alkane (D-Hydroxylase: 1998, pp. 352-355, vol. 248, No. RC988842. Evidence for a Diiron Cluster in an Integral-Membrane Enzyme.” Fox, B., et al., “ from Methylosinus Proc. Natl. Acad. Sci. USA, Apr. 1997, pp. 2981-2986, vol. 94. trichosporium OB3b.” Methods in Enzymology, 1990, pp. 191-202, Shilov, A., et al., “Activation of C-H Bonds by Metal Complexes.” vol. 188, Academic Press, Inc. Chem. Rev., 1997, pp. 2879-2932, vol.97, American Chemical Soci “Enzymology of cytochrme P450 reductase.” printed Apr. 5, 2004 ety. http://wwwuky.edu/Pharmacy/ps/porter/CPR enzymology.htm. Smith, A., et al., “Substrate Binding and Catalysis in Herne Sequence Alignment, Sep. 10, 1999. Accession Nos. A34286 and Peroxidases.” Current Opinion in Chemical Biology, (1998), pp. S43.653. 269-278, vol. 2. US 8,026,085 B2 Page 6

Sonnenschmidt-Rogge, Sandra, International Search Report and Wachter, R., et al., “Molecular Modeling Studies on Oxidation of Written Opinion, Date of Mailing of Search: Mar. 19, 2009, Interna Hexopyranoses by Galactose Oxidase. An Active Site Topology tional Application No. PCT/US08/057174. Apparently Designed to Catalyze Radical Reactions, Either Con Sono et al., "Heme-containing oxygenases.” Chemical Reviews, certed or Stepwise.” Journal of the American Chemical Society, Mar. 9, 1996, pp. 2782-2789, vol. 118, No. 9. 1996, vol. 96, No. 7, pp. 2841-2887. Whittaker, M., et al., “The Active Site of Galactose Oxidase.” The Sprinks, Matthew, Supplementary European Search Report, Date of Journal of Biological Chemistry, 1988, pp. 6074-6080, vol. 263, No. Completion of Search: Oct. 13, 2009, Application No. EP 06748800. 13, The American Society for Biochemistry and Molecular Biology, Staijen, I., et al., “Expression, Stability and Performance of the Inc. Three-Component Alkane Mono- of Pseudomonas Whittaker, M., et al., “Kinetic Isotope Effects as probes of the Mecha oleovorans in Escherichia coli.” Eur, J. Biochem., 2000, pp. 1957 nism of Galactose Oxidase.” Biochemistry, 1998, pp. 8426-8436, vol. 1965, vol. 267. 37. American Chemical Society. Stevenson, J., et al., “The Catalytic Oxidation of Linear and Branched Wilkinson, D., et al., “Structural and Kinetic Studies of a Series of Mutants of Galactose Oxidase Identified by Directed Evolution.” Alkanes by Cytochrome P450cam.” J. Am. Chem. Soc., 1996, pp. Protein Engineering, Design & Selection, Jan. 12, 2004, pp. 141-148, 12846-12847. vol. 118, No. 50, American Chemical Society. vol. 17. No. 2, Oxford University Press. Straatmann, M. G. et al., “Fluorine-18-labeled diethylaminosulfur Wubbolts, et al., “Enantioselective oxidation by non-heme iron trifluoride (DAST): An F-for-OH fluorinating agent”. Journal of monoxygenases from Pseudomonas.” Chimia, 1996, vol. 16, pp. Nuclear Medicine, vol. 18(2), 1977, pp. 151-158. 436-437. Sun, L., et al., “Expression and Stabilization of Galactose Oxidase in Yeom, H., et al., "Oxygen Activation by Cytochrome P450BM-3: Escherichia coli by Directed Evolution.” Protein Engineering, Sep. Effects of Mutating an Active Site Acidic Residue.” Archieves of 2001, pp. 699-704, vol. 14, No. 9, Oxford University Press. Biochemistry and Biophysics, Jan. 15, 1997, pp. 209-216, vol. 337. Sun, L., et al., “Modification of Galactose Oxidase to Introduce No. 2, Academic Press. Glucose 6-Oxidase Activity.” ChemBioChem: A European Journal of Yeom, Sligar H., et al., “The role of Thr268 in oxygen activation of Chemical Biology, Aug. 2, 2002, pp. 781-783, vol. 3, No. 8, Wiley cytochrome P450BM-3” Biochemistry, vol. 34, No. 45. Abstract VCH-Vertag GmbH. Weinheim, Germany. 1995. Taly et al., “A combinatorial approach to Substrate discrimination in Young, Lee W., International Search Report and Written Opinion, the P450 CYP1A subfamily.” Biochimica et Biophysica Acta, 2007, Date of Mailing of Search: Feb. 11, 2009, International Application vol. 1770, pp. 446-457. No. PCT/USO8.52795. Thomas, J. M., et al., “Molecular Sieve Catalysts for the Regioselec Young, Lee W., International Search Report and Written Opinion, tive and shape-Selective Oxyfunctionalization of Alkanes in Air”, Date of Mailing of Search: Apr. 17, 2009, International Application Acc Chem Res 2001; 34:191-200. No. PCT/USO8,53344 Tonge, G., al., “Purification and Properties of the Methane Mono Zhang, J., et al., “Directed Evolution of a Fucosidase from a oxygenase enzyme System from Methylosinus trichosporium Galactosidase by DNA Shuffling and Screening.” Proc. Natl. Acad. OB3b. Biochem. J., 1977, pp. 333-344, vol. 161. Sci. USA, Apr. 1997, pp. 4504-4509, vol. 94. Tressel, P. et al., “A Simplified Purification Procedure for Galactose Zhao, H. et al., “Methods for Optimizing Industrial Enzymes by Oxidase.” Analytical Biochemistry, Jun. 1980, pp. 150-153, vol. 105. Directed Evolution'. Manual of Industrial Microbiology and No. 1, Academic Press, Inc. Biotechnology, 2nd Edition, 1999, pp. 597-604. Tressel, P. et al., “Galactose Oxidase from Dactylium dendroides.” Zimmer, T., et al., “The CYP52 Multigene Family of Candida Methods in Enzymology, 1982, pp. 163-171, vol. 89, Academic maltosa Encodes Functionally Diverse n-Alkane-Inducible Press. Cytochromes P450.” Biochemical and Biophysical Research Com Truan, G., et al., “Thr268 in Substrate Binding and Catalysis in munications, 1996, pp. 784-789, vol. 224, No. 3, Academic Press, P450BM-3.” Archives of iochemistry and Biophysics, Jan. 1, 1998, Inc. pp. 53-64, vol. 349, No. 1, Academic Press. Kosman, D., “Chapter 1 Galactose Oxidase,” in Lontie, R., Eds. Tsotsou et al., “High throughput assay for chytochroms P450BM3 Copper Proteins and Copper Enzymes vol. II, pp. 1-26, CRC Press, for Screening libraries of Substrates and combinatorial mutants.” Inc., Boca Raton, FL, USA, 1985. Biosensors and Bioelectronics, 2002, vol. 17, No. 1-2, pp. 119-131. Mazur, A., “Chapter 8, Galactose Oxidase.” ACS Symposium Series Tuyman, A. International Search Report and Written Opinion, Date 466—Enzymes in Carbohydrate Synthesis, 1991, pp. 99-1 10, Ameri of Mailing of Search: Feb. 26, 2002, International Application No. can Chemical Society, Washington, DC, USA, Jun. 24, 1991. PCT/US99,11460. McPherson, M. et al., “Galactose oxidase of Dactylium dendroides. Urlacher et al., “Biotransformations using prokaryotic P450 Gene cloning and sequence analysis.” Chemical Abstract Service, . Current Opinion in Biotechnology, 2002, vol. 13, XP-002298.547, Database accession No. M86819, Apr. 25, 1991. pp. 557-564. Sequence 54, U.S. Appl. No. 10/869,825, Mar 17, 2005. Urlacher et al., “Protein Engineering of cytochrome P450 Sequence 4, U.S. Appl. No. 10/018,730A, Sep. 21, 2004. monooxygenase from Bacillus megaterium.” Methods in Enzymol Sequence 9, U.S. Appl. No. 10/869,813, Mar. 17, 2005. ogy, pp. 208-224, vol. 388, 2004. Sequence 10, U.S. Appl. No. 10/869,813, Mar 17, 2005. Van Deurzen M. P. J., et al., “Selective Oxidations Catalyzed by Sequence 11, U.S. Appl. No. 10/398,178, Mar. 3, 2005. Peroxidases”, Tetrahedron Report No. 427, vol.53, No. 39, 1997; pp. 13183-13220. * cited by examiner U.S. Patent Sep. 27, 2011 Sheet 1 of 8 US 8,026,085 B2

OH cy fluorinationDeOXO- F HO F Oxygenase 2 Deoxo inme. fluOrination on N HO OH F DeOXO fluorination

FIG. f

Chemo-enzymatic strategy F Oxygenase OH DeOXO fluorination High regio- and stereoselectivity Highly enantiopure fluoro derivative in good yields Chemical Strategy F

FIUOrination (OE Chiral reSOlution F F amm-e- (o) Enantiopurein poor fluoro-derivative yields Poor Stereoselectivity with Current methods FIG. 2 U.S. Patent Sep. 27, 2011 Sheet 2 of 8 US 8,026,085 B2

D helix L helix FIG. 3

U.S. Patent US 8,026,085 B2

co aayaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.

2222222222a2/2ZYZZ22a2/24%a22

N [JEM

S LIOS/a1103 fello I0IGLISO 31001 U.S. Patent Sep. 27, 2011 Sheet 5 of 8 US 8,026,085 B2

% T %

||% || T

21. L | 2 1.III

%2 IIII %

TTT%

st UOS/8MU00 fe0 l/Olin Siponpold U.S. Patent Sep. 27, 2011 Sheet 6 of 8 US 8,026,085 B2

||| ? 2È 81-9/BM 2 22%

(7284) É É 2 ||| |+ ||| 9–£JBAG-9/BM 2 2 HHH!

lit09GO U.S. Patent Sep. 27, 2011 Sheet 7 of 8 US 8,026,085 B2

B Regioselectivity in MMP0 activity (GC analysis) 120.0% S 100.0% SS. 80.0% 2 3 % S3 40.0%60.0% H &ZMMPCH Other S. 20.0% 3 O.0% S4-Sz War3-19 War3-18 Var3-8 War2 War3-5 War3-16 War3-6 War3-11

C

S S S s S. War3-19 War3-18 War3-8 War2 War3-5 War3-16 War,3-6 War3-11

FIG. 7 (Cont'd) U.S. Patent Sep. 27, 2011 Sheet 8 of 8 US 8,026,085 B2

Whole-cell activation of dihydrojasmOne (DHJ) 35.0 300 25.0 20.0 0 DH1 15.0 OXDH 10.0 5.0 0.0 O 5 10 15 20 25 30 35 40 time (hrs)

FIG. 8 US 8,026,085 B2 1. 2 METHODS AND SYSTEMIS FOR SELECTIVE fluorine-containing Substance selected for research, pharma FLUORINATION OF ORGANIC MOLECULES ceutical, or agrochemical application has to be man-made. Despite a few reports on the application of molecular fluo CROSS REFERENCE TO RELATED rine (F) for direct fluorination of organic compounds (Cham APPLICATIONS bers, Skinner et al. 1996; Chambers, Hutchinson et al. 2000), this method typically suffers from poor selectivity and This application claims priority to U.S. Provisional Appli requires handling of a highly toxic and gaseous reagent. Sev cation Ser. No. 60/835,613 filed on Aug. 4, 2006, the disclo eral chemical strategies have been developed over the past sure of which is incorporated herein by reference it its decades to afford selective fluorination of organic compounds entirety. 10 under friendlier conditions. These have been recently reviewed by Togni (Togni, Mezzetti et al. 2001), Cahard (Ma TECHNICAL FIELD and Cahard 2004), Sodeoka (Hamashima and Sodeoka 2006), and Gouverneur (Bobbio and Gouverneur 2006). These strat The present disclosure generally relates to the fields of 15 egies involve catalytic as well as non-catalytic methods. The synthetic organic chemistry and pharmaceutical chemistry. In latter comprise substrate-controlled fluorination methods, particular, the present disclosure relates to methods and sys which generally make use of a chiral auxiliary, and reagent tems for the selective fluorination of organic molecules. controlled fluorination methods, which generally make use of chiral electrophilic N F or nucleophilic fluorinating BACKGROUND reagents. These fluorination methods, however, need several chemi The importance of fluorine in altering the physicochemical cal steps to prepare the chiral substrates (Davis and Han 1992: properties of organic molecules and its exploitation in Enders, Potthoff et al. 1997) or the chiral reagents (Davis, medicinal chemistry has been highlighted in recent reviews Zhou et al. 1998; Taylor, Kotoris et al. 1999; Nyffeler, Duron (Bohm, Banneretal. 2004). Although similar in size to hydro 25 et al. 2005) and have an applicability restricted to reactive gen, H->F substitutions can cause dramatic effects on several C—H bonds (Cahard, Audouard et al. 2000; Shibata, Suzuki properties of organic molecules, including the lipophilicity, etal. 2000; Kim and Park 2002; Beeson and MacMillan 2005; dipole moment, and pKa thereof. In addition, fluorine substi Marigo, Fielenbach et al. 2005) in specific classes of com tutions can dramatically alter the reactivity of the fluorinated pounds such as aldehydes (Beeson and MacMillan 2005; site as well as that of neighboring functional groups. 30 Marigo, Fielenbach et al. 2005) or di-carbonyls (Hintermann In particular, in medicinal chemistry, there is a growing and Togni2000: Ma and Cahard 2004; Shibata, Ishimaru et al. interest towards incorporating fluorine atoms in building 2004; Hamashima and Sodeoka 2006). blocks, lead compounds and drugs in that this may increase Despite much progress in the field of organofluorine chem by many-fold the chances of turning these molecules into istry, the number of available methods for direct or indirect marketable drugs. Several studies have shown that potent 35 asymmetric synthesis of organofluorine compounds remains drugs can be obtained through fluorination of much less limited and additional tools are desirable. In particular, a active precursors. Some representative examples include general method to afford mono- or poly-fluorination of anticholesterolemic EZetimib (Clader 2004), anticancer CF organic compounds at reactive and unreactive sites of their taxanes (Ojima 2004), fluoro-steroids, and antibacterial fluo molecular scaffold is desirable. roquinolones. 40 The improved pharmacological properties of fluoro-con SUMMARY taining drugs are often due to their improved pharmacokinetic properties (biodistribution, clearance) and enhanced meta Provided herein are methods and systems for the selective bolic stability (Park, Kitteringham et al. 2001). Primary fluorination of a target site of an organic molecule, which metabolism of drugs in humans generally occurs through 45 include the activation and Subsequent fluorination of the tar P450-dependent systems, and the introduction of fluorine get site. In the methods and systems disclosed herein, the atoms at or near the sites of metabolic attack has often proven target site is an oxidizable carbon atom of the organic mol Successful in increasing the half-life of a compound (Bohm, ecule, the activation is performed by introducing an oxygen Banner et al. 2004). A comprehensive review covering the containing functional group on the target site, and the fluori influence of fluorination on drug metabolism (especially 50 nation of the activated site is performed by replacing the P450-dependent) is presented (Park, Kitteringham et al. functional group introduced on the target site with fluorine 2001). The introduction of the oxygen-containing functional group In other cases, the introduction of fluorine substituents and the replacement of the functional group with a fluorine leads to improvements in the pharmacological properties as a can be performed by Suitable agents result of enhanced binding affinity of the molecule to biologi 55 According to a first aspect, a method for fluorinating an cal receptors. Examples of the effect of fluorine on binding organic molecule is disclosed, the method comprising pro affinity are provided by recent results in the preparation of viding an organic molecule comprising a target site; provid NK1 antagonists (Swain and Rupniak 1999), 5HT1D ago ing an oxidizing agent that oxidizes the organic molecule by nists (van Niel, Collins et al. 1999), and PTB1B antagonists introducing an oxygen containing functional group on the (Burke, Ye et al. 1996). 60 target site, contacting the oxidizing agent with the organic Over the past years, fluorination has been playing an molecule for a time and under conditions to allow introduc increasingly important role in drug discovery, as exemplified tion of the oxygen-containing functional group on the target by the development of fluorinated derivatives of the antican site thus providing an oxygenated organic molecule, provid cer drugs paclitaxel and docetaxel (Ojima 2004). ing a fluorinating agent and contacting the fluorinating agent However, only a handful of organofluorine compounds 65 with the oxygenated organic molecule, for a time and under occur in nature and even those are found in very Small conditions to allow for the replacement of the oxygen-con amounts (Harper and O'Hagan 1994). Consequently, any taining functional group with fluorine. US 8,026,085 B2 3 4 According to a second aspect, a system for the fluorination cologically activity of a molecule that is already pharmaco of an organic molecule is disclosed, the system comprising an logically active. This in view of the known ability of fluorine oxidizing agent for introducing an oxygen-containing func to improve the pharmacological profile of drugs. tional group in an organic molecule and a fluorinating agent The details of one or more embodiments of the disclosure for replacing the oxygen-containing functional group in the are set forth in the accompanying drawings and the descrip organic molecule with fluorine or a fluorine group. An oxy tion below. Other features, objects, and advantages will be gen-providing compound and/or fluorine-providing com apparent from the description and drawings, and from the pound can also be included in the system. claims. A first advantage of the methods and systems disclosed herein is to allow for the fluorination of organic molecules in 10 BRIEF DESCRIPTION OF THE DRAWINGS one or more specific and predetermined target sites, including one or more target sites of interest, thus allowing a regiose The accompanying drawings, which are incorporated into lective mono- and poly-fluorination. and constitute a part of this specification, illustrate one or A second advantage of the methods and systems disclosed more embodiments of the present disclosure and, together herein is to allow for the introduction of fluorine at a fluorine 15 with the detailed description, serve to explain the principles unreactive site of a molecule, i.e. a site that, in absence of the and implementations of the disclosure. oxygen-containing functional group, is unlikely to undergo a FIG. 1 is a schematic representation of the methods and chemical transformation Such as a fluorination, as long as said systems for the selective fluorination of an organic molecule site is oxidizable. A according to an embodiment of disclosed herein. A third advantage of the methods and systems disclosed FIG. 2 is a schematic representation of methods and sys herein is that by using a suitable agent, in particular a suitable tems for Stereoselective fluorination of an organic molecule A oxidizing agent, it is possible to control the chirality of the according to an embodiment disclosed herein (chemo-enzy final product and therefore produce a product molecule hav matic strategy), illustrated in comparison with methods and ing a desired chirality (Stereoselective fluorination). systems of the art (chemical strategy). A fourth advantage of the methods and systems disclosed 25 FIG.3 is a graphic representation of the crystal structure of herein is that the methods and system provide fluorinated a P450 heme domain; helixes D, L, I and E in the domain are compounds wherein the fluorine is introduced in a predeter also indicated; the heme prosthetic group in the domain is mined site expected to be associated with a biological activity, indicated as "heme'; the cysteine in the heme-ligand loop is which can therefore generate candidate compounds. displayed in spheres (black). According to a third aspect, a method for the identification 30 FIG. 4 is a schematic representation of methods and system of a molecule having a biological activity is disclosed, the for identifying a molecule having biological activity accord method comprising, providing an organic molecule compris ing to an embodiment disclosed in the present specification. ing a target site; providing an oxidizing agent, contacting the FIG. 5 illustrates exemplary results from the screening of a oxidizing agent with the organic molecule for a time and Subset of pre-selected oxygenases for the identification of a under condition to allow introduction of an oxygen-contain 35 suitable oxidizing agent for the selective activation of the ing functional group on the target site thus providing an organic molecule dihydrojasmone. Panel A) is a diagram oxygenated organic molecule; providing a fluorinating agent; showing the conversion ratios for the reaction of activating contacting the fluorinating agent with the oxygenated organic dihydrojasmone with wild-type P450 and variants molecule, for a time and under conditions to allow for the thereof, as determined by GC analysis. Panel B) is a diagram replacement of the oxygen-containing functional group with 40 showing the product distribution obtained with wild-type fluorine; and testing the fluorinated organic molecule for the P450, and variants thereof in the reaction of activating biological activity. dihydrojasmone, as determined by GC analysis. Cpd 1 to cpd According to a fourth aspect, a system for identifying a 9 indicate activated products 1 to 9. molecule having a biological activity is disclosed. The system FIG. 6 illustrates exemplary results from the screening of a comprises an oxidizing agent capable of introducing an oxy 45 Subset of pre-selected oxygenases for the identification of a gen-containing functional group in a target site of an organic suitable oxidizing agent for the selective activation of the molecule, a fluorinating agent capable of replacing the oxy organic molecule Menthofuran. Panel A) is a diagram show gen-containing functional group in the organic molecule with ing the conversion ratios for the reaction of activating men fluorine, and an agent for testing the biological activity. An thofuran with wild-type P450, and variants thereof, as oxygen-providing agent and/or fluorine-providing-agent can 50 determined by GC analysis. Panel B) is a diagram showing also be included in the system. the product distribution obtained with wild-type P450 and A further advantage of the methods and systems for the variants thereof in the reaction of activating menthofuran, as identification of a molecule having a biological activity is the determined by GC analysis. Cpd 1 to cpd 10 indicate activated possibility to produce a broad spectrum of molecules that in products 1 to 10. view of the selected insertion of fluorine, already constitute 55 FIG. 7 illustrates exemplary results from the screening of a promising candidates, thus shortening and improving the Subset of pre-selected oxygenases for the identification of a selection process. suitable oxidizing agent for the selective activation of the An additional advantage of the methods and systems for organic molecule dihydro-4-methoxymethyl-2-methyl-5- the identification of a molecule having a biological activity is phenyl-2-oxazoline (MMPO). In particular, Panel (A) is a the possibility to confer new activities to a molecule that is 60 diagram showing the results from HTS screening of a pool of already biologically active and/or to improve the biological oxidizing agents using the colorimetric reagent Purpald. activity of the original molecule by selective insertion of Panel (B) is a diagram showing the results from the re-screen fluorine. of the positive hits identified with colorimetric HTS, where A still further advantage of the methods and systems of the the regioselectivity of each oxygenase is determined by GC identification of a molecule having a biological activity, is the 65 analysis (MMPOHis dihydro-4-hydroxymethyl-2-methyl-5- possibility to derive molecules that have a biological activity phenyl-2-oxazoline is the desired activated product). Panel that is pharmacologically relevant, or to improve the pharma (C) is a diagram showing the conversion ratios for the activa US 8,026,085 B2 5 6 tion reactions of MMPO with each of the tested oxidizing organic molecules or portions thereof including target sites, agents, as determined by GC analysis. chemical agents, including oxidizing agents and fluorinating FIG. 8 shows a diagram illustrating the time course for agents. whole-cell activation of the organic molecule dihydrolas The term "agent as used herein refers to a chemical unit mone (DHJ) using a batch culture of Vara-expressing E. coli that is capable to cause a specified in the DH5O. cells (0.5 L). The consumption of substrate (DHJ) and identifier accompanying the term. Accordingly, an “oxidizing the accumulation of the desired activated product (oxDHJ) agent' is an agent capable of causing an oxygenation reaction were monitored over time by GC analysis of aliquots of the of a Suitable Substrate and a “fluorinating agent' is an agent cell culture. capable of causing a fluorination reaction of a suitable Sub 10 strate. An oxygenation reaction is a chemical reaction in DETAILED DESCRIPTION which one or more oxygen atoms are inserted into one or more pre-existing chemical bonds of said Substrate. A fluori Methods and systems for the selective fluorination of a nation reaction is a chemical reaction in which a substituent predetermined target site of an organic molecule are disclosed connected to an atom in said Substrate is substituted for fluo herein. In these methods and systems, the predetermined 15 rine. target site is first activated by an oxidizing agent that intro The term “introducing as used herein with reference to the duces an oxygen-containing functional group in the target interaction between two chemical units, such as a functional site, and then fluorinated by a fluorinating agent that replaces groups and a target site, indicates a reaction resulting in the the oxygen-containing functional group with fluorine or a formation of a bond between the two chemical units, e.g. the fluorine group. In particular, activation and fluorination of an functional group and the target site. organic molecule can be performed as schematically illus The term “functional group’ as used herein refers to a trated in FIGS. 1 and 2. FIG. 2 also shows the activation and chemical unit within a molecule that is responsible for a fluorination of an organic molecule performed according to characteristic chemical reaction of that molecule. An "oxy Some embodiments disclosed herein, in comparison with gen-containing functional group' is a functional group that chemical methods and systems of the art. 25 comprises an oxygen atom. Exemplary oxygen-containing The term “target site' as used herein refers to an oxidizable functional groups include but are not limited to a hydroxyl C atom, i.e. a C atom in the organic molecule that bears an group (-OH), ether group (-OR), carbonyl oxygen (=O), oxidizable bond. Examples of oxidizable bonds include but hydroperoxy group (-OOH), and peroxy group (—OOR). are not limited to a C-H bond, a C-C double bond, and a The terms “replace' and “replacement” as used herein with C X bond, single or double, where X is an heteroatom 30 reference to chemical units indicates formation of a chemical independently selected from the group consisting of B (bo bond between the chemical units in place of a pre-existing ron), 0, (oxygen), P (phosphorous), N (nitrogen), S (sulfur), bond in at least one of said chemical unit. In particular, Si (silicon), Se (selenium), F (fluorine), Cl (chlorine), Br replacing an oxygen-containing functional group on the tar (bromine), and I (iodine). get site with a fluorine or fluorine group indicates the forma The terms 'activate” and “activation' as used herein with 35 tion of a bond between the target site and the fluorine or reference to a target site indicate a chemical reaction resulting fluorine group in place of the bond between the target site and in an enhanced reactivity of the Catom that forms the site, so the oxygen-containing functional group. that said C atom acquires or improves its ability to undergo a Any organic molecule that includes at least one target site, chemical transformation, more specifically a fluorination i.e. at least one oxidizable Catom, and is a Substrate of at least reaction. For example, the insertion of an oxygen atom in a 40 one oxidizing agent, can be used as an organic molecule to be target site bearing a C-H bond and resulting in the formation fluorinated according to the methods and systems disclosed of a hydroxyl group (C OH) on the site activates the target herein. site for a deoxofluorination reaction. A further example is the In some embodiments, the oxidizing agent is an enzyme, insertion of an oxygen atom in a target site bearing a C=C Such as an oxygenase, that is able to introduce an oxygen double bond and resulting in the formation of an epoxy group 45 containing functional group in the target site of the organic activates the site for a ring-opening fluorination reaction. molecule using an oxygen source Such as molecular oxygen Accordingly, the wording “activated site' as used herein (O), hydrogen peroxide (HO), a hydroperoxide refers to a C atom of an organic molecule that, following (R—OOH), or a peroxide (R-O-O-R'), including the activation, has acquired or improved its ability to undergo a with an Enzyme Classification (EC) number chemical transformation and in particular a fluorination reac 50 typically corresponding to EC 1.13 or EC 1.14. Suitable tion when contacted with a fluorine. oxygenases for the systems and methods herein described The term “contact’ as used herein with reference to inter include but are not limited to monooxygenases, dioxygena actions of chemical units indicates that the chemical units are ses, peroxygenases, and peroxidases. In particular, monooxy at a distance that allows short range non-covalent interactions genases and peroxygenase can be used to introduce on the (such as Van der Waals forces, hydrogen bonding, hydropho 55 target site an oxygen-containing functional group that com bic interactions, electrostatic interactions, dipole-dipole prises one oxygen atom, dioxygenases can be used to intro interactions) to dominate the interaction of the chemical duce on the target site an oxygen-containing functional group units. For example, when an oxygenase enzyme is contacted that comprises two oxygen atoms, and peroxidases can be with a target molecule, the enzyme is allowed to interact with used to introduce on the target site an oxygen-containing and bind to the organic molecule through non-covalent inter 60 functional group that comprises one or two oxygen atoms. actions so that a reaction between the enzyme and the target In some embodiments, the oxygenases are wild-type oxy molecule can occur. genases and in some embodiments the oxygenase is a mutant The wording “chemical unit' identifies single atoms as or variant. An oxygenase is wild-type if it has the structure well as groups of atoms connected by a chemical bond. and function of an oxygenase as it exists in nature. An oxy Exemplary chemical units herein described include, but are 65 genase is a mutant or variant if it has been mutated from the not limited to fluorine atom, chemical groups such as oxygen oxygenase as it exists in nature and provides an oxygenase containing chemical group and fluorine-containing groups, enzymatic activity. US 8,026,085 B2 7 8 In some embodiments, the variant oxygenase provides an heme-ligand loop containing the P450 signature sequence enhanced oxygenase enzymatic activity compared to the cor SEQID NO: 1. Helices D and E are also indicated in FIG.3. responding wild-type oxygenase. In some embodiments, the P450 enzymes are known to be involved in metabolism of variant oxygenases maintain the binding specificity of the exogenous and endogenous compounds. In particular, P450 corresponding wild-type oxygenase, in other embodiments enzymes can act as terminal oxidases in multicomponent the variant oxygenases disclosed herein are instead bindingly electron transfer chains, called here P450-containing sys distinguishable from the corresponding wild-type and bind tems. Reactions catalyzed by cytochrome P450 enzymes ingly distinguishable from another. The wording “bindingly include hydroxylation, epoxidation, N-dealkylation, distinguishable' as used herein with reference to molecules, O-dealkylation, S-oxidation and other less common transfor 10 mations. The most common reaction catalyzed by P450 indicates molecules that are distinguishable based on their enzymes is the monooxygenase reaction using molecular ability to specifically bind to, and are thereby defined as oxygen (O), where one atom of oxygen is inserted into a complementary to a specific molecule. Accordingly, a first substrate while the other is reduced to water. oxygenase is bindingly distinguishable from a second oxy P450 monooxygenases can catalyze the monooxygenation genase if the first oxygenase specifically binds and is thereby 15 of a variety of structurally diverse substrates. Exemplary sub defined as complementary to a first Substrate and the second strates, that can be oxidized by naturally-occurring P450s oxygenase specifically binds and is thereby defined as include Cs-C alkanes, cyclic alkanes, cyclic alkenes, alkane complementary to a second Substrate, with the first Substrate derivatives, alkene derivatives, Co-Co fatty acids, Steroids, distinct from the second Substrate. In some embodiments, the terpenes, aromatic hydrocarbons, natural products and natu variant oxygenase disclosed herein, has an increased enzyme ral product analogues such as polyketides, prostaglandines, half-time in vivo, a reduced antigenicity, and/or an increased thromboxanes, leukotrienes, anthraquinones, tetracyclines, storage stability when compared to the corresponding wild anthracyclines, polyenes, statins, amino acids, flavonoids, type OXygenase. stilbenes, alkaloids (e.g. lysine-derived, nicotinic acid-de In some embodiments, the oxygenase is a heme-containing rived, tyrosine-derived, tryptophan-derived, anthranilic acid oxygenase or a variant thereof. The wording “heme' or 25 derived, histidine-derived, -derived alkaloids), beta "heme domain” as used herein refers to an amino acid lactams, aminoglycosides, polymyxins, quinolones, sequence within an oxygenase, which is capable of binding an synthetic derivatives such as aromatic heterocyclic deriva iron-complexing structure such as a porphyrin. Compounds tives (e.g. phenyl-, -, pyridine-, piperidine-, pyr of iron are typically complexed in a porphyrin (tetrapyrrole) role-, furan-, triazol-, thiophene-, pyrazole-, imidazole-, tet ring that may differ in side chain composition. Heme groups 30 razole-, oxazole-, isoxazole-, thiazole-, isothiazole-, pyran-, can be the prosthetic groups of cytochromes and are found in pyridazine-, pyrazine-, piperazine-, thiazine-, and oxazine most oxygen carrier proteins. Exemplary heme domains derivatives), and the like. include that of P450, as well as truncated or mutated ver Naturally-occurring P450 monooxygenases have been also sions of these that retain the capability to bind the iron mutated in their primary sequence to favor their activity complexing structure. A skilled person can identify the heme 35 towards other non-native Substrates such as short-chain fatty domain of a specific protein using methods known in the art. acids, 8- and 12-pNCA, indole, aniline, p-nitrophenol, poly Exemplary organic molecules that can be oxidized by heme cyclic hydrocarbons (e.g. indole, naphthalene), styrene, containing oxygenases include Cs-C22 alkanes, fatty acids, medium- and short-chain alkanes, alkenes (e.g. cyclohexene, steroids, terpenes, aromatic hydrocarbons, polyketides, pros 1-hexene, Styrene, benzene), quinoline, Steroid derivatives, taglandins, terpenes, statins, amino acids, flavonoids, and 40 and various drugs (e.g. chlorZoxaZone, propranolol, amodi stilbenes. aquine, dextromethorphan, acetaminophen, ifosfamide, In particular, in some embodiments the “heme-containing cyclophosphamide, benzphetamine, buspirone, MDMA). oxygenase' is a cytochrome P450 enzyme (herein also indi P450 monooxygenases Suitable in the methods and sys cates as CYPs or P450s) or a variant thereof. The wording tems disclosed herein include cytochrome P450 monooxyge “P450 enzymes' indicates a group of heme-containing oxy 45 nases (EC 1.14.14.1) from different sources (bacterial, fungi, genases that share a common overall fold and topology yeast, plant, mammalian, and human), and variants thereof. despite less than 20% sequence identity across the corre Exemplary P450 monooxygenases suitable in the methods sponding gene superfamily (Denisov, Makris et al. 2005). In and systems disclosed herein include members of CYP102A particular, the P450 enzymes share a conserved P450 struc subfamily (e.g. CYP102A1, CYP102A2, CYP102A3, tural core, which binds to the heme group and comprises a 50 CYP102A5), members of CYP101A subfamily (e.g. P450 signature sequence. The conserved P450 structural core CYP101A1), members of CYP102e subfamily (e.g. is formed by a four-helix bundle composed of three parallel CYP102E1), members of CYP1A subfamily (e.g. CYP1A1, helices (usually labeled D. L., and I), and one antiparallel helix CYP1A2), members of CYP2A subfamily (e.g. CYP2A3, (usually labeled as helix E) (Presnell and Cohen 1989) and by CYP2A4, CYP2A5, CYP2A6, CYP2A12, CYP2A13), mem a Cys heme-ligand loop which includes a conserved cysteine 55 bers of CYP1B subfamily (e.g. CYP1B1), members of that binds to the heme group and the P450 signature. In CYP2B subfamily (e.g. CYP2B6), members of CYP2C sub particular, the conserved cysteine that binds to the heme family (e.g. CYP2C8, CYP2C9, CYP2C10, CYP2C18, group is the proximal or “fifth ligand to the heme iron and the CYP2C19) members of CYP2D subfamily (e.g. CYP2D6), relevant ligand group (a thiolate) is the origin of the charac members of CYP3A subfamily (e.g. CYP3A4, CYP3A5, teristic name giving 450-nm Soret absorbance observed for 60 CYP3A7, CYP3A43), members of CYP107A subfamily (e.g. the ferrous-CO complex (Pylypenko and Schlichting 2004). CYP107A1), and members of CYP153 family (e.g. The P450 signature sequence is the sequence indicated in the CYP153A1, CYP153A2, CYP153A6, CYP153A7, enclosed sequence listing as SEQ ID NO: 1. FIG. 3 is a rep CYP153A8, CYP153A11, CYP153D3, and CYP153D2, resentation of the P450 structural core of bacterial P450. (van Beilen and Funhoff 2007)). Exemplary organic mol In the illustration of FIG. 3, the prosthetic heme group 65 ecules oxidizable by P450 monooxygenases include Cs-C (heme') is located between the distal I helix (helix I) and alkanes, cyclic alkanes, cyclic alkenes, alkane derivatives, proximalL helix (helix L) and is bound to the adjacent Cys alkene derivatives, Co-Co fatty acids, steroids, terpenes, US 8,026,085 B2 10 aromatic hydrocarbons, natural products and natural product vonoids, Stilbenes, alkaloids (e.g. lysine-derived, nicotinic analogues such as polyketides, prostaglandines, thrombox acid-derived, tyrosine-derived, tryptophan-derived, anthra anes, leukotrienes, anthraquinones, tetracyclines, anthracy nilic acid-derived, histidine-derived, purine-derived alka clines, polyenes, statins, amino acids, flavonoids, stilbenes, loids), beta-lactams, aminoglycosides, polymyxins, quinolo alkaloids (e.g. lysine-derived, nicotinic acid-derived, 5 nes, synthetic derivatives such as aromatic heterocyclic tyrosine-derived, tryptophan-derived, anthranilic acid-de derivatives (e.g. phenyl-, pyrimidine-, pyridine-, piperidine-, rived, histidine-derived, purine-derived alkaloids), beta-lac pyrrole-, furan-, triazol-, thiophene-, pyrazole-, imidazole-, tams, aminoglycosides, polymyxins, quinolones, synthetic tetrazole-, oxazole-, isoxazole-, thiazole-, isothiazole-, derivatives such as aromatic heterocyclic derivatives (e.g. pyran-, pyridazine-, pyrazine-, piperazine-, thiazine-, and phenyl-, pyrimidine-, pyridine-, piperidine-, pyrrole-, furan-, 10 oxazine-derivatives), and the like. triazol-, thiophene-, pyrazole-, imidazole-, tetrazole-, In particular, in Some embodiments P450 monooxygenases oxazole-, isoxazole-, thiazole-, isothiazole-, pyran-, suitable for the methods and systems disclosed herein include pyridazine-, pyrazine-, piperazine-, thiazine-, and oxazine CYP102A1 (SEQ ID NO: 2) and variants thereof, wherein derivatives), and the like. none, one or more of the amino acids that are located within Other exemplary P450 monooxygenases suitable in the 15 50 A from the heme iron are mutated to any other of the methods and systems disclosed herein include CYP106A2, natural aminoacids or mutated to an unnatural amino acid or CYP2F1, CYP2J2, CYP2R1, CYP2S1, CYP2U1, CYP2W1, modified in Some way so to alter the properties of the enzyme. CYP4A11, CYP4A22, CYP4B1, CYP4F2, CYP4F3, Examples of amino acid positions that can be modified in CYP4F8, CYP4F11, CYP4F12, CYP4F22, CYP4V2, CYP102A1 to produce a P450 monooxygenase suitable in the CYP4X1, CYP4Z1, CYP5A1, CYP7A1, CYP7B1, methods and systems disclosed herein include without limi CYP8A1, CYP8B1, CYP11A1, CYP11B1, CYP11B2, tations: 25, 26, 42,47,51,52, 58,74, 75, 78,81, 82,87, 88,90, CYP17A1, CYP19A1, CYP20A1, CYP21A2, CYP24A1, 94, 96,102, 106, 107, 108, 118, 135,138, 142,145, 152, 172, CYP26A1, CYP26B1, CYP26C1, CYP27A1, CYP27C1, 173, 175, 178, 180, 181, 184, 185, 188, 197, 199, 205, 214, CYP39A1, CYP46A1, and CYP51A1. 226, 231, 236, 237, 239, 252, 255, 260, 263,264, 265, 268, In particular, in some embodiments P450 monooxygenases 25 273, 274, 275,290, 295,306, 324, 328, 354, 366, 398, 401, suitable in the methods and systems disclosed herein include 430, 433, 434, 437, 438, 442, 443, 444, and 446. CYP102A1 (also called P450) from Bacillus megaterium In particular, in Some embodiments, P450 monooxygena (SEQID NO: 2), CYP102A2 from Bacillus subtilis (SEQID ses Suitable in the methods and system disclosed herein are NO:3), CYP102A3 from Bacillus subtilis (SEQID NO: 4), selected from the group consisting of CYP102A1 (SEQ ID CYP102A5 from Bacillus cereus (SEQ ID NO. 5), 30 NO:2) and variants thereof including CYP102Alvarl (SEQ CYP102E1 from Ralstonia metallidurans (SEQ ID NO: 6), ID NO: 21), CYP102A1var2 (SEQ ID NO: 22), CYP102A6 from Bradyrhizobium japonicum (SEQID NO: CYP102Alvar3 (SEQ ID NO. 23), CYP102Alvar3-2 (SEQ 7), CYP101A1 (also called P450cam) from Pseudomonas ID NO:24), CYP102A1 var3-3(SEQ ID NO: 25), putida (SEQ ID NO: 8), CYP106A2 (also called P450meg) CYP102A1 var3-4(SEQID NO: 26), CYP102Alvar3-5(SEQ from Bacillus megaterium (SEQIDNO:9), CYP153A6(SEQ 35 ID NO: 27), CYP102A1var3-6(SEQ ID NO: 28), IDNO:54), CYP153A7 (SEQIDNO:55), CYP153A8(SEQ CYP102A1var3-7(SEQID NO: 29), CYP102A1var3-8(SEQ ID NO. 56), CYP153A11(SEQ ID NO. 57), CYP153D2 ID NO: 30), CYP102A1var3-9(SEQ ID NO: 31), (SEQ ID NO. 58), CYP153D3(SEQ ID NO. 59), P450cin CYP102A1 var3-10(SEQ ID NO: 32), CYP102A1 var3-11 from Citrobacter brakii (SEQ ID NO: 10), P450terp from (SEQ ID NO: 33), CYP102A1 var3-12(SEQ ID NO. 34), Pseudomonas sp. (SEQID NO: 11), P450eryF from Saccha 40 CYP102A1var3-13(SEQ ID NO: 35), CYP102A1var3-14 ropolyspora erythreae (SEQID NO: 12), CYP1A2 (SEQ ID (SEQ ID NO:36), CYP102A1var3-15(SEQ ID NO: 37), NO: 13), CYP2C8 (SEQID NO:14), CYP2C9 (SEQID NO: CYP102A1var3-16(SEQ ID NO: 38), CYP102A1var3-17 15), CYP2C19 (SEQ ID NO: 16), CYP2D6 (SEQ ID NO: (SEQ ID NO. 39), CYP102A1 var3-18(SEQ ID NO: 40), 17), CYP2E1 (SEQID NO: 18), CYP2F1 (SEQID NO:19), CYP102A1 var3-19(SEQ ID NO: 41), CYP102A1 var3-20 CYP3A4 (SEQ ID NO: 20), CYP153-AlkBurk from Alca 45 (SEQ ID NO: 42) CYP102A1 var3-21(SEQ ID NO: 43), nivorax borkumensis (SEQ ID NO: 60), CYP153-EB104 CYP102A1 var3-22 (SEQ ID NO. 44), CYP102A1 var3-23 from Acinetobacter sp. EB104 (SEQ ID NO: 61), CYP153 (SEQ ID NO: 45), CYP102A1 var4 (SEQ ID NO: 46) OC4 from Acinetobacter sp. OC4(SEQ ID NO: 62), and CYP102A1varS (SEQID NO:47), CYP102A1 var6 (SEQ ID variants thereof. Exemplary organic molecules that can be NO:48), CYP102A1var7 (SEQID NO:49), CYP102Alvar8 oxidized by these P450 monooxygenases include branched 50 (SEQ ID NO:50), CYP102A1 var9 (SEQ ID NO. 51), and and linear Co-Co fatty acids, Co-Coalkanes, cyclic alkanes, CYP102A1var9-1 (SEQID NO:52) cyclic alkenes, alkane derivatives, alkene derivatives, Ste The above variants are illustrated in particular in the fol roids, terpenes, aromatic hydrocarbons, natural products and lowing Table 1 wherein the respective sequences are reported natural product analogues such as polyketides, prostagland in the enclosed Sequence Listing and the mutations of each ines, thromboxanes, leukotrienes, anthraquinones, tetracy 55 variant with respect to the wild type (SEQ ID NO: 2) are clines, anthracyclines, polyenes, statins, amino acids, fla listed. TABLE 1

Name Sequence Mutation(s) with respect to CYP102A1 CYP102A1 SEQID NO: 2 CYP1 O2A1 war1 SEQID NO: 21 V78A, H138Y, T175I, V178I, A184V, H236Q, E252G, R255S, A29OV, A295T, L353V SEQID NO: 22 V78A, T175I, A184V, F205C, S226R, H236Q, E252G, R255S, A29OV, L353V SEQID NO: 23 R47C, V78A, K94.I., P142S, T175I, A184V, F205C,

US 8,026,085 B2 13 14 In some embodiments, the P450 monooxygenases listed in In particular, in Some embodiments, P450 monooxygena Table 1 are provided as oxygenating agents for the methods ses Suitable in the methods and system disclosed herein are and systems disclosed herein, wherein the organic molecules, selected from the group consisting of CYP101A1 (SEQ ID include branched and linear Co-Co fatty acids, C-Co NO:8) and variants thereof including CYP101 Alvarl (SEQ alkanes, cyclic alkanes, cyclic alkenes, alkane derivatives, ID NO:65), CYP101A1 var2 (SEQ ID NO:66), alkene derivatives, steroids, terpenes, aromatic hydrocar CYP101A1 var2-1 (SEQID NO:67), CYP101Alvar2-2(SEQ bons, prostaglandines, aromatic heterocyclic derivatives Such ID NO:68), and CYP101A1 var2-3(SEQID NO:69). as phenyl-, pyrimidine-, pyridine-, piperidine-, pyrrole-, The above variants are illustrated in particular in the fol furan-, triazol-, thiophene-, pyrazole-, imidazole-, tetrazole-, lowing Table 4 wherein the mutations of each variant with oxazole-, isoxazole-, thiazole-, isothiazole-, pyran-, 10 respect to the wild type (SEQID NO: 8) are listed. pyridazine-, pyrazine-, piperazine-, thiazine-, and oxazine derivatives. TABLE 4 In some embodiments P450 monooxygenases suitable in Mutation(s) with respect the methods and systems disclosed herein include Name Sequence to CYP101A1 CYP102A2 from Bacillus subtilis (SEQID NO:3), and vari 15 ants thereof, wherein none, one or more of the amino acids CYP101A1 SEQID NO: 8 that are located within 50 A from the heme iron are mutated CYP101Awar1 SEQID NO: 65 Y96A CYP101A1 war2 SEQID NO: 66 Y96F to any other of the natural aminoacids or mutated to an CYP101A1 war2-1 SEQID NO: 67 Y96F, F87W unnatural amino acid or modified in Some way so to alter the CYP101A1 war2-2 SEQID NO: 68 Y96F, V247L properties of the enzyme. CYP101A1 war2-3 SEQID NO: 69 F87W, Y96F, V247L In particular, in some embodiments, P450 monooxygena ses Suitable in the methods and system disclosed herein are In some embodiments, the P450 enzyme is included in a selected from the group consisting of CYP102A2 (SEQ ID P450-containing system, a system including a P450 enzyme NO:3) and variants thereof including CYP102A2varl (SEQ 25 and one or more proteins that deliver one or more electrons to ID NO:63). The above variants are illustrated in particular in the heme iron in the P450 enzyme. Natural P450-containing the following Table 2 wherein the mutations of each variant systems occur according to the following general schemes: with respect to the wild type (SEQID NO:3) are listed. CYP reductase (CPR)/cytochrome b5 (cyb5)/P450 sys tems, typically employed by eukaryotic microsomal (i.e., not TABLE 2 30 mitochondrial) CYPs, they involve the reduction of cyto Mutation(s) with respect chrome P450 reductase (variously CPR, POR, or CYPOR) by Name Sequence to CYP101A1 NADPH, and the transfer of reducing power as electrons to the CYP. Cytochrome b5 (cyb5) can also contribute reducing CYP102A2 SEQID NO:3 CYP102A2war1 SEQID NO: 63 F88A power to this system after being reduced by cytochrome b5 35 reductase (CYBSR); Ferrodoxin Reductase (FdxR) or Putidaredoxin Reductase In some embodiments P450 monooxygenases suitable for (PdxR)/Ferrodoxin (Fdx) or Putidaredoxin (Pdx)/P450 sys the methods and systems disclosed herein include tems, typically employed by mitochondrial and some bacte CYP102A3 from Bacillus subtilis (SEQID NO: 4), and vari rial CYPs. Reducing electrons from a soluble , typi ants thereof, wherein none, one or more of the amino acids 40 cally NADPH or NADH, are transferred through the that are located within 50 A from the heme iron are mutated reductaseto electron carrier, Fdx or Pdx, and transferred from to any other of the natural aminoacids or mutated to an the electron carrier to the P450 component; unnatural amino acid or modified in Some way so to alter the P450-CPR fusion systems, where the CYP domain is natu properties of the enzyme. rally fused to the electron donating partners. An example of In particular, in some embodiments P450 monooxygenases 45 these systems is represented by cytochrome P450 Suitable in the methods and systems disclosed herein are (CYP102A1) from the soil bacterium Bacillus megaterium: selected from the group consisting of CYP102A3 (SEQ ID CYBSR/cyb5/P450 systems, where both electrons required NO:4) and variants thereof including CYP102A3Varl (SEQ by the CYP derive from cytochrome b5; ID NO: 64). The above variants are illustrated in particular in FMN/Fd/P450 systems, where a FMN-domain-containing the following Table 3 wherein the mutations of each variant 50 reductase is fused to the CYP. This type of system was origi with respect to the wild type (SEQID NO: 4) are listed. nally found in Rhodococcus sp; and P450 only systems, which do not require external reducing TABLE 3 power. These include CYP5 (thromboxane synthase), CYP8, Mutation(s) with respect prostacyclin synthase, and CYP74A (allene oxide synthase). 55 In some embodiments, the oxidizing agent is a non-heme Name Sequence to CYP101A1 containing monooxygenases i.e. a monooxygenases that is CYP102A3 SEQID NO: 4 able to function without a heme prosthetic group. These CYP102A3war1 SEQID NO: 64 F88A monooxygenases include but are not limited to flavin monooxygenases, pterin-dependent non-heme monooxyge In particular, in some embodiments P450 monooxygenases 60 nases, non-heme diron monooxygenases, and diron suitable in the methods and systems disclosed herein include hydroxylases. In these enzymes, oxygen activation occurs at CYP101A1 (also called P450cam) from Pseudomonas a site in the enzyme’s structural fold that is covalently or putida (SEQ ID NO: 8) and variants thereof, wherein none, non-covalently bound to a flavin cofactor, a pterin cofactor, or one or more of the amino acids that are located within 50 A a diron cluster. Examples of non-heme containing monooxy from the heme iron are mutated to any other of the natural 65 genases include but are not limited to ()-hydroxylases (n-oc aminoacids or mutated to an unnatural amino acid or modi tane ()-hydroxylase, n-decane ()-hydroxylases, 9-O-hy fied in some way so to alter the properties of the enzyme. droxylase, and AlkB), Styrene monooxygenase, butane US 8,026,085 B2 15 16 monooxygenases, propane monooxygenases, and methane sis. Normally, this peroxide-driven reaction in P450s is not monooxygenases. Non-heme containing monooxygenases significant. However, mutations in the heme domain of P450 catalyze the monooxygenation of a variety of structurally enzymes can enhance their latent peroxygenase activity, as in diverse Substrates. Exemplary Substrates accepted by proges the case of P450cam (Joo, Lin et al. 1999) and P450, terone 9-O-hydroxylase from Nocardia sp. include steroid 5 (Cirino and Arnold 2003). Using three engineered P450 derivatives. Exemplary Substrates accepted by non-heme enzymes, namely CYP102A1, CYP102A2 and CYP102A3, monooxygenases such as integral membrane di-iron alkane that are capable of peroxygenase activity, a library of ~6000 hydroxylases (e.g. Al kB), Soluble di-iron methane members peroxygenase chimeras was created by site-di monooxygenases (SMMO), di-iron propane monooxygena rected recombination (Otey, Landwehr et al. 2006). ses, di-iron butane monooxygenases, membrane-bound cop 10 Naturally-occurring P450 peroxygenases also exist. per-containing methane monooxygenases, styrene monooxy P450s (CYP152A1) and P450s. (CYP152B1), recently genase, Xylene monooxygenase include C-C linear and isolated from Bacillus subtilis and Sphingomonas paucimo branched alkanes, alkenes, and aromatic hydrocarbons. bilis (Matsunaga, Sumimoto et al. 2002; Matsunaga, Yamada In some embodiments, the oxidizing agent is a dioxyge et al. 2002), efficiently utilize H.O. to hydroxylate fatty nase or a variant thereof and in particular a dioxygenase 15 acids, prevalently in C. and B positions. involved in the catabolism of aromatic hydrocarbons. Dioxy Exemplary peroxygenases Suitable in the methods and sys genases are a class of oxygenase enzymes that incorporate tem disclosed herein include but are not limited to natural both atoms of molecular oxygen (O) onto the Substrate heme-containing peroxygenases, natural P450 peroxygena according to the general scheme of reaction: ses, engineered P450s with peroxygenase activity, and P450 peroxygenase chimeras described in more details in the work of Arnold and co-workers (Otey, Landwehr et al. 2006). 21 21 OH dioxygenase These peroxygenases show activity on a variety of Substrates Hs including fatty acids, 8- and 12-pNCA, indole, aniline, p-ni S4S. O Sa. S. 25 trophenol, heterocyclic derivatives (e.g. chlorZoxazone, bus R y OH pirone), statins, and naphtyl derivatives. Other Suitable oxidizing agents for the systems and meth Dioxygenases are metalloprotein and activation of ods disclosed herein are peroxidases (EC number 1.11.1.x). molecular oxygen is carried out in a site within the structural Sequences of the peroxidase enzymes identified so far can be fold of the enzyme that is covalently or non-covalently bound 30 found in the PeroxiBase database. Peroxidases typically cata to one or more metal atoms. The metal is typically iron, lyze a reaction of the form: ROOR'+electron donor (2 e)+ manganese, or copper. Examples of dioxygenases include 2H"->ROH+ROH. For most peroxidases the optimal oxygen catechol dioxygenases, toluene dioxygenases, biphenyl providing compound is hydrogen peroxide, but others are dioxygenases. Catechol dioxygenases catalyze the oxidative more active with organic hydroperoxides Such as lipid peroX cleavage of catechols and have different Substrate specifici 35 ides. Peroxidases can contain a heme cofactor in their active ties, including catechol 1,2-dioxygenase (EC 1.13.11.1), cat sites, or redox-active cysteine or selenocysteine residues. The echol 2,3-dioxygenase (EC 1.13.11.2), and protocatechuate nature of the electron donor is very dependent on the structure 3,4-dioxygenase (EC 1.13.11.3). Toluene dioxygenase and of the enzyme. For example, horseradish peroxidase can use biphenyl dioxygenases are involved in the natural degrada a variety of organic compounds as electron donors and accep tion of aromatic compounds and typically introduce two oxy 40 tors. Horseradish peroxidase has an accessible active site and gen atoms across a double bond in aromatic or non-aromatic many compounds can reach the site of the reaction. In con compounds. Diocoxygenases, e.g. toluene dioxygenase, can trast, cytochrome c peroxidase has a much more restricted be engineered to accept substrates for which the wild-type active site, and the electron-donating compounds are very enzyme shows only basal or no activity, e.g. 4-picoline (Saka specific. Glutathione peroxidase is a peroxidase found in moto, Joernet al. 2001). Potentially suitable substrates for 45 humans, which contains selenocysteine. It uses glutathione as dioxygenase enzymes include but are not restricted to Substi an electron donor and is active with both hydrogen peroxide tuted or non-substituted monocyclic, polycyclic, and hetero and organic hydroperoxide Substrates. cyclic aromatic compounds. On these Substrates, the diooxy In some embodiments the organic molecule has the struc genase can introduce one or more cis dihydrodiol functional ture of formula (I) groups. 50 In some embodiments, the oxidizing agent is a peroxyge nase. Natural peroxygenases are heme-dependent oxidases (I) that are distinct from cytochrome P450 enzymes and peroxi dases in that they only accept peroxides, in particular hydro gen peroxide, as the Source of oxidant. Natural peroxygena 55 ses are typically membrane-bound and can catalyze hydroxylation reactions of aromatics, Sulfoxidations of Xeno biotics, or epoxidations of unsaturated fatty acids. In contrast in which X=C atom is the target site, and R, R2, and R are to cytochrome P450 monoxygenases, peroxygenases activ independently selected from the group consisting of hydro ity does not require any cofactor such as NAD(P)H and does 60 gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, not use molecular oxygen. Examples are the plant peroxyge heteroatom-containing aliphatic, heteroatom-containing nase (PXG) (Hanano, Burcklen et al. 2006), soybean peroxy aryl, Substituted heteroatom-containing aliphatic, Substituted genase (Blee, Wilcox et al. 1993), and oat seed peroxygenase. heteroatom-containing aryl, alkoxy, aryloxy, and functional In some embodiments, the peroxygenase is a cytochrome groups (FG) or are taken together to form a ring, Such that the P450s can also use peroxides as oxygen donors. This consti 65 carbon atom is a secondary or tertiary carbon atom. tutes the so-called peroxide shunt pathway and the enzyme The term “aliphatic' is used in the conventional sense to does not need a reductase and NAD(P)H to carry out cataly refer to an open-chain or cyclic, linear or branched, saturated US 8,026,085 B2 17 18 or unsaturated hydrocarbon group, including but not limited C=N), formyl ( CO-H), thioformyl ( CS H), to alkyl group, alkenyl group and alkynyl groups. The term phosphono (—P(O)OH), Substituted phosphono, and phos "heteroatom-containing aliphatic' as used herein refer to an pho ( PO). aliphatic moiety where at least one carbon atom is replaced In particular, the Substituents R. RandR of formula I can with a heteroatom. be independently selected from hydrogen, C-C alkyl, The term “alkyl and “alkyl group’ as used herein refers to C-C substituted alkyl, C-C heteroatom-containing a linear, branched, or cyclic Saturated hydrocarbon typically alkyl, C-C substituted heteroatom-containing alkyl, containing 1 to 24 carbon atoms, preferably 1 to 12 carbon C-C alkenyl, C-C Substituted alkenyl, C-C heteroa atoms, such as methyl, ethyl, n-propyl, isopropyl. n-butyl, tom-containing alkenyl, C-C substituted heteroatom-con 10 taining alkenyl, Cs-Caryl, Cs-C2 substituted aryl, Cs-C24 isobutyl, t-butyl, octyl, decyl and the like. The term "heteroa heteroatom-containing aryl, Cs-C. Substituted heteroatom tom-containing alkyl as used herein refers to an alkyl moiety containing aryl, C1-C4 alkoxy, Cs-C2 aryloxy, carbonyl, where at least one carbonatom is replaced with a heteroatom, e.g. oxygen, nitrogen, Sulphur, phosphorus, or silicon, and thiocarbonyl, and carboxy. More in particular, R. R. and R typically oxygen, nitrogen, or Sulphur. of formula I can be independently selected from hydrogen, 15 C-C alkyl, C-C Substituted alkyl, C-C heteroatom The term “alkenyl and “alkenyl group’ as used herein containing alkyl, C-C substituted heteroatom-containing refers to a linear, branched, or cyclic hydrocarbon group of 2 alkyl, C-C alkenyl, C-C substituted alkenyl, C-C het to 24 carbon atoms, preferably of 2 to 12 carbon atoms, eroatom-containing alkenyl, C-C Substituted heteroatom containing at least one double bond, such as ethenyl, n-pro containing alkenyl, Cs-C aryl, Cs-C. Substituted aryl, penyl, isopropenyl. n-butenyl, isobutenyl, octenyl, decenyl, Cs-C heteroatom-containing aryl, Cs-C. Substituted het and the like. The term "heteroatom-containing alkenyl as eroatom-containing aryl, C-C alkoxy, Cs-Caryloxy, car used herein refer to an alkenyl moiety where at least one bonyl, thiocarbonyl, and carboxy. carbon atom is replaced with a heteroatom. Oxidizing agents known or expected to react with the target The term “alkynyl and “alkynyl group’ as used herein site of a compound of Formula (I) include but are not limited refers to a linear, branched, or cyclic hydrocarbon group of 2 25 to oxygenases or variants thereof. to 24 carbon atoms, preferably of 2 to 12 carbon atoms, In some embodiments, the oxygenase can be a non-heme containing at least one triple bond. Such as ethynyl, n-propy monooxygenase or a variant thereof, a heme-containing nyl, and the like. The term "heteroatom-containing alkynyl monooxygenase or a variant thereof, a peroxygenase or a as used herein refer to an alkynyl moiety where at least one variant thereof. Such as any of the heme-containing carbon atom is replaced with a heteroatom. 30 monooxygenase, non heme-containing monooxygenases and The term “aryland “aryl group’ as used herein refers to an peroxygenases disclosed herein. In particular, the oxygenase aromatic substituent containing a single aromatic or multiple can be any of the P450 monooxygenases and P450 peroxy aromatic rings that are fused together, directly linked, or genases disclosed herein. indirectly linked (such as linked through a methylene or an In some embodiments, the oxygenase or variant thereof ethylene moiety). Preferred aryl groups contain 5 to 24 car 35 can be butane monooxygenase, CYP102A1 (SEQID NO:2), bonatoms, and particularly preferred aryl groups contain 5 to CYP102A1 var4 (SEQID NO:46), CYP102A1 var8 (SEQID 14 carbon atoms. The term "heteroatom-containing aryl as NO:50), CYP102A1 warl (SEQID NO:21), CYP102A1 var2 used herein refer to an aryl moiety where at least one carbon (SEQ ID NO:22), CYP102A1 var3 (SEQ ID NO:23), atom is replaced with a heteroatom. CYP102A1 var3-20 (SEQ ID NO:42), CYP102A1var3-2 The term “alkoxy' and “alkoxy group’ as used herein 40 (SEQ ID NO:44), CYP102A1 var3-3 (SEQ ID NO:25), refers to an aliphatic group or a heteroatom-containing ali CYP102A1 var3-4 (SEQID NO:26), CYP102Alvar3-5(SEQ phatic group bound through a single, terminal ether linkage. ID NO:27), CYP102A1var3-7 (SEQ ID NO:29), Preferred aryl alkoxy groups contain 1 to 24 carbon atoms, CYP102A1 var3-8(SEQID NO:30), CYP102A1 var3-9 (SEQ and particularly preferred alkoxy groups contain 1 to 14 car ID NO:31), CYP102A1var3-11 (SEQ ID NO:33), bon atoms. 45 CYP102A1var3-13 (SEQ ID NO:35), CYP102A1var3-14 The term “aryloxy” and “aryloxy group’ as used herein (SEQ ID NO:36), CYP102A1var3-15 (SEQ ID NO:37), refers to an aryl group or a heteroatom-containing aryl group CYP101A1 (SEQID NO:8), CYP101A1 warl (SEQ ID NO: bound through a single, terminal ether linkage. Preferred 65), CYP101A1var2-3 (SEQ ID NO:69), CYP102A2(SEQ aryloxy groups contain 5 to 24 carbon atoms, and particularly ID NO:3), CYP102A2var1 (SEQ ID NO:63), CYP102A3 preferred aryloxy groups contain 5 to 14 carbon atoms. 50 (SEQ ID NO:4), CYP102A3var1 (SEQ ID NO:64) and The terms “halo' and “halogen are used in the conven CYP153A6 (SEQID NO:54), CYP153A7 (SEQID NO:55), tional sense to refer to a fluoro, chloro, bromo or iodo Sub CYP153A8 (SEQ ID NO:56), CYP153A11 (SEQ ID stituent. NO:57), CYP153D2(SEQ ID NO:58), and/or CYP106A2 By “substituted it is intended that in the alkyl, alkenyl, (SEQ ID NO:9). In particular, in these embodiments at least alkynyl, aryl, or other moiety, at least one hydrogen atom is 55 one of said oxygenases or variants thereof is expected to replaced with one or more non-hydrogenatoms. Examples of activate the target site by introducing an oxygen-containing such substituents include, without limitation: functional functional group in the form of a hydroxyl group. In these groups referred to herein as “FG'. Such as alkoxy, aryloxy, embodiments, the final products resulting from the applica alkyl, heteroatom-containing alkyl, alkenyl, heteroatom-con tion of the systems and methods disclosed herein can be taining alkenyl, alkynyl, heteroatom-containing alkynyl, aryl, 60 (RRRCF), (RRCF), (RRCF), or (RRCF). heteroatom-containing aryl, alkoxy, heteroatom-containing In some embodiments of the methods and systems dis alkoxy, aryloxy, heteroatom-containing aryloxy, halo, closed herein, the organic molecule is a compound of For hydroxyl ( OH), sulfhydryl ( -SH), substituted sulfhydryl, mula (I), in which R=H, —CH or =O, and/or R and R. carbonyl (-CO—), thiocarbonyl, (-CS—), carboxy are connected together through 4, 5, 6, or 7-methylene moiety (—COOH), amino ( NH), substituted amino, nitro 65 to form a ring, the oxidizing agent can be an oxygenase, such ( NO), nitroso ( NO), sulfo ( SO OH), cyano as a P450 monooxygenase, and in particular CYP102A1 var1 ( C=N), cyanato ( O C=N), thiocyanato ( S (SEQ ID NO:21), CYP102A1 var2 (SEQ ID NO:22), US 8,026,085 B2 19 20 CYP102A 1var3 (SEQ ID NO:23), CYP102A1var3-7 (SEQ genases including CYP102Alvar1 (SEQ ID NO: 21), ID: NO:29), CYP101A1 (SEQ ID NO:8), CYP101Alvar1 CYP102A1 var2 (SEQID NO: 22), CYP102A1 var3-20 (SEQ (SEQID NO:65), and/or CYP101Alvar2-3(SEQID NO:69), ID NO: 42). In particular, when n=5, as in the case of cyclo and is expected to activate the target site of the corresponding pentanecarboxylic acid derivatives, the compound of formula compound of Formula (I) by introducing an oxygen-contain (I) can be activated with methods and systems disclosed ing functional group in the form of a hydroxyl group. herein wherein the oxidating agent is a monooxygenase In some embodiments of the methods and systems dis CYP102Alvar8 (SEQ ID NO: 50). When instead n=6, as in closed herein, the organic molecule is a compound of For the case of camphor, cyclohexane and cyclohexene, the com mula (I), in which R-H. R. Me. -Et, Pr, or -iPr, and/or pound of formula (I) can be activated with methods and R= -(CH2)COOH with n between 9 and 15, the oxidizing 10 systems disclosed herein wherein the oxidating agent is a agent can be an oxygenase Such as a P450 monoxygenase, in monooxygenase. Such as a P450 monooxygenase including particular CYP102A1 (SEQID NO:2), CYP102A1 var4(SEQ CYP101A1(SEQ ID NO:8), CYP153A6(SEQ ID NO:54), ID NO:46), CYP102AlvarS (SEQ ID NO:47), CYP102A2 CYP153A7 (SEQID NO:55), CYP153A8(SEQID NO:56), (SEQ ID NO:3), CYP102A2var1(SEQ ID NO:63), CYP153A11 (SEQ ID NO:57), CYP153D3 (SEQ ID NO: CYP102A3(SEQID NO:4), and/or CYP102A3var1(SEQID 15 59) or CYP153D2(SEQID NO:58). In those embodiments, NO:64), which is expected to activate the target site by intro activation is known or expected to result in the introduction of ducing an oxygen-containing functional group in the form of a hydroxyl group in the target site. a hydroxyl group. In some embodiments of the methods and systems dis In some embodiments of the methods and systems dis closed herein, organic molecule is a compound of Formula closed herein, the organic molecule is a compound of For (I), wherein R=H. R. and R are connected through 5 or 6 mula (I), in which R=R= Me, R = CH-O-substituted methylene moieties, so to form a polycyclic unsaturated sys Ph, activation can be performed by reacting the organic tem, such as in Steroids, activation can be performed by react molecule with an oxygenase, such as a P450 monooxygenase, ing the Substrate with a monooxygenases such as a P450 including CYP102A1(SEQ ID NO:2), CYP102A1 var3-4 monooxygenase including CYP106A2 (SEQID NO: 9), and (SEQ ID NO:26), CYP102A1 var3-14 (SEQ ID NO:36), 25 the activation is expected to result in the introduction of a CYP102A1var3-15 (SEQ ID NO:37), CYP102A1var3-3 hydroxyl group in the target site. (SEQ ID NO:25), CYP102A1 var3-2 (SEQ ID NO:24), In the compound of formula I, wherein R=H, R= - CYP102A1var3 (SEQ ID NO:23), CYP102A1var3-9 (SEQ CHCOOH, R-n-dodecyl, activation can be performed by ID NO:31), CYP102A1 warl (SEQ ID NO:21), and/or reacting the substrate with peroxygenase P450s CYP102Alvar2(SEQ ID NO:22), which introduce an 30 (CYP152A1) (SEQID NO:70), resulting in the introduction hydroxyl group in the target site, as exemplified in Examples of a hydroxyl group in the target site. 11 and illustrated in corresponding scheme 11. In some embodiments, the organic molecule has the struc In some embodiments of the methods and systems dis ture of formula (II) closed herein, the organic molecule is a compound of For mula (I), in which R-R-H, activation can be performed by 35 reacting the organic molecule with an oxygenase Such as a (II) P450 monooxygenase, including CYP153A6 (SEQ ID O NO:54), CYP153A7 (SEQID NO:55), CYP153A8(SEQID R6 NO:56), CYP153A11 (SEQ ID NO:57), CYP153D2 (SEQ R.S.-ly R4 ID NO:58), and/or CYP153D3 (SEQ ID NO:59), which are 40 expected to introduce a hydroxyl group on the target site H In some embodiments of the methods and systems dis closed herein, the organic molecule is a compound of For in which X is the target site C atom, and R. Rs, and R are mula (I), in which Ran-C-Cio alkyl (e.g. linear Co-Co independently selected from the group consisting of hydro alkanes), activation can be performed by an oxygenase Such 45 gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, as abutane monooxygenase, which is expected to introduce a heteroatom-containing aliphatic, heteroatom-containing hydroxyl group on the target site. aryl, Substituted heteroatom-containing aliphatic, Substituted In some embodiments of the methods and systems dis heteroatom-containing aryl, alkoxy, aryloxy, and functional closed herein, the organic molecule is a compound of For groups (FG) or are taken together to form a ring, Such that the mula (I), in which R cyclohexenyl (e.g. limonene), the oxi 50 carbon atom is a secondary or tertiary carbon atom. dating agent can be an oxygenase, Such as a P450 In particular, the substituents R. Rs and R of Formula (II) monooxygenase including CYP153A6 (SEQ ID NO:54), can be independently selected from hydrogen, C-C alkyl, CYP153A7 (SEQID NO:55), CYP153A8 (SEQID NO:56), C-C substituted alkyl, C-C heteroatom-containing CYP153A11 (SEQ ID NO:57), CYP153D2 (SEQ ID alkyl, C-C substituted heteroatom-containing alkyl, NO:58), and/or CYP153D3 (SEQ ID NO:59) which are 55 C-C alkenyl, C-C Substituted alkenyl, C-C heteroa expected to introduce a hydroxyl group on the target site. tom-containing alkenyl, C-C substituted heteroatom-con In some embodiments of the methods and systems dis taining alkenyl, Cs-Caryl, Cs-C2 substituted aryl, Cs-C24 closed herein, the organic molecule is a compound of For heteroatom-containing aryl, Cs-C. Substituted heteroatom mula (I), in which Ran-Cz, the oxidating agent can be a containing aryl, C1-C4 alkoxy, Cs-C2 aryloxy, carbonyl, monooxygenase Such as a P450 monooxygenases including 60 thiocarbonyl, carboxy, Sulfhydryl, amino, Substituted amino. CYP102Alvar3-13 (SEQID NO:35), which is expected to More in particular, R can be independently selected from introduce a hydroxyl group on the target site. hydrogen, C-C alkoxy, Cs-Caryloxy, amino, Substituted In some embodiments of the methods and systems dis amino, sulfhydryl, Substituted sulfhydryl, C-C alkyl, closed herein, the organic molecule is a compound of For C-C2 substituted alkyl, C-C heteroatom-containing mula (I), in which RH. R. and R are connected through in 65 alkyl, C-C Substituted heteroatom-containing alkyl, methylene moieties, activation can be performed by reacting C-C alkenyl, C-C Substituted alkenyl, C-C heteroa the Substrate with monooxygenases such as P450 monooxy tom-containing alkenyl, C-C Substituted heteroatom-con US 8,026,085 B2 21 22 taining alkenyl, Cs-Caryl, Cs-Ca. Substituted aryl, Cs-Ca mula (II), in which R is —OMe, —OEt, —OPr. —OBu, heteroatom-containing aryl, and C-C substituted heteroa —OtBu, Rs is hydrogen, and R is benzyl, o-chloro-phenyl, tom-containing aryl, while Rs and R are independently p-chloro-phenyl, or m-chloro-phenyl, o-methyl-phenyl, selected from hydrogen, C-C alkyl, C-C Substituted p-methyl-phenyl, or m-methyl-phenyl, o-methoxy-phenyl, alkyl, C-C heteroatom-containing alkyl, C-C Substi p-methoxy-phenyl, or m-methoxy-phenyl, the activation can tuted heteroatom-containing alkyl, C-C alkenyl, C-C2 be performed by reacting the Substrate with oxygenase Substituted alkenyl, C-C heteroatom-containing alkenyl, CYP102A1 var4(SEQID NO:46), CYP102A1 var3 (SEQID C-C Substituted heteroatom-containing alkenyl, Cs-Ca NO. 23), and CYP102A1 var3-7(SEQ ID NO: 29), as illus aryl, Cs-C. Substituted aryl, Cs-C heteroatom-containing trated in Examples 1,2,3 and 4 and corresponding schemes 1. aryl, Cs-C. Substituted heteroatom-containing aryl, C-C, 10 alkoxy, Cs-Caryloxy, carbonyl, thiocarbonyl, and carboxy. 2, 3, and 4. Oxidizing agents known or expected to react with the target In some embodiments of the methods and systems dis site of a compound of Formula (II) include but are not limited closed herein, the organic molecule is a compound of For to oxygenases or variants thereof mula (II), in which R is —OH. Rs is hydrogen, and R is a In some embodiments, the oxygenase can be a non-heme 15 linear C- alkyl chain, (for example a myristic acid), the monooxygenase or a variant thereof, a heme-containing activation can be performed by reacting the substrate with monooxygenase or a variant thereof, a peroxygenase or a peroxygenases P450s (CYP152A1) and P450s. variant thereof. Such as any of the heme-containing (CYP152B1), resulting in the introduction of a hydroxy monooxygenase, nonheme-containing monooxygenases and group in the target site. peroxygenases disclosed herein. In particular, the oxygenase In some embodiments of the methods and systems dis can be any of the P450 monooxygenases and P450 peroxy closed herein, the organic molecule is a compound of For genases disclosed herein. mula (II), in which Rs is —Me, and Ra and R are connected In some embodiments, the oxygenase or variant thereof through a 6-methylene ring, (for example a C-thujone), the can be a P450 monooxygenase or peroxygenase including activation can be performed by reacting the substrate with CYP102A1 (SEQ ID NO:2), CYP102A1var4 (SEQ ID 25 monooxygenases CYP101A1 (SEQ ID NO: 8), CYP102A1 NO:46), CYP102A1var8 (SEQID NO:50), CYP102A1 var1 (SEQID NO:2), CYP1A2(SEQID NO: 13), CYP2C9 (SEQ (SEQ ID NO:21), CYP102Alvar2(SEQ ID NO:22), ID NO: 14), CYP2C19(SEQ ID NO: 16), CYP2D6(SEQ ID CYP102A1var3(SEQ ID NO:23), CYP102A1var3-7(SEQ NO:17), CYP2E1(SEQ ID NO: 18), and CYP3A4(SEQ ID ID NO:9), CYP102A1var3-5 (SEQ ID NO:27), NO: 20), resulting in the introduction of a hydroxyl group in CYP102A1 var3-9(SEQ ID NO:31), CYP102A1 var3-14 30 the target site. (SEQ ID NO:36), CYP102A1var3-15 (SEQ ID NO:37), CYP102Alvar3-17(SEQ ID NO:39), CYP101A1(SEQ ID In some embodiments, the organic molecule has the struc NO:8), CYP101A1(Y96F), CYP101A1var2-1(SEQ ID ture of formula (III) NO:67), CYP101Alvar1(SEQ ID NO:65), CYP101Alvar2 2(SEQ ID NO:68), CYP1A2 (SEQ ID NO:13), CYP2C9 35 (III) (SEQ ID NO:15), CYP2C19(SEQ ID NO:16), CYP2D6 (SEQ ID NO:17), CYP2E1(SEQID NO:18), CYP3A4(SEQ ID NO:20), P450s (CYP152A1) (SEQID NO:70) and/or R11 \ P450s (CYP152B1). In particular, in these embodiments at least one of said oxygenases or variants thereof is expected to 40 Ro -R activate the target site of a compound of Formula (II) by R introducing an oxygen-containing functional group in the form of a hydroxyl group. In these embodiments, the final in which X is the target site C atom, and R-7, Rs. Ro, Ro and products resulting from the application of the systems and R are independently selected from the group consisting of methods disclosed herein can be (RRCF (CO)—R), 45 hydrogen, aliphatic, aryl, Substituted aliphatic, Substituted (RCF (CO)—R), or (RCF (CO)—R). aryl, heteroatom-containing aliphatic, heteroatom-contain In some embodiments of the methods and systems dis ing aryl, Substituted heteroatom-containing aliphatic, Substi closed herein, the organic molecule is a compound of For tuted heteroatom-containing aryl, alkoxy, aryloxy, and func mula (II) with R —OH, the oxidizing agent can be a per tional groups (FG) or are taken together to form a ring, Such oxygenase, such as a P450s (CYP152A1)(SEQID NO:70) 50 that the carbon atom is a secondary or tertiary carbon atom. and/or or a peroxygenase P450s (CYP152B1), which are In particular, the Substituents R-7, Rs. Ro Ro, and R of most expected activate the target site, in particular by intro Formula (III) can be independently selected from hydrogen, ducing an oxygen-containing functional group in the form of C-C alkyl, C-C Substituted alkyl, C-C heteroatom a hydroxyl group. containing alkyl, C-C substituted heteroatom-containing In some embodiments of the methods and systems dis 55 alkyl, C-C alkenyl, C-C substituted alkenyl, C-C het closed herein, the organic molecule is a compound of For eroatom-containing alkenyl, C-C substituted heteroatom mula (II), with Ra—OR, and R=a C-C alkyl, the oxidizing containing alkenyl, Cs-C aryl, Cs-C. Substituted aryl, agent can be an oxygenase and in particulara P450 oxygenase Cs-C heteroatom-containing aryl, Cs-C. Substituted het such as CYP102A1 (F87A), CYP102A1 var3 (SEQ ID NO: eroatom-containing aryl, C-C2-alkoxy, Cs-Caryloxy, car 23), CYP102A1var3-7 (SEQ ID NO: 29), CYP102A1var3 60 bonyl, thiocarbonyl, and carboxy. More in particular, R-7, Rs. 14 (SEQID NO:36), CYP102A1var3-15 (SEQID NO:37), R. Rio, and R are independently selected from hydrogen, and/or CYP102Alvar3-5 (SEQID NO: 27), which are most C-C alkyl, C-C Substituted alkyl, C-C heteroatom expected to activate the target site, in particular by introduc containing alkyl, C-C substituted heteroatom-containing ing an oxygen-containing functional group in the form of a alkyl, C-C alkenyl, C-C substituted alkenyl, C-C het hydroxyl group. 65 eroatom-containing alkenyl, C-C Substituted heteroatom In some embodiments of the methods and systems dis containing alkenyl, Cs-C aryl, Cs-C. Substituted aryl, closed herein, the organic molecule is a compound of For Cs-C heteroatom-containing aryl, Cs-C. Substituted het US 8,026,085 B2 23 24 eroatom-containing aryl, C-C alkoxy, Cs-Caryloxy, car reacting the substrate with oxygenases such as CYP2C19 bonyl, thiocarbonyl, and carboxy. (SEQ ID NO:16) and CYP2D6 (SEQ ID NO:17), as in the Oxidizing agents known or expected to react with the target case of linalool. site of a compound of Formula (III) includebut are not limited In some embodiments of the methods and systems dis to oxygenases or variants thereof. closed herein, the organic molecule is a compound of For In some embodiments, the oxygenase can be a non-heme mula (III), in which R, H, R. and R are linked together to monooxygenase or a variant thereof, a heme-containing form a 6-membered aromatic ring, Rs and Ro are linked monooxygenase or a variant thereof, a peroxygenase or a together to form a 5-carbon cyclic alkenyl, activation can be variant thereof. Such as any of the heme-containing performed by reacting the Substrate with oxygenases monooxygenase, nonheme-containing monooxygenases and 10 peroxygenases disclosed herein. In particular, the oxygenase CYP102A1varS (SEQID NO:47), CYP102A1 var6 (SEQ ID can be any of the P450 monooxygenases and P450 peroxy NO:48), and/or CYP102Alvar7(SEQ ID NO:49), resulting genases disclosed herein. in the introduction of a hydroxyl group in the target site as in In some embodiments, the oxygenase or variant thereof the case of acenaphthene. can be a P450 oxygenase including CYP102Alvar1 (SEQID 15 In some embodiments, the organic molecule has the struc NO:21), CYP102A1var2 (SEQID NO:22), CYP102A1 var3 ture of formula (IV) (SEQ ID NO:23), CYP102A1 var3-2 (SEQ ID NO:24), CYP102A1 var3-6(SEQID NO:28), CYP102Alvar3-5(SEQ ID NO:27), CYP102A1var3-8 (SEQ ID NO:30), (IV) CYP102A1var3-9(SEQ ID NO:31), CYP102A1var3-11 (SEQ ID NO:33), CYP102A1var3-17(SEQ ID NO:39), CYP102A1vars(SEQID NO:47), CYP102A1var6 (SEQ ID NO:48), CYP102A1var7(SEQ ID NO:49), CYP102A1 var8 (SEQ ID NO:50), CYP101A1var1(SEQ ID NO:65), CYP101A1 var2-1(SEQID NO:67), CYP101Alvar2-3(SEQ 25 in which the C is the target site, Ar can be a C-C aryl, ID NO:69), CYP2C19(SEQ ID NO:16) and/or CYP2D6 Cs-C. Substituted aryl, Cs-C heteroatom-containing aryl (SEQID NO:17). In particular, in these embodiments at least or Cs-C. Substituted heteroatom-containing aryl, while R2 one of said oxygenases or variants thereof is expected to and R are independently selected from the group consisting activate the target site of a compound of Formula III by of hydrogen, aliphatic, aryl, Substituted aliphatic, Substituted introducing an oxygen-containing functional group in the 30 aryl, heteroatom-containing aliphatic, heteroatom-contain form of a hydroxyl group. In these embodiments, the final ing aryl, Substituted heteroatom-containing aliphatic, Substi products resulting from the application of the systems and tuted heteroatom-containing aryl, alkoxy, aryloxy, and func methods disclosed herein can be (RRCF C(R)= tional groups (FG) or are taken together to form a ring, Such CRoR), (RCF C(R)=CRR) or (RCF C that the carbon atom is a secondary or tertiary carbon atom. (R)= CRR). 35 In particular, the substituent Ar of Formula (IV) can be In some embodiments of the methods and systems dis Cs-Caryl, Cs-C. Substituted aryl, Cs-C heteroatom-con closed herein, the organic molecule is a compound of For taining aryl, or Cs-C. Substituted heteroatom-containing mula (III) in which R, H, Ro—CH, Rio -n-CH, and Rs aryl, while R2 and R are independently selected from and R are linked to form a Substituted 5-member ring, hydrogen, C-C alkyl, C-C substituted alkyl, C-C het activation can be performed by reacting the substrate with 40 eroatom-containing alkyl, C-C substituted heteroatom oxygenases such as CYP102Alvar2 (SEQ ID NO:22), containing alkyl, C-C alkenyl, C-C Substituted alkenyl, CYP102A1 var3 (SEQ ID NO:23), CYP102Alvar3-2(SEQ C-C heteroatom-containing alkenyl, C-C substituted ID NO:24), CYP102A1 var3-6 (SEQ ID NO:28), heteroatom-containing alkenyl, Cs-Caryl, Cs-C. Substi CYP102A1var3-5(SEQID NO:27), CYP102A1var3-8(SEQ tuted aryl, Cs-C heteroatom-containing aryl, Cs-C. Substi ID NO:30), CYP102Alvar3-9 (SEQID NO:31), resulting in 45 tuted heteroatom-containing aryl, C-C alkoxy, Cs-Cary the introduction of a hydroxyl group in the target site as loxy, carbonyl, thiocarbonyl, and carboxy. illustrated in Examples 5 and corresponding scheme 5. Oxidizing agents known or expected to react with the target In some embodiments of the methods and systems dis site of a compound of Formula (IV) include but are not limited closed herein, the organic molecule is a compound of For to oxygenases or variants thereof. mula (III), in which R, H, RH. Rio CHRs and Rare 50 In some embodiments, the oxygenase can be a non-heme linked to form a substituted 6-member ring, activation can be monooxygenase or a variant thereof, a heme-containing performed by reacting the Substrate with oxygenase monooxygenase or a variant thereof, a peroxygenase or a CYP101 Alvar2-3(SEQID NO:69), resulting in the introduc variant thereof. Such as any of the heme-containing tion of a hydroxyl group in the target site as in the case of monooxygenase, non heme-containing monooxygenases and a-pinene. 55 peroxygenases disclosed herein. In particular, the oxygenase In some embodiments of the methods and systems dis can be any of the P450 monooxygenases and P450 peroxy closed herein, the organic molecule is a compound of For genases disclosed herein. mula (III), in which R, R-R-H. RandR are connected In some embodiments, the oxygenase or variant thereof through a Substituted 5-member ring, activation can be per can be such as CYP102A1 (SEQID NO:2), CYP102Alvar4 formed by reacting the organic molecule with an oxygenase 60 (SEQ ID NO:46), CYP102A1 vars(SEQ ID NO:47), such as CYP102Alvar3-2 (SEQ ID NO:24), resulting in the CYP102A1 varó(SEQID NO:48), CYP102A1var7 (SEQ ID introduction of a hydroxyl group in the target site as illus NO:49), CYP102A1 var1 (SEQID NO:21), CYP102A1 var2 trated by Example 6 and corresponding scheme 6. (SEQ ID NO:22), CYP102A1 var3(SEQ ID NO:23), In some embodiments of the methods and systems dis CYP102A1 var3-2(SEQID NO:24), CYP102Alvar3-3(SEQ closed herein, the organic molecule is a compound of For 65 ID NO:25), CYP102A1var3-4(SEQ ID NO:26), mula (III), in which RoR = CH, R-R-H, and CYP102A1var3-5(SEQ ID NO:27), CYP102A1var3-7(SEQ R, substituted Cs alkenyl, activation can be performed by ID NO:29), CYP102A1var3-8(SEQ ID NO:30), US 8,026,085 B2 25 26 CYP102A1var3-9 (SEQ ID NO:31), CYP102A1var3-17 containing alkyl, C-C Substituted heteroatom-containing (SEQ ID NO:39), CYP102A1var8(SEQ ID NO:50), alkyl, C-C alkenyl, C-C substituted alkenyl, C-C het CYP101A1 var2-1(SEQID NO:67), and/or CYP101Alvar2 eroatom-containing alkenyl, C-C Substituted heteroatom 3(SEQ ID NO:69). In particular, in these embodiments at containing alkenyl, Cs-C aryl, Cs-C. Substituted aryl, least one of said oxygenases or variants thereof is expected to Cs-C heteroatom-containing aryl, Cs-C. Substituted het activate the target site of a compound of Formula IV by eroatom-containing aryl, carbonyl, thiocarbonyl, and car introducing an oxygen-containing functional group in the boxy. More in particular, R. Rs. R. R.17, and R1s are form of a hydroxyl group. In these embodiments, the final independently selected from hydrogen, C-C alkyl, C-C, products resulting from the application of the systems and Substituted alkyl, C-C heteroatom-containing alkyl, methods disclosed herein can be RRArC-F, RArCF, 10 C-C Substituted heteroatom-containing alkyl, C-C alk or RArCF. enyl, C-C substituted alkenyl, C-C heteroatom-contain In some embodiments of the methods and systems dis ing alkenyl, C-C Substituted heteroatom-containing alk closed herein, the organic molecule is a compound of For enyl, Cs-C4 aryl, Cs-Ca. Substituted aryl, Cs-Ca mula (IV), in which Ar para-substituted phenyl R=H. heteroatom-containing aryl, C-C substituted heteroatom R -iPr, activation can be performed by reacting the organic 15 containing aryl, C2-C alkoxy, Cs-Caryloxy, carbonyl, and molecule with an oxygenase such as a P450 monooxygenase carboxy. including CYP102A1 (SEQ ID NO:2) and CYP102A1 vars Oxidizing agents known or expected to react with the target (SEQ ID NO:47), which results in the introduction of a hydroxyl group in the target site as illustrated in Examples 10 site of a compound of Formula (V) include but are not limited and corresponding scheme 10. to oxygenases or variants thereof. In some embodiments of the methods and systems dis In some embodiments, the oxygenase can be a non-heme closed herein, the organic molecule is a compound of For monooxygenase or a variant thereof, a heme-containing mula (IV), in which Ar para- or ortho or meta substituted monooxygenase or a variant thereof, a peroxygenase or a phenyl (where substituent is halo. —CH, or —OCH), variant thereof. Such as any of the heme-containing R. H. R. COOR, where R is C-C n-alkyl, activation 25 monooxygenase, non heme-containing monooxygenases and can be performed by reacting the Substrate with oxygenase peroxygenases disclosed herein. In particular, the oxygenase CYP102A1vars(SEQ ID NO:47), CYP102A1 var3(SEQ ID can be any of the P450 monooxygenases and P450 peroxy NO:23), and CYP102A1var3-7 (SEQ ID NO:29), as illus genases disclosed herein. trated in Examples 1,2,3 and 4 and corresponding schemes 1. In some embodiments, the oxygenase or variant thereof 2, 3, and 4. 30 can be CYP102A1 var8 (SEQ ID NO:50), CYP102A1var3-2 In some embodiments of the methods and systems dis (SEQ ID NO:24), CYP102A1 var3-3 (SEQ ID NO:25), closed herein, the organic molecule is a compound of For CYP102Alvar3-5 (SEQ ID NO:27), CYP102Alvar3-6 mula (IV), in which R-H, Ar is ortho substituted phenyl, (SEQ ID NO:28), CYP102Alvar3-9(SEQ ID NO:31), R is linked to Arthrough a phenyl moiety, activation can be CYP102A1 var3-11 (SEQ ID NO:33), CYP102A1 var3-16 performed by reacting the Substrate with oxygenases 35 (SEQ ID NO:38), CYP102A1 var3-19 (SEQ ID NO:41, CYP102A1 varó (SEQID NO:48) and CYP102A1 var8(SEQ CYP102A1 var3-18(SEQ ID NO:40), CYP102A1var3-2 ID NO:50), resulting in the introduction of a hydroxyl group (SEQ ID NO:24), CYP102A1 var3-3 (SEQ ID NO:25), in the target site as in the case of fluorene. CYP102A1 var3-14 (SEQ ID NO:36), CYP102A1 var3-15 In some embodiments of the methods and systems dis (SEQ ID NO:37), CYP102A1var3-17 (SEQ ID NO:39), closed herein, the organic molecule is a compound of For 40 CYP102A1 var3-9 (SEQID NO:31), CYP101Alvar2-3(SEQ mula (IV), in which R-H, Ar is ortho substituted phenyl, ID NO:69), and/or CYP3A4 (SEQID NO:20). In particular, R is linked to Arthrough a 2-methylene bridge, activation in those embodiments, at least one of said oxygenases or can be performed by reacting the Substrate with oxygenase variants thereof is expected to activate a compound of For CYP102Alvarš(SEQ ID NO:47), resulting in the introduc mula V by affording a hydroxyl group at the target site. In tion of a hydroxyl group in the target site as in the case of 45 these embodiments, the final product resulting from the appli indan. cation of the systems and methods disclosed herein can be R.R.R.C. F. In some embodiments the organic molecule has the struc In some embodiments of the methods and systems dis ture of formula (V), closed herein, the organic molecule is a compound of For 50 mula (V), in which Ra Rs. R17, R1s are hydrogen, and Rio is 2-methyl-5-phenyl-4,5-dihydrooxazolyl, activation can be (V) performed by reacting the Substrate with oxygenases CYP102A1var3-5 (SEQ ID NO:27), CYP102A1var3-6 (SEQ ID NO:28), CYP102A1 var3-11 (SEQ ID NO:33), 55 CYP102A1var3-16 (SEQ ID NO:38), CYP102A1var3-19 (SEQ ID NO:41), CYP102A1 var3-18 (SEQ ID NO:40), in which X is the target site Catom, R. Ris, Re, R7Rs are resulting in a hydroxyl group at the target site as illustrated in independently selected from the group consisting of hydro Examples 12 and corresponding scheme 12. gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, In some embodiments of the methods and systems dis heteroatom-containing aliphatic, heteroatom-containing 60 closed herein, the organic molecule is a compound of For aryl, Substituted heteroatom-containing aliphatic, Substituted mula (V), in which Ra Rs. R7, Rs are hydrogen, and R. heteroatom-containing aryl, alkoxy, aryloxy, and functional is 2,3,4,5-tetramethoxy-tetrahydro-2H-pyranyl, activation groups (FG) or are taken together to form a ring, Such that the can be performed by reacting the Substrate with oxygenases carbon atom is a secondary or tertiary carbon atom. such as CYP102A1 var3-2(SEQIDNO:24), CYP102A1 var3 In particular, the Substituents R. R. R. R7, and Rs of 65 3(SEQ ID NO:25), CYP102A1 var3-14 (SEQ ID NO:36), Formula (V) can be independently selected from hydrogen, CYP102A1var3-15 (SEQ ID NO:37), CYP102A1var3-17 C-C alkyl, C-C Substituted alkyl, C-C heteroatom (SEQ ID NO:39), CYP102A1var3-9 (SEQ ID NO:31), US 8,026,085 B2 27 28 resulting in a hydroxyl group at the target site as illustrated in nase. In particular, in these embodiments at least one of said Examples 13 and corresponding scheme 13. oxygenases or variants thereof is expected to activate a com In some embodiments of the methods and systems dis pound of Formula VI by introducing an oxygen-containing closed herein, the organic molecule is a compound of For functional group in the form of an epoxy group. In these mula (V), in which RCN, R=6-dimethylamino-naphtyl, embodiments, the final products resulting from the applica RR, H, RH, or hydrogen, activation can be per tion of the systems and methods disclosed herein can be formed by reacting the Substrate with oxygenases such as (RRC(OH)—CFR-R), (RRCF C(OH)RR). CYP102A1 var8 (SEQ ID NO:50) and CYP3A4 (SEQ ID or (RoRoCF CFR-R). NO:20), as in the case of a cyano-naphtyl ethers. Additional oxidizing agents that are expected to react with In some embodiments the organic molecule has the struc 10 the target site of a compound of Formula (VI) include but are ture of formula (VI) not limited to dioxygenases Such as toluene dioxygenase. More specifically, dioxidizing agents are expected to activate a compound of Formula (VI) by introducing an oxygen (VI) containing functional group in the form of a vicinal diol. In 15 these embodiments, the final products resulting from the application of the systems and methods disclosed herein can be (RRC(OH)—CFR-R), (RRCF C(OH) R2R-22), or (RoR2oCF–CFR-R-22). In some embodiments of the methods and systems dis in which X is the target site Catom, and Ro, Ro R, R2 are closed herein, the organic molecule is a compound of For independently selected from the group consisting of hydro mula (VI), in which RoRo R-H. R. n-butyl, activa gen, aliphatic, aryl, Substituted aliphatic, Substituted aryl, tion through epoxidation can be performed by reacting the heteroatom-containing aliphatic, heteroatom-containing substrate with oxygenases such as CYP102A1 Varl (SEQID aryl, Substituted heteroatom-containing aliphatic, Substituted NO: 21), CYP102A1 var3-21 (SEQ ID NO: 43), heteroatom-containing aryl, and functional groups (FG) or 25 CYP102A1 var3-22 (SEQ ID NO; 44), CYP102A1 var3-23 are taken together to form a ring representing in this case a (SEQID NO: 45), resulting in the introduction of an epoxide cycloalkenyl, Substituted cycloalkenyl, heteroatom-contain functional group at the target site as in the case of 1-hexene. ing cycloalkenyl, or a Substituted heteroatom-containing In some embodiments of the methods and systems dis cycloalkenyl derivative. closed herein, the organic molecule is a compound of For In particular, the substituents R. R. R. and R of 30 mula (VI), in which RoRo R. H. R. phenyl, activation formula VI are independently selected from hydrogen, through epoxidation can be performed by reacting the Sub C-C alkyl, C-C Substituted alkyl, C-C heteroatom strate with oxygenases varl, CYP1A2(SEQ ID NO; 13), containing alkyl, C-C substituted heteroatom-containing CYP102A1 var)(SEQID NO;51) or CYP102A1var9-1 (SEQ alkyl, C-C alkenyl, C-C Substituted alkenyl, C-C het ID NO. 52), resulting in the introduction of an epoxide func eroatom-containing alkenyl, C-C substituted heteroatom 35 tional group at the target site as in the case of styrene. containing alkenyl, Cs-C aryl, Cs-C. Substituted aryl, In some embodiments of the methods and systems dis Cs-C heteroatom-containing aryl, Cs-C. Substituted het closed herein, the organic molecule is a compound of For eroatom-containing aryl, carbonyl, thiocarbonyl, carboxy, mula (VI), in which Ro-R-H. R. and R2 are connected and Substituted amino. More in particular, Rio, Roc R and together through 4 methylene units so to form a 6-membered Rare independently selected from hydrogen, C-C alkyl, 40 ring, activation through epoxidation can be performed by C-C Substituted alkyl, C-C heteroatom-containing reacting the substrate with oxygenases of CYP153 family, alkyl, C-C substituted heteroatom-containing alkyl, such as CYP153A6 (SEQ ID NO:54), CYP153A7 (SEQ ID C-C alkenyl, C-C Substituted alkenyl, C-C heteroa NO;55), CYP153A8 (SEQ ID NO;56), CYP153A11 (SEQ tom-containing alkenyl, C-C substituted heteroatom-con ID NO; 57), CYP153D2 (SEQ ID NO; 58), resulting in the taining alkenyl, Cs-Caryl, Cs-Ca. Substituted aryl, Cs-Ca 45 introduction of an epoxide functional group at the target site heteroatom-containing aryl, Cs-C. Substituted heteroatom as in the case of cyclohexene. containing aryl, carbonyl, and carboxy. In some embodiments of the methods and systems dis Oxidizing agents known or expected to react with the target closed herein, the organic molecule is a compound of For site of a compound of Formula (VI) includebut are not limited mula (VI), in which Ro R. H. Ron-pentyl, R. Co to oxygenases or variants thereof. 50 alkenyl, activation through epoxidation can be performed by In some embodiments, the oxygenase can be a non-heme reacting the substrate with oxygenases CYP102A1 (SEQ ID monooxygenase or a variant thereof, a heme-containing NO; 2), resulting in the introduction of an epoxide functional monooxygenase or a variant thereof, a peroxygenase or a group at the target site as in the case of linolenic acid. variant thereof. Such as any of the heme-containing In some embodiments of the methods and systems dis monooxygenase, nonheme-containing monooxygenases and 55 closed herein, the organic molecule is a compound of For peroXugenases disclosed herein. In particular, the oxygenase mula (VI), in which R-R-H. Ro and R are linked can be any of the P450 monooxygenases and P450 peroxy together to form a 6-membered substituted or non-substituted genases disclosed herein. aromatic ring, activation can be performed by reacting the In some embodiments, the oxygenase or variant thereof Substrate with toluene dioxygenase, resulting in the introduc can be CYP102A1 (SEQIDNO:2), CYP102A1 warl (SEQID 60 tion of an oxygen-containing functional group in the form of NO:21), CYP102A1var2 (SEQID NO:22), CYP102A1 var3 a vicinal diol. In those embodiments, the oxygen-containing (SEQ ID NO:23), CYP102A1 var3-18 (SEQ ID NO:40), functional group will have the form of an epoxy group (C= CYP102A1vars (SEQID NO:47), CYP102A1 var4(SEQID (O)=C), that is an oxygenatomjoined by single bonds to two NO:46), CYP102A1 var3-21 (SEQ ID NO:43), adjacent carbon atoms so to form a three-membered ring. CYP102A1 var3-22 (SEQ ID NO:44), CYP102A1 var3-23 65 In some embodiments, the oxidating agent Suitable to acti (SEQ ID NO:45), CYP102A1var9 (SEQ ID NO:51), vate an organic molecule including a target site with the CYP102Alvar9-1 (SEQID NO:52), and/or toluene dioxyge methods and systems disclosed herein can be identified by (a) US 8,026,085 B2 29 30 providing the organic molecule, (b) providing an oxidizing step c) can be performed by extracting the reaction mixture agent, (c) contacting the oxidizing agent with the organic with organic solvent and characterizing the oxygen contain molecule for a time and under conditions to allow the intro ing functional group in the organic molecule can be per duction of an oxygen-containing functional group on the formed by GC analysis of the extraction solution. In some of target site; (d) detecting the oxygen-containing functional those embodiments, selected mixtures of oxidizing agent, and group on the target site of the organic molecule resulting from co-reagents (e.g. cofactors, oxygen) which gave rise to the step c), and repeating steps (a) to (d) until an oxygen contain largest amount of activated products for a given organic mol ing functional group is detected on the target site. In particu ecule, can be repeated at a larger scale. The activated products lar, one or more oxidating agents can be provided under step can be subsequently isolated by Suitable technique including b) of the method disclosed herein. 10 In particular, in embodiments wherein the organic mol liquid chromatography and identified by 'H-, 'C-NMR, and ecule is a molecule of formula (I), (II), (III), and (IV), detect MS and additional techniques identifiable by a skilled person. ing the oxygen-containing functional group on the target site Examples of these embodiments is provided in the Examples can be performed by: e) isolating of the organic molecule section and illustrated in FIGS. 5 and 6. resulting from step c), for example by a separation method or 15 In embodiments wherein the organic molecule is an a combination of separation methods, including but not lim organic molecule of general formula (V) wherein R is 2-me ited to extraction, chromatography, distillation, precipitation, thyl-5-phenyl-4,5-dihydrooxazolyl and R-R-R-R-H, Sublimation, and crystallization; and f) characterizing the upon contacting a library of engineered P450 monooxygena isolated organic molecule resulting from Step c) to identify ses (oxidizing agents) the oxygen containing functional the oxygen containing functional group, for example by a group can be detected using colorimetric reagent (e.g. Pur characterization method or a combination of methods, pald) and measuring the change in absorbance (e.g. at 550 nm including but not limited to spectroscopic or spectrometric on a microtiter plate reader). In embodiments wherein the technique, preferably a combination of two or more spectro organic molecule has the general formula (V) wherein R is scopic or spectrometric techniques, including UV-VIS spec 2,3,4,5-tetramethoxytetrahydro-2H-pyranyl and troscopy, fluorescence spectroscopy, IR spectroscopy, 25 R-R-R-R-H upon contacting a library of engineered H-NMR, C-NMR, 2D-NMR,3D-NMR, GC-MS, LC-MS, P450 monooxygenases, the oxygen containing functional and MS-MS. group can also using colorimetric reagent (e.g. Purpald) and In particular, in embodiments wherein the organic mol measuring the change in absorbance (e.g. at 550 nm on a ecule is a molecule of formula (V), detecting the oxygen microtiter plate reader). containing functional group on the target site can be per 30 In some embodiments, the isolated and characterized formed by monitoring the removal of the —CHR,Rs organic molecule that includes the oxygen-containing func moiety associated with the introduction of an oxygen con tional group at the target site can be used as authentic standard taining functional group in the target site. In those embodi for high-throughput Screening of other, more Suitable oxidiz ments, monitoring the removal of the —CHRRs moiety, ing agents, or improvement of reaction conditions for the can be performed by g) contacting the organic molecule 35 activation reaction. In exemplary embodiments, high resulting from step c) with a reagent that can react with an throughput screening can be carried out performing the acti aldehyde (R-CHO), a ketone (R—C(O)—R), a dicarbonyl vation reaction in a multi-well plate, typically a 96-well or (R—C(O)—C(O)—R), or a glyoxal (R—C(O)—CHO) func 384-well plate, each well containing the candidate organic tional group; and h) detecting the formation of an adduct or a molecule, the oxidizing agent, and the co-reagents (e.g. complex between an aldehyde, ketone, dicarbonyl, or glyoxal 40 cofactors, oxygen) required for the reaction to proceed, and in the organic molecule, the aldehyde, ketone, dicarbonyl, or detecting the activation of the target site using one of the glyoxal resulting from the removal of the —CHRRs moiety. following techniques, UV-VIS spectroscopy, fluorimetry, IR, Detecting the formation of an adduct or complex can be LC, GC, GC-MS, LC-MS, or a combination thereof, accord performed by spectroscopic (colorimetric, fluorimetric) or ing to the nature and properties of the candidate organic chromatographic methods and additional methods identifi 45 molecule and the activated product. able by a skilled person upon reading of the present disclo In some embodiments, an oxygenase that oxidizes a pre SUC. determined organic molecule in a target site is provided by (i) Reagents that can react with an aldehyde, ketone, dicarbo providing a candidate oxygenase, () mutating the candidate nyl, or glyoxal and Suitable for the methods and systems oxygenase to generate a mutant or variant oxygenase, (k) described herein include but are not limited to 4-amino-3- 50 contacting the variant oxygenase with the pre-determined hydrazino-5-mercapto-1,2,4-triazole-4-amino-5-hydrazino organic molecule for a time and under condition to allow 1,2,4-triazole-3-thiol (Purpald), (pentafluorobenzyl)-hy detection of an oxygen containing functional group on the droxylamine, p-nitrophenyl-hydrazine, 2,4-dinitrophenyl target site, (1) detecting the introduction of the oxygen con hydrazine, 3-methylbenzothiazolin-2-one hydrazone, diethyl taining functional group on the target site and repeating steps acetonedicarboxylate and ammonia, cyclohexane-1,3-dione 55 (i) to (l) until formation of on oxygen containing functional and ammonia, m-phenylenediamine, p-aminophenol. 3,5-di group is detected. aminobenzoic acid, p-dimethylamino-aniline, m-dinitroben In some embodiments, mutating the candidate oxygenase Zene, o-phenylenediamine, and the like. can be performed by laboratory evolutionary methods and/or In some embodiments, a plurality of oxidating agents can rational design methods, using one or a combination of tech be provided to identify a Suitable oxidating agent in the meth 60 niques such as random mutagenesis, site-saturation mutagen ods and systems disclosed herein. In particular, in some of esis, site-directed mutagenesis, DNA shuffling, DNA recom these embodiments wherein the organic molecule has the bination, and additional techniques identifiable by a skilled general formula (I), (II), (III), (IV) and (V), a pool of oxidiz person. In particular, mutating a candidate oxygenase can be ing agents, for example a library of engineered P450s, e.g. in performed by targeting one or more of the amino acid resi a 96-well plate, can be provided. In particular, in embodi 65 dues comprised in the oxygenase's nucleotidic or amino ments wherein the organic molecule has the formula (I), (II), acidic primary sequence to provide a mutant or variant poly (III) and (IV), isolating the organic molecule resulting from nucleotide or polypeptide. US 8,026,085 B2 31 32 In general, the term “mutant” or “variant' as used herein Smith 1985; Carter 1986; Dale and Felix 1996; Ling and with reference to a molecule Such as polynucleotide or Robinson 1997), mutagenesis using uracil containing tem polypeptide, indicates that has been mutated from the mol plates (Kunkel, Roberts etal. 1987: Bass, Sorrells etal. 1988), ecule as it exits in nature. In particular, the term "mutate' and oligonucleotide-directed mutagenesis (Zoller and Smith “mutation” as used herein indicates any modification of a 1983; Zoller and Smith 1987; Zoller 1992), phosphorothio nucleic acid and/or polypeptide which results in an altered ate-modified DNA mutagenesis (Taylor, Schmidt et al. 1985; nucleic acid or polypeptide. Mutations include any process or Nakamaye and Eckstein 1986; Sayers, Schmidt et al. 1988), mechanism resulting in a mutant protein, enzyme, polynucle mutagenesis using gapped duplex DNA (Kramer, Drutsaetal. otide, gene, or cell. This includes any mutation in which a 1984: Kramer and Fritz 1987), point mismatch, mutagenesis polynucleotide or polypeptide sequence is altered, as well as 10 using repair-deficient host strains, deletion mutagenesis any detectable change in a cell wherein the mutant polynucle (Eghtedarzadeh and Henikoff 1986), restriction-selection otide or polypeptide is expressed arising from Such a muta and restriction-purification (Braxton and Wells 1991), tion. Typically, a mutation occurs in a polynucleotide or gene mutagenesis by total gene synthesis (Nambiar, Stackhouse et sequence, by point mutations, deletions, or insertions of al. 1984; Grundstrom, Zenke et al. 1985; Wells, Vasser et al. single or multiple nucleotide residues. A mutation in a poly 15 1985), double-strand break repair (Mandecki 1986), and the nucleotide includes mutations arising withina protein-encod like. Additional details on many of the above methods can be ing region of a gene as well as mutations in regions outside of found in Methods in Enzymology Volume 154, which also a protein-encoding sequence, Such as, but not limited to, describes useful controls for trouble-shooting problems with regulatory or promoter sequences. A mutation in a coding various mutagenesis methods. polynucleotide Such as a gene can be 'silent’, i.e., not Additional details regarding the methods to generate vari reflected in an amino acid alteration upon expression, leading ants of naturally-occurring sequences can be found in the to a “sequence-conservative' variant of the gene. A mutation following U.S. patents, PCT publications, and EPO publica in a polypeptide includes but is not limited to mutation in the tions: U.S. Pat. No. 5,605,793 to Stemmer (Feb. 25, 1997), polypeptide sequence and mutation resulting in a modified “Methods for In vitro Recombination: U.S. Pat. No. 5,811, amino acid. Non-limiting examples of a modified amino acid 25 238 to Stemmer et al. (Sep. 22, 1998) “Methods for Generat include a glycosylated amino acid, a Sulfated amino acid, a ing Polynucleotides having Desired Characteristics by Itera prenylated (e.g., farnesylated, geranylgeranylated) amino tive Selection and Recombination: U.S. Pat. No. 5,830,721 acid, an acetylated amino acid, an acylated amino acid, a to Stemmer et al. (Nov. 3, 1998), “DNA Mutagenesis by PEGylated amino acid, a biotinylated amino acid, a carboxy Random Fragmentation and Reassembly: U.S. Pat. No. lated amino acid, a phosphorylated amino acid, and the like. 30 5,834,252 to Stemmer, et al. (Nov. 10, 1998) “End-Comple References adequate to guide one of skill in the modification mentary Polymerase Reaction:” U.S. Pat. No. 5,837,458 to of amino acids are replete throughout the literature. Example Minshull, et al. (Nov. 17, 1998), "Methods and Compositions protocols are found in Walker (1998) Protein Protocols on for Cellular and Metabolic Engineering:” WO95/22625, CD-ROM (Humana Press, Towata, N.J.). Stemmer and Crameri, “Mutagenesis by Random Fragmen A mutant or engineered protein or enzyme is usually, 35 tation and Reassembly:” WO 96/33207 by Stemmer and Lip although not necessarily, expressed from a mutant polynucle schutz "End Complementary Polymerase Chain Reaction: otide or gene. Engineered cells can be obtained by introduc WO97/20078 by Stemmer and Crameri “Methods for Gen tion of an engineered gene or part of it in the cell. The terms erating Polynucleotides having Desired Characteristics by “engineered cell”, “mutant cell' or “recombinant cell as Iterative Selection and Recombination:” WO 97/35966 by used herein refer to a cell that has been altered or derived, or 40 Minshull and Stemmer, “Methods and Compositions for Cel is in Some way different or changed, from a parent cell, lular and Metabolic Engineering:” WO 99/41402 by Pun including a wild-type cell. The term “recombinant as used nonen et al. “Targeting of Genetic Vaccine Vectors: WO herein with reference to a cell in alternative to “wild-type' or 99/41383 by Punnonen et al. “Antigen Library Immuniza “native', indicates a cell that has been engineered to modify tion:” WO 99/41369 by Punnonen et al. “Genetic Vaccine the genotype and/or the phenotype of the cell as found in 45 Vector Engineering:” WO 99/41368 by Punnonen et al. nature, e.g., by modifying the polynucleotides and/or “Optimization of Immunomodulatory Properties of Genetic polypeptides expressed in the cell as it exists in nature. A Vaccines.” EP 752008 by Stemmer and Crameri, “DNA “wild-type cell refers instead to a cell which has not been Mutagenesis by Random Fragmentation and Reassembly;' engineered and displays the genotype and phenotype of said EP 0932670 by Stemmer “Evolving Cellular DNAUptake by cell as found in nature. 50 Recursive Sequence Recombination:” WO 99/23107 by The term “engineer” refers to any manipulation of a mol Stemmer et al., “Modification of Virus Tropism and Host ecule or cell that result in a detectable change in the molecule Range by Viral Genome Shuffling.” WO99/21979 by Apt et or cell, wherein the manipulation includes but is not limited to al., “Human Papillomavirus Vectors:” WO 98/31837 by del inserting a polynucleotide and/or polypeptide heterologous Cardayre et al. “Evolution of Whole Cells and Organisms by to the cell and mutating a polynucleotide and/or polypeptide 55 Recursive Sequence Recombination:” WO 98/27230 by Pat native to the cell. Engineered cells can also be obtained by ten and Stemmer, “Methods and Compositions for Polypep modification of the cell' genetic material, lipid distribution, or tide Engineering:” WO 98/13487 by Stemmer et al., “Meth protein content. In addition to recombinant production, the ods for Optimization of Gene Therapy by Recursive enzymes may be produced by direct peptide synthesis using Sequence Shuffling and Selection:” WO 00/00632, “Methods solid-phase techniques, such as Solid-Phase Peptide Synthe 60 for Generating Highly Diverse Libraries:” WO 00/09679, sis. Peptide synthesis may be performed using manual tech “Methods for Obtaining in vitro Recombined Polynucleotide niques or by automation. Automated synthesis may be Sequence Banks and Resulting Sequences: WO 98/42832 achieved, for example, using Applied Biosystems 431 A Pep by Arnold et al., “Recombination of Polynucleotide tide Synthesizer (PerkinElmer, Foster City, Calif.) in accor Sequences. Using Random or Defined Primers: WO dance with the instructions provided by the manufacturer 65 99/29902 by Arnold et al., “Method for Creating Polynucle Variants of naturally-occurring sequences can be generated otide and Polypeptide Sequences:” WO 98/41653 by Vind, by site-directed mutagenesis (Botstein and Shortle 1985; “An invitro Method for Construction of a DNA Library;”WO US 8,026,085 B2 33 34 98/41622 by Borchert et al., “Method for Constructing a fragments of proteins that can be recombined to minimize Library Using DNA Shuffling.” WO 98/.42727 by Pati and disruptive interactions that would prevent the protein from Zarling, “Sequence Alterations using Homologous Recom folding into its active form. bination:” WO 00/18906 by Patten et al., “Shuffling of In some embodiments, activation of a target site in an Codon-Altered Genes:” WO 00/04190 by del Cardayre et al. 5 organic molecule can be performed in a whole-cell system. To “Evolution of Whole Cells and Organisms by Recursive prepare the whole-cell system, the encoding sequence of the Recombination:” WO 00/42561 by Crameri et al., “Oligo oxidizing agent can be introduced into a host cell using a nucleotide Mediated Nucleic Acid Recombination: WO Suitable vector, Such as a plasmid, a cosmid, a phage, a virus, 00/42559 by Selifonov and Stemmer “Methods of Populating a bacterial artificial chromosome (BAC), a yeast artificial 10 chromosome (YAC), or the like, into which the said sequence Data Structures for Use in Evolutionary Simulations: WO of the disclosure has been inserted, in a forward or reverse 00/42560 by Selifonov et al., “Methods for Making Character orientation. In some embodiments, the construct further com Strings, Polynucleotides & Polypeptides Having Desired prises regulatory sequences, including, for example, a pro Characteristics:” WO 01/23401 by Welch et al., “Use of moter linked to the sequence. Large numbers of Suitable Codon-Varied Oligonucleotide Synthesis for Synthetic Shuf vectors and promoters are known to those of skill in the art, fling:” and WO 01/64864 “Single-Stranded Nucleic Acid and are commercially available. Template-Mediated Recombination and Nucleic Acid Frag Accordingly, in other embodiments, vectors that include a ment Isolation” by Affholter. nucleic acid molecule of the disclosure are provided. In other In particular, in Some embodiments, site-directed embodiments, host cells transfected with a nucleic acid mol mutagenesis can be performed on predetermined residues of ecule of the disclosure, or a vector that includes a nucleic acid the oxygenase. These predetermined sites can be identified molecule of the disclosure, are provided. Host cells include using the crystal structure of said oxidizing agent if available eukaryotic cells such as yeast cells, insect cells, or animal or a crystal structure of a homologous protein that shares at cells. Host cells also include prokaryotic cells Such as bacte least 20% sequence identity with said oxidizing agent and an rial cells. alignment of the polynucleotide or amino acid sequences of 25 In other embodiments, methods for producing a cell that the oxidizing agent and its homologous protein. The prede converts a target molecule into a pre-determined oxygenated termined sites are chosen among the amino acid residues that derivative are provided. Such methods generally include: (a) are found within 50 A, preferably within 35 A from the transforming a cell with an isolated nucleic acid molecule oxygen-activating site of said oxidizing agent. For example, encoding a polypeptide comprising an amino acid sequence when a cytochrome P450 monooxygenase is to be used as the set forthin SEQID NO: 2 to SEQIDNO: 70; (b) transforming a cell with an isolated nucleic acid molecule encoding a oxidizing agent, the predetermined site are chosen among the polypeptide of the disclosure; or (c) transforming a cell with amino acid residues that are found within 50 A, preferably an isolated nucleic acid molecule of the disclosure. within 35 A from the heme iron. Mutagenesis of the prede The terms “vector”, “vector construct” and “expression termined sites can be performed changing one, two or three of 35 vector” as used herein refer to a vehicle by which a DNA or the nucleotides in the codon that encodes for each of the RNA sequence (e.g. a foreign gene) can be introduced into a predetermined amino acids. Mutagenesis of the predeter host cell, so as to transform the host and promote expression mined sites can be performed in the described way so that (e.g. transcription and translation) of the introduced each of the predetermined amino acid is mutated to any of the sequence. Vectors typically comprise the DNA of a transmis other 19 natural amino acids. Substitution of the predeter 40 sible agent, into which foreign DNA encoding a protein is mined sites with unnatural amino acids can be performed inserted by restriction enzyme technology. A common type of using methods established in vivo (Wang, Xie et al. 2006), in vector is a "plasmid', which generally is a self-contained vitro (Shimizu, Kuruma et al. 2006), semisynthetic molecule of double-stranded DNA that can readily accept (Schwarzer and Cole 2005) or synthetic methods (Camarero additional (foreign) DNA and which can readily introduced and Mitchell 2005) for incorporation of unnatural amino 45 into a suitable host cell. A large number of vectors, including acids into polypeptides. plasmid and fungal vectors, have been described for replica In still further embodiments, libraries of engineered vari tion and/or expression in a variety of eukaryotic and prokary ants can be obtained by laboratory evolutionary methods otic hosts. Non-limiting examples include pKK plasmids and/or rational design methods, using one or a combination of (Clonetech), puC plasmids, pET plasmids (Novagen, Inc., techniques such as random mutagenesis, site-saturation 50 Madison, Wis.), pRSET or pREP plasmids (Invitrogen, San mutagenesis, site-directed mutagenesis, DNA shuffling, Diego, Calif.), or pMAL plasmids (New England Biolabs, DNA recombination, and the like and targeting one or more of Beverly, Mass.), and many appropriate host cells, using meth the amino acid residues, one at a time or simultaneously, ods disclosed or cited herein or otherwise known to those comprised in the oxidizing agent's amino acid sequence. Said skilled in the relevant art. Recombinant cloning vectors will libraries can be arrayed on multi-well plates and screened for 55 often include one or more replication systems for cloning or activity on the target molecule using a colorimetric, fluori expression, one or more markers for selection in the host, e.g., metric, enzymatic, or luminescence assay and the like. For antibiotic resistance, and one or more expression cassettes. example a method for making libraries for directed evolution The terms "express' and “expression” refers to allowing or to obtain P450s with new or altered properties is recombina causing the information in a gene or DNA sequence to tion, or chimeragenesis, in which portions of homologous 60 become manifest, for example producing a protein by acti P450s are swapped to form functional chimeras, can use used. Vating the cellular functions involved in transcription and Recombining equivalent segments of homologous proteins translation of a corresponding gene or DNA sequence. A generates variants in which every amino acid Substitution has DNA sequence is expressed in or by a cell to forman “expres already proven to be successful in one of the parents. There sion product” such as a protein. The expression product itself. fore, the amino acid mutations made in this way are less 65 e.g. the resulting protein, may also be said to be “expressed disruptive, on average, than random mutations. A structure by the cell. A polynucleotide or polypeptide is expressed based algorithm, such as SCHEMA, can be used to identify recombinantly, for example, when it is expressed or produced US 8,026,085 B2 35 36 in a foreign host cell under the control of a foreign or native tion where an epoxide is reacted with a nucleophile, specifi promoter, or in a native host cell under the control of a foreign cally fluoride (F) to afford a fluorohydrin ( CFR C(OH) promoter. R ) or a vicinal difluoride-(-CRF CR-F—) containing Polynucleotides provided herein can be incorporated into derivative. Accordingly, the terms “ring-opening fluorination any one of a variety of expression vectors Suitable for express agent” and “ring-opening fluorinating agent’ as used herein ing a polypeptide. Suitable vectors include chromosomal, refer to a chemical agent that is able to carry out a ring nonchromosomal and synthetic DNA sequences, e.g., deriva opening fluorination reaction. tives of SV40; bacterial plasmids; phage DNA; baculovirus: In particular, the deoxofluorination reaction can be per yeast plasmids; vectors derived from combinations of plas formed using commercially available, deoxofluorinating mids and phage DNA, viral DNA such as vaccinia, adenovi 10 rus, fowlpox virus, pseudorabies, adenovirus, adeno-associ agents such as sulfur tetrachloride (SF), DAST (diethylami ated viruses, retroviruses and many others. Any vector that nosulfur trifluoride, (Middleton 1975), U.S. Pat. No. 3,914, transduces genetic material into a cell, and, if replication is 265; U.S. Pat. No. 3,976,691), Deoxo-Fluor (bis-(2-meth desired, which is replicable and viable in the relevant host can oxyethyl)-aminosulfur trifluoride, (Lal, Pezetal. 1999), U.S. be used. 15 Pat. No. 6,222,064), DFI (2,2-difluoro-1,3-dimethylimidazo Vectors can be employed to transform an appropriate host lidine, (Hayashi, Sonoda et al. 2002), U.S. Pat. No. 6,632, to permit the host to express a protein or polypeptide. 949), or analogues and derivatives thereof. Other deoxoflu Examples of appropriate expression hosts include: bacterial orinating agents include XeF. SiF, and SeF. The cells, such as E. coli, B. subtilis, Streptomyces, and Salmo deoxofluorination reaction can be performed in the presence nella typhimurium; fungal cells, such as Saccharomyces cer or in the absence of additional chemical agents that facilitate evisiae, Pichia pastoris, and Neurospora crassa; insect cells or enable the deoxofluorination to occur. These additional Such as Drosophila and Spodoptera frugiperda; mammalian agents include but are not limited to hydrogen fluoride (HF), cells such as CHO, COS, BHK, HEK 293 br Bowes mela Lewis acids, fluoride salts (e.g. CsF, KF, NaF. LiF BF), noma; or plant cells or explants, etc. crown-ethers, ionic liquids and the like. In bacterial systems, a number of expression vectors may 25 In particular, the ring-opening fluorination reaction can be be selected, depending upon the use intended for the oxidiz performed using nucleophilic fluoride-containing agents ing polypeptide. For example, such vectors include, but are including without limitations metal fluorides (e.g. CsP, KF, not limited to, multifunctional E. coli cloning and expression NaF. LiF, AgF, BF), potassium hydrogen difluoride (KHF), vectors such as BLUESCRIPT (Stratagene), in which the BuNHF. R.N.nHF, BuNFnHF, Py.9HF (Olah's reagent), oxidizing agent-encoding sequence may be ligated into the 30 and the like. The ring-opening fluorination reaction can be vector in-frame with sequences for the amino-terminal Met performed in the presence or in the absence of additional and the subsequent 7 residues of beta-galactosidase so that a chemical agents that facilitate or enable the deoxofluorination hybrid protein is produced; plN vectors; pl.T vectors; and the to occur. These additional agents include but are not limited to like. hydrogen fluoride (HF), Lewis acids, fluoride salts (e.g. CsF. Similarly, in the yeast Saccharomyces cerevisiae a number 35 KF, NaF. LiF), crown-ethers, ionic liquids and the like of vectors containing constitutive or inducible promoters Exemplary fluorinations of an organic molecule containing such as alpha factor, alcohol oxidase and PGH may be used an oxygen-containing group include but are not limited to for production of the oxidizing agent. conversion of a hydroxyl group to a fluoride, a carboxylic acid In Some embodiments, the activation of the target site in an group to a carbonyl fluoride, an aldehyde group to a gem organic molecule by an oxidating agent can be performed 40 difluoride, a keto group to a gem-difluoride, an epoxide group using an immobilized oxidizing agent. Immobilization of the to a fluorohydrin (also called vic-fluoro-alcohol), an epoxide oxidizing agent can be carried out through covalent attach group to a vic-difluoride. ment or physical adsorption to a Support, entrapment in a Exemplary products produced by methods and systems matrix, encapsulation, cross-linking of oxidizing agents disclosed herein comprise fluorinated derivatives of organic crystals or aggregates and the like. Several immobilization 45 molecules which include 2-aryl-acetate esters, dihydrolas techniques are known (Bornscheuer 2003: Cao 2005). The mone, menthofuran, guaiol, permethylated mannopyrano type of immobilization and matrix that preserves activity side, methyl 2-(4'-(2"-methylpropyl)phenyl)propanoate and often depends on the nature and physical-chemical properties a 5-phenyl-2-oxazoline. of the oxidizing agent. Specifically, the methods and systems disclosed herein In any of the above mentioned embodiments, the oxygen 50 have been applied to produce methyl 2-fluoro-2-phenylac containing functional group introduced on a target site of any etate, ethyl 2-fluoro-2-phenylacetate, propyl 2-(3-chlorophe of the above molecules is then replaced by fluorine. nyl)-2-fluoroacetate, propyl 2-fluoro-o-tolylacetate, and pro In some embodiments, the fluorination is performed by pyl 2-fluoro-p-tolylacetate starting from corresponding deoxofluorination of the oxygenated organic molecule. 2-aryl-acetate esters, 4-fluoro-3-methyl-2-pentylcyclopent The terms “deoxofluorination' and “deoxofluorination 55 2-enone, 4.4-difluoro-3-methyl-2-pentylcyclopent-2-enone, reaction' as used herein refer to a chemical reaction wherean and 3-(fluoromethyl)-2-pentylcyclopent-2-enone, starting oxygen-containing chemical unit is replaced with fluorine. from dihydrojasmone; methyl 2-(4'-(1"-fluoro-2'-methyl Accordingly, the terms “deoxofluorinating agent”, “deoxof propyl)phenyl)propanoate and methyl 2-(4'-(2"-fluoro-2'- luorinating agent, and “deoxofluorination agent’ as used methylpropyl)phenyl)propanoate, starting from methyl 2-(4'- herein refer to a chemical agent that is able to carry out a 60 (2"-methylpropyl)phenyl)propanoate; 6-fluoro deoxofluorination reaction. The term “reagent” as used herein menthofuran-2-ol from menthofuran: 2-((3S,5S,8S)-4- is equivalent to the term "agent'. fluoro-3,8-dimethyl-1,2,3,4,5,6,7,8-octahydroaZulen-5-yl) In some embodiments, the fluorination can be performed propan-2-ol from (-)-guaiol; 6-fluoro-6-deoxy-1,2,3,4- by ring-opening fluorination of the oxygenated organic mol tetramethyl-mannopyranoside starting from 1.2.3.4.6- ecule. 65 pentamethyl-mannpyranoside; (4R,5S)-4-(fluoromethyl)-2- The terms “ring-opening fluorination' and “ring-opening methyl-5-phenyl-4,5-dihydrooxazole, starting from (4S,5S)- fluorination reaction' as used herein refer to a chemical reac 4-(methoxymethyl)-2-methyl-5-phenyl-4,5-dihydrooxazole. US 8,026,085 B2 37 38 More specifically, the methods and systems disclosed a number of drugs with biologically relevant properties. Such herein have been applied to fluorinate a target site, namely a as antifungal, antibacterial and anti-inflammatory activities C carbon atom, in a highly regioselective manner despite the (Knight 1994; De Souza 2005). Many methods are available presence of other similar moieties in the molecule, as in the for their synthesis. However, strategies for post-synthetic case of 1,2,3,4,6-pentamethyl-mannopyranoside. 5 functionalization (and specifically in the context of the dis Even more specifically, the methods and systems disclosed closure, fluorination) of these scaffolds and compounds herein have been applied to fluorinate target organic mol incorporating these scaffolds would be highly desirable. ecules, namely 2-aryl-acetate esters, in a highly stereoselec Embodiments, wherein methods for selective fluorination tive manner, leading to the formation of the (R)-fluoro enan of protected hydroxyl groups in the form of R-O- tiomer in considerable excess over the (S)-fluoro enantiomer. 10 CHRR are performed where the resulting product is R—F The above mentioned fluorinated products are or can be is expected to expand our current synthetic capabilities and associated with a biological activity or can be used for the facilitate the synthesis of fluorinated compounds that bear synthesis of chemical compounds that are or can be associ multiple hydroxyl functional groups as well as the synthesis ated with a biological activity. of compounds that incorporate chemical units or structural 2-fluoro-2-phenylacetate derivatives find potential appli 15 features that are incompatible with the currently available cations in the synthesis of prodrugs, in particular in the prepa methods for protection/deprotection of hydroxyl groups ration of ester-type anticancer prodrugs with different Sus (Green and Wuts 1999). The protection of hydroxyl groups ceptibility to hydrolysis, which can be useful in selective with alkyl groups different from methoxymethyl (MOM), targeting of cancer cells (Yamazaki, Yusa et al. 1996). 2-(4- tetrahydropyranyl (THP), allyl, and benzyl (Bn) is rarely used (2"-Methylpropyl)phenyl)propionate also known as ibupro in practice, if ever, due to the requirement of harsh chemical fen is a marketed drug of the class non-steroidal anti-inflam reagents and conditions for their removal (e.g. strong Lewis matory drugs (NSAIDs). This drug has ample application in acids in the case of a methoxy group). These chemical the treatment of arthritis, primary dysmenorrhoea, fever, and reagents are poorly chemoselective, reacting with any as an analgesic, especially in the presence of inflammation nucleophilic group of the molecule. Chemical methods for process. Ibuprofen exerts its analgesic, antipyretic, and anti 25 regioselective Substitution, and more specifically fluorina inflammatory activity through inhibition of tion, of a single protected hydroxyl functional group in the (COX-2), thus inhibiting prostaglandin synthesis. More presence of multiple identically protected hydroxyl groups recently, ibuprofen was found to be useful in the prophylaxis are not available. of Alzheimer's disease (AD) (Townsend and Pratico 2005). In some embodiments, activation and fluorination of the The anti-AD activity of ibuprofen is presumably due to its 30 organic molecules can be performed as it follows. ability to lower the levels of amyloid-beta (Abeta) peptides, The activation reaction can be carried out in aqueous Sol in particular the longer, highly amyloidogenic isoform Abeta vent containing variable amounts of organic solvents to facili 42, which are believed to be the central disease-causing tate dissolution of the organic molecule in the mixture. The agents in Alzheimer's disease (AD). There is therefore a co-solvents include but are not limited to alcohols, acetoni growing interest towards the discovery of Abeta 42-lowering 35 trile, dimethyl sulfoxide, dimethylformamide, and acetone. compounds with improved potency and brain permeability The one or more oxidizing agents can be present as free in (Leuchtenberger, Beher et al. 2006). Unlike other NSAIDs, Solution or inside a cell where its expression has been ibuprofen was also found to be useful in protection against achieved using a plasmid vector or other strategies as Parkinson's disease, although the underlying mechanism is described earlier. The reaction can be carried out in batch, not yet known (Casper, Yaparpalvi et al. 2000). 40 semi-continuously or continuously, in air or using devices to Dihydrojasmone incorporates a cyclopentenone structural flow air or oxygen through the Solution, at autogeneous pres unit. The cyclopentanone and cyclopentenone scaffolds are sure or higher. The reaction temperature will generally be in present in a wide range of important natural products such as the range of 0° C. and 100° C., depending on the nature and jasmonoids, cyclopentanoid antibiotic, and prostaglandins. stability of the biocatalysts and substrates, preferably in the This type of compound has a broad spectrum of biological 45 range of about 4°C. and 30°C. The amount of biocatalyst is activities and important application in medicinal chemistry as generally in the range of about 0.01 mole % to 10 mole %, well as in the perfume and cosmetic industry, and agriculture. preferably in the range of about 0.05 mole% to 1 mole%. The Despite their relatively simple structures, the synthesis of cofactor (NADPH) can be added directly, regenerated using these scaffolds is not trivial (Mikolajczyk, Mikina et al. an enzyme-coupled system (typically dehydrogenase-based), 1999). Therefore, novel routes for functionalization (and spe 50 or provided by the host cell. Reducing equivalent to the bio cifically in the context of the disclosure, fluorination) of these catalysts can be provided though the use of an electrode or scaffolds and compounds incorporating these scaffolds chemical reagents. Superoxide dismutase, catalase or other would be highly desirable. reactive oxygen species-scavenging agents, can be used to Guaiol is a sesquiterpene alcohol having the guaiane skel prevent biocatalyst inactivation and improve the yields of the eton, found in many medicinal plants. The essential oils of 55 activation reaction. Glycerol, bovine serum albumine or other Salvia lanigera and Helitta longifoliata, which both contain stabilizing agents can be used to prevent biocatalyst aggrega guaiol as a major component, were found to possess pro tion and improve the yields of the activation reaction. nounced antibacterial activity (De-Moura, Simionatto et al. After the activation reaction, the activated products may or 2002). Structural modification of naturally-occurring bioac may not be isolated through any of the following methods or tive Substances by conventional chemical methods is very 60 combination thereof extraction, distillation, precipitation, difficult and often not feasible. Accessible methods to pro Sublimation, chromatography, crystallization with optional duce derivatives of these natural products (and specifically in seeding and/or co-crystallization aids. the context of this disclosure, fluorinated derivatives) would The activated products are then contacted with the fluori be highly desirable. nating agent in the presence or the absence of an organic Furans and 2-(5H)-furanones are attractive building blocks 65 solvent underinert atmosphere. The activated products can be being present in a large number of natural products that dis reacted in the form of isolated compound, purified com play a wide range of biological activities, and being present in pound, partially-purified mixtures or crude mixtures. No par US 8,026,085 B2 39 40 ticular restriction is imposed upon the solvent of the reaction An additional advantage of the methods and systems dis as long as the solvent does not react with the fluorination closed herein is the possibility to substitute protected reagent, enzymatic product, or reaction product. hydroxyl groups in the form of R O CHRR for fluo Solvents that can be used in the fluorination reaction rine. A further advantage is that the substitution of protected include, but are not restricted to, dichloromethane, pyridine, 5 hydroxyl group for fluorine can be carried out under mild acetonitrile, chloroform, ethylene dichloride, 1,2-dimethoxy conditions (room temperature and pressure), with limited use ethane, diethylene glycol dimethyl ether, N-methylpyrroli of hazardous chemical and toxic solvents, in a chemoselective done, dimethylformamide, and 1,3-dimethyl-2-imidazolidi and regioselective manner. none, preferably dichloromethane or pyridine. The reaction Classes of molecules that can be potentially obtained using 10 the methods and systems disclosed herein include but are not temperature will generally be in the range of -80°C. to 150° limited to C-fluoro acid derivatives, fluoro-alkyl derivatives, C., preferably in the range of about -78° C. and 30°C. The fluoro-allyl derivatives, fluorohydrins, vic- and gem-difluo amount of the fluorination reagent is preferably 1 equivalent ride derivatives. or more for oxygenatom introduced in the molecular scaffold Classes of molecules that can be potentially obtained in of the organic molecule during the enzymatic reaction. After 15 enantiopure form using the methods and systems disclosed completion of the reaction, the fluorinated products are iso herein include but are not limited to C-fluoro acid derivatives, lated through any of the following methods or combination fluoro-alkyl derivatives, fluoro-allyl derivatives, and fluoro thereof: extraction, distillation, precipitation, Sublimation, hydrins. chromatography, crystallization with optional seeding and/or In general, the methods and systems disclosed herein, in co-crystallization aids. contrast to previously known synthetic methods, provide a An advantage of the methods and systems is the possibility simple, environmentally benign, two-step procedure for to perform fluorination of predetermined target sites in a regio- and stereospecific incorporation of fluorine in a wide candidate organic molecule. A further advantage is that Sub variety of organic compounds both at reactive and non-reac jecting the activated product or the fluorinated derivative to tive sites of their molecular scaffold. Particularly, it will be the action of the same oxidizing agent used for its preparation 25 appreciated that methods and systems disclosed herein pro or another oxidizing agent, polyfluorination of the molecule cedure gives access to organofluorine derivatives, whose at the same or another predetermined target site can be preparation through alternative routes would require many achieved. A further advantage is that the mono- and/or poly more synthetic steps and much higher amounts of toxic fluorination of predetermined target sites in a candidate reagents and organic solvents. organic molecule can be carried out under mild conditions 30 Accordingly, the methods and systems disclosed herein (room temperature and pressure), with limited use of hazard have utility in the field of organic chemistry for preparation of ous chemical and toxic solvents, in a chemoselective, regi fluorinated building blocks and in medicinal chemistry for the oselective, and stereoselective manner. preparation or discovery of fluorinated derivatives of drugs, An additional advantage of the methods and systems dis drug-like molecules, drug precursors, and chemical building closed herein is the possibility to carry out fluorination of 35 blocks with altered or improved physical, chemical, pharma nonreactive sites of a candidate organic molecule, that is sites cokinetic, or pharmacological properties. that would could not be easily functionalized using chemical In particular, in some embodiments of the methods and reagents or would react only after or concurrently to other, systems disclosed herein, the organic molecules are pre-se more reactive sites of the molecule. lected among molecules of interest, such as drugs, drug pre A further advantage of the methods and systems disclosed 40 cursors, lead compounds, and synthetic building blocks. The herein is the possibility to produce fluorinated derivatives of term "drug” as used herein refer to a synthetic or non-syn candidate organic molecules with an established or poten thetic chemical entity with established biological and/or tially relevant biological activity in only two steps. This pharmacological activity, which is used to treat a disease, cure “post-synthetic' transformation represents a considerable a dysfunction, or alter in some way a physiological or non advantage compared to synthesis of the same derivative or 45 physiological function of a living organism. Lists of drugs can derivatives starting from fluorine-containing building blocks be easily found in online databases Such as www.access which may or may not be available, thus requiring numerous data.fda.gov, www.drugs.com, www.rxlist.com, and the like. additional synthetic steps. For example, the described methyl The term “drug precursor as used herein refers to a synthetic 2-(4'-(1"-fluoro-2'-methylpropyl)phenyl)propanoate, or non-synthetic chemical entity which can be converted into methyl 2-(4'-(2"-fluoro-2'-methylpropyl)phenyl)pro- 50 a drug through a chemical or biochemical transformation. panoate, and methyl 2-(4'-(1", 1"-difluoro-2'-methylpropyl) The conversion of a drug precursor into a drug can also occur phenyl)propanoate prepared according to the methods and after administration, in which case the drug precursor is typi systems disclosed herein could be conceivably synthesized cally referred to as “prodrug’. Accordingly, any synthetic or using (1-fluoro-2-methylpropyl)benzyl. (2-fluoro-2-methyl semi-synthetic intermediate in the preparation of a drug can propyl)benzyl, (1,1-difluoro-2-methylpropyl)benzyl deriva- 55 be considered a drug precursor. The term “lead compound as tives, which however are not commercially available and used herein refers to a synthetic or non-synthetic chemical therefore need to be prepared from raw material through entity that has pharmacological or biological activity and several chemical steps. whose chemical structure is used as a starting point for chemi A further advantage of the methods and systems disclosed cal modifications in order to improve potency, selectivity, or herein is the possibility to produce fluorinated derivatives of 60 pharmacokinetic parameters. Lead compounds are often a candidate organic molecule at a preparative scale, obtaining found in high-throughput Screenings ("hits’) or are secondary from a minimum of 10 up to hundreds milligrams of the final metabolites from natural sources. Reports on the discovery fluorinated product with overall yields (after isolation) of up and/or identification of lead compounds for various applica to 80%. These quantities and yields enable the evaluation of tions are widespread in the Scientific literature and in particu the biological, pharmacological, and pharmacokinetic prop- 65 lar in specialized journals such as Journal of medicinal chem erties of said products as well as their use in further synthesis istry, Bioorganic & medicinal chemistry, Current medicinal of more complex molecules. chemistry, Current topics in medicinal chemistry, European US 8,026,085 B2 41 42 Journal of Medicinal Chemistry, Mini reviews in medicinal NK1 antagonists (Swain and Rupniak 1999), 5HT agonists chemistry, and the like. The term “synthetic building blocks' (van Niel, Collins et al. 1999), and PTB1B antagonists as used herein refer to any synthetic or non-synthetic chemi (Burke, Ye et al. 1996). cal entity that is used for the preparation of a structurally more Accordingly, using the methods and system disclosed complex molecule. herein, production of various oxygenated/fluorinated prod Upon fluorination of the target site of the pre-selected ucts can be expected starting from a given drug or a drug-like molecule, the fluorinated organic molecules produced can be molecule, for example a lead compound identified in a drug further used in the synthesis of more complex molecules, or, discovery program. in addition, or in alternative, being tested for biological activi In an embodiment of the methods and systems, an array of ties. 10 oxygenases (P450 monooxygenases, non-heme iron In particular, in any embodiment, wherein identification of monooxygenases, dioxygenases and peroxygenases) can be an organic molecule having a predetermined biological activ used to produce various mono- and poly-oxygenated com ity is desired, the methods and systems disclosed herein fur pounds. Some of these products can be isolated and Subjected ther comprise testing the fluorinated organic molecule for the to fluorination, e.g. deoxo-fluorination, where all or a Subset desired biological activity. Testing can in particular be per 15 of the introduced oxygen-containing functional groups are formed by screening the products of the reaction by the meth substituted for fluorine. The resulting products can then be ods and systems illustrated in FIG. 4 in form of mixture or as separated and tested for improved biological properties. isolated compound for altered or improved metabolic stabil ity, biological activity, pharmacological potency, and phar EXAMPLES macokinetic properties. The wording “biological activity” as used herein refers to The present disclosure is further illustrated in the following any activity that can affect the status of a biological molecule examples, which are provided by way of illustration and are or biological entity. A biological molecule can be a protein or not intended to be limiting. a polynucleotide. Abiological entity can be a cell, an organ, or The following experiments have been carried out to per a living organism. The wording “pharmacological activity” as 25 form chemo-enzymatic fluorination approach according to used herein refers to any activity that can affect and, generally embodiments of the methods and systems disclosed herein. but not necessarily, improve the status of a living organism. First, a set of organic molecules has been selected, from In embodiments where identification of a molecule having which potentially useful fluorinated products can be pharmacological activity is desired, use of P450 as oxidizing obtained. agents is particularly preferred, since Phase I drug metabo 30 These compounds include: (a) 2-aryl acetic acid deriva lism in humans is mainly dependent on P450s. In this con tives, as demonstrative examples of useful synthetic blocks, nection, one clear advantage of the methods and systems for example in the preparation of prodrugs with different disclosed herein is that they allow for protection through susceptibility to hydrolysis. With the systems and methods fluorination of sites in the molecule that are sensitive to P450 disclosed herein, stereoselective fluorination of the alpha hydroxylation attack. 35 position of these target molecules was achieved, affording A further advantage of the methods and systems disclosed 2-fluoro-2-aryl acetic acid derivatives in considerable enan herein for the identification of a molecule having biological tiomeric excess; (b) ibuprofen methyl ester, as demonstrative activity compared to corresponding strategies known in the example of a marketed drug, of which more potent and BBB art for producing fluorinated drugs (which mainly rely on the (blood-brain-barrier)-penetrating derivatives are sought after use of fluorinated building blocks), is that the methods dis 40 for treatment of Alzheimer's and Parkinson's diseases. With closed herein can be carried out post-synthetically. As a con the systems and methods disclosed herein, regioselective sequence, the method disclosed herein can be broadly applied fluorination of weakly reactive sites of this target molecule to produce oxygenated/fluorinated derivatives starting from was achieved, affording various C F derivatives; (c) dihy marketed drugs, drugs in advanced testing phase, lead com drojasmone, menthofuran, and guaiol, as demonstrative pounds, or screening hits. 45 examples of various molecular scaffolds that are present in Additionally, a pre-selection of organic molecules of inter several natural, synthetic, and semisynthetic biologically est and/or related fluorinated products can be made on the active molecules. With the systems and methods disclosed basis of the ability of fluorine atoms to improve dramatically herein, regioselective fluorination of weakly reactive sites of the pharmacological profile of drugs. In particular, this can be these target molecules was achieved, affording various C—F done in view of several studies have shown that potent drugs 50 derivatives; (d) dihydro-4-methoxymethyl-2-methyl-5-phe can be obtained through fluorination of much less active nyl-2-oxazoline, as demonstrative example for chemoselec precursors. Anticholesterolemic EZetimib (Clader 2004), tive substitution of methoxygroup for fluorine. With the sys anticancer CF-taxanes (Ojima 2004), fluoro-steroids, and tems and methods herein described, fluorination of the antibacterial fluoroquinolones are only some representative methoxy protected group in the target molecules was examples. The improved pharmacological properties of 55 achieved, affording a demethoxy-fluoro derivative; (e) perm fluoro-containing drugs are due mainly to enhanced meta ethylated mannopyranoside as demonstrative example for bolic stability (Park, Kitteringham et al. 2001). Primary regioselective Substitution of a specific methoxy group for metabolism of drugs in humans generally occurs through fluorine in the presence of several otheridentical groups in the P450-dependent systems, and the introduction of fluorine molecule. With the systems and methods disclosed herein, atoms at or near the sites of metabolic attack has often proven 60 regioselective fluorination of the methoxy protected group in Successful in increasing the half-life of a compound (Bohm, position 6 of the target molecule was achieved, affording a Banner et al. 2004). 6-demethoxy-fluoro derivative. In some cases, the introduction of fluorine Substituents A pool of oxidizing agents, comprising wild-type P450 leads to improvements in the pharmacological properties as a (CYP102A1), variants of wild-type P450 carrying one or result of enhanced binding affinity of the molecule to biologi 65 more mutations at the positions 25, 26, 42, 47, 51, 52, 58, 64. cal receptors. Examples of the effect of fluorine on binding 74, 75, 78,81, 82, 87, 88,90,94, 96,102, 106, 107, 108,118, affinity are provided by recent results in the preparation of 135, 138, 142,143, 145, 152, 172, 173, 175, 178, 180, 181, US 8,026,085 B2 43 44 184, 185, 188, 197, 199, 205, 214, 226, 231, 236, 237, 239, acetate gradient. The identity of the purified product was 252, 255, 260, 263,264, 265,267, 268, 273, 274, 275,290, confirmed by GC-MS, HR-MS, H-, 'C-, and 'F-NMR. 295, 306, 324, 328, 354, 366,398, 401, 430, 433, 434, 437, The pool of pre-selected oxidizing agents and other 438,442, 443,444, and 446, and a selection of the most active selected variants from mutagenesis libraries of var3-10 i.e. P450 chimera peroxygenases and monooxygenases from the libraries where positions 74, 82, 87, 88, and 328 position of libraries described in Otey et al. (Otey, Landwehr et al. 2006) var3-10 were subjected to saturation mutagenesis—were and Landwehr et al. (Landwehr, Carbone et al. 2007) were screened for activity towards activation of the pre-selected arrayed on 96-well plates. Arrays were prepared by growing organic molecules dihydro-4-methoxymethyl-2-methyl-5- recombinant E. coli transformed with an expression plasmid phenyl-2-oxazoline (MMPO) and 1,2,3,4,6-pentamethyl encoding for the P450 sequence, inducing protein expression 10 mannopyranoside using a colorimetric assay on a 96-well plate format. In the case of MMPO, for example, different with IPTG, and preparing a cell lysate. oxidizing agents were arrayed on a 96-well plate, each well The activation reaction of the pre-selected organic mol containing about 150 uL phosphate buffer and about 1 uM ecules ibuprofen methyl ester, menthofuran, dihydrolas oxidizing agent. The target molecule was added to the solu mone, and guaiol with the pool of pre-selected oxidizing 15 tion from an ethanol stock to a final concentration of 2 mM agents was tested at a 1-mL scale dissolving the organic (and 1% ethanol). After addition of 1 mM NADPH, the reac molecule in phosphate buffer (1% ethanol) at a final concen tion mixture was incubated for 30 minutes at room tempera tration of 2 mM. The oxidizing agent was then added to the ture. After incubation, MMPO activation activity was deter solution at a final concentration of about 200-400 nM. The mined using the colorimetric reagent Purpald (Sigma), which reaction was started by adding NADPH and a glucose-6- reacts with formaldehyde and serves in this case to detect the phosphate dehydrogenase cofactor regeneration system to the demethylation of the methoxy group in the target molecule. mixture. After 20 hrs incubation at room temperature, the Positive hits were re-tested on a 1-mL scale using 1 mM reactions were extracted with chloroformandanalyzed by gas MMPO, 0.5uMoxidizing agent, 1 mM NADPH, and a cofac chromatography. Total conversion ratios were calculated tor regeneration system. After incubation at room tempera including in the experiment a sample containing no enzyme 25 ture, the reaction mixtures were extracted with chloroform and adding an internal standard to the samples. The 20-30% and analyzed by gas chromatography. In this way, the regi most promising oxidizing agents were re-tested at a larger oselectivity and conversion efficiency of each oxidizing agent scale (3 mL) to identify false positives and determine regi was established. The identity of the activated product was oselectivity and product distribution. Exemplary results from also confirmed by GC-MS. For the most promising oxidizing the screening of the pool of P450s on dihydrojasmone and 30 agents, that is those agents which showed the highest regi menthofuran are reported on FIGS. 5 and 6. oselectivity and/or conversion efficiency, were used for scale A group of about 5 to 10 most interesting oxidizing agents up tests and for producing larger quantities of activated prod were then selected based on the results from the re-screen, in uct for the fluorination reaction as described above. particular based on their regioselectivity, conversion effi Representative results from the screening of the P450 pool for ciency, or ability to produce “rare activated product. Using 35 MMPO activation activity are reported on FIG. 7. the selected oxidizing agents, conditions for the activation The activation of the target molecule dihydrojasmone was reaction were optimized, testing different co-solvents (e.g. also carried out using a whole-cell system (FIG. 8). Specifi ethanol, ethylacetate), additives (e.g. BSA, glycerol), ROS cally, the whole-cell system consisted of E. coli DHSa cells (Reactive oxygen species) scavengers (e.g. SOD, catalase), transformed with a pCWori vector that contains the sequence temperature, and target molecule: oxidizing agent ratios. 40 for var3. The whole-cell activation reaction was carried out Once optimized, the activation reaction was scaled up to growing a 0.5 L culture of the recombinant cells in TB 100-300 mL reaction scale, where the oxidizing agent con medium, inducing the intracellular expression of var3 during centration typically ranged from 0.5 to 15 uM, the target mid-log phase by adding 0.5 mMIPTG, and growing the cells molecule concentration from 5 to 20 mM, and a cofactor at 30° C. for further 12 hours. After that, 15 mL dodecane regeneration system was used. The co-solvent was usually 45 were added to the culture. Dihydrojasmone was then added to ethanol, typically at a final concentration of 0.5% to 2%. the culture at a final concentration of 30 mM. Formation of the Large scale reactions were incubated under stirring at room activated product and consumption of the target molecule temperature for a period of time of up to 56 hours, during were monitored by gas chromatography for up to 36 hours. which target molecule conversion was monitored by extract Conversion ratio at the end of the 36 hours amounted to ing Small aliquots of the reaction mixture and analyzing them 50 ~10%. Higher conversion ratios (>90-95%) were achieved in by gas chromatography. vitro with the same variant using a cofactor regeneration As the desired amount of activated product was produced, system. The lower efficiency of the whole-cell system in the the reaction mixture was extracted with an organic solvent, case of dihydrojasmone may be attributed to potential toxicity typically chloroform, and the activated product was isolated of this molecule or its activated product to the cells as well as by silica gel chromatography using hexane:ethyl acetate gra 55 their low membrane permeability. Nevertheless, this experi dient. Purified products were identified using GC-MS, "H-, ment demonstrates that the activation of the target molecule and 'C-NMR. for the scope of the systems and methods herein described can Once the product with the activated target site was identi also be performed using a whole-cell, especially in cases fied, the activated product was subjected to fluorination using where the chemo-physical properties of the candidate mol the deoxo-fluorinating agent DAST in dichloromethane. Dif 60 ecule may make this option more favorable. ferent reaction conditions were typically tested to optimize Chemical reagents, Substrates and solvents were purchased yield and possibly achieve quantitative conversion. During from Sigma, Aldrich, and Fluka. Silica gel chromatography these tests, the conversion of the activated product to the purifications were carried out using AMD Silica Gel 60 230 corresponding fluorinated derivative was typically monitored 400 mesh. Gas chromatography (GC) analyses were carried by GC-MS. 65 out using a Shimadzu GC-17A gas chromatograph, a FID After the fluorination reaction, the fluorinated product was detector, and an Agilent HP5 column (30 mx0.32 mmx0.1 um isolated by silica gel chromatography using a hexane:ethyl film). Chiral GC analyses were carried out using a Shimadzu US 8,026,085 B2 45 46 GC-17A gas chromatograph, a FID detector, and an Agilent mmol) of activated product was dissolved in 2 mL dry dichlo Cyclosilb column (30 mx0.52 mmx0.25 um film). GC-MS romethane (CHCl) and a catalytic amount (4 drops) of analyses were carried out on a Hewlett-Packard 5970B MSD ethanol was added to the solution. The solution was cooled to with 5890 GC and a DB-5 capillary column. 'H, C, and 'F -78° C. (dry ice) and then 41 uL DAST (0.29 mmol) was NMR spectra were recorded on a Varian Mercury 300 spec added. The reaction was stirred in dry ice for 12 hours. The trometer (300 MHz, 75 MHz, and 282 MHz, respectively), reaction mixture was then added with 5 mL saturated sodium and are internally referenced to residual protio Solvent signal. bicarbonate (NaHCO) and extracted with dichloromethane Data for "H NMR are reported in the conventional form: (3x15 mL). The organic phase was then dried over magne chemical shift (8 ppm), multiplicity (s=singlet, d=doublet, sium sulfate (MgSO) and evaporated in vacuo. Purification t-triplet, q quartet, m multiplet, br broad), coupling con 10 of the resulting oil by silica gel chromatography (5% ethyl stant (Hz), integration, and assignment). Data for C NMR acetate: 95% hexane) afforded the fluorinated product ((R)- are reported in the terms of chemical shift (8 ppm). Data for methyl 2-fluoro-2-phenylacetate) (30 mg, 75% yield, pale 'F NMR are reported in the terms of chemical shift (6 ppm) yellow oil) in 74% ee, as determined by chiral GC analysis. and multiplicity. High-resolution mass spectra were obtained 15 H-NMR (300 MHz, CDC1): 83.75 (s.3H, OCH), & 5.77 with a JEOL JMS-600H High Resolution Mass Spectrometer (d. J=48 Hz, 1H, CHF), & 737-7.46 (m, 5H); 'C-NMR (75 at the California Institute of Technology Mass Spectral facil MHz, CDC1): 82.8, 89.5 (d. J=184.5 Hz), 126.8, 126.9, 129.0, 8 129.9, 8 134.4 (d. J=34.5 Hz), 8 169.0. F-NMR (282 MHz, CDC1): 8-180.29 (d. J=48.7 Hz). HRMS (EI+): Example 1 exact mass calculated for CHFO, requires m/z 168.0587. found 168.0594. Stereoselective fluorination of methyl 2-phenyl acetate Example 2

25 Stereoselective fluorination of ethyl 2-phenyl acetate

Scheme 1. H Scheme 2. 0.1% mol P450BMB KPi, pH = 8.0 30 H 45% 0.1% mol WT(F87A) "N CH O KPH = 8.0 O H ---its 1.2 eq DAST O CH2Cl2, -78° C. 35 OH 75% 1.2 eq DAST d O CHCl2, -78°C. Crs H n-1 Hos78% O 40 Crs C 74% ee 45 Cr Methyl 2-phenyl acetate was subjected to selective fluori 93% ee nation of the target site C (alpha position) according to the systems and methods disclosed herein and, more specifically, Ethyl 2-phenyl acetate was subjected to selective fluorina according to the general procedure described above. 50 tion of the target site C (alpha position) according to the Experimental description: 90 mg methyl 2-phenyl acetate systems and methods disclosed herein and, more specifically, was dissolved in 500 uL ethanol and added to 240 mL potas according to the general procedure described above. sium phosphate buffer pH 8.0. P450 was added to the Experimental description: 100 mg ethyl 2-phenyl acetate mixture at a final concentration of 2 uM. The mixture was was dissolved in 500 uL ethanol and added to 250 mL potas split in 4 mL aliquots into 15 mL Scintillation vials equipped 55 sium phosphate buffer pH 8.0. WT(F87A) was added to the with a stir bar. 500 uL of a 5 mMNADPH solution was added mixture at a final concentration of 2 uM. The mixture was to each vial and stirred for 2 minutes. 500 uL of a cofactor split in 4 mL aliquots into 15 mL Scintillation vials equipped regeneration solution containing 300 mM glucose-6-phos with a stir bar. 500 uL of a 5 mMNADPH solution was added phate and 10 units/mL glucose-6-phosphate dehydrogenase to each vial and stirred for 2 minutes. 500 uL of a cofactor were then added to each vial. The resulting mixtures were 60 regeneration solution containing 300 mM glucose-6-phos stirred at room temperature. After 4 hours, the reaction mix phate and 10 units/mL glucose-6-phosphate dehydrogenase tures were joined together and extracted with chloroform were then added to each vial. The resulting mixtures were (3x100 mL). The organic phase was then dried over magne stirred at room temperature. After 3 hours, the reaction mix sium sulfate (MgSO) and evaporated in vacuo. Purification tures were joined together and extracted with chloroform of the resulting oil by silica gel chromatography (5% ethyl 65 (3x100 mL). The organic phase was then dried over magne acetate: 95% hexane) afforded the activated product (S)-me sium sulfate (MgSO) and evaporated in vacuo. Purification thyl 2-hydroxy-2-phenylacetate, 40.5 mg). 40 mg (0.24 of the resulting oil by silica gel chromatography (5% ethyl US 8,026,085 B2 47 48 acetate: 95% hexane) afforded the activated product (S)- to each vial and stirred for 2 minutes. 500 uL of a cofactor ethyl 2-hydroxy-2-phenylacetate, 66 mg). 66 mg (0.36 mmol) regeneration solution containing 300 mM glucose-6-phos of activated product was dissolved in 2 mL dry dichlo phate and 10 units/mL glucose-6-phosphate dehydrogenase romethane (CH2Cl2) and a catalytic amount (4 drops) of were then added to each vial. The resulting mixtures were ethanol was added to the solution. The solution was cooled to 5 stirred at room temperature. After 4 hours, the reaction mix -78° C. (dry ice) and then 61 uL DAST (0.43 mmol) was tures were joined together and extracted with chloroform added. The reaction was stirred in dry ice for 12 hours. The (3x100 mL). The organic phase was then dried over magne reaction mixture was then added with 5 mL saturated sodium sium sulfate (MgSO) and evaporated in vacuo. Purification bicarbonate (NaHCO) and extracted with dichloromethane of the resulting oil by silica gel chromatography (5% ethyl (3x15 mL). The organic phase was then dried over magne 10 acetate: 95% hexane) afforded the activated product ((S)-pro sium sulfate (MgSO) and evaporated in vacuo. Purification pyl 2-hydroxy-2-(3-chlorophenyl)acetate, 71 mg).70 mg (0.3 of the resulting oil by silica gel chromatography (5% ethyl acetate: 95% hexane) afforded the fluorinated product ((R)- mmol) of activated product was dissolved in 2 mL dry dichlo ethyl 2-fluoro-2-phenylacetate) (51 mg, 78% yield, pale yel romethane (CH2Cl2) and a catalytic amount (4 drops) of ethanol was added to the solution. The solution was cooled to low oil) in 93% ee, as determined by chiral GC analysis. 15 'H-NMR (300 MHz, CDC1): 8 1.24 (t, J=7.2 Hz, 3H, -78° C. (dry ice) and then 64 uL DAST (0.45 mmol) was —CH), 84.16-4.27 (m, 2H, OCH), & 5.75 (d. J=48 Hz, added. The reaction was stirred in dry ice for 12 hours. The 1H, CHF), & 737-7.46 (m, 5H); 'C-NMR (75 MHz, reaction mixture was then added with 5 mL saturated sodium CDC1): 14.2, 62.0, 81.2, 89.6 (d. J=184.5 Hz), 126.8, 126.9, bicarbonate (NaHCO) and extracted with dichloromethane 128.9, 129.8, 1344 (d. J=34.5 Hz), 8 169.0. 'F-NMR (282 (3x15 mL). The organic phase was then dried over magne MHz, CDC1): 8 -180.27 (d. J=48.7 Hz). HRMS (EI+): exact sium sulfate (MgSO) and evaporated in vacuo. Purification mass calculated for CHFO requires m/z, 182.0743, found of the resulting oil by silica gel chromatography (5% ethyl 182.0750. acetate: 95% hexane) afforded the fluorinated product ((R)- propyl 2-fluoro-2-(3-chlorophenyl)acetate) (57 mg. 82% Example 3 25 yield, colorless oil) in 89% ee, as determined by chiral GC analysis. H-NMR (300 MHz, CDC1): 8 0.85 (t, J=7 Hz,3H, Stereoselective fluorination of propyl —CH), 8 1.56-1.68 (m, 2H, CH), & 4.12 (t, J=6 Hz, 2H, 2-(3-chlorophenyl)acetate OCH), & 5.72 (d. J=48 Hz, 1H, CHF), & 7.32 (br,3H), 8 7.44 (br. 1H); 'C-NMR (75 MHz, CDC1): 10.3, 21.9, 67.7, 30 & 88.7 (d. J=186.5 Hz), 124.8, 126.9, 129.9, 130.3, 134.9. 'F-NMR (282 MHz, CDC1): 8 -182.8 (d. J=48.7 Hz). Scheme 3. HRMS (EI+): exact mass calculated for CHCIFO, requires m/z 2300510, found 230.0502. 0.05% mol war3-7 CH N-1- KPl, pH - 8.0 35 75% Example 4 O Stereoselective fluorination of propyl C 40 2-(4-methylphenyl)acetate and propyl OH 2-(2-methylphenyl)acetate s 1.5 eq DAST N1a CH2Cl2, -78°C. 82% O 45 Scheme 4-1.

C | 0.05% mol war3-7 KPi, pH = 8.0 C 65.9% H r 'Nu-1N 50 O "'N-nO OH C 1.5 eq DAST 89% ee 55 N-1N CHCl2, -78°C. r Ho83% Propyl 2-(3-chlorophenyl)acetate was subjected to selec O tive fluorination of the target site C (alpha position) according to the systems and methods disclosed herein and, more spe 60 cifically, according to the general procedure described above. Experimental description: 95 mg propyl 2-(3-chlorophe N-1a nyl)acetate was dissolved in 500 uLethanol and added to 250 mL potassium phosphate buffer pH 8.0. Var3-7 was added to r the mixture at a final concentration of 1 M. The mixture was 65 split in 4 mL aliquots into 15 mL Scintillation vials equipped 87% ee with a stir bar. 500 uL of a 5 mMNADPH solution was added US 8,026,085 B2 50 -continued Example 5

Scheme 4-2. Regioselective fluorination of H 3-methyl-2-pentylcyclopent-2-enone h O 0.05% mol war3-7 (dihydrojasmone) in position 4 1 N-1N KPi, 7296pH - 8.0 O 10 Scheme 5.

0.02% mol war2 KPi, pH = 8.0 CH-H H-e- OH S. 859% 1.5 eq DAST 15 O 1.3 eq DAST in N-n CH2Cl2,88% -78° C. CHCl2, -78°C. O CH-OH 0 S. 92%

O

CH-F S. N-1N 25

r Dihydrojasmone was subjected to selective fluorination of 85% ee the target site C (position 4) according to the systems and 30 methods disclosed herein and, more specifically, according to the general procedure described above. Propyl 2-(4-methylphenyl)acetate and propyl 2-(2-meth Experimental description: 270 uL dihydrojasmone was ylphenyl)acetate were subjected to selective fluorination of dissolved in 1.2 mL ethanol and added to 150 mL potassium the target site C (alpha position) according to the systems and 35 phosphate buffer pH 8.0. Var2 was added to the mixture at a methods disclosed herein and, more specifically, according to final concentration of 2 uM. The mixture was split in 4.8 mL the general procedure described above. aliquots into 15 mL Scintillation vials equipped with a stir bar. 600 uL 10 mM NADPH in KPi buffer was added to each vial Experimental description: Stereoselective activation and and stirred for 2 minutes. 600 uL cofactor regeneration solu fluorination of 2-(4-methylphenyl)acetate and propyl 2-(2- 40 tion containing 500 mM glucose-6-phosphate and 10 units/ methylphenyl)acetate were carried out starting from 100 mg mL glucose-6-phosphate dehydrogenase were then added to Substrate according to the experimental protocol described in each vial. The resulting mixtures were stirred at room tem Example 3. The fluorinated product (R)-propyl 2-fluoro-2- perature. After 36 hours, the reaction mixtures were joined (4-methylphenyl)acetate was obtained in 87% ee (54 mg. together and extracted with chloroform (3x50 mL). The colorless oil). H-NMR (300 MHz, CDC1): 8 0.84-0.91 (m, 45 organic phase was then dried over magnesium Sulfate 3H, -CH), 8 1.57-1.68 (m, 2H, CH,), & 2.37 (s, 3H, (MgSO) and evaporated in vacuo. Purification of the result —CH), 8 408-4.16 (m, 2H, OCH), & 5.75 (d. J=48 Hz, ing oil by silica gel chromatography (0-30% ethyl acetate/ 1H, CHF), & 7.18-7.27 (m, 2H), & 7.27-7.44 (m, 2H): hexane) afforded the activated product (4-hydroxy-3-methyl 'C-NMR (75 MHz, CDC1): 10.40, 19.41, 22.08, 67.47, 2-pentylcyclopent-2-enone, 222 mg). 210 mg (1.15 mmol) of 126.57, 131.10. 'F-NMR (282 MHz, CDC1): 8 -178.5 (d. 50 activated product was dissolved in 2 mL dry dichloromethane J=48.7 Hz). HRMS (EI--): exact mass calculated for (CH2Cl2) and a catalytic amount (4 drops) of ethanol was CHFO requires m/z, 210.1056, found 210.1062. The flu added to the solution. The solution was cooled to -78°C. (dry orinated product (R)-propyl 2-fluoro-2-(2-methylphenyl)ac ice) and then 215 uL DAST (1.5 mmol) was added. The etate was obtained in 87% ee (54 mg, colorless oil). H-NMR reaction was stirred in dry ice for 12 hours. The reaction (300 MHz, CDC1): 8 0.83 (t, J=7.5 Hz, 3H, CH), 8 55 mixture was then added with 5 mL saturated sodium bicar 1.52-1.68 (m. 2H, CH), 62.43 (s.3H, —CH), 84.12 (m. 2H, bonate (NaHCO) and extracted with dichloromethane (3x15 –OCH), & 5.96 (d. J=48 Hz, 1H, -CHF), & 7.16-7.30 (m, mL). The organic phase was then dried over magnesium 4H); 'C-NMR (75 MHz, CDC1): 10.3, 19.3, 22.0, 29.9, sulfate (MgSO) and evaporated in vacuo. Purification of the 67.4, 87.4 (d. J=183 Hz), & 126.5, 8 127.5, 8 129.8, 8 131.0. 60 resulting oil by silica gel chromatography (0-30% ethyl 'F-NMR (282 MHz, CDC1): 8 -180.1 (d. J=48.7 Hz). acetate/hexane) afforded the fluorinated product, 4-fluoro-3- HRMS (EI--): exact mass calculated for CHFO requires methyl-2-pentylcyclopent-2-enone (193 mg, 92% yield, yel m/Z 210.1056, found 210.1070. low oil). H-NMR (300 MHz, CDC1): 8 0.88 (t, J=6.6 Hz, Examples 1, 2, 3, and 4 illustrate the application of the 3H, CH), 81.25-1.40 (m, 6H, CH), 62.10 (d. J–2.1 Hz, 2H, systems and methods of the disclosure for stereoselective 65 CH), & 2.20 (t, J=7.1 Hz, 2H), & 2.44-2.60 (m. 1H), a 2.70 fluorination of a chemical building block, exemplified by 2.82 (m. 1H), & 5.47 (dd, J=54.2 Hz, J=5.8, 1H); 'C-NMR 2-aryl acetic acid derivatives (Schemes 1-4). (75 MHz, CDC1): 8 13.7, 14.2, 22.6, 23.1, 27.9, 29.9, 31.9, US 8,026,085 B2 51 52 41.4 (d. J=19.6 Hz), 8 91.2 (d. J=174 Hz): 'F-NMR (282 (MgSO) and evaporated in vacuo. Purification of the result MHz, CDC1): 8 -179.08 (ddd, J=51.88 Hz, J-21.43 Hz, ing oil by silica gel chromatography (0-30% ethyl acetate/ J=9.3 Hz). HRMS (EI--): exact mass calculated for CH, FO hexane) afforded the activated product (11-hydroxy-3-me requires m/z 184.1263, found 184.1255. thyl-2-pentylcyclopent-2-enone, 35 mg). 30 mg (0.16 mmol) 5 of activated product was dissolved in 2 mL dry dichlo Example 6 romethane (CH2Cl2) and a catalytic amount (4 drops) of ethanol was added to the solution. The solution was cooled to Regioselective fluorination of -78° C. (dry ice) and then 35 uL DAST (0.25 mmol) was 3-methyl-2-pentylcyclopent-2-enone added. The reaction was stirred in dry ice for 12 hours. The (dihydrojasmone) in position 11 " reaction mixture was then added with 5 mL saturated sodium bicarbonate (NaHCO) and extracted with dichloromethane (3x15 mL). The organic phase was then dried over magne sium sulfate (MgSO) and evaporated in vacuo. Purification Scheme 6. of the resulting oil by silica gel chromatography (0-30% ethyl O 0.02% mol war3-2 acetate/hexane) afforded the fluorinated product, 11-fluoro KPipH-8.0 3-methyl-2-pentylcyclopent-2-enone (27 mg. 89% yield, yel —s - low oil). 'H-NMR (300 MHz, CDC1): 8 0.88 (t, J=6.6 Hz, --- 3H, CH), 81.25-1.40 (m, 6H, CH), 62.10 (d. J–2.1 Hz, 2H, H.G-H 20 CH), 82.17 (t, J=7.6 Hz, 2H), 82.38-2.44 (m, 1H), 8 2.59 2.64 (m, 1H), & 5.20 (d. J=48.8 Hz, 1H); 'C-NMR (75 MHz, O cE. CDC1): & 14.3, 22.8, 23.3, 28.4, 31.8, 29.9, 31.9, 34.8, 60.6, - - 80.3 (d. J=164 Hz), 87.9; 'F-NMR (282 MHz, CDC1): 8 -48.80 (d. J=48.7 Hz). HRMS (EI+): exact mass calculated 25 for CHFO requires m/z 184.1263, found 184.1263. HC-OH Examples 5 and 6 illustrate the application of the systems and methods of the disclosure for regioselective fluorination of an organic molecule at weakly reactive sites, exemplified by dihydrojasmone (Schemes 5 and 6). 30 HC-F Example 7 Dihydrojasmone was subjected to selective fluorination of the target site C (position 11) according to the systems and 35 Regioselective difluorination of methods disclosed herein and, more specifically, according to 3-methyl-2-pentylcyclopent-2-enone the general procedure described above. (dihydrojasmone) in position 4

Scheme 7. O O 0.02% mol var2 O -R"s -R"s KPi,75% pH = 8.0 s CF-OH CHC-7so1.3 eq DAST C.85%

O

CF-F S.

Experimental description: 240 L dihydrojasmone was 55 Dihydrojasmone was subjected to selective difluorination dissolved in 1.1 mL ethanol and added to 130 mL potassium of the target site C (position 4) according to the systems and phosphate buffer pH 8.0. Var2 was added to the mixture at a methods disclosed herein and, more specifically, according to final concentration of 2 uM. The mixture was split in 4.8 mL the general procedure described above. aliquots into 15 mL Scintillation vials equipped with a stir bar. Experimental description: 4-fluoro-3-methyl-2-pentylcy 600 uL 10 mM NADPH in KPi buffer was added to each vial 60 clopent-2-enone was obtained according to the experimental and stirred for 2 minutes. 600 uL cofactor regeneration solu- described in Example 5. 180 mg 4-fluoro-3-methyl-2-pentyl tion containing 500 mM glucose-6-phosphate and 10 units/ cyclopent-2-enone was dissolved in 900 uL ethanol and mL glucose-6-phosphate dehydrogenase were then added to added to 120 mL potassium phosphate buffer pH 8.0. Var2 each vial. The resulting mixtures were stirred at room tem- was added to the mixture at a final concentration of 2 uM. The perature. After 36 hours, the reaction mixtures were joined 65 mixture was split in 4.8 mL aliquots into 15 mL scintillation together and extracted with chloroform (3x50 mL). The vials equipped with a stir bar. 600 uL 10 mM NADPH in KPi organic phase was then dried over magnesium Sulfate buffer was added to each vial and stirred for 2 minutes. 600 uL US 8,026,085 B2 53 54 cofactor regeneration Solution containing 500 mM glucose The organic phase was then dried over magnesium Sulfate 6-phosphate and 10 units/mL glucose-6-phosphate dehydro (MgSO) and evaporated in vacuo. The resulting oil (53 mg) genase were then added to each vial. The resulting mixtures was subjected directly to deoxo-fluorination without purifi were stirred at room temperature. After 36 hours, the reaction cation of the activated product. 53 mg of the activation mix mixtures were joined together and extracted with chloroform ture (-0.32 mmol) were dissolved in 2 mL dry dichlo (3x50 mL). The organic phase was then dried over magne romethane (CH2Cl2) and a catalytic amount (4 drops) of sium sulfate (MgSO) and evaporated in vacuo. Purification ethanol was added to the solution. The solution was cooled to of the resulting oil by silica gel chromatography (0-30% ethyl -78°C. (dry ice) and then 150 uL DAST (1 mmol) was added. acetate/hexane) afforded the activated product (4-hydroxy-4- The reaction was stirred in dry ice for 16 hours. The reaction fluoro-3-methyl-2-pentylcyclopent-2-enone, 135 mg). 100 10 mixture was then added with 5 mL saturated sodium bicar mg (0.54 mmol) of activated product was dissolved in 2 mL bonate (NaHCO) and extracted with dichloromethane (3x15 dry dichloromethane (CHCl) and a catalytic amount (4 mL). The organic phase was then dried over magnesium drops) of ethanol was added to the solution. The solution was sulfate (MgSO) and evaporated in vacuo. Purification of the cooled to -78°C. (dry ice) and then 100 uL DAST (0.7 mmol) resulting oil by silica gel chromatography (0-10% ethyl was added. The reaction was stirred in dry ice for 12 hours. 15 acetate/hexane) afforded the fluorinated product, 6-fluoro The reaction mixture was then added with 5 mL saturated menthofuran-2-ol (12 mg, 22% yield, yellow oil). H-NMR sodium bicarbonate (NaHCO) and extracted with dichlo (300 MHz, CDC1): 8 1.13 (d. J: 75.6 Hz, 3H, CH), 8 romethane (3x15 mL). The organic phase was then dried over 1.2-1.3 (m, 2H, -CH ), 8 1.84 (s.3H, —CH), 1.95-2.4 magnesium sulfate (MgSO) and evaporated in vacuo. Puri (dm, 2H, -CH ), 2.4-2.6 (dim, 2H, -CH ); 'C-NMR fication of the resulting oil by silica gel chromatography (75 MHz, CDC1): 822.7 (d. J–209 Hz), 43.08, 45.36, 91.7 (d. (0-30% ethyl acetate/hexane) afforded the fluorinated prod J=215 Hz), 114.9, 127.1. 'F-NMR (282 MHz, CDC1): 8 uct, 4.4-difluoro-3-methyl-2-pentylcyclopent-2-enone (85 -114.4 (m). HRMS (EI--): exact mass calculated for mg, 85% yield, yellow oil). MS (EI+): m/z. 202. Mw for CHFO, requires m/z 184.0909, found 184.0899. CHFO: 202.24. Example 8 illustrates the application of the systems and Example 7 illustrates the application of the systems and 25 methods of the disclosure for regioselective fluorination of an methods of the disclosure for regioselective polyfluorination organic molecule at a weakly reactive site, exemplified by of an organic molecule at a weakly reactive site, exemplified menthofuran (Scheme 8). by dihydrojasmone (Scheme 7). Example 9 Example 8 30 Regioselective fluorination of (-)-guaiol Regioselective fluorination of menthofuran

35 Scheme 8.

0.01% mol war3-11 KPH = 8.0 0.05% mol war3-2 47% KPi, pH = 8.0 40 H 13 12% 6C 3 eq DAST OH 14 H 2 ->CH2Cl2, -78% 22%

2.5 eq DAST CHCl2, -78°C. He H 45% C 50 OH Menthofuran was subjected to selective fluorination of the OH target site C (position 6) according to the systems and meth ods disclosed herein and, more specifically, according to the general procedure described above. 55 Experimental description: 112 mg menthofuran was dis H solved in 0.6 mL ethanol and added to 125 mL potassium C phosphate buffer pH 8.0. Var3-11 was added to the mixture at a final concentration of 0.7 uM. The mixture was split in 4 mL F aliquots into 15 mL Scintillation vials equipped with a stir bar. 500 uL 10 mM NADPH in KPi buffer was added to each vial 60 (-)-Guaiol was subjected to selective fluorination of the and stirred for 2 minutes. 500 uL cofactor regeneration solu target site C (position 6) according to the systems and meth tion containing 500 mM glucose-6-phosphate and 10 units/ ods disclosed herein and, more specifically, according to the mL glucose-6-phosphate dehydrogenase were then added to general procedure described above. each vial. The resulting mixtures were stirred at room tem Experimental description: 250 mg guaiol was dissolved in perature. After 24 hours, the reaction nearly reached comple 65 2 mL ethanol and added to 210 mL potassium phosphate tion (95% substrate conversion). The reaction mixtures were buffer pH 8.0. Var3-2 was added to the mixture at a final joined together and extracted with chloroform (3x50 mL). concentration of 3 uM. The mixture was split in 4.8 mL US 8,026,085 B2 55 56 aliquots into 15 mL Scintillation vials equipped with a stir bar. -continued 600 uL 10 mM NADPH in KPi buffer was added to each vial and stirred for 2 minutes. 600 uL cofactor regeneration solu tion containing 500 mM glucose-6-phosphate and 10 units/ mL glucose-6-phosphate dehydrogenase were then added to each vial. The resulting mixtures were stirred at room tem perature. After 48 hours, the reaction mixtures were joined together and extracted with chloroform (3x50 mL). The organic phase was then dried over magnesium Sulfate (MgSO) and evaporated in vacuo. Purification of the result 10 ing oil by silica gel chromatography (0-30% ethyl acetate/ hexane) afforded the activated product (6-hydroxy-guaiol, 30 mg, colorless oil). 'H-NMR (300 MHz, CDC1): 8 1.01 (d. J=6.9 Hz, 3H, -CH), 8 1.22 (s, 3H, CH), 8 1.28 (s, 3H, 15 Ibuprofen methyl ester was subjected to selective fluorina CH), 8 1.25 (d. J=9 Hz, 3H, CH), 8 1.42-1.45 (m, 2H,), 8 1.685 (bs, 2H), 8 1.74-1.183 (m, 2H), 8 1.95-2.03 (m, 2H), 8 tion of the target site C (position 1") according to the systems 2.15-2.24 (m, 2H), 82.54-2.72 (m,3H), 82.97-3.06 (m, 1H), and methods disclosed herein and, more specifically, accord 83.67 (d. J=9 Hz, 1H); 'C-NMR (75 MHz, CDC1): 8 11.20, ing to the general procedure described above. 1941, 26.08, 28.32, 31.07, 34.13, 35.33, 38.08, 42.42, 48.01, Experimental description: 150 mg ibuprofen methyl ester 72.12, 73.15, 178.94. 15 mg of the activated product (-0.06 was dissolved in 1.4 mL ethanol and added to 150 mL potas mmol) was dissolved in 1 mL dry dichloromethane (CHCl) and a catalytic amount (3 drops) of ethanol was added to the sium phosphate buffer pH 8.0. P450 was added to the solution. The solution was cooled to -78°C. (dry ice) and mixture at a final concentration of 10 uM. The mixture was then 18 uL DAST (0.12 mmol) was added. The reaction was 25 split in 4 mL-aliquots into 15 mL Scintillation vials equipped stirred in dry ice for 16 hours. The reaction mixture was then with a stir bar. 500 uL 10 mM NADPH in KPi buffer was added with 5 mL saturated sodium bicarbonate (NaHCO) added to each vial and stirred for 2 minutes. 500 uL cofactor and extracted with dichloromethane (3x15 mL). The organic phase was then dried over magnesium sulfate (MgSO) and regeneration solution containing 500 mM glucose-6-phos evaporated in vacuo. Purification of the resulting oil by silica 30 phate and 10 units/mL glucose-6-phosphate dehydrogenase gel chromatography (0-30% ethyl acetate/hexane) afforded were then added to each vial. The resulting mixtures were the fluorinated product, 6-fluoro-guaiol (7 mg, 45% yield, stirred at room temperature. After 48 hours, the reaction mix pale yellow oil). MS (EI+): m/z. 242. Mw for CHFO: tures were joined together and extracted with chloroform 242.35 35 (3x50 mL). The organic phase was then dried over magne Example 9 illustrates the application of the systems and sium sulfate (MgSO) and evaporated in vacuo. Purification methods of the disclosure for regioselective fluorination of an organic molecule at a weakly reactive site, exemplified by of the resulting oil by silica gel chromatography (5-40% ethyl (-)-guaiol (Scheme 9). acetate/hexane) afforded the activated product (methyl 2-(4'- 40 (1"-hydroxy-2"-methylpropyl)phenyl)propanoate, 73 mg). Example 10 15 mg (0.06 mmol) of activated product was dissolved in 2 mL dry dichloromethane (CHCl) and a catalytic amount (4 Regioselective fluorination of ibuprofen methyl ester drops) of ethanol was added to the solution. The solution was (methyl 2-(4'-(2"-methylpropyl)phenyl)propanoate) 45 cooled to -78°C. (dry ice) and then 11 uL DAST (0.72 mmol) in 1" position was added. The reaction was stirred in dry ice for 16 hours. The reaction mixture was then added with 5 mL saturated sodium bicarbonate (NaHCO) and extracted with dichlo 50 romethane (3x15 mL). The organic phase was then dried over Scheme 10. magnesium sulfate (MgSO4) and evaporated in vacuo. Puri O O O.1%ao no.O P450 BM3 fication of the resulting oil by silica gel chromatography N KPi, pH = 8.0 (0-30% ethyl acetate/hexane) afforded the fluorinated prod l2 2 49% 55 uct, methyl 2-(4'-(1'-fluoro-2'-methylpropyl)phenyl)pro t panoate (10 mg, 65% yield, colorless oil). H-NMR (300 i MHz, CDC1): 8 0.84 (d.J=6.9 Hz, 3H, CH), a 1.01 (d. J=6.9 H Hz, 3H, CH), a 1.49 (d. J=8.7 Hz, 3H, CH), a 2.05-2.08 O O 1.2 eq DAST 60 (m. 1H), & 3.66 (s, 3H, OCH), & 3.73 (q, J–7.5 Hz, 1H), 8 N CHCl2, -78°C. 5.07 (dd, J=40.0, J=6.9 Hz, 1H, CHF), & 7.25 (m, 4H); 'C- l 65% NMR (75 MHz, CDC1): 8 15.5, 17.82 (d), 18.54(d), 34.48 (d. J: 85.7 Hz), 45.37, 52.31, 99.3 (d. J=174 Hz), 175.6; i 65 'F-NMR (282 MHz, CDC1): 8 - 179.8 (m). HRMS (EI+): OH exact mass calculated for CHFO requires m/Z 238.1369, found 238.1367. US 8,026,085 B2 57 58 Example 11 3H, CH), a 2.87 (d. J: 20.4 Hz, 2H), & 3.65 (s.3H, OCH), 8 3.70 (q, J=7.15 Hz, 1H), 87.17 (m, 4H); 'C-NMR (75 MHz, CDC1): 8 18.80, 26.83 (d.J: 24.2 Hz), 45.24, 47.37 (d. J:22.8 Regioselective fluorination of ibuprofen methyl ester Hz), 52.25, 129.12 (d. J: 258.8 Hz), ca. 130, ca. 132, ca. 139, (methyl 2-(4'-(2"-methylpropyl)phenyl)propanoate) 5 ca. 173; F-NMR (282 MHz, CDC1): 8-137.7 (m). HRMS in 2" position (EI+): exact mass calculated for CHFO requires m/z. 238.1369, found 238.1370. Example 10 and 11 illustrate the application of the systems

and methods of the disclosure for regioselective fluorination 10 of an organic molecule at weakly and non-reactive site. Such 0.05% mol war3-4 as positions 1" and 2" of ibuprofen methyl ester (Schemes 10 KPi, pH = 8.0 and 11). 7296 15 Example 12 Regioselective fluorination of dihydro-4-methoxym ethyl-2-methyl-5-phenyl-2-oxazoline O \ 1.2 eq DAST CHCl2, -78°C. OH Nl 45% 2O -C Scheme 12.

O O 25 F N 0.1% mol war 3-5 Nl KPi, pH = 8.0 -C 64%

30 Ibuprofen methyl ester was subjected to selective fluorina tion of the target site C (position 2") according to the systems and methods disclosed herein and, more specifically, accord 2 eq DAST ing to the general procedure described above. CHCl2, -20°C. Experimental description: 150 mg ibuprofen methyl ester 35 40% was dissolved in 1.4 mL ethanol and added to 150 mL potas sium phosphate buffer pH 8.0. Var3-4 was added to the mix ture at a final concentration of 3 uM. The mixture was split in 4 mL-aliquots into 15 mL Scintillation vials equipped with a 40 stir bar. 500 uL 10 mM NADPH in KPi buffer was added to each vial and stirred for 2 minutes. 500LL cofactor regenera tion solution containing 500 mM glucose-6-phosphate and 10 units/mL glucose-6-phosphate dehydrogenase were then added to each vial. The resulting mixtures were stirred at 45 room temperature. After 48 hours, the reaction mixtures were joined together and extracted with chloroform (3x50 mL). The organic phase was then dried over magnesium sulfate Dihydro-4-methoxymethyl-2-methyl-5-phenyl-2-oxazo (MgSO) and evaporated in vacuo. Purification of the result line was subjected to selective fluorination of the target site C ing oil by silica gel chromatography (5-40% ethyl acetate/ 50 atom carrying a methoxy group, according to the systems and hexane) afforded the activated product (methyl 2-(4'-(2'-hy methods disclosed herein and, more specifically, according to droxy-2"-methylpropyl)phenyl)propanoate, 81 mg). 15 mg the general procedure described above. (0.06 mmol) of activated product were dissolved in 2 mL dry Experimental description: 100 mg dihydro-4-methoxym dichloromethane (CH2Cl2) and a catalytic amount (4 drops) ethyl-2-methyl-5-phenyl-2-oxazoline ibuprofen methyl ester 55 was dissolved in 1.2 mL ethanol and added to 160 mL potas of ethanol was added to the solution. The solution was cooled sium phosphate buffer pH 8.0. Var3-5 was added to the mix to -78°C. (dry ice) and then 11 uL DAST (0.72 mmol) was ture at a final concentration of 3 uM. The mixture was split in added. The reaction was stirred in dry ice for 16 hours. The 4 mL-aliquots into 15 mL Scintillation vials equipped with a reaction mixture was then added with 5 mL saturated sodium stir bar. 500 uL 10 mM NADPH in KPi buffer was added to bicarbonate (NaHCO) and extracted with dichloromethane 60 each vial and stirred for 2 minutes. 500 uL cofactor regenera (3x15 mL). The organic phase was then dried over magne tion solution containing 500 mM glucose-6-phosphate and 10 sium sulfate (MgSO) and evaporated in vacuo. Purification units/mL glucose-6-phosphate dehydrogenase were then of the resulting oil by silica gel chromatography (0-30% ethyl added to each vial. The resulting mixtures were stirred at acetate/hexane) afforded the fluorinated product, methyl room temperature. After 48 hours, the reaction mixtures were 2-(4'-(1"-fluoro-2'-methylpropyl)phenyl)propanoate (7 mg, 65 joined together and extracted with chloroform (3x50 mL). 45% yield, colorless oil). H-NMR (300 MHz, CDC1): 8 The organic phase was then dried over magnesium Sulfate 1.28 (s.3H, CH), 8 1.35 (s, 3H, CH), 8 1.48 (d. J–6.9 Hz, (MgSO) and evaporated in vacuo. Purification of the result US 8,026,085 B2 59 60 ing oil by silica gel chromatography (20% ethyl acetate/ Var3-6 was added to the mixture at a final concentration of 4 hexane) afforded the activated product (dihydro-4-hy uM. The mixture was split in 4 mL-aliquots into 15 mL droxymethyl-2-methyl-5-phenyl-2-oxazoline, 64 mg).30 mg scintillation vials equipped with a stir bar. 500 uL 10 mM (0.16 mmol) of activated product were dissolved in 2 mL dry NADPH in KPi buffer was added to each vial and stirred for dichloromethane (CHCl) and a catalytic amount (4 drops) 2 minutes. 500LL cofactor regeneration solution containing of ethanol was added to the solution. The solution was cooled 500 mM glucose-6-phosphate and 10 units/mL glucose-6- to -78°C. (dry ice) and then 22 uL DAST (0.32 mmol) was phosphate dehydrogenase were then added to each vial. The added. The reaction was stirred in dry ice for 2 hours and then resulting mixtures were stirred at room temperature. After 36 at -20°C. for 16 hours. The reaction mixture was then added hours, the reaction mixtures were joined together and with 5 mL saturated sodium bicarbonate (NaHCO) and 10 extracted with chloroform (3x50 mL). The organic phase was extracted with dichloromethane (3x15 mL). The organic then dried over magnesium Sulfate (MgSO4) and evaporated phase was then dried over magnesium sulfate (MgSO) and in vacuo. Purification of the resulting oil by silica gel chro evaporated in vacuo. Purification of the resulting oil by silica matography (10% ethyl acetate/hexane) afforded the acti gel chromatography (20% ethyl acetate/hexane) afforded the vated product (1,2,3,4-tetramethyl-O-D-mannopyranoside, fluorinated product, dihydro-4-fluoromethyl-2-methyl-5- 15 30 mg). 15 mg (0.1 mmol) of activated product were dis phenyl-2-oxazoline (12 mg, 40% yield, colorless oil). solved in 2 mL dry dichloromethane (CH,Cl) and a catalytic H-NMR (300 MHz, CDC1): 81.59 (s.3H, CH), 82.09 (dim, amount (4 drops) of ethanol was added to the solution. The 1H, CH), & 4.15-4.35 (m. 1H, CH), & 5.66 (tm, 1H, CH), 8 solution was cooled to -78°C. (dry ice) and then 85uL DAST 7.37 (m, 5H, Ph); 'F-NMR (282 MHz, CDC1): 8-114.14 (0.6 mmol) was added. The reaction was stirred in dry ice for (m). HRMS (EI--): exact mass calculated for CHFNO 2 hours and then at room temperature for 16 hours. The requires m/z 193.0903, found 193.0917. reaction mixture was then added with 5 mL saturated sodium In another aspect, example 12 illustrates the application of bicarbonate (NaHCO) and extracted with dichloromethane the systems and methods of the disclosure for selective fluo (3x15 mL). The organic phase was then dried over magne rination of an organic molecule at a site carrying a protected sium sulfate (MgSO) and evaporated in vacuo. Purification hydroxyl group. Such as in dihydro-4-methoxymethyl-2-me 25 of the resulting oil by silica gel chromatography (10% ethyl thyl-5-phenyl-2-oxazoline (Scheme 12). acetate/hexane) afforded the fluorinated product, 6-deoxy-6- fluoro-1,2,3,4-tetramethyl-O-D-mannopyranoside (4.5 mg. Example 13 30% yield, colorless oil). H-NMR (300 MHz, CDC1): 8 3.35 (s.3H, OCH), 3.48 (s, 6H, OCH), 3.53 (s.3H, OCH), Regioselective fluorination of 30 4.0–4.2 (m, 4H), 4.6 (dim, J: 47.5 Hz, 2H, CHF); 'C-NMR 1,2,3,4,6-pentamethyl-O-D-mannopyranoside (75 MHz, CDC1): 8 55.24, 57.94, 59.24, 61.02, 71.12 (d. J: 18.4 Hz), 73.2, 75.5, 82.49 (d. J: 192 Hz), 98.26. 'F-NMR (282 MHz, CDC1): 8 -235.2 (m). ESI-MS: m/z calculated for Mw CHFOs: 238.2533, found 238.28. Scheme 13. 35 In another aspect, example 13 illustrates the application of the systems and methods of the disclosure for regioselective No fluorination of an organic molecule at a defined site carrying it. 0.2% mol war 3-6 a protected hydroxyl group in the presence of other identical KPi, pH = 8.0 functional groups, such as in 1,2,3,4,6-pentamethyl-O-D- N O O 60% 40 mannopyranoside (Scheme 13). The examples set forth above are provided to give those of / O ordinary skill in the art a complete disclosure and description / OOS of how to make and use the embodiments of the methods and systems disclosed herein, and are not intended to limit the pi 45 scope of the disclosure. Modifications of the above-described HC 6 eq DAST modes for carrying out the disclosure that are obvious to N O CHCl2, r.t. persons of skill in the art are intended to be within the scope Oo 30% of the following claims. All patents and publications men / O tioned in the specification are indicative of the levels of skill 50 of those skilled in the art to which the disclosure pertains. All / OS references cited in this disclosure are incorporated by refer ence to the same extent as if each reference had been incor HC porated by reference in its entirety individually. O In Summary, a method and system, and in particular a No 55 chemo-enzymatic method and system for selectively fluori 2 O nating organic molecules on a target site wherein the target O site is activated and then fluorinated are present together with / OS a method and system for identifying a molecule having a biological activity. In particular, a chemo-enzymatic method 1,2,3,4,6-pentamethyl-O-D-mannopyranoside was Sub 60 for preparation of selectively fluorinated derivatives of jected to regioselective fluorination of the target site C in organic compounds with diverse molecular structures is pre position 6, according to the systems and methods disclosed sented together with a system for fluorination of an organic herein and, more specifically, according to the general proce molecule and a method for identification of a molecule having dure described above. a biological activity. Experimental description: 50 mg of 1.2.3.4.6-pentam 65 It is to be understood that the embodiments are not limited ethyl-O-D-mannopyranoside was dissolved in 0.5 mL etha to particular compositions or biological systems, which can, nol and added to 100 mL potassium phosphate buffer pH 8.0. of course, vary. It is also to be understood that the terminology US 8,026,085 B2 61 62 used herein is for the purpose of describing particular Camarero, J. A. and A. R. Mitchell (2005). “Synthesis of embodiments only, and is not intended to be limiting. As used proteins by native chemical ligation using Fmoc-based in this specification and the appended claims, the singular chemistry.” Protein Pept. Lett. 12(8): 723-8. forms “a,” “an and “the include plural referents unless the Cao, L. (2005). “Immobilised enzymes: science or art? Curr content clearly dictates otherwise. The term “plurality” Opin Chem Biol 9(2): 217-26. includes two or more referents unless the content clearly Carter, P. (1986). “Site-directed mutagenesis.” Biochem J dictates otherwise. Unless defined otherwise, all technical 237(1):1-7. and Scientific terms used herein have the same meaning as Casper, D., U. Yaparpalvi, et al. (2000). “Ibuprofen protects commonly understood by one of ordinary skill in the art to dopaminergic neurons against glutamate toxicity in vitro.' which the disclosure pertains. Although any methods and 10 Neurosci Lett 289(3): 201-4. materials similar or equivalent to those described herein can Chambers, R. D., J. Hutchinson, et al. (2000). J. Fluorine be used in the practice for testing of the disclosure(s), specific Chen. 102: 169. examples of appropriate materials and methods are described Chambers, R. D., C.J. Skinner, et al. (1996). J. Chem. Soc., herein. Perkin Trans. I. 605. Unless otherwise indicated, the disclosure is not limited to 15 Cirino, P. C. and F. H. Arnold (2003). “A self-sufficient per specific molecular structures, Substituents, synthetic meth oxide-driven hydroxylation biocatalyst.” Angewandte ods, reaction conditions, or the like, as Such may vary. It is Chemie-International Edition 42(28): 3299-3301. also to be understood that the terminology used herein is for Clader, J. W. (2004). “The discovery of ezetimibe: a view the purpose of describe particular embodiments only and is from outside the receptor. J. Med. Chem. 47(1): 1-9. not intended to be limiting. Dale, S.J. and I. R. Felix (1996). “Oligonucleotide-directed The entire disclosure of each document cited (including mutagenesis using an improved phosphorothioate patents, patent applications, journal articles, abstracts, labo approach.” Methods Mol Biol 57: 55-64. ratory manuals, books, or other disclosures) in the Back Davis, F. A. and W. Han (1992). “Diastereoselective Fluori ground, Detailed Description, and Examples is hereby incor nation of Chiral Imide Enolates Using N-Fluoro-O-Ben porated herein by reference. Further, the hard copy of the 25 Zenedisulfonimide (Nfobs).” Tetrahedron Letters 33 (9): sequence listing Submitted herewith and the corresponding 1153-1156. computer readable form are both incorporated herein by ref Davis, F. A. P. Zhou, et al. (1998). “Asymmetric fluorination erence in their entireties. of enolates with nonracemic N-fluoro-2,10-camphorsul A number of embodiments of the disclosure have been tams.” Journal of Organic Chemistry 63(7): 2273-2280. described. Nevertheless, it will be understood that various 30 De-Moura, N. F., E. Simionatto, et al. (2002). "Quinoline modifications may be made without departing from the spirit Alkaloids, Coumarins and Volatile Constituents of Helietta and scope of the disclosure. Accordingly, other embodiments longifoliata." Planta Med. 68: 631-634. are within the scope of the following claims. De Souza, M. V. N. (2005). “The furan-2(5H)-ones: Recent synthetic methodologies and its application in total synthe REFERENCES 35 sis of natural products' Mini-rev: Org. Chem. 2(2): 139 145. Bass, S. V. Sorrells, et al. (1988). “Mutant Trp repressors Denisov, I. G. T. M. Makris, et al. (2005). “Structure and with new DNA-binding specificities. Science 242(4876): chemistry of cytochrome P450." Chem. Rev. 105(6): 2253 240-5. 77. Beeson, T. D. and D. W. C. MacMillan (2005). “Enantiose 40 Eghtedarzadeh, M. K. and S. Henikoff (1986). “Use of oli lective organocatalytic alpha-fluorination of aldehydes.” gonucleotides to generate large deletions. Nucleic Acids Journal of the American Chemical Society 127(24): 8826 Res 14(12): 5115. 8828. Enders, D., M. Potthoff, et al. (1997). “Regio- and enantiose Blee, E., A. L. Wilcox, et al. (1993). “Mechanism of Reaction lective synthesis of alpha-fluoroketones by electrophilic of Fatty-Acid Hydroperoxides with Soybean Peroxyge 45 fluorination of alpha-silylketone enolates with N-fluo nase.” Journal of Biological Chemistry 268(3): 1708-1715. robenzosulfonimide' Angewandte Chemie-International Bobbio, C. and V. Gouverneur (2006). “Catalytic asymmetric Edition in English 36(21): 2362-2364. fluorinations.” Org Biomol Chem 4(11): 2065-75. Green, T. W. and P. G. M. Wuts (1999). Protective Groups in Bohm, H. J. D. Banner, et al. (2004). “Fluorine in medicinal Organic Synthesis. New York, Wiley-Interscience. chemistry.” Chembiochem 5(5): 637-43. 50 Grundstrom, T., W. M. Zenke, et al. (1985). “Oligonucle Bornscheuer, U.T. (2003). “Immobilizing enzymes: how to otide-directed mutagenesis by microscale shot-gun gene create more suitable biocatalysts.” Angew. Chem. Int. Ed. synthesis.” Nucleic Acids Res 13(9): 3305-16. Engl. 42: 3336-3337. Hamashima, Y. and M. Sodeoka (2006). “Enantioselective Botstein, D. and D. Shortle (1985). "Strategies and applica fluorination reactions catalyzed by chiral palladium com tions of in vitro mutagenesis.” Science 229(4719): 1193– 55 plexes.” Synlett(10): 1467-1478. 2O1. Hanano, A., M. Burcklen, et al. (2006). “Plant seed peroxy Braxton, S, and J. A. Wells (1991). “The importance of a genase is an original heme-oxygenase with an EF-hand distal hydrogen bonding group in stabilizing the transition calcium binding motif Journal of Biological Chemistry state in subtilisin BPN J Biol Chem 266(18): 11797-800. 281 (44): 33140-33151. Burke, T.R., B. Ye, etal. (1996). “Small molecule interactions 60 Harper, D. B. and D. O'Hagan (1994). “The fluorinated natu with protein-tyrosine phosphatase PTP1B and their use in ral products.” Nat Prod Rep 11(2): 123-33. inhibitor design.” Biochemistry 35(50): 15989-15996. Hayashi, H., H. Sonoda, et al. (2002). “2,2-difluoro-1,3-dim Cahard, D., C. Audouard, et al. (2000). “Design, synthesis, ethylimidazolidine (DFI). A new fluorinating agent.” and evaluation of a novel class of enantioselective electro Chemical Communications (15): 1618-1619. philic fluorinating agents: N-fluoro ammonium salts of 65 Hintermann, L. and A. Togni (2000). “Catalytic enantioselec cinchona alkaloids (F-CA-BF).” Organic Letters 2(23): tive fluorination of beta-ketoesters.” Angewandte Chemie 3699-3701. International Edition 39(23): 4359-+. US 8,026,085 B2 63 64 Joo, H., Z. L. Lin, et al. (1999). “Laboratory evolution of Nambiar, K. P. J. Stackhouse, et al. (1984). “Total synthesis peroxide-mediated cytochrome P450 hydroxylation.” and cloning of a gene coding for the ribonuclease S pro Nature 399(6737): 670-673. tein.” Science 223 (4642): 1299-301. Kim, D.Y. and E. J. Park (2002). “Catalytic enantioselective Nyffeler, P.T. S. G Duron, et al. (2005). “Selectfluor: Mecha fluorination of beta-keto esters by phase-transfer catalysis 5 nistic insight and applications.” Angewandte Chemie-In using chiral quaternary ammonium salts.” Organic Letters ternational Edition 44(2): 192-212. 4(4): 545-547. Ojima, I. (2004). “Use of fluorine in the medicinal chemistry Knight, D. W. (1994). Contemporary Organic Synthesis 1: and chemical biology of bioactive compounds—a case 287. study on fluorinated taxane anticancer agents. Chembio Kramer, W. V. Drutsa, et al. (1984). “The gapped duplex 10 chem 5(5): 628-35. DNA approach to oligonucleotide-directed mutation con Otey, C. R., M. Landwehr, et al. (2006). “Structure-Guided struction.” Nucleic Acids Res 12(24): 9441-56. Recombination Creates an Artificial Family of Cyto Kramer, W. and H. J. Fritz (1987). “Oligonucleotide-directed chromes P450. PLoS Biol 4(5): e112. construction of mutations via gapped duplex DNA. Meth 15 Park, B.K., N. R. Kitteringham,etal. (2001). “Metabolism of ods Enzymol 154: 350-67. fluorine-containing drugs.” Annu. Rev. Pharmacol. Toxi Kunkel, T. A.J. D. Roberts, et al. (1987). “Rapid and efficient col. 41: 443-70. site-specific mutagenesis without phenotypic selection.” Presnell, S. R. and F. E. Cohen (1989). Proc. Natl. Acad. Sci. Methods Enzymol 154: 367-82. U.S.A. 86: 6592. Lal, G. S., G. P. Pez, et al. (1999). “Bis(2-methoxyethyl) Pylypenko, O. and I. Schlichting (2004). “Structural aspects aminosulfur tricloride: a new broad-spectrum deoxofluori of ligand binding to and electron transfer in bacterial and nating agent with enhanced thermal stability.” J. Org. fungal P450s.” Annu. Rev. Biochem. 73:991-1018. Chen. 64: 7048-54. Sakamoto, T. J. M. Joern, et al. (2001). “Laboratory evolu Landwehr, M., M. Carbone, et al. (2007). “Diversification of tion of toluene dioxygenase to accept 4-picoline as a Sub catalytic function in a synthetic family of chimeric cyto 25 strate. Applied and Environmental Microbiology 67(9): chrome p450s. Chem Biol 14(3): 269-78. 3882-+. Leuchtenberger, S., D. Beher, et al. (2006). “Selective modu Sayers, J. R. W. Schmidt, et al. (1988). “5'-3' exonucleases in lation of Abeta42 production in Alzheimer's disease: non phosphorothioate-based oligonucleotide-directed steroidal anti-inflammatory drugs and beyond. Curr mutagenesis.” Nucleic Acids Res 16(3): 791-802. Pharm Des 12(33): 4337-55. 30 Schwarzer, D. and P. A. Cole (2005). “Protein semisynthesis Ling, M. M. and B. H. Robinson (1997). “Approaches to and expressed protein ligation: chasing a protein's tail.” DNA mutagenesis: an overview.” Anal Biochem 254(2): Curr. Opin. Chem. Biol. 9(6): 561-9. 157-78. Shibata, N. T. Ishimaru, et al. (2004). “First enantio-flexible Ma, J. A. and D. Cahard (2004). “Asymmetric fluorination, fluorination reaction using metal-bis(oxazoline) com trifluoromethylation, and perfluoroalkylation reactions.” 35 plexes.” Synlett(10): 1703-1706. Chem Rev 104(12): 6119-46. Shibata, N., E. Suzuki, et al. (2000). A fundamentally new Ma, J. A. and D. Cahard (2004). “Copper(II) triflate-bis(ox approach to enantioselective fluorination based on cin aZoline)-catalysed enantioselective electrophilic fluorina chona alkaloid derivatives/selectfluor combination.” Jour tion of beta-ketoesters.’ Tetrahedron-Asymmetry 15(6): nal of the American Chemical Society 122(43): 10728 1007-1011. 40 10729. Mandecki, W. (1986). “Oligonucleotide-directed double Shimizu, Y. Y. Kuruma, et al. (2006). “Cell-free translation strand break repair in plasmids of Escherichia coli: a systems for protein engineering.” FEBS J. 273 (18): 4133 method for site-specific mutagenesis.” Proc Natl Acad Sci 40. USA 83(19): 7177-81. Smith, M. (1985). “In vitro mutagenesis.” Annu Rev Genet. Marigo, M., D. I. Fielenbach, et al. (2005). "Enantioselective 45 19: 423-62. formation of stereogenic carbon-fluorine centers by a Swain, C. and N. M. J. Rupniak (1999). “Progress in the simple catalytic method.' Angewandte Chemie-Interna development of neurokinin antagonists.” Annual Reports tional Edition 44(24): 3703-3706. in Medicinal Chemistry, Vol 3434: 51-60. Matsunaga, I.T. Sumimoto, et al. (2002). “Functional modu Taylor, J. W., W. Schmidt, et al. (1985). “The use of phospho lation of a peroxygenase cytochrome P450: novel insight 50 rothioate-modified DNA in restriction enzyme reactions to into the mechanisms of peroxygenase and peroxidase prepare nicked DNA. Nucleic Acids Res 13(24): 8749-64. enzymes.” Febs Letters 528(1-3):90-94. Taylor, S.D., C. C. Kotoris, et al. (1999). “Recent advances in Matsunaga, I., A. Yamada, et al. (2002). “Enzymatic reaction electrophilic fluorination.” Tetrahedron 55 (43): 12431 of hydrogen peroxide-dependent peroxygenase cyto 12477. chrome P450s: kinetic deuterium isotope effects and 55 Togni, A., A. Mezzetti, et al. (2001). “Developing catalytic analyses by resonance Raman spectroscopy. Biochemistry enantioselective fluorination.” Chimia 55(10): 801-805. 41(6): 1886-92. Townsend, K. P. and D. Pratico (2005). “Novel therapeutic Middleton, W.J. (1975). “New Fluorinating Reagents Di opportunities for Alzheimer's disease: focus on nonsteroi alkylaminosulfur Fluorides.” Journal of Organic Chemis dal anti-inflammatory drugs.” FasebJ 19(12): 1592-601. try 40(5): 574-578. 60 van Beilen, J. B. and E.G Funhoff (2007). “Alkane hydroxy Mikolajczyk, M., M. Mikina, et al. (1999). “New phospho lases involved in microbial alkane degradation. Appl nate-mediated Synteses of cyclopentanoids and prostag Microbiol Biotechnol 74(1): 13-21. landins.” Pure Appl. Chem. 71(3): 473-480. van Niel, M. B., I. Collins, et al. (1999). “Fluorination of Nakamaye, K. L. and F. Eckstein (1986) “Inhibition of 3-(3-(piperidin-1-yl)propyl)indoles and 3-(3-(piperazin-1- restriction endonuclease Nici I cleavage by phosphorothio 65 yl)propyl)indoles gives selective human 5-HT1D receptor ate groups and its application to oligonucleotide-directed ligands with improved pharmacokinetic profiles. Journal mutagenesis.” Nucleic Acids Res 14(24): 9679–98. of Medicinal Chemistry 42(12): 2087-2104. US 8,026,085 B2 65 66 Wang, L., J. Xie, et al. (2006). “Expanding the genetic code.” Zoller, M.J. (1992). “New recombinant DNA methodology Annu. Rev. Biophys. Biomol. Struct. 35: 225-249. for protein engineering. Curr Opin Biotechnol 3(4): 348 Wells, J. A., M. Vasser, et al. (1985). “Cassette mutagenesis: 54 an efficient method for generation of multiple mutations at Zoller, M.J. and M. Smith (1983). “Oligonucleotide-directed defined sites.” Gene 34(2-3): 315-23. 5 mutagenesis of DNA fragments cloned into M13 vectors.” Yamazaki, Y. S. Yusa, et al. (1996). “Effect of fluorine sub Methods Enzymol 100: 468-500. stitution of alpha- and beta-hydrogen atoms in ethyl phe Zoller, M.J. and M. Smith (1987). “Oligonucleotide-directed nylacetate and phenylpropionate on their stereoselective mutagenesis: a simple method using two oligonucleotide hydrolysis by cultured cancer cells' J. Fluorine Chem. primers and a single-stranded DNA template.” Methods 79(2): 167-171. Enzymol 154: 329-50.

SEQUENCE LISTING

NUMBER OF SEO ID NOS : 7 O

SEO ID NO 1 LENGTH: 10 TYPE PRT ORGANISM: Artificial sequence FEATURE: ER NFORMATION: Cytochrome P450 signature sequence FEATURE: NAME/KEY: SC FEATURE LOCATION: (2) ... (2) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (3) . . (3) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (5) . . (5) OTHER INFORMATION: Xala can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (6) . . (6) OTHER Xaa can be Histidine or Arginine FEATURE: NAME/KEY: SC FEATURE LOCATION: (7) . . (7) OTHER Xaa can be any amino acid FEATURE: NAME/KEY: SC FEATURE LOCATION: (9) ... (9) OT ER NFORMATION: Xala can be any amino acid

< 4 OOs SEQUENCE: 1 Phe Xa a Xaa Gly Xaa Xala Xaa Cys Xaa Gly 1. 5 1O

SEO ID NO 2 LENGTH: 1048 TYPE PRT ORGANISM: Bacillus megaterium FEATURE; NAME/KEY: MISC FEATURE LOCATION: (1) . . (1048) OTHER INFORMATION: Cytochrome P450 enzyme

< 4 OOs SEQUENCE: 2

Thir II e Lys Glu Met Pro Glin Pro Llys Thr Phe Gly Glu Tell Lys Asn 1. 5 1O 15

Leul Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Wall Glin Ala Luell Met Lys Ile 2O 25 3 O

Ala As p Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45

Thir Airg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Asp Glu SO 55 60

Ser Airg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Phe Wall Arg Asp 65 70 7s 8O US 8,026,085 B2 67 68 - Continued

Phe Ala Gly Asp Gly Lell Phe Thir Ser Trp Thir His Glu Lys Asn Trp 85 90 95

Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp Glu Arg Lell Asn Ala Asp Glu His Ile Glu Wall Pro Glu Asp 13 O 135 14 O

Met Thir Arg Luell Thir Lell Asp Thir Ile Gly Luell Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Thir Ser 1.65 17O 17s

Met Wall Arg Ala Lell Glu Ala Met Asn Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Ala Asp Glu Asn Arg Glin Phe Glin Glu Asp 195

Ile Lys Wall Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215

Ala Ser Gly Glu Glin Ser Asp Asp Luell Luell Thir His Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Glu Asn Ile Arg 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Phe Luell Wall ASn Pro His Wall Luell Glin 27s 285

Ala Ala Glu Glu Ala Ala Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Ala 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Lell Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin US 8,026,085 B2 69 70 - Continued

SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925 US 8,026,085 B2 71 - Continued Glu Ala His Lieu. Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Tyr Lieu. 93 O 935 94 O Tyr Glin Glu Glu Lieu. Glu Asn Ala Glin Ser Glu Gly Ile Ile Thr Lieu. 945 950 955 96.O His Thr Ala Phe Ser Arg Met Pro Asn Glin Pro Llys Thr Tyr Val Glin 965 97O 97. His Val Met Glu Glin Asp Gly Lys Llys Lieu. Ile Glu Lieu. Lieu. Asp Glin 98O 985 99 O Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210s, SEQ ID NO 3 &211s LENGTH: 1059 212. TYPE: PRT <213> ORGANISM; Bacillus subtilis 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) . . (1059) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A2 <4 OOs, SEQUENCE: 3 Lys Glu Thir Ser Pro Ile Pro Gln Pro Llys Thr Phe Gly Pro Leu Gly 1. 5 1O 15 Asn Lieu Pro Lieu. Ile Asp Lys Asp Llys Pro Thr Lieu. Ser Lieu. Ile Llys 2O 25 3O Lieu Ala Glu Glu Gln Gly Pro Ile Phe Glin Ile His Thr Pro Ala Gly 35 4 O 45 Thir Thir Ile Val Val Ser Gly His Glu Lieu Val Lys Glu Val Cys Asp SO 55 6 O Glu Glu Arg Phe Asp Llys Ser Ile Glu Gly Ala Lieu. Glu Lys Val Arg 65 70 7s 8O Ala Phe Ser Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Pro Asn 85 90 95 Trp Arg Lys Ala His Asn. Ile Lieu Met Pro Thr Phe Ser Glin Arg Ala 1OO 105 11 O Met Lys Asp Tyr His Glu Lys Met Val Asp Ile Ala Val Glin Lieu. Ile 115 12 O 125 Glin Llys Trp Ala Arg Lieu. Asn. Pro Asn. Glu Ala Val Asp Val Pro Gly 13 O 135 14 O Asp Met Thir Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn 145 150 155 160 Tyr Arg Phe Asin Ser Tyr Tyr Arg Glu Thr Pro His Pro Phe Ile Asn 1.65 17O 17s Ser Met Val Arg Ala Lieu. Asp Glu Ala Met His Glin Met Glin Arg Lieu. 18O 185 19 O Asp Val Glin Asp Llys Lieu Met Val Arg Thr Lys Arg Glin Phe Arg Tyr 195 2OO 2O5 Asp Ile Glin Thr Met Phe Ser Lieu Val Asp Ser Ile Ile Ala Glu Arg 21 O 215 22O Arg Ala Asn Gly Asp Glin Asp Glu Lys Asp Lieu. Lieu Ala Arg Met Lieu 225 23 O 235 24 O US 8,026,085 B2 73 74 - Continued

Asn Wall Glu Asp Pro Glu Thir Gly Glu Lys Luell Asp Asp Glu Asn Ile 245 250 255

Arg Phe Glin Ile Ile Thir Phe Luell Ile Ala Gly His Glu Thir Thir Ser 26 O 265 27 O

Gly Luell Luell Ser Phe Ala Thir Tyr Phe Luell Luell His Pro Asp 27s 285

Lell Lys Ala Tyr Glu Glu Wall Asp Arg Wall Lell Thir Asp Ala Ala 29 O 295 3 OO

Pro Thir Glin Wall Lell Glu Luell Thir Tyr Ile Arg Met Ile Luell 3. OS 310 315

Asn Glu Ser Luell Arg Lell Trp Pro Thir Ala Pro Ala Phe Ser Luell Tyr 3.25 330 335

Pro Glu Asp Thir Wall Ile Gly Gly Lys Phe Pro Ile Thir Thir Asn 34 O 345 35. O

Asp Arg Ile Ser Wall Lell Ile Pro Glin Luell His Arg Asp Arg Asp Ala 355 360 365

Trp Gly Lys Asp Ala Glu Glu Phe Arg Pro Glu Arg Phe Glu His Glin 37 O 375

Asp Glin Wall Pro His His Ala Pro Phe Gly Asn Gly Glin Arg 385 390 395 4 OO

Ala Ile Gly Met Glin Phe Ala Luell His Glu Ala Thir Luell Wall Luell 4 OS 415

Gly Met Ile Luell Phe Thir Luell Ile Asp His Glu Asn Tyr Glu 425 43 O

Lell Asp Ile Glin Thir Lell Thir Luell Pro Gly Asp Phe His Ile 435 44 O 445

Ser Wall Glin Ser Arg His Glin Glu Ala Ile His Ala Asp Wall Glin Ala 450 45.5 460

Ala Glu Ala Ala Pro Asp Glu Glin Glu Thir Glu Ala Lys 465 470

Gly Ala Ser Wall Ile Gly Lell Asn Asn Arg Pro Lell Lell Wall Luell Tyr 485 490 495

Gly Ser Asp Thir Gly Thir Ala Glu Gly Wall Ala Arg Glu Luell Ala Asp SOO 505

Thir Ala Ser Luell His Gly Wall Arg Thir Thir Ala Pro Luell Asn Asp 515 525

Arg Ile Gly Lell Pro Lys Glu Gly Ala Wall Wall Ile Wall Thir Ser 53 O 535 54 O

Ser Asn Gly Pro Pro Ser Asn Ala Gly Glin Phe Wall Glin Trp 5.45 550 555 560

Lell Glin Glu Ile Lys Pro Gly Glu Luell Glu Gly Wall His Ala Wall 565 st O sts

Phe Gly Gly Asp His Asn Trp Ala Ser Thir Glin Tyr Wall Pro 585 59 O

Arg Phe Ile Asp Glu Glin Lell Ala Glu Gly Ala Thir Arg Phe Ser 595 605

Ala Arg Gly Glu Gly Asp Wall Ser Gly Asp Phe Glu Gly Glin Luell Asp 610 615 62O

Glu Trp Ser Met Trp Ala Asp Ala Ile Ala Phe Gly Luell 625 630 635 64 O

Glu Luell Asn Glu Asn Ala Asp Glu Arg Ser Thir Lell Ser Luell Glin 645 650 655

Phe Wall Arg Gly Lell Gly Glu Ser Pro Luell Ala Arg Ser Glu Ala US 8,026,085 B2 75 - Continued

660 665 67 O Ser His Ala Ser Ile Ala Glu Asn Arg Glu Lieu. Glin Ser Ala Asp Ser 675 68O 685 Asp Arg Ser Thr Arg His Ile Glu Ile Ala Lieu Pro Pro Asp Val Glu 69 O. 695 7 OO Tyr Glin Glu Gly Asp His Lieu. Gly Val Lieu Pro Lys Asn. Ser Glin Thr 7 Os 71O 71s 72O Asn Val Ser Arg Ile Lieu. His Arg Phe Gly Lieu Lys Gly Thr Asp Glin 72 73 O 73 Val Thir Lieu. Ser Ala Ser Gly Arg Ser Ala Gly His Lieu Pro Lieu. Gly 740 74. 7 O Arg Pro Val Ser Lieu. His Asp Lieu. Lieu. Ser Tyr Ser Val Glu Val Glin 7ss 760 765 Glu Ala Ala Thr Arg Ala Glin Ile Arg Glu Lieu Ala Ser Phe Thr Val 770 775 78O Cys Pro Pro His Arg Arg Glu Lieu. Glu Glu Lieu. Ser Ala Glu Gly Val 78s 79 O 79. 8OO Tyr Glin Glu Glin Ile Lieu Lys Lys Arg Ile Ser Met Lieu. Asp Lieu. Lieu. 805 810 815 Glu Lys Tyr Glu Ala Cys Asp Met Pro Phe Glu Arg Phe Lieu. Glu Lieu. 82O 825 83 O Lieu. Arg Pro Leu Lys Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Pro Arg 835 84 O 845 Val Asn Pro Arg Glin Ala Ser Ile Thr Val Gly Val Val Arg Gly Pro 850 855 860 Ala Trp Ser Gly Arg Gly Glu Tyr Arg Gly Val Ala Ser Asn Asp Lieu 865 87O 87s 88O Ala Glu Arg Glin Ala Gly Asp Asp Val Val Met Phe Ile Arg Thr Pro 885 890 895 Glu Ser Arg Phe Gln Leu Pro Lys Asp Pro Glu Thr Pro Ile Ile Met 9 OO 905 91 O Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly Phe Leu Glin Ala 915 92 O 925 Arg Asp Val Lieu Lys Arg Glu Gly Lys Thr Lieu. Gly Glu Ala His Lieu 93 O 935 94 O Tyr Phe Gly Cys Arg Asn Asp Arg Asp Phe Ile Tyr Arg Asp Glu Lieu. 945 950 955 96.O Glu Arg Phe Glu Lys Asp Gly Ile Val Thr Val His Thr Ala Phe Ser 965 97O 97. Arg Lys Glu Gly Met Pro Llys Thr Tyr Val Glin His Lieu Met Ala Asp 98O 985 99 O Glin Ala Asp Thir Lieu. Ile Ser Ile Lieu. Asp Arg Gly Gly Arg Lieu. Tyr 995 1OOO 1005 Val Cys Gly Asp Gly Ser Lys Met Ala Pro Asp Val Glu Ala Ala 1010 1015 1 O2O Lieu. Glin Lys Ala Tyr Glin Ala Wal His Gly Thr Gly Glu Glin Glu 1025 1O3 O 1035 Ala Glin Asn Trp Lieu. Arg His Lieu. Glin Asp Thr Gly Met Tyr Ala 104 O 1045 1 OSO Lys Asp Val Trp Ala Gly 105.5

<210s, SEQ ID NO 4 &211s LENGTH: 1052 US 8,026,085 B2 77 - Continued

212. TYPE: PRT <213> ORGANISM; Bacillus subtilis 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) . . (1052) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A3 <4 OOs, SEQUENCE: 4 Lys Glin Ala Ser Ala Ile Pro Gln Pro Llys Thr Tyr Gly Pro Leu Lys 1. 5 1O 15 Asn Lieu Pro His Lieu. Glu Lys Glu Gln Lieu. Ser Glin Ser Lieu. Trp Arg 2O 25 3O Ile Ala Asp Glu Lieu. Gly Pro Ile Phe Arg Phe Asp Phe Pro Gly Val 35 4 O 45 Ser Ser Val Phe Val Ser Gly His Asn Lieu Val Ala Glu Val Cys Asp SO 55 6 O Glu Lys Arg Phe Asp Lys Asn Lieu. Gly Lys Gly Lieu. Glin Llys Val Arg 65 70 7s 8O Glu Phe Gly Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Pro Asn 85 90 95 Trp Glin Lys Ala His Arg Ile Lieu. Lieu Pro Ser Phe Ser Glin Lys Ala 1OO 105 11 O Met Lys Gly Tyr His Ser Met Met Lieu. Asp Ile Ala Thr Glin Lieu. Ile 115 12 O 125 Glin Llys Trp Ser Arg Lieu. Asn. Pro Asn. Glu Glu Ile Asp Wall Ala Asp 13 O 135 14 O Asp Met Thr Arg Lieu. Thr Lieu. Asp Thr Ile Gly Lieu. CyS Gly Phe ASn 145 150 155 160 Tyr Arg Phe Asin Ser Phe Tyr Arg Asp Ser Gln His Pro Phe Ile Thr 1.65 17O 17s Ser Met Lieu. Arg Ala Lieu Lys Glu Ala Met Asn. Glin Ser Lys Arg Lieu. 18O 185 19 O Gly Lieu. Glin Asp Llys Met Met Val Llys Thir Lys Lieu. Glin Phe Glin Lys 195 2OO 2O5 Asp Ile Glu Val Met Asn. Ser Lieu Val Asp Arg Met Ile Ala Glu Arg 21 O 215 22O Lys Ala Asn Pro Asp Glu Asn. Ile Lys Asp Lieu Lleu Ser Lieu Met Lieu. 225 23 O 235 24 O Tyr Ala Lys Asp Pro Val Thr Gly Glu Thir Lieu. Asp Asp Glu Asn. Ile 245 250 255 Arg Tyr Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser 26 O 265 27 O Gly Lieu. Lieu. Ser Phe Ala Ile Tyr Cys Lieu. Lieu. Thir His Pro Glu Lys 27s 28O 285 Lieu Lys Lys Ala Glin Glu Glu Ala Asp Arg Val Lieu. Thir Asp Asp Thr 29 O 295 3 OO Pro Glu Tyr Lys Glin Ile Glin Glin Lieu Lys Tyr Ile Arg Met Val Lieu. 3. OS 310 315 32O Asn Glu Thir Lieu. Arg Lieu. Tyr Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr 3.25 330 335 Ala Lys Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Ile Ser Lys Gly 34 O 345 35. O Glin Pro Val Thr Val Lieu. Ile Pro Llys Lieu. His Arg Asp Glin Asn Ala 355 360 365 Trp Gly Pro Asp Ala Glu Asp Phe Arg Pro Glu Arg Phe Glu Asp Pro 37 O 375 38O US 8,026,085 B2 79 80 - Continued

Ser Ser Ile Pro His His Ala Pro Phe Gly Asn Gly Glin Arg 385 390 395 4 OO

Ala Ile Gly Met Glin Phe Ala Luell Glin Glu Ala Thir Met Wall Luell 4 OS 41O 415

Gly Luell Wall Luell His Phe Glu Luell Ile ASn His Thir Gly Tyr Glu 425 43 O

Lell Ile Glu Ala Lell Thir Ile Pro Asp Asp Phe Ile 435 44 O 445

Thir Wall Pro Arg Thir Ala Ala Ile ASn Wall Glin Arg Glu 450 45.5 460

Glin Ala Asp Ile Ala Glu Thir Pro Lys Glu Thir Pro Lys 465 470

His Gly Thir Pro Lell Lell Wall Luell Phe Gly Ser Asn Lell Gly Thir Ala 485 490 495

Glu Gly Ile Ala Gly Glu Lell Ala Ala Glin Gly Arg Glin Met Gly Phe SOO 505

Thir Ala Glu Thir Ala Pro Lell Asp Asp Ile Gly Lys Luell Pro Glu 515 525

Glu Gly Ala Wall Wall Ile Wall Thir Ala Ser Asn Gly Ala Pro Pro 53 O 535 54 O

Asp Asn Ala Ala Gly Phe Wall Glu Trp Luell Lys Glu Lell Glu Glu Gly 5.45 550 555 560

Glin Luell Gly Wall Ser Ala Wall Phe Gly Gly Asn Arg Ser 565 st O sts

Trp Ala Ser Thir Tyr Glin Arg Ile Pro Arg Luell Asp Asp Met Met 585 59 O

Ala Lys Gly Ala Ser Arg Luell Thir Ala Ile Glu Gly Asp Ala 595 605

Ala Asp Asp Phe Glu Ser His Arg Glu Ser Trp Asn Arg Phe Trp 610 615

Lys Glu Thir Met Asp Ala Phe Asp Ile Asn Glu Ala Glin Glu 625 630 635 64 O

Asp Arg Pro Ser Lell Ser Ile Thir Phe Luell Ser Ala Thir Glu Thir 645 650 655

Pro Wall Ala Lys Ala Gly Ala Phe Glu Gly Wall Luell Glu Asn 660 665 67 O

Arg Glu Luell Glin Thir Ala Ala Ser Thir Arg Ser Thir Arg His Ile Glu 675 685

Lell Glu Ile Pro Ala Gly Lys Thir Glu Gly Asp His Ile Gly 69 O. 695 7 OO

Ile Luell Pro Asn Ser Arg Glu Luell Wall Glin Arg Wall Luell Ser Arg 7 Os

Phe Gly Luell Glin Ser Asn His Wall Ile Lys Wall Ser Gly Ser Ala His 72 73 O 73

Met Ala His Luell Pro Met Asp Arg Pro Ile Wall Wall Asp Luell Luell 740 74. 7 O

Ser Ser Tyr Wall Glu Lell Glin Glu Pro Ala Ser Arg Lell Glin Luell Arg 760 765

Glu Luell Ala Ser Tyr Thir Wall Pro Pro His Glin Glu Luell Glu 770 775

Glin Luell Wall Ser Asp Asp Gly Ile Glu Glin Wall Luell Ala Lys 78s 79 O 79.

Arg Luell Thir Met Lell Asp Phe Luell Glu Asp Pro Ala Glu Met US 8,026,085 B2 81 - Continued

805 810 815 Pro Phe Glu Arg Phe Lieu Ala Lieu. Lieu Pro Ser Lieu Lys Pro Arg Tyr 82O 825 83 O Tyr Ser Ile Ser Ser Ser Pro Llys Val His Ala Asn Ile Val Ser Met 835 84 O 845 Thr Val Gly Val Val Lys Ala Ser Ala Trp Ser Gly Arg Gly Glu Tyr 850 855 860 Arg Gly Val Ala Ser Asn Tyr Lieu Ala Glu Lieu. Asn Thr Gly Asp Ala 865 87O 87s 88O Ala Ala Cys Phe Ile Arg Thr Pro Glin Ser Gly Phe Gln Met Pro Asn 885 890 895 Asp Pro Glu Thr Pro Met Ile Met Val Gly Pro Gly Thr Gly Ile Ala 9 OO 905 91 O Pro Phe Arg Gly Phe Ile Glin Ala Arg Ser Val Lieu Lys Lys Glu Gly 915 92 O 925 Ser Thr Lieu. Gly Glu Ala Lieu. Lieu. Tyr Phe Gly Cys Arg Arg Pro Asp 93 O 935 94 O His Asp Asp Lieu. Tyr Arg Glu Glu Lieu. Asp Glin Ala Glu Glin Asp Gly 945 950 955 96.O Lieu Val Thir Ile Arg Arg Cys Tyr Ser Arg Val Glu Asn. Glu Pro Llys 965 97O 97. Gly Tyr Val Glin His Lieu Lleu Lys Glin Asp Thr Glin Llys Lieu Met Thr 98O 985 99 O Lieu. Ile Glu Lys Gly Ala His Ile Tyr Val Cys Gly Asp Gly Ser Glin 995 1OOO 1005 Met Ala Pro Asp Val Glu Arg Thir Lieu. Arg Lieu Ala Tyr Glu Ala 1010 1015 1 O2O Glu Lys Ala Ala Ser Glin Glu Glu Ser Ala Val Trp Lieu. Glin Llys 1025 1O3 O 1035 Lieu. Glin Asp Glin Arg Arg Tyr Val Lys Asp Val Trp Thr Gly 104 O 1045 1 OSO

<210s, SEQ ID NO 5 &211s LENGTH: 1065 212. TYPE: PRT <213> ORGANISM; Bacillus cereus 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) . . (1065) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A5 <4 OOs, SEQUENCE: 5 Met Asp Llys Llys Val Ser Ala Ile Pro Glin Pro Llys Thr Tyr Gly Pro 1. 5 1O 15 Lieu. Gly Asn Lieu Pro Lieu. Ile Asp Lys Asp Llys Pro Thr Lieu. Ser Phe 2O 25 3O Ile Lys Ile Ala Glu Glu Tyr Gly Pro Ile Phe Glin Ile Glin Thr Lieu. 35 4 O 45 Ser Asp Thir Ile Ile Val Ile Ser Gly His Glu Lieu Val Ala Glu Val SO 55 6 O Cys Asp Glu Thir Arg Phe Asp Llys Ser Ile Glu Gly Ala Lieu Ala Lys 65 70 7s 8O Val Arg Ala Phe Ala Gly Asp Gly Lieu. Phe Thir Ser Glu Thr Glin Glu 85 90 95 Pro Asn Trp Llys Lys Ala His Asn Ile Leu Met Pro Thr Phe Ser Glin 1OO 105 11 O US 8,026,085 B2 83 84 - Continued

Arg Ala Met Lys Asp His Ala Met Met Wall Asp Ile Ala Wall Glin 115 12 O 125

Lell Wall Glin Trp Ala Arg Luell Asn Pro ASn Glu Asn Wall Asp Wall 13 O 135 14 O

Pro Glu Asp Met Thir Arg Lell Thir Luell Asp Thir Ile Gly Luell Gly 145 150 155 160

Phe Asn Arg Phe Asn Ser Phe Arg Glu Thir Pro His Pro Phe 1.65 17s

Ile Thir Ser Met Thir Arg Ala Luell Asp Glu Ala Met His Glin Luell Glin 18O 185 19 O

Arg Luell Asp Ile Glu Asp Luell Met Trp Arg Thir Lys Arg Glin Phe 195

Glin His Asp Ile Glin Ser Met Phe Ser Luell Wall Asp Asn Ile Ile Ala 21 O 215

Glu Arg Ser Ser Gly Asn Glin Glu Glu ASn Asp Lell Luell Ser Arg 225 23 O 235 24 O

Met Luell His Wall Glin Asp Pro Glu Thir Gly Glu Lell Asp Asp Glu 245 250 255

Asn Ile Arg Phe Glin Ile Ile Thir Phe Luell Ile Ala Gly His Glu Thir 26 O 265 27 O

Thir Ser Gly Luell Lell Ser Phe Ala Ile Phe Lell Lell Asn Pro 285

Asp Lys Luell Lys Ala Tyr Glu Glu Wall Asp Arg Wall Luell Thir Asp 29 O 295 3 OO

Pro Thir Pro Thir Tyr Glin Glin Wall Met Luell Ile Arg Met 3. OS 310 315

Ile Luell Asn Glu Ser Lell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser 3.25 330 335

Lell Ala Lys Glu Asp Thir Wall Ile Gly Gly Pro Ile 34 O 345 35. O

Gly Glu Asp Arg Ile Ser Wall Luell Ile Pro Glin Lell His Arg Asp 355 360 365

Asp Ala Trp Gly Asp Asn Wall Glu Glu Phe Glin Pro Glu Arg Phe 37 O 375

Glu Asp Luell Asp Lys Wall Pro His His Ala Tyr Pro Phe Gly Asn 385 390 395 4 OO

Gly Glin Arg Ala Cys Ile Gly Met Glin Phe Ala Lell His Glu Ala Thir 4 OS 415

Lell Wall Met Gly Met Lell Lell Glin His Phe Glu Phe Ile Asp Glu 425 43 O

Asp Glin Luell Asp Wall Glin Thir Luell Thir Lell Lys Pro Gly Asp 435 44 O 445

Phe Lys Ile Arg Ile Wall Pro Arg Asn Glin ASn Ile Ser His Thir Thir 450 45.5 460

Wall Luell Ala Pro Thir Glu Glu Luell ASn His Glu Ile Glin 465 470

Glin Wall Glin Thir Pro Ser Ile Ile Gly Ala Asp Asn Luell Ser Luell 485 490 495

Lell Wall Luell Tyr Gly Ser Asp Thir Gly Wall Ala Glu Gly Ile Ala Arg SOO 505

Glu Luell Ala Asp Thir Ala Ser Luell Glu Gly Wall Glin Thir Glu Wall Ala 515 525

Ala Luell Asn Asp Arg Ile Gly Ser Luell Pro Lys Glu Gly Ala Wall Luell 53 O 535 54 O US 8,026,085 B2 85 86 - Continued

Ile Wall Thir Ser Ser Tyr Asn Gly Pro Pro Ser Asn Ala Gly Glin 5.45 550 555 560

Phe Wall Glin Trp Lell Glu Glu Luell Pro Asp Glu Lell Gly Wall 565 st O sts

Glin Ala Wall Phe Gly Gly Asp His ASn Trp Ala Ser Thir 585 59 O

Glin Arg Ile Pro Arg Ile Asp Glu Glin Met Ala Glin Gly Ala 595 605

Thir Arg Phe Ser Thir Arg Gly Glu Ala Asp Ala Ser Gly Asp Phe Glu 610 615

Glu Glin Luell Glu Glin Trp Glu Ser Met Trp Ser Asp Ala Met Lys 625 630 635 64 O

Ala Phe Gly Luell Glu Lell Asn Asn Met Glu Glu Arg Ser Thir 645 650 655

Lell Ser Luell Glin Phe Wall Ser Arg Luell Gly Gly Ser Pro Luell Ala Arg 660 665 67 O

Thir Glu Ala Wall Ala Ser Ile Luell Glu Asn Arg Glu Luell Glin 675 685

Ser Ser Ser Ser Glu Arg Ser Thir Arg His Ile Glu Ile Ser Luell Pro 69 O. 695 7 OO

Glu Gly Ala Thir Tyr Lys Glu Gly Asp His Luell Gly Wall Luell Pro Ile 7 Os

Asn Ser Glu Asn Wall Asn Arg Ile Luell Arg Phe Gly Luell Asn 72 73 O 73

Gly Asp Glin Wall Ile Lell Ser Ala Ser Gly Arg Ser Wall Asn His 740 74. 7 O

Ile Pro Luell Asp Ser Pro Wall Arg Luell Asp Lell Lell Ser Ser 760 765

Wall Glu Wall Glin Glu Ala Ala Thir Arg Ala Glin Ile Arg Glu Met Wall 770 775

Thir Phe Thir Ala Cys Pro Pro His Glu Lell Glu Ser Luell Luell 79 O 79.

Glu Asp Gly Wall Tyr His Glu Glin Ile Luell Arg Ile Ser Met 805 810 815

Lell Asp Luell Lel Glu Glu Ala Glu Ile Arg Phe Glu Arg 825 83 O

Phe Luell Glu Lel Lell Pro Ala Luell Pro Arg Tyr Ser Ile Ser 835 84 O 845

Ser Ser Pro Lel Ile Ala Glin Asp Arg Luell Ser Ile Thir Wall Gly Wall 850 855 860

Wall Asn Ala Ala Trp Ser Gly Glu Gly Thir Glu Gly Wall Ala 865

Ser Asn Lel Ala Glin Arg His Asn Lys Asp Ile Ile Cys Phe 885 890 895

Ile Arg Thir Glin Ser Asn Phe Glin Luell Pro Asn Pro Glu Thir 9 OO 905 91 O

Pro Ile Ile Met Wall Gly Pro Gly Thir Gly Ile Pro Phe Arg Gly 915 92 O 925

Phe Luell Glin Ala Arg Arg Wall Glin Glin Met Asn Luell Gly 93 O 935

Glu Ala His Luell Tyr Phe Gly Arg His Pro Asp Luell 945 950 955 96.O

Arg Thir Glu Lell Glu Asn Asp Glu Arg Asp Lell Ile Ser Luell US 8,026,085 B2 87 - Continued

965 97O 97. His Thr Ala Phe Ser Arg Lieu. Glu Gly His Pro Llys Thr Tyr Val Glin 98O 985 99 O His Val Ile Lys Glu Asp Arg Met Asn Lieu. Ile Ser Lieu. Lieu. Asp Asn 995 1OOO 1005 Gly Ala His Lieu. Tyr Ile Cys Gly Asp Gly Ser Lys Met Ala Pro 1010 1015 1 O2O Asp Val Glu Asp Thir Lieu. Cys Glin Ala Tyr Glin Glu Ile His Glu 1025 1O3 O 1035 Val Ser Glu Glin Glu Ala Arg Asn Trp Lieu. Asp Arg Lieu. Glin Asp 104 O 1045 1 OSO Glu Gly Arg Tyr Gly Lys Asp Val Trp Ala Gly Ile 105.5 106 O 1065

<210s, SEQ ID NO 6 &211s LENGTH: 1063 212. TYPE: PRT <213> ORGANISM: Ralstonia metallidurans 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) . . (1063) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102E1 <4 OOs, SEQUENCE: 6 Ser Thr Ala Thr Pro Ala Ala Ala Lieu. Glu Pro Ile Pro Arg Asp Pro 1. 5 1O 15 Gly Trp Pro Ile Phe Gly Asn Lieu Phe Glin Ile Thr Pro Gly Glu Val 2O 25 3O Gly Gln His Lieu. Lieu Ala Arg Ser Arg His His Asp Gly Ile Phe Glu 35 4 O 45 Lieu. Asp Phe Ala Gly Lys Arg Val Pro Phe Val Ser Ser Val Ala Lieu. SO 55 6 O Ala Ser Glu Lieu. Cys Asp Ala Thr Arg Phe Arg Lys Ile Ile Gly Pro 65 70 7s 8O Pro Lieu. Ser Tyr Lieu. Arg Asp Met Ala Gly Asp Gly Lieu. Phe Thr Ala 85 90 95 His Ser Asp Glu Pro Asn Trp Gly Cys Ala His Arg Ile Lieu Met Pro 1OO 105 11 O Ala Phe Ser Glin Arg Ala Met Lys Ala Tyr Phe Asp Wal Met Lieu. Arg 115 12 O 125 Val Ala Asn Arg Lieu Val Asp Llys Trp Asp Arg Glin Gly Pro Asp Ala 13 O 135 14 O Asp Ile Ala Val Ala Asp Asp Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile 145 150 155 160 Ala Lieu Ala Gly Phe Gly Tyr Asp Phe Ala Ser Phe Ala Ser Asp Glu 1.65 17O 17s Lieu. Asp Pro Phe Wal Met Ala Met Val Gly Ala Lieu. Gly Glu Ala Met 18O 185 19 O Glin Llys Lieu. Thir Arg Lieu Pro Ile Glin Asp Arg Phe Met Gly Arg Ala 195 2OO 2O5 His Arg Glin Ala Ala Glu Asp Ile Ala Tyr Met Arg Asn Lieu Val Asp 21 O 215 22O Asp Val Ile Arg Glin Arg Arg Val Ser Pro Thir Ser Gly Met Asp Lieu 225 23 O 235 24 O Lieu. Asn Lieu Met Lieu. Glu Ala Arg Asp Pro Glu Thir Asp Arg Arg Lieu. 245 250 255

US 8,026,085 B2 91 - Continued

Ala Pro Arg Thir Ser Thr Arg Asp Ile Arg Lieu Gln Lieu Pro Pro Gly 69 O. 695 7 OO Ile Thr Tyr Arg Thr Gly Asp His Ile Ala Val Trp Pro Glin Asn Asp 7 Os 71O 71s 72O Ala Glin Lieu Val Ser Glu Lieu. Cys Glu Arg Lieu. Asp Lieu. Asp Pro Asp 72 73 O 73 Ala Glin Ala Thr Ile Ser Ala Pro His Gly Met Gly Arg Gly Lieu Pro 740 74. 7 O Ile Asp Glin Ala Lieu Pro Val Arg Glin Lieu. Lieu. Thir His Phe Ile Glu 7ss 760 765 Lieu. Glin Asp Val Val Ser Arg Glin Thr Lieu. Arg Ala Lieu Ala Glin Ala 770 775 78O Thr Arg Cys Pro Phe Thr Lys Glin Ser Ile Glu Gln Leu Ala Ser Asp 78s 79 O 79. 8OO Asp Ala Glu. His Gly Tyr Ala Thr Llys Val Val Ala Arg Arg Lieu. Gly 805 810 815 Ile Lieu. Asp Val Lieu Val Glu. His Pro Ala Ile Ala Lieu. Thir Lieu. Glin 82O 825 83 O Glu Lieu. Leu Ala Cys Thr Val Pro Met Arg Pro Arg Lieu. Tyr Ser Ile 835 84 O 845 Ala Ser Ser Pro Lieu Val Ser Pro Asp Wall Ala Thr Lieu. Lieu Val Gly 850 855 860 Thr Val Cys Ala Pro Ala Lieu. Ser Gly Arg Gly Glin Phe Arg Gly Val 865 87O 87s 88O Ala Ser Thir Trp Lieu Gln His Lieu Pro Pro Gly Ala Arg Val Ser Ala 885 890 895 Ser Ile Arg Thr Pro Asn Pro Pro Phe Ala Pro Asp Pro Asp Pro Ala 9 OO 905 91 O Ala Pro Met Leu Lleu. Ile Gly Pro Gly Thr Gly Ile Ala Pro Phe Arg 915 92 O 925 Gly Phe Lieu. Glu Glu Arg Ala Lieu. Arg Llys Met Ala Gly Asn Ala Val 93 O 935 94 O Thr Pro Ala Glin Lieu. Tyr Phe Gly Cys Arg His Pro Gln His Asp Trp 945 950 955 96.O Lieu. Tyr Arg Glu Asp Ile Glu Arg Trp Ala Gly Glin Gly Val Val Glu 965 97O 97. Val His Pro Ala Tyr Ser Val Val Pro Asp Ala Pro Arg Tyr Val Glin 98O 985 99 O Asp Lieu. Lieu. Trp Glin Arg Arg Glu Glin Val Trp Ala Glin Wal Arg Asp 995 1OOO 1005 Gly Ala Thir Ile Tyr Val Cys Gly Asp Gly Arg Arg Met Ala Pro 1010 1015 1 O2O Ala Val Arg Glin Thir Lieu. Ile Glu Ile Gly Met Ala Glin Gly Gly 1025 1O3 O 1035 Met Thr Asp Lys Ala Ala Ser Asp Trp Phe Gly Gly Lieu Val Ala 104 O 1045 1 OSO Glin Gly Arg Tyr Arg Glin Asp Val Phe Asn 105.5 106 O

<210s, SEQ ID NO 7 &211s LENGTH: 1077 212. TYPE: PRT <213> ORGANISM: Bradyrhizobium japonicum 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE US 8,026,085 B2 93 - Continued <222s. LOCATION: (1) . . (1077 <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP102A6 <4 OO > SEQUENCE: 7 Ser Ser Lys Asn Arg Lieu. Asp Pro Ile Pro Gln Pro Pro Thr Llys Pro 1. 5 1O 15 Val Val Gly Asn Met Lieu. Ser Lieu. Asp Ser Ala Ala Pro Val Glin His 2O 25 3O Lieu. Thir Arg Lieu Ala Lys Glu Lieu. Gly Pro Ile Phe Trp Lieu. Asp Met 35 4 O 45 Met Gly Ser Pro Ile Val Val Val Ser Gly His Asp Leu Val Asp Glu SO 55 6 O Lieu. Ser Asp Glu Lys Arg Phe Asp Llys Thr Val Arg Gly Ala Lieu. Arg 65 70 7s 8O Arg Val Arg Ala Val Gly Gly Asp Gly Lieu. Phe Thr Ala Asp Thr Arg 85 90 95 Glu Pro Asn Trp Ser Lys Ala His Asn. Ile Lieu. Lieu. Glin Pro Phe Gly 1OO 105 11 O Asn Arg Ala Met Glin Ser Tyr His Pro Ser Met Val Asp Ile Ala Glu 115 12 O 125 Glin Lieu Val Glin Llys Trp Glu Arg Lieu. Asn Ala Asp Asp Glu Ile Asp 13 O 135 14 O Val Val His Asp Met Thr Ala Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys 145 150 155 160 Gly Phe Asp Tyr Arg Phe Asn Ser Phe Tyr Arg Arg Asp Tyr His Pro 1.65 17O 17s Phe Val Glu Ser Lieu Val Arg Ser Lieu. Glu Thir Ile Met Met Thr Arg 18O 185 19 O Gly Lieu Pro Phe Glu Glin Ile Trp Met Gln Lys Arg Arg Llys Thr Lieu. 195 2OO 2O5 Ala Glu Asp Val Ala Phe Met Asn Llys Met Val Asp Glu Ile Ile Ala 21 O 215 22O Glu Arg Arg Llys Ser Ala Glu Gly Ile Asp Asp Llys Lys Asp Met Lieu. 225 23 O 235 24 O Ala Ala Met Met Thr Gly Val Asp Arg Ser Thr Gly Glu Glin Lieu. Asp 245 250 255 Asp Wall Asn. Ile Arg Tyr Glin Ile Asn. Thir Phe Lieu. Ile Ala Gly His 26 O 265 27 O Glu Thir Thr Ser Gly Lieu Lleu Ser Tyr Thr Lieu. Tyr Ala Lieu. Leu Lys 27s 28O 285 His Pro Asp Ile Lieu Lys Lys Ala Tyr Asp Glu Val Asp Arg Val Phe 29 O 295 3 OO Gly Pro Asp Val Asn Ala Lys Pro Thr Tyr Glin Glin Val Thr Glin Leu 3. OS 310 315 32O Thr Tyr Ile Thr Glin Ile Leu Lys Glu Ala Lieu. Arg Lieu. Trp Pro Pro 3.25 330 335 Ala Pro Ala Tyr Gly Ile Ser Pro Lieu Ala Asp Glu Thir Ile Gly Gly 34 O 345 35. O Gly Lys Tyr Lys Lieu. Arg Lys Gly Thr Phe Ile Thr Ile Leu Val Thr 355 360 365 Ala Lieu. His Arg Asp Pro Ser Val Trp Gly Pro Asn Pro Asp Ala Phe 37 O 375 38O Asp Pro Glu Asn. Phe Ser Arg Glu Ala Glu Ala Lys Arg Pro Ile Asn 385 390 395 4 OO US 8,026,085 B2 95 96 - Continued

Ala Trp Pro Phe Gly Asn Gly Glin Arg Ala Cys Ile Gly Arg Gly 4 OS 415

Phe Ala Met His Glu Ala Ala Luell Ala Luell Gly Met Ile Luell Glin Arg 42O 425 43 O

Phe Luell Ile Asp His Glin Arg Glin Met His Lell Glu Thir 435 44 O 445

Lell Thir Met Pro Glu Gly Phe Ile Wall Arg Pro Arg Ala 450 45.5 460

Asp Arg Glu Arg Gly Ala Gly Gly Pro Wall Ala Ala Wall Ser Ser 465 470

Ala Pro Arg Ala Pro Arg Glin Pro Thir Ala Arg Pro Gly His Asn Thir 485 490 495

Pro Met Luell Wall Lell Gly Ser Asn Luell Gly Thir Ala Glu Glu Luell SOO 505

Ala Thir Arg Met Ala Asp Lell Ala Glu Ile ASn Gly Phe Ala Wall His 515 525

Lell Gly Ala Luell Asp Glu Tyr Wall Gly Luell Pro Glin Glu Gly Gly 53 O 535 54 O

Wall Luell Ile Ile Ala Ser Asn Gly Ala Pro Pro Asp Asn Ala 5.45 550 555 560

Thir Glin Phe Wall Lys Trp Lell Gly Ser Asp Luell Pro Asp Ala Phe 565 st O sts

Ala Asn Wall Arg Tyr Ala Wall Phe Gly Cys Gly Asn Ser Asp Trp Ala 585 59 O

Ala Thir Tyr Glin Ser Wall Pro Arg Phe Ile Asp Glu Glin Luell Ser Gly 595 605

His Gly Ala Arg Ala Wall Tyr Pro Arg Gly Glu Gly Asp Ala Arg Ser 610 615

Asp Luell Asp Gly Glin Phe Glin Trp Phe Pro Ala Ala Ala Glin Wall 625 630 635 64 O

Ala Thir Glu Phe Gly Ile Asp Trp Asn Phe Thir Arg Thir Ala Glu 645 650 655

Asp Asp Pro Luell Tyr Ala Ile Glu Pro Wall Ala Wall Thir Ala Wall Asn 660 665 67 O

Thir Ile Wall Ala Glin Gly Gly Ala Wall Ala Met Wall Luell Wall Asn 675 685

Asp Glu Luell Glin Asn Ser Gly Ser Asn Pro Ser Glu Arg Ser Thir 69 O. 695 7 OO

Arg His Ile Glu Wall Glin Lell Pro Ser Asn Ile Thir Arg Wall Gly 7 Os

Asp His Luell Ser Wall Wall Pro Arg Asn Asp Pro Thir Lell Wall Asp Ser 72 73 O 73

Wall Ala Arg Arg Phe Gly Phe Luell Pro Ala Asp Glin Ile Arg Luell Glin 740 74. 7 O

Wall Ala Glu Gly Arg Arg Ala Glin Luell Pro Wall Gly Glu Ala Wall Ser 760 765

Wall Gly Arg Luell Lell Ser Glu Phe Wall Glu Luell Glin Glin Wall Ala Thir 770 775

Arg Glin Ile Glin Ile Met Ala Glu His Thir Arg Pro Wall Thir 79 O 79.

Pro Luell Lell Ala Phe Wall Gly Glu Glu Ala Glu Pro Ala Glu 805 810 815

Arg Arg Thir Glu Ile Lell Ala Met Arg Ser Wall Tyr Asp Luell 82O 825 83 O US 8,026,085 B2 97 - Continued

Lieu. Leu Glu Tyr Pro Ala Cys Glu Lieu Pro Phe His Val Tyr Lieu. Glu 835 84 O 845 Met Leu Ser Lieu. Leu Ala Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Pro 850 855 860 Ser Val Asp Pro Ala Arg Cys Ser Ile Thr Val Gly Val Val Glu Gly 865 87O 87s 88O Pro Ala Ala Ser Gly Arg Gly Val Tyr Lys Gly Ile Cys Ser Asn Tyr 885 890 895 Lieu Ala Asn Arg Arg Ala Ser Asp Ala Ile Tyr Ala Thr Val Arg Glu 9 OO 905 91 O Thir Lys Ala Gly Phe Arg Lieu Pro Asp Asp Ser Ser Val Pro Ile Ile 915 92 O 925 Met Ile Gly Pro Gly Thr Gly Lieu Ala Pro Phe Arg Gly Phe Leu Gln 93 O 935 94 O Glu Arg Ala Ala Arg Lys Ala Lys Gly Ala Ser Lieu. Gly Pro Ala Met 945 950 955 96.O Lieu. Phe Phe Gly Cys Arg His Pro Asp Glin Asp Phe Lieu. Tyr Ala Asp 965 97O 97. Glu Lieu Lys Ala Lieu Ala Ala Ser Gly Val Thr Glu Lieu. Phe Thr Ala 98O 985 99 O Phe Ser Arg Ala Asp Gly Pro Llys Thr Tyr Val Glin His Wall Lieu Ala 995 1OOO 1005 Ala Glin Lys Asp Llys Val Trp Pro Lieu. Ile Glu Glin Gly Ala Ile 1010 1015 1 O2O Ile Tyr Val Cys Gly Asp Gly Gly Glin Met Glu Pro Asp Wall Lys 1025 1O3 O 1035 Ala Ala Lieu Val Ala Ile Arg His Glu Lys Ser Gly Ser Asp Thr 104 O 1045 1 OSO Ala Thr Ala Ala Arg Trp Ile Glu Glu Met Gly Ala Thr Asn Arg 105.5 106 O 1065 Tyr Val Lieu. Asp Val Trp Ala Gly Gly 1070 1075

<210s, SEQ ID NO 8 &211s LENGTH: 415 212. TYPE: PRT <213> ORGANISM: Pseudomonas putida 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) ... (415) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP101A1

<4 OOs, SEQUENCE: 8

Met Thir Thr Glu. Thir Ile Glin Ser Asn Ala Asn Lieu Ala Pro Leu Pro 1. 5 1O 15 Pro His Val Pro Glu. His Leu Val Phe Asp Phe Asp Met Tyr Asn Pro 2O 25 3O Ser Asn Lieu. Ser Ala Gly Val Glin Glu Ala Trp Ala Val Lieu. Glin Glu 35 4 O 45 Ser Asn Val Pro Asp Lieu Val Trp Thr Arg Cys Asn Gly Gly His Trp SO 55 6 O Ile Ala Thr Arg Gly Glin Lieu. Ile Arg Glu Ala Tyr Glu Asp Tyr Arg 65 70 7s 8O His Phe Ser Ser Glu. Cys Pro Phe Ile Pro Arg Glu Ala Gly Glu Ala 85 90 95

Tyr Asp Phe Ile Pro Thir Ser Met Asp Pro Pro Glu Glin Arg Glin Phe US 8,026,085 B2 99 100 - Continued

105 11 O

Arg Ala Luell Ala Asn Glin Wall Wall Gly Met Pro Wall Wall Asp Llys Lieu. 115 12 O 125

Glu Asn Arg Ile Glin Glu Lell Ala Ser Luell Ile Glu Ser Lieu. Arg 13 O 135 14 O

Pro Glin Gly Glin Cys Asn Phe Thir Glu Asp Tyr Ala Glu Pro Phe Pro 145 150 155 160

Ile Arg Ile Phe Met Lell Lell Ala Gly Luell Pro Glu Glu Asp Ile Pro 1.65 17O 17s

His Luell Tyr Lell Thir Asp Glin Met Thir Arg Pro Asp Gly Ser Met 18O 185 19 O

Thir Phe Ala Glu Ala Glu Ala Luell Asp Lell Ile Pro Ile 195

Ile Glu Glin Arg Arg Glin Lys Pro Gly Thir Asp Ala Ile Ser Ile Wall 21 O 215

Ala Asn Gly Glin Wall Asn Gly Arg Pro Ile Thir Ser Asp Glu Ala Lys 225 23 O 235 24 O

Arg Met Gly Lell Lell Lell Wall Gly Gly Luell Asp Thir Wall Wall Asn 245 250 255

Phe Luell Ser Phe Ser Met Glu Phe Luell Ala Ser Pro Glu His Arg 26 O 265 27 O

Glin Glu Luell Ile Glu Arg Pro Glu Arg Ile Pro Ala Ala Glu Glu 28O 285

Lell Luell Arg Arg Phe Ser Lell Wall Ala Asp Gly Arg Ile Luell Thir Ser 29 O 295 3 OO

Asp Glu Phe His Gly Wall Glin Luell Lys Gly Asp Glin Ile Lieu. 3. OS 310 315 32O

Lell Pro Glin Met Lel Ser Gly Luell Asp Glu Arg Glu Asn Ala Cys Pro 3.25 330 335

Met His Wall Asp Phe Ser Arg Glin Lys Wall Ser His Thir Thir Phe Gly 34 O 345 35. O

His Gly Ser His Lel Lell Gly Glin His Luell Ala Arg Arg Glu Ile 355 360 365

Ile Wall Thir Luell Glu Trp Luell Thir Arg Ile Pro Asp Phe Ser Ile 37 O 375

Ala Pro Gly Ala Glin Ile Glin His Ser Gly Ile Wall Ser Gly Val 385 390 395 4 OO

Glin Ala Luell Pro Lell Wall Trp Asp Pro Ala Thir Thir Ala Wall 4 OS 41O 415

SEO ID NO 9 LENGTH: TYPE : PRT ORGANISM: Bacillus megaterium FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) . . (410) OTHER INFORMATION: Cytochrome P450 enzyme CYP106A2 <4 OOs, SEQUENCE: Met Lys Glu Val Ile Ala Wall Lys Glu Ile Thr Arg Phe Llys Thr Arg 1. 5 15

Thr Glu Glu Phe Ser Pro Tyr Ala Trp Cys Lys Arg Met Lieu. Glu Asn 25 Asp Pro Val Ser Tyr His Glu Gly Thr Asp Thir Trp Asn Val Phe Lys 35 4 O 45 US 8,026,085 B2 101 102 - Continued

Glu Asp Wall Lys Arg Wall Luell Ser Asp Lys His Phe Ser Ser SO 55 6 O

Wall Arg Arg Thir Thir Ile Ser Wall Gly Thir Asp Ser Glu Glu Gly 65 70

Ser Wall Pro Glu Lys Ile Glin Ile Thir Glu Ser Asp Pro Pro Asp His 85 90 95

Arg Arg Arg Ser Lell Lell Ala Ala Ala Phe Thir Pro Arg Ser Luell 105 11 O

Glin Asn Trp Glu Pro Arg Ile Glin Glu Ile Ala Asp Glu Luell Ile Gly 115 12 O 125

Glin Met Asp Gly Gly Thir Glu Ile Asp Ile Wall Ala Ser Luell Ala Ser 13 O 135 14 O

Pro Luell Pro Ile Ile Wall Met Ala Asp Luell Met Gly Wall Pro Ser Lys 145 150 155 160

Asp Arg Luell Luell Phe Trp Wall Asp Thir Lell Phe Luell Pro Phe 1.65 17O 17s

Asp Arg Glu Lys Glin Glu Glu Wall Asp Luell Glin Wall Ala Ala 18O 185 19 O

Glu Tyr Glin Tyr Lell Tyr Pro Ile Wall Wall Glin Arg Luell 195

Asn Pro Ala Asp Asp Ile Ile Ser Asp Luell Luell Lys Ser Glu Wall Asp 21 O 215 22O

Gly Glu Met Phe Thir Asp Asp Glu Wall Wall Arg Thir Thir Met Luell Ile 225 23 O 235 24 O

Lell Gly Ala Gly Wall Glu Thir Thir Ser His Luell Lell Ala Asn Ser Phe 245 250 255

Ser Luell Luell Tyr Asp Asp Glu Wall Glin Glu Luell His Glu 26 O 265 27 O

Asn Luell Asp Luell Wall Pro Glin Ala Wall Glu Glu Met Lell Arg Phe Arg 28O 285

Phe Asn Luell Ile Lell Asp Arg Thir Wall Glu Asp Asn Asp Luell 29 O 295 3 OO

Lell Gly Wall Glu Lell Lys Glu Gly Asp Ser Wall Wall Wall Trp Met Ser 3. OS 310 315

Ala Ala Asn Met Asp Glu Glu Met Phe Glu Asp Pro Phe Thir Luell Asn 3.25 330 335

Ile His Arg Pro Asn Asn His Luell Thir Phe Gly Asn Gly Pro 34 O 345 35. O

His Phe Cys Luell Gly Ala Pro Luell Ala Arg Luell Glu Ala Ile Ala 355 360 365

Lell Thir Ala Phe Lell Lys Phe His Ile Glu Ala Wall Pro Ser 37 O 375

Phe Glin Luell Glu Glu Asn Lell Thir Asp Ser Ala Thir Gly Glin Thir Luell 385 390 395 4 OO

Thir Ser Luell Pro Lell Ala Ser Arg Met 4 OS 41O

SEQ ID NO 10 LENGTH: TYPE : PRT ORGANISM: Citrobacter brakii FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) . . (404) OTHER INFORMATION: Cytochrome P450 enzyme P450cin

SEQUENCE: 10 US 8,026,085 B2 103 104 - Continued

Thir Ala Thir Wall Ala Ser Thir Ser Luell Phe Thir Thir Ala Asp His 1O 15

His Thir Pro Lell Gly Pro Asp Gly Thir Pro His Ala Phe Phe Glu 25

Ala Luell Arg Asp Glu Ala Glu Thir Thir Pro Ile Gly Trp Ser Glu Ala 35 4 O 45

Gly Gly His Trp Wall Wall Ala Gly Tyr Glu Ile Glin Ala Wall SO 55 6 O

Glin Asn Thir Ala Phe Ser Asn Gly Wall Thir Phe Pro Arg 70 7s

Glu Thir Gly Glu Phe Glu Luell Met Met Ala Gly Glin Asp Asp Pro 85 90 95

Wall His Lys Tyr Arg Glin Luell Wall Ala Pro Phe Ser Pro Glu 105 11 O

Ala Thir Asp Luell Phe Thir Glu Glin Luell Arg Glin Ser Thir Asn Asp Luell 115 12 O 125

Ile Asp Ala Arg Ile Glu Lell Gly Glu Gly Asp Ala Ala Thir Trp Luell 13 O 135 14 O

Ala Asn Glu Ile Pro Ala Arg Luell Thir Ala Ile Lell Lell Gly Luell Pro 145 150 155 160

Pro Glu Asp Gly Asp Thir Arg Arg Trp Wall Trp Ala Ile Thir His 1.65 17O 17s

Wall Glu Asn Pro Glu Glu Gly Ala Glu Ile Phe Ala Glu Luell Wall Ala 18O 185 19 O

His Ala Arg Thir Lell Ile Ala Glu Arg Arg Thir Asn Pro Gly Asn Asp 195

Ile Met Ser Arg Wall Ile Met Ser Ile Asp Gly Glu Ser Luell Ser 21 O 215 22O

Glu Asp Asp Luell Ile Gly Phe Phe Thir Ile Luell Lell Lell Gly Gly Ile 225 23 O 235 24 O

Asp Asn Thir Ala Arg Phe Lell Ser Ser Wall Phe Trp Arg Luell Ala Trp 245 250 255

Asp Ile Glu Luell Arg Arg Arg Luell Ile Ala His Pro Glu Luell Ile Pro 26 O 265 27 O

Asn Ala Wall Asp Glu Lell Lell Arg Phe Gly Pro Ala Met Wall Gly 285

Arg Luell Wall Thir Glin Glu Wall Thir Wall Gly Asp Ile Thir Met Pro 29 O 295 3 OO

Gly Glin Thir Ala Met Lell Trp Phe Pro Ile Ala Ser Arg Asp Arg Ser 3. OS 310 315

Ala Phe Asp Ser Pro Asp Asn Ile Wall Ile Glu Arg Thir Pro Asn Arg 3.25 330 335

His Luell Ser Luell Gly His Gly Ile His Arg Lell Gly Ala His Luell 34 O 345 35. O

Ile Arg Wall Glu Ala Arg Wall Ala Ile Thir Glu Phe Lell Arg Ile 355 360 365

Pro Glu Phe Ser Lell Asp Pro Asn Glu Cys Glu Trp Luell Met Gly 37 O 375

Glin Wall Ala Gly Met Lell His Wall Pro Ile Ile Phe Pro Gly Lys 385 390 395 4 OO

Arg Luell Ser Glu

<210s, SEQ ID NO 11 US 8,026,085 B2 105 106 - Continued

&211s LENGTH: 428 212. TYPE: PRT <213> ORGANISM: Pseudomonas sp 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) ... (428) <223> OTHER INFORMATION: Cytochrome P450 enzyme P450terp

<4 OOs, SEQUENCE: 11 Met Asp Ala Arg Ala Thir Ile Pro Glu. His Ile Ala Arg Thr Val Ile 1. 5 1O 15 Lieu Pro Glin Gly Tyr Ala Asp Asp Glu Val Ile Tyr Pro Ala Phe Lys 2O 25 3 O Trp Lieu. Arg Asp Glu Glin Pro Lieu Ala Met Ala His Ile Glu Gly Tyr 35 4 O 45 Asp Pro Met Trp Ile Ala Thr Lys His Ala Asp Val Met Glin Ile Gly SO 55 6 O Lys Glin Pro Gly Lieu. Phe Ser Asn Ala Glu Gly Ser Glu Ile Lieu. Tyr 65 70 7s 8O Asp Glin Asn. Asn. Glu Ala Phe Met Arg Ser Ile Ser Gly Gly Cys Pro 85 90 95 His Val Ile Asp Ser Lieu. Thir Ser Met Asp Pro Pro Thr His Thr Ala 1OO 105 11 O Tyr Arg Gly Lieu. Thir Lieu. Asn Trp Phe Glin Pro Ala Ser Ile Arg Llys 115 12 O 125 Lieu. Glu Glu Asn. Ile Arg Arg Ile Ala Glin Ala Ser Val Glin Arg Lieu. 13 O 135 14 O Lieu. Asp Phe Asp Gly Glu. Cys Asp Phe Met Thr Asp Cys Ala Lieu. Tyr 145 150 155 160 Tyr Pro Leu. His Val Val Met Thr Ala Leu Gly Val Pro Glu Asp Asp 1.65 17O 17s Glu Pro Leu Met Leu Lys Lieu. Thr Glin Asp Phe Phe Gly Val His Glu 18O 185 19 O Pro Asp Glu Glin Ala Val Ala Ala Pro Arg Glin Ser Ala Asp Glu Ala 195 2OO 2O5 Ala Arg Arg Phe His Glu Thir Ile Ala Thr Phe Tyr Asp Tyr Phe Asn 21 O 215 22O Gly Phe Thr Val Asp Arg Arg Ser Cys Pro Lys Asp Asp Wal Met Ser 225 23 O 235 24 O Lieu. Lieu Ala Asn. Ser Llys Lieu. Asp Gly Asn Tyr Ile Asp Asp Llys Tyr 245 250 255 Ile Asn Ala Tyr Tyr Val Ala Ile Ala Thr Ala Gly His Asp Thir Thr 26 O 265 27 O Ser Ser Ser Ser Gly Gly Ala Ile Ile Gly Lieu. Ser Arg Asn Pro Glu 27s 28O 285 Glin Lieu Ala Lieu Ala Lys Ser Asp Pro Ala Lieu. Ile Pro Arg Lieu Val 29 O 295 3 OO Asp Glu Ala Val Arg Trp Thr Ala Pro Val Lys Ser Phe Met Arg Thr 3. OS 310 315 32O Ala Lieu Ala Asp Thr Glu Val Arg Gly Glin Asn. Ile Lys Arg Gly Asp 3.25 330 335 Arg Ile Met Lieu. Ser Tyr Pro Ser Ala Asn Arg Asp Glu Glu Val Phe 34 O 345 35. O Ser Asn Pro Asp Glu Phe Asp Ile Thr Arg Phe Pro Asn Arg His Lieu. 355 360 365 Gly Phe Gly Trp Gly Ala His Met Cys Lieu. Gly Glin His Lieu Ala Lys US 8,026,085 B2 107 108 - Continued

37 O 375

Lieu. Glu Met Lys Ile Phe Phe Glu Glu Lieu. Lieu Pro Llys Lieu Lys Ser 385 390 395 4 OO

Wall Glu Luell Ser Gly Pro Pro Arg Lieu Val Ala Thr Asn Phe Val Gly 4 OS 415

Gly Pro Lys Asn Wall Pro Ile Arg Phe Thr Lys Ala 42O 425

SEQ ID NO 12 LENGTH: TYPE : PRT ORGANISM: Saccharopolyspora erythreae FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) . . (404) OTHER INFORMATION: Cytochrome P450 enzyme P450eryF

SEQUENCE: 12

Met Thir Thir Wall Pro Asp Lell Glu Ser Asp Ser Phe His Wall Asp Trp 1. 5 1O 15

Tyr Arg Thir Tyr Ala Glu Lell Arg Glu Thir Ala Pro Wall Thir Pro Wall 25 3O

Arg Phe Luell Gly Glin Asp Ala Trp Luell Wall. Thir Gly Tyr Asp Glu Ala 35 4 O 45

Ala Ala Luell Ser Asp Lell Arg Luell Ser Ser Asp Pro SO 55 6 O

Tyr Pro Gly Wall Glu Wall Glu Phe Pro Ala Tyr Lell Gly Phe Pro Glu 65 70 7s

Asp Wall Arg Asn Tyr Phe Ala Thir Asn Met Gly Thir Ser Asp Pro Pro 85 90 95

Thir His Thir Arg Lell Arg Luell Wall Ser Glin Glu Phe Thir Wall Arg 105 11 O

Arg Wall Glu Ala Met Arg Pro Arg Wall Glu Glin Ile Thir Ala Glu Luell 115 12 O 125

Lell Asp Glu Wall Gly Asp Ser Gly Wall Val Asp Ile Wall Asp Arg Phe 13 O 135 14 O

Ala His Pro Luell Pro Ile Wall Ile Cys Glu Lell Lell Gly Wall Asp 145 150 155 160

Glu Arg Gly Glu Phe Gly Arg Trp Ser Ser Glu Ile Luell Wall 1.65 17O 17s

Met Asp Pro Glu Arg Ala Glu Glin Arg Gly Glin Ala Ala Arg Glu Wall 18O 185 19 O

Wall Asn Phe Ile Lell Asp Lell Wall Glu Arg Arg Arg Thir Glu Pro Gly 195

Asp Asp Luell Luell Ser Ala Lell Ile Arg Wall Glin Asp Asp Asp Asp Gly 21 O 215 22O

Arg Luell Ser Ala Asp Glu Lell Thir Ser Ile Ala Lell Wall Luell Luell Luell 225 23 O 235 24 O

Ala Gly Phe Glu Ala Ser Wall Ser Luell Ile Gly Ile Gly Thir Tyr Luell 245 250 255

Lell Luell Thir His Pro Asp Glin Luell Ala Lieu Wall Arg Arg Asp Pro Ser 26 O 265 27 O

Ala Luell Pro Asn Ala Wall Glu Glu Ile Lieu. Arg Ile Ala Pro Pro 28O 285

Glu Thir Thir Thir Phe Ala Ala Glu Glu Wall Glu Ile Gly Gly Wall 29 O 295 3 OO US 8,026,085 B2 109 110 - Continued Ala Ile Pro Glin Tyr Ser Thr Val Lieu Wall Ala Asn Gly Ala Ala Asn 3. OS 310 315

Arg Asp Pro Lys Glin Phe Pro Asp Pro His Arg Phe Asp Wall Thir Arg 3.25 330 335

Asp Thr Arg Gly His Leu Ser Phe Gly Gln Gly Ile His Phe Met 34 O 345 35. O

Gly Arg Pro Lieu Ala Lys Lieu. Glu Gly Glu Wall Ala Lell Arg Ala Luell 355 360 365

Phe Gly Arg Phe Pro Ala Lieu. Ser Lieu. Gly Ile Asp Ala Asp Asp Wall 37 O 375 38O

Wall Trp Arg Arg Ser Lieu Lleu Lieu. Arg Gly Ile Asp His Luell Pro Wall 385 390 395 4 OO Arg Lieu. Asp Gly

<210s, SEQ ID NO 13 &211s LENGTH: 516 212. TYPE: PRT <213> ORGANISM: homo sapiens 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) . . (516) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP1A2 <4 OOs, SEQUENCE: 13

Met Ala Lieu. Ser Glin Ser Wall Pro Phe Ser Ala Thr Glu Luell Luell Luell 1. 5 1O 15

Ala Ser Ala Ile Phe Cys Lieu Val Phe Trp Val Lieu. Gly Luell Arg 2O 25

Pro Arg Val Pro Llys Gly Lieu Lys Ser Pro Pro Glu Pro Trp Gly Trp 35 4 O 45

Pro Lieu. Lieu. Gly. His Val Lieu. Thir Lieu Gly Lys Asn Pro His Luell Ala SO 55 6 O

Lell Ser Arg Met Ser Glin Arg Tyr Gly Asp Val Lieu. Glin Ile Arg Ile 65 70 7s

Gly Ser Thr Pro Wall Leu Val Lieu. Ser Arg Lieu. Asp Thir Ile Arg Glin 85 90 95

Ala Lieu Val Arg Glin Gly Asp Asp Phe Pro Asp Luell Tyr 1OO 105 11 O

Thir Ser Thr Lieu. Ile Thr Asp Gly Glin Ser Lieu. Thir Phe Ser Thir Asp 115 12 O 125

Ser Gly Pro Val Trp Ala Ala Arg Arg Arg Lieu Ala Glin Asn Ala Luell 13 O 135 14 O

Asn Thr Phe Ser Ile Ala Ser Asp Pro Ala Ser Ser Ser Ser Tyr 145 150 155 160

Lell Glu Glu. His Val Ser Lys Glu Ala Lys Ala Lieu. Ile Ser Arg Luell 1.65 17O

Glin Glu Lieu Met Ala Gly Pro Gly His Phe Asp Pro Asn Glin Wall 18O 185 19 O

Wall Val Ser Val Ala Asn Val Ile Gly Ala Met Cys Phe Gly Glin His 195 2OO 2O5

Phe Pro Glu Ser Ser Asp Glu Met Leu Ser Lieu. Wall Asn Thir His 21 O 215 22O

Glu Phe Val Glu Thir Ala Ser Ser Gly ASn Pro Lell Asp Phe Phe Pro 225 23 O 235 24 O

Ile Lieu. Arg Tyr Lieu Pro ASn Pro Ala Lieu. Glin Arg Phe Ala Phe 245 250 255 US 8,026,085 B2 111 112 - Continued

Asn Glin Arg Phe Lell Trp Phe Luell Glin Lys Thr Wall Glin Glu His Tyr 26 O 265 27 O

Glin Asp Phe Asp Asn Ser Wall Arg Asp Ile Thir Gly Ala Luell Phe 27s 285

His Ser Lys Gly Pro Arg Ala Ser Gly Asn Lell Ile Pro Glin 29 O 295 3 OO

Glu Ile Wall Asn Lell Wall Asn Asp Ile Phe Gly Ala Gly Phe Asp 3. OS 310 315

Thir Wall Thir Thir Ala Ile Ser Trp Ser Luell Met Lell Wall Thir 3.25 330 335

Pro Glu Ile Glin Arg Ile Glin Lys Glu Lieu. Asp Thir Wall Ile Gly 34 O 345 35. O

Arg Glu Arg Arg Pro Arg Lell Ser Asp Arg Pro Glin Lell Pro Luell 355 360 365

Glu Ala Phe Ile Lell Glu Thir Phe Arg His Ser Ser Phe Luell Pro Phe 37 O 375

Thir Ile Pro His Ser Thir Thir Arg Asp Th Thir Lell Asn Gly Phe Tyr 385 390 395 4 OO

Ile Pro Cys Wall Phe Wall ASn Glin Trp Glin Wall Asn His 4 OS 41O 415

Asp Pro Glu Luell Trp Glu Asp Pro Ser Glu Phe Arg Pro Glu Arg Phe 425 43 O

Lell Thir Ala Asp Gly Thir Ala Ile Asn Llys Pro Lell Ser Glu Met 435 44 O 445

Met Luell Phe Gly Met Gly Lys Arg Arg Cys Ile Gly Glu Wall Luell Ala 450 45.5 460

Lys Trp Glu Ile Phe Lell Phe Luell Ala Ile Lieu. Lell Glin Glin Luell Glu 465 470 47s

Phe Ser Wall Pro Pro Gly Wall Wall Asp Lieu. Thir Pro Ile Tyr Gly 485 490 495

Lell Thir Met Lys His Ala Arg Glu His Wall Glin Ala Arg Luell Arg SOO 505 51O

Phe Ser Ile Asn 515

SEQ ID NO 14 LENGTH: 489 TYPE : PRT ORGANISM: homo sapiens FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) ... (489) OTHER INFORMATION: Cytochrome P450 enzyme CYP2C8

<4 OOs, SEQUENCE: 14

Met Glu Pro Phe Wall Wall Lell Wall Luell Cys Lieu Ser Phe Met Luell Luell 1. 5 1O 15

Phe Ser Luell Trp Arg Glin Ser Arg Arg Arg Lell Pro Pro Gly 2O 25

Pro Thir Pro Luell Pro Ile Ile Gly Asn Met Leu Glin Ile Asp Wall Lys 35 4 O 45

Asp Ile Cys Ser Phe Thir Asn Phe Ser Lys Wall Gly Pro Wall SO 55 6 O

Phe Thir Wall Phe Gly Asn Pro Ile Wall Wall Phe His Gly Tyr Glu 65 70 7s 8O

Ala Wall Glu Ala Lell Ile Asp Asn Gly Glu Glu Phe Ser Gly Arg 85 90 95 US 8,026,085 B2 113 114 - Continued

Gly Asn Ser Pro Ile Ser Glin Arg Ile Thir Lys Gly Lieu. Gly Ile Ile 105 11 O

Ser Ser Asn Gly Arg Trp Lys Glu Ile Arg Arg Phe Ser Luell Thir 115 12 O 125

Thir Luell Arg Asn Phe Gly Met Gly Arg Ser Ile Glu Asp Arg Wall 13 O 135 14 O

Glin Glu Glu Ala His Cys Lell Wall Glu Glu Luell Arg Thir Ala 145 150 155 160

Ser Pro Asp Pro Thir Phe Ile Luell Gly Cys Ala Pro Asn Wall 1.65 17O 17s

Ile Ser Wall Wall Phe Glin Arg Phe Asp Asp Glin Asn 18O 185 19 O

Phe Luell Thir Luell Met Arg Phe Asn Glu ASn Phe Arg Ile Luell Asn 195 2OO 2O5

Ser Pro Trp Ile Glin Wall Cys Asn Asn Phe Pro Lell Lell Ile Asp 21 O 215

Phe Pro Gly Thir His Asn Wall Luell ASn Wall Ala Luell Thir Arg 225 23 O 235 24 O

Ser Ile Arg Glu Wall Glu His Glin Ala Ser Luell Asp Wall 245 250 255

Asn Asn Pro Arg Asp Phe Ile Asp Cys Phe Luell Ile Met Glu Glin 26 O 265 27 O

Glu Asp Asn Glin Ser Glu Phe Asn Ile Glu Asn Luell Wall Gly 28O 285

Thir Wall Ala Asp Lell Phe Wall Ala Gly Thir Glu Thir Thir Ser Thir Thir 29 O 295 3 OO

Lell Arg Gly Lell Lell Lell Luell Luell His Pro Glu Wall Thir Ala 3. OS 310 315

Wall Glin Glu Glu Ile Asp His Wall Ile Gly Arg His Arg Ser Pro 3.25 330 335

Met Glin Asp Arg Ser His Met Pro Thir Asp Ala Wall Wall His 34 O 345 35. O

Glu Ile Glin Arg Tyr Ser Asp Luell Wall Pro Thir Gly Wall Pro His Ala 355 360 365

Wall Thir Thir Asp Thir Phe Arg Asn Tyr Luell Ile Pro Gly Thir 37 O 375

Thir Ile Met Ala Lell Lell Thir Ser Wall Luell His Asp Asp Glu Phe 385 390 395 4 OO

Pro Asn Pro Asn Ile Phe Asp Pro Gly His Phe Lell Asp Asn Gly 4 OS 415

Asn Phe Lys Ser Asp Phe Met Pro Phe Ser Ala Gly 425 43 O

Ile Ala Gly Glu Gly Lell Ala Arg Met Glu Lell Phe Luell Phe Luell 435 44 O 445

Thir Thir Ile Luell Glin Asn Phe Asn Luell Ser Wall Asp Asp Luell 450 45.5 460

Asn Luell Asn Thir Thir Ala Wall Thir Gly Ile Wall Ser Luell Pro Pro 465 470 47s 48O

Ser Glin Ile Cys Phe Ile Pro Wall 485

<210s, SEQ ID NO 15 &211s LENGTH: 490 212. TYPE : PRT US 8,026,085 B2 115 116 - Continued <213> ORGANISM: homo sapiens 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) ... (490) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2C9

<4 OOs, SEQUENCE: 15 Met Asp Ser Lieu Val Val Lieu Val Lieu. Cys Lieu. Ser Cys Lieu. Lieu. Lieu. 1. 5 1O 15 Lieu. Ser Lieu. Trp Arg Glin Ser Ser Gly Arg Gly Llys Lieu Pro Pro Gly 2O 25 3O Pro Thr Pro Leu Pro Val Ile Gly Asn Ile Leu Glin Ile Gly Ile Llys 35 4 O 45 Asp Ile Ser Lys Ser Lieu. Thir Asn Lieu. Ser Llys Val Tyr Gly Pro Val SO 55 6 O Phe Thr Lieu. Tyr Phe Gly Lieu Lys Pro Ile Val Val Lieu. His Gly Tyr 65 70 7s 8O Glu Ala Wall Lys Glu Ala Lieu. Ile Asp Lieu. Gly Glu Glu Phe Ser Gly 85 90 95 Arg Gly Ile Phe Pro Lieu Ala Glu Arg Ala Asn Arg Gly Phe Gly Ile 1OO 105 11 O Val Phe Ser Asn Gly Llys Llys Trp Llys Glu Ile Arg Arg Phe Ser Lieu 115 12 O 125 Met Thr Lieu. Arg Asn. Phe Gly Met Gly Lys Arg Ser Ile Glu Asp Arg 13 O 135 14 O Val Glin Glu Glu Ala Arg Cys Lieu Val Glu Glu Lieu. Arg Llys Thir Lys 145 150 155 160 Ala Ser Pro Cys Asp Pro Thr Phe Ile Leu Gly Cys Ala Pro Cys Asn 1.65 17O 17s Val Ile Cys Ser Ile Ile Phe His Lys Arg Phe Asp Tyr Lys Asp Glin 18O 185 19 O Glin Phe Lieu. Asn Lieu Met Glu Lys Lieu. Asn. Glu Asn. Ile Lys Ile Lieu. 195 2OO 2O5 Ser Ser Pro Trp Ile Glin Ile Cys Asn Asin Phe Ser Pro Ile Ile Asp 21 O 215 22O Tyr Phe Pro Gly Thr His Asn Llys Lieu. Leu Lys Asn Val Ala Phe Met 225 23 O 235 24 O Llys Ser Tyr Ile Lieu. Glu Lys Wall Lys Glu. His Glin Glu Ser Met Asp 245 250 255 Met Asn Asn Pro Glin Asp Phe Ile Asp Cys Phe Leu Met Lys Met Glu 26 O 265 27 O Lys Glu Lys His Asn Gln Pro Ser Glu Phe Thr Ile Glu Ser Lieu. Glu 27s 28O 285 Asn Thr Ala Val Asp Leu Phe Gly Ala Gly Thr Glu Thir Thr Ser Thr 29 O 295 3 OO Thir Lieu. Arg Tyr Ala Lieu. Lieu. Lieu. Lieu. Lieu Lys His Pro Glu Val Thr 3. OS 310 315 32O Ala Lys Val Glin Glu Glu Ile Glu Arg Val Ile Gly Arg Asn Arg Ser 3.25 330 335 Pro Cys Met Glin Asp Arg Ser His Met Pro Tyr Thr Asp Ala Val Val 34 O 345 35. O His Glu Val Glin Arg Tyr Ile Asp Lieu. Leu Pro Thr Ser Leu Pro His 355 360 365 Ala Val Thir Cys Asp Ile Llys Phe Arg Asn Tyr Lieu. Ile Pro Lys Gly 37 O 375 38O US 8,026,085 B2 117 118 - Continued

Thir Thir Ile Luell Ile Ser Lell Thir Ser Wall Luell His Asp Asn Lys Glu 385 390 395 4 OO

Phe Pro Asn Pro Glu Met Phe Asp Pro His His Phe Lell Asp Glu Gly 4 OS 415

Gly Asn Phe Lys Lys Ser Phe Met Pro Phe Ser Ala Gly Lys 425 43 O

Arg Ile Cys Wall Gly Glu Ala Luell Ala Gly Met Glu Lell Phe Luell Phe 435 44 O 445

Lell Thir Ser Ile Lell Glin Asn Phe Asn Luell Lys Ser Lell Wall Asp Pro 450 45.5 460

Lys Asn Luell Asp Thir Thir Pro Wall Wall Asn Gly Phe Ala Ser Wall Pro 465 470 47s 48O

Pro Phe Glin Lell Cys Phe Ile Pro Wall 485 490

SEQ ID NO 16 LENGTH: 490 TYPE : PRT ORGANISM: homo sapiens FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) ... (490) OTHER INFORMATION: Cytochrome P450 enzyme CYP2C19

SEQUENCE: 16

Met Asp Pro Phe Wall Wall Lell Wall Luell Cys Luell Ser Luell Luell Luell 1. 5 15

Lell Ser Ile Trp Arg Glin Ser Ser Gly Arg Gly Lell Pro Pro Gly 2O 25

Pro Thir Pro Luell Pro Wall Ile Gly Asn Ile Luell Glin Ile Asp Ile 35 4 O 45

Asp Wall Ser Ser Lell Thir Asn Luell Ser Lys Ile Gly Pro Wall SO 55 6 O

Phe Thir Luell Phe Gly Lell Glu Arg Met Wall Wall Lell His Gly Tyr 65 70

Glu Wall Wall Glu Ala Lell Ile Asp Luell Gly Glu Glu Phe Ser Gly 85 90 95

Arg Gly His Phe Pro Lell Ala Glu Arg Ala ASn Arg Gly Phe Gly Ile 1OO 105 11 O

Wall Phe Ser Asn Gly Arg Trp Glu Ile Arg Arg Phe Ser Luell 115 12 O 125

Met Thir Luell Arg Asn Phe Gly Met Gly Arg Ser Ile Glu Asp Arg 13 O 135 14 O

Wall Glin Glu Glu Ala Arg Luell Wall Glu Glu Lell Arg Thir Lys 145 150 155 160

Ala Ser Pro Asp Pro Thir Phe Ile Luell Gly Ala Pro Cys Asn 1.65 17O 17s

Wall Ile Ser Ile Ile Phe Glin Lys Arg Phe Asp Lys Asp Glin 18O 185 19 O

Glin Phe Luell Asn Lell Met Glu Lys Luell Asn Glu Asn Ile Arg Ile Wall 195 2OO 2O5

Ser Thir Pro Trp Ile Glin Ile Asn Asn Phe Pro Thir Ile Ile Asp 21 O 215 22O

Tyr Phe Pro Gly Thir His Asn Luell Luell Lys Asn Lell Ala Phe Met 225 23 O 235 24 O

Glu Ser Asp Ile Lell Glu Wall Glu His Glin Glu Ser Met Asp 245 250 255 US 8,026,085 B2 119 120 - Continued

Ile Asn Asn Pro Arg Asp Phe Ile Asp Cys Phe Lieu. Ile Lys Met Glu 26 O 265 27 O

Glu Lys Glin Asn. Glin Glin Ser Glu Phe Thir Ile Glu Asn Luell Wall 27s 28O 285

Ile Thir Ala Ala Asp Lieu. Lieu. Gly Ala Gly Thr Glu Thir Thir Ser Thir 29 O 295 3 OO

Thir Lieu. Arg Tyr Ala Lieu Lleu Lieu. Lieu Lieu Lys His Pro Glu Wall Thir 3. OS 310 315

Ala Llys Val Glin Glu Glu Ile Glu Arg Val Ile Gly Arg Asn Arg Ser 3.25 330 335

Pro Cys Met Glin Asp Arg Gly His Met Pro Tyr Thr Asp Ala Wall Wall 34 O 345 35. O

His Glu Val Glin Arg Tyr Ile Asp Lieu. Ile Pro Thir Ser Luell Pro His 355 360 365

Ala Val Thr Cys Asp Val Llys Phe Arg Asn Tyr Lieu. Ile Pro Gly 37 O 375 38O

Thir Thir Ile Lieu. Thir Ser Lieu. Thir Ser Wall Lieu. His Asp Asn Glu 385 390 395 4 OO

Phe Pro Asn Pro Glu Met Phe Asp Pro Arg His Phe Lell Asp Glu Gly 4 OS 41O 415

Gly Asin Phe Llys Lys Ser Asn Tyr Phe Met Pro Phe Ser Ala Gly Lys 42O 425 43 O

Arg Ile Cys Val Gly Glu Gly Lieu Ala Arg Met Glu Lell Phe Luell Phe 435 44 O 445

Lell Thir Phe Ile Leul Glin Asn. Phe Asn Lieu Lys Ser Lell Ile Asp Pro 450 45.5 460

Lys Asp Lieu. Asp Thr Thr Pro Val Val Asin Gly Phe Ala Ser Wall Pro 465 470 47s 48O

Pro Phe Tyr Glin Lieu. Cys Phe Ile Pro Wall 485 490

<210s, SEQ ID NO 17 &211s LENGTH: 446 212. TYPE: PRT <213> ORGANISM: homo sapiens 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) ... (446) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2D6

<4 OOs, SEQUENCE: 17

Met Gly Lieu. Glu Ala Lieu Val Pro Lieu. Ala Wall Ile Wall Ala Ile Phe 1. 5 1O 15

Lell Lieu. Lieu Val Asp Lieu Met His Arg Arg Glin Arg Trp Ala Ala Arg 2O 25

Pro Pro Gly Pro Leu Pro Leu Pro Gly Lieu. Gly Asn Luell Luell His 35 4 O 45

Wall Asp Phe Glin Asn Thr Pro Tyr Cys Phe Asp Glin Lell Arg Arg Arg SO 55 6 O

Phe Gly Asp Val Phe Ser Lieu. Glin Lieu. Ala Trp Thr Pro Wall Wall Wall 65 70 7s

Lell Asin Gly Lieu Ala Ala Val Arg Glu Ala Lieu. Wall Thir His Gly Glu 85 90 95

Asp Thr Ala Asp Arg Pro Pro Val Pro Ile Thr Glin Ile Luell Gly Phe 1OO 105 11 O

Gly Pro Arg Ser Glin Gly Arg Pro Phe Arg Pro Asn Gly Luell Luell Asp US 8,026,085 B2 121 122 - Continued

115 12 O 125

Ala Wall Ser Asn Wall Ile Ala Ser Luell Thir Cys Gly Arg Arg Phe 13 O 135 14 O

Glu Asp Asp Pro Arg Phe Luell Arg Luell Luell Asp Lieu. Ala Glin Glu 145 150 155 160

Gly Luell Glu Glu Ser Gly Phe Luell Arg Glu Wall Lieu Asn Ala Wall 1.65 17s

Pro Wall Luell Luell His Ile Pro Ala Luell Ala Gly Llys Val Luell Arg Phe 18O 185 19 O

Glin Ala Phe Lell Thir Glin Luell Asp Glu Luell Lieu. Thir Glu His Arg 195 2O5

Met Thir Trp Asp Pro Ala Glin Pro Pro Arg Asp Lieu. Thir Glu Ala Phe 21 O 215 22O

Lell Ala Glu Met Glu Lys Ala Gly Asn Pro Glu Ser Ser Phe Asn 225 23 O 235 24 O

Asp Glu Asn Luell Cys Ile Wall Wall Ala Asp Luell Phe Ser Ala Gly Met 245 250 255

Wall Thir Thir Ser Thir Thir Lell Ala Trp Gly Luell Lieu. Luell Met Ile Luell 26 O 265 27 O

His Pro Asp Wall Glin Arg Arg Wall Glin Glin Glu Ile Asp Asp Wall Ile 285

Gly Glin Wall Arg Arg Pro Glu Met Gly Asp Glin Ala His Met Pro Tyr 29 O 295 3 OO

Thir Thir Ala Wall Ile His Glu Wall Glin Arg Phe Gly Asp Ile Wall Pro 3. OS 310 315

Lell Gly Wall Thir His Met Thir Ser Arg Asp Ile Glu Wall Glin Gly Phe 3.25 330 335

Arg Ile Pro Lys Gly Thir Thir Luell Ile Thir ASn Luell Ser Ser Wall Luell 34 O 345 35. O

Asp Glu Ala Wall Trp Glu Lys Pro Phe Arg Phe His Pro Glu His 355 360 365

Phe Luell Asp Ala Glin Gly His Phe Wall Pro Glu Ala Phe Luell Pro 37 O 375

Phe Ser Ala Gly Arg Arg Ala Luell Gly Glu Pro Leu Ala Arg Met 385 390 395 4 OO

Glu Luell Phe Luell Phe Phe Thir Ser Luell Luell Glin His Phe Ser Phe Ser 4 OS 41O 415

Wall Pro Thir Gly Glin Pro Arg Pro Ser His His Gly Val Phe Ala Phe 425 43 O

Lell Wall Thir Pro Ser Pro Glu Luell Cys Ala Wall Pro Arg 435 44 O 445

SEQ ID NO 18 LENGTH: 493 TYPE : PRT ORGANISM: homo sapiens FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) ... (493) OTHER INFORMATION: Cytochrome P450 enzyme CYP2E1

SEQUENCE: 18 Met Ser Ala Lieu. Gly Val Thr Val Ala Lieu. Lieu Val Trp Ala Ala Phe 1. 5 15 Lieu. Lieu. Lieu Val Ser Met Trp Arg Glin Val His Ser Ser Trp Asn Lieu. 25 US 8,026,085 B2 123 124 - Continued

Pro Pro Gly Pro Phe Pro Lell Pro Ile Ile Gly Asn Lell Phe Glin Luell 35 4 O 45

Glu Luell Asn Ile Pro Lys Ser Phe Thir Arg Lell Ala Glin Arg Phe SO 55 6 O

Gly Pro Wall Phe Thir Lell Tyr Wall Gly Ser Glin Arg Met Wall Wall Met 65 70

His Gly Tyr Ala Wall Glu Ala Luell Luell Asp Asp Glu 85 90 95

Phe Ser Gly Arg Gly Asp Lell Pro Ala Phe His Ala His Arg Asp Arg 105 11 O

Gly Ile Ile Phe Asn Asn Gly Pro Thir Trp Asp Ile Arg Arg Phe 115 12 O 125

Ser Luell Thir Thir Lell Arg Asn Tyr Gly Met Gly Lys Glin Gly Asn Glu 13 O 135 14 O

Ser Arg Ile Glin Arg Glu Ala His Phe Luell Luell Glu Ala Luell Arg Lys 145 150 155 160

Thir Glin Gly Glin Pro Phe Asp Pro Thir Phe Luell Ile Gly Ala Pro 1.65 17O 17s

Asn Wall Ile Ala Asp Ile Luell Phe Arg His Phe Asp Asn 18O 185 19 O

Asp Glu Lys Phe Lell Arg Lell Met Luell Phe Asn Glu Asn Phe His 195

Lell Luell Ser Thir Pro Trp Lell Glin Luell Tyr ASn Asn Phe Pro Ser Phe 21 O 215 22O

Lell His Luell Pro Gly Ser His Arg Wall Ile Asn Wall Ala 225 23 O 235 24 O

Glu Wall Glu Tyr Wall Ser Glu Arg Wall Glu His His Glin Ser 245 250 255

Lell Asp Pro Asn Pro Arg Asp Luell Thir Asp Lell Luell Wall Glu 26 O 265 27 O

Met Glu Lys Glu His Ser Ala Glu Arg Luell Thir Met Asp Gly 27s 285

Ile Thir Wall Thir Wall Ala Asp Luell Phe Phe Ala Gly Thir Glu Thir Thir 29 O 295 3 OO

Ser Thir Thir Luell Arg Tyr Gly Luell Luell Ile Luell Met Pro Glu 3. OS 310 315

Ile Glu Glu Lell His Glu Glu Ile Asp Arg Wall Ile Gly Pro Ser 3.25 330 335

Arg Ile Pro Ala Ile Asp Arg Glin Glu Met Pro Met Asp Ala 34 O 345 35. O

Wall Wall His Glu Ile Glin Arg Phe Ile Thir Luell Wall Pro Ser Asn Luell 355 360 365

Pro His Glu Ala Thir Arg Asp Thir Ile Phe Arg Gly Tyr Luell Ile Pro 37 O 375

Lys Gly Thir Wall Wall Wall Pro Thir Luell Asp Ser Wall Lell Tyr Asp Asn 385 390 395 4 OO

Glin Glu Phe Pro Asp Pro Glu Phe Lys Pro Glu His Phe Luell Asn 4 OS 415

Glu Asn Gly Lys Phe Ser Asp Phe Pro Phe Ser Thir 425 43 O

Gly Arg Wall Cys Ala Gly Glu Gly Luell Ala Arg Met Glu Luell Phe 435 44 O 445

Lell Luell Luell Ala Ile Lell Glin His Phe ASn Lell Pro Luell Wall 450 45.5 460 US 8,026,085 B2 125 126 - Continued

Asp Pro Lys Asp Ile Asp Lieu. Ser Pro Ile His Ile Gly Phe Gly Cys 465 470 47s 48O Ile Pro Pro Arg Tyr Lys Lieu. Cys Val Ile Pro Arg Ser 485 490

<210s, SEQ ID NO 19 &211s LENGTH: 491 212. TYPE: PRT <213> ORGANISM: homo sapiens 22 Os. FEATURE: <221s NAME/KEY: MISC FEATURE <222s. LOCATION: (1) ... (491) <223> OTHER INFORMATION: Cytochrome P450 enzyme CYP2F1

<4 OOs, SEQUENCE: 19 Met Asp Ser Ile Ser Thr Ala Ile Lieu. Lieu. Lieu Lleu Lieu Ala Lieu Val 1. 5 1O 15 Cys Lieu. Lieu. Lieu. Thir Lieu. Ser Ser Arg Asp Llys Gly Llys Lieu Pro Pro 2O 25 3O Gly Pro Arg Pro Lieu. Ser Ile Lieu. Gly Asn Lieu Lleu Lleu Lieu. Cys Ser 35 4 O 45 Glin Asp Met Lieu. Thir Ser Lieu. Thir Lys Lieu. Ser Lys Glu Tyr Gly Ser SO 55 6 O Met Tyr Thr Val His Leu Gly Pro Arg Arg Val Val Val Lieu. Ser Gly 65 70 7s 8O Tyr Glin Ala Wall Lys Glu Ala Lieu Val Asp Glin Gly Glu Glu Phe Ser 85 90 95 Gly Arg Gly Asp Tyr Pro Ala Phe Phe Asin Phe Thr Lys Gly Asin Gly 1OO 105 11 O Ile Ala Phe Ser Ser Gly Asp Arg Trp Llys Val Lieu. Arg Glin Phe Ser 115 12 O 125 Ile Glin Ile Lieu. Arg Asn. Phe Gly Met Gly Lys Arg Ser Ile Glu Glu 13 O 135 14 O Arg Ile Lieu. Glu Glu Gly Ser Phe Lieu. Lieu Ala Glu Lieu. Arg Llys Thr 145 150 155 160 Glu Gly Glu Pro Phe Asp Pro Thr Phe Val Leu Ser Arg Ser Val Ser 1.65 17O 17s Asn. Ile Ile Cys Ser Val Lieu. Phe Gly Ser Arg Phe Asp Tyr Asp Asp 18O 185 19 O Glu Arg Lieu. Lieu. Thir Ile Ile Arg Lieu. Ile Asn Asp Asn. Phe Glin Ile 195 2OO 2O5 Met Ser Ser Pro Trp Gly Glu Lieu. Tyr Asp Ile Phe Pro Ser Lieu. Leu 21 O 215 22O Asp Trp Val Pro Gly Pro His Glin Arg Ile Phe Glin Asn Phe Lys Cys 225 23 O 235 24 O Lieu. Arg Asp Lieu. Ile Ala His Ser Val His Asp His Glin Ala Ser Lieu. 245 250 255 Asp Pro Arg Ser Pro Arg Asp Phe Ile Glin Cys Phe Lieu. Thir Lys Met 26 O 265 27 O Ala Glu Glu Lys Glu Asp Pro Lieu. Ser His Phe His Met Asp Thir Lieu 27s 28O 285 Lieu Met Thr Thr His Asn Lieu. Leu Phe Gly Gly Thr Lys Thr Val Ser 29 O 295 3 OO Thir Thr Lieu. His His Ala Phe Leu Ala Leu Met Lys Tyr Pro Llys Val 3. OS 310 315 32O

Glin Ala Arg Val Glin Glu Glu Ile Asp Lieu Val Val Gly Arg Ala Arg US 8,026,085 B2 127 128 - Continued

3.25 330 335

Lell Pro Ala Luell Lys Asp Arg Ala Ala Met Pro Thir Asp Ala Wall 34 O 345 35. O

Ile His Glu Wall Glin Arg Phe Ala Asp Ile Ile Pro Met Asn Luell Pro 355 360 365

His Arg Wall Thir Asp Thir Ala Phe Arg Gly Phe Lell Ile Pro Lys 37 O 375

Gly Thir Asp Wall Ile Thir Lell Luell Asn Thir Wall His Asp Pro Ser 385 390 395 4 OO

Glin Phe Luell Thir Pro Glin Glu Phe Asn Pro Glu His Phe Luell Asp Ala 4 OS 415

Asn Glin Ser Phe Lys Ser Pro Ala Phe Met Pro Phe Ser Ala Gly 425 43 O

Arg Arg Luell Lell Gly Glu Ser Luell Ala Arg Met Glu Luell Phe Luell 435 44 O 445

Luell Thir Ala Ile Lell Glin Ser Phe Ser Luell Glin Pro Luell Gly Ala 450 45.5 460

Pro Glu Asp Ile Asp Lell Thir Pro Luell Ser Ser Gly Lell Gly Asn Luell 465 470 48O

Pro Arg Pro Phe Glin Lell Luell Arg Pro Arg 485 490

SEQ ID NO 2 O LENGTH: SO3 212. TYPE : PRT ORGANISM: homo sapiens FEATURE: NAME/KEY: MISC FEATURE LOCATION: (1) . . (503) HER INFORMATION: Cytochrome P450 enzyme CYP3A4

<4 OOs, SEQUENCE:

Met Ala Lel Ile Pro Asp Lell Ala Met Glu Thir Trp Lell Luell Luell Ala 1. 5 1O 15

Wall Ser Lel Wall Lell Lell Luell Tyr Gly Thir His Ser His Gly Luell 25

Phe Lys Luell Gly Ile Pro Gly Pro Thir Pro Lell Pro Phe Luell Gly 35 4 O 45

Asn Lel Ser Tyr His Lys Gly Phe Met Phe Asp Met Glu Cys 55 6 O

His Gly Lys Wall Trp Gly Phe Tyr Asp Gly Glin Glin Pro 65 70

Wall Luell Ala Ile Thir Asp Pro Asp Met Ile Thir Wall Luell Wall Lys 85 90 95

Glu Ser Wall Phe Thir Asn Arg Arg Pro Phe Gly Pro Wall Gly 105 11 O

Phe Met Lys Ser Ala Ile Ser Ile Ala Glu Asp Glu Glu Trp 115 12 O 125

Lell Arg Ser Luell Lell Ser Pro Thir Phe Thir Ser Gly Luell Glu 13 O 135 14 O

Met Wall Pro Ile Ile Ala Glin Gly Asp Wall Lell Wall Arg Asn Luell 145 150 155 160

Arg Arg Glu Ala Glu Thir Gly Pro Wall Thir Lell Asp Wall Phe 1.65 17O 17s

Gly Ala Ser Met Asp Wall Ile Thir Ser Thir Ser Phe Gly Wall Asn 18O 185 19 O US 8,026,085 B2 129 130 - Continued

Ile Asp Ser Luell Asn Asn Pro Glin Asp Pro Phe Wall Glu Asn. Thir Lys 195

Luell Luell Arg Phe Asp Phe Luell Asp Pro Phe Phe Lell Ser Ile Thir 21 O 215 22O

Wall Phe Pro Phe Lell Ile Pro Ile Luell Glu Wall Lell Asn Ile Wall 225 23 O 235 24 O

Phe Pro Arg Glu Wall Thir Asn Phe Luell Arg Ser Wall Arg Met 245 250 255

Glu Ser Arg Lell Glu Asp Thir Glin His Arg Wall Asp Phe Luell 26 O 265 27 O

Glin Luell Met Ile Asp Ser Glin Asn Ser Glu Thir Glu Ser His 285

Ala Luell Ser Asp Lell Glu Lell Wall Ala Glin Ser Ile Ile Phe Ile Phe 29 O 295 3 OO

Ala Gly Glu Thir Thir Ser Ser Wall Luell Ser Phe Ile Met Glu 3. OS 310 315 32O

Lell Ala Thir His Pro Asp Wall Glin Glin Lys Luell Glin Glu Glu Ile Asp 3.25 330 335

Ala Wall Luell Pro Asn Ala Pro Pro Thir Asp Thir Wall Luell Glin 34 O 345 35. O

Met Glu Tyr Luell Asp Met Wall Wall Asn Glu Thir Lell Arg Luell Phe Pro 355 360 365

Ile Ala Met Arg Lell Glu Arg Wall Asp Wall Glu Ile Asn 37 O 375

Gly Met Phe Ile Pro Lys Gly Wall Wall Wall Met Ile Pro Ser Ala 385 390 395 4 OO

Lell His Arg Asp Pro Trp Thir Glu Pro Glu Phe Luell Pro 4 OS 41O 415

Glu Arg Phe Ser Asn Asp Asn Ile Asp Pro Tyr Ile Tyr 425 43 O

Thir Pro Phe Gly Ser Gly Pro Arg Asn Ile Gly Met Arg Phe Ala 435 44 O 445

Lell Met Asn Met Lys Lell Ala Luell Ile Arg Wall Lell Glin Asn Phe Ser 450 45.5 460

Phe Pro Lys Glu Thir Glin Ile Pro Luell Lell Ser Luell Gly 465 470

Gly Luell Luell Glin Pro Glu Pro Wall Wall Luell Wall Glu Ser Arg 485 490 495

Asp Gly Thir Wall Ser Gly Ala SOO

SEQ ID NO 21 LENGTH: 104.8 TYPE : PRT ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Cytochrome P450 variant CYP102A1var1

SEQUENCE: 21

Thir Ile Lys Glu Met Pro Glin Pro Thir Phe Gly Glu Lieu Lys Asn 1. 5 1O 15

Luell Pro Luell Luell Asn Thir Asp Wall Glin Ala Lell Met Lys Ile

Ala Asp Glu Lieu. Gly Glu Ile Phe Phe Glu Ala Pro Gly Arg Val 35 4 O 45

Thr Arg Tyr Lieu Ser Ser Glin Arg Luell Ile Lys Glu Ala Cys Asp Glu US 8,026,085 B2 131 132 - Continued

SO 55 6 O

Ser Arg Phe Asp Lys Asn Lell Ser Glin Ala Luell Phe Ala Arg Asp 65 70

Phe Ala Gly Asp Gly Lell Phe Thir Ser Trp Thir His Glu Asn Trp 85 90 95

Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp Glu Arg Lell Asn Ala Asp Glu Ile Glu Wall Pro Glu Asp 13 O 135 14 O

Met Thir Arg Luell Thir Lell Asp Thir Ile Gly Luell Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Ile Ser 1.65 17O 17s

Met Ile Arg Ala Lell Glu Wall Met Asn Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Ala Asp Glu Asn Arg Glin Phe Glin Glu Asp 195

Ile Lys Wall Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215

Ala Ser Gly Glu Glin Ser Asp Asp Luell Luell Thir Glin Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Gly Asn Ile Ser 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Phe Luell Wall ASn Pro His Wall Luell Glin 27s 285

Wall Ala Glu Glu Ala Thir Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Ala 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Wall Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375 38O

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470 47s 48O US 8,026,085 B2 133 134 - Continued

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O US 8,026,085 B2 135 136 - Continued

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Tyr Wall Glin 965 97.

His Wal Met Glu Glin Asp Gly Lys Lys Luell Ile Glu Lell Luell Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO

Wall Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 5 1 O2O

Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 103 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 104 5

<210s, SEQ ID NO 22 &211s LENGTH : 1048 212. TYPE : PRT ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Cytochrome P450 variant CYP102A1var2

<4 OOs, SEQUENCE: 22

Thir Ile Llys Glu Met Pro Glin Pro Thir Phe Gly Glu Luell Lys Asn 1. 5 15

Lell Pro Leu Luell Asn Thir Asp Wall Glin Ala Lell Met Ile 3O

Ala Asp Glu Luell Gly Glu Ile Phe Phe Glu Ala Pro Gly Arg Wall 35 4 O 45

Thir Arg Tyr Luell Ser Ser Glin Arg Luell Ile Glu Ala Asp Glu SO 55 6 O

Ser Arg Phe Asp Asn Lell Ser Glin Ala Luell Ala Wall Arg Asp 65 70

Phe Ala Gly Asp Gly Lell Phe Thir Ser Trp Thir His Glu Asn Trp 85 90 95

Lys Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp. Glu Arg Lell Asn Ala Asp Glu His Ile Glu Wall Pro Glu Asp 13 O 135 14 O

Met Thr Arg Luell Thir Lell Asp Thir Ile Gly Luell Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Ile Ser 1.65 17O 17s

Met Val Arg Ala Lell Glu Wall Met Asn Lys Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Ala Asp Glu Asn Arg Glin Cys Glin Glu Asp 195

Ile Llys Val Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215 22O US 8,026,085 B2 137 138 - Continued

Ala Arg Gly Glu Glin Ser Asp Asp Luell Luell Thir Glin Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Gly Asn Ile Ser 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Tyr Phe Luell Wall ASn Pro His Wall Luell Glin 27s 285

Wall Ala Glu Glu Ala Ala Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Tyr Ala 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Wall Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Tyr Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Tyr Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655 US 8,026,085 B2 139 140 - Continued

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin Tyr 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Tyr Wall Glin 965 97.

His Wall Met Glu Glin Asp Gly Lys Lys Luell Ile Glu Lell Luell Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1005

Wall Glu Ala Thir Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 1O1 5 1 O2O

Ser Glu Ala Asp Ala Arg Luell Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 103 O 1035

Gly Arg Tyr Ala Lys Asp Wall Trp Ala Gly 104 O 104 5

<210s, SEQ ID NO 23 &211s LENGTH: 104.8 212. TYPE : PRT US 8,026,085 B2 141 142 - Continued <213> ORGANISM: Artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3 <4 OOs, SEQUENCE: 23 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Cys Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Phe Ala Gly Asp Gly Lieu Phe Thr Ser Trp Thr His Glu Ile Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO US 8,026,085 B2 143 144 - Continued

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Glu Luell Asp 42O 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Lys Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin Tyr 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser US 8,026,085 B2 145 146 - Continued

825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Tyr Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865 87s

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Tyr Wall Glin 965 97.

His Wall Met Glu Glin Asp Gly Lys Lys Luell Ile Glu Lell Luell Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1005

Wall Glu Ala Thir Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 1O1 5 1 O2O

Ser Glu Ala Asp Ala Arg Luell Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 103 O 1035

Gly Arg Tyr Ala Lys Asp Wall Trp Ala Gly 104 O 104 5

SEQ ID NO 24 LENGTH: 104.8 TYPE : PRT ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-2

<4 OOs, SEQUENCE: 24

Thir Ile Lys Glu Met Pro Glin Pro Thir Phe Gly Glu Luell Lys Asn 1. 5 15

Lell Pro Luell Luell Asn Thir Asp Wall Glin Ala Lell Met Ile 3O

Ala Asp Glu Luell Gly Glu Ile Phe Phe Glu Ala Pro Gly Arg Wall 35 4 O 45

Thir Arg Luell Ser Ser Glin Arg Luell Ile Lys Glu Ala Asp Glu SO 55 6 O

Ser Arg Phe Asp Asn Lell Ser Glin Ala Luell Ala Wall Arg Asp 65 70

Pro Luell Gly Asp Gly Lell Phe Ala Ser Trp Thir His Glu Asn Trp 85 90 95

Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp Glu Arg Lell Asn Ala Asp Glu His Ile Glu Wall Ser Glu Asp 13 O 135 14 O US 8,026,085 B2 147 148 - Continued

Met Thir Arg Luell Thir Lell Asp Thir Ile Gly Luell Cys Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Ile Ser 1.65 17O 17s

Met Wall Arg Thir Lell Glu Wall Met Asn Lys Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Wall Asp Glu Asn Arg Glin Cys Glin Glu Asp 195

Ile Lys Wall Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215

Ala Arg Gly Glu Glin Ser Asp Asp Luell Luell Thir Glin Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Gly Asn Ile Ser Tyr 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Tyr Phe Luell Wall Lys ASn Pro His Wall Luell Glin 27s 285

Wall Ala Glu Glu Ala Ala Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Ala Lys 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Wall Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Tyr Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Tyr Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir US 8,026,085 B2 149 150 - Continued

565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Wall Glin 965 97.

His Wall Met Glu Glin Asp Gly Lys Luell Ile Glu Lell Luell Asp Glin 98O 985 99 O US 8,026,085 B2 151 152 - Continued Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO 1005 Val Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Val His Glin Val 1010 1015 1 O2O Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu. Glu Glu Lys 1025 1O3 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 1045

<210s, SEQ ID NO 25 &211s LENGTH: 1048 212. TYPE: PRT <213> ORGANISM: Artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-3 <4 OOs, SEQUENCE: 25 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Arg Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Cys Pro Gly Asp Gly Lieu Ala Thr Ser Trp Thr His Glu Lys Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Thr Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Val Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO

Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu US 8,026,085 B2 153 154 - Continued

3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Ala 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Wall Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 52O 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Thir Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Glu Luell Ala 72 73 O 73 US 8,026,085 B2 155 156 - Continued

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Glu Lell Luell Glin Tyr 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thr Gin Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Glu Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Lieu Ala Arg Luell Thir Met 79 O 79.

Lell Glu Luell Luell Glu Pro Ala Cys Glu Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Pro Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Tyr Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865 87s

Ile Ser Thir Pro Glin Ser Glu Phe Thir Leul Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Val Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Gln Pro Thir Tyr Wall Glin 965 97O 97.

His Wall Met Glu Glin Asp Gly Lys Lys Lieu. Ile Glu Lell Lieu. Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1005

Wall Glu Ala Thir Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 1O1 5 1 O2O

Ser Glu Ala Asp Ala Arg Luell Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 103 O 1035

Gly Arg Tyr Ala Lys Asp Wall Trp Ala Gly 104 O 104 5

SEQ ID NO 26 LENGTH: 104.8 TYPE : PRT ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-4

SEQUENCE: 26

Thir Ile Lys Glu Met Pro Glin Pro Thir Phe Gly Glu Lieu Lys Asn 1. 5 1O 15

Luell Pro Luell Luell Asn Thir Asp Wall Glin Ala Lell Met Lys Ile

Ala Asp Glu Lieu. Gly Glu Ile Phe Phe Glu Ala Pro Gly Arg Val 35 4 O 45

Thr Arg Tyr Lieu Ser Ser Glin Arg Luell Ile Llys Glu Ala Cys Asp Glu US 8,026,085 B2 157 158 - Continued

SO 55 6 O

Ser Arg Phe Asp Asn Lell Ser Glin Ala Luell Ala Wall Arg Asp 65 70

Trp Ile Gly Asp Gly Lell Ala Thir Ser Trp Thir His Glu Asn Trp 85 90 95

Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp Glu Arg Lell Asn Ala Asp Glu His Ile Glu Wall Ser Glu Asp 13 O 135 14 O

Met Thir Arg Luell Thir Lell Asp Thir Ile Gly Luell Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Ile Ser 1.65 17O 17s

Met Wall Arg Thir Lell Glu Wall Met Asn Lys Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Wall Asp Glu Asn Arg Glin Cys Glin Glu Asp 195

Ile Lys Wall Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215

Ala Arg Gly Glu Glin Ser Asp Asp Luell Luell Thir Glin Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Gly Asn Ile Ser Tyr 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Tyr Phe Luell Wall ASn Pro His Wall Luell Glin 27s 285

Wall Ala Glu Glu Ala Ala Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Ala Pro Ala Phe Ser Lell Tyr Ala Lys 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Wall Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375 38O

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470 47s 48O US 8,026,085 B2 159 160 - Continued

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O US 8,026,085 B2 161 162 - Continued

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Tyr Wall Glin 965 97.

His Wal Met Glu Glin Asp Gly Lys Lys Luell Ile Glu Lell Luell Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1OOO

Wall Glu Ala Thr Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 5 1 O2O

Ser Glu Ala Asp Ala Arg Lieu. Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 103 O 1035 Gly Arg Tyr Ala Lys Asp Val Trp Ala Gly 104 O 104 5

<210s, SEQ ID NO 27 &211s LENGTH: 1048 212. TYPE : PRT ORGANISM: Artificial sequence 22 Os. FEATURE: OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-5

<4 OOs, SEQUENCE: 27

Thir Ile Llys Glu Met Pro Glin Pro Thir Phe Gly Glu Luell Lys Asn 1. 5 15

Lell Pro Leu Luell Asn Thir Asp Wall Glin Ala Lell Met Ile 3O

Ala Asp Glu Luell Gly Glu Ile Phe Phe Glu Ala Pro Gly Arg Wall 35 4 O 45

Thir Arg Tyr Luell Ser Ser Glin Arg Luell Ile Glu Ala Asp Glu SO 55 6 O

Ser Arg Phe Asp Asn Lell Ser Glin Ala Luell Ala Wall Arg Asp 65 70

Phe Gly Gly Asp Gly Lell Wall Thir Ser Trp Thir His Glu Asn Trp 85 90 95

Lys Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp. Glu Arg Lell Asn Ala Asp Glu His Ile Glu Wall Ser Glu Asp 13 O 135 14 O

Met Thr Arg Luell Thir Lell Asp Thir Ile Gly Luell Gly Phe Asn Tyr 145 150 155 160

Arg Phe Asn Ser Phe Arg Asp Glin Pro His Pro Phe Ile Ile Ser 1.65 17O 17s

Met Val Arg Ala Lell Glu Wall Met Asn Lys Lell Glin Arg Ala Asn 18O 185 19 O

Pro Asp Asp Pro Ala Asp Glu Asn Arg Glin Cys Glin Glu Asp 195

Ile Llys Val Met Asn Asp Lell Wall Asp Ile Ile Ala Asp Arg 21 O 215 22O US 8,026,085 B2 163 164 - Continued

Ala Arg Gly Glu Glin Ser Asp Asp Luell Luell Thir Glin Met Luell Asn Gly 225 23 O 235 24 O

Asp Pro Glu Thir Gly Glu Pro Luell Asp Asp Gly Asn Ile Ser 245 250 255

Glin Ile Ile Thir Phe Lell Ile Ala Gly His Glu Thir Thir Ser Gly Luell 26 O 265 27 O

Lell Ser Phe Ala Lell Tyr Phe Luell Wall ASn Pro His Wall Luell Glin 27s 285

Wall Ala Glu Glu Ala Ala Arg Wall Luell Wall Asp Pro Wall Pro Ser 29 O 295 3 OO

Tyr Glin Wall Lys Glin Lell Wall Gly Met Wall Luell Asn Glu 3. OS 310 315

Ala Luell Arg Luell Trp Pro Thir Wall Pro Ala Phe Ser Lell Tyr Ala 3.25 330 335

Glu Asp Thir Wall Lell Gly Gly Glu Tyr Pro Luell Glu Gly Asp Glu 34 O 345 35. O

Wall Met Wall Luell Ile Pro Glin Luell His Arg Asp Thir Ile Trp Gly 355 360 365

Asp Asp Wall Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn Pro Ser Ala 37 O 375

Ile Pro Glin His Ala Phe Pro Phe Gly ASn Gly Glin Arg Ala Cys 385 390 395 4 OO

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Tyr Glu Luell Asp 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Tyr Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655 US 8,026,085 B2 165 166 - Continued

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin Tyr 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser 825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Tyr Wall Glin 965 97.

His Wall Met Glu Glin Asp Gly Lys Lys Luell Ile Glu Lell Luell Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1005

Wall Glu Ala Thir Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 1O1 5 1 O2O

Ser Glu Ala Asp Ala Arg Luell Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 103 O 1035

Gly Arg Tyr Ala Lys Asp Wall Trp Ala Gly 104 O 104 5

<210s, SEQ ID NO 28 &211s LENGTH: 104.8 212. TYPE : PRT US 8,026,085 B2 167 168 - Continued <213> ORGANISM: Artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-8 <4 OOs, SEQUENCE: 28 Thir Ile Lys Glu Met Pro Gln Pro Llys Thr Phe Gly Glu Lieu Lys Asn 1. 5 1O 15 Lieu Pro Lieu. Lieu. Asn. Thir Asp Llys Pro Val Glin Ala Lieu Met Lys Ile 2O 25 3O Ala Asp Glu Lieu. Gly Glu Ile Phe Llys Phe Glu Ala Pro Gly Cys Val 35 4 O 45 Thir Arg Tyr Lieu. Ser Ser Glin Arg Lieu. Ile Lys Glu Ala Cys Asp Glu SO 55 6 O Ser Arg Phe Asp Lys Asn Lieu. Ser Glin Ala Lieu Lys Ala Val Arg Asp 65 70 7s 8O Phe Ala Gly Asp Gly Lieu. Ile Thr Ser Trp Thr His Glu Ile Asn Trp 85 90 95 Llys Lys Ala His Asn. Ile Lieu. Lieu Pro Ser Phe Ser Glin Glin Ala Met 1OO 105 11 O Lys Gly Tyr His Ala Met Met Val Asp Ile Ala Val Glin Lieu Val Glin 115 12 O 125 Llys Trp Glu Arg Lieu. Asn Ala Asp Glu. His Ile Glu Val Ser Glu Asp 13 O 135 14 O Met Thr Arg Lieu. Thir Lieu. Asp Thir Ile Gly Lieu. Cys Gly Phe Asn Tyr 145 150 155 160 Arg Phe Asin Ser Phe Tyr Arg Asp Gln Pro His Pro Phe Ile Ile Ser 1.65 17O 17s Met Val Arg Ala Lieu. Asp Glu Val Met Asn Llys Lieu. Glin Arg Ala Asn 18O 185 19 O Pro Asp Asp Pro Ala Tyr Asp Glu Asn Lys Arg Glin Cys Glin Glu Asp 195 2OO 2O5 Ile Llys Val Met Asn Asp Lieu Val Asp Llys Ile Ile Ala Asp Arg Llys 21 O 215 22O Ala Arg Gly Glu Glin Ser Asp Asp Lieu. Lieu. Thr Gln Met Lieu. Asn Gly 225 23 O 235 24 O Lys Asp Pro Glu Thr Gly Glu Pro Lieu. Asp Asp Gly Asn. Ile Ser Tyr 245 250 255 Glin Ile Ile Thr Phe Lieu. Ile Ala Gly His Glu Thir Thr Ser Gly Lieu. 26 O 265 27 O Lieu. Ser Phe Ala Lieu. Tyr Phe Lieu Val Lys Asn. Pro His Val Lieu. Glin 27s 28O 285 Llys Val Ala Glu Glu Ala Ala Arg Val Lieu Val Asp Pro Val Pro Ser 29 O 295 3 OO Tyr Lys Glin Val Lys Glin Lieu Lys Tyr Val Gly Met Val Lieu. Asn. Glu 3. OS 310 315 32O Ala Lieu. Arg Lieu. Trp Pro Thr Ala Pro Ala Phe Ser Lieu. Tyr Ala Lys 3.25 330 335 Glu Asp Thr Val Lieu. Gly Gly Glu Tyr Pro Lieu. Glu Lys Gly Asp Glu 34 O 345 35. O Val Met Val Lieu. Ile Pro Glin Lieu. His Arg Asp Llys Thir Ile Trp Gly 355 360 365 Asp Asp Val Glu Glu Phe Arg Pro Glu Arg Phe Glu Asn. Pro Ser Ala 37 O 375 38O Ile Pro Gln His Ala Phe Llys Pro Phe Gly Asn Gly Glin Arg Ala Cys 385 390 395 4 OO US 8,026,085 B2 169 170 - Continued

Ile Gly Glin Glin Phe Ala Lell His Glu Ala Thir Lell Wall Luell Gly Met 4 OS 415

Met Luell His Phe Asp Phe Glu Asp His Thir Asn Glu Luell Asp 42O 425 43 O

Ile Glu Thir Lell Thir Lell Lys Pro Glu Gly Phe Wall Wall Lys Ala 435 44 O 445

Ser Ile Pro Lell Gly Gly Ile Pro Ser Pro Ser Thir Glu 450 45.5 460

Glin Ser Ala Lys Wall Arg Ala Glu Asn Ala His Asn Thir 465 470

Pro Luell Luell Wall Lell Gly Ser Asn Met Gly Thir Ala Glu Gly Thir 485 490 495

Ala Arg Asp Luell Ala Asp Ile Ala Met Ser Lys Gly Phe Ala Pro Glin SOO 505

Wall Ala Thir Luell Asp Ser His Ala Gly Asn Luell Pro Arg Glu Gly Ala 515 525

Wall Luell Ile Wall Thir Ala Ser Asn Gly His Pro Pro Asp Asn Ala 53 O 535 54 O

Lys Glin Phe Wall Asp Trp Lell Asp Glin Ala Ser Ala Asp Glu Wall Lys 5.45 550 555 560

Gly Wall Arg Ser Wall Phe Gly Gly Asp Asn Trp Ala Thir 565 st O sts

Thir Glin Lys Wall Pro Ala Phe Ile Asp Glu Thir Lell Ala Ala 585 59 O

Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala Asp Ala Ser Asp Asp 595 605

Phe Glu Gly Thir Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp Wall 610 615

Ala Ala Phe Asn Lell Asp Ile Glu Asn Ser Glu Asp Asn Ser 625 630 635 64 O

Thir Luell Ser Luell Glin Phe Wall Asp Ser Ala Ala Asp Met Pro Luell Ala 645 650 655

Met His Gly Ala Phe Ser Thir Asn Wall Wall Ala Ser Lys Glu Luell 660 665 67 O

Glin Glin Pro Gly Ser Ala Arg Ser Thir Arg His Lell Glu Ile Glu Luell 675 685

Pro Lys Glu Ala Ser Glin Glu Gly Asp His Lell Gly Wall Ile Pro 69 O. 695 7 OO

Arg Asn Glu Gly Ile Wall Asn Arg Wall Ala Arg Phe Gly Luell 7 Os 71O

Asp Ala Ser Glin Glin Ile Arg Luell Glu Ala Glu Glu Luell Ala 72 73 O 73

His Luell Pro Luell Ala Thir Wall Ser Wall Glu Lell Luell Glin Tyr 740 74. 7 O

Wall Glu Luell Glin Asp Wall Thir Arg Thir Lell Arg Ala Met Ala 760 765

Ala Lys Thir Wall Cys Pro His Wall Lell Glu Ala Luell Luell 770 775

Glu Glin Ala Tyr Lys Glu Glin Wall Luell Arg Luell Thir Met 79 O

Lell Glu Luell Luell Glu Pro Ala Cys Met Phe Ser Glu 805 810 815

Phe Ile Ala Luell Lell Ser Ile Arg Pro Arg Ser Ile Ser US 8,026,085 B2 171 172 - Continued

825 83 O

Ser Ser Pro Arg Wall Asp Glu Lys Glin Ala Ser Ile Thir Wall Ser Wall 835 84 O 845

Wall Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Gly Ile Ala 850 855 860

Ser Asn Tyr Luell Ala Glu Lell Glin Glu Gly Asp Thir Ile Thir Phe 865 87s

Ile Ser Thir Pro Glin Ser Glu Phe Thir Luell Pro Asp Pro Glu Thir 885 890 895

Pro Luell Ile Met Wall Gly Pro Gly Thir Gly Wall Ala Pro Phe Arg Gly 9 OO 905 91 O

Phe Wall Glin Ala Arg Glin Luell Glu Glin Gly Glin Ser Luell Gly 915 92 O 925

Glu Ala His Luell Tyr Phe Gly Cys Arg Ser Pro His Glu Asp Luell 93 O 935 94 O

Tyr Glin Glu Glu Lell Glu Asn Ala Glin Ser Glu Gly Ile Ile Thir Luell 945 950 955 96.O

His Thir Ala Phe Ser Arg Met Pro Asn Glin Pro Thir Tyr Wall Glin 965 97.

His Wall Met Glu Glin Asp Gly Lys Lys Luell Ile Glu Lell Luell Asp Glin 985 99 O

Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro Ala 995 1005

Wall Glu Ala Thir Lieu Met Lys Ser Tyr Ala Asp Wall His Glin Wall 1010 1015

Ser Glu Ala Asp Ala Arg Luell Trp Lieu. Glin Gln Lieu Glu Glu Lys 1025 1035

Gly Arg Tyr Ala Lys Asp Wall Trp Ala Gly 104 O 1045

SEQ ID NO 29 LENGTH: 104.8 TYPE : PRT ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Cytochrome P450 variant CYP102A1var3-7

<4 OOs, SEQUENCE: 29

Thir Ile Lys Glu Met Pro Glin Pro Thir Phe Gly Glu Luell Lys Asn 1. 5 15

Lell Pro Luell Luell Asn Thir Asp Wall Glin Ala Lell Met Ile 3O

Ala Asp Glu Luell Gly Glu Ile Phe Phe Glu Ala Pro Gly Wall 35 4 O 45

Thir Arg Luell Ser Ser Glin Arg Luell Ile Lys Glu Ala Asp Glu SO 55 6 O

Ser Arg Phe Asp Asn Lell Ser Glin Ala Luell Ala Wall Arg Asp 65 70

Phe Ala Gly Asp Gly Lell Ala Thir Ser Trp Thir His Glu Ile Asn Trp 85 90 95

Ala His Asn Ile Lell Luell Pro Ser Phe Ser Glin Glin Ala Met 105 11 O

Gly Tyr His Ala Met Met Wall Asp Ile Ala Wall Glin Luell Wall Glin 115 12 O 125

Trp Glu Arg Lell Asn Ala Asp Glu His Ile Glu Wall Ser Glu Asp 13 O 135 14 O