US 20030003528A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2003/0003528A1 Brzostowicz et al. (43) Pub. Date: Jan. 2, 2003

(54) PRODUCTION FROM A (21) Appl. No.: 09/941,947 SINGLE CARBON SUBSTRATE (22) Filed: Aug. 29, 2001 (76) Inventors: Patricia C. Brzostowicz, West Chester, PA (US); Qiong Cheng, Wilmington, Related U.S. Application Data DE (US); Deana Dicosimo, Rockland, DE (US); Mattheos Kofas, (60) Provisional application No. 60/229,907, filed on Sep. Wilmington, DE (US); Edward S. 1, 2000. Provisional application No. 60/229,858, filed Miller, Wilmington, DE (US); James on Sep. 1, 2000. M. Odom, Kennett Square, PA (US); Stephen K. Picataggio, Landenberg, PA Publication Classification (US); Pierre E. Rouviere, Wilmington, (51) Int. Cl...... C12P 23/00; C12N 1/21 DE (US) (52) U.S. Cl...... 435/67; 435/252.3 Correspondence Address: E DUPONT DE NEMOURS AND (57) ABSTRACT COMPANY LEGAL PATENT RECORDS CENTER A method for the production of carotenoid compounds is BARLEY MILL PLAZA 25/1128 disclosed. The method relies on the use of microorganisms 4417 LANCASTER PIKE which metabolize Single carbon Substrates for the production WILMINGTON, DE 19805 (US) of carotenoid compounds in high yields. Patent Application Publication Jan. 2, 2003 Sheet 1 of 7 US 2003/0003528A1

'cool -- o FIGURE 1 Pyruvate v dys Glyceraldehyde-3P H O OP 1-Deoxyxylulose-5P H discr' 9 H3 : Or 2-C-Methylerythritol-4-phosphate H2 OH CH yghP a 9 Hs GP

OH, OH oisorsO O of 4-diphosphocytidyl-methylerythriyol ychB, ygbB, lytB,gcpE, ipk OHOH isopentenyl-plus --> Dimethylallyl-PP O-PP ispá OPP ---y ispa Geranyl-PP s s 1s '' pyrophosphate p N cri co GGPP GGPP N crifB

Phytoere | cert

lycopene criY

al-Yerra8-Carotene 2 40 N cra o -D e Zeaxanthir CrO to O Astaxanthin

Zeaxanthin-3-o-diglucoside Patent Application Publication Jan. 2, 2003 Sheet 2 of 7 US 2003/0003528A1

Zenôl! Patent Application Publication Jan. 2, 2003 Sheet 3 of 7 US 2003/0003528A1

****dqGGGII go

þjó

Eko Patent Application Publication Jan. 2, 2003 Sheet 4 of 7 US 2003/0003528A1 :

---

T ca Y. 5 .9 c 3. NY al

\ -

9. sy - s M. Ry a Y as use 7s su :

E;

> S c Patent Application Publication US 2003/0003528A1

G?un61-I Patent Application Publication Jan. 2, 2003 Sheet 6 of 7 US 2003/0003528A1

eunfil9 Patent Application Publication US 2003/0003528A1

C co na Aug 6tu Jo w Aid fruo fin

(sXp

US 2003/0003528A1 Jan. 2, 2003

CAROTENOID PRODUCTION FROM ASINGLE No. 5,656,472; U.S. Pat. No. 5,5545,816; U.S. Pat. No. CARBON SUBSTRATE 5.530,189, U.S. Pat. No. 5,530, 188; U.S. Pat. No. 5,429, 0001) This application claims the benefit of U.S. Provi 939). Despite the similarity in operon structure, the DNA sional Application No. 60/229,907, filed Sep. 1, 2000 and sequences of E. uredovora and E. herbicola show no homol the benefit of U.S. Provisional Application No. 60/229,858 ogy by DNA-DNA hybridization (U.S. Pat. No. 5,429,939). filed Sep. 1, 2000. 0007 Although more than 600 different have been identified in nature, only a few are used industrially for FIELD OF THE INVENTION food colors, animal feeding, pharmaceuticals and cosmetics. Presently, most of the carotenoids used for industrial pur 0002 The invention relates to the field of molecular poses are produced by chemical Synthesis, however, these biology and microbiology. More Specifically, the invention compounds are very difficult to make chemically (Nelis and describes the production of carotenoid compounds from Leenheer, Appl. Bacteriol. 70:181-191 (1991)). Natural microorganisms which metabolize Single carbon Substrates carotenoids can either be obtained by extraction of plant as a Sole carbon Source. material or by microbial Synthesis. At the present time, only a few plants are widely used for commercial carotenoid BACKGROUND OF THE INVENTION production. However, the productivity of carotenoid Synthe sis in these plants is relatively low and the resulting caro 0.003 Carotenoids represent one of the most widely dis tenoids are very expensive. tributed and structurally diverse classes of natural pigments, producing pigment colors of light yellow to orange to deep 0008. A number of carotenoids have been produced from red. Eye-catching examples of carotenogenic tissues include microbial Sources. For example, Lycopene has been pro carrots, tomatoes, red peppers, and the petals of daffodils duced from genetically engineered E. coli and Candia utilis and marigolds. Carotenoids are Synthesized by all photo (Farmer W. R. and J. C. Liao. (2001) Biotechnol. Prog. 17: Synthetic organisms, as well as Some bacteria and fungi. 57-61; Wang C. et al., (2000) Biotechnol Prog. 16:922–926; These pigments have important functions in photosynthesis, Misawa, N. and H. Shimada. (1998). J. Biotechnol. 59:169 nutrition, and protection against photooxidative damage. For 181; Shimada, H. et al. 1998. Appl. Environm. Microbiol. example, animals do not have the ability to Synthesize 64:2676-2680). B-carotene has been produced from E. coli, carotenoids but must instead obtain these nutritionally Candia utilis and Pfafia rhodozyma (Albrecht, M. et al., important compounds through their dietary Sources. Struc (1999). Biotechnol. Lett. 21: 791-795; Miura, Y. et al., 1998. turally, carotenoids are 40-carbon (C) terpenoids derived Appl. Environm. Microbiol. 64:1226-1229; U.S. Pat. No. from the isoprene biosynthetic pathway and its five-carbon 5,691,190). Zeaxanthin has been produced from recombi universal isoprene building block, isopentenyl pyrophoS nant from E. coli and Candia utilis (Albrecht, M. et al., phate (IPP). This biosynthetic pathway can be divided into (1999). Biotechnol. Lett. 21: 791-795; Miura, Y. et al., 1998. two portions: the upper isoprene pathway, which leads to the Appl. Environm. Microbiol. 64:1226-1229). Astaxanthin has formation of IPP, and the lower carotenoid biosynthetic been produced from E. coli and Pfafia rhodozyma (U.S. Pat. pathway, which converts IPP into long Co and Co carote Nos. 5,466.599; 6,015,684; 5,182,208; 5,972,642). nogenic compounds. Both portions of this pathway are 0009 Additionally genes encoding various elements of shown in FIG. 1. the carotenoid biosynthetic pathway have been cloned and 0004 Various other crt genes are known, which enable expressed in various microbes. For example genes encoding the intramolecular conversion of long Co and Co com lycopene cyclase, geranylgeranyl pyrophosphate Synthase, pounds to produce numerous other carotenoid compounds. It and dehydrogenase isolated from Erwinia herbi is the degree of the carbon backbone's unsaturation, conju cola have been expressed recombinantly in E. coli (U.S. Pat. gation and isomerization which determines the Specific Nos. 5656472; 5545816; 5530189; 5530188). Similarly carotenoids unique absorption characteristics and colors. genes encoding the carotenoid products geranylgeranyl Several reviews discuss the genetics of carotenoid pigment pyrophosphate, phytoene, lycopene, B-caroteine, and ZeaX biosynthesis, such as those of Armstrong (J. Bact. 176: anthin-diglucoside, isolated from Erwinia uredov'Ora have 4795-4802 (1994); Annu. Rev. Microbiol. 51:629-659 been expressed in E. coli, Zymomonas mobilis, and Saccha (1997)). romyces cerevisiae (U.S. Pat. No. 5,429,939). Similarly, the 0005. In reference to the availability of carotenoid genes, Carotenoid biosynthetic genes crtE (1), crtB (3), crt1 (5), public domain databases Such as GenBank contain crtY (7), and crtz isolated from Flavobacterium have been Sequences isolated from numerous organisms. For example, recombinantly expressed (U.S. Pat. No. 6,124,113). there are currently 26 GenBank Accession numbers relating 0010 Although the above methods of propducing caro to various crtE genes isolated from 19 different organisms. tenoids are useful, these methods suffer from low yields and The leSS frequently encountered crtz gene boasts 6 GenBank reliance on expensive feedstocks. A method that produces Accession numbers with each gene isolated from a different higher yields of carotenoids from an inexpensive feedstock organism. A similarly wide Selection of carotenoid genes is is needed. available for each of the genes discussed above. 0011. There are a number of microorganisms that utilize 0006 The genetics of carotenoid pigment biosynthesis Single carbon Substrates as Sole energy Sources. These has been extremely well Studied in the Gram-negative, Substrates include, methane, methanol, formate, methylated pigmented bacteria of the genera Pantoea, formerly known amines and thiols, and various other reduced carbon com as Erwinia. In both E. herbicola EHO-10 (ATCC 39368) and pounds which lack any carbon-carbon bonds and are gen E. uredovora 20D3 (ATCC 19321), the crt genes are clus erally quite inexpensive. These organisms are referred to as tered in two genetic units, crt Z and crt EXYIB (U.S. Pat. methylotrophs and herein as “C1 metabolizers”. These US 2003/0003528A1 Jan. 2, 2003

organisms are characterized by the ability to use carbon abundance of methane that makes the biotransformation of Substrates lacking carbon to carbon bonds as a Sole Source of methane a potentially unique and valuable process. energy and biomass. A Subset of methylotrophs are the methanotrophs which have the unique ability to utilize 0014. Many methanotrophs contain an inherent iso methane as a Sole energy Source. Although a large number prenoid pathway which enables these organisms to Synthe of these organisms are known, few of these microbes have Size other non-endogenous isoprenoid compounds. Since been Successfully harnessed to industrial processes for the methanotrophs can use one carbon Substrate (methane or Synthesis of materials. Although Single carbon Substrates are methanol) as an energy Source, it is possible to produce cost effective energy Sources, difficulty in genetic manipu carotenoids at low cost. lation of these microorganisms as well as a dearth of 0015 Current knowledge in the field concerning methy information about their genetic machinery has limited their lotrophic organisms and carotenoids leads to the following use primarily to the Synthesis of native products. For conclusions. First, there is tremendous commercial incentive example the commercial applications of biotransformation arising from abundantly available C1 Sources, which could of methane have historically fallen broadly into three cat be used as a feedstock for C1 organisms and which should egories: 1) Production of single cell protein, (Sharpe D. H. provide both economic and quality advantages over other BioProtein Manufacture 1989. Ellis Horwood Series in more traditional carbohydrate raw materials. Secondly, there applied Science and industrial technology. New York: Hal is abundant knowledge available concerning organisms that stead Press.) (Villadsen, John, Recent Trends Chem. React. possess carotenogenic biosynthetic genes, the function of Eng., Proc. Int. Chem. React. Eng. Conf.), 2nd (1987), those genes, and the upper isoprene pathway which pro Volume 2,320-33. Editor(s): Kulkarni, B. D.; Mashelkar, R. duces carotenogenic precursor molecules. Finally, numerous A.; Sharma, M. M. Publisher: Wiley East., New Delhi, India; methylotrophic organisms exist in the art which are them Naguib, M., Proc. OAPEC Symp. Petroprotein, Pap. Selves pigmented, and thereby possess portions of the nec (1980), Meeting Date 1979, 253-77 Publisher: Organ. Arab essary carotenoid biosynthetic pathway. Pet. Exporting Countries, Kuwait, Kuwait.); 2) epoxidation 0016 Despite these available tools, the art does not of alkenes for production of chemicals (U.S. Pat. No. reveal any C1 metabolizers which have been genetically 4,348,476); and 3) biodegradation of chlorinated pollutants engineered to make Specific carotenoids of choice, for large (Tsien et al., Gas, Oil, Coal, Environ. Biotechnol. 2, Pap. Scale commercial value. It is hypothesized that the useful Int. IGT Symp. Gas, Oil, Coal, Environ. Biotechnol., 2nd neSS of these organisms for production of a larger range of (1990), 83-104. Editor(s): Akin, Cavit; Smith, Jared. Pub chemicals is constrained by limitations including, relatively lisher: Inst. Gas Technol, Chicago, Ill., WO 9633821; Slow growth rates of methanotrophs, limited ability to tol Merkley et al., Biorem. Recalcitrant Org., Pap. Int. In Situ erate methanol as an alternative Substrate to methane, dif On-Site Bioreclam. Symp. 3rd (1995), 165-74. Editor(s): ficulty in genetic engineering, poor understanding of the Hinchee, Robert E.; Anderson, Daniel B.; Hoeppel, Ronald roles of multiple carbon assimilation pathways present in E. Publisher: Battelle Press, Columbus, Ohio.: Meyer et al., methanotrophs, and potentially high costs due to the oxygen Microb. Releases (1993), 201), 11-22). Even here, the com demand of fully Saturated Substrates Such as methane. The mercial Success of the methane bio-transformation has been problem to be solved, therefore is to provide a cost effective limited to epoxidation of alkenes due to low product yields, method for the microbial production of carotenoid com toxicity of products and the large amount of cell mass pounds, using organisms which utilize C1 compounds as required to generate product associated with the process. their carbon and energy Source. 0012. The commercial utility of methylotrophic organ isms is reviewed in Lidstrom and Stirling (Annu. Rev. 0017 Applicants have solved the stated problem by engi Microbiol. 44:27-58 (1990)). Little commercial success has neering microorganisms which are able to use Single carbon been documented, despite numerous efforts involving the Substrates as Sole carbon Sources for the production of application of methylotrophic organisms and their enzymes carotenoid compounds. (Lidstrom and Stirling, Supra, Table 3). In most cases, it has been discovered that the organisms have little advantage SUMMARY OF THE INVENTION over other well-developed host systems. Methanol is fre 0018. The invention provides a method for the production quently cited as a feedstock which should provide both of a carotenoid compound comprising: economic and quality advantages over other more traditional carbohydrate raw materials, but thus far this expectation has 0019 (a) providing a transformed C1 metabolizing not been Significantly validated in published WorkS. host cell comprising: 0013. One of the most common classes of single carbon 0020 (i) Suitable levels of isopentenyl pyrophos metabolizers are the methanotrophs. Methanotrophic bacte phate, and ria are defined by their ability to use methane as a Sole Source 0021 (ii) at least one isolated nucleic acid mol of carbon and energy. Methane monooxygenase is the ecule encoding an enzyme in the carotenoid bio enzyme required for the primary Step in methane activation Synthetic pathway under the control of Suitable and the product of this reaction is methanol (Murrell et al., regulatory Sequences; Arch. Microbiol. (2000), 173(5-6), 325-332). This reaction occurs at ambient temperature and pressures whereas chemi 0022 (b) contacting the host cell of step (a) under cal transformation of methane to methanol requires tem Suitable growth conditions with an effective amount peratures of hundreds of degrees and high pressure (Grigo of a C1 carbon substrate whereby an carotenoid ryan, E. A., Kinet. Catal. (1999), 40(3), 350-363; WO compound is produced. 2000007718; U.S. Pat. No. 5,750.821). It is this ability to 0023 Preferred C1 carbon Substrates of the invention are transform methane under ambient conditions along with the Selected from the group consisting of methane, methanol, US 2003/0003528A1 Jan. 2, 2003 formaldehyde, formic acid, methylated amines, methylated 0034 FIG.2 provides microarray expression data for key thiols, and carbon dioxide. Preferred C1 metabolizers are carbon pathway genes, as expressed in Methylomonas 16a. methylotrophs and methanotrophs. Particularly preferred C1 metabolizers are those that comprise a functional Embden 0035 FIG. 3 shows plasmid pcrt1 and HPLC spectra Meyerhof carbon pathway, Said pathway comprising a gene Verifying Synthesis of B-caroteine in those Methylomonas encoding a pyrophosphate dependent phosphofructokinase containing plasmid pcrt1. enzyme. Optionally the preferred host may comprise at least 0036 FIG. 4 shows plasmid pcrt3 and HPLC spectra one gene encoding a fructose bisphosphate aldolase enzyme. Verifying Synthesis of Zeaxanthin and its mono- and di 0024 Suitable levels of isopentenyl pyrophosphate may glucosides in those Methylomonas containing plasmid pcrt3. be endogenous to the host, or may be provided by hetero 0037 FIG. 5 shows plasmid pcrt4 and HPLC spectra logusly introduced upper pathway isoprenoid genes Such as Verifying Synthesis of Zeaxanthin and its mono- and di D-1-deoxyxylulose-5-phosphate synthase (DXs), D-1-deox glucosides in those Methylomonas containing plasmid pcrt4. yxylulose-5-phosphate reductoisomerase (DXr), 2C-methyl d-erythritol cytidylyltransferase (IspD), 4-diphosphocyti 0038 FIG. 6 shows plasmid pcrt4.1 and HPLC spectra dyl-2-C-methylerythritol kinase (IspE), 2C-methyl-d- Verifying Synthesis of canthaxanthin and astaxanthin in erythritol 2,4-cyclodiphosphate synthase (IspF), CTP those Methylomonas containing plasmid pcrt4.1. synthase (PyrC) and IytB. 0039 FIG. 7 shows plasmid 0.025 In an alternate embodiment the invention provides pTJS75::dxs:dxr:lacZ:Tn5Kn and production of the native a method for the over-production of carotenoid production carotenoid in those Methylomonas containing plasmid in a transformed C1 metabolizing host comprising: pTJS75::dxs:dxr:lacZ:Tn5Kn. Additionally, the construct 0026 (a) providing a transformed C1 metabolizing pcrt4.1 is shown. host cell comprising: 0040. The invention can be more fully understood from 0027 (i) Suitable levels of isopentenyl pyrophos the following detailed description and the accompanying phate, and Sequence descriptions which form a part of this application. 0028 (ii) at least one isolated nucleic acid mol 0041. The following sequences conform with 37 C.F.R. ecule encoding an enzyme in the carotenoid bio 1.821-1.825 (“Requirements for Patent Applications Con Synthetic pathway under the control of Suitable taining Nucleotide Sequences and/or Amino Acid Sequence regulatory Sequences, and Disclosures-the Sequence Rules”) and consistent with World Intellectual Property Organization (WIPO) Standard 0029 (iii) either: ST25 (1998) and the sequence listing requirements of the 0030) 1) multiple copies of at least one gene EPO and PCT (Rules 5.2 and 49.5(a-bis), and Section 208 encoding an enzyme Selected from the group and Annex C of the Administrative Instructions). The sym consisting of D-1-deoxyxylulose-5-phosphate bols and format used for nucleotide and amino acid Synthase (DXS), D-1-deoxyxylulose-5-phos sequence data comply with the rules set forth in 37 C.F.R.S phate reductoisomerase (DXr), 2C-methyl-d- 1822. erythritol cytidylyltransferase (IspD), 4-diphos 0042 SEQID NOS:1-38 are full length genes or proteins phocytidyl-2-C-methylerythritol kinase (IspE), as identified in Table 1. 2C-methyl-d-erythritol 2,4-cyclodiphosphate synthase (IspF), CTP synthase (PyrC) and IytB; TABLE 1. O Summary of Gene and Protein SEQ ID Numbers 0031) 2) at least one gene encoding an enzyme Selected from the group consisting of D-1- SEO ID SEO ID deoxyxylulose-5-phosphate Synthase (DXS), Description Nucleic acid Peptide D-1-deoxyxylulose-5-phosphate reductoi Phosphofructokinase pyrophosphate 1. 2 Somerase (DXr), 2C-methyl-d-erythritol cytidy dependent KHG/KDPG Aldolase 3 4 lyltransferase (IspD), 4-diphosphocytidyl-2-C- dxs 5 6 methylerythritol kinase (IspE), 2C-methyl-d- dXr 7 8 erythritol 2,4-cyclodiphosphate Synthase ispD (ygbP 9 1O (IspF), CTP synthase (PyrC) and IytB operable ispE (ychB) 11 12 ispF (ygbB) 13 14 linked to a strong promoter. pyrCi 15 16 lytB 17 18 0032 (b) contacting the host cell of step (a) under ispA 19 2O Suitable growth conditions with an effective amount CrtN1 21 22 of a C1 carbon Substrate whereby a carotenoid CrtN2 23 24 compound is over-produced. crtE 25 26 crtX 27 28 crtY 29 3O BRIEF DESCRIPTION OF THE DRAWINGS, crt 31 32 SEQUENCE DESCRIPTIONS AND crtB 33 34 BIOLOGICAL DEPOSITS crtz, 35 36 crtO 37 38 0.033 FIG. 1 illustrates the upper isoprene pathway and lower carotenoid biosynthetic pathway. US 2003/0003528A1 Jan. 2, 2003

0043 SEQ ID Nos:39-40 are amplification primers for microorganism for transformation and the production of the HMPS promoter SEQ ID Nos:41-42 are amplification various carotenoid compounds in high yield. primers for the crtO gene from RhodococcuS. 0055. In this disclosure, a number of terms and abbre 0044 SEQ ID NOs:43 and 44 are the primer sequences viations are used. The following definitions are provided. used to amplify the crt cluster of Pantoea Stewartii. 0056. The term “Embden-Meyerhof pathway” refers to 0045 SEQ ID NOs:45-47 are the primer sequences used the Series of biochemical reactions for conversion of heXOSes to amplify the 16S rRNA of Rhodococcus erythropolis AN Such as glucose and fructose to important cellular 3-carbon 12. intermediates Such as glyceraldehyde-3-phosphate, dihy droxyacetone phosphate, phosphophenolpyruvate and pyru 0046) SEQ ID NOs:48 and 49 are the primer sequences Vate. These reactions typically proceed with net yield of used to amplify the crtO gene. biochemically useful energy in the form of ATP. The key 0047 SEQID NOs: 50-54 are promoter sequences for the enzymes unique to the Embden-Meyerhof pathway are the HMPS gene and primers used to amplify that promoter. phosphofructokinase and fructose-1,6 bisphosphate aldo 0048 SEQ ID NOS:55 and 56 are the primer sequences lase. used to amplify the dxS gene. 0057 The term “Entner-Douderoff pathway” refers to a 0049 SEQ ID NOS:57 and 58 are the primer sequences Series of biochemical reactions for conversion of heXOSes used to amplify the dxr gene. Such as glucose or fructose to the important 3-carbon cellular intermediates pyruvate and glyceraldehyde-3-phos 0050 SEQ ID NOS:59 and 60 are the primer sequences phate without any net production of biochemically useful used to amplify the lytB gene. energy. The key enzymes unique to the Entner-Douderoff 0051) Applicants made the following biological deposits pathway are the 6-phosphogluconate dehydratase and a under the terms of the Budapest Treaty on the International ketodeoxyphosphogluconate aldolase. Recognition of the Deposit of Micro-organisms for the 0058. The term “diagnostic” as it relates to the presence Purposes of Patent Procedure: of a gene in a pathway refers to evidence of the presence of that pathway, where a gene having that activity is identified. Within the context of the present invention the presence of a gene encoding a pyrophosphate dependant phosphofruc Depositor Identification International Depository Reference Designation Date of Deposit tokinase is "diagnostic' for the presence of the Embden Meyerhof carbon pathway and the presence of gene encod Methylomonas 16a ATCC PTA 2402 August 22, 2000 ing a ketodeoxyphosphogluconate aldolase is "diagnostic' for the presence of the Entner-Douderoff carbon pathway. DETAILED DESCRIPTION OF THE 0059) The term "yield” is defined herein as the amount of INVENTION cell mass produced per gram of carbon Substrate metabo lized. 0.052 The present method is useful for the creation of recombinant organisms that have the ability to produce 0060. The term “carbon conversion efficiency” is a mea various carotenoid compounds. Nucleic acid fragments Sure of how much carbon is assimilated into cell mass and encoding a variety of enzymes implicated in the carotenoid is calculated assuming a biomass composition of biosynthetic pathway have been cloned into microorganisms CH2Oos No.2s. which use Single carbon Substrates as a Sole carbon Source 0061 The term “C, carbon substrate” refers to any car for the production of carotenoid compounds. bon-containing molecule that lacks a carbon-carbon bond. 0053. There is a general practical utility for microbial Examples are methane, methanol, formaldehyde, formic production of carotenoid compounds as these compounds acid, formate, methylated amines (e.g., mono-, di-, and are very difficult to make chemically (Nelis and Leenheer, tri-methyl amine), methylated thiols, and carbon dioxide. Appl. Bacteriol. 70:181-191 (1991)). Most carotenoids have 0062) The term “C1 metabolizer” refers to a microorgan Strong color and can be viewed as natural pigments or ism that has the ability to use an Single carbon Substrate as colorants. Furthermore, many carotenoids have potent anti a Sole Source of energy and biomass. C1 metabolizers will oxidant properties and thus inclusion of these compounds in typically be methylotrophs and/or methanotrophs. the diet is thought to be healthful. Well-known examples are B-caroteine and astaxanthin. Additionally, carotenoids are 0063. The term “methylotroph” means an organism required elements of aquaculture. Salmon and shrimp aquac capable of oxidizing organic compounds which do not ulture are particularly useful applications for this invention contain carbon-carbon bonds. Where the methylotroph is as carotenoid pigmentation is critically important for the able to oxidize CH4, the methylotroph is also a methan value of these organisms. (F. Shahidi, J. A. Brown, Caro otroph. tenoid pigments in Seafood and aquaculture: Critical reviews in food Science 38(1): 1-67 (1998)). Finally, carotenoids 0064. The term “methanotroph” means a prokaryote have utility as intermediates in the Synthesis of , capable of utilizing methane as a Substrate. Complete oxi flavors and fragrances and compounds with potential elec dation of methane to carbon dioxide occurs by aerobic tro-optic applications. degradation pathways. Typical examples of methanotrophs useful in the present invention include but are not limited to 0.054 The disclosure below provides a detailed descrip the genera Methylomonas, Methylobacter, Methylococcus, tion of the Selection of the appropriate C1 metabolizing and Methylosinus. US 2003/0003528A1 Jan. 2, 2003

0065. The term “high growth methanotrophic bacterial 0075) The term “IspA" refers to Geranyltransferase or Strain” refers to a bacterium capable of growth with methane farnesyl diphosphate Synthase enzyme as one of prenyl or methanol as Sole carbon and energy Source which possess transferase family encoded by ispA gene. a functional Embden-Meyerhof carbon flux pathway result ing in a yield of cell mass per gram of C1 Substrate 0076) The term “LytB” refers to protein having a role in metabolized. The Specific “high growth methanotrophic the formation of dimethylallyl-pyrophosphate in the iso bacterial strain” described herein is referred to as “Methy prenoid pathway and which is encoded by IytB gene. lomonas 16a' or “16a', which terms are used interchange 0077. The term “gcpE” refers to a protein having a role ably. in the formation of 2-C-methyl-D-erythritol 4-phosphate in 0066. The term “Methylomonas 16a” and “Methylomo the isoprenoid pathway (Altincicek et al., J. Bacteriol. nas 16a sp.” Are used interchangeably and refer to the (2001), 183(8), 2411-2416; Campos et al., FEBS Lett. Methylomonas Strain used in the present invention. (2001), 488(3), 170-173) 0067. The term "isoprenoid compound” refers to any 0078. The term “lower carotenoid biosynthetic pathway” compound which is derived via the pathway beginning with refers to any of the following genes and gene products isopentenyl pyrophosphate (IPP) and formed by the head asSociated with the isoprenoid biosynthetic pathway, which to-tail condensation of isoprene units which may be of 5, 10, are involved in the immediate Synthesis of phytoene (whose 15, 20, 30 or 40 carbons in length. There term "isoprenoid Synthesis represents the first Step unique to biosynthesis of pigment” refers to a class of isoprenoid compounds which carotenoids) or Subsequent reactions. These genes and gene typically have Strong light absorbing properties. products include the “ispA gene (encoding geranyltrans ferase or farnesyl diphosphate synthase), the “ctrN” and 0068 The term “upper isoprene pathway” refers to any of “ctrN 1” genes (encoding diapophytoene dehydrogenases), the following genes and gene products associated with the the “crtE’ gene (encoding geranylgeranyl pyrophosphate isoprenoid biosynthetic pathway including the dxS gene Synthase), the “crtX' gene (encoding zeaxanthin glucosyl (encoding 1-deoxyxylulose-5-phosphate Synthase), the dxr transferase), the “crtY' gene (encoding lycopene cyclase), gene (encoding 1-deoxyxylulose-5-phosphate reductoi the “crtI” gene (encoding phytoene desaturase), the “crtB” Somerase), the “ispID gene (encoding the 2C-methyl-D- gene (encoding phytoene Synthase), the “crtz’ gene (encod erythritol cytidyltransferase enzyme; also known as ygbP), ing B-caroteine hydroxylase), and the “crtO' gene (encoding the “ispE” gene (encoding the 4-diphosphocytidyl-2-C-me a f-caroteine ketolase). Additionally, the term "carotenoid thylerythritol kinase; also known as yehB), the “ispF” gene biosynthetic enzyme’ is an inclusive term referring to any (encoding a 2C-methyl-d-erythritol 2,4-cyclodiphosphate and all of the enzymes in the present pathway including Synthase; also known as ygbB), the "pyrG” gene (encoding CrtE, CrtX, CrtY, CrtI, CrtB, CrtZ, and CrtO. a CTP synthase); the “lytB” gene involved in the formation of dimethylallyl diphosphate; and the gcpE gene involved in 007.9 The term “IspA" refers to the protein encoded by the synthesis of 2-C-methyl-D-erythritol 4-phosphate in the the ispA gene, and whose activity catalyzes a Sequence of 3 isoprenoid pathway. prenyltransferase reactions in which (GPP), (FPP), and geranylgeranyl 0069. The term “DXs” refers to the 1-deoxyxylulose-5- pyrophosphate (GGPP) are formed. phosphate Synthase enzyme encoded by the dxS gene. 0080. The term “CrtN1” or “CrtN, copy1” refers to copy 0070 The term “DXr” refers to the 1-deoxyxylulose-5- 1 of the diapophytoene dehydrogenase enzyme encoded by phosphate reductoisomerase enzyme encoded by the dxr crtN1 gene. gene. 0081. The term “CrtN2” or “CrtN copy2" refers to copy 0071. The term “YgbP” or “IspD” refers to the 2C-me 2 of the diapophytoene dehydrogenase enzyme(Crt) thyl-D-erythritol cytidyltransferase enzyme encoded by the encoded by crtN2 gene. ygbPorispD gene. The names of the gene, ygbPorispD, are used interchangeably in this application. The names of gene 0082 The term “CrtE” refers to geranylgeranyl pyro product, YgbP or IspD are used interchangeably in this phosphate Synthase enzyme encoded by crtE gene which application. converts trans-trans-farnesyl diphosphate and isopentenyl diphosphate into pyrophosphate and geranylgeranyl diphos 0072 The term “YchB” or “IspE” refers to the 4-diphos phate. phocytidyl-2-C-methylerythritol kinase enzyme encoded by the yeh BorispE gene. The names of the gene, yeh BorispE, 0083) The term “CrtX” refers to the zeaxanthin glucosyl are used interchangeably in this application. The names of transferase enzyme encoded by the crtX gene, and which gene product, YchB or IspE are used interchangeably in this glycosolateS Zeaxanthin to produce zeaxanthin-?3-digluco application. Side. 0073. The term “YgbB” or “IspF” refers to the 2C-me 0084. The term “CrtY” refers to the lycopene cyclase thyl-d-erythritol 2,4-cyclodiphosphate Synthase enzyme enzyme encoded by the crtY gene and which catalyzes encoded by the ygbB or ispE gene. The names of the gene, conversion of lycopene to B-caroteine. ygbB or ispF, are used interchangeably in this application. The names of gene product, YgbB or IspF are used inter 0085. The term “CrtI” refers to the phytoene desaturase changeably in this application. enzyme encoded by the crt1 gene and which converts phy toene into lycopene via the intermediaries of phytofluene, 0074 The term “PyrG” refers to a CTP synthase enzyme Zeta-caroteine, and neurosporene by the introduction of 4 encoded by the pyrC gene. double bonds. US 2003/0003528A1 Jan. 2, 2003

0086) The term “CrtB” refers to the phytoene synthase 0094. “Gene” refers to a nucleic acid fragment that is enzyme encoded by the crtB gene which catalyses the capable of being expressed as a specific protein, including reaction from prephytoene diphosphate to phytoene. regulatory Sequences preceding (5' non-coding Sequences) 0087. The term “CrtZ” refers to the B-carotene hydroxy and following (3' non-coding sequences) the coding lase enzyme encoded by critz gene which catalyses the Sequence. “Native gene' refers to a gene as found in nature hydroxylation reaction from B-caroteine to Zeaxanthin. with its own regulatory Sequences. "Chimeric gene' refers to any gene that is not a native gene, comprising regulatory and 0088. The term “CrtO” refers to the B-carotene ketolase coding Sequences that are not found together in nature. enzyme encoded by crtO gene which catalyses conversion of Accordingly, a chimeric gene may comprise regulatory f-caroteine into canthaxanthin (two ketone groups) via Sequences and coding Sequences that are derived from echinenone (one ketone group) as the intermediate. different Sources, or regulatory Sequences and coding 0089. The term “Carotenoid compound” is defined as a Sequences derived from the same Source, but arranged in a class of hydrocarbons (carotenes) and their oxygenated manner different than that found in nature. "Endogenous derivatives (Xanthophylls) consisting of eight isoprenoid gene' refers to a native gene in its natural location in the units joined in Such a manner that the arrangement of genome of an organism. A “foreign' gene refers to a gene isoprenoid units is reversed at the center of the molecule So not normally found in the host organism, but that is intro that the two central methyl groups are in a 1,6-positional duced into the host organism by gene transfer. Foreign genes relationship and the remaining nonterminal methyl groups can comprise native genes inserted into a non-native organ are in a 1,5-positional relationship. All carotenoids may be ism, or chimeric genes. A “transgene' is a gene that has been formally derived from the acyclic C40H56 structure (For introduced into the genome by a transformation procedure. mula I below), having a long central chain of conjugated double bonds, by (i) hydrogenation. (ii) dehydrogenation, 0095 “Coding sequence” refers to a DNA sequence that (iii) cyclization, or (iv) oxidation. or any combination of codes for a specific amino acid Sequence. "Suitable regula these processes. tory Sequences” refer to nucleotide Sequences located

Formula I 8 : ' ' ' ' ' ch c.1 n1 n1 n1 n1Sa n1 n1 n1 3 H H H H2 CH CH CH CH

0090 This class also includes certain compounds that upstream (5' non-coding sequences), within, or downstream arise from certain rearrangements of the carbon skeleton (I), (3' non-coding sequences) of a coding sequence, and which or by the (formal) removal of part of this structure. influence the transcription, RNA processing or Stability, or 0.091 For convenience carotenoid formulae are often translation of the associated coding Sequence. Regulatory written in a shorthand form as Sequences may include promoters, translation leader

(IA) --> -N-

0092 where the broken lines indicate formal division into Sequences, introns, polyadenylation recognition Sequences, isoprenoid units. RNA processing site, effector binding site and Stem-loop Structure. 0093. As used herein, an “isolated nucleic acid fragment” is a polymer of RNA or DNA that is single- or double 0096 “Promoter” refers to a DNA sequence capable of controlling the expression of a coding Sequence or func Stranded, optionally containing Synthetic, non-natural or tional RNA. In general, a coding Sequence is located 3' to a altered nucleotide bases. An isolated nucleic acid fragment promoter Sequence. Promoters may be derived in their in the form of a polymer of DNA may be comprised of one entirety from a native gene, or be composed of different or more segments of cDNA, genomic DNA or synthetic elements derived from different promoters found in nature, DNA or even comprise Synthetic DNA segments. It is understood US 2003/0003528A1 Jan. 2, 2003

by those skilled in the art that different promoters may direct (1994); Sequence Analysis in Molecular Biology (von Hei the expression of a gene in different tissueS or cell types, or nje, G., ed.) Academic Press (1987); and Sequence Analysis at different Stages of development, or in response to different Primer (Gribskov, M. and Devereux, J., eds.) Stockton environmental or physiological conditions. Promoters which Press, N.Y. (1991). Preferred methods to determine identity cause a gene to be expressed in most cell types at most times are designed to give the best match between the Sequences are commonly referred to as “constitutive promoters'. It is tested. Methods to determine identity and similarity are further recognized that Since in most cases the exact bound codified in publicly available computer programs. Sequence aries of regulatory Sequences have not been completely alignments and percent identity calculations may be per defined, DNA fragments of different lengths may have formed using the Megalign program of the LASERGENE identical promoter activity. bioinformatics computing Suite (DNASTAR Inc., Madison, 0097. The term “operably linked” refers to the associa Wis.). Multiple alignment of the Sequences was performed tion of nucleic acid Sequences on a Single nucleic acid using the Clustal method of alignment (Higgins and Sharp fragment so that the function of one is affected by the other. (1989) CABIOS. 5:151-153) with the default parameters For example, a promoter is operably linked with a coding (GAP PENALTY-10, GAP LENGTH PENALTY-10). Sequence when it is capable of affecting the expression of Default parameters for pairwise alignments using the Clustal that coding Sequence (i.e., that the coding sequence is under method were KTUPLE 1, GAPPENALTY-3, WINDOW-5 the transcriptional control of the promoter). Coding and DIAGONALS SAVED=5. Sequences can be operably linked to regulatory Sequences in 0102) Suitable nucleic acid fragments (isolated poly Sense or antisense orientation. nucleotides of the present invention) encode polypeptides 0098. The term “expression”, as used herein, refers to the that are at least about 70% identical, preferably at least about transcription and Stable accumulation of Sense (mRNA) or 80% identical to the amino acid Sequences reported herein. antisense RNA derived from the nucleic acid fragment of the Preferred nucleic acid fragments encode amino acid invention. Expression may also refer to translation of mRNA Sequences that are about 85% identical to the amino acid into a polypeptide. Sequences reported herein. More preferred nucleic acid fragments encode amino acid Sequences that are at least 0099. “Transformation” refers to the transfer of a nucleic about 90% identical to the amino acid Sequences reported acid fragment into the genome of a host organism, resulting herein. Most preferred are nucleic acid fragments that in genetically stable inheritance. Host organisms containing encode amino acid Sequences that are at least about 95% the transformed nucleic acid fragments are referred to as identical to the amino acid Sequences reported herein. Suit “transgenic' or “recombinant’ or “transformed’ organisms. able nucleic acid fragments not only have the above homolo 0100. The terms “plasmid”, “vector” and “cassette” refer gies but typically encode a polypeptide having at least 50 to an extra chromosomal element often carrying genes amino acids, preferably at least 100 amino acids, more which are not part of the central metabolism of the cell, and preferably at least 150 amino acids, still more preferably at usually in the form of circular double-stranded DNA frag least 200 amino acids, and most preferably at least 250 ments. Such elements may be autonomously replicating amino acids. Sequences, genome integrating Sequences, phage or nucle otide Sequences, linear or circular, of a single- or double 0103) A nucleic acid molecule is “hybridizable” to stranded DNA or RNA, derived from any source, in which another nucleic acid molecule, Such as a cDNA, genomic a number of nucleotide Sequences have been joined or DNA, or RNA, when a single stranded form of the nucleic recombined into a unique construction which is capable of acid molecule can anneal to the other nucleic acid molecule introducing a promoter fragment and DNA sequence for a under the appropriate conditions of temperature and Solution Selected gene product along with appropriate 3' untranslated ionic strength. Hybridization and washing conditions are Sequence into a cell. “Transformation cassette” refers to a well known and exemplified in Sambrook, J., Fritsch, E. F. Specific vector containing a foreign gene and having ele and Maniatis, T. Molecular Cloning. A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold ments in addition to the foreign gene that facilitates trans Spring Harbor (1989), particularly Chapter 11 and Table formation of a particular host cell. “Expression cassette' 11.1 therein (entirely incorporated herein by reference). The refers to a specific vector containing a foreign gene and conditions of temperature and ionic Strength determine the having elements in addition to the foreign gene that allow for “Stringency of the hybridization. Stringency conditions can enhanced expression of that gene in a foreign host. be adjusted to Screen for moderately similar fragments, Such 0101 The term “percent identity”, as known in the art, is as homologous Sequences from distantly related organisms, a relationship between two or more polypeptide Sequences to highly similar fragments, Such as genes that duplicate or two or more polynucleotide Sequences, as determined by functional enzymes from closely related organisms. Post comparing the Sequences. In the art, “identity also means hybridization washes determine Stringency conditions. One the degree of Sequence relatedness between polypeptide or Set of preferred conditions uses a Series of washes Starting polynucleotide Sequences, as the case may be, as determined with 6x SSC, 0.5% SDS at room temperature for 15 min, by the match between Strings of Such Sequences. “Identity” then repeated with 2x SSC, 0.5% SDS at 45° C. for 30 min, and “similarity” can be readily calculated by known meth and then repeated twice with 0.2x SSC, 0.5% SDS at 50° C. ods, including but not limited to those described in: Com for 30 min. A more preferred Set of Stringent conditions uses putational Molecular Biology (Lesk, A. M., ed.) Oxford higher temperatures in which the washes are identical to University Press, NY (1988); Biocomputing: Informatics those above except for the temperature of the final two 30 and Genome Projects (Smith, D. W., ed.) Academic Press, min washes in 0.2x SSC, 0.5% SDS was increased to 60° C. NY (1993); Computer Analysis of Sequence Data Part I Another preferred Set of highly Stringent conditions uses two (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, N.J. final washes in 0.1x SSC, 0.1% SDS at 65° C. An additional US 2003/0003528A1 Jan. 2, 2003 preferred set of stringent conditions include 0.1x SSC, 0.1% 0107 Identification and Isolation of C1 Metabolizing SDS.-65 C. and washed with 2x SSC, 0.1% SDS followed Microorganisms by 0.1x SSC, 0.1% SDS). 0108. The present invention provides for the expression 0.104) Hybridization requires that the two nucleic acids of genes involved in the biosynthesis of carotenoid com contain complementary Sequences, although depending on pounds in microorganisms which are able to use Single carbon Substrates as a Sole energy Source. Such microorgan the Stringency of the hybridization, mismatches between isms are referred to herein as C1 metabolizers. The host bases are possible. The appropriate Stringency for hybridiz microorganism may be any C1 metabolizer which has the ing nucleic acids depends on the length of the nucleic acids ability to synthesize isopentenyl pyrophosphate (IPP) the and the degree of complementation, variables well known in precursor for many of the carotenoids. the art. The greater the degree of Similarity or homology between two nucleotide Sequences, the greater the value of 0109 Many C1 metabolizing microorganisms are known Tm for hybrids of nucleic acids having those Sequences. The in the art which are able to use a variety of Single carbon relative Stability (corresponding to higher Tm) of nucleic Substrates. Single carbon Substrates useful in the present acid hybridizations decreases in the following order: invention include but are not limited to methane, methanol, RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater formaldehyde, formic acid, methylated amines (e.g. mono-, than 100 nucleotides in length, equations for calculating Tm di- and tri-methyl amine), methylated thiols, and carbon have been derived (see Sambrook et al., Supra, 9.50-9.51). dioxide. For hybridizations with Shorter nucleic acids, i.e., oligo 0110 All C1 metabolizing microorganisms are generally nucleotides, the position of mismatches becomes more classified as methylotrophs. Methylotrophs may be defined important, and the length of the oligonucleotide determines as any organism capable of oxidizing organic compounds its specificity (see Sambrook et al., Supra, 11.7-11.8). In one which do not contain carbon-carbon bonds. A Subset of embodiment the length for a hybridizable nucleic acid is at methylotrophs are the methanotrophs, which have the dis least about 10 nucleotides. Preferable a minimum length for tinctive ability to oxidize methane. Facultative methylotro a hybridizable nucleic acid is at least about 15 nucleotides, phs have the ability to oxidize organic compounds which do more preferably at least about 20 nucleotides, and most not contain carbon-carbon bonds, but may also use other preferably the length is at least 30 nucleotides. Furthermore, carbon Substrates Such as Sugars and complex carbohydrates the Skilled artisan will recognize that the temperature and for energy and biomass. Obligate methylotrophs are those wash Solution Salt concentration may be adjusted as neces organisms which are limited to the use of organic com Sary according to factorS Such as length of the probe. pounds which do not contain carbon-carbon bonds for the 0105 The term “sequence analysis software” refers to generation of energy and obligate methanotrophs are those any computer algorithm or Software program that is useful obligate methylotrophs that have the ability to oxidize for the analysis of nucleotide or amino acid Sequences. methane. “Sequence analysis Software' may be commercially avail 0111 Facultative methylotrophic bacteria are found in able or independently developed. Typical Sequence analysis many environments, but are isolated most commonly from Software will include but is not limited to the GCGSuite of Soil, landfill and waste treatment Sites. Many facultative programs (Wisconsin Package Version 9.0, Genetics Com methylotrophs are members of the B, and Y Subgroups of the puter Group (GCG), Madison, Wis.), BLASTP, BLASTN, Proteobacteria (Hanson et al., Microb. Growth C1 Com BLASTX (Altschulet al., J. Mol. Biol. 215:403-410 (1990), pounds, Int. Symp.), 7th (1993), 285-302. Editor(s): Mur and DNASTAR (DNASTAR, Inc. 1228 S. Park St. Madison, rell, J. Collin; Kelly, Don P. Publisher: Intercept, Andover, Wis. 53715 USA), and the FASTA program incorporating UK; Madigan et al., Brock Biology of Microorganisms, 8th the Smith-Waterman algorithm (W. R. Pearson, Comput. edition, Prentice Hall, UpperSaddle River, N.J. (1997)). Methods Genome Res., Proc. Int. Symp. (1994), Meeting Facultative methylotrophic bacteria Suitable in the present Date 1992, 111-20. Editor(s): Suhai, Sandor. Publisher: invention include but are not limited to, Methylophilus, Plenum, New York, N.Y.). Within the context of this appli Methylobacillus, Methylobacterium, Hyphomicrobium, cation it will be understood that where Sequence analysis Xanthobacter, Bacillus, Paracoccus, Nocardia, Arthrobacter, Software is used for analysis, that the results of the analysis Rhodopseudomonas, and Pseudomonas. will be based on the “default values” of the program 0112 The ability to utilize single carbon substrates is not referenced, unless otherwise Specified. AS used herein limited to bacteria but extends also to yeasts and fungi. A “default values' will mean any set of values or parameters number of yeast genera are able to use Single carbon which originally load with the Software when first initial Substrates in addition to more complex materials as energy ized. Sources. Specific methylotrophic yeasts useful in the present 0106 Standard recombinant DNA and molecular cloning invention include but are not limited to Candida, Hansenula, techniques used here are well known in the art and are Pichia, Torulopsis, and Rhodotorula. described by Sambrook, J., Fritsch, E. F. and Maniatis, T., 0113 Those methylotrophs having the additional ability Molecular Cloning. A Laboratory Manual, Second Edition, to utilize methane are referred to as methanotrophs. Of Cold Spring Harbor Laboratory Press, Cold Spring Harbor, particular interest in the present invention are those obligate N.Y. (1989) (hereinafter “Maniatis”); and by Silhavy, T. J., methanotrophs which are methane utilizers but which are Bennan, M. L. and Enquist, L. W., Experiments with Gene obliged to use organic compounds lacking carbon-carbon Fusions, Cold Spring Harbor Laboratory Cold Press Spring bonds. Exemplary of these organisms are included in, but Harbor, N.Y. (1984); and by Ausubel, F. M. et al., Current not limited to, the genera Methylomonas, Methylobacter, Protocols in Molecular Biology, published by Greene Pub Mehtylococcus, Methylosinus, Methylocyctis, Methylomi lishing Assoc. and Wiley-Interscience (1987). crobium, and Methanomonas. US 2003/0003528A1 Jan. 2, 2003

0114. Of particular interest in the present invention are are at least 80% identical to the nucleic acid Sequences of high growth obligate methanotrophs having an energetically reported herein. More preferred pyrophosphate dependent favorable carbon flux pathway. For example, Applicants phosphofructokinase nucleic acid fragments are at least 90% have discovered a specific Strain of methanotroph having identical to the Sequences herein. Most preferred are pyro Several pathway features which make it particularly useful phosphate dependent phosphofructokinase nucleic acid frag for carbon flux manipulation. This type of Strain has served ments that are at least 95% identical to the nucleic acid as the host in the present application and is known as fragments reported herein. Methylomonas 16a (ATCC PTA 2402). 0115 The present strain contains several anomalies in the 0118. A further distinguishing characteristic of the carbon utilization pathway. For example, based on genome present Strain is revealed when examining the “cleavage' Sequence data, the Strain is shown to contain genes for two step which occurs in the Ribulose Monophosphate Pathway, pathways of hexose metabolism. The Entner-Douderoff or RuMP cycle. This cyclic set of reactions converts meth Pathway, which utilizes the keto-deoxy phosphogluconate ane to biomolecules in methanotrophic bacteria. The path aldolase enzyme, is present in the Strain. It is generally well way is comprised of three phases, each phase being a Series accepted that this is the operative pathway in obligate of enzymatic steps (FIG. 2). The first step is “fixation” or methanotrophs. Also present however is the Embden-Mey incorporation of C-1 (formaldehyde) into a pentose to form erhof Pathway, which utilizes the fructose bisphosphate a hexose or six-carbon Sugar. This occurs via a condensation aldolase enzyme. It is well known that this pathway is either reaction between a 5-carbon Sugar (pentose) and formalde not present or not operative in obligate methanotrophs. hyde and is catalyzed by the heXulose monophosphate Energetically, the latter pathway is most favorable and Synthase enzyme. The Second phase is termed “cleavage' allows greater yield of biologically useful energy, which and results in Splitting of that hexose into two 3-carbon ultimately results in greater yield production of cell mass molecules. One of those three-carbon molecules is recycled and other cell mass-dependent products in Methylomonas back through the RuMP pathway, while the other 3-carbon 16a. The activity of this pathway in the present 16a strain has fragment is utilized for cell growth. In methanotrophs and been confirmed through microarray data and biochemical methylotrophs, the RuMP pathway may occur as one of evidence measuring the reduction of ATP. Although the 16a three variants. However, only two of these variants are strain has been shown to possess both the Embden-Meyer commonly found, identified as the FBP/TA (fructose bis hof and the Entner-Douderoff pathway enzymes, the data phosphotase/transaldolase) pathway or the KDPG/TA (keto Suggests that the Embden-Meyerhof pathway enzymes are deoxy phosphogluconate/transaldolase) pathway (Dijkhui more strongly expressed than the Entner-Douderoff pathway Zen L., G. E. Devries. The Physiology and biochemistry of enzymes. This result is Surprising and counter to existing aerobic methanol-utilizing gram negative and gram positive beliefs on the glycolytic metabolism of methanotrophic bacteria. In: Methane and Methanol Utilizers (1992), eds. bacteria. Applicants have discovered other methanotrophic Colin Murrell and Howard Dalton; Plenum Press: N.Y.). bacteria having this characteristic, including for example, 0119) The present strain is unique in the way it handles Methylomonas clara and Methylosinus Sporium. It is likely the “cleavage' steps as genes were found that carry out this that this activity has remained undiscovered in methanotro conversion via fructose bisphosphate as a key intermediate. phs due to the lack of activity of the enzyme with ATP, the The genes for fructose bisphosphate aldolase and transaldo typical phosphoryl donor for the enzyme in most bacterial lase were found clustered together on one piece of DNA. Systems. Secondly, the genes for the other variant involving the keto 0116. A particularly novel and useful feature of the deoxy phosphogluconate intermediate were also found clus Embden-Meyerhof pathway in strain 16a is that the key tered together. Available literature teaches that these organ phosphofructokinase Step is pyrophosphate dependent isms (methylotrophs and methanotrophs) rely solely on the instead of ATP dependent. This feature adds to the energy KDPG pathway and that the FBP-dependent fixation path yield of the pathway by using pyrophosphate instead of ATP. way is utilized by facultative methylotrophs (Dijkhuizen et Because of its significance in providing an energetic advan al., Supra). Therefore the latter observation is expected tage to the Strain, this gene in the carbon flux pathway is whereas the former is not. The finding of the FBP genes in considered diagnostic for the present Strain. an obligate methane utilizing bacterium is both Surprising and suggestive of utility. The FBP pathway is energetically 0117 Comparison of the pyrophosphate dependent phos favorable to the host microorganism due to the fact that leSS phofructokinase gene sequence (SEQID NO:1) and deduced energy (ATP) is utilized than is utilized in the KDPG amino acid sequence (SEQ ID NO:2) to public databases pathway. Thus organisms that utilize the FBP pathway may reveals that the most similar known sequence is about 63% have an energetic advantage and growth advantage over identical to the amino acid Sequence of reported herein over those that utilize the KDPG pathway. This advantage may length of 437 amino acids using a Smith-Waterman align also be useful for energy-requiring production pathways in ment algorithm (W. R. Pearson, Comput. Methods Genome the Strain. By using this pathway a methane-utilizing bac Res., Proc. Int. Symp. (1994), Meeting Date 1992, 111-20. terium may have an advantage over other methane utilizing Editor(s): Suhai, Sandor. Publisher: Plenum, New York, organisms as production platforms for either Single cell N.Y.). More preferred amino acid fragments are at least protein or for any other product derived from the flow of about 80%-90% identical to the sequences herein. Most carbon through the RuMP pathway. preferred are nucleic acid fragments that are at least 95% identical to the amino acid fragments reported herein. Simi 0120 Accordingly the present invention provides a larly, preferred pyrophosphate dependent phosphofructoki method for the production of a carotenoid compound com nase encoding nucleic acid Sequences corresponding to the prising providing a transformed C1 metabolizing host cell instant gene are those encoding active proteins and which which US 2003/0003528A1 Jan. 2, 2003 10

0121 (a) grows on a C1 carbon substrate selected (IPP). Where IPP is naturally present only elements of the from the group consisting of methane and methanol; lower carotenoid pathway will be needed. However, it will and be appreciated that for the lower pathway carotenoid genes to be effective in the production of carotenoids, it will be 0122 (b) comprises a functional Embden-Meyerhof necessary for the host cell to have suitable levels of IPP carbon pathway, Said pathway comprising a gene within the cell. Where IPP synthesis is not provided by the encoding a pyrophosphate dependent phosphofruc host cell, it will be necessary to introduce the genes neces tokinase enzyme. sary for the production of IPP. Each of these pathways will 0123) Isolation of C1 Metabolizing Microorganisms be discussed below in detail. 0.124. The C1 metabolizing microorganisms of the 0128. The Upper Isoprenoid Pathway present invention are ubiquitous and many have been iso 0129. IPP biosynthesis occurs through either of two path lated and characterized. A general Scheme for isolation of ways. First, IPP may be synthesized through the well-known these Strains includes addition of an inoculum into a Sealed acetate/. However, recent Studies have liquid mineral Salts media, containing either methane or demonstrated that the mevalonate-dependent pathway does methanol. Care must be made of the Volume:gas ratio and not operate in all living organisms. An alternate mevalonate cultures are typically incubated between 25-55 C. Typi independent pathway for IPP biosynthesis has been charac cally, a variety of different methylotrophic bacteria can be terized in bacteria and in green algae and higher plants isolated from a first enrichment, if it is plated or Streaked (Horbach et al., FEMS Microbiol. Lett. 111:135-140(1993); onto solid media when growth is first visible. Methods for Rohmer et al, Biochem. 295:517-524 (1993); Schwender et the isolation of methanotrophs are common and well known al., Biochem. 316: 73-80 (1996); Eisenreich et al., Proc. in the art (See for example Thomas D. Brock in Biotech Natl. Acad. Sci. USA 93: 6431-6436 (1996)). Many steps in nology: A Textbook of Industrial Microbiology, Second both isoprenoid pathways are known (FIG. 1). For example, Edition (1989) Sinauer Associates, Inc., Sunderland, Mass.; the initial Steps of the alternate pathway leading to the Deshpande, Mukund V., Appl. Biochem. Biotechnol, 36: production of IPP have been studied in Mycobacterium 227 (1992); or Hanson, R. S. et al. The Prokaryotes: a tuberculosis by Cole et al. (Nature 393:537-544 (1998)). The handbook on habitats, isolation, and identification of bac first step of the pathway involves the condensation of two teria; Springer-Verlag: Berlin, N.Y., 1981; Volume 2, Chap 3-carbon molecules (pyruvate and D-glyceraldehyde ter 118). 3-phosphate) to yield a 5-carbon compound known as D-1- 0125. As noted above, preferred C1 metabolizer is one deoxyxylulose-5-phosphate. This reaction occurs by the that incorporates an active Embden-Meyerhof pathway as DXS enzyme, encoded by the dxS gene. Next, the isomer indicated by the presence of a pyrophosphate dependent ization and reduction of D-1-deoxyxylulose-5-phosphate phosphofructokinase. It is contemplated that the present yields 2-C-methyl-D-erythritol-4-phosphate. One of the teaching will enable the general identification and isolation enzymes involved in the isomerization and reduction pro of Similar Strains. For example, the key characteristics of the ceSS is D-1-deoxyxylulose-5-phosphate reductoisomerase present high growth Strain are that it is an obligate metha (DXR), encoded by the gene dxr. 2-C-methyl-D-erythritol notroph, using only either methane of methanol as a Sole 4-phosphate is Subsequently converted into 4-diphospho carbon Source and possesses a functional Embden-Meyer cytidyl-2C-methyl-D-erythritol in a CTP-dependent reac hof, and particularly a gene encoding a pyrophosphate tion by the enzyme encoded by the non-annotated geneygbP dependent phosphofructokinase. Methods for the isolation (Cole et al., Supra). Recently, however, the ygbP gene was of methanotrophs are common and well known in the art renamed as ispID as a part of the isp gene cluster (SwissPro (See for example Thomas D. Brock supra or Deshpande, tein Accession #Q46893). Supra). Similarly, pyrophosphate dependent phosphofruc 0130. Next, the 2" position hydroxy group of 4-diphos tokinase has been well characterized in mammalian Systems phocytidyl-2C-methyl-D-erythritol can be phosphorylated and assay methods have been well developed (see for in an ATP-dependent reaction by the enzyme encoded by the example Schliselfeld et al. Clin. Biochem. (1996), 29(1), ychB gene. This product phosphorylates 4-diphosphocyti 79-83; Clark et al., J. Mol. Cell. Cardiol. (1980), 12(10), dyl-2C-methyl-D-erythritol, resulting in 4-diphosphocyti 1053-64. The contemporary microbiologist will be able to dyl-2C-methyl-D-erythritol 2-phosphate. The yehB gene use these techniques to identify the present high growth was renamed as ispE, also as a part of the isp gene cluster Strain. (SwissProtein Accession #P24209). Finally, the product of 0.126 Genes Involved in Carotenoid Production. ygbB gene converts 4-diphosphocytidyl-2C-methyl-D- erythritol 2-phosphate to 2C-methyl-D-erythritol 2,4-cyclo 0127. The enzyme pathway involved in the biosynthesis diphosphate in a CTP-dependent manner. This gene has also of carotenoids can be conveniently viewed in two parts, the upper isoprenoid pathway providing for the conversion of been recently renamed, and belongs to the isp gene cluster. pyruvate and glyceraldehyde-3-phosphate to isopentenyl Specifically, the new name for the ygbB gene is ispF pyrophosphate and the lower carotenoid biosynthetic path (SwissProtein Accession #P36663). way, which provides for the Synthesis of phytoene and all 0131. It is known that 2C-methyl-D-erythritol 2,4-cyclo Subsequently produced carotenoids. The upper pathway is diphosphate can be further converted into IPP to ultimately ubiquitous in many C1 metabolizing microorganisms and in produce carotenoids in the carotenoid biosynthesis pathway. these cases it will only be necessary to introduce genes that However, the reactions leading to the production of isopen comprise the lower pathway for the biosynthesis of the tenyl monophosphate from 2C-methyl-D-erythritol 2,4-cy desired carotenoid. The key division between the two path clodiphosphate are not yet well-characterized. The enzymes ways concerns the Synthesis of isopentenyl pyrophosphate encoded by the IytB and gcpE genes (and perhaps others) are US 2003/0003528A1 Jan. 2, 2003 thought to participate in the reactions leading to formation of isopentenyl pyrophosphate (IPP) and dimethylallyl pyro TABLE 2-continued phosphate (DMAPP). Sources of Genes Encoding the Upper Isoprene Pathway 0132) IPP may be isomerized to DMAPP via IPP isomerase, encoded by the idi gene, however this enzyme is Gene Genbank Accession Number and Source Organism not essential for Survival and may be absent in Some bacteria synthase (Ctps), mRNA using 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. NM O19857 Recent evidence suggests that the MEP pathway branches Homo sapiens CTP synthase II (CTPS2), X68196 before IPP and separately produces IPP and DMAPP via the mRNAS.cerevisiae ura8 gene for CTP synthetase IytB gene product. A IytB knockout mutation is lethal in E. XM 013134 coli except in media Supplemented with both IPP and BCOO9408, Homo Sapiens, CTP synthase, clone DMAPP. MGC10396 IMAGE 3355881 Homo sapiens CTP synthase II (CTPS2), mRNA 0.133 Genes encoding elements of the upper pathway are XM 0468O1 known from a variety of plant, animal, and bacterial Sources, Homo sapiens CTP synthase II (CTPS2), mRNA XM 046802 as shown in Table 2. Homo sapiens CTP synthase II (CTPS2), mRNA XM 046803 TABLE 2 Homo sapiens CTP synthase II (CTPS2), mRNA XM 046804 Sources of Genes Encoding the Upper Isoprene Pathway Homo sapiens CTP synthase II (CTPS2), mRNA Z47198, A. parasiticus pkSA gene for polyketide Gene Genbank Accession Number and Source Organism synthase lytB AF027189, Acinetobacter sp. BD4I 3 dxs AFO35440, Escherichia coli AFO98521, Burkholderia pseudomallei Y18874, Synechococcus PCC6301 AF291696, Streptococcus pneumoniae AB026631, Streptomyces sp. CL190 AF323927, Plasmodium falciparum gene AB042821, Streptomyces griseolosporeus M87645, Bacilius subtillis AF111814, Plasmodium falciparum U38915, Synechocystis sp. AF143812, Lycopersicon esculentum X89371, C. jejuni AJ279019, Narcissus pseudonarcissus AJ291721, Nicotiana tabacum dXr ABO13300, Escherichia coli AB049187, Streptomyces griseolosporeus AF111813, Plasmodium falciparum AF116825, Mentha x piperita AF148852, Arabidopsis thaliana AF182287, Artemisia annua AF250235, Catharanthus roseus AF282879, Pseudomonas aeruginosa AJ242588, Arabidopsis thaliana AJ250714, Zymomonas mobilis strain ZM4 AJ292312, Klebsiella pneumoniae, AJ297566, Zea mays ispD AB037876, Arabidopsis thaliana AF109075, Clostridium difficile AF230736, Escherichia coli AF230737, Arabidopsis thaliana ispE AF216300, Escherichia coli AF263101, Lycopersicon esculentum AF288615, Arabidopsis thaliana 0134) The most preferred source of genes for the upper ispF AB038256, Escherichia coli mecs gene isoprene pathway in the present invention is from Methy AF230738, Escherichia coli lomonas 16a. Methylomonas 16a is particularly well Suited AF250236, Catharanthus roseus (MECS) AF279661, Plasmodium falciparum for the present invention, as the methanotroph is naturally AF321531, Arabidopsis thaliana pink-pigmented, producing a 30-carbon carotenoid. Thus, pyrCi AB017705, Aspergillus Oryzae the organism is well-endowed with the genes of the upper AB064659, Aspergillus kawachii isoprene pathway. Sequences of these preferred genes are AFO61753, NitroSOmonas europaea AF206163, Solorina crocea presented as the following SEQ ID numbers: the dxS gene L22971, Spiroplasma citri (SEQ ID NO:5), the dxr gene (SEQ ID NO:7), the “ispD” M12843, E. coli gene (SEQ ID NO:9), the “ispE” gene (SEQ ID NO:11), the M19132, Emericelia nidulans “ispF” gene (SEQ ID NO:13), the “pyrG” gene (SEQ ID M69112, Mucor circinelioides U15192, Chlamydia trachomatis NO:15), and the “lytB” gene (SEQ ID NO:17). U59237, Synechococcus PCC7942 U88301, Mycobacterium bovis 0135) The Lower Carotenoid Biosynthetic Pathway X06626, Aspergillus niger X08037, Penicillium chrysogenium 0.136 The formation of phytoene is the first “true” step X53601, P. biakesleeanus unique in the biosynthesis of carotenoids and produced via X67216, A. brasilense the lower carotenoid biosynthetic pathway, despite the com Y11303, A. fumigatus pounds being colorless. The Synthesis of phytoene occurs Y13811, Aspergillus Oryzae NM OO1905 via isomerization of IPP to dimethylallyl pyrophosphate Homo Sapiens CTP synthase (CTPS), mRNA (DMAPP). This reaction is followed by a sequence of 3 NM 016748, Mus musculus cytidine 5'-triphosphate prenyltransferase reactions. Two of these reactions are cata lyzed by isp A, leading to the creation of geranyl pyrophos US 2003/0003528A1 Jan. 2, 2003 12 phate (GPP; a 10-carbon molecule) and farnesyl pyrophos phate (FPP, 15-carbon molecule). TABLE 3 0.137 The gene crtN1 and N2 convert farnesyl pyrophos Sources of Genes Encoding the Lower Carotenoid phate to naturally occurring 16 A 30-carbon pigment. Biosynthetic Pathway 0.138. The gene crtE, encoding GGPP synthetase is Gene Genbank Accession Number and Source Organism ispA ABOO3187, Micrococcus luteus responsible for the 3' prenyltransferase reaction which may AB016094, Synechococcus elongatus occur, leading to the Synthesis of phytoene. This reaction AB021747, Oryza sativa FPPSI gene for farnesyl adds IPP to FPP to produce a 20-carbon molecule, gera diphosphate synthase AB028044, Rhodobacter sphaeroides nylgeranyl pyrophosphate (GGPP). AB028046, Rhodobacter capsulatus AB028047, Rhodovulum sulfidophilum 0139 Finally, a condensation reaction of two molecules AF112881 and AF136602, Artemisia annua of GGPP occur to form phytoene (PPPP), the first 40-carbon AF384040, Mentha x piperita DOO694, Escherichia coli molecule of the lower carotenoid biosynthesis pathway. This D13293, B. Stearothermophilus enzymatic reaction is catalyzed by crtB, encoding phytoene D85317, Oryza sativa Synthase. X75789, A. thaliana Y12072, G. arboreum Z49786, H. brasiliensis 0140) Lycopene, which imparts a “red'-colored spectra, U80605, Arabidopsis thaliana farnesyl diphosphate is produced from phytoene through four Sequential dehy synthase precursor (FPS1) mRNA, complete cds drogenation reactions by the removal of eight atoms of X76026, K. Iactis FPS gene for farnesyl diphosphate synthetase, QCR8 gene for bc1 complex, subunit VIII hydrogen, catalyzed by the gene crtl (encoding phytoene X82542, P. argentatum mRNA for farnesyl desaturase). Intermediaries in this reaction are phtyofluene, diphosphate synthase (FPS1) X82543, P argentatum mRNA for farnesyl Zeta-carotene, and neurosporene. diphosphate synthase (FPS2) BCO10004, Homo Sapiens, farnesyl diphosphate 0141 Lycopene cyclase (crtY) converts lycopene to synthase (farnesyl pyrophosphate synthetase, B-caroteine. dimethylallyltranstransferase, geranyltranstransferase), clone MGC 15352 IMAGE, 4132071, mRNA, complete cds 0142 B-caroteine is converted to zeaxanthin via a AF234168, Dictyostelium discoideum farnesyl hydroxylation reaction resulting from the activity of B-caro diphosphate synthase (Dfps) tene hydroxylase (encoded by the crtz gene). B-cryptoxan L46349, Arabidopsis thaliana farnesyl diphosphate thin is an intermediate in this reaction. synthase (FPS2) mRNA, complete cds L46350, Arabidopsis thaliana farnesyl diphosphate synthase (FPS2) gene, complete cds 0143 B-caroteine is converted to canthaxanthin by L46367, Arabidopsis thaliana farnesyl diphosphate B-caroteine ketolase encoded by the crtW gene. Echinenone synthase (FPSI) gene, alternative products, complete in an intermediate in this reaction. Canthaxanthin can then cds M89945, Rat farnesyl diphosphate synthase gene, be converted to astaxanthin by B-caroteine hydroxylase exons 1-8 NM 002004, Homo Sapiens farnesyl diphosphate encoded by the crtz gene. Adonbirubrin is an intermediate synthase (farnesyl pyrophosphate synthetase, in this reaction. dimethylallyltranstransferase, geranyltranstransferase) (FDPS), mRNA 0144 Zeaxanthin can be converted to zeaxanthin-f-di U36376 glucoside. This reaction is catalyzed by ZeaXanthin glucosyl Artemisia annua farnesyl diphosphate synthase (fps1) mRNA, complete cds transferase (crtX). XM 001352, Homo Sapiens farnesyl diphosphate synthase (farnesyl pyrophosphate synthetase, 0145 Zeaxanthin can be converted to astaxanthin by dimethylallyltranstransferase, B-caroteine ketolase encoded by crtW, crtO or bkt. Adonix geranyltranstransferase) (FDPS), mRNA XM 034497 anthin is an intermediate in this reaction. Homo Sapiens farnesyl diphosphate synthase (farnesyl pyrophosphate synthetase, 0146) Spheroidene can be converted to spheroidenone by dimethylallyltranstransferase, Spheroidene monooxygenase encoded by crtA. geranyltranstransferase) (FDPS), mRNA XM 034498 0147 Nerosporene can be converted spheroidene and Homo Sapiens farnesyl diphosphate synthase (farnesyl pyrophosphate synthetase, lycopene can be converted to Spirilloxanthin by the Sequen dimethylallyltranstransferase, tial actions of hydroxyneurosporene Synthase, methoxy geranyltranstransferase) (FDPS), mRNA neuroSporene desaturase and hydroxyneurosporene-O-me XM 03.4499 thyltransferase encoded by the crtC, crtD and crtF genes, Homo Sapiens farnesyl diphosphate synthase respectively. (farnesyl pyrophosphate synthetase, dimethylallyltranstransferase, geranyltranstransferase) (FDPS), mRNA 0148 B-caroteine can be converted to isorenieratene by XM 034500 b-caroteine desaturase encoded by crtU. Homo Sapiens farnesyl diphosphate synthase (farnesyl pyrophosphate synthetase, 0149 Genes encoding elements of the lower carotenoid dimethylallyltranstransferase, biosynthetic pathway are known from a variety of plant, geranyltranstransferase) (FDPS), mRNA animal, and bacterial Sources, as shown in Table 3. US 2003/0003528A1 Jan. 2, 2003 13

TABLE 3-continued TABLE 3-continued Sources of Genes Encoding the Lower Carotenoid Sources of Genes Encoding the Lower Carotenoid Biosynthetic Pathway Biosynthetic Pathway Gene Genbank Accession Number and Source Organism Gene Genbank Accession Number and Source Organism crtN X73889, S. aureus X74599 crtE (GGPP AB000835, Arabidopsis thaliana Synechococcus sp. Icy gene for lycopene cyclase Synthase) AB016043 and ABO19036, Homo Sapiens X81787 ABO16044, Mus musculus N. tabacum CrtL-1 gene encoding lycopene cyclase AB027705 and ABO27706, Daucus carota X86221, C. annuum AB034249, Croton Sublyratus X86452, L. esculentum mRNA for lycopene f-cyclase AB034250, Scoparia dulcis X95596, S. griseus AFO20041, Helianthus annuus X98796, N. pseudonarcissus AFO49658, Drosophila melanogaster signal crtl ABO46992, Citrus unshiu CitPDS1 mRNA for recognition particle 19kDa protein (Srp19) gene, partial phytoene desaturase, complete cds sequence; and geranylgeranyl pyrophosphate AFO39.585 synthase (quemao) gene, complete cds Zea mays phytoene desaturase (pds1) gene promoter AFO49659, Drosophila melanogaster geranylgeranyl region and exon 1 pyrophosphate synthase mRNA, complete cds AFO49356 AF139916, Brevibacterium linens Oryza sativa phytoene desaturase precursor (Pds) AF279807, Penicillium paxilligeranylgeranyl mRNA, complete cds pyrophosphate synthase (ggs1) gene, complete AF139916, Brevibacterium linens AF2798O8 AF218415, Bradyrhizobium sp. ORS278 Penicillium paxilli dimethylallyl tryptophan synthase AF251014, Tagetes erecta (paxD) gene, partial cds; and cytochrome P450 AF364515, Citrus x paradisi monooxygenase (paxQ), cytochrome P450 D58420, Agrobacterium aurantiacum monooxygenase (paxP), PaxC (paxC), D83514, Erythrobacter longus monooxygenase (paxM), geranylgeranyl L16237, Arabidopsis thaliana pyrophosphate synthase (paxG), PaxU (paxU), and L37405, Streptomyces griseus geranylgeranyl metabolite transporter (paxT) genes, complete cds pyrophosphate synthase (crtB), phytoene desaturase AJO10302, Rhodobactersphaeroides (cftE) and phytoene synthase (cftl) genes, complete AJ133724, Mycobacterium aurum cds AJ276129, Mucor circinelloides flusitanicus carG L39266, Zea mays phytoene desaturase (Pds) gene for geranylgeranyl pyrophosphate synthase, mRNA, complete cds exons 1-6 M64704, Soybean phytoene desaturase D85O29 M88683, Lycopersicon esculentum phytoene Arabidopsis thaliana mRNA for geranylgeranyl desaturase (pds) mRNA, complete cds pyrophosphate synthase, partial cds S71770, carotenoid gene cluster L25813, Arabidopsis thaliana U37285, Zea mays L37405, Streptomyces griseus geranylgeranyl U46919, Solanum lycopersicum phytoene desaturase pyrophosphate synthase (crtB), phytoene desaturase (Pds) gene, partial cds (cftE) and phytoene synthase (cftl) genes, complete U62808, Flavobacterium ATCC21588 cds X55289, Synechococcus pds gene for phytoene U15778, Lupinus albus geranylgeranyl desaturase pyrophosphate synthase (ggps1) mRNA, complete X59948, L. esculentum cds X62574, Synechocystis sp. pds gene for phytoene U44876, Arabidopsis thaliana pregeranyigeranyl desaturase pyrophosphate synthase (GGPS2) mRNA, complete X68058 cds C. annuum pds1 mRNA for phytoene desaturase X92893, C. roseus X71O23 X95596, S. griseus Lycopersicon esculentum pds gene for phytoene X98795, S. alba desaturase Y15112, Paracoccus marcusii X78271, L. esculentum (Ailsa Craig) PDS gene crtX D90087, E. uredovora X78434, P. blakesleeanus (NRRL1555) carB gene M87280 and M90698, Pantoea agglomerans X78815, N. pseudonarcissus crtY AF139916, Brevibacterium linens X86783, H. pluvialis AF152246, Citrus x paradisi Y14807, Dunaiella bardawill AF218415, Bradyrhizobium sp. ORS278 YI 5007, Xanthophyllomyces dendrorhous AF272737, Streptomyces griseus strain IFO13350 Y15112, Paracoccus marcusii AJ133724, Mycobacterium aurum Y15114, Anabaena PCC7210 crtP gene AJ250827, Rhizomucor circinelloides flusitanicus Z11165, R. capsulatus carRP gene for lycopene cyclase/phytoene synthase, crtB AB001284, Spirulina platensis exons 1-2 AB032797, Daucus carota PSY mRNA for phytoene AJ276965, Phycomyces blakesleeanus carRA gene synthase, complete cds for phytoene synthase/lycopene cyclase, exons 1-2 AB034704, Rubrivivax gelatinosus D58420, Agrobacterium aurantiacum ABO37975, Citrus unshiu D83513, Erythrobacter longus AFOO9954, Arabidopsis thaliana phytoene synthase L40176, Arabidopsis thaliana lycopene cyclase (PSY) gene, complete cds (LYC) mRNA, complete cds AF139916, Brevibacterium linens M87280, Pantoea agglomerans AF152892, Citrus x paradisi U50738, Arabodopsis thaliana lycopene epsilon AF218415, Bradyrhizobium sp. ORS278 cyclase mRNA, complete cds AF220218, Citrus unshiu phytoene synthase (Psy1) U50739 mRNA, complete cds Arabidosis thaliana lycopene B cyclase mRNA, AJO10302, Rhodobacter complete cds AJ133724, Mycobacterium aurum U62808, Flavobacterium ATCC21588 AJ278287, Phycomyces blakesleeanus carRA gene US 2003/0003528A1 Jan. 2, 2003

TABLE 3-continued TABLE 3-continued Sources of Genes Encoding the Lower Carotenoid Sources of Genes Encoding the Lower Carotenoid Biosynthetic Pathway Biosynthetic Pathway Gene Genbank Accession Number and Source Organism Gene Genbank Accession Number and Source Organism for lycopene cyclase/phytoene synthase, U73944, Rubrivivax gelatinosus A3O4825 X52291 and Z11165, Rhodobacter capsulatus Helianthus annuus mRNA for phytoene synthase (psy Z21955, M. xanthus gene) crtD AJO10302 and X63204, Rhodobacter sphaeroides A3O8385 (carotenoid 3, U73944, Rubrivivax gelatinosus Helianthus annuus mRNA for phytoene synthase (psy 4-desaturase X52291 and Z11165, Rhodobacter capsulatus gene) crtF AB034704, Rubrivivax gelatinosus D58420, Agrobacterium aurantiacum (1-OH-carotenoid AF288602, Chloroflexus aurantiacus L23424 methylase) AJO10302, Rhodobacter sphaeroides Lycopersicon esculentum phytoene synthase (PSY2) X52291 and Z11165, Rhodobacter capsulatus mRNA, complete cds L25812, Arabidopsis L37405, Streptomyces griseus geranylgeranyl pyrophosphate synthase (crtB), phytoene desaturase 0150. The most preferred source of genes for the lower (cftE) and phytoene synthase (cftI) genes, complete carotenoid biosynthetic pathway in the present invention are cds M38424 from a variety of sources. The “ispA” gene (SEQID NO:19) Pantoea agglomerans phytoene synthase (crtE) is native to Methylomonas 16a, as the organism produces gene, complete cds respiratory quinones and a 30-carbon carotenoid via the M87280, Pantoea agglomerans 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. How S71770, carotenoid gene cluster ever, Methylomonas does not synthesize the desired 40-car U32636 Zea mays phytoene synthase (Y1) gene, complete bon carotenoids. FPP is the end-product of the MEP pathway cds in Methylomonas 16A and is subsequently converted to its U62808, Flavobacterium ATCC21588 natural 30-carbon carotenoid by the action of the Sqs, crtN1 U87626, Rubrivivax gelatinosus and crtN2 gene products. AS a native gene to the preferred U91900, Dunaiella bardawill X52291, Rhodobacter capsulatus host organism, the ispA gene (SEQ ID NO:19) is the most X60441, L. esculentum GTom5 gene for phytoene preferred Source of the gene for the present invention. synthase X63873 0151. The majority of the most preferred source of crt Synechococcus PCC7942 pys gene for phytoene genes are primarily from Panteoa Stewartii. Sequences of synthase these preferred genes are presented as the following SEQID X68O17 C. annuum psy1 mRNA for phytoene synthase numbers: the crtE gene (SEQ ID NO:25), the crtX gene X69172 (SEQ ID NO:27), crtY (SEQID NO:29), the crt1 gene (SEQ Synechocystis sp. pys gene for phytoene synthase ID NO:31), the crtB gene (SEQ ID NO:33) and the crtz gene X78814, N. pseudonarcissus (SEQ ID NO:35). Additionally, the crt0 gene isolated from crtz, D58420, Agrobacterium aurantiacum D58422, Alcaligenes sp. Rhodococcus erythropolis AN12 and presented as SEQ ID D90087, E. uredovora NO:37 is preferred in combination with other genes for the M87280, Pantoea agglomerans present invention. U62808, Flavobacterium ATCC21588 Y15112, Paracoccus marcusii 0152 By using various combinations of the genes pre crtW AF218415, Bradyrhizobium sp. ORS278 Sented in Table 3 and the preferred genes of the present D45881, Haematococcus pluvialis invention, innumerable different carotenoids and carotenoid D58420, Agrobacterium aurantiacum derivatives could be made using the methods of the present D58422, Alcaligenes sp. X86782, H. pluvialis invention, provided sufficient sources of IPP are available in Y15112, Paracoccus marcusii the host organism. For example, the gene cluster crtEXYIB crtO X86782, H. pluvialis enables the production of B-carotene. Addition of the crt Z. Y15112, Paracoccus marcusii to crtEXYIB enables the production of zeaxanthin, while the crtU AF047490, Zea mays crt EXYIBZO cluster leads to production of astaxanthin and AF121947, Arabidopsis thaliana canthaxanthin. AF139916, Brevibacterium linens AF195507, Lycopersicon esculentum 0153. It is envisioned that useful products of the present AF272737, Streptomyces griseus strain IFO13350 invention will include any carotenoid compound as defined AF372617, Citrus x paradisi AJ133724, Mycobacterium aurum herein including but not limited to antheraxanthin, adonix AJ224683, Narcissus pseudonarcissus anthin, astaXanthin, canthaxanthin, capSorubrin, B-cryptox D26095 and U38550, Anabaena sp. anthin alpha-carotene, beta-carotene, epsilon-caroteine, X89897, C. annuum echinenone, gamma-caroteine, Zeta-carotene, alpha-cryptoX Y15115, Anabaena PCC7210 crtO gene anthin, diatoxanthin, 7,8-didehydroastaxanthin, fucoxan crtA AJO10302, Rhodobactersphaeroides thin, fucoxanthinol, isorenieratene, lactucaXanthin, lutein, (spheroidene Z11165 and X52291, Rhodobacter capsulatus monooxygenase) lycopene, neoxanthin, neuroSporene, hydroxyneuroSporene, crtC AB034704, Rubrivivax gelatinosus peridinin, phytoene, rhodopin, rhodopin glucoside, Sipho AF195122 and AJO10302, Rhodobacter sphaeroides naxanthin, Spheroidene, Spheroidenone, Spirilloxanthin, uri AF287480, Chlorobium tepidum olide, uriolide acetate, violaxanthin, ZeaXanthin-3-digluco Side, and ZeaXanthin. Additionally the invention US 2003/0003528A1 Jan. 2, 2003

encompasses derivitization of these molecules to create energy or carbon. Alternatively, it may be useful to over hydroxy-methoxy-, OXO-, epoxy-, carboxy-, or aldehydic express various genes upstream of desired carotenoid inter functional groups, or glycoside esters, or Sulfates. mediates to enhance production. 0154 Construction of Recombinant C1 Metabolizing 0160 Methods of up-regulating and down-regulating Microorganisms genes for this purpose have been explored. Where Sequence of the gene to be disrupted is known, one of the most O155 Methods for introduction of genes encoding the effective methods gene down regulation is targeted gene appropriate upper isoprene pathway genes or lower caro disruption where foreign DNA is inserted into a structural tenoid biosynthetic pathway genes into a Suitable C1 gene So as to disrupt transcription. This can be effected by metabolizing host are common. Microbial expression SyS the creation of genetic cassettes comprising the DNA to be tems and expression vectors containing regulatory inserted (often a genetic marker) flanked by Sequence having Sequences Suitable for expression of heterologus genes in C1 a high degree of homology to a portion of the gene to be metabolizing hosts are known. Any of these could be used disrupted. Introduction of the cassette into the host cell to construct chimeric genes for expression of any of the results in insertion of the foreign DNA into the structural above mentioned carotenoid biosynthetic genes. These chi gene via the native DNA replication mechanisms of the cell. meric genes could then be introduced into appropriate hosts (See for example Hamilton et al. (1989) J. Bacteriol. via transformation to provide high level expression of the 171:4617-4622, Balbas et al. (1993) Gene 136:211-213, enzymes. Gueldener et al. (1996) Nucleic Acids Res. 24:2519-2524, 0156 Vectors or cassettes useful for the transformation of and Smith et al. (1996) Methods Mol. Cell. Biol. 5:270-277.) Suitable host cells are available. For example Several classes of promoters may be used for the expression of genes 0.161 Antisense technology is another method of down encoding the present carotenoid biosynthetic genes in C1 regulating genes where the Sequence of the target gene is metabolizers including, but not limited to endogenous pro known. To accomplish this, a nucleic acid Segment from the moterS Such as the deoxy-xylulose phosphate Synthase or desired gene is cloned and operably linked to a promoter methanol dehydrogenase operon promoter (Springer et al. Such that the anti-sense Strand of RNA will be transcribed. (1998) FEMS Microbiol Lett 160: 119-124), the promoter for This construct is then introduced into the host cell and the polyhydroxyalkanoic acid Synthesis (Foellner et al. Appl. antisense strand of RNA is produced. Antisense RNA inhib Microbiol. Biotechnol. (1993) 40:284-291), or promoters its gene expression by preventing the accumulation of identified from native plasmids in methylotrophs (EP mRNA which encodes the protein of interest. The person 296484). In addition to these native promoters, non-native skilled in the art will know that Special considerations are promoters may also be used, as for example the promoter for asSociated with the use of antisense technologies in order to the lactose operon Plac (Toyama et al. Microbiology (1997) reduce expression of particular genes. For example, the 143:595-602; EP 62971) or a hybrid promoter such as Ptrc proper level of expression of antisense genes may require (Brosius et al. (1984) Gene 27:161-172). Similarly, promot the use of different chimeric genes utilizing different regu erS associated with antibiotic resistance, e.g. kanamycin latory elements known to the skilled artisan. (Springer et al. (1998). FEMS Microbiol Lett 160:119-124; 0162 Although targeted gene disruption and antisense Ueda et al. Appl. Environ. Microbiol. (1991) 57.924-926) or technology offer effective means of down regulating genes tetracycline (U.S. Pat. No. 4,824.786), are also suitable. where the Sequence is known, other leSS Specific method ologies have been developed that are not Sequence based. O157. Once the specific regulatory element is selected, For example, cells may be exposed to a UV radiation and the promoter-gene cassette can be introduced into a C1 then Screened for the desired phenotype. Mutagenesis with metabolizer on a plasmid containing either a replicon for chemical agents is also effective for generating mutants and episomal expression (Brenner et al. Antonie Van Leeuwen commonly used Substances include chemicals that affect hoek (1991) 60:43-48; Ueda et al. Appl. Environ. Microbiol. non-replicating DNA such as HNO and NH-OH, as well as (1991) 57.924-926) or homologous regions for chromo agents that affect replicating DNA Such as acridine dyes, Somal integration (Naumov et al. Mol. Genet. Mikrobiol. notable for causing frameshift mutations. Specific methods Virusol. (1986) 11:44-48). for creating mutants using radiation or chemical agents are 0158 Typically, the vector or cassette contains sequences well documented in the art. See for example Thomas D. directing transcription and translation of the relevant gene, Brock in Biotechnology: A Textbook of Industrial Microbi a Selectable marker, and Sequences allowing autonomous ology, Second Edition (1989) Sinauer Associates, Inc., Sun replication or chromosomal integration. Suitable vectors derland, Mass., or Deshpande, Mukund V., Appl. Biochem. comprise a region 5' of the gene which harbors transcrip Biotechnol, 36, 227, (1992). tional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. It is most pre 0163 Another non-specific method of gene disruption is ferred when both control regions are derived from genes the use of transposoable elements or transposons. Trans homologous to the transformed host cell, although it is to be poSons are genetic elements that insert randomly in DNAbut understood that Such control regions need not be derived can be latter retrieved on the basis of Sequence to determine from the genes native to the Specific Species chosen as a where the insertion has occurred. Both in vivo and in vitro production host. transposition methods are known. Both methods involve the use of a transposable element in combination with a trans 0159. Where accumulation of a specific carotenoid is posase enzyme. When the transposable element or transpo desired it may be necessary to reduce or eliminate the Son, is contacted with a nucleic acid fragment in the pres expression of certain genes in the target pathway or in ence of the transposase, the transposable element will competing pathways that may serve as competing Sinks for randomly insert into the nucleic acid fragment. The tech US 2003/0003528A1 Jan. 2, 2003 nique is useful for random mutagenesis and for gene isola production of end product or intermediate in Some Systems. tion, Since the disrupted gene may be identified on the basis Stationary or post-exponential phase production can be of the Sequence of the transposable element. Kits for in Vitro obtained in other Systems. transposition are commercially available (see for example The Primer Island Transposition Kit, available from Perkin 0169. A variation on the standard batch system is the Elmer Applied BioSystems, Branchburg, N.J., based upon Fed-Batch system. Fed-Batch culture processes are also the yeast Ty1 element; The Genome Priming System, avail Suitable in the present invention and comprise a typical able from New England Biolabs, Beverly, Mass.; based upon batch system with the exception that the substrate is added the bacterial transposon TnT.; and the EZ::TN Transposon in increments as the culture progresses. Fed-Batch Systems Insertion Systems, available from Epicentre Technologies, are useful when catabolite repression is apt to inhibit the Madison, Wis., based upon the Tn5 bacterial transposable metabolism of the cells and where it is desirable to have element. limited amounts of Substrate in the media. Measurement of the actual Substrate concentration in Fed-Batch Systems is 0164. In the context of the present invention the disrup difficult and is therefore estimated on the basis of the tion of certain genes in the terpenoid pathway may enhance changes of measurable factorS Such as pH, dissolved oxygen the accumulation of Specific carotenoids however, the deci and the partial pressure of waste gases Such as CO2. Batch Sion of which genes to disrupt would need to be determined and Fed-Batch culturing methods are common and well on an empirical basis. Candidate genes may include one or known in the art and examples may be found in Thomas D. more of the prenyltransferase genes which, as described Brock in Biotechnology: A Textbook of Industrial Microbi earlier, which catalyze the Successive condensation of iso ology, Second Edition (1989) Sinauer Associates, Inc., Sun pentenyl diphosphate resulting in the formation of prenyl derland, Mass., or Deshpande, Mukund V., Appl. Biochem. diphosphates of various chain lengths (multiples of C-5 Biotechnol, 36, 227, (1992), herein incorporated by refer isoprene units). Other candidate genes for disruption would CCC. include any of those which encode proteins acting upon the terpenoid backbone prenyl diphosphates. 0170 Commercial production of carotenoids using C1 metabolizers may also be accomplished with a continuous 0.165 Similarly, over-expression of certain genes culture. A continuous culture is an open System where a upstream of the desired product will be expected to have the defined culture media is added continuously to a bioreactor effect of increasing the production of that product. For and an equal amount of conditioned media is removed example, may of the genes in the upper isoprenoid pathway Simultaneously for processing. Continuous cultures gener (D-1-deoxyxylulose-5-phosphate synthase (DxS), D-1- ally maintain the cells at a constant high liquid phase density deoxyxylulose-5-phosphate reductoisomerase (DXr), where cells are primarily in log phase growth. Alternatively 2C-methyl-d-erythritol cytidylyltransferase (IspD), continuous culture may be practiced with immobilized cells 4-diphosphocytidyl-2-C-methylerythritol kinase (IspE), where carbon and nutrients are continuously added, and 2C-methyl-d-erythritol 2,4-cyclodiphosphate Synthase valuable products, by-products or waste products are con (IspF), CTP synthase (PyrC) and lytB) could be expressed tinuously removed from the cell mass. Cell immobilization on multicopy plasmids, or under the influence of Strong may be performed using a wide range of Solid Supports non-native promoters. In this fashion the levels of desired composed of natural and/or Synthetic materials. carotenoids may be enhanced. 0171 Continuous or semi-continuous culture allows for 0166) Industrial Production of Carotenoids the modulation of one factor or any number of factors that 0167. Where commercial production of carotenoid com affect cell growth or end product concentration. For pounds is desired according to the present invention, a example, one method will maintain a limiting nutrient Such variety of culture methodologies may be applied. For as the carbon Source or nitrogen level at a fixed rate and example, large-scale production of a specific gene product, allow all other parameters to moderate. In other Systems a over-expressed from a recombinant microbial host may be number of factors affecting growth can be altered continu produced by both batch or continuous culture methodolo ously while the cell concentration, measured by media gIeS. turbidity, is kept constant. Continuous Systems Strive to maintain Steady State growth conditions and thus the cell 0168 A classical batch culturing method is a closed loSS due to media being drawn off must be balanced against System where the composition of the media is Set at the the cell growth rate in the culture. Methods of modulating beginning of the culture and not Subject to artificial alter nutrients and growth factors for continuous culture pro ations during the culturing process. Thus, at the beginning of ceSSes as well as techniques for maximizing the rate of the culturing process the media is inoculated with the product formation are well known in the art of industrial desired organism or organisms and growth or metabolic microbiology and a variety of methods are detailed by activity is permitted to occur adding nothing to the System. Brock, Supra. Typically, however, a “batch” culture is batch with respect to the addition of carbon Source and attempts are often made at 0172 Fermentation media in the present invention must controlling factorS Such as pH and oxygen concentration. In contain Suitable carbon Substrates for C1 metabolizing batch Systems the metabolite and biomass compositions of organisms. Suitable Substrates may include but are not the System change constantly up to the time the culture is limited to one-carbon Substrates Such as carbon dioxide, terminated. Within batch cultures cells moderate through a methane or methanol for which metabolic conversion into Static lag phase to a high growth log phase and finally to a key biochemical intermediates has been demonstrated. In Stationary phase where growth rate is diminished or halted. addition to one and two carbon Substrates, methylotrophic Ifuntreated, cells in the Stationary phase will eventually die. organisms are also known to utilize a number of other Cells in log phase are often responsible for the bulk of carbon containing compounds Such as methylamine, glu US 2003/0003528A1 Jan. 2, 2003 cosamine and a variety of amino acids for metabolic activity. 0.178 The meaning of abbreviations is as follows: “h” For example, methylotrophic yeast are known to utilize the means hour(s), “min' means minute(s), “Sec’ means Sec carbon from methylamine to form trehalose or glycerol ond(s), “d” means day(s), “mL” means milliliters, “L” (Bellion et al., Microb. Growth C1 Compa, Int. Symp.), 7th means liters. (1993), 415-32. Editor(s): Murrell, J. Collin; Kelly, Don P. Publisher: Intercept, Andover, UK). Similarly, various spe 0179 Microbial Cultivation, Preparation of Cell Suspen cies of Candida will metabolize alanine or oleic acid (Sulter Sions, and ASSociated Analyses for Methylomonas 16a et al., Arch. Microbiol. 153:485-489 (1990)). Hence it is contemplated that the Source of carbon utilized in the present 0180. The following conditions were used throughout the invention may encompass a wide variety of carbon contain experimental Examples for treatment of Methylomonas 16a, ing substrates and will only be limited by the choice of unless conditions were Specifically specified otherwise. organism. 0181 Methylomonas 16a is typically grown in serum EXAMPLES stoppered Wheaton bottles (Wheaton Scientific, Wheaton 0173 The present invention is further defined in the Ill.) using a gas/liquid ratio of at least 8:1 (i.e. 20 mL of following Examples. It should be understood that these Nitrate liquid “BTZ-3” media of 160 mL total volume). The Examples, while indicating preferred embodiments of the Standard gas phase for cultivation contained 25% methane in invention, are given by way of illustration only. From the air. These conditions comprise growth conditions and the above discussion and these Examples, one skilled in the art cells are referred to as growing cells. In all cases, the can ascertain the essential characteristics of this invention, cultures were grown at 30° C. with constant Shaking in a and without departing from the Spirit and Scope thereof, can Lab-Line rotary Shaker unless otherwise Specified. make various changes and modifications of the invention to adapt it to various usages and conditions. 0182 Nitrate medium for Methylomonas 16A 0174 General Methods 0183 Nitrate liquid medium, also referred to herein as “defined medium” or “BTZ-3’ medium was comprised of 0175 Standard recombinant DNA and molecular cloning various salts mixed with Solution 1 as indicated below techniques used in the Examples are well known in the art (Tables 4 and 5) or where specified the nitrate was replaced and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning. A Laboratory Manual; with 15 mM ammonium chloride. Solution 1 provides the Cold Spring Harbor Laboratory Press: Cold Spring Harbor, comosition for 100 fold concentrated Stock Solution of trace (1989) (Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. minerals. W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984) and by TABLE 4 Ausubel, F. M. et al., Current Protocols in Molecular Biol Solution 1* ogy, pub. by Greene Publishing Assoc. and Wiley-Inter science (1987). MW Conc. (mM) g per L. Nitriloacetic acid 1911 66.9 12.8 0176) Materials and methods suitable for the mainte CuCl2 x 2HO 170.48 O.15 O.O254 nance and growth of bacterial cultures are well known in the FeCl x 4HO 198.81 1.5 O.3 art. Techniques Suitable for use in the following examples MnCl2 x 4HO 197.91 0.5 O.1 may be found as set out in Manual of Methods for General CoCl2 x 6HO 237.9 1.31 O.312 Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. ZnCl2 136.29 O.73 O.1 HBO 61.83 O.16 O.O1 Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg NaMoC) x 2HO 241.95 O.04 O.O1 and G. Briggs Phillips, eds), American Society for Micro NiCl, x 6HO 237.7 O.77 O.184 biology, Washington, DC. (1994)) or by Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, Sec *Mix the gram amounts designated above in 900 mL of HO, adjust to pH ond Edition, Sinauer ASSociates, Inc., Sunderland, Mass. = 7, and add H2O to an end volume of 1 L. Keep refrigerated. (1989). All reagents, restriction enzymes and materials used for the growth and maintenance of bacterial cells were 0184 obtained from Aldrich Chemicals (Milwaukee, Wis.), DIFCO Laboratories (Detroit, Mich.), GIBCO/BRL (Gaith TABLE 5 ersburg, Md.), or Sigma Chemical Company (St. Louis, Mo.) unless otherwise specified. Nitrate liquid medium (BTZ-3)" 0177. Manipulations of genetic sequences were accom MW Conc. (mM) g per L. plished using the Suite of programs available from the NaNO 84.99 1O O.85 Genetics Computer Group Inc. (Wisconsin Package Version KHPO 136.09 3.67 0.5 NaSO 142.04 3.52 0.5 9.0, Genetics Computer Group (GCG), Madison, Wis.). MgCl2 x 6HO 2O3.3 O.98 O.2 Where the GCG program “Pileup” was used the gap creation CaCl2 x 2HO 147.02 O.68 O.1 default value of 12, and the gap extension default value of 1 M HEPES (pH 7) 238.3 50 mL. 4 were used. Where the CGC “Gap' or “Bestfit' programs Solution 1 10 mL. were used the default gap creation penalty of 50 and the **Dissolve in 900 mL H.O. Adjust to pH = 7, and add HO to give 1 L. default gap extension penalty of 3 were used. In any case For agar plates: Add 15 g of agarose in 1 L of medium, autoclave, let cool where GCG program parameters were not prompted for, in down to 50 C., mix, and pour plates. these or any other GCG program, default values were used. US 2003/0003528A1 Jan. 2, 2003

0185 Assessment of Microbial Growth and Conditions was followed until the optical density at 660 nm was stable, for Harvesting Cells whereupon the culture was transferred to fresh medium Such 0186 Cells obtained for experimental purposes were that a 1:100 dilution was achieved. After 3 Successive allowed to grow to maximum optical density (O.D. transferS with methane as Sole carbon and energy Source, the 660-1.0). Harvested cells were obtained by centrifugation in culture was plated onto growth agar with ammonium as a Sorval RC-5B centrifuge using a SS-34 rotor at 6000 rpm nitrogen Source and incubated under 25% methane in air. for 20 min. These cell pellets were resuspended in 50 mM Many methanotrophic bacterial Species were isolated in this HEPES buffer pH 7. These cell suspensions are referred to manner. However, Methylomonas 16a was selected as the as washed, resting cells. organism to study due to its rapid growth of colonies, large 0187 Microbial growth was assessed by measuring the colony size, ability to grow on minimal media, and pink optical density of the culture at 660 nm in an Ultrospec 2000 pigmentation indicative of an active biosynthetic pathway UV/Vis spectrophotometer (Pharmacia Biotech, Cambridge for carotenoids. England) using a 1 cm light path cuvet. Alternatively micro 0195 Genomic DNA was isolated from Methylomonas bial growth was assessed by harvesting cells from the 16a according to Standard protocols. Genomic DNA and culture medium by centrifugation as described above and, library construction were prepared according to published resuspending the cells in distilled water with a Second protocols (Fraser et al., The Minimal Gene Complement of centrifugation to remove medium Salts. The washed cells Mycoplasma genitalium, Science 270 (5235):397-403 were then dried at 105 C. overnight in a drying oven for dry (1995)). A cell pellet was resuspended in a solution con weight determination. taining 100 mM Na-EDTA pH 8.0, 10 mM Tris-HCl pH 8.0, 0188 Methane concentration was determined as 400 mM NaCl, and 50 mM MgCl. described by Emptage et al. (1997 Env Sci. Technol. 31:732 734), hereby incorporated by reference. 0196) Genomic DNA preparation 0189 Nitrate and Nitrite Assays 0197). After resuspension, the cells were gently lysed in 0190. 1 mL samples of cell culture were taken and filtered 10% SDS, and incubated for 30 min at 55° C. After through a 0.2 micron Acrodisc filter to remove cells. The incubation at room temperature, proteinase K was added to filtrate from this step contains the nitrite or nitrate to be 100 lug/mL and incubated at 37 C. until the suspension was analyzed. The analysis was performed on a DioneX ion clear. DNA was extracted twice with Tris-equilibrated phe chromatograph 500 system (Dionex, Sunnyvale Calif.) with nol and twice with chloroform. DNA was precipitated in an AS3500 autosampler. The column used was a 4 mm 70% ethanol and resuspended in a solution containing 10 Ion-Pac AS11-HC separation column with an AG-AC guard mM Tris-HCl and 1 mM Na-EDTA (TE), pH 7.5. The DNA column and an ATC trap column. All columns are provided Solution was treated with a mix of RNAases, then extracted by Dionex. twice with Tris-equilibrated phenol and twice with chloro 0191 The mobile phase was a potassium hydroxide gra form. This was followed by precipitation in ethanol and dient from 0 to 50 mM potassium hydroxide over a 12 min resuspension in TE. time interval. Cell temperature was 35 C. with a flow rate of 1 mL/min. 0198 Library Construction 0192 HPLC Analysis of Carotenoid Content 0199 200 to 500 lug of chromosomal DNA was resus pended in a solution of 300 mM sodium acetate, 10 mM 0193 Cell pellets were extracted with 1 ml by Tris-HCl, 1 mM Na-EDTA, and 30% glycerol, and sheared Vortexing for 1 min and intermittent Vortexing over the next at 12 psi for 60 sec in an Aeromist Downdraft Nebulizer 30 min. Cell debris was removed by centrifugation at chamber (IBI Medical products, Chicago, Ill.). The DNA 14,000x g for 10 min and the Supernatants was collected and was precipitated, resuspended and treated with Bal31 passed through a 0.45 pM filter. A Beckman System Gold(R) nuclease. After size fractionation, a fraction (2.0 kb, or 5.0 HPLC with Beckman Gold Nouveau Software (Columbia, kb) was excised and cleaned, and a two-step ligation pro Md.) was used for the study. The crude extraction (0.1 mL) cedure was used to produce a high titer library with greater was loaded onto a 125x4 mm RP8 (5um particles) column than 99% single inserts. with corresponding guard column (Hewlett-Packard, San Fernando, Calif.). The flow rate was 1 mL/min, while the 0200 Sequencing solvent program used was: 0-11.5 min 40% water/60% 0201 A shotgun sequencing Strategy approach was methanol; 11.5-20 min 100% methanol; 20-30 min 40% adopted for the Sequencing of the whole microbial genome water/60% methanol. The spectral data was collected by a (Fleischmann, R. et al., Whole-Genome Random sequenc Beckman photodiode array detector (model 168). ing and assembly of Haemophilus influenzae Rd Science Example 1 269(5223):496-512 (1995)). Sequence was generated on an ABI Automatic Sequencer using dye terminator technology (U.S. Pat. No. 5,366,860; EP 272,007) using a combination Isolation And Sequencing Of Methylomonas 16a of Vector and insert-specific primerS. Sequence editing was 0194 The original environmental sample containing the performed in either DNAStar (DNA Star Inc.) or the Wis isolate was obtained from pond Sediment. The pond Sedi consin GCG program (Wisconsin Package Version 9.0, ment was inoculated directly into defined medium with Genetics Computer Group (GCG), Madison, Wis.) and the ammonium as nitrogen Source under 25% methane in air. CONSED package (version 7.0). All sequences represent Methane was the Sole Source of carbon and energy. Growth coverage at least two times in both directions. US 2003/0003528A1 Jan. 2, 2003 19

Example 2 were translated in all reading frames and compared for Similarity to all publicly available protein Sequences con Identification and Characterization of Bacterial tained in the “nr” database using the BLASTX algorithm Genes from Methylomonas (Gish, W. and States, D. J. (1993) Nature Genetics 3:266 0202 All sequences from Example 1 were identified by 272) provided by the NCBI. All comparisons were done conducting BLAST (Basic Local Alignment Search Tool; using either the BLASTNnr or BLASTXnr algorithm. Altschul, S. F., et al., (1993).J. Mol. Biol. 215:403-410; see also www.ncbi.nim.nih.gov/BLAST/) searches for similar 0203 The results of these BLAST comparisons are given ity to sequences contained in the BLAST “nr” database below in Table 6 for many critical genes of the present (comprising all non-redundant GenBank CDS translations, invention. Table 6 Summarizes the Sequence to which each Sequences derived from the 3-dimensional Structure Methylomonas gene has the most similarity (presented as % Brookhaven Protein Data Bank, the SWISS-PROT protein Similarities, % identities, and expectation values). The table sequence database, EMBL, and DDBJ databases). The displays databased on the BLASTXnr algorithm with values Sequences were analyzed for Similarity to all publicly avail reported in expect values. The Expect value estimates the able DNA sequences contained in the “nr” database using Statistical significance of the match, Specifying the number the BLASTN algorithm provided by the National Center for of matches, with a given Score, that are expected in a Search Biotechnology Information (NCBI). The DNA sequences of a database of this size absolutely by chance.

TABLE 6 Identification of Critical Methylomonas Genes Based on Sequence Homology % Gene SEO ID % Similarity Name Similarity Identified SEQ ID peptide Identity b E-value Citation Phosphofructokinase Phosphofructokinase 1. 2 63% 83%, 1.7e-97 Ladror et al., J. Biol. Chem. 266, pyrophosphate pyrophosphate 16550-16555 (1991) dependent dependent 675.1(M67447) (AL352972) 3 4 59% 72%. 1e-64 Redenbach et al., Mol. Microbiol. 21 KHG/KDPG aldolase (1), 77-96 (1996) Streptomyces coelicolor dxs 1-deoxyxylulose-5- 5 6 60% 86% 5.7e-149 Lois et al., Proc. Natl. Acad. Sci. phosphate synthase USA. 95 (5), 2105-2110 (1998) (E. coli) dXr 1-deoxy-d-xylulose 7 8 55% 78% 3.3e-74 Takahashi et al., Proc. Natl. Acad. 5-phosphate USA95:9879–9884 (1998) reductoisomerase (E. coli) ygbPfispD 2C-methyl-d- 9 1O 52% 74%, 7.7e-36 Rohdich et al., Proc Natl Acad Sci erythrito USA 1999 Oct 12:96(21):11758-63 cytidylyltransferase (E. coli) ychB/IspE 4-diphosphocytidyl-2- 11 12 50% 73% 8.8e-49 Luttgen et al., Proc Natl Acad Sci C-methylerythritol USA 2000 Feb 1:97(3):1062–7. kinase (E. coli) ygbB/ispF 2C-methyl-d- 13 14 69% 84%, 1.6e-36 Herz et al., Proc Natl Acad Sci US erythritol 2,4- A 2000 Mar 14:97(6):2486-90 cyclodiphosphate synthase (E. coli) pyrCi CTP synthase 15 16 67% 89% 2.4e-141 Weng. et al., J. Biol. Chem. (E. coli) 261:5568-5574 (1986) lytB Acinetobacter sp 17 18 65 87 3.4e–75 Genbanki G.I. 5915671 BD413 Putative penicillin binding protein lspA Geranyltranstransfer 19 2O 57% 78%, 7.8e-56 Ohto,et al., Plant Mol. Biol. 40 (2), ase (also farnesyl 307-321 (1999) diphosphate synthase) (Synechococcus elongatus) crtN1 diapophytoene 21 22 34% 72% 4e-66 Xiong, et al., Proc. Natl. Acad. Sci. dehydrogenase U.S.A. 95 (25), 14851–14856 (1998) CrtN-copy 1 (Heliobacillus mobilis) US 2003/0003528A1 Jan. 2, 2003 20

TABLE 6-continued Identification of Critical Methylomonas Genes Based on Sequence Homology % Gene SEO ID % Similarity Name Similarity Identified SEQ ID peptide Identity b E-value Citation crtN2 Diapophytoene 23 24 49% 78%. 13e-76 Genbank #: X97985 dehydrogenase CrtN-copy 2 (Staphylococcus aureus)

Example 3 ug; Gibco BRL, Gaithersburg, Md.) were diluted with RNAase-free water to a volume of 25 till. The sample was Microarray For Gene Expression. In Methylomonas denatured at 70° C. for 10 min and then chilled on ice for 30 16a Sec. After adding 14 till of labeling mixture, the annealing was accomplished by incubation at room temperature for 10 0204 All bacterial ORFs of Methylomonas were pre min. The labeling mixture contained 8 till of 5x enzyme pared for DNA microarray. The following Example presents buffer, 4 uL DTT (0.1M), and 2 ul of 20x dye mixture. The the Specific protocols utilized for microarray analysis. dye mixture consisted of 2 mM of each dATP, dGTP, and 0205 Amplification of DNA Regions for the Construc dTTP, 1 mM dCTP, and 1 mM of Cy3-dCTP or Cy5-dCTP. tion of DNA Microarray. After adding 1 to 1.5 till of SuperScript II reverse tran Scriptase (200 units/mL, Life Technologies Inc., Gaithers 0206 Specific primer pairs were used to amplify each berg, Md.), cDNA synthesis was allowed to proceed at 42 protein specifying ORF of Methylomonas sp. strain 16a. C. for 2 hr. The RNA was removed by adding 2 ul, NaOH Genomic DNA (10-30 ng) was used as the template. The (2.5 N) to the reaction. After 10 min of incubation at 37 C., PCR reactions were performed in the presence of HotStart the pH was adjusted with 10 uL of HEPES (2 M). The Taq'TM DNA polymerase (Qiagen, Valencia, Calif.) and labeled cDNA was then purified with a PCR purification kit dNTPs (Gibco BRL Life Science Technologies, Gaithers (Qiagen, Valencia, Calif.). Labeling efficiency was moni berg, Md.). Thirty-five cycles of denaturation at 95 C. for tored using either Asso for Cy3 incorporation, or Aso for 30 sec, annealing at 55 C. for 30 sec, and polymerization at Cy5. 72° C. for 2 min were conducted. The quality of PCR reactions was checked with electrophresis in a 1% argarose 0212 Fluorescent Labeling of Genomic DNA. gel. The DNA samples were purified by the high-throughput 0213 Genomic DNA was nebulized to approximately 2 PCR purification kit from Qiagen. kb pair fragments. Genomic DNA (0.5 to 1 lug) was mixed with 6 ug of random hexamers primers (Gibco BRL Life 0207 Arraying Amplified ORFs. Science Technologies, Gaithersburg, Md.) in 15uL of water. 0208 Before arraying, an equal volume of DMSO (10 The mix was denatured by placement in boiling water for 5 uL) and DNA (10 uI) sample was mixed in 384-well min, followed by annealing on ice for 30 sec before transfer microtiter plates. A generation 11 DNA spotter (Molecular to room temperature. Then, 2 ul. 5x Buffer 2 (Gibco BRL) Dynamics, Sunnyvale, Calif.) was used to array the samples and 2 ul dye mixture were added. The components of the dye onto coated glass slides (Telechem, Sunnyvale, Calif.). Each mixture and the labeling procedure are the same as described PCR product was arrayed in duplicate on each slide. After above for RNA labeling, except that the Klenow fragment of cross-linking by UV light, the slides were stored under DNA polymerase 1 (5ug?uL, Gibco BRL) was used as the Vacuum in a desiccator at room temperature. enzyme. After incubation at 37° C for 2 hr, the labeled DNA probe was purified using a PCR purification kit (Qiagen, 0209 RNA Isolation. Valencia, Calif.). 0210 Methylomonas 16a was cultured in a defined 0214) Hybridization and Washing. medium with ammonium or nitrate (10 mM) as a nitrogen Source under 25% methane in air. Samples of the minimal 0215 Slides were first incubated with prehybridization medium culture were harvested when the O.D. reached 0.3 solution containing 3.5xSSC (Gibco BRL, Gaithersberg, at Asoo (exponential phase). Cell cultures were harvested Md.), 0.1% SDS (Gibco BRL), 1% bovine serum albumin quickly and ruptured in RLT buffer (Qiagen RNeasy Mini (BSA, Fraction V, Sigma, St. Louis, Mo.). After prehybrid Kit, Valencia, Calif.) with a beads-beater (Biol(01, Vista, ization, hybridization Solutions (Molecular Dynamics, Calif.). Debris was pelleted by centrifugation for 3 min at Sunnyvale, Calif.) containing labeled probes were added to 14,000x g at 4 C. RNA isolation was completed using the Slides and covered with cover Slips. Slides were placed in a protocol supplied with this kit. After on-column DNAase humidified chamber in a 42 C. incubator. After overnight treatment, the RNA product was eluted with 50-100 till hybridization, slides were initially washed for 5 min at room RNAase-free water. RNA preparations were stored frozen at temperature with a washing Solution containing 1XSSC, either -20 or -80 C. 0.1% SDS and 0.1XSSC, 0.1% SDS. Slides were then washed at 65 C. for 10 min with the same solution for three 0211 Synthesis of Fluorescent cDNA from Total RNA. times. After washing, the Slides were dried with a stream of RNA samples (7 to 15ug) and random hexamer primers (6 nitrogen gas. US 2003/0003528A1 Jan. 2, 2003

0216) Data Collection and Analysis. central metabolism of methanotrophic bacteria (Dijkhuizen, 0217. The signal generated from each slide was quanti L., et al. The physiology and biochemistry of aerobic fied with a laser Scanner (Molecular Dynamics, Sunnyvale, methanol-utilizing gram-negative and gram-positive bacte Calif.). The images were analyzed with Array Vision 4.0 ria. In: Methane and Methanol Utilizers, Biotechnology Software (Imaging Research, Inc., Ontario, Canada). The Handbooks 5, 1992. Eds: Colin Murrell, Howard Dalton; pp raw fluorescent intensity for each spot was adjusted by 149-157). Subtracting the background. These readings were exported to a spreadsheet for further analysis. Example 5 Example 4 Direct Enzymatic Evidence For A Pyrophosphate-Linked Phosphofructokinase Comparison Of Gene Expression Levels In The 0222. This example shows the evidence for the presence Entner Douderoff Pathway. As Compared With The of a pyrophosphate-linked phosphofructokinase enzyme in Embeden Meverof Pathway the current Strain, thereby confirming the functionality of the 0218. This Example presents microarray evidence dem Embden-Meyerhoff pathway in the present Methylomonas onstrating the use of the Embden-Meyerhoff pathway for Strain. carbon metabolism in the 16a Strain. 0223 Phosphofructokinase activity was shown to be 0219 FIG. 2 shows the relative levels of expression of present in Methylomonas 16a by using the coupled enzyme genes for the Entner-Douderoff pathway and the Embden assay described below. ASSay conditions are given in Table Meyerhoff pathway. The relative transcriptional activity of 7 below. each gene was estimated with DNA microarray as described previously (Example 3; Wei, et al., J. Bact. 183:545-556 0224 Coupled Assay Reactions (2001)). 0225. Phosphofructokinase reaction is measured by a 0220 Specifically, a single DNA microarray containing coupled enzyme assay. Phosphofructokinase reaction is 4000 ORFs (open reading frames) of Methylomonas 16a coupled with fructose 1,6, biphosphate aldolase followed by was hybridized with probes generated from genomic DNA triosephosphate isomerase. The enzyme activity is measured and total RNA. The genomic DNA of 16a was labeled with by the disappearance of NADH. the Klenow fragment of DNA polymerase and fluorescent dye Cy-5, while the total RNA was labeled with reverse 0226 Specifically, the enzyme phosphofructokinase cata transcriptase and Cy-3. After hybridization, the Signal inten lyzes the key reaction converting fructose 6 phosphate and sities of both Cy-3 and Cy-5 for each spot in the array were pyrophosphate to fructose 1.6 bisphosphate and orthophos quantified. The intensity ratio of Cy-3 and Cy-5 was then phate. Fructose-1,6-bisphosphate is cleaved to 3-phospho used to calculate the fraction of each transcript (as a per glyceraldehyde and dihydroxyacetonephosphate by fructose centage), according to the following formula: (gene ratio/ 1,6-bisphosphate aldolase. Dihydroxyacetonephosphate is sum of all ratio)x 100. The value obtained reflects the isomerized to 3-phosphoglyceraldehyde by triosephosphate relative abundance of mRNA of an individual gene. Accord isomerase. Glycerol phosphate dehydrogenase plus NADH ingly, transcriptional activity of all the genes represented by and 3-phosphoglyceraldehyde yields the alcohol glycerol the array can be ranked based on its relative mRNA abun 3-phosphate and NAD. Disappearance of NADH is moni dance in a descending order. The numbers in FIG. 2 next to tored at 340 nm using spectrophotometer (UltraSpec 4000, each Step indicate the relative expression level of that Pharmacia Biotech). enzyme. For example, mRNA abundance for the methane monooxygenase was the most highly expressed enzyme in TABLE 7 the cell (ranked #1) because its genes had the highest Assay Protocol transcriptional activity when the organism was grown with methane as the carbon source (FIG.2). The next most highly Volume (ul) Final assay expressed enzyme is methanol dehydrogenase (ranked #2). Stock solution per 1 mL total concentration The heXulose-monophosphate Synthase gene is one of the Reagent (mM) reaction volume (mM) ten most highly expressed genes in cells grown on methane. Tris-HCl pH 7.5 1OOO 1OO 1OO MgCl2.2H2O 1OO 35 3.5 0221) The genes considered “diagnostic' for Entner NaPO7.1OHO 1OO 2O 2 Douderoff pathway are the 6-phosphogluconate dehydratase or ATP Fructose-6- 1OO 2O 2 and the 2 keto-3-deoxy-6-phosphogluconate aldolase. In phophate contrast, the phosphofructokinase and fructose bisphosphate NADH 50 6 O.3 aldolase are “diagnostic' of the Embden-Meyerhoff Fructose 100 (units/mL) 2O 2 (units) Sequence. Messenger RNA transcripts of phosphofructoki bisphosphate aldolase nase (ranked #232) and fructose bisphosphate aldolase Triose phosphate (7.2 unitsful) 3.69 27 units (ranked #65) were in higher abundance than those for isomerase/glycerol (0.5 unitsful) 1.8 units glucose 6 phosphate dehydrogenase (ranked #717), 6 phos phosphate dehydrogenase phogluconate dehydratase (ranked #763) or the 2-keto-3- KC 1OOO 50 50 deoxy-6-gluconate aldolase. The data Suggests that the Emb H2O adjust to 1 mL den-Meyerhoff pathway enzymes are more Strongly Crude extract 0-50 expressed than the Entner-Douderoff pathway enzymes. This result is Surprising and counter to existing beliefs on the US 2003/0003528A1 Jan. 2, 2003 22

0227. This coupled enzyme assay was further used to assay the activity in a number of other methanotrophic bacteria as shown below in Table 8. The data in Table 8 ATGACGGTCTGCGCAAAAAAACACG SEQ ID NO : 43 shows known ATCC strains tested for phosphofructokinase GAGAAATTATGTTGTGGATTTGGAATGC SEQ ID NO: 44 activity with ATP or pyrophosphate as the phosphoryl donor. These organisms were classified as either a Type I or Type 0230 Chromosomal DNA was purified from Pantoea X ribulose monophosphate-utilizing Strains or a Type II Stewati (ATCC no. 8.199) and PfuTurbo polymerase (Strat Serine utilizer. Established literature makes these types of agene, La Jolla, Calif.) was used in a PCR amplification classifications based on the mode of carbon incorporation, reaction under the following conditions: 94 C, 5 min; 94 morphology, 7% GC content and the presence or absence of C (1 min)-60° C (1 min)-72°C (10 min) for 25 cycles, and key specific enzymes in the organism. 72°C for 10 min. A single product of approximately 6.5 kb was observed following gel electrophoresis. Taq polymerase TABLE 8 (PerkinElmer) was used in a ten min 72 C. reaction to add additional 3' adenoside nucleotides to the fragment for Comparison Of Pyrophosphate Linked And ATP Linked TOPO cloning into pCR4-TOPO (Invitrogen, Carlsbad, Phosphofructokinase Activity. In Calif.). Following transformation to E. coli DH5C. (Life Different Methanotrophic Bacteria Technologies, Rockville, Md.) by electroporation, several Assimi- ATP-PFK Ppi-PFK colonies appeared to be bright yellow in color, indicating ation umol NADHI umol NADH/ that they were producing a carotenoid compound. Following Strain Type Pathway min/mg min/mg plasmid isolation as instructed by the manufacturer using the Methylomonas 16a I Ribulose O 2.8 Qiagen (Valencia, Calif.) miniprep kit, the plasmid contain ATCC PTA 2402 OO ing the 6.5 kb amplified fragment was transposed with OSlate Methylomonas I Ribulose O.O1 3.5 pGPS1.1 using the GPS-1 Genome Priming System kit agile OO (New England Biolabs, Inc., Beverly, Mass.). A number of ATCC 35068 OSlate these transposed plasmids were Sequenced from each end of Methylobacter I Ribulose O.O1 O.O25 the transposon. Sequence was generated on an ABI Auto Whittenbury OO ATCC 51738 OSlate matic sequencer using dye terminator technology (U.S. Pat. Methylomonas I Ribulose O O.3 No. 5,366,860; EP 272007) using transposon specific prim clara OO erS. Sequence assembly was performed with the Sequencher ATCC 31226 OSlate Methylomicrobium I Ribulose O.O2 3.6 program (Gene Codes Corp., Ann Arbor Mich.). albuS OO ATCC 33003 OSlate Example 7 Methylococcus X Ribulose O.O1 O.O4 capsulatus OO ATCC 19069 OSlate Cloning of Rhodococcus erythropolis crtO Methylosinus II Serine O.O7 0.4 Sporium 0231. The present example describes the isolation, ATCC 35069 Sequencing, and identification of a carotenoid biosynthetic pathway gene from Rhodococcus erythropolis AN12. 0232) Isolation and Characterization of Strain AN12 0228 Several conclusions may be drawn from the data presented above. First, it is clear that ATP (which is the 0233 Strain AN12 of Rhodococcus erythropolis was typical phosphoryl donor for phosphofructokinase) is essen isolatd on the basis of being able to grow on aniline as the tially ineffective in the phosphofructokinase reaction in Sole Source of carbon and energy. Bacteria that grew on methanotrophic bacteria. Only inorganic pyrophosphate was aniline were isolated from an enrichment culture. The found to Support the reaction in all methanotrophs tested. enrichment culture was established by inoculating 1 ml of Secondly, not all methanotrophs contain this activity. The activated sludge into 10 ml of S12 medium (10 mM ammo activity was essentially absent in Methylobacter whittenbury nium sulfate, 50 mM potassium phosphate buffer (pH 7.0), 2 mM MgCl, 0.7 mM CaCl, 50 uM MnCl, 1 uM FeCls, and in MethylococcuS capsulatus. Intermediate levels of 1 uM ZnCls, 1.72 uM CuSO, 2.53 uM CoCl2, 2.42 uM activity were found in Methylomonas clara and Methylosi NaMoO, and 0.0001 % FeSO) in a 125 ml screw cap nuS Sporium. These data Show that many methanotrophic Erlenmeyer flask. The activated sludge was obtained from a bacteria may contain a hitherto unreported phosphofructoki wastewater treatment facility. The enrichment culture was nase activity. It may be inferred from this that methanotro supplemented with 100 ppm aniline added directly to the phs containing this activity have an active Embden-Meyer culture medium and was incubated at 25 C. with reciprocal hoff pathway. Shaking. The enrichment culture was maintained by adding 100 ppm of aniline every 2-3 days. The culture was diluted Example 6 every 14 days by replacing 9.9 ml of the culture with the same volume of S12 medium. Bacteria that utilized aniline Cloning of Carotenoid Genes from Pantoea as a Sole Source of carbon and energy were isolated by Stewartii Spreading Samples of the enrichment culture onto S12 agar. Aniline (5 pl.) was placed on the interior of each petri dish 0229. Primers were designed using the sequence from lid. The petridishes were sealed with parafilm and incubated PantOea ananatis to amplify a fragment by PCR containing upside down at room temperature (approximately 25 C.). a crt cluster of genes. These Sequences included 5'-3': Representative bacterial colonies were then tested for the US 2003/0003528A1 Jan. 2, 2003

ability to use aniline as a Sole Source of carbon and energy. 0239 Library Construction. Colonies were transferred from the original S12 agar plates 0240 200 to 500 lug of chromosomal DNA was resus used for initial isolation to new S12 agar plates and Supplied pended in a solution of 300 mM sodium acetate, 10 mM with aniline on the interior of each petri dish lid. The petri Tris-HCl, 1 mM Na-EDTA, and 30% glycerol, and sheared dishes were Sealed with parafilm and incubated upside down at 12 psi for 60 sec in an Aeromist Downdraft Nebulizer at room temperature (approximately 25 C.). chamber (IBI Medical products, Chicago, Ill.). The DNA 0234. The 16S rRNA genes of each isolate were ampli was precipitated, resuspended and treated with Bal31 fied by PCR and analyzed as follows. Each isolate was nuclease (New England Biolabs, Beverly, Mass.). After size grown on R2A agar (Difco Laboratories, Bedford, Mass.). fractionation by 0.8% agarose gel electrophoresis, a fraction Several colonies from a culture plate were suspended in 100 (2.0 kb, or 5.0 kb) was excised, cleaned and a two-step All of water. The mixture was frozen and then thawed once. ligation procedure was used to produce a high titer library The 16S rRNA gene sequences were amplified by PCR with greater than 99% single inserts. using a commercial kit according to the manufacturer's instructions (Perkin Elmer) with primers HK12 (5'- 0241 Sequencing. GAGTTTGATCCTGGCTCAG-3) (SEQ ID NO:45) and 0242 A shotgun sequencing Strategy approach was HK13 (5'-TACCTTGTTACGACTT-3) (SEQ ID NO:46). adopted for the Sequencing of the whole microbial genome PCR was performed in a Perkin Elmer Gene Amp 9600 (Fleischmann, Robert et al., Whole-Genome Random (Norwalk, Conn.). The samples were incubated for 5 min at Sequencing and assembly of Haemophilus influenzae Rd 94° C. and then cycled 35 times at 94° C. for 30 sec, 55° C. Science, 269:1995). for 1 min, and 72° C. for 1 min. The amplified 16S rRNA genes were purified using a commercial kit according to the 0243 Sequence was generated on an ABI Automatic manufacturer's instructions (QIAquick PCR Purification Sequencer using dye terminator technology (U.S. Pat. No. Kit, Qiagen, Valencia, Calif.) and Sequenced on an auto 5,366,860; EP 272007) using a combination of vector and mated ABI sequencer. The Sequencing reactions were initi insert-specific primerS. Sequence editing was performed in ated with primers HK12, HK13, and HK14 (5'-GTGCCAG either DNAStar (DNA Star Inc., Madison, Wis.) or the CAGYMGCGGT-3) (SEQ ID NO:47, where Y=C or T, Wisconsin GCG program (Wisconsin Package Version 9.0, M=A or C). The 16S rRNA gene sequence of each isolate Genetics Computer Group (GCG), Madison, Wis.) and the was used as the query Sequence for a BLAST Search CONSED package (version 7.0). All sequences represent (Altschul, et al., Nucleic Acids Res. 25:3389-3402(1997)) of coverage at least two times in both directions. GenBank for Similar Sequences. 0244 Sequence Analysis of CrtO 0235 A 16S rRNA gene of strain AN12 was sequenced 0245. Two ORFs were identified in the genomic sequence and compared to other 16S rRNA sequences in the GenBank of Rhodococcus erythropolis AN12 which shared homology Sequence database. The 16S rRNA gene Sequence from to two different phytoene dehydrogenases. One ORF was strain AN12 was at least 98% similar to the 16S rRNA gene designated Crtland had the highest homology (45% identity, Sequences of high G+C Gram positive bacteria belonging to 56% similarity) to a putative phytoene dehydrogenase from the genus RhodococcuS. Streptomyces coelicolor A3(2). The other ORF (originally designated as Crtl2, now as CrtO) had the highest homology 0236 Preparation of Genomic DNA for Sequencing and (35% identity, 50% similarity; White O. et al Science 286 Sequence Generation (5444),1571-1577 (1999)) to a probable phytoene dehydro 0237 Genomic DNA Preparation. genase DRO093 from Deinococcus radiodurans. Subse quent examination of the protein by motif analysis indicated 0238 Rhodococcus erythropolis AN12 was grown in 25 that the crtO might function as a ketolase. mLNBYE medium (0.8% nutrient broth, 0.5% yeast extract, 0.05% Tween 80) till mid-log phase at 37° C. with aeration. 0246 In Vitro Assay for Ketolase Activity of Rhodococ Bacterial cells were centrifuged at 4,000 g for 30 min at 4 cuS CrtO C. The cell pellet was washed once with 20 ml 50 mM 0247 To confirm if crtO encoded a ketolase, the Rhodo NaCO, containing 1 M KCl (pH 10) and then with 20 ml 50 coccuS crtO gene in E. coli was expressed was assayed for mM NaOAc (pH 5). The cell pellet was gently resuspended the presence of ketolase activity in Vitro. The crtO gene was in 5 ml of 50 mM Tris-10 mM EDTA (pH 8) and lysozyme amplified from AN12 using the primers crt12-N: was added to a final concentration of 2 mg/mL. The SuS ATGAGCGCATTTCTCGACGCC (SEQ ID NO:48) and pension was incubated at 37 C. for 2 h. Sodium dodecyl crt12-C. TCACGACCTGCTCGAACGAC (SEQ ID Sulfate was then added to a final concentration of 1% and NO:49). The amplified 1599 bp full-length crtO gene was proteinase K was added to 100 ug/ml final concentration. cloned into pTrcHis2-TOPO cloning vector (Invitrogen, The suspension was incubated at 55 C. for 5 h. The Carlsbad, Calif.) and transformed into TOP10 cells follow Suspension became clear and the clear lysate was extracted ing manufacture's instructions. The construct (designated with equal volume of phenol:chloroform:isoamyl alcohol pDCQ117) containing the crt0 gene cloned in the forward (25:24:1). After centrifuging at 17,000 g for 20 min, the orientation respective to the trc promoter on the Vector was aqueous phase was carefully removed and transferred to a confirmed by restriction analysis and Sequencing. new tube. Two volumes of ethanol were added and the DNA 0248. The in vitro enzyme assay was performed using was gently Spooled with a Sealed glass pasteur pipet. The crude cell extract from E. coli TOP10 (pDCQ117) cells DNA was dipped into a tube containing 70% ethanol, then expressing crtO. 100 ml of LB medium containing 100 air dried. After air drying, DNA was resuspended in 400 ul tug/ml amplicillin was inoculated with 1 ml fresh overnight of TE (10 mM Tris-1 mM EDTA, pH 8) with RNaseA (100 culture of TOP10 (pDCQ117) cells. Cells were grown at 37 Aug/mL) and Stored at 4 C. C. with shaking at 300 rpm until ODoo reached 0.6. Cells US 2003/0003528A1 Jan. 2, 2003 24 were then induced with 0.1 mM IPTG and continued grow extracts of control cells that would not use B-caroteine as the ing for additional 3 hrs. Cell pellets harvested from 50 ml Substrate. No product peaks were detected in the control culture by centrifugation (4000 g, 15 min) were frozen and reaction mixture. thawed once, and resuspended in 2 ml ice cold 50 mM Tris-HCl (pH7.5) containing 0.25% TritonX-100. 10 ug of 0250) In Summary, the in vitro assay data confirmed that f-caroteine Substrate (Spectrum Laboratory Products, Inc.) crtO encodes a ketolase, which converted B-caroteine into in 50 ul of acetone was added to the Suspension and mixed canthaxanthin (two ketone groups) via echinenone (one by pipetting. The mixture was divided into two tubes and ketone group) as the intermediate. This symmetric ketolase 250 mg of zirconia/silica beads (0.1 mm, BioSpec Products, activity of Rhodococcus CrtO is different from what was Inc, Bartlesville, Okla.) was added to each tube. Cells were reported for the asymmetric function of Synechocystis CrtO. broken by bead beating for 2 min, and cell debris was removed by spinning at 10000 rpm for 2 min in an Eppen dorf microcentrifuge 5414C. The combined supernatant (2 TABLE 9 ml) was diluted with 3 ml of 50 mM Tris pH 7.5 buffer in a 50 ml flask, and the reaction mixture was incubated at 30 HPLC Analysis Of The In Vitro Reaction C. with shaking at 150 rpm for different lengths of time. The Mixtures With Rhodococcus CrtO reaction was stopped by addition of 5 ml methanol and extraction with 5 ml diethyl ether. 500 mg of NaCl was Canthaxanthin Echinenone 3-caroteine added to Separate the two phases for extraction. Carotenoids 474 mm 459 mm 449 mm 474 mm in the upper diethyl ether phase was collected and dried 13.8 min 14.8 min 15.8 min under nitrogen. The carotenoids were re-dissolved in 0.5 ml of methanol, for HLPC analysis, using a Beckman System Ohr O% O% 100% Gold(E) HPLC with Beckman Gold Nouveau Software 2 hr O% 14% 86% (Columbia, Md.). 0.1 ml of the crude acetone extraction was loaded onto a 125x4 mm RP8 (5 um particles) column with 16 hr 16% 28% 56% corresponding guard column (Hewlett-Packard, San 20hr 30% 35% 35% Fernando, Calif.). The flow rate was 1 ml/min and the Solvent program was 0-11.5 min 40% water/60% methanol, 11.5-20 min 100% methanol, 20-30 min 40% water/60% Example 8 methanol. Spectral data was collected using a Beckman photodiode array detector (model 168). 0251 All sequences from Examples 6 and 7 were iden 0249 Three peaks were identified at 470 nm in the 16 hr tified by conducting BLAST (Basic Local Alignment Search reaction mixture. When compared to Standards, it was deter Tool; Altschul, S. F., et al., (1993).J. Mol. Biol. 215:403-410; mined that the peak with a retention time of 15.8 min was See also www.ncbi.nlm.nih.gov/BLAST) searches for simi B-caroteine and the peak with retention time of 13.8 min was larity to sequences contained in the BLAST “nr” database, canthaxanthin. The peak at 14.8 min was most likely echinenone, the intermediate with only one ketone group according to the methodology of Example 2. addition. In the 2 hr reaction mixture, the echinenone intermediate was the only reaction product and no canthax 0252) The results of these BLAST comparisons are given anthin was produced. Longer incubation times resulted in below in Table 10. The table displays data based on the higher levels of echinenone and the appearance of a peak BLASTXnr algorithm with values reported in expect values. corresponding to canthaxanthin. Canthaxanthin is the final The Expect value estimates the Statistical significance of the product in this step representing the addition of two ketone match, Specifying the number of matches, with a given groups (Table 9). To confirm that the ketolase activity was Score, that are expected in a Search of a database of this size Specific for crtO gene, the assay was also performed with absolutely by chance.

TABLE 10 Identification of Carotenoid Genes Based on Sequence Homology % Gene SEO ID % Similarity Name Similarity Identified SEQ ID Peptide Identity b E-value Citation crtE Geranylgeranyl pryophosphate 25 26 83 88 e-137 Misawa et al., J. Bacteriol. 172 synthetase (or GGPP synthetase, (12), 6704-6712 (1990) or farnesyltranstransferase) EC 2.5.1.29 gi117509sp|P21684|CRTE PAN ANGERANYLGERANYL PYROPHOSPHATE SYNTHETASE (GGPP SYNTHETASE) (FARNESYLTRANSTRANSFERA SE) US 2003/0003528A1 Jan. 2, 2003 25

TABLE 10-continued Identification of Carotenoid Genes Based on Sequence Homology % Gene SEO ID % Similarity Name Similarity Identified SEQ ID Peptide Identity b E-value Citation crtX Zeaxanthin glucosyl transferase 27 28 75 79 O.O Lin et al., Mol. Gen. Genet. EC 2.4.1.- 245 (4), 417–423 (1994) gi1073294pir|S52583 crtX protein - Erwinia herbicola C Lycopene cyclase 29 3O 83 91 O.O Lin et al., Mol. Gen. Genet. gi1073295pir|S52585 dycopene 245 (4), 417–423 (1994) cyclase - Erwinia herbicola crt Phytoene desaturaseEC 1.3.-- 31 32 89 91 O.O Lin et al., Mol. Gen. Genet. gi1073299pir|S52586 phytoene 245 (4), 417–423 (1994) dehydrogenase (EC 1.3.--)- Erwinia herbicola C Phytoene synthaseEC2.5.1.- 33 34 88 92 e-150 Lin et al., Mol. Gen. Genet. gi1073300pir|S52587 245 (4), 417–423 (1994) prephytoene pyrophosphate synthase - Erwinia herbicola C -carotene hydroxylase 35 36 88 91 3e-88 Misawa et al., J. Bacteriol. 172 gi117526sp|P21688CRTZ PAN (12, 6704-6712 (1990) AN-CAROTENE HYDROXYLASE C sIrO088 - Synechocystis 37 38 35 64% White O. et al Science 286 hypothetical protein (5444), 1571–1577 (1999) Fernández-González, et al., J. Biol. Chem., 1997, 272:9728 9733

Example 9 gene of the broad host range vector, pBHR1 (MoBiTec, LLC, Marco Island, Fla.). pBHR1 (500ng) was linearized by Expression of B-carotene in Methylomonas 16A digestion with EcoRI (New England Biolabs, Beverly, Growing on Methane Mass.) and then dephosphorylated with calf intestinal alka line phosphatase (Gibco/BRL, Rockville, Md.). pCR4-crt 0253) The crt gene cluster comprising the crtEXYIBZ was digested with EcoRI and the 6.3 kb EcoRI fragment genes from Pantoea Stewartii (Example 6) was introduced containing the crt gene cluster (crtEXYIB) was purified into Methylomonas 16a to enable the synthesis of desirable following gel electrophoresis in 0.8% agarose (TAE). This 40-carbon carotenoids. DNA fragment was ligated to EcoRI-digested pBHR1 and 0254 Primers were designed using the sequence from the ligated DNA was used to transform E. coli DH5C. by Erwinia uredovora to amplify a fragment by PCR containing electroporation. Transformants were Selected on LB medium the crt genes. These Sequences included 5'-3': containing 50 ug/ml kanamycin. 0257 Several isolates were found to be sensitive to chloramphenicol (25 ug/ml) and demonstrated a yellow ATGACGGTCTGCGCAAAAAAACACG SEQ ID 43 colony phenotype after overnight incubation at 37 C. Analysis of the plasmid DNA from these transformants GAGAAATTATGTTGTGGATTTGGAATGC SEQ ID 44 confirmed the presence of the crt gene cluster cloned in the Same orientation as the pBHR1 chloramphenicol-resistance 0255 Chromosomal DNA was purified from Pantoea gene and this plasmid was designated pCrt1 (FIG. 3). In Stewadtii (ATCC no. 8.199) and Pfu Turbo polymerase contrast, analysis of the plasmid DNA from transformants (Stratagene, La Jolla, Calif.) was used in a PCR amplification demonstrating a white colony phenotype confirmed the reaction under the following conditions: 94 C, 5 min; 94 presence of the crt gene cluster cloned in the opposite C (1 min)-60° C (1 min)-72°C (10 min) for 25 cycles, and orientation as the pBHR1 chloramphenicol-resistance gene 72°C for 10 min. A single product of approximately 6.5 kb and this plasmid was designated pCrt2. These results Sug was observed following gel electrophoresis. Taq polymerase gested that functional expression of the crt gene cluster was (Perkin Elmer) was used in a ten minute 72 C. reaction to directed from the pBHR1 cat promoter. add additional 3' adenoside nucleotides to the fragment for TOPO cloning into pCR4-TOPO (Invitrogen, Carlsbad, 0258 Plasmid pcrt1 was transferred into Methylomonas Calif.). Following transformation to E. coli DH5C. (Life 16a by tri-parental conjugal mating. The E. Coli helper Strain Technologies, Rockville, Md.) by electroproation, several containing pRK2013 and the E. coli DH5a donor strain colonies appeared to be bright yellow in color indicating that containing pCrt1 were grown overnight in LB medium they were producing a carotenoid compound containing kanamycin (50 ug/mL), washed three times in LB, and resuspended in a volume of LB representing 0256 For introduction into Methylomonas 16a, the crt approximately a 60-fold concentration of the original culture gene cluster from pCR4-crt was first subcloned into the volume. The Methylomonas 16a recipient was grown for 48 unique EcoRI site within the chloramphenicol-resistance hours in Nitrate liquid “BTZ-3” medium (General Methods) US 2003/0003528A1 Jan. 2, 2003 26

in an atmosphere containing 25% (v/v) methane, washed control of the E. coli trc promoter and were designated as three times in BTZ-3, and resuspended in a volume of pCrt3. The plasmid map for this pCrt3 construct is shown in BTZ-3 representing a 150-fold concentration of the original culture Volume. The donor, helper, and recipient cell pastes FIG. 4. The P promoter is illustrated with a small bold were combined on the Surface of BTZ-3 agar plates con black arrow, in contrast to the large wide arrows, represent taining 0.5% (w/v) yeast extract in ratios of 1:1:2 respec ing Specific genes as labeled. tively. Plates were maintained at 30° C. in 25% methane for 16-72 hours to allow conjugation to occur, after which the 0262 Plasmid pCrt3 was transferred into Methylomonas cell pastes were collected and resuspended in BTZ-3. Dilu 16a by tri-parental conjugal mating, as described above for tions were plated on BTZ-3 agar containing kanamycin (50 pCrt1 (Example 9). Transconjugants containing this plasmid ug/mL) and incubated at 30° C. in 25% methane for up to 1 demonstrated yellow colony color following growth on week. Transconjugants were Streaked onto BTZ-3 agar with kanamycin (50 ug/mL) for isolation. Analysis of plasmid BTZ-3 agar with kanamycin (50 ug/mL) and methane as the DNA isolated from these transconjugants confirmed the Sole carbon Source. presence of pCrt1 (FIG. 3). 0263 HPLC analysis of extracts from Methylomonas 0259 For analysis of carotenoid composition, transcon 16A containing pCrt3 revealed the presence of Zeaxanthin jugants were cultured in 25 ml BTZ-3 containing kanamycin and its mono- and diglucosides. These results are shown in (50 ug/mL) and incubated at 30° C. in 25% methane as the FIG. 4. The left panel shows the HPLC profile of extracts Sole carbon Source for up to 1 week. The cells were harvested by centrifugation and frozen at -20° C. After from Methylomonas 16A or Methylomonas 16A containing thawing, the pellets were extracted and carotenoid content the pcrt3. The right panel shows the UV spectra of the was analyzed by HPLC according to the methodology of the individual peaks displayed in the HPLC profile and dem General Methods. onstrate the Synthesis of Zeaxanthin and its mono- and di-glucosides in Methylomonas 16A containing pcrt3. These 0260 HPLC analysis of extracts from Methylomonas 16a containing pCrt1 confirmed the Synthesis of B-carotene. The results suggested that the crtEXYIB genes were functionally left panel of FIG. 3 shows the HPLC results obtained using expressed from the trc promoter while the crtz gene was the B-caroteine Standard and a single peak is present at transcribed in the opposite orientation from the p3HR1 cat 15.867 min. Similarly, the right panel of FIG. 3 shows the promoter in Methylomonas 16A. HPLC profile obtained for analysis of Methylomonas 16a transconjugant cultures containing the pCrt1 plasmid. A 0264. One skilled in the art would expect that deletion of similar peak at 15.750 min is indicative of B-carotene in the crtX from this and subsequent plasmids should enable the cultures. production of Zeaxanthin without formation of the mono and di-glucosides. Furthermore, a plasmid in which the Example 10 crtEYIBZ genes are expressed in the same orientation from one or more promoters may be expected to alleviate poten Expression of Zeaxanthin in Methylomonas 16A tial transcriptional interference and enhance the Synthesis of Growing on Methane Zeaxanthin. This would readily be possible using Standard 0261) To enable the synthesis of zeaxanthin in Methy cloning techniques know to those skilled in the art. lomonas 16a, the crt gene cluster from pTrcHis-crt2 (as described above) was subcloned into the chloramphenicol Example 11 resistance gene of the broad host range vector, pFBHR1 (MoBiTec, LLC, Marco Island, Fla.). pBHR1 (500 ng) was Expression of Zeaxanthin in Methylomonas 16A digested sequentially with EcoRI and Scal and the 4876 bp Growing on Methane, With an Optimized HMPS EcoRI-Scal DNA fragment was purified following gel elec Promoter trophoresis in 0.8% agarose (TAE). Plasmid pTrcHis-crt2 0265 Analysis of gene array data following growth of was digested simultaneously with Sspl and EcoRI and the 6491 bp Sspl-EcoRI DNA fragment containing the crt gene Methylomonas 16a on methane Suggested the heXoulose cluster (crtEXYIB) under the transcriptional control of the monophosphate synthase (HMPS) to be one of the ten most E. coli trc promoter was purified following gel electrophore highly expressed genes. Thus, one may use the DNA sis in 0.8% agarose (TAE). The 6491 bp Sspl-EcoRI frag Sequences comprising the HMPS promoter to direct high ment was ligated to the 4876 bp EcoRI-Scal fragment and level expression of heterologous genes, including those in the ligated DNA was used to transform E. coli DH5C. by the P. Stewartii crt gene cluster, in Methylomonas 16A. electroporation. Transformants were Selected on LB medium Analysis of the 5'-DNA sequences upstream from the HMPS containing 50 ug/ml kanamycin. Several kanamycin-resis gene identified potential transcription initiation sites in both tant isolates were also sensitive to chloramphenicol (25 DNA strands using the NNPP/neural network prokaryotic ug/ml) and demonstrated yellow colony color after over promoter prediction program from Baylor College of Medi night incubation at 37 C. Analysis of the plasmid DNA cine Predictions concerning the forward strand of the H6P from these transformants confirmed the presence of the crt synthase are shown below in Table 11; similar results are gene cluster cloned into pBHR1 under the transcriptional shown below in Table 12 for the reverse strand. US 2003/0003528A1 Jan. 2, 2003 27

TABLE 11 Promoter Predictions for H6P synthase-Forward Strand Start End Score Promoter Seguences 63 108 0.93 GAGAATTGGCTGAAAAACCAAATAAATAACAAAATTTAG (SEQ ID NO: 50) CGAGTAAATGG

11.9 164 O. 91 TTCAATTGACAGGGGGGCTCGTTCTGATTTAGAGTTGC (SEQ ID NO:51) TGCCAGCTTTTT

211 256 O. 85 GGGTTGTCCAGATGTTGGTGAGCGGTCCTTATAACTAT (SEQ ID NO: 52) AACTGTAACAAT The transcription start sites are indicated in bold text.

0266

TABLE 12 Promoter Predictions for H6P synthase-Reverse Strand Start End Score Promoter Sequences 284 239 O. 89 TTAATGGTCTTGCCATGAGATGTGCTCCGATTGTTACAG (SEQ ID NO:53) TATAGTTATA 129 84 0.95 CCCCCTGTCAATTGAAAGCCCGCCATTTACTCGCTAAAT (SEQ ID NO:54) TTTGTTATTTA The transcription start sites are indicated in bold text.

0267 Based on these sequences, the following primers crtEXYIB genes under the transcriptional control of the trc were used in a polymerase chain reaction (PCR) to amplify promoter and the crtz gene under the transcriptional control a 240 bp DNA sequence comprising the HMPS promoter of the himps promoter (FIG. 5). from Methylomonas 16a genomic DNA: 0269 Plasmid pCrt4 was transferred into Methylomonas 16a by tri-parental conjugal mating. Transconjugants con taining this plasmid demonstrated yellow colony color fol 5' CCGAGTACTGAAGCGGGTTTTTGCAGGGAG 3' (SEQ ID NO:39) lowing growth on BTZ-3 agar with kanamycin (50 ug/mL) 5' GGGCTAGCTGCTCCGATTGTTACAG 3' (SEQ ID NO: 40) and methane as the sole carbon source. HPLC analysis of extracts from Methylomonas 16a containing pCrt4 revealed 0268. The PCR conditions were: 94° C. for 2 min, the presence of Zeaxanthin, and its mono- and di-glucosides, followed by 35 cycles of 94° C. for 1 min, 50° C. for 1 min thereby confirming expression of the crtz gene. This data is and 72 C. for 2 min, and final extension at 72 C. for 5 min. shown in FIG. 5. Peaks with retention times of 13.38 min, After purification, the 240 bp PCR product was ligated to 12.60 min and 11.58 min correspond to Zeaxanthin, a pCR2.1 (Invitrogen, Carlsbad, Calif.) and transformed into mixture of Zeaxanthin mono-glucosides and Zeaxanthin E. coli DH5C. by electroporation. Analysis of the plasmid diglucoside, respectively, DNA from transformants that demonstrated white colony color on LB agar containing kanamycin (50 ug/ml) and Example 12 X-gal identified the expected plasmid, which was designated pHMPS. PHMPS was digested with EcoRI and the 256 bp Expression of Canthaxanthin and AStaxanthin in DNA fragment containing the HMPS promoter was purified Methylomonas 16A Growing on Methane following gel electrophoresis in 1.5% agarose (TEA). This 0270. To enable the synthesis of canthaxanthin and astax DNA fragment was ligated to pCrt3 previously digested with anthin in Methylomonas 16a, the Rhodococcus erythropolis EcoRI and dephosphorylated with calf intestinal alkaline AN 12 crtO gene encoding f-caroteine ketolase (Example 7) phosphatase. The ligated DNA was used to transform E. coli was cloned into pcrt4. The crtO gene was amplified by PCR DH5C. by electroporation. Analysis of plasmid DNA from from plDCQ117 (Example 7) using the following primers to transformants that demonstrated yellow colony color on LB introduce convenient Spel and Nhe restriction sites as well agar containing kanamycin (50 ug/ml) identified the as the ribosome binding site found upstream of crtE which expected plasmid, designated pCrt4, containing the was presumably recognized in Methylomonas 16a.

5'-AGCAGCTAGCGGAGGAATAAACCATGAGCGCATTTCTC-3' (SEQ ID NO: 41

5'-GACTAGTCACGACCTGCTCGAACGAC-3' (SEQ ID NO: 42) US 2003/0003528A1 Jan. 2, 2003 28

0271 The PCR conditions were: 95° C. for 5 min, 35 0280 Reverse reaction: gctictagatgcaaccagaatcg (con cycles of 95°C. for 30 sec, 45-60° C. gradient with 0.15° C. tains a Xba I site, SEQ ID NO:58). decrease/cycle for 30 sec and 72 C. for 90 sec, and a final extension at 72° C. for 7 min. The 1653 bp PCR product was 0281. The expected PCR products of dxS and dxr genes purified following gel electrophoresis in 1.0% agarose included Sequences of 323 bp and 420 bp, respectively, (TAE), digested simultaneously with Spel and NheI restric upstream of the Start codon of each gene in order to ensure tion endonucleases and then ligated to pCrt4 previously that the natural promoters of the genes were present. The digested with Nhe and dephosphorylated with calf intestinal PCR program (in Perkin-Elmer, Norwalk, Conn.) was as alkaline phosphatase. The ligated DNA was used to trans follows: denaturing 95°C. (900 sec); 35 cycles of 94° C. (45 form E. coli DH5C. by electroporation. sec), 58° C. (45 sec), 72°C. (60 sec); final elongation 72° C. (600 sec). The reaction mixture (50 ul total volume) con 0272 Analysis of plasmid DNA from transformants that tained: 25 ul Hot Star master mix (Qiagen, Valencia, Calif.), demonstrated yellow colony color on LB agar containing 0.75 ul genomic DNA (approx. 0.1 ng), 1.2 ul Sense primer kanamycin (50 ug/ml) identified the expected plasmid, des (=10 pmol), 1.2 pi antisense primer (=10 pmol), 21.85 ul ignated pCrt4.1, in which the crtEXYIB genes were cloned deionized water. under the transcriptional control of the trc promoter and the crtO and crtz genes were cloned under the transcriptional 0282 Standard procedures (Sambrook, J., Fritsch, E. F. control of the himps promoter This plasmid construct is and Maniatis, T. Molecular Cloning. A Laboratory Manual, shown in FIG. 6. Upon prolonged incubation, transformants Second Edition, Cold Spring Harbor Laboratory Press, Cold containing pcrt4.1 demonstrated a Salmon pink colony color. Spring Harbor (1989)), were used in order to clone dxs and dxr into pTJS75::lacZ:Tn5Kn, a low-copy, broad-host plas 0273 Plasmid pCrt4.1 was transferred into Methylomo mid (Schmidhauser and Helinski J. Bacteriology. Vol. nas 16a by tri-parental conjugal mating. Transconjugants 164:446-455 (1985)). containing this plasmid demonstrated orange colony color following growth on BTZ-3 agar with kanamycin (50 0283 For isolation, concentration, and purification of Aug/mL) and methane as the Sole carbon Source. HPLC DNA, Qiagen kits (Valencia, Calif.) were used. Enzymes for analysis of extracts of Methylomonas 16a containing the cloning were purchased from Gibco/BRL (Rockville, pCrt4.1 are shown in FIG. 6. These results revealed the Md.) or NEB (Beverly, Mass.). To transfer plasmids into E. presence of the endogenous Methylomonas 16a 30-carbon coli, One Shot Top10 competent cells (Invitrogen, Carlsbad, carotenoid (retention time of 12.717 min) as well as can Calif.), cuvettes (0.2 cm; Invitrogen), and Bio-Rad Gene thaxanthin (retention time of 13.767 min). The retention Pulser III (Hercules, Calif.) with standard settings were used time of the wild-type pigment is very close to that expected for electroporation. for astaxanthin. Analysis of a shoulder on this peak con 0284. First, dxS was cloned into the Bam HI site, which firmed the presence of astaxanthin The predominant forma was located between the lac7 gene and the Tn5Kn cassette tion of the wild-type 16A pigment in this Strain Suggested of pTJS75::lacZ:Tn5Kn. The resulting plasmids were iso transcriptional interference of the crtEXYIB operon by lated from E. coli transformants growing on LB+ kanamycin high-level expression of the crtOZ operon from the Strong (Kn, 50 ug/mL). The plasmid containing the insert in direc himps promoter. In addition, it is hypothesized that the cat tion of the Kn-resistance gene (as confirmed by restriction promoter on the pBHR1 vector may be directing expression analysis) was chosen for further cloning. The dxr gene was of crtOZ in concert with the himps promoter. Plasmids in cloned in between dxs and the Tn5Kn cassette by using the which the crtEYIBZO genes are expressed in the same XhoI and Xba I sites. The anticipated plasmid was isolated orientation from one or more promoters may be expected to from E. coli transformants. The presence of dxS and dxr in alleviate potential transcriptional interference and thereby the plasmid was confirmed by restriction analysis and enhance the Synthesis of canthaxanthin and astaxanthin. Sequencing. The resulting plasmid, pTJS75::dxs:dxr:lacZ:Tn5Kn is shown in FIG. 7. Example 13 0285) The plasmid pTJS75::dxs:dxr:lacZ:Tn5Kn was Enhanced Synthesis of the Native Carotenoid of transferred from E. coli into Methylomonas 16a by tripa Methylomonas 16A by Amplification of Upper rental conjugation. A spontaneous rifampin (Rif)-resistant Isoprenoid Pathway Genes isolate of Strain Methylomonas 16a was used as the recipient to Speed the isolation of the methanotroph from contami 0274 Native isoprene pathway genes dxS and dxr were nating E. coli following the mating. Six Separately isolated amplified from the Methylomonas 16a genome by using kanamycin-resistant Methylomonas 16a transconjugants PCR with the following primers. were used for carotenoid content determination. 0275 DXs primers: 0286 For carotenoid determination, six 100 mL cultures 0276 Forward reaction: aaggatccg.cgtattegtactic (contains of transconjugants (in BTZ+50 ug/mL Kn) were grown under methane (25%) over the weekend to stationary growth a Bam HI site, SEQ ID NO:55). phase. Two cultures of each, the wild-type Strain and its 0277 Reverse reaction: ctggatccgatctagaaataggctic Rif-resistant derivative without the plasmid, served as a gagttgtcgttcagg (contains a Bam HI and a Xho I site, SEQ control to see whether there are different carotenoid contents ID NO:56). in those Strains and to get a Standard deviation of the carotenoid determination. Cells were Spun down, washed 0278 DXr primers: with distilled water, and freeze-dried (lyophilizer: Virtis, 0279 Forward reaction: aaggat.cctacticgagctgacatcagtgct Gardiner, N.Y.) for 24h in order to determine dry-weights. (contains a Bam HI and a Xho I site, SEQ ID NO:57). After the dry-weight of each culture, was determined, cells US 2003/0003528A1 Jan. 2, 2003 29 were extracted. First, cells were welled with 0.4 mL of water and let stand for 15 min. After 15 min, four mL of acetone TABLE 13 was added and thoroughly vortexed to homogenize the sample. The samples were then shaken at 30° C. for 1 hr. Native Carotenoid contents in Methylomonas 16a cells After 1 hr, the cells were centrifuged. Pink coloration was carotenoid content observed in the Supernatant. The Supernatant was collected Cultures dry weight (mg) carotenoid (g) (ugg) and pellets were extracted again with 0.3 mL of water and 16a-1 30.8 3.OO194E-06 97.5 16a-2 30.8 3.0865E-06 1OO.2 3 mL of acetone. The Supernatants from the Second extrac 16a Rif-1 29.2 3.12937E-06 107.2 tion were lighter pink in color. The Supernatants of both 16a Rif-2 30.1 3.02O14E-06 100.3 extractions were combined, their volumes were measured, dxp 1 28.2 3.48817E-06 123.7 dxp 2 23.8 3.17224E-06 133.3 and analyzed spectrophotometrically. No qualitative differ dxp 3 31.6 4.01962E-06 127.2 ences were seen in the Spectra between negative control and dxp 4 31.8 4.38899E-06 138.O transconjugant Samples. In acetone extract, a following dxp 5° 28.4 3.4547E-06 121.6 observation was typical measured by Spectrophotometer dxp 6 30.3 4.OO817E-06 132.3 (shoulder at 460 nm, maxima at 491 and 522 nm) (Amer Methylomonas 16a native strain Rif resistant derivative of Methylomonas 16a without plasmid sham Pharmacia Biotech, Piscataway, N.J.). For calculation transconjugants containing pTJS75::dxs:dxr:lacZ:Tn5Kn plasmid of the carotenoid content, the absorption at 491 nm was read, the molar extinction coefficient of bacterioruberin (188,000) 0297. There were no significant differences between four and a MW of 552 were used. The MW of the carotenoid (552 negative controls. Likewise, there were no significant dif g/mol) was determined by MALDI-MS of a purified sample ferences between six transconjugants. However, approxi (Silica/Mg adsorption followed by Silica column chroma mately 28% increase in average carotenoid production was tography, reference: Britton, G., Liaaen-Jensen, S., Pfander, observed in the transconjugants in comparison to the aver age carotenoid production in negative controls (Table 13; H., Carotenoids Vol. 1a; Isolation and analysis, Birkhäuser FIG. 7 In order to confirm the structure, Methylobacterium Verlag, Basel, Boston, Berlin (1995)). rhodinum (formerly Pseudomonas rhodos: ATCC No. 14821) of which C30-carotenoid was identified was used as 0287. A crude acetone extract from Methylomonas 16a a reference strain (Kleinig et al., Z. Naturforsch 34c, 181 cells has a typical absorption spectrum (inflexion at 460 nm, 185 (1979); Kleinig and Schmitt, Z. Naturforsch 37c, 758 maxima at 491 nm and 522 nm). HPLC analysis (as 760 (1982)). A saponified extract of Methylobacterium described in the General Methods, except Solvent program: rhodinum and of Methylomonas 16a were compared by HPLC analysis under the same conditions as mentioned 0-10 min 15% water/85% methanol, then 100% methanol) above. The results are shown as follows: of acetone extracts confirmed that one major carotenoid (net retention volume at about 6 mL) with the above mentioned 0298 Saponified M. rhodinum: inflexion at 460 nm, absorption spectrum is responsible for the pink coloration of maxima at 487 nm, 517 nm. wild-type and transconjugant Methylomonas 16a cells. 0299 Net retention volume=1.9 mL. Because nothing else in the extract absorbs at 491 nm, carotenoid content was directly measured in the acetone 0300 Saponified Methylomonas 16a: inflexion at 460 extract with a spectrophotometer (Amersham Pharmacia nm, maxima at 488 nm, 518 nm. Biotech, Piscataway, N.J.). 0301 Net retention volume=2.0 mL. 0288 The molar extinction coefficient of bacterioruberin Example 14 (188,000), was used for the calculation of the quantity. Enhanced Synthesis of Genetically Engineered 0289. The following formula was used (Lambert-Beer's Carotenoids in Methylomonas 16A by law) to determine the quantity of carotenoid: Amplification of Upper Isoprenoid Pathway Genes 0302) The previous example (Example 13) demonstrated that amplification of the dxS and dxr genes in Methylomonas 0290 Ca: Carotenoid amount (g) 16a increased the endogenous 30-carbon carotenoid content 0291 A491 nm: Absorption of acetone extract at by about 30%. Amplification of dxs, dxr and other iso prenoid pathway genes, Such as lytB, may be used to 491 nm (-) increase the metabolic flux into an engineered carotenoid 0292 d: Light path in cuvette (1 cm) pathway and thereby enhance production of 40-carbon caro tenoids, Such as B-caroteine, Zeaxanthin, canthaxanthin and 0293) e: Molar extinction coefficient (L/(molxcm)) astaxanthin. The lytB gene was amplified by PCR from Methylomonas 16a using the following primers that also 0294 MW. Molecular weight (g/mol) introduced convenient XhoI restriction sites for Subcloning: 0295 v. Volume of extract (L) 0296 To get the carotenoid content, the calculated caro 5'-TGGCTCGAGAGTAAAACACTCAAG-3' (SEQ ID NO:59) tenoid amount has to be divided by the corresponding cell 5'-TAGCTCGAGTCACGCTTGC-3' (SEQ ID NO: 60) dry weight. US 2003/0003528A1 Jan. 2, 2003 30

0303) The PCR conditions were: 95° C. for 5 min, 35 cycles of 95°C. for 30 sec, 47-62 C. gradient with 0.25° C. decrease/cycle for 30 sec and 72 C. for 1 min, and a final (weight of filter + cells) - extension at 72 C. for 7 min. (weight of filter) - (weight of filter + media) - 0304 Following purification, the 993 bp PCR product DCW =l (g mL = (weight of filter) was digested with XhoI and ligated to = g 20 mL culture volume pTJS75::dxs:dxr:lacZ:Tn5Kn, previously digested with XhoI and dephosphorylated with calf intestinal alkaline phosphatase. The ligated DNA was used to transform E. coli 0309 Ammonia Concentration Determination DH10B by electroporation. Analysis of the plasmid DNA from transformants Selected on LB agar containing kana 0310 3 mL culture samples for ammonia analyses were mycin (50 ug/ml) identified a plasmid in which the IytB gene taken from the fermenter and centrifuged at 10,000xg and 4 was Subcloned between the dxS and dxr genes in an operon C. for 10 min. The Supernatant was then filtered through a under the control of the native dxS promoter. This operon 0.2 um syringe filter (Gelman Lab., Ann Arbor, Mich.) and was excised as a 4891 bp DNA fragment following sequen placed at -20°C. until analyzed. Ammonia concentration in tial digestion with HindIII and BamHI restriction endonu the fermentation broth was determined by ion chromatog cleases, made blunt-ended by treatment with T4DNA poly raphy using a Dionex System 500 Ion Chromatograph merase and purified following gel electrophoresis in 1.0% (Dionex, Sunnyvale, Calif.) equipped with a GP40 Gradient agarose (TAE). The purified DNA fragment was ligated to Pump, AS3500 Autosampler, and ED40 Electrochemical crt3 (Example 10) previously linearized within the crtz gene Detector operating in conductivity mode with an SRS cur by digestion with BstXI, made blunt-ended by treatment rent of 100 mA. Separation of ammonia was accomplished with T4 DNA polymerase and dephosphorylated with calf using a Dionex CS12A column fitted with a Dionex CG12A intestinal alkaline phosphatase. The ligated DNA was used Guard column. The columns and the chemical detection cell to transform E. coli DH10B by electroporation and trans were maintained at 35° C. Isocratic elution conditions were formants were Selected on LB agar containing kanamycin employed using 22 mM HSO as the mobile phase at a (50 ug/ml). Analysis of the plasmid DNA from transfor flowrate of 1 mL min. The presence of ammonia in the mants which demonstrated more intense yellow colony color fermentation broth was verified by retention time compari than those containing crt3 identified a plasmid, designated son with an NHCl standard. The concentration of ammonia pcrt3.2, containing both the crtEXYIB and dxS-lytB-dxr in the fermentation broth was determined by comparison of operons (FIG. 7). area counts with a previously determined NHCl standard calibration curve. When necessary, Samples were diluted 0305 HPLC analysis of extracts from E. coli containing with de-ionized water so as to be within the bounds of the pcrt3.2 confirmed the synthesis of B-carotene. Transfer of calibration curve. this plasmid into Methylomonas 16a by tri-parental conjugal mating will enhance production of B-caroteine compared to 0311 Carbon Dioxide Evolution Rate (CER) Determina transconjugants containing pcrt3. tion Example 15 0312 The carbon dioxide concentration in the exit gas Stream from the fermenter was determined by gas chroma Industrial Production of B-Carotene in tography (GC) using a Hewlett Packard 5890 Gas Chro Methylomonas 16a Optical Density Measurements matograph (Hewlett Packard, Avondale, Pa.) equipped with a TCD detector and HP19091 P-Q04, 32 mx32 umx20 um 0306 Growth of the Methylomonas culture was moni divinylbenzene/Styrene porous polymer capillary column. tored at 600 nm using a Shimadzu 160U UV/Vis dual beam, Gas Samples were withdrawn from the outlet gas Stream recording spectrophotometer. Water was used as the blank in through a Sample port consisting of a polypropylene “T” to the reference cell. Culture Samples were appropriately which the side arm was covered with a butyl rubber stopper. diluted with de-ionized water to maintain the absorbance 200 till samples were collected by piercing the rubber values less than 1.0. Stopper with a Hamilton (Reno, Nev.) gas-tight GCSyringe. Samples were collected after purging the barrel of the 0307 Dry Cell Weight Determination Syringe a minimum of 4 times with the outlet gas. Imme 0308) 20 mL of Methylomonas cell culture was filtered diately following Sample collection, the Volume in the through a pre-weighed 0.2 um filter (Type GTTP, Millipore, Syringe was adjusted to 100 till and injected through a Bedford, Mass.) by vacuum filtration. Following filtration of SplitleSS injection port onto the column. Chromatographic biomass samples, filters were washed with 10 mL of de conditions used for CO determination were as follows: ionized water and filtered under vacuum to dryneSS. Filters Injector Temperature (100 C.); Oven Temperature (35 C.); were then placed in a drying oven at 95 C. for 24 to 48 hr. Detector Temperature (140 C.); Carrier Gas (Helium); Elu After 24 hr, filters were cooled to room temperature and tion Profile (Isothermal); Column Head Pressure (15 psig). re-weighed. After recording the filter weight, the filters were The presence of CO2 in the exit gas Stream was verified by returned to the drying oven and the proceSS repeated at retention time comparison with a pure component CO various time intervals until no further change in weight loSS Standard. The concentration of CO2 in the exit gas Stream was recorded. Media contribution to the dry cell weight was determined by comparison of area counts with a pre (DCW) measurement was obtained by filtering 20 mL of viously determined CO standard calibration curve. Stan fermentation media prior to inoculation by the above pro dard gas cylinders (Robert's Oxygen, Kenneft Square, Pa.) cedure. Dry cell weight is calculated by the following containing CO in the concentration range of 0.1% (v/v) to formula: 10% (v/v) were used to generate the calibration curve. US 2003/0003528A1 Jan. 2, 2003

0313 The carbon dioxide evolution rate was calculated purity, Spectrum Chemical Inc., New Brunswick, N.J.) in from the following formula: 100 mL of acetone. Appropriate dilutions of this stock Solution were made to span the B-caroteine concentrations encountered in the acetone extracts. Calibration curves con Structed in this manner were linear over the concentration CER=) mmol hr' = range examined. Exit Pressure: CO2 concentration: inlet gas flowrate R: Absolute temperature of the exit gas stream 0320 Fermentation of Methylomonas 16a 0321 Fermentation was performed as a fed-batch fer mentation under nitrogen limitation using a 3 liter, Vertical, 0314. In the above equation the exit pressure from the stirred tank fermenter (B. Braun Biotech Inc., Allentown, fermenter was assumed to be equal to the atmospheric Pa.) with a working volume of 2 liters. The fermenter was preSSure. The inlet gas flowrate was calculated from the Sum equipped with 2 Six-bladed Rushton turbines and Stainless of the individual methane and air flowrates. R is the ideal gas Steel headplate with fittings for pH, temperature, and dis constant=82.06 cm atm mol' K'. The absolute tempera Solved oxygen probes, inlets for pH regulating agents, ture of the exit gas Stream was calculated by the following Sampling tube for withdrawing liquid Samples, and con formula: T(K)=t( C.)+273.15, where T is the absolute denser. The exit gas line from the fermenter contained a temperature in K, and t is the exit gas temperature in C. and Separate port for Sampling the exit gas Stream for GC was assumed to be equal to the ambient temperature. analysis of methane, O2, and CO concentrations. The fer menter was jacketed for temperature control with the tem 0315 B-Caroteine Extraction and Determination by High perature maintained constant at 30° C. through the use of an Performance Liquid Chromatography (HPLC) external heat eXchanger. Agitation was maintained in the 0316) 15-30 mL of the Methylomonas culture was cen range of 870-885 rpm. The pH of the fermentation was trifuged at 10,000xg and 4 C. for 10 min. The Supernatant maintained constant at 6.95 through the use of 2.5 M NaOH was decanted and the cell pellet frozen at -20°C. The frozen and 2 M HSO. cell pellet was thawed at room temperature to which 2.5 mL 0322 Methane was used as the sole carbon and energy of acetone was added. The Sample was vortexed for 1 min Source during the fermentation. The flow of methane to the and allowed to Stand at room temperature for an additional fermenter was metered using a Brooks (Brooks Instrument, 30 min before being centrifuged at 10,000xg and 4 C. for Hatfield, Pa.) mass flow controller. A separate mass flow 10 min. The acetone layer was decanted and saved. The controller was used to regulate the flow of air. Prior to pellet was then re-extracted with an additional 2.5 mL of entering the fermenter, the individual methane and air flows acetone, centrifuged, and the two acetone pools combined. were mixed and filtered through a 0.2 um in-line filter (Millipore, Bedford, MA) giving a total gas flowrate of 260 Visual observation of the cell pellet revealed that all the mL min (0.13 V/v/min) and methane concentration of 23% B-caroteine had been removed from the cells following the (v/v) in the inlet gas stream. The gas was delivered to the Second extraction. The acetone pool was then concentrated medium 3 cm below the lower Rushton turbine through a to 1 mL under a stream of N, filtered through a 0.45 um perforated pipe. 2 liters of a minimal Salts medium of the filter, and analyzed by HPLC. composition given in Table 14 was used for the fermenta 0317 Acetone samples containing f-caroteine were ana tion. Silicone antifoam (Sigma Chemical Co., St. Louis, lyzed using a Beckman System Gold HPLC (Beckman Mo.) was added to a final concentration of 800 ppm prior to Coulter, Fullerton, Calif.) equipped with a model 125 ter Sterilization to SuppreSS foaming. Before inoculating, the nary pump system, model 168 diode array detector, and fermenter and it contents were Sterilized by autoclaving for model 508 autosampler. 100 L of concentrated acetone 1 hr at 121 C and 15 psia. Once the medium had cooled, 4 extracts were injected onto a HP LichroCART 125-4, C. mL of a 25 mg mL. kanamycin stock Solution was added reversed phase HPLC column (Hewlett Packard, Avondale, to the fermentation medium to maintain plasmid Selection Pa.). Peaks were integrated using Beckman Gold Software. preSSure during the fermentation. Retention time and spectral comparison confirmed peak identity with B-caroteine pure component Standards in the TABLE 1.4 wavelength range from 220 to 600 nm. The retention time and Spectral profiles of the B-caroteine in the acetone extracts Fermentation Media Composition were an exact match to those obtained from the pure Amount component B-caroteine Standards. The B-caroteine concentra Component (g L') tions in the acetone extracts were quantified by comparison of area counts with a previously determined calibration NHCI 1.07 KHPO, 1. curve as described below. A wavelength of 450 nm, corre MgCl, *6HO 0.4 sponding to the maximum absorbance wavelength of B-caro CaCl2.H2O O.2 tene in acetone, was used for quantitation. 1M HEPES Solution (pH 7) 50 mL. L. 0318. A mobile consisting of methanol and water was Solution 1* 30 mL. L. used for reversed phase Separation of B-carotene. The Sepa NaSO 1. ration of 3-caroteine was accomplished using a linear gra *Note: The composition of Solution 1 is provided in the General Methods. dient of 60% methanol and 40% water changing linearly over 11.5 minutes to 100% methanol. Under the chromato 0323 1 ml of frozen Methylomonas 16a containing plas graphic conditions employed, resolution of C-caroteine from mid pCRT1 was used to inoculate a 100 mL culture of sterile B-caroteine could not be attained. 0.5x minimal salts media containing 50 ug mL of kana 0319 B-caroteine calibration curves were prepared from mycin in a 500 mL. Wheaton bottle sealed with a butyl stock solutions by dissolving 25 mg of B-carotene (96% rubber Stopper and aluminum crimp cap. Methane was US 2003/0003528A1 Jan. 2, 2003 32 added to the culture by piercing the rubber stopper with a 60 mL Syringe fitted with a 21 gauge needle to give a final TABLE 15-continued methane concentration in the headspace of 25% (v/v). The inoculated medium was Shaken for approximately 48 hr at Fed-Batch Fermentation Results of Methylomonas Sp. 16alpCRT1 30° C. in a controlled environmental rotary shaker. When B cell growth reached Saturation, 5 mL of this culture was used B- Caro to inoculate 2 100-mL cultures as described above. When the caroteine tene Am optical density of the cultures reached 0.8, 60 mL of each Titer Titer monia CER po culture was used to inoculate the fermenter. Time OD DCWa (ug (mg Conc. (mimol (% (hr) 600 (g L') gDCW) L') (mM) hr') Satn) 0324 Samples were taken at 4-5 hr intervals during the 45.9 4.27 155 7710 11.94 8.7 22.1 1.OO course of the fermentation to monitor carotenoid production 49.3 7.99 2.36 5050 12.07 O.12 19.4 O.O as a function of the growth phase of the organism. The 53.5 11.68 3.44 451O 15.51 O 10.4 45.50 specific growth rate of the culture was 0.13 hr'. No 58.9 13.63 4.07 3960 15.85 O 4.2 65.85 adjustment of air or methane flows was employed to prevent 63.8 13.8O 3.87 41SO 15.96 O 4.2 72.70 the culture from becoming oxygen limited during the course 69.6 13.45 3.93 4890 19.01. O 2.O 75.30 of the fermentation. Furthermore, the aeration and methane DCW = Dry Cell Weight addition continued once the culture had stopped growing to PCER = Carbon Dioxide Evolution Rate explore f3-caroteine production in the absence of cell growth. pO2 = Dissolved Oxygen Concentration in Fermenter Cessation of growth was indicated when no changes in 'ND = Not Determined optical density were observed, by the disappearance of ammonia from the fermentation media, and by an observed decrease in the CER. The B-caroteine content of the cells, dry 0325 At 46 hr into the fermentation f3-caroteine titers cell weight, ammonia levels, and carbon dioxide evolution reached a maximum titer of 7,710 ppm on a dry weight rate were determined as described Supra. The results are basis. Shortly after this time the B-caroteine titer dropped stated in Table 15 below. Substantially as the fermenter became oxygen limited as noted by the dissolved oxygen concentration. Thus, it is TABLE 1.5 apparent that maintenance of high B-caroteine titers is depen Fed-Batch Fermentation Results of Methylomonas Sp. 16alpCRT1 dent on high oxygen tensions present in the fermentation media. Presumably higher B-caroteine titers could be reached B f- Caro than reported here through better control of the dissolved caroteine tene Am oxygen concentration during the course of the fermentation. Titer Titer monia CER po, Maximum B-caroteine productivities were calculated as 620 Time OD DCWa (ug (mg Conc. (mmol (% (hr) 600 (g L') gDCW) L') (mM) hr') Satn) JuggDCW hr' and 886 ug L' hr'. In addition, B-caro tene concentrations were found to Stabilize at roughly 4,400 O.O. O.351 NDd NDd NDd 23.7 NDd NDd 37.7 1.59 O.54 2640 1.42 17.8 8.1 53.65 ppm as the cells transitioned into Stationary phase. It is 41.6 2.50 0.87 63OO 5.51 13.9 13.2 33.50 apparent that B-caroteine titers are growth asSociated as well as dependent on Oxygen tension.

SEQUENCE LISTING

<160> NUMBER OF SEQ ID NOS: 60 <210> SEQ ID NO 1 <211& LENGTH: 1311 &212> TYPE DNA <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 1 gatgttgg to a catgg cccta to acttaacg gCtgatatto gattttgtca ttggittttitt 60 cittaactitta acttctacac gotcatgaac aaacctaaaa aagttgcaat actgacagoa 120 ggcggcttgg cqccittgttt gaatticcgca atcgg tagtt to atcgaacg titat accgaa 18O atc gatccta gcatagaaat catttgctat cqcggcggitt ataaaggcct gttgctgggc 240 gattctitatic cagtaacggc cqaagtgcgt aaaaaggcgg gtgttctgca acgttittggc 3OO ggttctgttga toggcaa.cag cogcqtcaaa ttgaccalatg to aaag act g cqtgaaacgc 360 ggtttggtoa aa gagggtga agatcc.gcaa aaagtc.gcgg citigatcaatt ggittaaggat 420 ggtgtcgata ttctgcacac catcggcggc gatgatacca atacgg cago agcggatttg 480

US 2003/0003528A1 Jan. 2, 2003 34

-continued

210 215 220 Gly Arg Asn. Cys Gly Trp Lieu. Thir Ala Ala Thr Ala Glin Glu Tyr Arg 225 230 235 240 Lys Lieu Lieu. Asp Arg Ala Glu Trp Lieu Pro Glu Lieu Gly Lieu. Thir Arg 245 250 255 Glu Ser Tyr Glu Val His Ala Val Phe Val Pro Glu Met Ala Ile Asp 260 265 27 O Leu Glu Ala Glu Ala Lys Arg Lieu Arg Glu Val Met Asp Llys Val Asp 275 280 285 Cys Val Asin Ile Phe Val Ser Glu Gly Ala Gly Val Glu Ala Ile Val 29 O 295 3OO Ala Glu Met Glin Ala Lys Gly Glin Glu Val Pro Arg Asp Ala Phe Gly 305 310 315 320 His Ile Lys Lieu. Asp Ala Wall Asn Pro Gly Lys Trp Phe Gly Glu Glin 325 330 335 Phe Ala Gln Met Ile Gly Ala Glu Lys Thr Lieu Val Glin Lys Ser Gly 340 345 350 Tyr Phe Ala Arg Ala Ser Ala Ser Asn. Wall Asp Asp Met Arg Lieu. Ile 355 360 365 Lys Ser Cys Ala Asp Leu Ala Val Glu Cys Ala Phe Arg Arg Glu Ser 370 375 38O Gly Val Ile Gly His Asp Glu Asp Asn Gly Asn Val Lieu Arg Ala Ile 385 390 395 400 Glu Phe Pro Arg Ile Lys Gly Gly Lys Pro Phe Asn. Ile Asp Thr Asp 405 410 415 Trp Phe Asn. Ser Met Leu Ser Glu Ile Gly Glin Pro Lys Gly Gly Lys 420 425 430

Wall Glu Wal Ser His 435

<210> SEQ ID NO 3 &2 11s LENGTH 636 &212> TYPE DNA <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 3 gaaaatacta totcc.gtcac catcaaagaa gtcatgacca cct cqc.ccgt tatgc.cggto 60 atgg to atca atcatctgga acatgcc.gto cotctggcto go.gc.gctagt c gacggtggc 120 ttgaaagttt to gagatcac attgcgcacg ccggtgg cac toggaatgitat cogac gitatic 18O aaag.ccgaag taccggacgc catcgtoggc gcggg cacca to atcaa.ccc toataccttg 240 tatcaag.cga ttgacgc.cgg togcggaattic atcgt.ca.gcc ccggcatcac cqaaaatcta 3OO citcaac galag cqctago atc cqgcgtgcct atcct gcc.cg gcgtoatcac accoagc gag 360 gtocatgcgtt tattggaaaa agg catcaat gcgatgaaat tottt coggc tigaagcc.gc.c 420 ggcggcatac cqatgctgaa atc.ccittggc gg.ccc cittgc cqcaagttcac cittctgtc.cg 480 accggcgg.cg tdaatc.ccaa aaacgc.gc.cc gaatatotgg cattgaaaaa totc.gc.ctoc 540 gtoggcggct cotggatggc gcc gg.ccgat citggtagatg cc galagact g g gciggaaatc 600 acgcgg.cggg cqagcgaggc cqcgg cattg aaaaaa 636

<210> SEQ ID NO 4 US 2003/0003528A1 Jan. 2, 2003 35

-continued

<211& LENGTH 212 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 4 Glu Asn Thr Met Ser Val Thir Ile Lys Glu Val Met Thr Thr Ser Pro 1 5 10 15

Wal Met Pro Wal Met Wall Ile Asn His Leu Glu. His Ala Wall Pro Leu 2O 25 30 Ala Arg Ala Lieu Val Asp Gly Gly Lieu Lys Val Lieu Glu Ile Thr Lieu 35 40 45 Arg Thr Pro Val Ala Lieu Glu Cys Ile Arg Arg Ile Lys Ala Glu Val 50 55 60 Pro Asp Ala Ile Val Gly Ala Gly Thr Ile Ile Asn Pro His Thr Leu 65 70 75 8O Tyr Glin Ala Ile Asp Ala Gly Ala Glu Phe Ile Val Ser Pro Gly Ile 85 90 95 Thr Glu Asn Lieu Lleu. Asn. Glu Ala Lieu Ala Ser Gly Val Pro Ile Leu 100 105 110 Pro Gly Val Ile Thr Pro Ser Glu Val Met Arg Leu Leu Glu Lys Gly 115 120 125 Ile Asn Ala Met Lys Phe Phe Pro Ala Glu Ala Ala Gly Gly Ile Pro 130 135 1 4 0 Met Leu Lys Ser Leu Gly Gly Pro Leu Pro Glin Val Thr Phe Cys Pro 145 150 155 160 Thr Gly Gly Val Asn Pro Lys Asn Ala Pro Glu Tyr Lieu Ala Lieu Lys 1.65 170 175 Asn Val Ala Cys Val Gly Gly Ser Trp Met Ala Pro Ala Asp Lieu Val 18O 185 190 Asp Ala Glu Asp Trp Ala Glu Ile Thr Arg Arg Ala Ser Glu Ala Ala 195 200 2O5 Ala Lieu Lys Lys 210

<210 SEQ ID NO 5 &2 11s LENGTH 1860 &212> TYPE DNA <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 5 atgaaact ga ccaccgacta toccittgctt aaaaa.catcc acacgc.cggc ggacatacgc 60 gcqctgtc.ca aggaccagot coagcaactg gctgacgagg togcgcggcta totgacccac 120 acgg to agca titt.ccggcgg ccattttgcg gcc.ggcc to g g caccgtgga act gaccgtg 18O gccttgcatt atgtgttcaa taccc.ccg to gatcagttgg totgggacgt gggccatcag 240 gccitat cogc acaagattct gaccggtogc aaggagcgca tocc gaccat togcaccctg 3OO ggcggggtgt CagcCttt CC gg.cgcgggiac gaga.gcgaat acgatgccitt C9g.cgtcggc 360 cattccagoa cctogatcag cqcgg cactg. g.gcatggcca ttgcgtogca gctg.cgcggc 420 galagacaaga agatggtagc catcatcggc gacggttcca to accgg.cgg catggccitat 480 gaggcgatga atcatgcc.gg cqatgtgaat gcc aacctgc tiggtgatctt galacgacaac 540 gatatgtcga totc.gc.cgcc ggtogggg.cg atgaacaatt atctgaccala ggtgttgtc.g 600

US 2003/0003528A1 Jan. 2, 2003 37

-continued

Met Val Ala Ile Ile Gly Asp Gly Ser Ile Thr Gly Gly Met Ala Tyr 145 15 O 155 160 Glu Ala Met Asn His Ala Gly Asp Wall Asn Ala Asn Lieu Lieu Val Ile 1.65 170 175 Lieu. Asn Asp Asn Asp Met Ser Ile Ser Pro Pro Val Gly Ala Met Asn 18O 185 190 Asn Tyr Leu Thr Lys Val Leu Ser Ser Lys Phe Tyr Ser Ser Val Arg 195 200 2O5 Glu Glu Ser Lys Lys Ala Leu Ala Lys Met Pro Ser Val Trp Glu Lieu 210 215 220 Ala Arg Lys Thr Glu Glu His Val Lys Gly Met Ile Val Pro Gly Thr 225 230 235 240 Leu Phe Glu Glu Lieu Gly Phe Asn Tyr Phe Gly Pro Ile Asp Gly. His 245 250 255 Asp Val Glu Met Leu Val Ser Thr Lieu Glu Asn Lieu Lys Asp Lieu. Thr 260 265 27 O Gly Pro Val Phe Leu. His Val Val Thr Lys Lys Gly Lys Gly Tyr Ala 275 280 285 Pro Ala Glu Lys Asp Pro Leu Ala Tyr His Gly Val Pro Ala Phe Asp 29 O 295 3OO Pro Thr Lys Asp Phe Leu Pro Lys Ala Ala Pro Ser Pro His Pro Thr 305 310 315 320 Tyr Thr Glu Val Phe Gly Arg Trp Leu Cys Asp Met Ala Ala Glin Asp 325 330 335 Glu Arg Lieu Lieu Gly Ile Thr Pro Ala Met Arg Glu Gly Ser Gly Lieu 340 345 350 Val Glu Phe Ser Gln Lys Phe Pro Asn Arg Tyr Phe Asp Val Ala Ile 355 360 365 Ala Glu Gln His Ala Val Thr Lieu Ala Ala Gly Glin Ala Cys Glin Gly 370 375 38O Ala Lys Pro Val Val Ala Ile Tyr Ser Thr Phe Leu Glin Arg Gly Tyr 385 390 395 400 Asp Gln Lieu. Ile His Asp Val Ala Lieu Glin Asn Lieu. Asp Met Leu Phe 405 410 415 Ala Lieu. Asp Arg Ala Gly Lieu Val Gly Pro Asp Gly Pro Thr His Ala 420 425 430 Gly Ala Phe Asp Tyr Ser Tyr Met Arg Cys Ile Pro Asn Met Leu Ile 435 4 40 4 45 Met Ala Pro Ala Asp Glu Asn. Glu Cys Arg Glin Met Lieu. Thir Thr Gly 450 455 460 Phe Glin His His Gly Pro Ala Ser Val Arg Tyr Pro Arg Gly Lys Gly 465 470 475 480 Pro Gly Ala Ala Ile Asp Pro Thr Lieu. Thir Ala Leu Glu Ile Gly Lys 485 490 495 Ala Glu Val Arg His His Gly Ser Arg Ile Ala Ile Leu Ala Trp Gly 5 OO 505 510 Ser Met Val Thr Pro Ala Val Glu Ala Gly Lys Gln Leu Gly Ala Thr 515 52O 525 Val Val Asn Met Arg Phe Val Lys Pro Phe Asp Glin Ala Leu Val Leu 530 535 540

US 2003/0003528A1 Jan. 2, 2003 39

-continued

Lieu. Thir Ala Asn Gly Asn. Ile Asp Ala Leu Tyr Glu Glin Cys Lieu Ala 35 40 45 His His Pro Glu Tyr Ala Val Val Val Met Glu Ser Lys Val Ala Glu 50 55 60 Phe Lys Glin Arg Ile Ala Ala Ser Pro Wall Ala Asp Ile Llys Val Lieu 65 70 75 8O Ser Gly Ser Glu Ala Leu Glin Glin Val Ala Thr Lieu Glu Asn. Wall Asp 85 90 95 Thr Val Met Ala Ala Ile Val Gly Ala Ala Gly Leu Leu Pro Thr Leu 100 105 110 Ala Ala Ala Lys Ala Gly Lys Thr Val Lieu Lieu Ala Asn Lys Glu Ala 115 120 125 Leu Val Met Ser Gly Glin Ile Phe Met Glin Ala Val Ser Asp Ser Gly 130 135 1 4 0 Ala Val Lieu Lleu Pro Ile Asp Ser Glu His Asn Ala Ile Phe Glin Cys 145 15 O 155 160 Met Pro Ala Gly Tyr Thr Pro Gly His Thr Ala Lys Glin Ala Arg Arg 1.65 170 175 Ile Leu Leu Thr Ala Ser Gly Gly Pro Phe Arg Arg Thr Pro Ile Glu 18O 185 190 Thr Leu Ser Ser Val Thr Pro Asp Glin Ala Val Ala His Pro Llys Trp 195 200 2O5 Asp Met Gly Arg Lys Ile Ser Val Asp Ser Ala Thr Met Met Asn Lys 210 215 220 Gly Lieu Glu Lieu. Ile Glu Ala Cys Lieu Lleu Phe Asn Met Glu Pro Asp 225 230 235 240

Glin Ile Glu Wal Wall Ile His Pro Glin Ser Ile Ile His Ser Met Wall 245 250 255 Asp Tyr Val Asp Gly Ser Val Lieu Ala Glin Met Gly Asn Pro Asp Met 260 265 27 O Arg Thr Pro Ile Ala His Ala Met Ala Trp Pro Glu Arg Phe Asp Ser 275 280 285 Gly Val Ala Pro Leu Asp Ile Phe Glu Val Gly His Met Asp Phe Glu 29 O 295 3OO Lys Pro Asp Leu Lys Arg Phe Pro Cys Lieu Arg Lieu Ala Tyr Glu Ala 305 310 315 320 Ile Lys Ser Gly Gly Ile Met Pro Thr Val Lieu. Asn Ala Ala Asn. Glu 325 330 335 Ile Ala Val Glu Ala Phe Lieu. Asn. Glu Glu Wall Lys Phe Thr Asp Ile 340 345 350 Ala Val Ile Ile Glu Arg Ser Met Ala Glin Phe Lys Pro Asp Asp Ala 355 360 365 Gly Ser Lieu Glu Lieu Val Lieu Glin Ala Asp Glin Asp Ala Arg Glu Val 370 375 38O Ala Arg Asp Ile Ile Lys Thr Lieu Val Ala 385 390

<210 SEQ ID NO 9 &2 11s LENGTH 693 &212> TYPE DNA <213> ORGANISM: Methylomonas 16a US 2003/0003528A1 Jan. 2, 2003 40

-continued <400 SEQUENCE: 9 atgaac coaa ccatccaatig citggg.ccg to gtgccc.gcag ccggcgt.cgg caaacgcatg 60 caagcc gatc gcc.ccaaaca atatttaccg cittgcc.ggta aaacggtoat cqaacacaca 120 citgacticgac tacttgagtc. c gacgc.ctitc caaaaagttg cqgtggc gat titc.cgtogaa 18O gacccittatt gocctgaact gtccatagcc aaa.caccc.cg acatcatcac cqc.gc.ctggc 240 ggcaaggaac go.gc.cgacitc ggtgctgtct gcactgaagg ctittagaaga tatagcc agc 3OO gaaaatgatt go.gtgctggit acacgacgcc gcc.cgc.ccct gcttgacggg cagogacatc 360 caccittcaaa to gatacctt aaaaaatgac ccggtoggcg gcatcctggc cittgagttcg 420 cacgacacat tdaaac acgt ggatggtgac acgatcaccg caaccataga cagaaag cac 480 gtotgg.cgcg ccttgacgcc gcaaatgttcaaatacggca tottgcgcga cqc gttgcaa. 540 c galacc galag gCaatc.cggc cqtcaccgac gaa.gc.ca.gtg cqctdgaact tittgggc cat 600 aaac coaaaa togtggaagg cc.gc.ccggac alacatcaaaa to accogccc ggaagatttg 660 gcc.ctggcac aattittatat ggagcaacaa goa 693

<210> SEQ ID NO 10 <211& LENGTH: 231 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 10 Met Asn Pro Thr Ile Gln Cys Trp Ala Val Val Pro Ala Ala Gly Val 1 5 10 15 Gly Lys Arg Met Glin Ala Asp Arg Pro Lys Glin Tyr Lieu Pro Leu Ala 2O 25 30 Gly Lys Thr Val Ile Glu His Thr Lieu. Thir Arg Lieu Lleu Glu Ser Asp 35 40 45 Ala Phe Gln Lys Val Ala Val Ala Ile Ser Val Glu Asp Pro Tyr Trp 50 55 60 Pro Glu Lieu Ser Ile Ala Lys His Pro Asp Ile Ile Thr Ala Pro Gly 65 70 75 8O Gly Lys Glu Arg Ala Asp Ser Val Lieu Ser Ala Leu Lys Ala Leu Glu 85 90 95 Asp Ile Ala Ser Glu Asn Asp Trp Val Lieu Val His Asp Ala Ala Arg 100 105 110 Pro Cys Lieu. Thr Gly Ser Asp Ile His Leu Glin Ile Asp Thr Lieu Lys 115 120 125 Asn Asp Pro Val Gly Gly Ile Leu Ala Leu Ser Ser His Asp Thr Lieu 130 135 1 4 0 Lys His Val Asp Gly Asp Thir Ile Thr Ala Thr Ile Asp Arg Lys His 145 15 O 155 160 Val Trp Arg Ala Leu Thr Pro Gln Met Phe Lys Tyr Gly Met Leu Arg 1.65 170 175 Asp Ala Leu Glin Arg Thr Glu Gly Asn Pro Ala Val Thr Asp Glu Ala 18O 185 190 Ser Ala Leu Glu Lieu Lleu Gly His Lys Pro Lys Ile Val Glu Gly Arg 195 200 2O5 Pro Asp Asn. Ile Lys Ile Thr Arg Pro Glu Asp Leu Ala Lieu Ala Glin 210 215 220

US 2003/0003528A1 Jan. 2, 2003 42

-continued Phe Val Phe Gly Cys Ser Ala Trp Gly Glu Gly Val Ser Glu Asp Leu 145 15 O 155 160 Glin Ala Ile Thr Leu Pro Glu Gln Trp Phe Val Ile Ile Llys Pro Asp 1.65 170 175 Cys His Val Asn Thr Gly Glu Ile Phe Ser Ala Glu Asn Leu Thr Arg 18O 185 190 Asn Ser Ala Val Val Thr Met Ser Asp Phe Leu Ala Gly Asp Asn Arg 195 200 2O5 Asn Asp Cys Ser Glu Val Val Cys Lys Lieu. Tyr Arg Pro Wall Lys Asp 210 215 220 Ala Ile Asp Ala Leu Lieu. Cys Tyr Ala Glu Ala Arg Lieu. Thr Gly Thr 225 230 235 240 Gly Ala Cys Val Phe Ala Glin Phe Cys Asn Lys Glu Asp Ala Glu Ser 245 250 255 Ala Leu Glu Gly Lieu Lys Asp Arg Trp Lieu Val Phe Leu Ala Lys Gly 260 265 27 O Lieu. Asn Glin Ser Ala Leu Tyr Lys Lys Lieu Glu Glin Gly 275 280 285

<210> SEQ ID NO 13 &2 11s LENGTH 471 &212> TYPE DNA <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 13 atgatacgcg tagg catggg ttacgacgtg caccgtttca acgacgg.cga cca catcatt 60 ttggg.cgg.cg tdaaaatc.cc titatgaaaaa gqcctggaag cccattcc.ga cqgcgacgtg 120 gtgctgcacg cattggcc.ga cqc catcttg g gagcc.gc.cg citttggg.cga catcggcaaa 18O catttc.ccgg acaccg acco caatttcaag ggcgc.cgaca gcagggtgct actg.cgc.cac 240 gtgtacgg catcgtoaagga aaaaggctat aaactggtoa acgc.cgacgt gaccatcatc 3OO gctcaggcgc cqaagatgct gcc acacgtg ccc.gg catgc gcgc.caa.cat td.ccgcc gat 360 citggaalaccg atgtcgattt cattaatgta aaa.gc.cacga cq accgagaa actgggctitt 420 gaggg.ccgta aggaaggcat C gcc.gtgcag gctgtggtgt togatagaacg C 471

<210> SEQ ID NO 14 &2 11s LENGTH 157 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 14 Met Ile Arg Val Gly Met Gly Tyr Asp Wal His Arg Phe Asn Asp Gly 1 5 10 15 Asp His Ile Ile Leu Gly Gly Wall Lys Ile Pro Tyr Glu Lys Gly Lieu 2O 25 30 Glu Ala His Ser Asp Gly Asp Val Val Lieu. His Ala Lieu Ala Asp Ala 35 40 45 Ile Leu Gly Ala Ala Ala Lieu Gly Asp Ile Gly Lys His Phe Pro Asp 50 55 60 Thr Asp Pro Asn. Phe Lys Gly Ala Asp Ser Arg Val Lieu Lieu Arg His 65 70 75 8O Val Tyr Gly Ile Wall Lys Glu Lys Gly Tyr Lys Lieu Val Asn Ala Asp 85 90 95

US 2003/0003528A1 Jan. 2, 2003 44

-continued

<210> SEQ ID NO 16 <211& LENGTH 54 4 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 16 Met Thr Lys Phe Ile Phe Ile Thr Gly Gly Val Val Ser Ser Leu Gly 1 5 10 15 Lys Gly Ile Ala Ala Ser Ser Lieu Ala Ala Ile Leu Glu Asp Arg Gly 2O 25 30 Leu Lys Val Thir Ile Thr Lys Lieu. Asp Pro Tyr Ile Asn. Wall Asp Pro 35 40 45 Gly Thr Met Ser Pro Phe Gln His Gly Glu Val Phe Val Thr Glu Asp 50 55 60 Gly Ala Glu Thir Asp Lieu. Asp Leu Gly. His Tyr Glu Arg Phe Lieu Lys 65 70 75 8O Thir Thr Met Thr Lys Lys Asn Asn Phe Thr Thr Gly Glin Val Tyr Glu 85 90 95 Glin Val Lieu Arg Asn. Glu Arg Lys Gly Asp Tyr Lieu Gly Ala Thr Val 100 105 110 Glin Val Ile Pro His Ile Thr Asp Glu Ile Lys Arg Arg Val Tyr Glu 115 120 125 Ser Ala Glu Gly Lys Asp Wall Ala Lieu. Ile Glu Val Gly Gly Thr Val 130 135 140 Gly Asp Ile Glu Ser Leu Pro Phe Leu Glu Thir Ile Arg Gln Met Gly 145 15 O 155 160 Val Glu Lieu Gly Arg Asp Arg Ala Lieu Phe Ile His Lieu. Thir Lieu Val 1.65 170 175 Pro Tyr Ile Lys Ser Ala Gly Glu Leu Lys Thr Lys Pro Thr Glin His 18O 185 190 Ser Val Lys Glu Lieu Arg Thr Ile Gly Ile Glin Pro Asp Ile Lieu. Ile 195 200 2O5 Cys Arg Ser Glu Glin Pro Ile Pro Ala Ser Glu Arg Arg Lys Ile Ala 210 215 220 Leu Phe Thr Asn. Wall Ala Glu Lys Ala Val Ile Ser Ala Ile Asp Ala 225 230 235 240 Asp Thir Ile Tyr Arg Ile Pro Leu Lleu Lieu Arg Glu Glin Gly Lieu. Asp 245 250 255 Asp Leu Val Val Asp Gln Leu Arg Lieu. Asp Val Pro Ala Ala Asp Lieu 260 265 27 O Ser Ala Trp Glu Lys Val Val Asp Gly Leu Thr His Pro Thr Asp Glu 275 280 285 Val Ser Ile Ala Ile Val Gly Lys Tyr Val Asp His Thr Asp Ala Tyr 29 O 295 3OO Lys Ser Lieu. Asn. Glu Ala Lieu. Ile His Ala Gly Ile His Thr Arg His 305 310 315 320 Lys Val Glin Ile Ser Tyr Ile Asp Ser Glu Thir Ile Glu Ala Glu Gly 325 330 335 Thr Ala Lys Lieu Lys Asn. Wall Asp Ala Ile Leu Val Pro Gly Gly Phe 340 345 350 Gly Glu Arg Gly Val Glu Gly Lys Ile Ser Thr Val Arg Phe Ala Arg

US 2003/0003528A1 Jan. 2, 2003 46

-continued <210> SEQ ID NO 18 &2 11s LENGTH 318 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 18 Met Glin Ile Val Lieu Ala Asn. Pro Arg Gly Phe Cys Ala Gly Val Asp 1 5 10 15 Arg Ala Ile Glu Ile Val Asp Glin Ala Ile Glu Ala Phe Gly Ala Pro 2O 25 30 Ile Tyr Val Arg His Glu Val Val His Asn Arg Thr Val Val Asp Gly 35 40 45 Leu Lys Gln Lys Gly Ala Val Phe Ile Glu Glu Lieu Ser Asp Val Pro 50 55 60 Val Gly Ser Tyr Leu Ile Phe Ser Ala His Gly Val Ser Lys Glu Val 65 70 75 8O Glin Glin Glu Ala Glu Glu Arg Glin Lieu. Thr Val Phe Asp Ala Thr Cys 85 90 95 Pro Leu Val Thr Lys Val His Met Glin Val Ala Lys His Ala Lys Glin 100 105 110 Gly Arg Glu Val Ile Lieu. Ile Gly His Ala Gly His Pro Glu Val Glu 115 120 125 Gly Thr Met Gly Glin Tyr Glu Lys Cys Thr Glu Gly Gly Gly Ile Tyr 130 135 1 4 0 Leu Val Glu Thr Pro Glu Asp Val Arg Asn Leu Lys Val Asn Asn Pro 145 15 O 155 160 Asn Asp Leu Ala Tyr Val Thr Glin Thr Thr Leu Ser Met Thr Asp Thr 1.65 170 175 Lys Wal Met Val Asp Ala Leu Arg Glu Glin Phe Pro Ser Ile Lys Glu 18O 185 190 Glin Lys Lys Asp Asp Ile Cys Tyr Ala Thr Glin Asn Arg Glin Asp Ala 195 200 2O5 Val His Asp Leu Ala Lys Ile Ser Asp Lieu. Ile Leu Val Val Gly Ser 210 215 220 Pro Asn. Ser Ser Asn. Ser Asn Arg Lieu Arg Glu Ile Ala Val Glin Lieu 225 230 235 240 Gly Lys Pro Ala Tyr Lieu. Ile Asp Thr Tyr Glin Asp Lieu Lys Glin Asp 245 250 255 Trp Leu Glu Gly Ile Glu Val Val Gly Val Thr Ala Gly Ala Ser Ala 260 265 27 O Pro Glu Val Lieu Val Glin Glu Val Ile Asp Glin Leu Lys Ala Trp Gly 275 280 285 Gly Glu Thir Thr Ser Val Arg Glu Asn Ser Gly Ile Glu Glu Lys Val 29 O 295 3OO Val Phe Ser Ile Pro Lys Glu Lieu Lys Lys His Met Glin Ala 305 310 315

<210 SEQ ID NO 19 &2 11s LENGTH 891 &212> TYPE DNA <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 19 atgagtaaat togaaag ccta cct gaccg to tdccaagaac gog to gagc g c gcgctggac 60 US 2003/0003528A1 Jan. 2, 2003 47

-continued gcc.cgtctgc ctd.ccgaaaa catactgcca caaac cittgc atcaggc.cat gcgctattoc 120 gtattgaacg gcggcaaacg. cacco gg.ccc ttgttgacitt atgcg accq g to aggctittg 18O ggcttgcc.gg aaaacgtgct g gatgc.gc.cg gcttgc.gcgg tagaattcat coatgtgitat 240 togctdattc acgacgatct gcc.ggccatg gacaacgatg atctg.cgcc g c ggcaaaccg 3OO acctgtcaca aggcttacga cq aggccacc gcc attittgg ccggcgacgc actgcaggcg 360 citggcc tittg aagttctggc caacg acccc ggcatcaccg to gatgcc.cc ggctogcctg 420 aaaatgatca cqgctttgac cc.gc.gc.ca.gc ggctdtcaag goatggtggg cqgtoaa.gc.c 480 atcgatctog gctcc.gtcgg cc.gcaaattg acgctg.ccgg aacto gaaaa catgcatat c 540 cacaag acto go.gc.cctgat cog cqccago gtcaatctgg cqg cattatc caaacco gat 600 citggatacitt gcgtogccaa gaaactggat cactatocca aatgcatagg cittgtcgttc 660 caggtoaaag acgacattct c gacatcgaa goc gacaccg cqacacticgg caag acticag 720 ggcaaggaca to gataacga caaac cqacc taccctd.cgc tattggg cat ggctggc gcc 78O aaacaaaaag cccaggaatt gcacgaacaa goagtcgaaa gottaacggg atttggcago 840 gaagcc gacc togctg.cgcga act atc.gctt tacatcatcg agcgcacgca c 891

<210> SEQ ID NO 20 &2 11s LENGTH 297 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 20 Met Ser Lys Lieu Lys Ala Tyr Lieu. Thr Val Cys Glin Glu Arg Val Glu 1 5 10 15 Arg Ala Lieu. Asp Ala Arg Lieu Pro Ala Glu Asn. Ile Leu Pro Glin Thr 2O 25 30 Lieu. His Glin Ala Met Arg Tyr Ser Val Lieu. Asn Gly Gly Lys Arg Thr 35 40 45 Arg Pro Leu Lieu. Thr Tyr Ala Thr Gly Glin Ala Lieu Gly Lieu Pro Glu 50 55 60 Asn Val Leu Asp Ala Pro Ala Cys Ala Val Glu Phe Ile His Val Tyr 65 70 75 8O Ser Lieu. Ile His Asp Asp Lieu Pro Ala Met Asp Asn Asp Asp Leu Arg 85 90 95 Arg Gly Lys Pro Thr Cys His Lys Ala Tyr Asp Glu Ala Thr Ala Ile 100 105 110 Leu Ala Gly Asp Ala Lieu Glin Ala Lieu Ala Phe Glu Val Lieu Ala Asn 115 120 125 Asp Pro Gly Ile Thr Val Asp Ala Pro Ala Arg Lieu Lys Met Ile Thr 130 135 1 4 0 Ala Lieu. Thir Arg Ala Ser Gly Ser Glin Gly Met Val Gly Gly Glin Ala 145 15 O 155 160 Ile Asp Leu Gly Ser Val Gly Arg Lys Lieu. Thir Lieu Pro Glu Lieu Glu 1.65 170 175 Asn Met His Ile His Lys Thr Gly Ala Lieu. Ile Arg Ala Ser Val Asn 18O 185 190 Leu Ala Ala Leu Ser Lys Pro Asp Lieu. Asp Thr Cys Val Ala Lys Lys 195 200 2O5

US 2003/0003528A1 Jan. 2, 2003 49

-continued gcacacagog cct ggctgaa aaaag.ccaaa goc 1533

<210> SEQ ID NO 22 &2 11s LENGTH 511 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 22 Met Ala Asn Thr Lys His Ile Ile Ile Val Gly Ala Gly Pro Gly Gly 1 5 10 15 Lieu. Cys Ala Gly Met Leu Lleu Ser Glin Arg Gly Phe Lys Wal Ser Ile 2O 25 30 Phe Asp Llys His Ala Glu Ile Gly Gly Arg Asn Arg Pro Ile Asn Met 35 40 45 Asn Gly Phe Thr Phe Asp Thr Gly Pro Thr Phe Leu Leu Met Lys Gly 50 55 60 Val Lieu. Asp Glu Met Phe Glu Lieu. Cys Glu Arg Arg Ser Glu Asp Tyr 65 70 75 8O Leu Glu Phe Leu Pro Leu Ser Pro Met Tyr Arg Lieu Lleu Tyr Asp Asp 85 90 95 Arg Asp Ile Phe Val Tyr Ser Asp Arg Glu Asn Met Arg Ala Glu Lieu 100 105 110 Glin Arg Val Phe Asp Glu Gly Thr Asp Gly Tyr Glu Gln Phe Met Glu 115 120 125 Gln Glu Arg Lys Arg Phe Asn Ala Leu Tyr Pro Cys Ile Thr Arg Asp 130 135 1 4 0 Tyr Ser Ser Lieu Lys Ser Phe Leu Ser Lieu. Asp Lieu. Ile Lys Ala Lieu 145 15 O 155 160 Pro Trp Leu Ala Phe Pro Lys Ser Val Phe Asn Asn Leu Gly Glin Tyr 1.65 170 175 Phe Asn Glin Glu Lys Met Arg Lieu Ala Phe Cys Phe Glin Ser Lys Tyr 18O 185 190 Leu Gly Met Ser Pro Trp Glu Cys Pro Ala Leu Phe Thr Met Leu Pro 195 200 2O5 Tyr Lieu Glu His Glu Tyr Gly Ile Tyr His Val Lys Gly Gly Lieu. Asn 210 215 220 Arg Ile Ala Ala Ala Met Ala Glin Val Ile Ala Glu Asn Gly Gly Glu 225 230 235 240 Ile His Lieu. Asn. Ser Glu Ile Glu Ser Lieu. Ile Ile Glu Asn Gly Ala 245 250 255 Ala Lys Gly Val Lys Lieu Glin His Gly Ala Glu Lieu Arg Gly Asp Glu 260 265 27 O Val Ile Ile Asn Ala Asp Phe Ala His Ala Met Thr His Leu Val Lys 275 280 285 Pro Gly Val Lieu Lys Lys Tyr Thr Pro Glu Asn Lieu Lys Glin Arg Glu 29 O 295 3OO Tyr Ser Cys Ser Thr Phe Met Leu Tyr Leu Gly Leu Asp Lys Ile Tyr 305 310 315 320 Asp Leu Pro His His Thr Ile Val Phe Ala Lys Asp Tyr Thr Thr Asn 325 330 335 Ile Arg Asn. Ile Phe Asp Asn Lys Thr Lieu. Thir Asp Asp Phe Ser Phe 340 345 350

US 2003/0003528A1 Jan. 2, 2003 51

-continued atcc.cgcacc to gaccc.cga caaactgctg accgc.cgagg attatto agc cittgcgc gag 1200 cgggtgctgg toaaactcga acgcatgggc ctdacggatt tacgc.caa.ca catcgtgacc 1260 gaagaatact ggacgcc.gct ggatattoag gCCaaatatt attcaaacca gggcticgatt 1320 tacggcgtgg togcc.gaccg cittcaaaaac citgggtttca agg caccitca acgcago agc 1380 gaattatcca atctgt attt cqtcggcggc agcgtcaatc cc.ggcgg.cgg catgc.cgatg 1440 gtgacgctgt cogggcaatt gotgagggiac aagattgttgg cqgatttgca a 1491

<210> SEQ ID NO 24 &2 11s LENGTH 497 &212> TYPE PRT <213> ORGANISM: Methylomonas 16a <400 SEQUENCE: 24 Met Asn. Ser Asn Asp Asn Glin Arg Val Ile Val Ile Gly Ala Gly Lieu 1 5 10 15 Gly Gly Lieu Ser Ala Ala Ile Ser Lieu Ala Thr Ala Gly Phe Ser Val 2O 25 30 Glin Lieu. Ile Glu Lys Asn Asp Llys Val Gly Gly Lys Lieu. Asn. Ile Met 35 40 45 Thr Lys Asp Gly Phe Thr Phe Asp Leu Gly Pro Ser Ile Leu Thr Met 50 55 60 Pro His Ile Phe Glu Ala Leu Phe Thr Gly Ala Gly Lys Asn Met Ala 65 70 75 8O Asp Tyr Val Glin Ile Gln Lys Val Glu Pro His Trp Arg Asin Phe Phe 85 90 95 Glu Asp Gly Ser Val Ile Asp Lieu. Cys Glu Asp Ala Glu Thr Glin Arg 100 105 110 Arg Glu Lieu. Asp Lys Lieu Gly Pro Gly Thr Tyr Ala Glin Phe Glin Arg 115 120 125 Phe Lieu. Asp Tyr Ser Lys Asn Lieu. Cys Thr Glu Thr Glu Ala Gly Tyr 130 135 1 4 0 Phe Ala Lys Gly Lieu. Asp Gly Phe Trp Asp Leu Lleu Lys Phe Tyr Gly 145 15 O 155 160 Pro Leu Arg Ser Lieu Lleu Ser Phe Asp Val Phe Arg Ser Met Asp Glin 1.65 170 175 Gly Val Arg Arg Phe Ile Ser Asp Pro Lys Lieu Val Glu Ile Lieu. Asn 18O 185 190 Tyr Phe Ile Lys Tyr Val Gly Ser Ser Pro Tyr Asp Ala Pro Ala Leu 195 200 2O5 Met Asn Leu Leu Pro Tyr Ile Glin Tyr His Tyr Gly Leu Trp Tyr Val 210 215 220 Lys Gly Gly Met Tyr Gly Met Ala Glin Ala Met Glu Lys Lieu Ala Wal 225 230 235 240 Glu Lieu Gly Val Glu Ile Arg Lieu. Asp Ala Glu Val Ser Glu Ile Glin 245 250 255 Lys Glin Asp Gly Arg Ala Cys Ala Wall Lys Lieu Ala Asn Gly Asp Wal 260 265 27 O Leu Pro Ala Asp Ile Val Val Ser Asn Met Glu Val Ile Pro Ala Met 275 280 285 Glu Lys Lieu Lieu Arg Ser Pro Ala Ser Glu Lieu Lys Lys Met Glin Arg 29 O 295 3OO US 2003/0003528A1 Jan. 2, 2003 52

-continued

Phe Glu Pro Ser Cys Ser Gly Lieu Val Lieu. His Leu Gly Wall Asp Arg 305 310 315 320 Leu Tyr Pro Gln Leu Ala His His Asn Phe Phe Tyr Ser Asp His Pro 325 330 335 Arg Glu His Phe Asp Ala Val Phe Lys Ser His Arg Lieu Ser Asp Asp 340 345 350 Pro Thr Ile Tyr Leu Val Ala Pro Cys Lys Thr Asp Pro Ala Glin Ala 355 360 365 Pro Ala Gly Cys Glu Ile Ile Lys Ile Leu Pro His Ile Pro His Lieu 370 375 38O Asp Pro Asp Llys Lieu Lieu. Thr Ala Glu Asp Tyr Ser Ala Lieu Arg Glu 385 390 395 400 Arg Val Lieu Val Lys Lieu Glu Arg Met Gly Lieu. Thr Asp Leu Arg Glin 405 410 415 His Ile Val Thr Glu Glu Tyr Trp Thr Pro Leu Asp Ile Glin Ala Lys 420 425 430 Tyr Tyr Ser Asn Gln Gly Ser Ile Tyr Gly Val Val Ala Asp Arg Phe 435 4 40 4 45 Lys Asn Lieu Gly Phe Lys Ala Pro Glin Arg Ser Ser Glu Lieu Ser Asn 450 455 460 Leu Tyr Phe Val Gly Gly Ser Val Asn Pro Gly Gly Gly Met Pro Met 465 470 475 480 Val Thr Leu Ser Gly Gln Leu Val Arg Asp Lys Ile Val Ala Asp Leu 485 490 495

Glin

<210> SEQ ID NO 25 &2 11s LENGTH 912 &212> TYPE DNA <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 25 ttgacggtot gcgcaaaaaa acacgttcac cittactggca titt.cggctga gcagttgctg 60 gctgat atcg atago.cgcct to atcagtta citgcc.ggttc agggtgagc g g gattgttgtg 120 ggtgcc.gc.ga tigcgtgaagg cacgctggca ccgggcaaac gitattogtoc gatgctgctg 18O ttatta acag cqc.gc.gatct toggctgtgcg atcagtcacg ggggattact ggatttagcc 240 tgcgcggttgaaatggtgca togctoccitcg citgattctgg atgatatgcc citgcatggac 3OO gatgcgcaga tigcgtogggg gcgtoccacc attcacacgc agtacggtga acatgtggcg 360 attctggcgg cqgtogctitt acticagoa aa gogtttgggg to attgc.cga ggctgaaggt 420 citgacgc.cga tagccaaaac togcgcggtg toggagctgt ccactg.cgat togg catgcag 480 ggtotggttc agg gcc agitt taagg accitc. tcggaaggcg ataalacc cc g cagogcc gat 540 gccatact gc talaccalatca gtttaaaacc agcacgctgt tittgcgc.gtc. aacgcaaatg 600 gcqtcc attg cqgccaacgc gtc.ctg.cgaa gogcgtgaga acct gcatcg tittcticgctic 660 gatctoggcc aggcctttca gttgcttgac gatcttaccg atggcatcac cqataccggc 720 aaag acatca atcaggatgc aggtaaatca acgctggtoa atttattagg citcaggcgc.g 78O gtogaagaac goctogc gaca gcatttgcgc ctdgc.ca.gtgaac acct titc cqcgg catgc 840 caaaacggcc attccaccac coaactttitt attcaggcct g gtttgacaa aaaactc.gct 9 OO US 2003/0003528A1 Jan. 2, 2003 53

-continued gcc.gtoagtt aa 912

<210> SEQ ID NO 26 &2 11s LENGTH 303 &212> TYPE PRT <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 26 Lieu. Thr Val Cys Ala Lys Lys His Val His Lieu. Thr Gly Ile Ser Ala 1 5 10 15 Glu Gln Leu Lieu Ala Asp Ile Asp Ser Arg Lieu. Asp Gln Leu Lleu Pro 2O 25 30 Val Glin Gly Glu Arg Asp Cys Val Gly Ala Ala Met Arg Glu Gly Thr 35 40 45 Leu Ala Pro Gly Lys Arg Ile Arg Pro Met Leu Lleu Lleu Lieu. Thir Ala 50 55 60 Arg Asp Leu Gly Cys Ala Ile Ser His Gly Gly Lieu Lleu. Asp Leu Ala 65 70 75 8O Cys Ala Val Glu Met Val His Ala Ala Ser Lieu. Ile Lieu. Asp Asp Met 85 90 95 Pro Cys Met Asp Asp Ala Gln Met Arg Arg Gly Arg Pro Thir Ile His 100 105 110 Thr Glin Tyr Gly Glu His Val Ala Ile Leu Ala Ala Val Ala Lieu Lieu 115 120 125 Ser Lys Ala Phe Gly Val Ile Ala Glu Ala Glu Gly Leu Thr Pro Ile 130 135 1 4 0 Ala Lys Thr Arg Ala Val Ser Glu Leu Ser Thr Ala Ile Gly Met Glin 145 15 O 155 160 Gly Lieu Val Glin Gly Glin Phe Lys Asp Leu Ser Glu Gly Asp Llys Pro 1.65 170 175 Arg Ser Ala Asp Ala Ile Leu Lieu. Thir Asn. Glin Phe Lys Thr Ser Thr 18O 185 190 Leu Phe Cys Ala Ser Thr Gln Met Ala Ser Ile Ala Ala Asn Ala Ser 195 200 2O5 Cys Glu Ala Arg Glu Asn Lieu. His Arg Phe Ser Lieu. Asp Leu Gly Glin 210 215 220 Ala Phe Glin Leu Lleu. Asp Asp Lieu. Thir Asp Gly Met Thr Asp Thr Gly 225 230 235 240 Lys Asp Ile Asin Glin Asp Ala Gly Lys Ser Thr Lieu Val Asn Lieu Lieu 245 250 255 Gly Ser Gly Ala Val Glu Glu Arg Lieu Arg Glin His Lieu Arg Lieu Ala 260 265 27 O Ser Glu His Leu Ser Ala Ala Cys Glin Asin Gly His Ser Thr Thr Glin 275 280 285 Leu Phe Ile Glin Ala Trp Phe Asp Llys Lys Lieu Ala Ala Wal Ser 29 O 295 3OO

<210 SEQ ID NO 27 &2 11s LENGTH 1296 &212> TYPE DNA <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 27

US 2003/0003528A1 Jan. 2, 2003 55

-continued Phe Val Ser Val Ala Cys Ala Lieu Pro Leu Asn Arg Glu Pro Gly Lieu 130 135 1 4 0 Pro Leu Ala Val Met Pro Phe Glu Tyr Gly. Thir Ser Asp Ala Ala Arg 145 15 O 155 160 Glu Arg Tyr Thr Thr Ser Glu Lys Ile Tyr Asp Trp Lieu Met Arg Arg 1.65 170 175 His Asp Arg Val Ile Ala His His Ala Cys Arg Met Gly Lieu Ala Pro 18O 185 190 Arg Glu Lys Lieu. His His Cys Phe Ser Pro Leu Ala Glin Ile Ser Glin 195 200 2O5 Lieu. Ile Pro Glu Lieu. Asp Phe Pro Arg Lys Ala Leu Pro Asp Cys Phe 210 215 220 His Ala Val Gly Pro Leu Arg Gln Pro Glin Gly Thr Pro Gly Ser Ser 225 230 235 240 Thr Ser Tyr Phe Pro Ser Pro Asp Llys Pro Arg Ile Phe Ala Ser Leu 245 250 255 Gly Thr Lieu Glin Gly His Arg Tyr Gly Lieu Phe Arg Thr Ile Ala Lys 260 265 27 O Ala Cys Glu Glu Val Asp Ala Glin Lieu Lleu Lieu Ala His Cys Gly Gly 275 280 285 Leu Ser Ala Thr Glin Ala Gly Glu Lieu Ala Arg Gly Gly Asp Ile Glin 29 O 295 3OO Val Val Asp Phe Ala Asp Glin Ser Ala Ala Leu Ser Glin Ala Glin Lieu 305 310 315 32O Thir Ile Thr His Gly Gly Met Asn Thr Val Leu Asp Ala Ile Ala Ser 325 330 335 Arg Thr Pro Leu Lleu Ala Leu Pro Leu Ala Phe Asp Gln Pro Gly Val 340 345 350 Ala Ser Arg Ile Val Tyr His Gly Ile Gly Lys Arg Ala Ser Arg Phe 355 360 365 Thir Thr Ser His Ala Lieu Ala Arg Glin Ile Arg Ser Lieu Lieu. Thir Asn 370 375 38O Thr Asp Tyr Pro Glin Arg Met Thr Lys Ile Glin Ala Ala Lieu Arg Lieu 385 390 395 400 Ala Gly Gly Thr Pro Ala Ala Ala Asp Ile Val Glu Glin Ala Met Arg 405 410 415 Thr Cys Gln Pro Val Leu Ser Gly Glin Asp Tyr Ala Thr Ala Leu 420 425 430

<210 SEQ ID NO 29 &2 11s LENGTH 1149 &212> TYPE DNA <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 29 atgcaa.ccgc actato atct cattctgg to ggtgcc.gg to tdgctaatgg ccttatcgc.g 60 citcc.ggctitc agcaa.cagoa toc ggatatgc ggat cittgc titatt gaggc gggtoctdag 120 gcqg gaggga accatacct g g to ctittcac gaa gaggatt taacgctgaa totago atc.gc 18O tggatagogc cqcttgtggit coatcactgg ccc.gact acc aggttcgttt cocccaacgc 240 cgtogc catg togalacagtgg citact actgc gtgacct coc ggcatttcgc cqggatactic 3OO cggcaa.cagt ttggacaa.ca tittatggctg cataccg.cgg titt cago.cgt to atgctgaa 360 US 2003/0003528A1 Jan. 2, 2003 56

-continued togg to cagt tag.cggatgg ccggattatt catgc.cagta cagtgatcga cqgacggggit 420 tacacgc.ctg attctgcact acgc.gtagga titcCagg cat ttatcgg to a ggagtggcaa. 480 citgagc.gc.gc cqcatggttt atcgtcaccg attat catgg atgcgacggit cqatcagdaa 540 aatggctacc gctttgttta taccctg.ccg cittitcc.gcaa cc.gcact got gatcgaagac 600 acacactaca ttgacaaggc taatcttcag gCC gaacggg cqcgtcagaa cattcgc gat 660 tatgctg.cgc gacagg gttg gcc gttacag acgttgctgc gggaagaac a gggtgcattg 720 cc cattacgt taacggg.cga taatcgtoag titttggcaac agcaa.ccgca agcct gtagc 78O ggattacgcg ccgggctgtt to atc.cgaca accggctact coctaccgct c gciggtggcg 840 citggcc gatc gtc.tcagogc gctggatgtg tttacct citt cotctgttca ccagacgatt 9 OO gctcactittg cccagoaacg ttggcago aa caggggttitt toc goatgct gaatcgcatg 96.O ttgtttittag ccggaccggc cqagt cacgc tiggcgtgttga tigcagogttt citatggctta 1020 ccc.gaggatt to attgcc.cg cittittatgcg ggaaaactca cc.gtgaccga toggctacgc 1080 attctgagcg gcaa.gc.cgcc cqttc.ccgtt titcgcgg cat tdcaggcaat tatgacg act 1140 catcgttga 1149

<210 SEQ ID NO 30 &2 11s LENGTH 382 &212> TYPE PRT <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 30 Met Glin Pro His Tyr Asp Lieu. Ile Leu Val Gly Ala Gly Lieu Ala Asn 1 5 10 15 Gly Lieu. Ile Ala Leu Arg Lieu Glin Glin Gln His Pro Asp Met Arg Ile 2O 25 30 Leu Lieu. Ile Glu Ala Gly Pro Glu Ala Gly Gly Asn His Thir Trp Ser 35 40 45 Phe His Glu Glu Asp Lieu. Thir Lieu. Asn Gln His Arg Trp Ile Ala Pro 50 55 60 Leu Val Val His His Trp Pro Asp Tyr Glin Val Arg Phe Pro Glin Arg 65 70 75 8O Arg Arg His Val Asn Ser Gly Tyr Tyr Cys Val Thr Ser Arg His Phe 85 90 95 Ala Gly Ile Leu Arg Glin Glin Phe Gly Glin His Leu Trp Lieu. His Thr 100 105 110 Ala Val Ser Ala Wal His Ala Glu Ser Val Glin Leu Ala Asp Gly Arg 115 120 125 Ile Ile His Ala Ser Thr Val Ile Asp Gly Arg Gly Tyr Thr Pro Asp 130 135 1 4 0 Ser Ala Leu Arg Val Gly Phe Glin Ala Phe Ile Gly Glin Glu Trp Gln 145 15 O 155 160 Leu Ser Ala Pro His Gly Leu Ser Ser Pro Ile Ile Met Asp Ala Thr 1.65 170 175 Val Asp Glin Glin Asn Gly Tyr Arg Phe Val Tyr Thr Leu Pro Leu Ser 18O 185 190 Ala Thr Ala Lieu Lleu. Ile Glu Asp Thr His Tyr Ile Asp Lys Ala Asn 195 200 2O5 US 2003/0003528A1 Jan. 2, 2003 57

-continued Leu Glin Ala Glu Arg Ala Arg Glin Asn. Ile Arg Asp Tyr Ala Ala Arg 210 215 220 Glin Gly Trp Pro Leu Glin Thr Lieu Lieu Arg Glu Glu Glin Gly Ala Lieu 225 230 235 240 Pro Ile Thr Leu Thr Gly Asp Asn Arg Glin Phe Trp Gln Glin Glin Pro 245 250 255 Glin Ala Cys Ser Gly Leu Arg Ala Gly Leu Phe His Pro Thr Thr Gly 260 265 27 O Tyr Ser Lieu Pro Leu Ala Wall Ala Lieu Ala Asp Arg Lieu Ser Ala Lieu 275 280 285 Asp Val Phe Thr Ser Ser Ser Val His Glin Thir Ile Ala His Phe Ala 29 O 295 3OO Gln Glin Arg Trp Gln Glin Gln Gly Phe Phe Arg Met Leu Asn Arg Met 305 310 315 320 Leu Phe Lieu Ala Gly Pro Ala Glu Ser Arg Trp Arg Val Met Glin Arg 325 330 335 Phe Tyr Gly Lieu Pro Glu Asp Lieu. Ile Ala Arg Phe Tyr Ala Gly Lys 340 345 350 Lieu. Thr Val Thr Asp Arg Lieu Arg Ile Leu Ser Gly Lys Pro Pro Wal 355 360 365 Pro Val Phe Ala Ala Leu Glin Ala Ile Met Thr Thr His Arg 370 375 38O

<210> SEQ ID NO 31 &2 11s LENGTH 14.79 &212> TYPE DNA <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 31 atgaaaccala citacggtaat tdgtgcgggc tittggtogcc togg cactggc aattic gttta 60 caggcc.gcag gtattoctot tittgctgctt gag cagogog acaa.gc.cggg togcc.gggct 120 tatgtttatc aggagcaggg citt tacttitt gatgcaggcc citaccgittat caccgatcc c 18O agcgcgattg aagaactgtt to citctggcc ggtaaa.cago ttaaggatta cqtcgagctg 240 ttgcc.ggtoa cqc.cgttitta togcctgtgc tigggagtccg gcaaggtott caattacgat 3OO aacgaccagg cccagttaga agcgcagata cagcagttta atcc.gc.gcga tigttgcgggit 360 tatcgagcgt to cittgacta titcgc.gtgcc gtattocaatgagggctatot gaagctoggc 420 actgtgccitt ttittatcgtt caaagacatg citt.cggg.ccg cqcco cagtt gocaaagctg 480 cagg catggc gcagogttta cagtaaagtt gcc.ggctaca ttgaggatga gcatctt.cgg 540 caggcgttitt cittittcactc gct cittagtg ggggggaatc cqtttgcaac citcgtcc att 600 tatacgctga titcacgc.gtt agaacgggaa togggg.cgtot gottt coacg cqgtggalacc 660 ggtgcgctgg toaatgg cat gatcaagctg titt caggatc toggcgg.cga agtcgtgctt 720 aacgc.ccggg toagtcatat ggaaaccqtt goggacaaga titcaggcc.gt gcagttggaa 78O gacggcagac ggtttgaaac citgcgcggtg gcgtogaacg citgatgttgt acatacctat 840 cgcgatctgc tigtc.to agca toccgcagcc gctaagcagg cqaaaaaact gcaatccaag 9 OO cgitatgagta acticactgtt totactictat tittggtotca accatcatca cqatcaactic 96.O gcc catcata cc.gtotgttt toggccacgc taccgtgaac to attcacga aatttittaac 1020 catgatgg to tdgctgagga tttitt.cgctt tatttacacg caccittgtgt cacggat.ccg. 1080 US 2003/0003528A1 Jan. 2, 2003 58

-continued to actogcac cqgaaggg to cqg cagot at tatgtgctgg cqcct gttcc acacttaggc 1140 acgg.cgaacc togactgggc ggtagaagga ccc.cg actogc gcgatcgitat ttittgacitac 1200 cittgagcaac attacatgcc togcttgcga agc.cagttgg to acgcaccg tatgtttacg 1260 cc.gttcgatt toc gogacga gct caatgcc td.gcaaggitt cqg ccttct c gttgaacct 1320 attctgaccc agagc.gc.ctg. gttcc gacca catalacc.gcg ataag cacat tdataatctt 1380 tatctggttg gcgcaggcac ccatcctggc gcggg cattc cc.ggcgtaat cqgctcqgcg 1440 aaggcgacgg caggcttaat gctggaggac citgatttga 1479

<210> SEQ ID NO 32 &2 11s LENGTH 492 &212> TYPE PRT <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 32 Met Lys Pro Thr Thr Val Ile Gly Ala Gly Phe Gly Gly Leu Ala Leu 1 5 10 15 Ala Ile Arg Lieu Glin Ala Ala Gly Ile Pro Val Lieu Lleu Lieu Glu Glin 2O 25 30 Arg Asp Llys Pro Gly Gly Arg Ala Tyr Val Tyr Glin Glu Glin Gly Phe 35 40 45 Thr Phe Asp Ala Gly Pro Thr Val Ile Thr Asp Pro Ser Ala Ile Glu 50 55 60 Glu Lieu Phe Ala Leu Ala Gly Lys Glin Leu Lys Asp Tyr Val Glu Lieu 65 70 75 8O Leu Pro Val Thr Pro Phe Tyr Arg Leu Cys Trp Glu Ser Gly Lys Val 85 90 95 Phe Asn Tyr Asp Asn Asp Glin Ala Glin Leu Glu Ala Glin Ile Glin Glin 100 105 110 Phe Asn Pro Arg Asp Wall Ala Gly Tyr Arg Ala Phe Lieu. Asp Tyr Ser 115 120 125 Arg Ala Val Phe Asn Glu Gly Tyr Leu Lys Leu Gly Thr Val Pro Phe 130 135 1 4 0 Leu Ser Phe Lys Asp Met Leu Arg Ala Ala Pro Glin Lieu Ala Lys Lieu 145 15 O 155 160 Glin Ala Trp Arg Ser Val Tyr Ser Lys Wall Ala Gly Tyr Ile Glu Asp 1.65 170 175 Glu His Lieu Arg Glin Ala Phe Ser Phe His Ser Lieu Lieu Val Gly Gly 18O 185 190 Asn Pro Phe Ala Thr Ser Ser Ile Tyr Thr Leu Ile His Ala Leu Glu 195 200 2O5 Arg Glu Trp Gly Val Trp Phe Pro Arg Gly Gly. Thr Gly Ala Leu Val 210 215 220 Asn Gly Met Ile Lys Lieu Phe Glin Asp Leu Gly Gly Glu Val Val Lieu 225 230 235 240 Asn Ala Arg Val Ser His Met Glu Thr Val Gly Asp Lys Ile Glin Ala 245 250 255 Val Glin Leu Glu Asp Gly Arg Arg Phe Glu Thr Cys Ala Wall Ala Ser 260 265 27 O Asn Ala Asp Val Val His Thr Tyr Arg Asp Leu Lleu Ser Gln His Pro 275 280 285

US 2003/0003528A1 Jan. 2, 2003 60

-continued atgaagacgt atccaccc.cg toctoctoat citctggcago go.ccgatcta g 891

<210> SEQ ID NO 34 &2 11s LENGTH 296 &212> TYPE PRT <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 34 Met Ala Val Gly Ser Lys Ser Phe Ala Thr Ala Ser Thr Leu Phe Asp 1 5 10 15 Ala Lys Thr Arg Arg Ser Val Lieu Met Leu Tyr Ala Trp Cys Arg His 2O 25 30 Cys Asp Asp Val Ile Asp Asp Glin Thr Lieu Gly Phe His Ala Asp Glin 35 40 45 Pro Ser Ser Gln Met Pro Glu Glin Arg Leu Gln Gln Leu Glu Met Lys 50 55 60 Thr Arg Glin Ala Tyr Ala Gly Ser Gln Met His Glu Pro Ala Phe Ala 65 70 75 8O Ala Phe Glin Glu Val Ala Met Ala His Asp Ile Ala Pro Ala Tyr Ala 85 90 95 Phe Asp His Leu Glu Gly Phe Ala Met Asp Val Arg Glu Thir Arg Tyr 100 105 110 Lieu. Thir Lieu. Asp Asp Thr Lieu Arg Tyr Cys Tyr His Val Ala Gly Val 115 120 125 Val Gly Leu Met Met Ala Glin Ile Met Gly Val Arg Asp Asn Ala Thr 130 135 1 4 0 Leu Asp Arg Ala Cys Asp Leu Gly Lieu Ala Phe Glin Lieu. Thir Asn. Ile 145 15 O 155 160 Ala Arg Asp Ile Val Asp Asp Ala Glin Val Gly Arg Cys Tyr Lieu Pro 1.65 170 175 Glu Ser Trp Leu Glu Glu Glu Gly Lieu. Thir Lys Ala Asn Tyr Ala Ala 18O 185 190 Pro Glu Asn Arg Glin Ala Leu Ser Arg Ile Ala Gly Arg Lieu Val Arg 195 200 2O5 Glu Ala Glu Pro Tyr Tyr Val Ser Ser Met Ala Gly Leu Ala Glin Leu 210 215 220 Pro Leu Arg Ser Ala Trp Ala Ile Ala Thr Ala Lys Glin Val Tyr Arg 225 230 235 240 Lys Ile Gly Val Lys Val Glu Glin Ala Gly Lys Glin Ala Trp Asp His 245 250 255 Arg Glin Ser Thr Ser Thr Ala Glu Lys Lieu. Thir Lieu Lleu Lieu. Thir Ala 260 265 27 O Ser Gly Glin Ala Val Thir Ser Arg Met Lys Thr Tyr Pro Pro Arg Pro 275 280 285 Ala His Leu Trp Glin Arg Pro Ile 29 O 295

<210 SEQ ID NO 35 &2 11s LENGTH 528 &212> TYPE DNA <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 35 atgttgttgga tittggaatgc cct gatcgtg tttgtcaccg togtogg cat ggaagtggitt 60 US 2003/0003528A1 Jan. 2, 2003 61

-continued gctgcactgg cacataaata catcatgcac ggctggggitt goggctgg catctitt cacat 120 catgaaccgc gtaaaggcgc atttgaagtt aac gatctot atgcc.gtggt attc.gc.catt 18O gtgtcgattg ccctgattta citt.cggcagt acaggaatct ggcc.gct coa gtggattggit 240 gcaggcatca cc.gctitatgg titt actdt at tittatggtoc acgacgg act g g tacaccag 3OO cgctggcc.gt toc got acat accgc.gcaaa gqctacctga aacggittata catggcc cac 360 cgitatgcatc atgctgtaag g g gaaaagag ggctg.cgtgt cctittggttt totgtacgc.g 420 ccaccgittat citaaacttica gg.cgacgctg agagaaaggc atgcggctag atcggg.cgct 480 gccagagatg agcaggacgg g g toggatacg tottcatcc g g galagtaa 528

<210 SEQ ID NO 36 &2 11s LENGTH 175 &212> TYPE PRT <213> ORGANISM: Pantoea stewartii

<400 SEQUENCE: 36 Met Leu Trp Ile Trp Asn Ala Leu Ile Val Phe Val Thr Val Val Gly 1 5 10 15 Met Glu Val Val Ala Ala Leu Ala His Lys Tyr Ile Met His Gly Trp 2O 25 30 Gly Trp Gly Trp His Leu Ser His His Glu Pro Arg Lys Gly Ala Phe 35 40 45 Glu Val Asn Asp Leu Tyr Ala Val Val Phe Ala Ile Val Ser Ile Ala 50 55 60 Leu Ile Tyr Phe Gly Ser Thr Gly Ile Trp Pro Leu Gln Trp Ile Gly 65 70 75 8O Ala Gly Met Thr Ala Tyr Gly Leu Leu Tyr Phe Met Val His Asp Gly 85 90 95 Leu Val His Glin Arg Trp Pro Phe Arg Tyr Ile Pro Arg Lys Gly Tyr 100 105 110 Leu Lys Arg Lieu. Tyr Met Ala His Arg Met His His Ala Val Arg Gly 115 120 125 Lys Glu Gly Cys Val Ser Phe Gly Phe Leu Tyr Ala Pro Pro Leu Ser 130 135 1 4 0 Lys Lieu Glin Ala Thr Lieu Arg Glu Arg His Ala Ala Arg Ser Gly Ala 145 15 O 155 160 Ala Arg Asp Glu Glin Asp Gly Val Asp Thir Ser Ser Ser Gly Lys 1.65 170 175

<210 SEQ ID NO 37 &2 11s LENGTH 1599 &212> TYPE DNA <213> ORGANISM: Rhodococcus erythropolis AN12 <400 SEQUENCE: 37 gtgagcgcat ttctogacgc cqtcgtogto ggttc.cggac acaacgc.gct c gtttcggcc 60 gcqtat citcg cacgtgaggg ttggtogg to gaggttctog agaaggacac ggttctoggc 120 ggtgcc.gtct c gaccgtcga gcgattitc.cc ggataca agg togg accgggg gttcgtctg.cg 18O caccitcatga toc gacacag togcatcatc gaggaactcg gacitcgg.cgc gcacggc citt 240 cgctacatcg actgttgacco gtggg.cgttc gctcc.gc.ccg ccc.ctgg cac cqacggg.ccg. 3OO

US 2003/0003528A1 Jan. 2, 2003 63

-continued Arg Phe Val Ala Val Trp Ser Glu Arg Ser Arg His Val Met Lys Ala 130 135 1 4 0 Phe Ser Thr Pro Pro Thr Gly Ser Asn Leu Ile Gly Ala Phe Gly Gly 145 15 O 155 160 Leu Ala Thr Ala Arg Gly Asn. Ser Glu Lieu Ser Arg Glin Phe Leu Ala 1.65 170 175 Pro Gly Asp Ala Leu Lieu. Asp Glu Tyr Phe Asp Ser Glu Ala Lieu Lys 18O 185 190 Ala Ala Leu Ala Trp Phe Gly Ala Glin Ser Gly Pro Pro Met Ser Glu 195 200 2O5 Pro Gly. Thr Ala Pro Met Val Gly Phe Ala Ala Leu Met His Val Leu 210 215 220 Pro Pro Gly Arg Ala Val Gly Gly Ser Gly Ala Leu Ser Ala Ala Lieu 225 230 235 240 Ala Ser Arg Met Ala Wall Asp Gly Ala Thr Val Ala Leu Gly Asp Gly 245 250 255 Val Thr Ser Ile Arg Arg Asn Ser Asn His Trp Thr Val Thr Thr Glu 260 265 27 O Ser Gly Arg Glu Val His Ala Arg Lys Val Ile Ala Gly Cys His Ile 275 280 285 Lieu. Thir Thr Lieu. Asp Leu Lleu Gly Asn Gly Gly Phe Asp Arg Thr Thr 29 O 295 3OO Leu Asp His Trp Arg Arg Lys Ile Arg Val Gly Pro Gly Ile Gly Ala 305 310 315 32O Val Lieu Arg Lieu Ala Thr Ser Ala Lieu Pro Ser Tyr Arg Gly Asp Ala 325 330 335 Thir Thr Arg Glu Ser Thr Ser Gly Lieu Gln Leu Lleu Val Ser Asp Arg 340 345 350 Ala His Leu Arg Thr Ala His Gly Ala Ala Lieu Ala Gly Glu Lieu Pro 355 360 365 Pro Arg Pro Ala Val Leu Gly Met Ser Phe Ser Gly Ile Asp Pro Thr 370 375 38O Ile Ala Pro Ala Gly Arg His Glin Val Thr Leu Trp Ser Glin Trp Gln 385 390 395 400 Pro Tyr Arg Lieu Ser Gly. His Arg Asp Trp Ala Ser Val Ala Glu Ala 405 410 415 Glu Ala Asp Arg Ile Val Gly Glu Met Glu Ala Phe Ala Pro Gly Phe 420 425 430 Thr Asp Ser Val Lieu. Asp Arg Phe Ile Glin Thr Pro Arg Asp Ile Glu 435 4 40 4 45 Ser Glu Leu Gly Met Ile Gly Gly Asn Val Met His Val Glu Met Ser 450 455 460 Leu Asp Gln Met Met Leu Trp Arg Pro Leu Pro Glu Lieu Ser Gly. His 465 470 475 480 Arg Val Pro Gly Ala Asp Gly Lieu. Tyr Lieu. Thr Gly Ala Ser Thr His 485 490 495 Pro Gly Gly Gly Val Ser Gly Ala Ser Gly Arg Ser Ala Ala Arg Ile 5 OO 505 510 Ala Leu Ser Asp Ser Arg Arg Gly Lys Ala Ser Glin Trp Met Arg Arg 515 52O 525

Ser Ser Arg Ser US 2003/0003528A1 Jan. 2, 2003 64

-continued

530

SEQ ID NO 39 LENGTH 30 TYPE DNA ORGANISM: Methylomonas 16a

<400 SEQUENCE: 39 cc.gagtactgaag.cgg gttt ttgcagggag 30

SEQ ID NO 40 LENGTH 25 TYPE DNA ORGANISM: Methylomonas 16a <400 SEQUENCE: 40 gggctagotg citc.cgattgt tacag 25

SEQ ID NO 41 LENGTH 38 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer, derived from Rhodococcus erythropolis AN12

<400 SEQUENCE: 41 agcagotago ggaggaataa accatgagcg catttct c 38

SEQ ID NO 42 LENGTH 26 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer, derived from Rhodococcus erythropolis AN12

<400 SEQUENCE: 42 gacitagtcac gacctgctic g aacgac 26

SEQ ID NO 43 LENGTH 25 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 43 atgacggtot gcgcaaaaaa acacg 25

SEQ ID NO 44 LENGTH 2.8 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 44 gagaaattat gttgttggatt toggaatgc 28

SEQ ID NO 45 LENGTH 19 TYPE DNA US 2003/0003528A1 Jan. 2, 2003 65

-continued ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 45 gagtttgatc ctdgcticag 19

SEQ ID NO 46 LENGTH 16 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 46 taccttgtta cq actt 16

SEQ ID NO 47 LENGTH 17 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer NAME/KEY: misc feature LOCATION: (11) . . (11) OTHER INFORMATION: Y = C or T NAME/KEY: misc feature LOCATION: (12) ... (12) OTHER INFORMATION: M = A or C

SEQUENCE: 47 gtgc.ca.gcag ymg.cggit 17

SEQ ID NO 48 LENGTH 21 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 48 atgagcgcat ttctogacgc c 21

SEQ ID NO 49 LENGTH 2.0 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 49 toac gacctg. citcgaacgac 20

SEQ ID NO 50 LENGTH 50 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 50 gaga attggc tigaaaaacca aataaataac aaaatttagc gagtaaatgg 5 O

SEQ ID NO 51 US 2003/0003528A1 Jan. 2, 2003 66

-continued

&2 11s LENGTH 50 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE <223> OTHER INFORMATION: primer <400 SEQUENCE: 51 ttcaattgac aggggggcto gttctgattt agagttgctg. ccagotttitt 5 O

<210> SEQ ID NO 52 &2 11s LENGTH 50 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE <223> OTHER INFORMATION: primer <400 SEQUENCE: 52 gggttgtcca gatgttggtg agcggtoctit atalactataa citgtaacaat 5 O

<210 SEQ ID NO 53 &2 11s LENGTH 50 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE <223> OTHER INFORMATION: primer <400 SEQUENCE: 53 ttaatggtot toccatgaga tigtgctoc ga ttgttacagt tatagittata 5 O

<210> SEQ ID NO 54 &2 11s LENGTH 50 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE <223> OTHER INFORMATION: primer <400 SEQUENCE: 54 ccccctgtca attgaaag.cc cgc.catttac togctaaatt ttgttattta 5 O

<210 SEQ ID NO 55 <211& LENGTH 22 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE <223> OTHER INFORMATION: primer <400 SEQUENCE: 55 aaggat.ccgc gtattogtac to 22

<210 SEQ ID NO 56 <211& LENGTH: 40 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE <223> OTHER INFORMATION: primer <400 SEQUENCE: 56 citggat.ccga totagaaata ggcto gagtt gtc gttcagg 40

<210 SEQ ID NO 57 &2 11s LENGTH 30 &212> TYPE DNA <213> ORGANISM: Artificial Sequence &220s FEATURE US 2003/0003528A1 Jan. 2, 2003 67

-continued <223> OTHER INFORMATION: primer <400 SEQUENCE: 57 aaggat.ccita citcgagctga catcagtgct 30

SEQ ID NO 58 LENGTH 22 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 58 gctctagatg caaccagaat cq 22

SEQ ID NO 59 LENGTH 24 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 59 tggcto gaga gtaaaacact caag 24

SEQ ID NO 60 LENGTH 19 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer <400 SEQUENCE: 60 tagcto gagt cacgcttgc 19

What is claimed is: coccus, Nocardia, Arthrobacter, Rhodopseudomonas, 1. A method for the production of a carotenoid compound Pseudomonas, Candida, Hansenula, Pichia, Torulopsis, and comprising: Rhodotorula 4. A method according to claim 3 wherein C1 metaboliz (a) providing a transformed C1 metabolizing host cell ing host is a methanotroph. comprising: 5. A method according to claim 4 wherein the methan otroph is a methanotroph Selected from the group consisting (i) Suitable levels of isopentenyl pyrophosphate; and of Methylomonas, Methylobacter, Mehtylococcus, Methy losinus, Methylocyctis, Methylomicrobium, and Methano (ii) at least one isolated nucleic acid molecule encoding OS. an enzyme in the carotenoid biosynthetic pathway 6. A method according to claim 2 wherein the C1 carbon under the control of Suitable regulatory Sequences, Substrate is Selected from the group consisting of methane (b) contacting the host cell of step (a) under Suitable and methanol and the C1 metabolizing host cell is a metha growth conditions with an effective amount of a C1 notroph Selected from the group consisting of Methylomo carbon Substrate whereby an carotenoid compound is nas, Methylobacter, Mehtylococcus, Methylosinus, Methy produced. locyctis, Methylomicrobium, and Methanomonas. 7. A method according to claim 6 wherein the methan 2. A method according to claim 1 wherein the C1 carbon otroph is a high growth methanotrophic Strain which com Substrate is Selected from the group consisting of methane, prises a functional Embden-Meyerhof carbon pathway, Said methanol, formaldehyde, formic acid, methylated amines, pathway comprising a gene encoding a pyrophosphate methylated thiols, and carbon dioxide. dependent phosphofructokinase enzyme. 3. A method according to claim 1 wherein the C1 metabo lizing host cell is a methylotroph Selected from the group 8. A method according to claim 7 wherein the gene consisting of Methylomonas, Methylobacter, Mehtylococ encoding a pyrophosphate dependent phosphofructokinase cus, Methylosinus, Methylocyctis, Methylomicrobium, enzyme is Selected from the group consisting of: Methanomonas, Methylophilus, Methylobacillus, Methylo (a) an isolated nucleic acid molecule encoding the amino bacterium, Hyphomicrobium, Xanthobacter, Bacillus, Para acid sequence as set forth in SEQ ID NO:2; US 2003/0003528A1 Jan. 2, 2003 68

(b) an isolated nucleic acid molecule that hybridizes with enzyme encodes a phytoene desaturase Selected from the (a) under the following hybridization conditions: 0.1x group consisting of Genbank Acc if AB046992, AFO39585, SSC, 0.1% SDS, 65° C. and washed with 2x SSC, 0.1% AF049356, AF139916, AF218415AF251014, AF364515, SDS followed by 0.1x SSC, 0.1% SDS; D58420, D83514, L16237, L37405, M64704, M88683, (c) an isolated nucleic acid molecule comprising a first S71770, U37285, U46919, U62808, X55289, X59948, nucleotide Sequence encoding a polypeptide of at least X62574, X68058, X71023, X78271, X78434, X78815, 437 amino acids that has at least 63% identity based on X86783, Y14807, Y15007, Y15112, Y15114, and Z11165 the Smith-Waterman method of alignment when com 19. A method according to claim 13 wherein the phytoene pared to a polypeptide having the Sequence as Set forth desaturase as the amino acid Sequence as Set forth in SEQID in SEQ ID NO:2; and NO:32 20. A method according to claim 13 wherein the isolated (d) an isolated nucleic acid molecule that is complemen nucleic acid molecule encoding a carotenoid biosynthetic tary to (a), (b) or (c). enzyme encodes a lycopene cyclase Selected from the group 9. A method according to claim 7 wherein the high growth consisting of Genbank Acc # AF139916, AF152246, methanotrophic bacterial Strain optionally contains at least AF218415, AF272737, AJ133724, AJ250827, AJ276965, one gene encoding a fructose bisphosphate aldolase enzyme. D58420, D83513, L40176, M87280, U50738, U50739 10. A method according to claim 7 wherein the high U62808, X74599, X81787, X86221, X86452, X95596, and growth methanotrophic bacterial Strain optionally contains a X987.96. functional Entner-Douderoff carbon pathway. 21. A method according to claim 13 wherein the lycopene 11. A method according to claim 8 wherein the high cyclase as the amino acid Sequence as Set forth in SEQ ID growth methanotrophic bacterial Strain optionally contains NO:30 at least one gene encoding a keto-deoxy phosphogluconate 22. A method according to claim 13 wherein the isolated aldolase. nucleic acid molecule encoding a carotenoid biosynthetic 12. A method according to claim 9 wherein the high enzyme encodes a B-caroteine hydroxylase Selected from the growth methanotrophic bacterial Strain is methylomonas 16a group consisting of Genbank Acc # D58420, D58422, having the ATCC designation ATCC PTA 2402. D90087, M87280, U62808, Y15112, 13. A method according to claim 1 wherein the isolated 23. A method according to claim 13 wherein 3-caroteine nucleic acid molecule encodes a carotenoid biosynthetic hydroxylase as the amino acid Sequence as Set forth in SEQ enzyme Selected from the group consisting of geranylgera ID NO:36 nyl pyrophosphate (GGPP) synthase, phytoene Synthase, 24. A method according to claim 13 wherein the isolated phytoene desaturase, lycopene cyclase, 3-caroteine hydroxy nucleic acid molecule encoding a carotenoid biosynthetic lase, Zeaxanthin glucosyl transferase, B-caroteine ketolase, enzyme encodes Zeaxanthin glucosyl transferase Selected B-caroteine C-4 oxygenase, B-carotene desaturase, Spheroi from the group consisting of Genbank Acc #. D90087, dene monooxygenase, caroteine hydratase, carotenoid 3,4- M87280, and M90698. desaturase, 1-OH-carotenoid methylase, farnesyl diphos 25. A method according to claim 13 wherein zeaxanthin phate Synthetase, and diapophytoene dehydrogenase. glucosyltransferase as the amino acid Sequence as Set forth 14. A method according to claim 13 wherein the isolated nucleic acid molecule encoding a carotenoid biosynthetic in SEO ID NO:28 enzyme encodes a geranylgeranyl pyrophosphate (GGPP) 26. A method according to claim 13 wherein the isolated Synthase Selected from the group consisting of Genbank Acc nucleic acid molecule encoding a carotenoid biosynthetic #. AB000835, AB 016043 AB019036, AB027705, enzyme encodes a B-caroteine ketolase Selected from the AB027706, AB016044, ABO34249, AB034250, AF020041, group consisting of Genbank Acc #. AF218415, D45881, AFO49658, AF049659, AF139916, AF2798.07, AF279808, D58420, D58422, X86782, and Y15112. AJO10302, AJ133724, AJ276129, D85029, L25813, 27. A method according to claim 13 wherein the isolated L37405, U15778, U44876, X92893, X95596, X98795, and nucleic acid molecule encoding a carotenoid biosynthetic Y15112 enzyme encodes a B-caroteine ketolase having the amino 15. A method according to claim 13 wherein the gera acid sequence as set for the in SEQ ID NO:38. nylgeranyl pyrophosphate (GGPP) synthase as the amino 28. A method according to claim 13 wherein the isolated acid sequence as set forth in SEQ ID NO:.26. nucleic acid molecule encoding a carotenoid biosynthetic 16. A method according to claim 13 wherein the isolated enzyme encodes a B-carotene C-4 oxygenase Selected from nucleic acid molecule encoding a carotenoid biosynthetic the group consisting of Genbank Acc #. X86782, and enzyme encodes a phytoene Synthase Selected from the Y15112. group consisting of Genbank Acc # AB001284, AB032797, 29. A method according to claim 13 wherein the isolated AB034704, AB037975, AF009954, AF139916, AF152892, nucleic acid molecule encoding a carotenoid biosynthetic AF218415, AF220218, AJO10302, AJ133724, AJ278287, enzyme encodes a B-caroteine desaturase Selected from the AJ304825, AJ308385, D58420, L23424, L25812, L37405, group consisting of Genbank Acc #. AF047490, AF121947, M38424, M87280, S71770, U32636, U62808, U87626, AF139916, AF195507, AF272737, IFO13350, AF372617, U91900, X52291, X60441, X63873, X68017, X69172, and AJ133724, AJ224683, D26095 U38550, X89897, Y15115, X78814. and PCC7210, 17. A method according to claim 13 wherein the phytoene 30. A method according to claim 13 wherein the isolated Synthase as the amino acid Sequence as Set forth in SEQ ID nucleic acid molecule encoding a carotenoid biosynthetic NO:34. enzyme encodes a spheroidene monooxygenase Selected 18. A method according to claim 13 wherein the isolated from the group consisting of Genbank Acc #. AJO10302, nucleic acid molecule encoding a carotenoid biosynthetic Z11165, and X52291. US 2003/0003528A1 Jan. 2, 2003

31. A method according to claim 13 wherein the isolated 45. A method according to claim 40 wherein Said gene nucleic acid molecule encoding a carotenoid biosynthetic encoding a 2C-methyl-d-erythritol 2,4-cyclodiphosphate enzyme encodes a caroteine hydratase Selected from the Synthase (IspF), encodes a polypeptide as set forth in SEQ group consisting of Genbank Acc #. AB034704, AF195122, ID NO:14. AJO10302, AF287480, U73944, X52291, Z 1165, and 46. A method according to claim 40 wherein Said gene Z21955. encoding a CTP synthase (PyrC) encodes a polypeptide as 32. A method according to claim 13 wherein the isolated Set forth in SEO ID NO:16. nucleic acid molecule encoding a carotenoid biosynthetic 47. A method according to claim 40 wherein Said gene enzyme encodes a carotenoid 3,4-desaturase Selected from encoding an activity for the production of dimethylallyl the group consisting of Genbank Acc #. AJO10302, X63204 diphosphate (lytB) encodes a polypeptide as set forth in SEQ U73944, X52291, and Z11165, ID NO:18. 33. A method according to claim 13 wherein the isolated 48. A method according to claim 1 wherein the carotenoid nucleic acid molecule encoding a carotenoid biosynthetic compound is Selected form the group consisting of Anther enzyme encodes a 1-OH-carotenoid methylase Selected aXanthin, adonixanthin, AStaxanthin, Canthaxanthin, cap from the group consisting of Genbank Acc #. AB034704, Sorubrin, 3-cryptoXanthin alpha-caroteine, beta-caroteine, AF288602, AJO10302, X52291 and Z11165. epsilon-caroteine, echinenone, gamma-carotene, Zeta-caro 34. A method according to claim 13 wherein the isolated tene, alpha-cryptoxanthin, diatoxanthin, 7,8-didehydroast nucleic acid molecule encoding a carotenoid biosynthetic aXanthin, fucoxanthin, fucoxanthinol, isorenieratene, lactu enzyme encodes a farnesyl diphosphate Synthetase Selected caXanthin, lutein, lycopene, neoxanthin, neuroSporene, from the group consisting of Genbank Acc #. AB003 187, hydroxyneurosporene, peridinin, phytoene, rhodopin, ABO16094, AB021747, AB028044, AB028046, AB028047, rhodopin glucoside, SiphonaXanthin, Spheroidene, Spheroi AF112881, AF136602, AF384040, D00694, D13293, denone, Spirilloxanthin, uriolide, uriolide acetate, violaxan D85317, X75789, Y12072, Z49786, U80605, X76026, thin, ZeaXanthin-3-diglucoside, and Zeaxanthin. X82542 X82543, AF234168, L46349, L46350, L46367, 49. A method for the over-production of carotenoid pro M89945, NM 002004, U36376, XM 034497, duction in a transformed C1 metabolizing host comprising: XM 034498, XM 034499, and XM 034500. (a) providing a transformed Cl metabolizing host cell 35. A method according to claim 13 wherein the isolated comprising: nucleic acid molecule encoding a carotenoid biosynthetic enzyme encodes a farnesyl diphosphate Synthetase having (i) Suitable levels of isopentenyl pyrophosphate; and the amino acid sequence as set forth in SEQ ID NO:20. 36. A method according to claim 13 wherein the isolated (ii) at least one isolated nucleic acid molecule encoding nucleic acid molecule encoding a carotenoid biosynthetic an enzyme in the carotenoid biosynthetic pathway enzyme encodes a diapophytoene dehydrogenase enzyme as under the control of Suitable regulatory Sequences, described by Genbank Acc #. X73889. and 37. A method according to claim 13 wherein the isolated (iii) either: nucleic acid molecule encoding a carotenoid biosynthetic enzyme encodes a diapophytoene dehydrogenase enzyme 1) multiple copies of at least one gene encoding an having the amino acid Sequence Selected from the group enzyme Selected from the group consisting of consisting of SEQ ID NO:22 and SEQ ID NO:24. D-1-deoxyxylulose-5-phosphate Synthase (DXS), 38. A method according to claim 1 wherein said metha D-1-deoxyxylulose-5-phosphate reductoi notrophic bacteria is methylomonas 16a ATCC PTA 2402. Somerase (DXr), 2C-methyl-d-erythritol cytidylyl 39. A method according to claim 1 wherein the Suitable transferase (IspD), 4-diphosphocytidyl-2-C-meth levels of isopentenyl pyrophosphate are provided by the ylerythritol kinase (IspE), 2C-methyl-d-erythritol expression heterologus upper pathway isoprenoid pathway 2,4-cyclodiphosphate synthase (IspF), CTP syn geneS. thase (PyrC) lytB and gcpE; or 40. A method according to claim 39 wherein said upper pathway isoprenoid genes are Selected from the group 2) at least one gene encoding an enzyme Selected consisting of D-1-deoxyxylulose-5-phosphate Synthase from the group consisting of D-1-deoxyxylulose (DXS), D-1-deoxyxylulose-5-phosphate reductoisomerase 5-phosphate Synthase (DXS), D-1-deoxyxylulose (DXr), 2C-methyl-d-erythritol cytidylyltransferase (IspD), 5-phosphate reductoisomerase (DXr), 2C-methyl 4-diphosphocytidyl-2-C-methylerythritol kinase (IspE), d-erythritol cytidylyltransferase (IsplD), 2C-methyl-d-erythritol 2,4-cyclodiphosphate Synthase 4-diphosphocytidyl-2-C-methylerythritol kinase (IspF), CTP synthase (PyrC), lytB, and GcpE. (IspE), 2C-methyl-d-erythritol 2,4-cyclodiphos 41. A method according to claim 40 wherein Said gene phate synthase (IspF), CTP synthase (PyrC), lytB encoding a D-1 -deoxyxylulose-5-phosphate Synthase and gcpE operably linked to a strong promoter. (DxS), encodes a polypeptide as set forth in SEQ ID NO:6 42. A method according to claim 40 wherein Said gene (b) contacting the host cell of Step (a) under Suitable encoding a D-1-deoxyxylulose-5-phosphate reductoi growth conditions with an effective amount of a C1 Somerase (DXr), encodes a polypeptide as set forth in SEQ carbon Substrate whereby a carotenoid compound is ID NO:8. Over-produced. 43. A method according to claim 40 wherein Said gene 50. A method according to claim 49 wherein the at least encoding a 2C-methyl-d-erythritol cytidylyltransferase one gene encoding an enzyme of either part (a)(iii)(1) or (IsplD), encodes a polypeptide as set forth in SEQID NO:10. (a)(iii)(2) encodes an enzyme Selected from the group con 44. A method according to claim 40 wherein Said gene sisting of SEQ ID NO:6, 8, 10, 12, 14, 16, and 18. encoding a 4-diphosphocytidyl-2-C-methylerythritol kinase (IspE), encodes a polypeptide as set forth in SEQID NO:12. k k k k k