US007252985B2

(12) United States Patent (10) Patent No.: US 7,252,985 B2 Cheng et al. (45) Date of Patent: Aug. 7, 2007

(54) KETOLASES in the Cyanobacterium Synechocystis sp. PCC 6803*, J. Biol. Chem., vol. 272(15):9728-9733, 1997. (75) Inventors: Qiong Cheng, Wilmington, DE (US); Norihiko Misawa et al., Metabolic engineering for the production of Luan Tao, Claymont, DE (US); Henry in non-carotenogenic bacteria and yeasts, J. of Biotech., Yao, Boothwyn, PA (US) vol. 59:169-181, 1998. National Center for Biotechnology Information General Identifier (73) Assignee: E. I. du Pont de Nemours and No. 5912291, Accession No. Y 15112, Sep. 15, 1999, M. Harker et Company, Wilmington, DE (US) al., Carotenoid biosynthesis genes in the bacterium Paracoccus narcusii MH1. National Center for Biotechnology Information General Identifier (*) Notice: Subject to any disclaimer, the term of this No. 2654317. Accession No. X86782, Sep. 9, 2004, M. Harker et patent is extended or adjusted under 35 al., Biosynthesis of ketocarotenoids in transgenic cyanobacteria U.S.C. 154(b) by 326 days. expressing the algal gene for beta-C-4-oxygenase, crtO. National Center for Biotechnology Information General Identifier (21) Appl. No.: 11/015,433 No. 903298, Accession No. D58422, N. Misawa et al., biosynthesis by the conversion of methylene to keto (22) Filed: Dec. 17, 2004 groups in a hydrocarbon beta- by a single gene Biochem. Biophys. Res. Commun. 209(3), 867-876 (1995). (65) Prior Publication Data National Center for Biotechnology Information General Identifier No. 61629280, Accession No. D58420, N. Misawa et al., US 2005/0227311 A1 Oct. 13, 2005 Canthaxanthin biosynthesis by the conversion of methylene to keto groups in a hydrocarbon beta-carotene by a single gene Biochem. Related U.S. Application Data Biophys. Res. Commun. 209(3), 867-876 (1995). (60) Provisional application No. 60/531,310, filed on Dec. National Center for Biotechnology Information General Identifier 19, 2003. No. 1136838, Accession No. D45881, Feb. 10, 1999, S. Kajiwara et al., Isolation and functional identification of a novel cDNA for (51) Int. Cl. biosynthesis from Haematococcus pluvialis, and CI2N L/20 (2006.01) astaxanthin synthesis in Escherichia coli. CI2N 9/02 (2006.01) National Center for Biotechnology Information General Identifier No. 8650414, Accession No. AF2 18415, L. Hannibal et al., Isolation C07H 2L/04 (2006.01) and characterization of canthaxanthin biosynthesis genes from the (52) U.S. Cl...... 435/252.3; 435/189: 435/320.1; photosynthetic bacterium Bradyrhizobium sp. strain ORS278 J. 435/148; 435/410; 435/254.3:435/255.2: Bacteriol. 182(13), 3850-3853 (2000). 435/255.5; 536/23.2 H. J. Nelis et al., Principal carotenoids permitted as additives in (58) Field of Classification Search ...... 435/252.3, foods and feeds, J. of Applied Bacteriology, vol.70:181-191, 1991. 435/189,148,410, 254.3, 255.2, 255.5; 536/23.2 Laure Hannibal et al., Isolation and Characterization of See application file for complete search history. Canthaxanthin Biosynthesis Genes from the Photosynthetic Bacte rium Bradyrhizobium sp. Strain ORS278, J. of Bacteriology, vol. (56) References Cited 182(13):3850-3853, Jul. 2000. U.S. PATENT DOCUMENTS Primary Examiner Tekchand Saidha 6,150,130 A 11/2000 Misawa et al. 6,551,807 B1 4/2003 Cunningham (57) ABSTRACT 2003/0O87337 A1 5, 2003 Giraud et al. Novel CrtW carotenoid ketolase are provided that are useful FOREIGN PATENT DOCUMENTS for the production of ketocarotenoids. The ketolases genes WO WO O2/O79395 A2 10, 2002 of the present invention exhibit low homology in compari son to other CrtW ketolases previously reported. Expression OTHER PUBLICATIONS of the carotenoid ketolases in heterologous hosts enabled production of canthaxanthin and astaxanthin. Coexpression Takakazu Kaneko et al., Complete Genomic Sequence of the experiments using divergent crtW genes resulted in Filamentous Nitrogen-fixing Cyanobacterium Anabaena sp. Strain PCC 7120, DNA Research, vol. 8:205-213, 2001. increased production of the desired ketocarotenoids. Blanca Fernandez-Gonzalez et al., A New Type of Asymmetrically Acting beta-Carotene Ketolase Is Required for the Synthesis of 10 Claims, 7 Drawing Sheets U.S. Patent US 7,252,985 B2

IQun??Ht—ausdoosK U.S. Patent Aug. 7, 2007 Sheet 2 of 7 US 7,252,985 B2

E

so

e

AZS ---so st

t s 3 3 3

GEO st 2

re s

5 3 S 3

U.S. Patent Aug. 7 9 2007 Sheet 3 of 7 US 7,252,985 B2

8| qZ9.InãIH

627 -s

09

U.S. Patent Aug. 7, 2007 Sheet S of 7 US 7,252,985 B2

Figure 3 Dao D, Sigelos Reiss50.5 (112403217-0701.D.

2000 - DC18 CritW 1soo!

co

so

fait d, sig-arts ref=6505 i2a32G2ai04.D)

DC263 CritW

2.5 s 5 12.5 s 7.5

Ap1d, sig=4705 Ref=650,5112403205-soid) So K1-202C CtW

s O 15 s S. U.S. Patent Aug. 7, 2007 Sheet 6 of 7 US 7,252,985 B2

Figure 4 dio, Sig=4705 Ref=850,5(11 1703LT001-0101 o A. : 500 - 10G(pDCQ335)

10G(pDCQ335) -- DC263 CritW

10G(pDCQ335) +K1-202C crtW U.S. Patent Aug. 7, 2007 Sheet 7 of 7 US 7,252,985 B2 Figure 5 MWM1200(pDCQ340) Alu A, Sigel-ou,s resolub Lutuq Nstuz-uzu.Lu

MWM1200(pDCQ341) DAD B, Sig=47 U.5 Ref=8505 (030204 SSG TUC04.D) --- mAU 700

600

500 400 300

200

100

US 7,252,985 B2 1. 2 CAROTENOID KETOLASES the CrtOketolase enzyme from Synechocystis sp. PCC6803 adds a keto-group asymmetrically to only one of the two This application claims the benefit of U.S. Provisional B- rings of B-carotene. Application No. 60/531,310, filed Dec. 19, 2003. Several examples of CrtW ketolases have been reported in variety of bacteria including Agrobacterium aurantiacum FIELD OF THE INVENTION (U.S. Pat. No. 6,150,130), Bradyrhizobium sp. (U.S. Patent Publication No. 20030087337), and Brevundimonas auran This invention is in the field of microbiology and molecu tiacum (WO 02/079395). However, there is a need to lar biology. More specifically, this invention pertains to identify additional novel CrtW ketolase genes useful for nucleic acid fragments encoding enzymes useful for micro 10 genetically engineering industrially suitable microorganisms bial production of cyclic ketocarotenoid compounds. for the production of valuable ketocarotenoids, such as canthaxanthin and astaxanthin. Additionally, there is a par BACKGROUND OF THE INVENTION ticularly important need to identify CrtW type ketolases having relatively low to moderate sequence homology (i.e. Carotenoids are pigments that are ubiquitous throughout 15 <65% nucleotide sequence identity) as coexpression of nature and synthesized by all photosynthetic organisms, and highly homologous genes tends to result genetic instability in some heterotrophic growing bacteria and fungi. Caro (i.e. undesirable homologous recombination). Expressing tenoids provide color for flowers, vegetables, insects, fish crtW genes having relatively low to moderate sequence and birds. Colors of carotenoid range from yellow to red homology should decrease the probability of genetic insta with variations of brown and purple. As precursors of bility normally associated with expression of highly vitamin A, carotenoids are fundamental components in our homologous genes. This is particularly important when diet and they play additional important role in human health. developing genetically-stable commercial strains for opti Because animals are unable to synthesize carotenoid de mal production of the desired product (i.e. ketocarotenoids). novo, they must obtain them by dietary means. Thus, CrtW genes having divergent nucleotide sequences are manipulation of carotenoid production and composition in 25 most suitable for expressing multiple ketolases in a single plants or bacteria can provide new or improved source for recombinant host cell. This is especially important when carotenoids. Industrial uses of carotenoids include pharma ketolase activity becomes the rate-limiting step in the keto ceuticals, food supplements, animal feed additives, and carotenoid biosynthesis pathway. Increasing the number of colorants in cosmetics, to mention a few. crtW genes that can be simultaneously expressed in the Industrially, only a few carotenoids are used for food 30 production host is expected to increase ketocarotenoid pro colors, animal feeds, pharmaceuticals, and cosmetics, duction, assuming that the pool of available substrates is not despite the existence of more than 600 different carotenoids limiting. identified in nature. This is largely due to difficulties in Additionally, CrtW ketolases tend to exhibit substrate production. Presently, most of the carotenoids used for flexibility. However, it can be envisioned that different CrtW industrial purposes are produced by chemical synthesis: 35 ketolases may exhibit preferential activity towards one or however, these compounds are very difficult to make chemi more possible substrates (i.e. f6-caroteine versus ). cally (Nelis and Leenheer, Appl. Bacteriol. 70:181-191 Simultaneous expression of multiple CrtW ketolases, each (1991)). Natural carotenoids can either be obtained by selected based on their preferred substrate, may be used for extraction of plant material or by microbial synthesis; but, optimal production of a desired ketocarotenoid. One of skill only a few plants are widely used for commercial carotenoid 40 in the art may optimize production of the desired ketocaro production and the productivity of carotenoid synthesis in tenoid end product by analyzing the available substrate pool these plants is relatively low. As a result, carotenoids pro within the desired host cell, selectively expressing an appro duced from these plants are very expensive. One way to priate combination of ketolases for optimal production of the increase the productive capacity of biosynthesis would be to desired ketocarotenoid. apply recombinant DNA technology (reviewed in Misawa 45 The problem to be solved therefore is to identify and and Shimada, J. Biotech., 59:169-181 (1998)). Thus, it isolate novel crtW ketolase genes useful for engineering would be desirable to produce carotenoids in non-caroteno production of ketocarotenoids (i.e. canthaxanthin and as tax genic bacteria and yeasts, thereby permitting control over anthin). The present invention has solved the stated problem quality, quantity, and selection of the most suitable and by providing three novel crtW genes useful for the produc efficient producer organisms. The latter is especially impor 50 tion of ketocarotenoids in recombinant host cells. Methods tant for commercial production economics (and therefore for producing ketocarotenoids using the present CrtW keto availability) to consumers. lases are also provided. Carotenoid ketolases are a class of enzymes that introduce keto groups to the ionone ring of the cyclic carotenoids, such SUMMARY OF THE INVENTION as B-carotene, to produce ketocarotenoids. Examples of 55 ketocarotenoids include astaxanthin, canthaxanthin, adonix The invention relates to new carotenoid ketolase enzymes anthin, adonirubin, echinenone, 3-hydroxyechinenone. capable of the conversion of cyclic carotenoids to cyclic 3'-hydroxyechinenone, 4-keto-gamma-carotene, 4-keto-ru ketocarotenoids. Accordingly the invention provides an iso bixanthin, 4-keto-, 3-hydroxy-4-keto-torulene, lated nucleic acid molecule encoding a carotenoid ketolase deoxyflexixanthin, and myxobactone. Two classes of keto 60 enzyme, selected from the group consisting of: lase, CrtW and CrtO, have been reported. The two classes (a) an isolated nucleic acid molecule encoding an amino have similar functionality yet appear to have arisen inde acid as set forth in SEQ ID NOS:2, 4, and 6: pendently as they share very little sequence similarity. The (b) an isolated nucleic acid molecule that hybridizes with CrtW is a symmetrically acting enzyme that adds keto (a) under the following wash conditions: 0.1xSSC, groups to both rings of B-carotene (Hannibal et al., J. 65 0.1% SDS, 65° C.; or Bacteriol., 182: 3850-3853 (2000)). Fernández-González et an isolated nucleic acid molecule that is complementary al. (J. of Biol. Chem., 272: 9728-9733 (1997)) reported that to (a), or (b). US 7,252,985 B2 3 4 Similarly the invention provides genetic chimera com BRIEF DESCRIPTION OF THE DRAWINGS prising the isolated nucleic acid molecules operably linked AND SEQUENCE DESCRIPTIONS to Suitable regulatory sequences, polypeptides encoded by the isolated nucleic acid molecules of the invention and FIG. 1. Illustration of possible pathway intermediates in transformed production host cells comprising the same. 5 the synthesis of astaxanthin via ketolase and hydroxylase reactions from B-carotene. The invention additionally provides methods of obtaining FIG. 2. HPLC analysis of carotenoids produced by the the nucleic acid molecules of the invention either by meth bacterial strains. FIG. 2a shows HPLC data from the analy ods of primer directed amplification or by hybridization. sis of S. melonis DC18; FIG.2b shows HPLC data from the In an other embodiment the invention provides a method 10 analysis of B. vesicularis DC263; and FIG.2c shows HPLC for the production of cyclic ketocarotenoid compounds data from the analysis of Flavobacterium sp. K1-202C. comprising: FIG. 3. HPLC data from the analysis of carotenoids (a) providing a host cell which produces cyclic caro produced by B-carotene accumulating E. coli Strain express tenoids; ing the divergent crtW genes. 15 FIG. 4. HPLC data from the analysis of carotenoids (b) transforming the host cell of (a) with the genes of the produced by astaxanthin-producing E. coli strain expressing invention encoding a carotenoid ketolase enzyme; and the divergent crtW genes. (c) growing the transformed host cell of (b) under con FIG. 5. HPLC analysis of Methylomonas sp. 16a cells ditions whereby a cyclic ketocarotenoid is produced. expressing the divergent crtW genes with B-carotene Syn Similarly the invention provides a method of regulating thesis genes. cyclic ketocarotenoid biosynthesis in an organism compris The invention can be more fully understood from the ing, following detailed description and the accompanying (a) introducing into a host cell a carotenoid ketolase gene sequence descriptions, which form a part of this application. of the invention said gene under the control of suitable The following sequences comply with 37 C.F.R. 1.821 regulatory sequences; and 25 1.825 (“Requirements for Patent Applications Containing (b) growing the host cell of (a) under conditions whereby Nucleotide Sequences and/or Amino Acid Sequence Disclo the carotenoid ketolase gene is expressed and cyclic sures—the Sequence Rules') and are consistent with World ketocarotenoid biosynthesis is regulated. Intellectual Property Organization (WIPO) Standard ST.25 In an alternate embodiment the invention provides a (1998) and the sequence listing requirements of the EPO and method for the increasing production of cyclic ketocaro 30 PCT (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex tenoid compounds comprising: C of the Administrative Instructions). The symbols and (a) providing a host cell which produces cyclic caro format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. S.1.822. tenoids; SEQ ID NO:1 is the nucleotide sequence of the Sphin (b) transforming the host cell of (a) with a first gene, of 35 gomonas melonis DC18 crtW ORF. the invention encoding a CrtW carotenoid ketolase SEQID NO:2 is the deduced amino acid sequence of the enzyme; Sphingomonas melonis DC18 CrtW ketolase. (c) transforming the host cell of (a) with a second gene SEQID NO:3 is the nucleotide sequence of Brevundimo encoding a CrtW carotenoid ketolase enzyme, said nas vesicularis DC263 crtW ORF. second gene having less than 65% nucleic acid 40 SEQID NO:4 is the deduced amino acid sequence of the sequence identity when compared to said first gene; and Brevundimonas vesicularis DC263 CrtW ketolase. (d) growing the transformed host cell comprising said first SEQ ID NO:5 is the nucleotide sequence of Flavobacte gene of (a) and said second gene of (b) under conditions rium sp. K1-202C crtW ORF. whereby the production of cyclic ketocarotenoid is SEQID NO:6 is the deduced amino acid sequence of the increased relative to a transformed host cell only 45 Flavobacterium sp. K1-202C CrtW ketolase. expressing either said first gene or said second gene. SEQ ID NO:7 is the nucleotide sequence of a primer Mutated genes of the invention are also provided pro (“HK12) used for 16S rRNA gene sequencing. duced by a method comprising the steps of: SEQ ID NO:8 is the nucleotide sequence of a primer (a) digesting a mixture of nucleotide sequences with 50 (“JCR14) used for 16S rRNA gene sequencing. restriction endonucleases wherein said mixture com SEQ ID NO:9 is the nucleotide sequence of a primer prises: (“JCR15') used for 16S rRNA gene sequencing. i) a native carotenoid ketolase gene; SEQ ID NO:10 is the nucleotide sequence of the Sphin ii) a first population of nucleotide fragments which will gomonas melonis DC18 16S rRNA gene. hybridize to said native carotenoid ketolase gene; 55 SEQID NO:11 is the nucleotide sequence of the Brevun iii) a second population of nucleotide fragments that dimonas vesicularis DC263 16S rRNA gene. will not hybridize to said native carotenoid ketolase SEQID NO:12 is the nucleotide sequence of the crtEidiY gene. IBZ carotenoid synthesis gene cluster from Pantoea agglo wherein a mixture of restriction fragments are pro merans DC404 (U.S. Ser. No. 60/477,874) 60 SEQ ID NO:13 is the nucleotide sequence of primer duced; pWEB404F. (b) denaturing said mixture of restriction fragments; SEQ ID NO:14 is the nucleotide sequence of primer (c) incubating the denatured said mixture of restriction pWEB404R. fragments of step (ii) with a polymerase; SEQ ID NO:15 is the nucleotide sequence of the (d) repeating steps (ii) and (iii) wherein a mutated caro 65 crtEidiYIB gene cluster from P agglomerans DC404. tenoid ketolase gene is produced encoding a protein SEQ ID NO:16 is the nucleotide sequence of primer having an altered biological activity. crtW-18 F. US 7,252,985 B2 5 6 SEQ ID NO:17 is the nucleotide sequence of primer ing CrtW ketolases have been isolated from several bacterial crtW-18 R. strains including Sphingomonas melonis DC18, Brevundi SEQ ID NO:18 is the nucleotide sequence of primer monas vesicularis DC263, and Flavobacterium sp. crtW-263 F. K1-202C. The isolated nucleic acid fragments were identi SEQ ID NO:19 is the nucleotide sequence of primer 5 fied and characterized by comparison to public databases crtW-263. R. containing nucleotide and protein sequences using the SEQ ID NO:20 is the nucleotide sequence of primer BLAST and FASTA algorithms, well-known to those skilled crtWK1-2O2CF. in the art. SEQ ID NO:21 is the nucleotide sequence of primer The present crtW ketolase genes were expressed in trans crtWK1-2O2CR. 10 genic microbial hosts engineered to produce Suitable Sub SEQ ID NO:22 is the nucleotide sequence of the Agro strates (i.e. B-carotene). Functional expression of the genes bacterium aurantiacum crtZ hydroxylase gene. was measured by the production of ketocarotenoids (for SEQ ID NO:23 is the nucleotide sequence of the Agro example, canthaxanthin and astaxanthin) in the heterologous bacterium aurantiacum crtW ketolase gene. hosts. Additionally, the effects of divergent ketolase coex SEQ ID NO:24 is the nucleotide sequence of primer 15 pression on ketocarotenoid production within the transgenic crtzW F. hosts were characterized by measuring relative changes in SEQ ID NO:25 is the nucleotide sequence of primer ketocarotenoid production. critzW Soe R. The genes and gene products of the present invention may SEQ ID NO:26 is the nucleotide sequence of primer be used in a variety of ways for the production or regulation crtZW Soe F of cyclic ketocarotenoid compounds. The present crtW SEQ ID NO:27 is the nucleotide sequence of primer ketolase genes can be used for ketocarotenoid production in crtzW. R. heterologous hosts having the ability to produce Suitable SEQ ID NO:28 is the nucleotide sequence of primer substrates. Additionally, two or more of the present crtW crt-260 F. ketolase genes may be simultaneously expressed in the SEQ ID NO:29 is the nucleotide sequence of primer 25 heterologous host for optimized production of ketocaro crt-260SOE R. tenoids. Simultaneous expression of the present crtW genes SEQ ID NO:30 is the nucleotide sequence of primer is possible due to their relatively low to moderate nucleotide crt-260SOE. F. sequence homology to other known CrtW ketolases. The SEQ ID NO:31 is the nucleotide sequence of primer relatively low/moderate homology permits stable expression crt-260R1 R. 30 of multiple CrtW ketolases in the recombinant host cell for SEQ ID NO:32 is the nucleotide sequence of primer optimal ketocarotenoid production. crt-260R1 F. The gene and gene sequences described herein enable one SEQ ID NO:33 is the nucleotide sequence of primer to incorporate the production of ketocarotenoids directly crt-260 R. into an industrially Suitable host cell. This aspect makes any The following biological deposit was made under the 35 recombinant host into which these genes are incorporated a terms of the Budapest Treaty on the International Recogni more desirable production host. The ketocarotenoids pro tion of the Deposit of Microorganisms for the Purposes of duced can be isolated from the production host for use in a Patent Procedure: variety of applications, including animal feed. Optionally, the recombinant host cells (whole, homogenized, or autoly 40 sed) can be directly incorporated into animal feed (no carotenoid isolation step) due to the presence of carotenoids International that are known to add desirable pigmentation and health Depositor Identification Depository benefits. Salmon and shrimp aquacultures are particularly Reference Designation Date of Deposit useful applications for this invention as carotenoid pigmen 45 Methylomonas 16a ATCC PTA 2402 Aug. 22, 2000 tation is critically important for the value of these organisms (F. Shahidi, J. A. Brown, Carotenoid pigments in seafood and aquaculture, Critical Reviews in Food Science, 38(1): As used herein, “ATCC refers to the American Type 1-67 (1998)). Additionally, the ketocarotenoid astaxanthin is Culture Collection International Depository Authority known to be a powerful antioxidant and has been reported to located at ATCC, 10801 University Blvd., Manassas, Va. 50 boost immune functions in humans and reduce carcinogen 20110-2209, U.S.A. The “International Depository Desig esis (Jyonouchi et al., Nutr. Cancer, 23:171-183 (1995); nation' is the accession number to the culture on deposit Tanaka et al., Cancer Res., 55:4059-4064 (1995)). with ATCC. In this disclosure, a number of terms and abbreviations are The listed deposit will be maintained in the indicated used. The following definitions are provided. international depository for at least thirty (30) years and will 55 be made available to the public upon the grant of a patent As used herein, the term “comprising” means the pres disclosing it. The availability of a deposit does not constitute ence of the stated features, integers, steps, or components as a license to practice the Subject invention in derogation of referred to in the claims, but that it does not preclude the patent rights granted by government action. presence or addition of one or more other features, integers, 60 steps, components or groups thereof. DETAILED DESCRIPTION OF THE “Open reading frame' is abbreviated ORF. INVENTION “Polymerase chain reaction' is abbreviated PCR. As used herein, the terms "isolated nucleic acid fragment' The present crtW genes and their expression product, a or “isolated nucleic acid molecule' will be used interchange carotenoid ketolase, are useful for the creation of recombi 65 ably and will refer to a polymer of RNA or DNA that is nant organisms that have the ability to produce cyclic single- or double-stranded, optionally containing synthetic, ketocarotenoid compounds. Nucleic acid fragments encod non-natural or altered nucleotide bases. An isolated nucleic US 7,252,985 B2 7 8 acid fragment in the form of a polymer of DNA may be phate to farnesyl pyrophosphate (FPP). Genes encoding comprised of one or more segments of cDNA, genomic these enzymes include, but are not limited to: the “dxs' gene DNA or synthetic DNA. (encoding 1-deoxyxylulose-5-phosphate synthase); the The term “pBHR-crt1' refers to a B-carotene producing "dxr gene (encoding 1-deoxyxylulose-5-phosphate reduc plasmid. The plasmid was constructed by cloning the toisomerase; also known as the ispC); the “isp) gene crtEXYIB carotenoid gene cluster from Pantoea stewartii (encoding a 2C-methyl-D-erythritol cytidyltransferase (ATCC 8199) into pBHR1 (MoBioTech, Goettingen, Ger enzyme; also known as ygbP); the “ispE gene (encoding many; and U.S. Ser. No. 09/941,947, hereby incorporated by 4-diphosphocytidyl-2-C-methylerythritol kinase; also reference). The resulting plasmid contained the P. Stewartii known as yehB); the “ispF gene (encoding a 2C-methyl gene cluster expressed under the control of the chloram 10 D-erythritol 2.4-cyclodiphosphate synthase; also known as phenicol-resistance gene promoter. ygbB); the “pyrG” gene (encoding a CTP synthase; also The term, “p)CO329 refers to a B-carotene producing known as ispF); the "lytB gene (also known as ispH) plasmid. The plasmid was constructed by cloning the involved in the formation of dimethylallyl diphosphate; the crtEXYIB carotenoid gene cluster from Enterobactericeae 'gcpE gene (also known as ispG) involved in the synthesis DC260 into pBHR1 (U.S. Ser. No. 10/808,979). 15 of 2-C-methyl-D-erythritol 4-phosphate; the “idi’ gene (re The term “poCQ330 refers to a f-caroteine producing sponsible for the intramolecular conversion of IPP to dim plasmid. The plasmid was constructed by cloning the ethylallyl pyrophosphate); and the “ispA gene (encoding crtEidiYIB carotenoid gene cluster from Pantoea agglom geranyltransferase or farnesyl diphosphate synthase) in the erans DC404 into broad host range vector pBHR1. isoprenoid pathway. The term “pCDQ335” refers to a plasmid comprising the The terms “lower carotenoid biosynthetic pathway' and B-carotene synthesis gene cluster from plCQ330 and the “lower pathway' will be used interchangeably and refer to Agrobacterium aurantiacum critzW genes. Plasmid those enzymes which convert FPP to a suite of carotenoids. pDCQ335 contains the crtZWEidiYIB genes in an operon These include those genes and gene products that are under the control of the chloramphenicol resistance gene involved in the synthesis of either diapophytoene (whose promoter. The resulting plasmid, when transformed into an 25 synthesis represents the first step unique to biosynthesis of appropriate heterologous host, enables the production of Co carotenoids) or (whose synthesis represents astaxanthin (FIG. 1). the first step unique to biosynthesis of Cao carotenoids). All The term “poCQ335TA’ refers to a plasmid comprising Subsequent reactions leading to the production of various the Agrobacterium aurantiacum crtWZ genes cloned into a C-Cao carotenoids are included within the lower carotenoid pTrchis2-TOPO expression vector (Invitrogen, Carlsbad, 30 biosynthetic pathway. These genes and gene products com Calif.). prise all of the “crt' genes including, but not limited to: The term “pDCQ340” refers to a B-caroteine producing crtM, crtN, crtN2, crtE, crtX, crtY, crtI, crtB, crtz, crtW. plasmid. The plasmid contains the crtEYIB genes from crtO, crtR, crtA, crtC, crtD, crtF, and crtU. Finally, the term Enterobactericeae DC260 cloned into the broad host range “lower carotenoid biosynthetic enzyme” is an inclusive term vector pBHR1. 35 referring to any and all of the enzymes in the lower pathway The term “p)CO341 TA’ refers to a plasmid expressing including, but not limited to: CrtM, CrtN, CrtN2, CrtE, the crtW gene from Sphingomonas melonis DC18 cloned CrtX, CrtY. CrtI, CrtB, CrtZ, CrtW. CrtO, CrtR, CrtA, CrtC, into a pTrchis2-TOPO vector (Invitrogen). CrtD, CrtF, and CrtU. The term “p)CO342TA’ refers to a plasmid expressing “Co diapocarotenoids’ consist of six isoprenoid units the crtW gene from Brevundimonas vesicularis DC263 40 joined in Such a manner that the arrangement of isoprenoid cloned into a pTrchis2-TOPO vector (Invitrogen). units is reversed at the center of the molecule so that the two The term “p)CO339TA’ refers to a plasmid expressing central methyl groups are in a 1.6-positional relationship and the crtW gene from Flavobacterium sp. K1-202C cloned the remaining nonterminal methyl groups are in a 1.5- into a pTrchis2-TOPO vector (Invitrogen). positional relationship. All Co carotenoids may be formally The term "isoprenoid' or “ refers to the com 45 derived from the acyclic CoH structure, having a long pounds are any molecule derived from the isoprenoid path central chain of conjugated double bonds, by: (i) hydroge way, including 10 carbon and their derivatives, nation (ii) dehydrogenation, (iii) cyclization, (iv) oxidation, Such as carotenoids and . (V) esterification/glycosylation, or any combination of these The term “carotenoid’ refers to a compound composed of processes. a polyene backbone which is condensed from five-carbon 50 isoprene unit. Carotenoids can be acyclic or terminated with “Tetraterpenes' or “Cao carotenoids' consist of eight one (monocyclic) or two (bicyclic) cyclic end groups. The isoprenoid units joined in Such a manner that the arrange ment of isoprenoid units is reversed at the center of the term “carotenoid may include both and Xantho molecule so that the two central methyl groups are in a phylls. A “caroteine' refers to a hydrocarbon carotenoid. 1.6-positional relationship and the remaining nonterminal Caroteine derivatives that contain one or more oxygen atoms, 55 in the form of hydroxy-, methoxy-, oxo-, epoxy-, carboxy-, methyl groups are in a 1.5-positional relationship. All Cao or aldehydic functional groups, or within glycosides, gly carotenoids may beformally derived from the acyclic CHs. coside esters, or Sulfates, are collectively known as “xan structure. Non-limiting examples of Cao carotenoids thophylls’. Carotenoids that are particularly suitable in the include: phytoene, , B-carotene, Zeaxanthin, astax present invention are monocyclic and bicyclic carotenoids. 60 anthin, and canthaxanthin. The term “carotenoid biosynthetic pathway” refers to The term “CrtE’ refers to a geranylgeranyl pyrophosphate those genes comprising members of the “upper isoprenoid synthase enzyme encoded by the crtE gene and which pathway' and/or the “lower carotenoid biosynthetic path converts trans-trans-farnesyl diphosphate and isopentenyl way”. diphosphate to pyrophosphate and geranylgeranyl diphos The terms “upper isoprenoid pathway” and “upper path 65 phate. way’ are used interchangeably and refer to enzymes The term "Idi" refers to an isopentenyl diphosphate involved in converting pyruvate and glyceraldehyde-3-phos isomerase enzyme (E.C. 5.3.3.2) encoded by the idi gene. US 7,252,985 B2 9 10 The term “CrtY” refers to a lycopene cyclase enzyme glutamic acid) or one positively charged residue for another encoded by the crtY gene, which converts lycopene to (such as lysine for arginine) can also be expected to produce B-carotene. a functionally equivalent product. The term “CrtI refers to a phytoene desaturase enzyme In many cases, nucleotide changes which result in alter encoded by the crtI gene. CrtI converts phytoene into ation of the N-terminal and C-terminal portions of the lycopene via the intermediaries of , -carotene, protein molecule would also not be expected to alter the and by the introduction of 4 double bonds. activity of the protein. The term “CrtB” refers to a phytoene synthase enzyme Each of the proposed modifications is well within the encoded by the crtB gene, which catalyzes the reaction from routine skill in the art, as is determination of retention of prephytoene diphosphate to phytoene. 10 biological activity of the encoded products. Moreover, the The term “CrtZ” refers to a B-carotene hydroxylase skilled artisan recognizes that Substantially similar enzyme encoded by the crtZ gene, which catalyzes a sequences encompassed by this invention are also defined by hydroxylation reaction from B-carotene to Zeaxanthin. their ability to hybridize, under stringent conditions (0.1x The term “CrtW' refers to a B-carotene ketolase enzyme SSC, 0.1% SDS, 65° C. and washed with 2XSSC, 0.1% SDS encoded by the crtW gene, which catalyzes an oxidation 15 followed by 0.1xSSC, 0.1% SDS), with the sequences reaction where a keto group is introduced on the ionone ring exemplified herein. In one embodiment, Substantially simi of cyclic carotenoids. It is known that CrtW ketolases lar nucleic acid fragments of the instant invention are those typically exhibit substrate flexibility. The term "carotenoid nucleic acid fragments whose DNA sequences are at least ketolase' or “ketolase' refers to the group of enzymes that about 80% identical to the DNA sequence of the nucleic acid can add keto groups to the ionone ring of cyclic carotenoids. fragments reported herein. In another embodiment, Substan The term “CrtX' refers to a zeaxanthin glucosyl trans tially similar nucleic acid fragments are at least about 90% ferase enzyme encoded by the crtX gene and which converts identical to the DNA sequence of the nucleic acid fragments Zeaxanthin to Zeaxanthin-f-diglucoside. reported herein. In yet a further embodiment, substantially The term “keto group' or “ketone group' will be used similar nucleic acid fragments are at least about 95% iden interchangeably and refers to a group in which a carbonyl 25 tical to the DNA sequence of the nucleic acid fragments group is bonded to two carbon atoms: RC=O (neither R reported herein. may be H). A nucleic acid molecule is “hybridizable' to another The term “ketocarotenoid refers to carotenoids possess nucleic acid molecule. Such as a cDNA, genomic DNA, or ing at least one keto group on the ionone ring of a cyclic RNA, when a single stranded form of the nucleic acid carotenoid. Examples of ketocarotenoids include, but are not 30 molecule can anneal to the other nucleic acid molecule limited to canthaxanthin and astaxanthin. under the appropriate conditions of temperature and Solution As used herein, “substantially similar refers to nucleic ionic strength. Hybridization and washing conditions are acid fragments wherein changes in one or more nucleotide well known and exemplified in Sambrook, J., Fritsch, E. F. bases results in Substitution of one or more amino acids, but and Maniatis, T. Molecular Cloning. A Laboratory Manual. do not affect the functional properties of the protein encoded 35 Second Edition, Cold Spring Harbor Laboratory Press, Cold by the DNA sequence. “Substantially similar also refers to Spring Harbor (1989) (hereinafter “Maniatis'), particularly nucleic acid fragments wherein changes in one or more Chapter 11 and Table 11.1 therein. The conditions of tem nucleotide bases does not affect the ability of the nucleic perature and ionic strength determine the “stringency of the acid fragment to mediate alteration of gene expression by hybridization. Stringency conditions can be adjusted to antisense or co-suppression technology. “Substantially simi 40 screen for moderately similar fragments, such as homolo lar also refers to modifications of the nucleic acid frag gous sequences from distantly related organisms, to highly ments of the instant invention Such as deletion or insertion similar fragments, such as genes that duplicate functional of one or more nucleotide bases that do not substantially enzymes from closely related organisms. Post-hybridization affect the functional properties of the resulting transcript. It washes determine stringency conditions. In one embodi is therefore understood that the invention encompasses more 45 ment, the stringency conditions use a series of washes than the specific exemplary sequences. starting with 6xSSC, 0.5% SDS at room temperature for For example, it is well known in the art that alterations in about 15 min, then repeated with 2xSSC, 0.5% SDS at about a gene which result in the production of a chemically 45° C. for about 30 min, and then repeated twice with equivalent amino acid at a given site, but do not effect the 0.2xSSC, 0.5% SDS at about 50° C. for about 30 min. In functional properties of the encoded protein are common. 50 another embodiment, the stringency conditions use higher For the purposes of the present invention substitutions are temperatures in which the washes are identical to those defined as exchanges within one of the following five above except for the temperature of the final two 30 min groups: washes in 0.2xSSC, 0.5% SDS was increased to about 60° 1. Small aliphatic, nonpolar or slightly polar residues: C. In yet another embodiment, highly stringent conditions Ala, Ser, Thr(Pro, Gly): 55 use two final washes in 0.1xSSC, 0.1% SDS at about 65° C. 2. Polar, negatively charged residues and their amides: Hybridization requires that the two nucleic acids contain Asp, ASn, Glu, Gln; complementary sequences, although depending on the strin 3. Polar, positively charged residues: His, Arg, Lys; gency of the hybridization, mismatches between bases are 4. Large aliphatic, nonpolar residues: Met, Leu, lie, Val possible. The appropriate stringency for hybridizing nucleic (CyS); and 60 acids depends on the length of the nucleic acids and the 5. Large aromatic residues: Phe, Tyr, Trp. degree of complementation, variables well-known in the art. Thus, a codon for the amino acid alanine, a hydrophobic The greater the degree of similarity or homology between amino acid, may be substituted by a codon encoding another two nucleotide sequences, the greater the value of Tm for less hydrophobic residue (such as glycine) or a more hydro hybrids of nucleic acids having those sequences. The rela phobic residue (such as Valine, leucine, or isoleucine). 65 tive stability (corresponding to higher Tm) of nucleic acid Similarly, changes which result in Substitution of one nega hybridizations decreases in the following order: RNA:RNA, tively charged residue for another (such as aspartic acid for DNA:RNA, DNA:DNA. For hybrids of greater than 100 US 7,252,985 B2 11 12 nucleotides in length, equations for calculating Tm have University Press, N.Y. (1988); Biocomputing: Informatics been derived (see Sambrook et al., supra, 9.50-9.51). For and Genome Projects (Smith, D. W., ed.) Academic Press, hybridizations with shorter nucleic acids, i.e., oligonucle N.Y. (1993); Computer Analysis of Sequence Data, Part I otides, the position of mismatches becomes more important, (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, NJ and the length of the oligonucleotide determines its speci (1994); Sequence Analysis in Molecular Biology (von Hei ficity (see Sambrook et al., Supra, 11.7-11.8). In one embodi nje, G., ed.) Academic Press (1987); and Sequence Analysis ment, the length for a hybridizable nucleic acid is at least Primer (Gribskov, M. and Devereux, J., eds.) Stockton about 10 nucleotides. In another embodiment, the minimum Press, N.Y. (1991). In one embodiment, the methods used to length for a hybridizable nucleic acid is at least about 15 determine identity are designed to give the best match nucleotides; in yet another embodiment at least about 20 10 nucleotides; and in yet a further embodiment, the length is between the sequences tested. Methods to determine identity at least about 30 nucleotides. Furthermore, the skilled artisan and similarity are codified in publicly available computer will recognize that the temperature and wash solution salt programs. Sequence alignments and percent identity calcu concentration may be adjusted as necessary according to lations may be performed using the Megalign program of the factors such as length of the probe. 15 LASERGENE bioinformatics computing suite (DNASTAR A “substantial portion of an amino acid or nucleotide Inc., Madison, Wis.). Multiple alignment of the sequences sequence comprising enough of the amino acid sequence of can be performed using the Clustal method of alignment a polypeptide or the nucleotide sequence of a gene to (Higgins and Sharp, CABIOS., 5:151-153 (1989)) with the putatively identify that polypeptide or gene, either by default parameters (GAP PENALTY=10, GAP LENGTH manual evaluation of the sequence by one skilled in the art, PENALTY=10). Default parameters for pairwise alignments or by computer-automated sequence comparison and iden using the Clustal method were KTUPLE 1, GAP PEN tification using algorithms such as BLAST (Basic Local ALTY=3, WINDOW=5 and DIAGONALS SAVED=5. Alignment Search Tool; Altschul, S. F., et al., (1993).J. Mol. Suitable nucleic acid fragments (isolated polynucleotides Biol. 215:403-410). In general, a sequence of ten or more of the present invention) encode polypeptides that are at contiguous amino acids or thirty or more nucleotides is 25 least about 75% identical. In one embodiment, suitable necessary in order to putatively identify a polypeptide or nucleic acid fragments are at least about 85% identical to the nucleic acid sequence as homologous to a known protein or amino acid sequences reported herein. In another embodi gene. Moreover, with respect to nucleotide sequences, gene ment, the nucleic acid fragments encode amino acid specific oligonucleotide probes comprising about 20-30 con sequences that are at least about 90% identical to the amino tiguous nucleotides may be used in sequence-dependent 30 acid sequences reported herein. In a further embodiment, methods of gene identification (e.g., Southern hybridization) nucleic acid fragments encode amino acid sequences that are and isolation (e.g., in situ hybridization of bacterial colonies at least about 95% identical to the amino acid sequences or bacteriophage plaques). In addition, short oligonucle reported herein. In yet a further embodiment, the suitable otides of about 12-15 bases may be used as amplification nucleic acid fragments encode amino acid sequences that are primers in PCR in order to obtain a particular nucleic acid 35 at least about 99% identical to the amino acid sequences fragment comprising the primers. Accordingly, a “substan reported herein. Suitable nucleic acid fragments of the tial portion of a nucleotide sequence comprises enough of present invention not only have the above homologies, but the sequence to specifically identify and/or isolate a nucleic typically encode a polypeptide having at least about 240 acid fragment comprising the sequence. The instant speci amino acids. fication teaches partial or complete amino acid and nucle 40 In the present invention, the terms “divergent gene'. otide sequences encoding one or more particular microbial "divergent ketolase', and "divergent sequence' are used proteins. The skilled artisan, having the benefit of the interchangeably and refer to the lack of nucleic acid frag sequences as reported herein, may now use all or a Substan ment sequence identity among CrtW ketolases. Nucleotide tial portion of the disclosed sequences for purposes known sequence comparisons between 2 or more crtW genes allows to those skilled in this art. Accordingly, the instant invention 45 classification of the relationship(s) as to the relative degree comprises the complete sequences as reported in the accom of sequence identity. Simultaneous expression of highly panying Sequence Listing, as well as Substantial portions of homologous genes tends to result in genetic instability (i.e. those sequences as defined above. increased rate of homologous recombination). Expression of The term “complementary' is used to describe the rela moderately or highly divergent genes is likely to result in tionship between nucleotide bases that are capable to hybrid 50 genetic stability. As used herein, “genetic stability” or izing to one another. For example, with respect to DNA, “genetically stable' will be used to described the expression adenosine is complementary to thymine and cytosine is of multiple carotenoid ketolase genes having coding complementary to guanine. Accordingly, the instant inven sequence with less than 75% nucleic acid sequence identity tion also includes isolated nucleic acid fragments that are to the present carotenoid ketolase genes, preferably less than complementary to the complete sequences as reported in the 55 65% nucleic acid sequence identity. This is particularly accompanying Sequence Listing as well as those Substan important when chromosomally integrating more than one tially similar nucleic acid sequences. carotenoid ketolase gene for increasing ketocarotenoid pro The term “percent identity”, as known in the art, is a duction in a genetically stable transformant. In one embodi relationship between two or more polypeptide sequences or ment, the crtW ketolase genes useful for coexpression are two or more polynucleotide sequences, as determined by 60 those that share less than 75% identify when compared by comparing the sequences. In the art, “identity also means sequence alignment. In another embodiment, the crtW keto the degree of sequence relatedness between polypeptide or lase genes used for coexpression are those that share less polynucleotide sequences, as the case may be, as determined than about 65% identify when compared by sequence align by the match between strings of such sequences. “Identity” ment. In a further embodiment, the crtW genes used for and “similarity” can be readily calculated by known meth 65 coexpression are those that share less than about 55% ods, including but not limited to those described in: Com identify when compared by sequence alignment. In yet a putational Molecular Biology (Lesk, A. M., ed.) Oxford further embodiment, the crtW genes used for coexpression US 7,252,985 B2 13 14 are those that share less than about 45% identify when In general, a coding sequence is located 3' to a promoter compared by sequence alignment. sequence. Promoters may be derived in their entirety from a “Codon degeneracy” refers to the nature in the genetic native gene, or be composed of different elements derived code permitting variation of the nucleotide sequence without from different promoters found in nature, or even comprise effecting the amino acid sequence of an encoded polypep synthetic DNA segments. It is understood by those skilled in tide. Accordingly, the instant invention relates to any nucleic the art that different promoters may direct the expression of acid fragment that encodes all or a substantial portion of the a gene in different tissues or cell types, or at different stages amino acid sequence encoding the instant microbial of development, or in response to different environmental or polypeptides as set forth in SEQ ID NOS: 2, 4, and 6. The physiological conditions. Promoters which cause a gene to skilled artisan is well aware of the “codon-bias” exhibited by 10 be expressed in most cell types at most times are commonly a specific host cell in usage of nucleotide codons to specify referred to as “constitutive promoters”. It is further recog a given amino acid. Therefore, when synthesizing a gene for nized that since in most cases the exact boundaries of improved expression in a host cell, it is desirable to design regulatory sequences have not been completely defined, the gene Such that its frequency of codon usage approaches DNA fragments of different lengths may have identical the frequency of preferred codon usage of the host cell. 15 promoter activity. “Synthetic genes' can be assembled from oligonucleotide The '3' non-coding sequences” refer to DNA sequences building blocks that are chemically synthesized using pro located downstream of a coding sequence and include cedures known to those skilled in the art. These building polyadenylation recognition sequences (normally limited to blocks are ligated and annealed to form gene segments eukaryotes) and other sequences encoding regulatory signals which are then enzymatically assembled to construct the capable of affecting mRNA processing or gene expression. entire gene. “Chemically synthesized, as related to a The polyadenylation signal is usually characterized by sequence of DNA, means that the component nucleotides affecting the addition of polyadenylic acid tracts (normally were assembled in vitro. Manual chemical synthesis of DNA limited to eukaryotes) to the 3' end of the mRNA precursor. may be accomplished using well-established procedures, or “RNA transcript refers to the product resulting from automated chemical synthesis can be performed using one of 25 RNA polymerase-catalyzed transcription of a DNA a number of commercially available machines. Accordingly, sequence. When the RNA transcript is a perfect comple the genes can be tailored for optimal gene expression based mentary copy of the DNA sequence, it is referred to as the on optimization of nucleotide sequence to reflect the codon primary transcript or it may be a RNA sequence derived bias of the host cell. The skilled artisan appreciates the from post-transcriptional processing of the primary tran likelihood of Successful gene expression if codon usage is 30 script and is referred to as the mature RNA. "Messenger biased towards those codons favored by the host. Determi RNA (mRNA) refers to the RNA that is without introns and nation of preferred codons can be based on a survey of genes that can be translated into protein by the cell. “cDNA” refers derived from the host cell where sequence information is to a double-stranded DNA that is complementary to and available. derived from mRNA. “Sense” RNA refers to RNA transcript "Gene' refers to a nucleic acid fragment that expresses a 35 that includes the mRNA and so can be translated into protein specific protein, including regulatory sequences preceding by the cell. “Antisense RNA refers to a RNA transcript that (5' non-coding sequences) and following (3' non-coding is complementary to all or part of a target primary transcript sequences) the coding sequence. “Native gene' refers to a or mRNA and that blocks the expression of a target gene gene as found in nature with its own regulatory sequences. (U.S. Pat. No. 5,107,065; WO99/28508). The complemen "Chimeric gene' refers to any gene that is not a native gene, 40 tarity of an antisense RNA may be with any part of the comprising regulatory and coding sequences that are not specific gene transcript, i.e., at the 5' non-coding sequence, found together in nature. Accordingly, a chimeric gene may 3' non-coding sequence, or the coding sequence. "Functional comprise regulatory sequences and coding sequences that RNA refers to antisense RNA, ribozyme RNA, or other are derived from different sources, or regulatory sequences RNA that is not translated yet has an effect on cellular and coding sequences derived from the same source, but 45 processes. arranged in a manner different than that found in nature. The term “operably linked’ refers to the association of "Endogenous gene' refers to a native gene in its natural nucleic acid sequences on a single nucleic acid fragment So location in the genome of an organism. A “foreign' gene that the function of one is affected by the other. For example, refers to a gene not normally found in the host organism, but a promoter is operably linked with a coding sequence when that is introduced into the host organism by gene transfer. 50 it is capable of affecting the expression of that coding Foreign genes can comprise native genes inserted into a sequence (i.e., that the coding sequence is under the tran non-native organism, or chimeric genes. A “transgene' is a Scriptional control of the promoter). Coding sequences can gene that has been introduced into the genome by a trans be operably linked to regulatory sequences in sense or formation procedure. antisense orientation. "Coding sequence” refers to a DNA sequence that codes 55 The term “expression', as used herein, refers to the for a specific amino acid sequence. "Suitable regulatory transcription and stable accumulation of sense (mRNA) or sequences’ refer to nucleotide sequences located upstream antisense RNA derived from the nucleic acid fragment of the (5' non-coding sequences), within, or downstream (3' non invention. Expression may also refer to translation of mRNA coding sequences) of a coding sequence, and which influ into a polypeptide. ence the transcription, RNA processing or stability, or trans 60 “Transformation” refers to the transfer of a nucleic acid lation of the associated coding sequence. Regulatory fragment into the genome of a host organism, resulting in sequences may include promoters, translation leader genetically stable inheritance. As used herein, the host cell sequences, introns, polyadenylation recognition sequences, genome includes both chromosomal or extrachromosomal RNA processing site, effector binding site and stem-loop (i.e. a vector) genes with the host cell. Host organisms Structure. 65 containing the transformed nucleic acid fragments are “Promoter refers to a DNA sequence capable of control referred to as “transgenic’’ or “recombinant’ or “trans ling the expression of a coding sequence or functional RNA. formed organisms. US 7,252,985 B2 15 16 “Conjugation” refers to a particular type of transformation The term “CrtN3' refers to an enzyme encoded by the in which a unidirectional transfer of DNA (e.g., from a crtN3 gene, active in the native carotenoid biosynthetic bacterial plasmid) occurs from one bacterium cell (i.e., the pathway of Methylomonas sp. 16a. This gene is not located “donor') to another (i.e., the “recipient'). The process within the crt gene cluster; instead this gene is present in a involves direct cell-to-cell contact. different location within the Methylomonas genome. The term “carbon substrate” refers to a carbon source The terms “crtN1 gene cluster”, “Co crt gene cluster', capable of being metabolized by host organisms of the “crt gene cluster, and “endogenous Methylomonas crt gene present invention and particularly carbon sources selected cluster” refer to an operon comprising crtN1, ald, and crtN2 from the group consisting of monosaccharides, oligosaccha genes that is active in the native carotenoid biosynthetic rides, polysaccharides, and one-carbon Substrates or mix 10 pathway of Methylomonas sp. 16a. tures thereof. The term “C carbon substrate” refers to any The term “MWM1200 (Acrt cluster promoter+AcrtN3)” carbon-containing molecule that lacks a carbon-carbon refers to a mutant of Methylomonas sp. 16a in which the Co bond. Non-limiting examples are methane, methanol, form crt cluster promoter and the crtN3 gene have been disrupted. aldehyde, formic acid, formate, methylated amines (e.g., Disruption of the native Co carotenoid biosynthetic path mono-, di-, and tri-methyl amine), methylated thiols, and 15 way results in Suitable background for engineering Co carbon dioxide. In one embodiment, the C carbon substrate carotenoid production. The Methylomonas MWM1200 is methanol and/or methane. strain was previously created and is a Suitable carotenoid The term “C metabolizer” refers to a microorganism that production host (U.S. Ser. No. 60/527,083; hereby incorpo has the ability to use a single carbon Substrate as its sole rated by reference). The term “pigmentless” or “white Source of energy and biomass. C metabolizers will typically mutant” refers to a Methylomonas sp. 16a bacterium wherein be methylotrophs and/or methanotrophs. The term “C the native pink pigment (e.g., a Cso carotenoid) is not metabolizing bacteria' refers to bacteria that have the ability produced. Thus, the bacterial cells appear white in color, as to use a single carbon Substrate as their sole source of energy opposed to pink. and biomass. C metabolizing bacteria, a subset of C The terms “plasmid, “vector' and “cassette' refer to an metabolizers, will typically be methylotrophs and/or metha 25 extra chromosomal element often carrying genes which are notrophs. In one embodiment, the C metabolizer is a not part of the central metabolism of the cell, and usually in methylotroph and the single carbon Substrate is selected the form of circular double-stranded DNA fragments. Such from the group consisting of methane and/or methanol. In elements may be autonomously replicating sequences, another embodiment, the C metabolizer is a methanotroph genome integrating sequences, phage or nucleotide and the single carbon Substrate is selected from the group 30 sequences, linear or circular, of a single- or double-stranded consisting of methane and/or methanol. DNA or RNA, derived from any source, in which a number The term “methylotroph” means an organism capable of of nucleotide sequences have been joined or recombined oxidizing organic compounds that do not contain carbon into a unique construction. “Transformation cassette' refers carbon bonds. Where the methylotroph is able to oxidize to a specific vector containing a foreign gene and having CH, the methylotroph is also a methanotroph. 35 elements in addition to the foreign gene that facilitate The term “methanotroph” or “methanotrophic bacteria transformation of a particular host cell. “Expression cas means a prokaryote capable of utilizing methane as its sette' refers to a specific vector containing a foreign gene primary source of carbon and energy. Complete oxidation of and having elements in addition to the foreign gene that methane to carbon dioxide occurs by aerobic degradation 40 allow for enhanced expression of that gene in a foreign host. pathways. Typical examples of methanotrophs useful in the The term “sequence analysis software” refers to any present invention include (but are not limited to) the genera computer algorithm or Software program that is useful for Methylomonas, Methylobacter, Methylococcus, and Methy the analysis of nucleotide or amino acid sequences. losinus. “Sequence analysis Software may be commercially avail The term “high growth methanotrophic bacterial strain' 45 able or independently developed. Typical sequence analysis refers to a bacterium capable of growth with methane and/or software will include but is not limited to the GCG suite of methanol as the Sole carbon and energy source and which programs (Wisconsin Package Version 9.0, Genetics Com possesses a functional Embden-Meyerhof carbon flux path puter Group (GCG), Madison, Wis.), BLASTP. BLASTN, way, resulting in a high rate of growth and yield of cell mass BLASTX (Altschulet al., J. Mol. Biol. 215:403-410 (1990), per gram of C substrate metabolized (U.S. Pat. No. 6,689, 50 and DNASTAR (DNASTAR, Inc. 1228 S. Park St. Madison, 601). The specific “high growth methanotrophic bacterial Wis. 53715 USA), and the FASTA program incorporating strain” described herein is referred to as “Methylomonas the Smith-Waterman algorithm (W. R. Pearson, Comput. 16a”, “16a' or “Methylomonas sp. 16a', which terms are Methods Genome Res., Proc. Int. Symp. (1994), Meeting used interchangeably and which refer to the Methylomonas Date 1992,111-20. Editor(s): Suhai, Sandor. Publisher: Ple strain used in the present invention. 55 num, New York, N.Y.). Within the context of this application The term “CrtN1 refers to an enzyme encoded by the it will be understood that where sequence analysis software crtN1 gene, active in the native carotenoid biosynthetic is used for analysis, that the results of the analysis will be pathway of Methylomonas sp. 16a. This gene is located based on the “default values” of the program referenced, within an an operon comprising crtN2 and ald. unless otherwise specified. As used herein “default values” The term "ALD” refers to an enzyme encoded by the ald 60 will mean any set of values or parameters (as set by the gene, active in the native carotenoid biosynthetic pathway of software manufacturer) which originally load with the soft Methylomonas sp. 16a. This gene is located within an an ware when first initialized. operon comprising crtN1 and crtN2. The present invention provides newly discovered crtW The term “CrtN2 refers to an enzyme encoded by the genes encoding carotenoid ketolases. The present CrtW crtN2 gene, active in the native carotenoid biosynthetic 65 ketolases may be used in vitro and/or in vivo for the pathway of Methylomonas sp. 16a. This gene is located production of ketocarotenoids from cyclic carotenoid com within an an operon comprising crtN1 and ald. pounds. US 7,252,985 B2 17 18 Comparison of the Sphingomonas melonis DC18 crtW from any desired bacteria using methodology well known to nucleotide base and deduced amino acid sequences to public those skilled in the art. Specific oligonucleotide probes databases reveals that the most similar known sequences based upon the instant nucleic acid sequences can be were about 57% identical to the amino acid sequence of designed and synthesized by methods known in the art reported herein over length of 249 amino acid using a 5 (Maniatis). Moreover, the entire sequences can be used Smith-Waterman alignment algorithm (W. R. Pearson, Com directly to synthesize DNA probes by methods known to the put. Methods Genome Res., Proc. Int. Symp. (1994), skilled artisan Such as random primers DNA labeling, nick Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Pub translation, end-labeling techniques, or RNA probes using lisher: Plenum, New York, N.Y.). available in vitro transcription systems. In addition, specific Comparison of the Brevundimonas vesicularis DC263 10 primers can be designed and used to amplify a part of or the crtW nucleotide base and deduced amino acid sequences to full-length of the instant sequences. The resulting amplifi public databases reveals that the most similar known cation products can be labeled directly during amplification sequences were about 63% identical to the amino acid reactions or labeled after amplification reactions, and used as sequence of reported herein over length of 259 amino acid probes to isolate full length DNA fragments under condi using a Smith-Waterman alignment algorithm. 15 tions of appropriate stringency. Comparison of the Flavobacterium sp. K1-202C crtW Typically in PCR-type amplification techniques, the prim nucleotide base and deduced amino acid sequences to public ers have different sequences and are not complementary to databases reveals that the most similar known sequences each other. Depending on the desired test conditions, the were about 47% identical to the amino acid sequence of sequences of the primers should be designed to provide for reported herein over length of 256 amino acid using a both efficient and faithful replication of the target nucleic Smith-Waterman alignment algorithm. acid. Methods of PCR primer design are common and well In one embodiment, the present invention is comprised of known in the art. (Thein and Wallace, “The use of oligo nucleic acid fragments encoding amino acid sequences that nucleotide as specific hybridization probes in the Diagnosis are at least about 75%-85% identical to the sequences of Genetic Disorders', in Human Genetic Diseases. A herein. In another embodiment, the present invention is 25 Practical Approach, K. E. Davis Ed., (1986) pp. 33-50 IRL comprised of nucleic acid fragments encoding amino acid Press, Herndon, Va.); Rychlik, W., in Methods in Molecular sequences that are at least about 85% to about 95% identical Biology. PCR Protocols. Current Methods and Applica to the amino acid sequences reported herein. In a further tions, Vol. 15, pages 31-39, White, B. A. (ed.), (1993) embodiment, the present invention is comprised of nucleic Humania Press, Inc., Totowa, N.J.) acid fragments encoding amino acid sequences are at least 30 Generally two short segments of the instant sequences about 95% identical to the amino acid sequences reported may be used in polymerase chain reaction protocols to herein. In yet a further embodiment, the present invention is amplify longer nucleic acid fragments encoding homolo comprised of nucleic acid fragments encoding amino acid gous genes from DNA or RNA. The polymerase chain sequences that are at least 99% identical to the amino acid reaction may also be performed on a library of cloned sequences reported herein. 35 nucleic acid fragments wherein the sequence of one primer Similarly, Suitable nucleic acid fragments are those com is derived from the instant nucleic acid fragments, and the prised of nucleic acid sequences encoding the corresponding sequence of the other primer takes advantage of the presence active CrtW ketolases which are at least about 80% identical of the polyadenylic acid tracts to the 3' end of the mRNA to the nucleic acid sequences of reported herein. In one precursor of a eukaryotic gene. In the case of microbial embodiment, suitable crtW nucleic acid fragments are those 40 genes which lack polyadenylated mRNA, random primers having nucleic acid sequences that are at least about 90% may be used. Random primers may also be useful for identical to the nucleic acid sequences herein. In another amplification from DNA. embodiment, suitable crtW nucleic acid fragments are those Alternatively, the second primer sequence may be based having nucleic acid sequences that are at least about 95% upon sequences derived from the cloning vector. For identical to the nucleic acid sequences herein. In yet another 45 example, the skilled artisan can follow the RACE protocol embodiment, suitable crtW nucleic acid fragments are those (Frohman et al., Proc. Natl. Acad. Sci. USA, 85:8998 (1988)) having nucleic acid sequences that are at least about 99% to generate cDNAs by using PCR to amplify copies of the identical to the nucleic acid sequences reported herein. region between a single point in the transcript and the 3' or 5' end. Primers oriented in the 3' and 5' directions can be Isolation of Homologs 50 designed from the instant sequences. Using commercially The nucleic acid fragments of the instant invention may available 3' RACE or 5 RACE systems (BRL), specific 3' or be used to isolate genes encoding homologous proteins from 5' clNA fragments can be isolated (Ohara et al., Proc. Natl. the same or other microbial species. Isolation of homolo Acad. Sci. USA, 86:5673 (1989); Loh et al., Science, 243: gous genes using sequence-dependent protocols is well 217 (1989)). known in the art. Examples of sequence-dependent proto 55 Alternatively, the instant sequences may be employed as cols include, but are not limited to, methods of nucleic acid hybridization reagents for the identification of homologs. hybridization, and methods of DNA and RNA amplification The basic components of a nucleic acid hybridization test as exemplified by various uses of nucleic acid amplification include a probe, a sample suspected of containing the gene technologies (e.g. polymerase chain reaction (PCR), Mullis or gene fragment of interest, and a specific hybridization et al., U.S. Pat. No. 4,683.202), ligase chain reaction (LCR), 60 method. Probes of the present invention are typically single Tabor, S. et al., Proc. Natl. Acad. Sci. USA, 82:1074 (1985)) Stranded nucleic acid sequences which are complementary or strand displacement amplification (SDA, Walker, et al., to the nucleic acid sequences to be detected. Probes are Proc. Natl. Acad. Sci. U.S.A., 89:392 (1992)). “hybridizable' to the nucleic acid sequence to be detected. For example, genes encoding similar proteins or polypep The probe length can vary from 5 bases to tens of thousands tides to those of the instant invention could be isolated 65 of bases, and will depend upon the specific test to be done. directly by using all or a portion of the instant nucleic acid Typically, a probe length of about 15 bases to about 30 bases fragments as DNA hybridization probes to screen libraries is suitable. Only part of the probe molecule need be comple US 7,252,985 B2 19 20 mentary to the nucleic acid sequence to be detected. In DNA expression libraries to isolate full-length DNA clones addition, the complementarity between the probe and the of interest (Lerner, R. A., Adv. Immunol. 36:1 (1984); target sequence need not be perfect. Hybridization does Maniatis, Supra). occur between imperfectly complementary molecules with the result that a certain fraction of the bases in the hybridized Genes Involved in Carotenoid Production region are not paired with the proper complementary base. The enzymatic pathway involved in the biosynthesis of Hybridization methods are well defined. Typically the carotenoids can be conveniently viewed in two parts, the probe and sample must be mixed under conditions which upper isoprenoid pathway providing for the conversion of will permit nucleic acid hybridization. This involves con pyruvate and glyceraldehyde-3-phosphate to farnesyl pyro tacting the probe and sample in the presence of an inorganic 10 phosphate (FPP) and the lower carotenoid biosynthetic or organic salt under the proper concentration and tempera pathway, which provides for the synthesis of phytoene and ture conditions. The probe and sample nucleic acids must be all Subsequently produced carotenoids. The upper pathway in contact for a long enough time that any possible hybrid is ubiquitous in many non-carotogenic microorganisms and ization between the probe and sample nucleic acid may in these cases it will only be necessary to introduce genes occur. The concentration of probe or target in the mixture 15 that comprise the lower pathway for the biosynthesis of the will determine the time necessary for hybridization to occur. desired carotenoid. The key division between the two path The higher the probe or target concentration the shorter the ways concerns the synthesis of farnesyl pyrophosphate. hybridization incubation time needed. Optionally, a chao Where FPP is naturally present, only elements of the lower tropic agent may be added. The chaotropic agent stabilizes carotenoid pathway will be needed. However, it will be nucleic acids by inhibiting nuclease activity. Furthermore, appreciated that for the lower pathway carotenoid genes to the chaotropic agent allows sensitive and stringent hybrid be effective in the production of carotenoids, it will be ization of short oligonucleotide probes at room temperature necessary for the host cell to have suitable levels of FPP (Van Ness and Chen, Nucl. Acids Res., 19:5143-5151 within the cell. In another embodiment, isoprenoid biosyn (1991)). Suitable chaotropic agents include guanidinium thesis genes may be optionally upregulated to increase the chloride, guanidinium thiocyanate, sodium thiocyanate, 25 levels of FPP available for cartenoid biosynthesis. Where lithium tetrachloroacetate, sodium perchlorate, rubidium tet FPP synthesis is not provided by the host cell, it will be rachloroacetate, potassium iodide, and cesium trifluoroac necessary to introduce the genes necessary for the produc etate, among others. Typically, the chaotropic agent will be tion of FPP. Each of these pathways will be discussed below present at a final concentration of about 3M. If desired, one in detail. can add formamide to the hybridization mixture, typically 30 The Upper Isoprenoid Pathway 30-50% (v/v). Isoprenoid biosynthesis occurs through either of two Various hybridization solutions can be employed. Typi pathways, generating the common C5 isoprene sub-unit, cally, these comprise from about 20 to 60% volume, pref isopentenyl pyrophosphate (IPP). First, IPP may be synthe erably 30%, of a polar organic solvent. A common hybrid sized through the well-known acetate/mevalonate pathway. ization solution employs about 30-50% V/v formamide, 35 However, recent studies have demonstrated that the meva about 0.15 to 1M sodium chloride, about 0.05 to 0.1M lonate-dependent pathway does not operate in all living buffers, such as sodium citrate, Tris-HCl, PIPES or HEPES organisms. An alternate mevalonate-independent pathway (pH range about 6-9), about 0.05 to 0.2% detergent, such as for IPP biosynthesis has been characterized in bacteria and sodium dodecylsulfate, or between 0.5-20 mM EDTA, in green algae and higher plants (Horbach et al., FEMS FICOLL (Pharmacia Inc.) (about 300-500 kilodaltons), 40 Microbiol. Lett., 111:135-140 (1993); Rohmer et al., Bio polyvinylpyrrolidone (about 250-500 kD), and serum albu chem. 295: 517-524 (1993); Schwender et al., Biochem. min. Also included in the typical hybridization solution will 316: 73-80 (1996); and Eisenreich et al., Proc. Natl. Acad. be unlabeled carrier nucleic acids from about 0.1 to 5 Sci. USA, 93: 6431-6436 (1996)). mg/mL, fragmented nucleic DNA, e.g., calf thymus or Many steps in the mevalonate-independent isoprenoid salmon sperm DNA, or yeast RNA, and optionally from 45 pathway are known. For example, the initial steps of the about 0.5 to 2% wt.?vol. glycine. Other additives may also alternate pathway leading to the production of IPP have been be included. Such as Volume exclusion agents which include studied in Mycobacterium tuberculosis by Cole et al. (Na a variety of polar water-soluble or Swellable agents, such as ture, 393:537-544 (1998)). The first step of the pathway polyethylene glycol, anionic polymers such as polyacrylate involves the condensation of two 3-carbon molecules (pyru or polymethylacrylate, and anionic saccharidic polymers, 50 vate and D-glyceraldehyde 3-phosphate) to yield a 5-carbon Such as dextran Sulfate. compound known as D-1-deoxyxylulose-5-phosphate. This Nucleic acid hybridization is adaptable to a variety of reaction occurs by the DXS enzyme, encoded by the dxs assay formats. One of the most suitable is the sandwich gene. Next, the isomerization and reduction of D-1-deox assay format. The sandwich assay is particularly adaptable yxylulose-5-phosphate yields 2-C-methyl-D-erythritol-4- to hybridization under non-denaturing conditions. A primary 55 phosphate. One of the enzymes involved in the isomeriza component of a sandwich-type assay is a solid Support. The tion and reduction process is D-1-deoxyxylulose-5- solid support has adsorbed to it or covalently coupled to it phosphate reductoisomerase (DXR), encoded by the gene immobilized nucleic acid probe that is unlabeled and dxr (also known as ispC). 2-C-methyl-D-erythritol-4-phos complementary to one portion of the sequence. phate is Subsequently converted into 4-diphosphocytidyl Availability of the instant nucleotide and deduced amino 60 2C-methyl-D-erythritol in a CTP-dependent reaction by the acid sequences facilitates immunological screening DNA enzyme encoded by the non-annotated geneygbP. Recently, expression libraries. Synthetic peptides representing por however, the ygbP gene was renamed as isp) as a part of the tions of the instant amino acid sequences may be synthe isp gene cluster (SwissProtein Accession iQ46893). sized. These peptides can be used to immunize animals to Next, the 2" position hydroxy group of 4-diphosphocyti produce polyclonal or monoclonal antibodies with specific 65 dyl-2C-methyl-D-erythritol can be phosphorylated in an ity for peptides or proteins comprising the amino acid ATP-dependent reaction by the enzyme encoded by the sequences. These antibodies can be then be used to Screen ychB gene. YchB phosphorylates 4-diphosphocytidyl-2C US 7,252,985 B2 21 22 methyl-D-erythritol, resulting in 4-diphosphocytidyl-2C methyl-D-erythritol 2-phosphate. The yehB gene was TABLE 1-continued renamed as ispE, also as a part of the isp gene cluster (SwissProtein Accession #P24209). YgbB converts Sources of Genes Encoding the Upper Isoprene Pathway 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate to GenBank (R). Accession Number 2C-methyl-D-erythritol 2.4-cyclodiphosphate in a CTP-de Gene and Source Organism pendent manner. This gene has also been recently renamed (1-hydroxy-2- P54482, Bacilius subtilis as ispF (SwissProtein Accession iP36663). methyl-2-(E)- Q9pky3, Chlamydia muridarum The enzymes encoded by the gcpE (also known as ispG) butenyl 4 Q978HO, Chlamydophila pneumoniae and lytB (also known as ispH) genes (and perhaps others) 10 diphosphate O84060, Chlamydia trachomatis synthase) P27433, Escherichia coli are thought to participate in the reactions leading to forma P44667, Haemophilus influenzae tion of isopentenyl pyrophosphate (IPP) and dimethylallyl Q97LL0, Helicobacter pylori J99 pyrophosphate (DMAPP). IPP may be isomerized to O33350, Mycobacterium tuberculosis DMAPP via IPP isomerase, encoded by the idi gene. How S77159, Synechocystis sp. 15 Q9WZZ3, Thermotoga maritima ever, this enzyme is not essential for Survival and may be O83460, Treponema pallidum absent in some bacteria using 2-C-methyl-D-erythritol Q9JZ40, Neisseria meningitidis 4-phosphate (MEP) pathway. Recent evidence suggests that Q9PPM1, Campylobacter jejuni the MEP pathway branches before IPP and separately pro Q9RXC9, Deinococcus radiodurans AAG07190, Pseudomonas aeruginosa duces IPP and DMAPP via the lytB gene product. A lytB Q9KTX1, Vibrio cholerae knockout mutation is lethal in E. coli except in media lytB (isph) AF027189, Acinetobacter sp. BD413 supplemented with both IPP and DMAPP. AF098521, Burkholderia pseudomaiei AF291696, Streptococcus pneumoniae The synthesis of FPP occurs via the isomerization of IPP AF323927, Plasmodium falciparum gene to dimethylallyl pyrophosphate. This reaction is followed by M87645, Bacillus subtilis a sequence of two prenyltransferase reactions catalyzed by U38915, Synechocystis sp. isp A, leading to the creation of geranyl pyrophosphate 25 X89371, C. jejuni sp. O67496 ispA (FPP ABOO3187, Micrococcus luteus (GPP; a 10-carbon molecule) and farnesyl pyrophosphate synthase) AB016094, Synechococcus elongatus (FPP; a 15-carbon molecule). AB021747, Oryza sativa FPPS1 gene for Genes encoding elements of the upper pathway are known farnesyl diphosphate synthase from a variety of plant, animal, and bacterial sources, as B028044, Rhodobacter sphaeroides 30 B028046, Rhodobacter capsulatus shown in Table 1. B028047, Rhodovulum sulfidophilum F112881 and AF136602, Artemisia annua TABLE 1. F384040, Mentha X piperita OO694, Escherichia coi Sources of Genes Encoding the Upper Isoprene Pathway 3293, B. Stearothermophilus 35 85317, Oryza sativa GenBank (R). Accession Number 75789, A. thaliana Gene and Source Organism 2072, G. arboreum Z49786, H. brasiliensis dxs (D-1- FO35440, Escherichia coli U80605, Arabidopsis thaliana farnesyl deoxyxylulose 5- 8874, Synechococcus PCC6301 iphosphate synthase precursor (FPS1) phosphate B026631, Streptomyces sp. CL190 mRNA, complete cols synthase) B042821, Streptomyces griseolosporetis 40 X76026, K. lactis FPS gene for farnesyl F111814, Plasmodium falciparum iphosphate synthetase, QCR8 gene for F143812, Lycopersicon esculentum b c1 complex, subunit VIII 279019, Narcissus pseudonarcissus 82542, P argentatum mRNA for farnesyl 291721, Nicotiana tabacum phosphate synthase ( FPS1), dxr (ispC) (1- BO13300, Escherichia coli 82543, P argentatum mRNA for farnesyl deoxy-D- B049187, Streptomyces griseolosporeus 45 phosphate synthase ( FPS2). xylulose 5- F111813, Plasmodium falciparum BCO10004, Homo sapiens, farnesyl diphosphate phosphate F116825, Mentha X piperita synthase (farnesyl pyrophosphate synthetase, reductoisomerase) F148852, Arabidopsis thaliana imethylallyltranstrans erase, geranyl F182287, Artemisia annua transtransferase), clone MGC 15352 IMAGE, F250235, Catharanthus roseus 4132071, mRNA, complete cols F282879, Pseudomonas aeruginosa 50 AF234168, Dictyostelium discoideum farnesyl 242588, Arabidopsis thaliana iphosphate synthase ( Dfps) 250714, Zymomonas mobilis strain ZM4 L46349, Arabidopsis thaliana farnesyl 292312, Klebsiella pneumoniae, iphosphate synthase ( FPS2) mRNA, 297566, Zea mays complete cols B037876, Arabidopsis thaliana L46350, Arabidopsis thaliana farnesyl (2-C-methyl-D- F109075, Clostridium difficile 55 iphosphate synthase ( FPS2) gene, erythritol 4- F230736, Escherichia coli complete cols phosphate F230737, Arabidopsis thaliana L46367, Arabidopsis thaliana farnesyl cytidylyltrans iphosphate synthase ( FPS1) gene, ferase) alternative products, complete cols ychB (ispE) (4- A. F216300, Escherichia coli M89945, Rat farnesyl diphosphate diphosphocytidyl- A. F263101, Lycopersicon esculentum synthasegene, exons 1 8 2-C-methyl-D- A. F288615, Arabidopsis thaliana 60 NM 002004, Homo sapiens farnesyl erythritol kinase) diphosphate synthase (farnesyl ygbB (ispF) (2- AB038256, Escherichia coli mecs gene pyrophosphate synthetase, dimethyl C-methyl-D- AF230738, Escherichia coli allyltranstransferase, geranyltrans erythritol 2,4- A. F250236, Catharanthus roseus (MECS) transferase) (FDPS), m RNA cyclodiphosphate A. F279661, Plasmodium falciparum U36376, Artemisia annua farnesyl synthase) A. F321531, Arabidopsis thaliana 65 diphosphate synthase (fps1) mRNA, O 67496, Aquifex aeolicus complete cols US 7,252,985 B2 23 24 Genes encoding elements of the lower carotenoid biosyn TABLE 1-continued thetic pathway are known from a variety of plant, animal, Sources of Genes Encoding the Upper Isoprene Pathway and bacterial Sources, as shown in Table 2. GenBank (R). Accession Number TABLE 2 Gene and Source Organism Sources of Genes Encoding the Lower XM 001352, Homo sapiens farnesyl Carotenoid Biosynthetic Pathway diphosphate synthase (farnesy pyrophosphate synthetase, dimethyl GenBank Accession Number allyltranstransferase, geranyltrans 10 Gene and Source Organism transferase) (FDPS), mRNA 034497, Homo sapiens airnesyl crtE (GGPP A. B000835, Arabidopsis thaliana diphosphate synthase (farnesy Synthase) A. B016043 and ABO19036, Homo sapiens pyrophosphate synthetase, dimethyl ABO16044, Mus musculius allyltranstransferase, geranyltrans AB027705 and AB027706, Daucus carota transferase) (FDPS), mRNA 15 AB034249, Croton subly ratus XM 034498, Homo sapiens airnesyl AB034250, Scoparia dulcis diphosphate synthase (farnesy AFO20041, Heianthus annuus pyrophosphate synthetase, dimethyl A. F049658, Drosophila melanogaster signal allyltranstransferase, geranyltrans recognition particle 19 kDa protein transferase) (FDPS), mRNA (Srp19) gene, partial sequence; and XM 034499, Homo sapiens airnesyl geranylgeranyl pyrophosphate synthase diphosphate synthase (farnesy (quiemao) gene, complete cols pyrophosphate synthetase, dimethyl AFO49659, Drosophila melanogaster allyltranstransferase, geranyltrans geranylgeranyl pyrophosphate synthase transferase) (FDPS), mRNA mRNA, complete cols 0345002, Homo sapiens farnesyl AF139916, Brevibacterium inens diphosphate synthase (farnesyl AF2798.07, Penicilium paxilligeranylgeranyl pyrophosphate synthetase, dimethyl 25 pyrophosphate synthase (ggs1) gene, allyltranstransferase, geranyltrans complete transferase) (FDPS), mRNA AF2798.08, Penicilium paxilli dimethylallyl tryptophan synthase (paxD) gene, partial cols; and cytochrome P450 monooxygenase (paxQ), The Lower Carotenoid Biosynthetic Pathway cytochrome P450 monooxygenase (paxP), PaxC 30 (paxC), monooxygenase (paxM), geranylgeranyl The division between the upper isoprenoid pathway and pyrophosphate synthase (paxG), PaxU (paxU), the lower carotenoid pathway is somewhat subjective. and metabolite transporter (paxT) genes, Because FPP synthesis is common in both carotenogenic and complete cols non-carotenogenic bacteria, the first step in the lower caro AJO10302, Rhodobacter sphaeroides tenoid biosynthetic pathway is considered to begin with the AJ133724, Mycobacterium aurum 35 AJ276129, Mucor circinelioides flusitanicus prenyltransferase reaction converting farnesyl pyrophos carG gene for geranylgeranyl pyrophosphate phate (FPP) to geranylgeranyl pyrophosphate (GGPP). The synthase, exons 1-6 D85029, Arabidopsis thaliana mRNA for gene crtE, encoding GGPP synthetase, is responsible for this geranylgeranyl pyrophosphate synthase, prenyltransferase reaction which adds IPP to FPP to produce partial cols the 20-carbon molecule GGPP. A condensation reaction of L25813, Arabidopsis thaliana two molecules of GGPP occurs to form phytoene (PPPP), 40 L37405, Streptomyces grisetts geranylgeranyl the first 40-carbon molecule of the lower carotenoid bio pyrophosphate synthase (crtB), phytoene desaturase (crtE) and phytoene synthase synthesis pathway. This enzymatic reaction is catalyzed by (crtI) genes, complete cols crtB, encoding phytoene synthase. U15778, Lupinus albus geranylgeranyl pyrophosphate synthase (ggps1) mRNA, Lycopene, which imparts a 'red' colored spectra, is 45 complete cols produced from phytoene through four sequential dehydro U44876, Arabidopsis thaliana pregeranylgeranyl genation reactions by the removal of eight atoms of hydro pyrophosphate synthase (GGPS2) mRNA, gen, catalyzed by the gene crtl (encoding phytoene desatu complete cols rase). Intermediaries in this reaction are phytofluene, Zeta X92893, C. roseus X95596, S. griseus carotene, and neurosporene. 50 X98795, S. alba Lycopene cyclase (crtY) converts lycopene to B-carotene. Y15112, Paracoccus marcusii In the present invention, a reporter plasmid is used which crtX (Zeaxanthin D90087, E. uredovora glucosylase) M87280 and M90698, Pantoea agglomerans produces B-carotene as the genetic end product. However, crtY (Lycopene- AF139916, Brevibacterium inens additional genes may be used to create a variety of other 3-cyclase) AF152246, Citrus X paradisi carotenoids. For example, B-carotene is converted to Zeax 55 AF218415, Bradyrhizobium sp. ORS278 anthin via a hydroxylation reaction resulting from the activ AF272737, Streptomyces griseus strain IFO13350 AJ133724, Mycobacterium aurum ity of B-carotene hydroxylase (encoded by the crtZ gene). AJ250827, Rhizomucor circinelioides flusitanicus B-cryptoxanthin is an intermediate in this reaction. carRP gene for lycopene cyclase/phytoene B-carotene is converted to canthaxanthin by B-carotene synthase, exons 1-2 AJ276965, Phycomyces blakesleeanus carRA ketolase encoded by either the crtW or crtO gene. 60 gene for phytoene synthaseflycopene cyclase, Echinenone in an intermediate in this reaction. Canthaxan exons 1-2 thin can then be converted to astaxanthin by 3-carotene D58420, Agrobacterium aurantiacum hydroxylase encoded by the crtZ or crtR gene. Adonbirubrin D83513, Erythrobacter longus is an intermediate in this reaction. L40176, Arabidopsis thaliana lycopene cyclase (LYC) mRNA, complete cols Zeaxanthin can be converted to Zeaxanthin-f-diglucoside. 65 M87280, Pantoea agglomerans This reaction is catalyzed by Zeaxanthin glucosyltransferase U50738, Arabodopsis thaliana lycopene epsilon (crtX). US 7,252,985 B2 25 26

TABLE 2-continued TABLE 2-continued Sources of Genes Encoding the Lower Sources of Genes Encoding the Lower Carotenoid Biosynthetic Pathway Carotenoid Biosynthetic Pathway GenBank Accession Number GenBank Accession Number Gene and Source Organism Gene and Source Organism cyclase mRNA, complete cols gene for lycopene cyclase/phytoene synthase, U50739, Arabidosis thaliana lycopene B cyclase AJ304825, Heianthus annuus mRNA for mRNA, complete cols 10 phytoene synthase (psy gene) U62808, Flavobacterium ATCC21588 AJ3O8385, Heianthus annuus mRNA for X74599, Synechococcus sp. Icy gene for lycopene phytoene synthase (psy gene) cyclase D58420, Agrobacterium aurantiacum X81787, N. tabacum CrtL-1 gene encoding L23424, Lycopersicon escientain phytoene lycopene cyclase synthase (PSY2) mRNA, complete cols X86221, C. annuum 15 L25812, Arabidopsis thaliana X86452, L. esculentum mRNA for lycopene L37405, Streptomyces grisetts geranylgeranyl 3-cyclase pyrophosphate synthase (crtB), phytoene X95596, S. griseus desaturase (crtE) and phytoene synthase X98796, N. pseudonarcissus (crtI) genes, complete cols crtI (Phytoene AB046992, Citrus unshiu CitPDS1 mRNA for M38424, Pantoea agglomerans phytoene desaturase) phytoene desaturase, complete cols synthase (crtE) gene, complete cols AF0395.85, Zea mays phytoene desaturase M87280, Pantoea agglomerans (pds 1) gene promoter region and exon 1 S71770, Carotenoid gene cluster AF049356, Oryza sativa phytoene desaturase U32636, Zea mays phytoene synthase (Y1) precursor (Pds) mRNA, complete cols gene, complete cols AF139916, Brevibacterium inens U62808, Flavobacterium ATCC21588 AF218415, Bradyrhizobium sp. ORS278 U87626, Rubrivivax gelatinosus AF251014, Tagetes erecta 25 U91900, Dunaieila bardawii AF364515, Citrus X paradisi X52291, Rhodobacter capsulatus D58420, Agrobacterium aurantiacum X60441, L. esculentum GTom5 gene for D83514, Erythrobacter longus phytoene synthase L16237, Arabidopsis thaliana X63873, Synechococcus PCC7942 pys gene L37405, Streptomyces grisetis geranylgeranyl or phytoene synthase pyrophosphate synthase (crtB), phytoene 30 X68017, C. annuum psy1 mRNA for desaturase (crtE) and phytoene synthase phytoene synthase (crtI) genes, complete cols X69172, Synechocystis sp. pys gene L39266, Zea mays phytoene desaturase (Pds) or phytoene synthase mRNA, complete cols X78814, N. pseudonarcissus M64704: Soybean phytoene desaturase crtz (B- D58420, Agrobacterium aurantiacum M88683, Lycopersicon esculentum phytoene 35 caroteine D58422, Alcaligenes sp. desaturase (pds) mRNA, complete cols hydroxylase) D90087, E. uredovora S71770, carotenoid gene cluster M87280, Pantoea agglomerans U37285, Zea mays U62808, Flavobacterium ATCC21588 U46919, Solanum lycopersicum phytoene Y15112, Paracoccus marcusii desaturase (Pds) gene, partial cols crtW (p-carotene AF218415, Bradyrhizobium sp. ORS278 U62808, Flavobacterium ATCC21588 ketolase) D45881, Haematococcus pluvialis X55289, Synechococcus pds gene for 40 D58420, Agrobacterium aurantiacum phytoene desaturase D58422, Alcaligenes sp. X59948, L. escuienium X86782, H. pluvialis X62574, Synechocystis sp. pds gene for Y15112, Paracoccus marcusii phytoene desaturase X68058, C. annuum pds1 mRNA for phytoene desaturase 45 X71023, Lycopersicon escientain pds gene Preferred sources of the non-crtW carotenoid genes are for phytoene desaturase from Pantoea stewartii (ATCC 8199: WO 02/079395), X78271, L. esculentum (Ailsa Craig) PDS gene Enterobactericeae DC260 (U.S. Ser. No. 10/808,979), and X78434, P. blakesleeanus (NRRL1555) carB gene X78815, N. pseudonarcissus Pantoea agglomerans DC404 (U.S. Ser. No. 10/808,807). X86783, H. pluvialis 50 Preferred sources of crtW genes are from Sphingomonas 4807, Dunaieia bardawii melonis DC18 (SEQ ID NO:1), Brevundimonas vesicularis 5007, Xanthophyllomyces dendrorhous 5112, Paracoccus marcusii DC263 (SEQ ID NO:3), and Flavobacterium sp. K1-202C 5114, Anabaena PCC7210 crtP gene (SEQ ID NO:5). 1165, R. Capsulatus By using various combinations of the genes presented in crtB (Phytoene B001284, Spirulina platensis 55 synthase) B032797, Daucus carota PSY mRNA for Table 2 and the preferred crtW genes of the present inven hytoene synthase, complete cols tion, numerous different carotenoids and carotenoid deriva B034704, Rubrivivax gelatinosus tives could be made using the methods of the present B037975, Citrus unshiu F009954, Arabidopsis thaliana phytoene invention, provided that sufficient sources of FPP are avail nthase (PSY) gene, complete cols able in the host organism. For example, the gene cluster F139916, Brevibacterium inens 60 crtEXYIB enables the production of B-carotene. The addi F152892, Citrus X paradisi F218415, Bradyrhizobium sp. ORS278 tion of the crtW gene to crtEXYIB enables the production of F220218, Citrus unshiu phytoene synthase canthaxanthin. sy1) mRNA, complete cols It is envisioned that useful products of the present inven 010302, Rhodobacter 133724, Mycobacterium aurum 65 tion will include any ketocarotenoid compound as defined 278287, Phycomyces blakesleeanus carRA herein including, but not limited to , adonix anthin, astaxanthin, canthaxanthin, B-cryptoxanthin, keto-Y- US 7,252,985 B2 27 28 carotene, echinenone, 3-hydroxyechinenone, 3'-hydrox Additionally, the instant genes may also be introduced into yechinenone, and Co-ketocarotenoids. non-native host bacteria where the existing carotenoid path way may be manipulated. Recombinant Expression—Microbial Specific ketocarotenoids that will be produced by the The gene and gene product of the instant sequences may present invention include, but are not limited to canthaxan be produced in heterologous host cells, particularly in the thin, astaxanthin, adonixanthin, adonirubin, echinenone, cells of microbial hosts. Expression in recombinant micro 3-hydroxyechinenone, 3'-hydroxyechinenone, 4-keto bial hosts may be useful for the expression of various gamma-carotene, 4-keto-, 4-keto-torulene, 3-hy pathway intermediates, for the modulation of pathways droxy-4-keto-torulene, deoxyflexixanthin, and myxobac already existing in the host, or for the synthesis of new 10 tone. Of particular interest is the production of astaxanthin products heretofore not possible using the host. and canthaxanthin, the synthesis of which is shown in FIG. 1. The specific substrate for the present CrtW enzymes is a Preferred heterologous host cells for expression of the cyclic carotenoid. Cyclic carotenoids are well known in the instant genes and nucleic acid fragments are microbial hosts art and available commercially. Preferred in the present that can be found broadly within the fungal or bacterial 15 invention are CrtW ketolase substrates that include, but are families and which grow over a wide range of temperature, not limited to B-carotene, Y-carotene, Zeaxanthin, B-cryptox pH values, and solvent tolerances. For example, it is con anthin, 3'-hydroxyechinenone, rubixanthin, echinenone, and templated that any of bacteria, yeast, and filamentous fungi torulene. will be suitable hosts for expression of the present nucleic Vectors or cassettes useful for the transformation of acid fragments. Because of transcription, translation and the suitable host cells are well known in the art. Typically the protein biosynthetic apparatus is the same irrespective of the vector or cassette contains sequences directing transcription cellular feed Stock, functional genes are expressed irrespec and translation of the relevant gene, a selectable marker, and tive of carbon feedstock used to generate cellular biomass. sequences allowing autonomous replication or chromosomal Large-scale microbial growth and functional gene expres integration. Suitable vectors comprise a region 5' of the gene sion may utilize a wide range of simple or complex carbo 25 which harbors transcriptional initiation controls and a region hydrates, organic acids and alcohols, Saturated hydrocarbons 3' of the DNA fragment which controls transcriptional Such as methane or carbon dioxide in the case of photosyn termination. It is most preferred when both control regions thetic or chemoautotrophic hosts. However, the functional are derived from genes homologous to the transformed host genes may be regulated, repressed or depressed by specific cell, although it is to be understood that such control regions growth conditions, which may include the form and amount 30 need not be derived from the genes native to the specific of nitrogen, phosphorous, Sulfur, oxygen, carbon or any species chosen as a production host. trace micronutrient including small inorganic ions. In addi Initiation control regions or promoters, which are useful tion, the regulation of functional genes may be achieved by to drive expression of the instant ORFs in the desired host the presence or absence of specific regulatory molecules that cell are numerous and familiar to those skilled in the art. are added to the culture and are not typically considered 35 Virtually any promoter capable of driving these genes is nutrient or energy sources. Growth rate may also be an suitable for the present invention including but not limited to important regulatory factor in gene expression. Examples of CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, host strains include, but are not limited to bacterial, fungal GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for or yeast species such as Aspergillus, Trichoderma, Saccha expression in Saccharomyces); AOX1 (useful for expression romyces, Pichia, Candida, Hansenula, or bacterial species 40 in Pichia); and lac, ara, tet, trp, IP, IP, TT, tac, and trc Such as Salmonella, Bacillus, Acinetobacter, Zymomonas, (useful for expression in Escherichia coli) as well as the Agrobacterium, Erythrobacter, Chlorobium, Chromatium, amy, apr, npr promoters and various phage promoters useful Flavobacterium, Cytophaga, Rhodobacter; Rhodococcus, for expression in Bacillus, and promoters isolated from the Streptomyces, Brevibacterium, Corynebacteria, Mycobacte nrtA, glnB, moXF, glyoxll, htpG, and hps genes useful for rium, Deinococcus, Escherichia, Erwinia, Pantoea, 45 expression in Methylomonas (U.S. Ser. No. 10/689.200). Pseudomonas, Sphingomonas, Methylomonas, Methyllo Additionally, promoters such as the chloramphenicol resis bacter, Methylococcus, Methylosinus, Methylomicrobium, tance gene promoter may also be useful for expression in Methylocystis, Alcaligenes, Synechocystis, Synechococcus, Methylomonas. Anabaena, Thiobacillus, Methanobacterium, Klebsiella, and Termination control regions may also be derived from Myxococcus. In one embodiment, Suitable bacterial host 50 various genes native to the preferred hosts. Optionally, a strains include Escherichia, Bacillus, and Methylomonas. termination site may be unnecessary, however, it is most Microbial expression systems and expression vectors con preferred if included. taining regulatory sequences that direct high level expres Knowledge of the sequence of the present gene will be sion of foreign proteins are well known to those skilled in useful in manipulating the carotenoid biosynthetic pathways the art. Any of these could be used to construct chimeric 55 in any organism having Such a pathway and particularly in genes for expression of present ketolases. These chimeric Methylomonas sp. 16a and Escherichia coli. Methods of genes could then be introduced into appropriate microor manipulating genetic pathways are common and well known ganisms via transformation to provide high level expression in the art. Selected genes in a particularly pathway may be of the enzymes upregulated or down regulated by variety of methods. Addi Accordingly, it is expected that introduction of chimeric 60 tionally, competing pathways organism may be eliminated genes encoding the instant bacterial enzymes under the or Sublimated by gene disruption and similar techniques. control of the appropriate promoters will demonstrate Once a key genetic pathway has been identified and increased or altered cyclic ketocarotenoid production. It is sequenced specific genes may be upregulated to increase the contemplated that it will be useful to express the instant output of the pathway. For example, additional copies of the genes both in natural host cells as well as heterologous host. 65 targeted genes may be introduced into the host cell on Introduction of the present crtW genes into native host will multicopy plasmids such as pBR322. Alternatively the target result in altered levels of existing ketocarotenoid production. genes may be modified so as to be under the control of US 7,252,985 B2 29 30 non-native promoters. Where it is desired that a pathway enzyme. When the transposable element or transposon, is operate at a particular point in a cell cycle or during a contacted with a nucleic acid fragment in the presence of the fermentation run, regulated or inducible promoters may used transposase, the transposable element will randomly insert to replace the native promoter of the target gene. Similarly, into the nucleic acid fragment. The technique is useful for in some cases the native or endogenous promoter may be random mutageneis and for gene isolation, since the dis modified to increase gene expression. For example, endog rupted gene may be identified on the basis of the sequence enous promoters can be altered in vivo by mutation, dele of the transposable element. Kits for in vitro transposition tion, and/or substitution (see, Kmiec, U.S. Pat. No. 5,565, are commercially available (see for example The Primer 350; Zarling et al., PCT/US93/03868). Island Transposition Kit, available from Perkin Elmer Alternatively it may be necessary to reduce or eliminate 10 Applied Biosystems, Branchburg, N.J., based upon the yeast the expression of certain genes in the target pathway or in Ty1 element; The Genome Priming System, available from competing pathways that may serve as competing sinks for New England Biolabs, Beverly, Mass.: based upon the energy or carbon. Methods of down-regulating genes for this bacterial transposon TnT; and the EZ:TN Transposon Inser purpose have been explored. Where sequence of the gene to tion Systems, available from Epicentre Technologies, Madi be disrupted is known, one of the most effective methods 15 son, Wis., based upon the Tn5 bacterial transposable ele gene down regulation is targeted gene disruption where ment. foreign DNA is inserted into a structural gene so as to disrupt transcription. This can be effected by the creation of genetic Methylotrophs and Methylomonas sp. 16a as Microbial cassettes comprising the DNA to be inserted (often a genetic Hosts marker) flanked by sequence having a high degree of Although a number of carotenoids have been produced homology to a portion of the gene to be disrupted. Intro from recombinant microbial Sources e.g., E. coli and Can duction of the cassette into the host cell results in insertion dida utilis for production of lycopene (Farmer W. R. and J. of the foreign DNA into the structural gene via the native C. Liao, Biotechnol. Prog., 17: 57-61 (2001); Wang C. et al., DNA replication mechanisms of the cell. (See for example Biotechnol Prog., 16:922-926 (2000); Misawa, N. and H. Hamilton et al., J. Bacteriol., 171:4617-4622 (1989), Balbas 25 Shimada, J. Biotechnol. 59: 169-181 (1998); Shimada, H., et al., Gene, 136:211-213 (1993), Gueldener et al., Nucleic et al., Appl. Environm. Microbiol. 64:2676-2680 (1998)); E. Acids Res., 24:2519-2524 (1996), and Smith et al., Methods coli, Candida utilis and Pfaffia rhodozyma for production of Mol. Cell. Biol. 5:270-277 (1996)). B-carotene (Albrecht, M. et al., Biotechnol. Lett. 21: 791 Antisense technology is another method of down regu 795 (1999); Miura, Y. et al., Appl. Environm. Microbiol., lating genes where the sequence of the target gene is known. 30 64: 1226-1229 (1998): U.S. Pat. No. 5,691,190); E. coli and To accomplish this, a nucleic acid segment from the desired Candida utilis for production of zeaxanthin (Albrecht, M. et gene is cloned and operably linked to a promoter such that al., Supra; Miura, Y. et al., Supra); E. coli and Pfafia the anti-sense strand of RNA will be transcribed. This rhodozyma for production of astaxanthin (U.S. Pat. No. construct is then introduced into the host cell and the 5,466,599; U.S. Pat. No. 6,015,684; U.S. Pat. No. 5,182,208: antisense strand of RNA is produced. Antisense RNA inhib 35 U.S. Pat. No. 5,972,642); see also: U.S. Pat. No. 5,656,472, its gene expression by preventing the accumulation of U.S. Pat. No. 5,545,816, U.S. Pat. No. 5,530,189, U.S. Pat. mRNA that encodes the protein of interest. The person No. 5,530,188, U.S. Pat. No. 5,429,939, and U.S. Pat. No. skilled in the art will know that special considerations are 6,124,113), these methods of producing carotenoids using associated with the use of antisense technologies in order to various combinations of different crt genes suffer from low reduce expression of particular genes. For example, the 40 yields and reliance on relatively expensive feedstocks. Thus, proper level of expression of antisense genes may require it would be desirable to identify a method that produces the use of different chimeric genes utilizing different regu higher yields of carotenoids in a microbial host from an latory elements known to the skilled artisan. inexpensive feedstock. There are a number of microorgan Although targeted gene disruption and antisense technol isms that utilize single carbon Substrates as their sole energy ogy offer effective means of down regulating genes where 45 source. Such microorganisms are referred to herein as “C1 the sequence is known, other less specific methodologies metabolizers”. These organisms are characterized by the have been developed that are not sequence based. For ability to use carbon Substrates lacking carbon to carbon example, cells may be exposed to UV radiation and then bonds as a sole source of energy and biomass. These carbon screened for the desired phenotype. Mutagenesis with Substrates include, but are not limited to: methane, metha chemical agents is also effective for generating mutants and 50 nol, formate, formaldehyde, formic acid, methylated amines commonly used Substances include chemicals that affect (e.g., mono-, di- and tri-methyl amine), methylated thiols, nonreplicating DNA such as HNO and NH-OH, as well as carbon dioxide, and various other reduced carbon com agents that affect replicating DNA Such as acridine dyes, pounds which lack any carbon-carbon bonds. In one notable for causing frameshift mutations. Specific methods embodiment, the single carbon substrate is selected from the for creating mutants using radiation or chemical agents are 55 group consisting of methane and methanol. All C1 metabo well documented in the art. See for example Thomas D. lizing microorganisms are generally classified as methy Brock in Biotechnology: A Textbook of Industrial Microbi lotrophs. Methylotrophs may be defined as any organism ology, Second Edition (1989) Sinauer Associates, Inc., Sun capable of oxidizing organic compounds that do not contain derland, Mass., or Deshpande, Mukund V., Appl. Biochem. carbon-carbon bonds. However, facultative methylotrophs, Biotechnol. 36, 227, (1992). 60 obligate methylotrophs, and obligate methanotrophs are all Another non-specific method of gene disruption is the use various subsets of methylotrophs. Specifically: of transposable elements or transposons. Transposons are Facultative methylotrophs have the ability to oxidize genetic elements that insert randomly in DNA but can be organic compounds which do not contain carbon-car latter retrieved on the basis of sequence to determine where bon bonds, but may also use other carbon Substrates the insertion has occurred. Both in vivo and in vitro trans 65 Such as Sugars and complex carbohydrates for energy position methods are known. Both methods involve the use and biomass. Facultative methylotrophic bacteria are of a transposable element in combination with a transposase found in many environments, but are isolated most US 7,252,985 B2 31 32 commonly from soil, landfill and waste treatment sites. MWM1200 (U.S. Ser. No. 60/527,083). The endogenous Many facultative methylotrophs are members of the f C, carotenoid pathway has been knocked-out (Acrt cluster and Y subgroups of the Proteobacteria (Hanson et al., promoter+AcrtN3), creating an optimized platform for Cao Microb. Growth C1 Compounds. Int. Symp.), 7" carotenoid production. The deletion of the promoter respon (1993), pp. 285-302 Murrell, J. Collin and Don P. Kelly, 5 sible for expression of the endogenous crt cluster (crtN1 Eds. Intercept: Andover, UK; Madigan et al., Brock ald-crtN2 cluster) resulted in a non-pigmented Strain (the Biology of Microorganisms, 8" ed., Prentice Hall: wild type strain in normally pink in color due to its naturally UpperSaddle River, N.J. (1997)). production of Cso carotenoids). Expression of Cao caro Obligate methylotrophs are those organisms that are lim tenoid biosynthesis genes within this optimized host enables ited to the use of organic compounds that do not contain 10 increased production of the desired Cao carotenoids. carbon-carbon bonds for the generation of energy. Obligate methanotrophs are those obligate methylotrophs Transformation of C1 Metabolizing Bacteria that have the distinct ability to oxidize methane. Techniques for the transformation of C1 metabolizing Additionally, the ability to utilize single carbon substrates bacteria are not well developed, although general method is not limited to bacteria but extends also to yeasts and fungi. 15 ology that is utilized for other bacteria, which is well known A number of yeast genera are able to use single carbon to those of skill in the art, may be applied. Electroporation Substrates as energy sources in addition to more complex has been used successfully for the transformation of Methy materials (i.e., the methylotrophic yeasts). lobacterium extorquens AM1 (Toyama, H., et al., FEMS Although a large number of these methylotrophic organ Microbiol. Lett., 166:1-7 (1998)), Methylophilus methylotro isms are known, few of these microbes have been Success phus AS1 (Kim, C. S., and T. K. Wood, Appl. Microbiol. fully harnessed in industrial processes for the synthesis of Biotechnol., 48: 105-108 (1997)), and Methylobacillus sp. materials. And, although single carbon Substrates are cost strain 12S (Yoshida, T., et al., Biotechnol. Lett. 23: 787-791 effective energy sources, difficulty in genetic manipulation (2001)). Extrapolation of specific electroporation param of these microorganisms as well as a dearth of information eters from one specific C1 metabolizing utilizing organism about their genetic machinery has limited their use primarily 25 to another may be difficult, however, as is well to known to to the synthesis of native products. those of skill in the art. Despite these hardships, many methanotrophs contain an Bacterial conjugation, relying on the direct contact of inherent isoprenoid pathway that enables these organisms to donor and recipient cells, is frequently more readily ame synthesize pigments and provides the potential for one to nable for the transfer of genes into C1 metabolizing bacteria. envision engineering these microorganisms for production 30 Simplistically, this bacterial conjugation process involves of various non-endogenous isoprenoid compounds. Since mixing together “donor” and “recipient cells in close methanotrophs can use single carbon substrates (i.e., meth contact with one another. Conjugation occurs by formation ane and/or methanol) as an energy source, it could be of cytoplasmic connections between donor and recipient possible to produce carotenoids at low cost in these organ bacteria, with direct transfer of newly synthesized donor isms. One Such example wherein a methanotroph is engi 35 DNA into the recipient cells. As is well known in the art, the neered for production of B-carotene is described in U.S. Ser. recipient in a conjugation is defined as any cell that can No. 09/941,947, hereby incorporated by reference. accept DNA through horizontal transfer from a donor bac Methods are provided for the expression of genes terium. The donor in conjugative transfer is a bacterium that involved in the biosynthesis of carotenoid compounds in contains a conjugative plasmid, conjugative transposon, or microorganisms that are able to use single carbon Substrates 40 mobilizable plasmid. The physical transfer of the donor as a sole energy source. The host microorganism may be any plasmid can occur in one of two fashions, as described C1 metabolizer that has the ability to synthesize farnesyl below: pyrophosphate (FPP) as a metabolic precursor for caro 1. In some cases, only a donor and recipient are required tenoids. More specifically, facultative methylotrophic bac for conjugation. This occurs when the plasmid to be teria suitable in the present invention include, but are not 45 transferred is a self-transmissible plasmid that is both limited to Methylophilus, Methylobacillus, Methylobacte conjugative and mobilizable (i.e., carrying both tra rium, Hyphomicrobium, Xanthobacter, Bacillus, Paracoc genes and genes encoding the Mob proteins). In gen cus, Nocardia, Arthrobacter. Rhodopseudomonas, and eral, the process involves the following steps: 1.) Pseudomonas. Specific methylotrophic yeasts useful in the Double-strand plasmid DNA is nicked at a specific site present invention include, but are not limited to: Candida, 50 in oriT; 2.) A single-strand DNA is released to the Hansenula, Pichia, Torulopsis, and Rhodotorula. Exem recipient through a pore or pilus structure; 3.) A DNA plary methanotrophs include, but are not limited to the relaxase enzyme cleaves the double-strand DNA at oriT genera Methylomonas, Methylobacter; Methylococcus, and binds to a release 5' end (forming a relaxoSome as Methylosinus, Methylocyctis, Methylomicrobium, and the intermediate structure); and 4.) Subsequently, a Methanomonas. 55 complex of auxiliary proteins assemble at oriT to Of particular interest in the present invention are high facilitate the process of DNA transfer. growth obligate methanotrophs having an energetically 2. Alternatively, a “triparental conjugation is required for favorable carbon flux pathway. For example, a specific strain transfer of the donor plasmid to the recipient. In this of methanotroph having several pathway features that makes type of conjugation, donor cells, recipient cells, and a it particularly useful for carbon flux manipulation. This 60 “helper plasmid participate. The donor cells carry a strain is known as Methylomonas 16a (ATCC PTA 2402) mobilizable plasmid or conjugative transposon. Mobi (U.S. Pat. No. 6,689.601); and, this particular strain and lizable vectors contain an oriT, a gene encoding a other related methylotrophs are preferred microbial hosts for nickase, and have genes encoding the Mob proteins; expression of the gene products of this invention, useful for however, the Mob proteins alone are not sufficient to the production of Cao carotenoids. 65 achieve the transfer of the genome. Thus, mobilizable An optimized version of Methylomonas sp. 16a has been plasmids are not able to promote their own transfer created and designated as Methylomonas sp. 16a unless an appropriate conjugation system is provided US 7,252,985 B2 33 34 by a helper plasmid (located within the donor or within by-products or waste products are continuously removed a “helper cell). The conjugative plasmid is needed for from the cell mass. Cell immobilization may be performed the formation of the mating pair and DNA transfer, using a wide range of Solid Supports composed of natural since the plasmid encodes proteins for transfer (Tra) and/or synthetic materials. that are involved in the formation of the pore or pilus. Continuous or semi-continuous culture allows for the Examples of Successful conjugations involving C1 modulation of one factor or any number of factors that affect metabolizing bacteria include the work of Stolyar et al. cell growth or end product concentration. For example, one (Mikrobiologiya, 64(5): 686-691 (1995)); Motoyama, H. et method will maintain a limiting nutrient such as the carbon al. (Appl. Micro. Biotech., 42(1): 67-72 (1994)); Lloyd, J. S. source or nitrogen level at a fixed rate and allow all other et al. (Archives of Microbiology, 171 (6): 364-370 (1999)); 10 parameters to moderate. In other systems a number of and Odom, J. M. et al. (U.S. Ser. No. 09/941,947). factors affecting growth can be altered continuously while the cell concentration, measured by media turbidity, is kept Industrial Production constant. Continuous systems strive to maintain steady state Where commercial production of cyclic ketocarotenoid growth conditions and thus the cell loss due to media being compounds is desired using the present crtW genes, a variety 15 drawn off must be balanced against the cell growth rate in of culture methodologies may be applied. For example, the culture. Methods of modulating nutrients and growth large-scale production of a specific gene product overex factors for continuous culture processes as well as tech pressed from a recombinant microbial host may be produced niques for maximizing the rate of product formation are well by both batch and continuous culture methodologies. known in the art of industrial microbiology and a variety of A classical batch culturing method is a closed system methods are detailed by Brock, supra. where the composition of the media is set at the beginning Fermentation media in the present invention must contain of the culture and not subject to artificial alterations during suitable carbon substrates. Suitable substrates may include the culturing process. Thus, at the beginning of the culturing but are not limited to monosaccharides such as glucose and process the media is inoculated with the desired organism or fructose, disaccharides such as lactose or Sucrose, polysac organisms and growth or metabolic activity is permitted to 25 charides Such as starch or cellulose or mixtures thereof and occur adding nothing to the system. Typically, however, a unpurified mixtures from renewable feedstocks such as “batch' culture is batch with respect to the addition of cheese whey permeate, cornsteep liquor, Sugar beet molas carbon Source and attempts are often made at controlling ses, and barley malt. Additionally the carbon Substrate may factors such as pH and oxygen concentration. In batch also be one-carbon Substrates Such as carbon dioxide, meth systems the metabolite and biomass compositions of the 30 ane, and/or methanol for which metabolic conversion into system change constantly up to the time the culture is key biochemical intermediates has been demonstrated. In terminated. Within batch cultures cells moderate through a addition to one and two carbon substrates methylotrophic static lag phase to a high growth log phase and finally to a organisms are also known to utilize a number of other stationary phase where growth rate is diminished or halted. carbon containing compounds such as methylamine, glu Ifuntreated, cells in the stationary phase will eventually die. 35 cosamine and a variety of amino acids for metabolic activity. Cells in log phase are often responsible for the bulk of For example, methylotrophic yeast are known to utilize the production of end product or intermediate in Some systems. carbon from methylamine to form trehalose or glycerol Stationary or post-exponential phase production can be (Bellion et al., Microb. Growth C1 Compa. Int. Symp., 7th obtained in other systems. (1993), 415-32. Editor(s): Murrell, J. Collin; Kelly, Don P. A variation on the standard batch system is the fed-batch 40 Publisher: Intercept, Andover, UK). Similarly, various spe system. Fed-batch culture processes are also suitable in the cies of Candida will metabolizealanine or oleic acid (Sulter present invention and comprise a typical batch system with et al., Arch. Microbiol. 153:485-489 (1990)). Hence it is the exception that the Substrate is added in increments as the contemplated that the source of carbon utilized in the present culture progresses. Fed-batch systems are useful when invention may encompass a wide variety of carbon contain catabolite repression is apt to inhibit the metabolism of the 45 ing substrates and will only be limited by the choice of cells and where it is desirable to have limited amounts of organism. substrate in the media. Measurement of the actual substrate concentration in fed-batch systems is difficult and is there Recombinant Expression Plants fore estimated on the basis of the changes of measurable Plants and algae are also known to produce carotenoid factors such as pH, dissolved oxygen and the partial pressure 50 compounds. The nucleic acid fragments of the instant inven of waste gases such as CO. Batch and fed-batch culturing tion may be used to create transgenic plants having the methods are common and well known in the art and ability to express the microbial protein. Preferred plant hosts examples may be found in Thomas D. Brock in Biotech will be any variety that will support a high production level nology: A Textbook of Industrial Microbiology, Second of the instant proteins. Suitable green plants will include, but Edition (1989) Sinauer Associates, Inc., Sunderland, Mass. 55 are not limited to Soybean, rapeseed (Brassica napus, B. or Deshpande, Mukund V., Appl. Biochem. Biotechnol. campestris), pepper, Sunflower (Helianthus annus), cotton 36:227, (1992). (Gossypium hirsutum), corn, tobacco (Nicotiana tabacum), Commercial production of cyclic ketocarotenoids may alfalfa (Medicago sativa), wheat (Triticum sp), barley (Hor also be accomplished with a continuous culture. Continuous deum vulgare), oats (Avena sativa, L), Sorghum (Sorghum cultures are an open system where a defined culture media 60 bicolor), rice (Oryza sativa), Arabidopsis, cruciferous Veg is added continuously to a bioreactor and an equal amount etables (broccoli, cauliflower, cabbage, parsnips, etc.), mel of conditioned media is removed simultaneously for pro ons, carrots, celery, parsley, tomatoes, potatoes, strawber cessing. Continuous cultures generally maintain the cells at ries, peanuts, grapes, grass seed crops, Sugar beets, Sugar a constant high liquid phase density where cells are prima cane, beans, peas, rye, flax, hardwood trees, softwood trees, rily in log phase growth. Alternatively continuous culture 65 and forage grasses. Algal species include, but not limited to may be practiced with immobilized cells where carbon and commercially significant hosts Such as Spirulina, Haemot nutrients are continuously added, and valuable products, acoccus, and Dunalliela. Production of the carotenoid com US 7,252,985 B2 35 36 pounds may be accomplished by first constructing chimeric Protein Engineering genes of present invention in which the coding region are It is contemplated that the present nucleotides may be operably linked to promoters capable of directing expression used to produce gene products having enhanced or altered of a gene in the desired tissues at the desired stage of activity. Various methods are known for mutating a native development. For reasons of convenience, the chimeric 5 gene sequence to produce a gene product with altered or genes may comprise promoter sequences and translation enhanced activity including but not limited to error-prone leader sequences derived from the same genes. 3' Non PCR (Melnikov et al., Nucleic Acids Research, 27(4): 1056 coding sequences encoding transcription termination signals 1062 (1999); site-directed mutagenesis (Coombs et al., must also be provided. The instant chimeric genes may also Proteins (1998), 259-311, 1 plate. Editor(s): Angeletti, Ruth comprise one or more introns in order to facilitate gene 10 Hogue. Publisher: Academic, San Diego, Calif); "gene expression. shuffling” (U.S. Pat. No. 5,605,793; U.S. Pat. No. 5,811,238; U.S. Pat. No. 5,830,721; U.S. Pat. No. 5,837,458; and U.S. Any combination of any promoter and any terminator Ser. No. 10/374,366, hereby incorporated by reference). capable of inducing expression of a coding region may be The method of gene shuffling is particularly attractive due used in the chimeric genetic sequence. Some Suitable 15 to its facile implementation, and high rate of mutagenesis examples of promoters and terminators include those from and ease of Screening. The process of gene shuffling nopaline synthase (nos), octopine synthase (ocs) and cauli involves the restriction endonuclease cleavage of a gene of flower mosaic virus (CaMV) genes. One type of efficient interest into fragments of specific size in the presence of plant promoter that may be used is a high-level plant additional populations of DNA regions of both similarity to promoter. Such promoters, in operable linkage with the or difference to the gene of interest. This pool of fragments genetic sequences or the present invention should be capable will then be denatured and reannealed to create a mutated of promoting expression of the present gene product. High gene. The mutated gene is then screened for altered activity. level plant promoters that may be used in this invention The instant microbial sequences of the present invention include the promoter of the small subunit (ss) of the ribu may be mutated and Screened for altered or enhanced lose-1,5-bisphosphate carboxylase from example from Soy 25 activity by this method. The sequences should be double bean (Berry-Lowe et al., J. Molecular and App. Gen., Stranded and can be of various lengths ranging form 50 bp 1:483-498 1982)), and the promoter of the chlorophyll a?b to 10 kb. The sequences may be randomly digested into binding protein. These two promoters are known to be fragments ranging from about 10 bp to 1000 bp, using light-induced in plant cells (see, for example, Genetic Engi restriction endonucleases well known in the art (Maniatis, 30 Supra). In addition to the instant microbial sequences, popu neering of Plants, an Agricultural Perspective, A. Cash lations of fragments that are hybridizable to all or portions more, Plenum, N.Y. (1983), pages 29-38; Coruzzi, G. et al., of the microbial sequence may be added. Similarly, a popu J. Biol. Chem. 258:1399 (1983), and Dunsmuir, P. et al., J. lation of fragments that are not hybridizable to the instant Mol. Appl. Gen., 2:285 (1983)). sequence may also be added. Typically these additional Plasmid vectors comprising the instant chimeric genes 35 fragment populations are added in about 10 to 20 fold excess can then constructed. The choice of plasmid vector depends by weight as compared to the total nucleic acid. Generally upon the method that will be used to transform host plants. if this process is followed the number of different specific The skilled artisan is well aware of the genetic elements that nucleic acid fragments in the mixture will be about 100 to must be present on the plasmid vector in order to Success about 1000. The mixed population of random nucleic acid fully transform, select and propagate host cells containing 40 fragments are denatured to form single-stranded nucleic acid the chimeric gene. The skilled artisan will also recognize fragments and then reannealed. Only those single-stranded that different independent transformation events will result nucleic acid fragments having regions of homology with in different levels and patterns of expression (Jones et al., other single-stranded nucleic acid fragments will reanneal. EMBO.J., 4:2411-2418 (1985): De Almeida et al., Mol. Gen. The random nucleic acid fragments may be denatured by Genetics, 218:78-86 (1989)), and thus that multiple events 45 heating. One skilled in the art could determine the conditions must be screened in order to obtain lines displaying the necessary to completely denature the double stranded desired expression level and pattern. Such screening may be nucleic acid. Preferably the temperature is from 80° C. to accomplished by Southern analysis of DNA blots (Southern, 100° C. The nucleic acid fragments may be reannealed by J. Mol. Biol. 98:503 (1975)). Northern analysis of mRNA cooling. Preferably the temperature is from 20° C. to 75° C. expression (Kroczek, J. Chromatogr: Biomed. Appl. 618 50 Renaturation can be accelerated by the addition of polyeth (1-2):133-145 (1993)), Western analysis of protein expres ylene glycol (“PEG') or salt. A suitable salt concentration Sion, or phenotypic analysis. may range from 0 mM to 200 mM. The annealed nucleic For some applications it will be useful to direct the instant acid fragments are then incubated in the presence of a proteins to different cellular compartments. It is thus envi nucleic acid polymerase and dNTPs (i.e., dATP, dCTP, sioned that the chimeric genes described above may be 55 dGTP and dTTP). The nucleic acid polymerase may be the further supplemented by altering the coding sequences to Klenow fragment, the Taq polymerase or any other DNA encode enzymes with appropriate intracellular targeting polymerase known in the art. The polymerase may be added sequences such as transit sequences (Keegstra, K., Cell. to the random nucleic acid fragments prior to annealing, 56:247-253 (1989)), signal sequences or sequences encod simultaneously with annealing or after annealing. The cycle ing endoplasmic reticulum localization (Chrispeels, J. J., 60 of denaturation, renaturation and incubation in the presence Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53 (1991)), or of polymerase is repeated for a desired number of times. nuclear localization signals (Raikhel, N., Plant Phys., 100: Preferably the cycle is repeated from 2 to 50 times, more 1627-1632 (1992)) added and/or with targeting sequences preferably the sequence is repeated from 10 to 40 times. The that are already present removed. While the references cited resulting nucleic acid is a larger double-stranded polynucle give examples of each of these, the list is not exhaustive and 65 otide ranging from about 50 bp to about 100 kb and may be more targeting signals of utility may be discovered in the screened for expression and altered activity by standard future that are useful in the invention. cloning and expression protocol. (Maniatis, Supra). US 7,252,985 B2 37 38 Furthermore, a hybrid protein can be assembled by fusion expression plasmids were created by cloning the carotenoid of functional domains using the gene shuffling (exon shuf gene clusters from either Enterobactericeae DC260 (U.S. fling) method (Nixon et al., PNAS, 94: 1069-1073 (1997)). Ser. No. 10/808,979) or Pantoea agglomerans DC404 (U.S. The functional domain of the instant gene can be combined Ser. No. 10/808,807). The present CrtW ketolases exhibited with the functional domain of other genes to create novel the ability to convert B-carotene into canthaxanthin. enzymes with desired catalytic function. A hybrid enzyme In another embodiment, coexpression of divergent keto may be constructed using PCR overlap extension method lase was conducted (Example 6). The plasmid expressing the and cloned into the various expression vectors using the B-caroteine synthesis genes (pDCQ330) used in Example 5 techniques well known to those skilled in art. was engineered to additionally express the crtWZ genes 10 from Agrobacterium aurantiacum. The resulting plasmid DESCRIPTION OF THE PREFERRED (pDCQ335) was used to create an astaxanthin/adonixanthin EMBODIMENTS producing E. coli strain. The plasmids expressing either the crtW from DC263 (pDCQ342TA) or the crtW from Pigmented microbes were isolated from environmental K1-202C (pDCQ339TA) were transformed into the astax samples and cultured using standard microbiological tech 15 anthin/adonixanthin producing strain. Comparisons between niques (Example 1). Two pigmented colonies (DC18 and the strain harboring plCQ335 alone and the strains con DC263) were selected and 16S rRNA gene sequencing was taining the additional plasmid pl)CO342TA or p)CO339TA performed. The 16S rRNA gene sequence from strain DC18 were conducted. Strains expressing one or more divergent (SEQID NO. 4) was used as a query search using BLASTN ketolase genes improved the efficiency of keto group addi against GenBank(R). The closest match to the public database tion, increasing the production of astaxanthin. was 98% identical to Sphingomonas melonis. The strain was designated as Sphingomonas melonis DC18. The 16S rRNA EXAMPLES gene sequence from strain DC263 (SEQID NO. 5) exhibited homology (99% identical) to Brevundimonas vesicularis. The present invention is further defined in the following The isolated Strain was designated as Brevundimonas 25 Examples. It should be understood that these Examples, vesicularis DC263. A third pigmented microbial strain (Fla while indicating preferred embodiments of the invention, are vobacterium sp. K1-202C) was obtained from Dr. Gerhard given by way of illustration only. From the above discussion Sandmann (J.W. Goethe University, Germany). This strain is and these Examples, one skilled in the art can ascertain the also known as Cytophaga sp. KK1020C and is available essential characteristics of this invention, and without from the Marine Biotechnology Institute (MBI, Japan). 30 departing from the spirit and scope thereof, can make Carotenoid samples from each strain were analyzed by various changes and modifications of the invention to adapt HPLC/LC-MS. The major carotenoid in Sphingomonas it to various usages and conditions. melonis DC18 was determined to be tetrahydroxy-f.f3'- caroten-4-one. The major carotenoid in Brevundimonas General Methods vesicularis DC263 was determined to be tetrahydroxy-f.f3'- 35 Standard recombinant DNA and molecular cloning tech caroten-4,4'-dione. The major carotenoid in Flavobacterium niques used in the Examples are well known in the art and sp. K1-202C was flexixanthin. The major carotenoids in all are described by Sambrook, J., Fritsch, E. F. and Maniatis, three strains were ketocarotenoids, indicating that they all T. Molecular Cloning: A Laboratory Manual: Cold Spring possessed a carotenoid ketolase. Harbor Laboratory Press: Cold Spring Harbor, (1989) (Ma Genomic DNA was prepared from each strain for the 40 niatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist, creation of small insert libraries (4-6 kb fragments) in Experiments with Gene Fusions, Cold Spring Harbor Labo pEZseq vector (Example 2). The respective plasmids were ratory, Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. electroporated into E. coli cells harboring a B-carotene M. et al., Current Protocols in Molecular Biology, pub. by producing plasmid. Orange pigmented transformants were Greene Publishing Assoc. and Wiley-Interscience (1987). isolated and the respective carotenoid content of each was 45 Materials and methods suitable for the maintenance and analyzed. Ketocarotenoids were produced by each orange growth of bacterial cultures are well known in the art. transformant. Techniques suitable for use in the following examples may The inserts on the pEZ-based plasmid were sequenced by be found as set out in Manual of Methods for General random transposon insertion and/or by primer walking. Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Sequences of the inserts were assembled and BLAST ana 50 Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg lyzed (BLASTNnr and BLASTXnr) against GenBank R. and G. Briggs Phillips, eds), American Society for Micro The genes encoding the CrtW ketolases were identified biology, Washington, D.C. (1994)) or by Thomas D. Brock (Example 3, Table 3). Pairwise comparison analysis was in Biotechnology: A Textbook of Industrial Microbiology, conducted using the present crtW sequences and several Second Edition, Sinauer Associates, Inc., Sunderland, Mass. previously reported crtWs (Table 4). The present crtW 55 (1989). All reagents, restriction enzymes and materials used sequences show only moderate homology to previously for the growth and maintenance of bacterial cells were reported carotenoid ketolases. obtained from Aldrich Chemicals (Milwaukee, Wis.), The present carotenoid ketolase genes were cloned indi DIFCO Laboratories/BD Diagnostics (Sparks, Md.). vidually into a pTrcHis2-TOPO expression vector (Example Promega (Madison, Wis.), New England Biolabs (Beverly, 5). Each crtW expression vector was transformed into a 60 Mass.), GIBCO/BRL Life Technologies (Carlsbad, Calif.), B-carotene accumulating E. coli strain. The carotene content or Sigma Chemical Company (St. Louis, Mo.) unless oth of the respective orange transformants was analyzed by erwise specified. HPLC. Canthaxanthin was exclusively produced in each of Manipulations of genetic sequences were accomplished the respective transformants. using the Suite of programs available from the Genetics Several B-carotene expression plasmids (pDCQ340, 65 Computer Group Inc. (Wisconsin Package Version 9.0, pDCQ330) were created to measure the effects of expressing Genetics Computer Group (GCG), Madison, Wis.). Where the present crtW ketolase genes (Examples 4 and 7). The the GCG program "Pileup' was used the gap creation US 7,252,985 B2 39 40 default value of 12, and the gap extension default value of 16S rRNA gene sequence (SEQ ID NO:11) of DC263 were 4 were used. Where the CGC “Gap' or “Bestfit' programs used as the query sequence for a BLASTN search (Altschul were used the default gap creation penalty of 50 and the et al., Nucleic Acids Res., 25:3389-3402(1997)) against default gap extension penalty of 3 were used. Multiple GenBank R. The 16S rDNA sequence of DC18 showed alignments were created using the FASTA program incor homology to those of Sphingomonas strains, with the top hit porating the Smith-Waterman algorithm (W. R. Pearson, as 98% identical to Sphingomonas melonis. This strain was Comput. Methods Genome Res., Proc. Int. Symp. (1994), thus designated as Sphingomonas melonis DC18. The 16S Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Pub rDNA sequence of DC263 showed homology to those of lisher: Plenum, New York, N.Y.). In any case where program Brevundimonas strains, with the top hit as 99% identical to parameters were not prompted for, in these or any other 10 Brevundimonas vesicularis. This strain was thus designated programs, default values were used. as Brevundimonas vesicularis DC263. The meaning of abbreviations is as follows: “h” means Flavobacterium sp. K1-202C was a marine isolate that we hour(s), 'min' means minute(s), “seq' means second(s), 'd obtained from Dr. Gerhard Sandmann at J. W. Goethe means day(s), “mL means milliliters, "Lull' mean microli University in Germany. Flavobacterium sp. K1-202C is also ters, “L” means liters, 'g' means grams, “mg means 15 known as Cytophaga sp. KK10202C (MBICO139), available milligrams, “ug' means micrograms, and “ppm’ means from the Marine Biotechnology Institute (MBI) (Iwate, parts per million. Japan). Carotenoid Analysis Example 1 Sphingomonas melonis DC18 was grown in 100 mL of the same medium as described for the Strain isolation. Brevun Bacterial Strains Producing Ketocarotenoids dimonas vesicularis DC263 was grown in 100 mL LB. Flavobacterium sp. K1-202C was grown in 100-mL marine This example describes isolation of three bacterial strains broth (Difico, Detroit, Mich.). All three strains were grown at that produce ketocarotenoids and preliminary analysis of 30° C. shaking overnight. Cells were pelleted by centrifu their carotenoids. 25 gation at 4000 g for 15 min, and the cell pellets were Strain Isolation and Typing extracted with 10 mL acetone. The extraction was dried To isolate novel carotenoid producing bacterial Strains under nitrogen and redissolved in 1-2 mL of acetone. The pigmented microbes were isolated from a collection of extraction was filtered with an Acrodisc(R) CR25 mm syringe environmental samples. Approximately 1 g of Surface soil 30 filter (Pall Corporation, Ann Arbor, Mich.). It was then from a yard in Wilmington, Del. was resuspended in 10 mL concentrated in 0.1 mL 10% acetone--90% acetonitrile for of tap water. A 10-uL loopful of the water was streaked onto HPLC analysis using an Agilent Series 1100 LC/MSD SI Luria-Broth (LB) plates and the plates were incubated at 30° (Agilent, Foster City, Calif.). C. Pigmented bacteria with diverse colony appearances were Samples (20 uL) were loaded onto a 150 mmx4.6 mm picked and streaked twice to homogeneity on LB plates and 35 ZORBAX C18 (3.5 um particles) column (Agilent Tech incubated at 30° C. From these colonies, one which formed nologies, Inc.). The column temperature was kept at 40° C. orange-pink colonies was designated as strain DC263. Strain The flow rate was 1 mL/min, while the solvent running DC18 was isolated from a Pennsylvania stream. Serial program used was dilutions (10°, 10 and 10) of the aqueous sample were 0-2 min: 95% Buffer A and 5% Buffer B; plated onto large 245x245 mm 15% agar plates with basal 40 2-10 min: linear gradient from 95% Buffer A and 5% medium enriched with tryptone and yeast. The components Buffer B to 60% Buffer A and 40% Buffer B; of the basal medium (per liter) were: NHCl 0.8 g. KHPO, 10-12 min: linear gradient from 60% Buffer A and 40% 0.5 g. MgCl, 6H2O 0.2g, CaCl2HO 0.1 g, NaNO 1.3 g, Buffer B to 50% Buffer A and 50% Buffer B: and NaSO, 0.5 g. The components of the stock solution 1 12-18 min: 50% Buffer A and 50% Buffer B; and, were (per liter): nitrilotriacetic acid 12.8 g. FeC1.4H2O 0.3 45 18-20 min: 95% Buffer A and 5% Buffer B. g, CuCl2.H2O 0.0254 g, MnCl2.4H2O 0.1 g, CoCl2.6H2O Buffer A was 95% acetonitrile and 5% dHO; Buffer B 0.312 g., ZnCl2 0.1 g, HBO 0.01 g, Na-MoC).2H2O 0.01 was 100% tetrahydrofuran. g, and NiCl2.6H2O 0.184 g. Ten milliliters of stock solution FIGS. 2a, 2b, and 2c show the HPLC profiles of the 1 was added per 1 liter of the basal medium. The medium carotenoids produced in DC18, DC263 and K1-202C. The was Supplemented with tryptone at concentration 10 g/L and 50 absorption spectra of the major carotenoid were also shown yeast extract 5 g/L. Media pH was adjusted to 7. The plates for each strain. The molecular weight of the major caro were incubated at room temperature and single colonies tenoid was determined by LC-MS. Each sample of 50 uL were streaked twice onto the same plates. One strain was was run on a Zorbax 2.1x150 mm SB-C18 LC column selected which formed orange colonies and was designated (Agilent Technologies, CA) with solvent program of: as strain DC18. 55 0-30 min: linear gradient from 70% acetonitrile and 30% 16S rRNA gene sequencing was performed with DC18 water to 100% acetonitrile; and DC263. Specifically, the 16S rRNA gene of the strain 30–45 min: 100% acetonitrile. was amplified by PCR using primers HK12: The mass spectrometer (Micromass Quattro LC triple 5'-GAGTTTGATCCTGGCTCAG-3' (SEQ ID NO:7) and quadrapole, Micromass Limited, UK) was scanned from 100 JCR14: 5'-ACGGGCGGTGTGTAC-3 (SEQID NO:8). The 60 to 1000 AMU's in 0.9 sec with an 0.1 sec interscan delay in amplified 16S rRNA genes were purified using a QIAquick APCI (Atmospheric Pressure Chemical Ionization) mode PCR Purification Kit according to the manufacturers with the corona discharge needle at 3KV and the APCI probe instructions (Qiagen) and sequenced on an automated ABI at 450° C. LC-MS analyses determined the molecular sequencer. The sequencing reactions were initiated with weight of the major carotenoid in DC18 to be 614, the primers HK12, JCR14, and JCR15: 5'-GCCAGCAGC 65 molecular weight of the major carotenoid in DC263 to be CGCGGTA-3' (SEQID NO:9). The assembled 1291 bp 16S 628, and the molecular weight of the major carotenoid in rRNA gene sequence (SEQID NO:10) of DC18 and 1268 bp K1-202C to be 582. Based on the HPLC elution time, the US 7,252,985 B2 41 42 absorption spectra, and the molecular weight, the major possibly containing a ketolase gene that converted B-caro carotenoid in DC18 was predicted to be tetrahydroxy-f.f3'- tene to ketocarotenoids. Each of the positive strains was caroten-4-one. The major carotenoid in DC263 was pre grown in 100 mL LB with antibiotics at 30° C. shaking for dicted to be tetrahydroxy-B.B'-caroten-4,4'-dione. The prop 3 days. Carotenoids from the cells were extracted and erties we determined for the major carotenoid in DC18 and analyzed by HPLC as described in Example 1. Ketocaro DC263 were consistent with those reported in the literature for these carotenoids (Yokoyama et al., Biosci. Biotech. tenoids (canthaxanthin and echinenone) were produced in Biochem. 60:200-203, (1996); Kleinig et al, Helvetica the positive E. coli clones isolated from the library of DC18, Chimica Acta, 60:254-258 (1977)). The major carotenoid in DC263, and K1-202C. K1-202C was determined to be flexixanthin by Sandmann's 10 group. The properties determined for the major carotenoid in Example 3 K1-202C was consistent with those reported for flexixanthin (Aasen et al., Acta Chemica Scandinavica, 20:1970-1988 Isolation of Novel Carotenoid Ketolase Genes (1966); Andrewes et al., Acta Chemica Scandinavica, B38: 337-339 (1984)). These three strains are potential sources 15 for carotenoid ketolase genes, since the major carotenoids in This example describes sequencing of the insert on the all three strains are ketocarotenoids. positive E. coli clones and identification of novel carotenoid ketolase genes encoded on the inserts. Example 2 Carotenoid analysis indicated that the positive clones probably contained ketolase genes that are responsible for Construction and Screening of Small Insert Libraries conversion of B-carotene to canthaxanthin and echinenone. The pEZ-based plasmid was separated from the f-carotene This example describes construction of the small insert reporter plasmid by selecting for amplicillin resistant and library from the bacterial strains and identification of posi 25 kanamycin sensitive clones. The insert on the pEZ-based tive clones that potentially contain the ketolase gene. plasmid was sequenced by random transposon insertion Library Construction using the EZ-TN-TET-12 kit (Epicentre, Madison, Wis.) Cells of DC18, DC263 and K1-202C were grown as and/or primer walking. The sequences were assembled with described in Example 1. Genomic DNA was prepared from 30 the Sequencher program (Gene Codes Corp., Ann Arbor, the cells using the Qiagen genomic DNA preparation kits. Mich.). The small insert library of strain K1-202C was prepared by partial restriction digest method. Genomic DNA of K1-202C Genes encoding CrtW ketolases were identified by con was partially digested with Hincll (Promega, Madison, Wis.) ducting BLAST (Basic Local Alignment Search Tool; Alts and separated on a 0.8% agarose gel. The 4-6 kb fraction was 35 chul, S. F., et al., J. Mol. Biol., 215:403-410 (1993)) searches excised from the gel and extracted using Qiagen MinElute for similarity to sequences contained in the BLAST “nr” Gel-Extraction kit. The extracted DNA was ligated to database (comprising all non-redundant GenBank R. CDS pEZseq vector using pEZSeq Blunt Cloning kit (Lucigen, translations, sequences derived from the 3-dimensional Middletown, Wis.). The ligation mixture was electroporated structure Brookhaven Protein Data Bank, the SWISS-PROT into freshly prepared competent cells of E. coli 10G con 40 protein sequence database, EMBL, and DDBJ databases). taining a B-carotene producing plasmid pBHR-crt1 (U.S. The sequences were analyzed for similarity to all publicly Ser. No. 09/941,947). Transformants were plated on LB available DNA sequences contained in the “nr” database plates with 100 ug/mL amplicillin and 50 ug/mL, kanamycin. using the BLASTN algorithm provided by the National The small insert library of strain DC18 and DC263 was 45 Center for Biotechnology Information (NCBI). The DNA prepared by random shearing method. Genomic DNA of sequence was translated in all reading frames and compared DC18 and DC263 was sheared by passing through a 291/2 for similarity to all publicly available protein sequences G insulin syringe (Becton Dickinson, Franklin Lakes, N.J.) contained in the “nr database using the BLASTX algorithm about 300 times and separated on a 0.8% agarose gel. The (Gish, W. and States, D. J., Nature Genetics, 3:266-272 4-6 kb fraction was excised from the gel and extracted using (1993)) provided by the NCBI. Qiagen MinElute Gel-Extraction kit (Qiagen). The ends of the extracted DNA were repaired using Lucigen DNA Ter All comparisons were done using either the BLASTNnr minator Repair kit. The repaired DNA inserts were ligated to or BLASTXnr algorithm. The results of the BLAST com parisons are given in Table 3, which Summarizes the pEZseq vector using pEZSeq Blunt Cloning kit (Lucigen). 55 The ligation mixture was electroporated into freshly pre sequences to which each gene has the most similarity. Table pared competent cells of E. coli 10G containing a 3-carotene 3 displays data based on the BLASTXnr algorithm with producing plasmid p)CO329 (U.S. Ser. No. 10/808,979; values reported in expect values. The nucleotide and amino hereby incorporated by reference). Transformants were acid sequences were also compared with several known plated on LB plates with 100 ug/mL amplicillin and 50 60 ketolase genes using a multiple sequence alignment algo g/mL, kanamycin. rithm in Vector NTI. Table 4 displays the percentage of nucleotide sequence identity and amino acid sequence iden Identification and Analysis of Positive Clones Approximately 20,000 to 100,000 transformants were tity for the pairwise comparisons. The three crtW genes obtained for each library. Several orange colonies were 65 isolated share only moderate homology with the known identified among the tens of thousands of yellow colonies crtW genes. Furthermore, they are very divergent from each for each library. These positive clones were identified as other as shown from the pairwise comparison in Table 4. US 7,252,985 B2 43 44

TABLE 3 Top BLAST hits for the carotenoid ketolase genes isolated from different bacterial species ORF SEQ ID SEQ ID % % Name Gene Name Similarity Identified base Peptide Identity" Similarity E-value Citation 1 crtW. Sphing- beta-caroteine C4 oxygenase 1 2 57 70 e-68 WO O2,079395 omonas nei- gi33439708gb|AAN86030.1crtW oris DC18 Brevtindimonas aurantiaca 2 crtW Bre- beta-caroteine C4 oxygenase 3 4 63 68 9e-88 WO O2,07939S vtindimonas gi33439708gb|AAN86030.1crtW vesicularis Brevtindimonas aurantiaca DC263 3 crtW Fia- beta-caroteine C4 oxygenase 5 6 47 62 8e-49 Kaneko et al., obacterium gil 17230681 refNP 487229.1|crtW DNA Res., 8(5): sp. K1-202C Nostoc sp. PCC7120 205-213 (2001) % Identity is defined as percentage of amino acids that are identical between the two proteins. % Similarity is defined as percentage of amino acids that are identical or conserved between the two proteins. Expect value. The Expect value estimates the statistical significance of the match, specifying the number of matches, with a given score, that are expected in a search of a database of this size absolutely by chance.

TABLE 4 Pairwise comparison of the nucleotide and amino acid sequences of the three newly isolated crtW sequences with Several known crtW Sequences DNAAA Sphingomonas Brevundimonas Flavobacterium Agrobacterium Bradyrhizobium Brevundimonas Nostoc Identity meionis vesicularis sp. aurantiacum” sp. aurantiaca sp. Sphingomonas 100,100 S2,42 35,29 S7.48 56.48 61 S3 38.33 meionis Brevundimonas 100,100 40.32 55.45 57/48 72/70 43,34 vesicularis Flavobacterium sp. 100,100 39.31 42,32 39.32 56.41 Agrobacterium 100 100 6048 59.50 40.36 attrantiactin Bradyrhizobium sp. 100,100 62.53 44,36 Brevundimonas 100,100 43,35 attrantiaca Nostoc sp. 100,100 "Percentage of nucleotide sequence identity and amino acid sequence identity. Agrobacterium aurantiacum, SwissProt Accession Number P54972 Bradyrhizobium sp., GenBank (R) Accession Number AF218415 Brevundimonas aurantiaca, GenBank (R) Accession Number AY166610 Nostoc sp. PCC7120, Pir Accession Number AF2204

Example 4 became clear and the clear lysate was extracted twice with an equal volume of phenol:chloroform:isoamyl alcohol (25: Construction of B-Carotene Synthesis Plasmid 45 24: 1) and once with chloroform:isoamyl alcohol (24:1). pDCQ330 After centrifuging at 4,000 rpm for 20 min, the aqueous phase was carefully removed and transferred to a new tube. P agglomerans DC404 was an environmental isolate that Two volumes of ethanol were added and the DNA was contained the carotenoid synthesis gene cluster crtEidiYIBZ gently spooled with a sealed glass Pasteur pipette. The DNA (SEQ ID NO:12) (see U.S. Ser. No. 10/808,807). 50 was dipped into a tube containing 70% ethanol. After air The soil from a residential vegetable garden in Wilming drying, the DNA was resuspended in 400 uL of TE (10 mM ton, Del. was collected and resuspended in LB medium. A Tris-1 mM EDTA, pH 8.0) with RNaseA (100 g/mL) and 10-uL loopful of resuspension was streaked onto LB plates stored at 4°C. The concentration and purity of DNA was and the plates were incubated at 30° C. Pigmented bacteria determined spectrophotometrically by OD/ODs. with diverse colony appearances were picked and streaked 55 A cosmid library of DC404 was constructed using the twice to homogeneity on LB plates and incubated at 30° C. pWEB cosmid cloning kit from Epicentre (Madison, Wis.) From these colonies, one which formed pale yellow smooth following the manufacturer's instructions. Genomic DNA translucent colonies was designated as “strain DC404”. was sheared by passing it through a syringe needle. The P. agglomerans strain DC404 was grown in 25 mL of LB sheared DNA was end-repaired and size-selected on low medium at 30° C. overnight with aeration. Bacterial cells 60 melting-point agarose by comparison with a 40 kB standard. were centrifuged at 4,000x g for 10 min. The cell pellet was DNA fragments approximately 40 kB in size were purified gently resuspended in 5 mL of 50 mM Tris-10 mM EDTA and ligated into the blunt-ended cloning-ready pWEB (pH 8.0) and lysozyme was added to a final concentration of cosmid vector. The library was packaged using ultra-high 2 mg/mL. The suspension was incubated at 37° C. for 1 hr. efficiency MaxPlax Lambda Packaging Extracts, and plated Sodium dodecyl sulfate was then added to a final concen 65 on EPI100 E. coli cells. Two yellow colonies were identified tration of 1% and proteinase K was added at 100 g/mL. The from the cosmid library clones. The cosmid DNA from the suspension was incubated at 55° C. for 2 h. The suspension two clones had similar restriction digestion patterns. This US 7,252,985 B2 45 46 cosmid DNA, referred to herein as pWEB-404, contained canthaxanthin standard was purchased from CaroteNature the crtWEidiYIBZ gene cluster, given as SEQ ID NO:12. (Lupsingen, Switzerland). This clearly demonstrated the ketolase function of the three new crtW genes. Primers pWEB404F: 5'-GAATTCACTAGTCGAGACGC CGGGTACCAACCAT-3 (SEQ ID NO:13) and Example 6 pWEB404R: 5'-GAATTCTAGCGCGGGCGCTGCCAGA 3' (SEQ ID NO:14) were used to amplify a fragment from Co-Expression of Divergent Ketolase Genes in E. DC404 containing the crtEidiYIB genes (SEQID NO:15) by coli PCR. Cosmid DNA pWEB-404 was used as the template with Pfu TurboTM polymerase (Stratagene, La Jolla, Calif.), 10 This example describes co-expression of divergent keto and the following thermocycler conditions: 92°C. (5 min); lase genes in an E. coli strain producing astaxanthin and 94°C. (1 min), 60° C. (1 min), 72° C. (9 min) for 25 cycles: intermediates. Expression of the additional ketolase genes and 72° C. (10 min). A single product of approximately 5.6 increased astaxanthin production. kB was observed following gel electrophoresis. Taq poly The crtW and the crtz genes from Agrobacterium auran merase (Roche Applied Science, Indianapolis, Ind.) was 15 tiacum were used to produce astaxanthin in a heterologous used in a ten minute 72° C. reaction to add additional 3' host such as E. coli. We evaluated whether co-expression of adenosine nucleotides to the fragment for TOPOR cloning a divergent crtW would improve astaxanthin conversion. into pTrcHis2-TOPO (Invitrogen). Following transforma The three newly isolated carotenoid ketolase genes from tion to E. coli TOP10 cells, several colonies appeared bright DC18, DC263, and K1-202C share only moderate homol yellow in color, indicating that they were producing a ogy with several known crtW ketolase genes as shown in carotenoid compound. The gene cluster was then Subcloned Table 4. Specifically, the crtW gene from DC18 has 57% into the broad host range vector pBHR1 (MoBiTec, LLC, DNA sequence identity and 48% amino acid sequence Marco Island, Fla.), and electroporated into E. coli 10G cells identity with the crtW gene (SEQ ID NO:23) from Agro (Lucigen, Middletown, Wis.). The transformants containing bacterium aurantiacum. The crtW from DC263 has 55% the resulting plasmid pCO330 were selected on LB 25 DNA sequence identity and 45% amino acid sequence medium containing 50 g/mL, kanamycin. In pCO330, a identity with the crtW gene from Agrobacterium auranti unique Spel site was engineered upstream of crtE. acum. The crtW from K1-202C has 39% DNA sequence identity and 31% amino acid sequence identity with the crtW Example 5 gene from Agrobacterium aurantiacum. It is unlikely that 30 the presence of multiple copies of the crtW genes in a single Expression of a Novel CrtW Carotenoid Ketolase host would cause instability problem due to their moderate Gene in E. coli to low homologies to each other. Plasmid plCQ335 was constructed by cloning the syn This example describes expression of the novel caro thetic Agrobacterium crtZW genes into the B-carotene Syn tenoid ketolase genes in an E. coli strain producing B-caro 35 thesis gene cluster in plDCQ330. The crtz (SEQID NO:22) tene. Function of the ketolase genes is demonstrated by and crtW (SEQ ID NO:23) genes were joined together by conversion of B-carotene to canthaxanthin. SOEing PCR. The crtz gene was amplified using forward The 3-carotene producing strain used in this study was the primer crtzW F: 5'-ACTAGTAAGGAGGAATAAACCAT E. coli strain containing plasmid ploCQ330, which carried GACCAAC-3' (SEQID NO:24) and reverse primer crtzW the B-carotene synthesis gene cluster from Pantoea agglo 40 Soe R: 5'-AGGGCATGGGCGCTCATGGTATATTC merans DC404 (U.S. Ser. No. 10/808,807). The putative CTCCTTTCTAGATTAGGTGCGTTCTTGGGCTTC-3' ketolase genes from the three bacterial strains were ampli (SEQ ID NO:25). The crtW gene was amplified using fied by PCR. The crtW from DC18 was amplified using forward primer crtzW soe F: 5'-GAAGCCCAAGAACG primers crtW-18 F: 5'-ACTAGTAAGGAGGAATAAAC CACCTAATCTAGAAAGGAGGAATATAC CATGACCGTCGATCACGACGCAC-3 (SEQ ID NO:16) 45 CATGAGCGCCCATGCCCT-3' (SEQ ID NO:26) and and crtW-18 R: 5'-TCTAGACTACCGGTCTTTGCT reverse primer crtzW R: 5'-GCTAGCTGTACAT TAACGAC-3' (SEQID NO:17). The crtW from DC263 was CACGCGGTGTCGCCTTTGG-3 (SEQ ID NO:27). The amplified using primers crtW-263 F: 5'-ACTAGTAAG two PCR products were gel purified and joined together by GAGGGAATAAACCATGCGGCAAGCGAACAGGATG PCR using primers crtzW F and crtzW. R. The 1272 bp 3' (SEQ ID NO:18) and crtW-263 R: 5'-TCTAGAC 50 PCR product was cloned into pTrcHis2-Topo vector (Invit TAGCTGAACAAACTCCACCAG-3 (SEQ ID NO:19). rogen) resulting in plasmid pl)CO335TA. The -1.2 kb Nhe The crtW from K1-202C was amplified using primers crtW/ I/Spe I fragment from pl)CO335TA containing the crtzW K1-2O2CF: 5'-ACTAGTAAGGAGGAATAAACCATG genes was ligated to the unique Spe I site in pl)CO330. In GCTGATGGAGGAAGTGMGG-3 (SEQ ID NO:20) and the resulting construct pl)CO335, the crtzWEidiYIB genes critWK1-2O2CR 5-TCTAGATTAGTTTGATTGAGAT 55 are organized in an operon and under the control of the TCTT-3' (SEQ ID NO:21). The PCR products were cloned chloramphenicol resistant gene promoter of the vector. into pTrchis2-TOPO (Invitrogen) vector and screened for Plasmid pl)CO342TA expressing a crtW gene from clones containing the insert in the forward orientation. These DC263 and plasmid plCQ339TA expressing a crtW gene resulted in pl)CO341TA expressing the crtW gene from from K1-202C were transformed into E. coli cells contain DC18, pIDCO342TA expressing the crtW gene from DC263, 60 ing plCQ335. Plasmid plCQ335 containing a crtW gene and pL)CO339TA expressing the crtW gene from K1-202C. from Agrobacterium aurantiacum is compatible with plas These constructs were transformed into the B-carotene accu mids plCQ342TA or poCQ339TA. E. coli strains contain mulating E. coli strain containing pCO330. Orange trans ing p)CO335 alone and strains containing the additional formants were obtained and their carotenoids were analyzed plasmid plDCQ342TA or poCQ339TA were grown in LB at by HPLC as described in Example 1. The HPLC results are 65 30° C. for 3 days and HPLC analysis was performed as shown in FIG. 3. Canthaxanthin eluted at 7.29 min was the described in Example 1. Results are shown in FIG. 4. carotenoid exclusively produced in each of the strain. The Astaxanthin was identified by comparing its elution time, US 7,252,985 B2 47 48 absorption spectra and molecular weight with those of the into the B-carotene synthesis plasmid pl)CO340 (Example authentic standard (Sigma, St. Louis, Mo.). Presence of 7), creating plasmids pl)CO341 and ploCQ342, respectively. adonixanthin was predicted based on the absorption spectra The plasmids plCQ341 and pL)CO342 were transferred and its molecular weight (582 Dalton). In the E. coli strain into Methylomonas 16a by tri-parental conjugal mating containing plCQ335 alone, approximately 24% of the total 5 (U.S. Ser. No. 60/527,083). An E. coli helper strain con carotenoids produced was astaxanthin (5.0 min) and the taining pRK2013 (ATCC No. 37159) and an E. coli 10G. majority (46%) of the carotenoids produced was adonixan donor strain containing the plasmid pCO341 or plCQ342 thin (5.6 min). In strains that containing poCQ335 co were grown overnight in LB medium containing kanamycin expressed with pl)CO342TA or poCQ339TA, approxi (50 ug/mL), washed three times in LB, and resuspended in mately 50% of carotenoids produced was astaxanthin (4.8- 10 a Volume of LB representing approximately a 60-fold con 4.9 min) and approximately 10% was adonixanthin (5.5 centration of the original culture Volume. min). This result demonstrated that co-expression of more The Methylomonas sp. 0.16a MWM1200 strain contains than one divergent ketolase genes improved the efficiency of a double crossover knockout of the promoter for the native the keto group addition to increase production of ketocaro crtN1 aldcrtN2 gene cluster and a knockout of the native tenoids such as astaxanthin. 15 crtN3 gene, disrupting the synthesis of the native Co carotenoids (U.S. Ser. No. 60/527,083). This MWM1200 Example 7 strain can be grown as the recipient using the general conditions described in U.S. Ser. No. 09/941,947. Briefly, Construction of B-Carotene Synthesis Plasmid Methylomonas 16a MWM1200 strain was grown in serum pDCQ340 stoppered Wheaton bottles (Wheaton Scientific. Wheaton Ill.) using a gas/liquid ratio of at least 8:1 (i.e., 20 mL of The purpose of this Example was to prepare a 3-carotene Nitrate liquid “BTZ-3 media in 160 mL total volume) at expression plasmid, referred to herein as plCQ340. Entero 30° C. with constant shaking. bactericeae DC260 (U.S. Ser. No. 10/808,979; hereby incor porated by reference) contains the natural gene cluster 25 Nitrate Medium for Methylomonas 16A crtEXYIBZ. The genes required for B-caroteine synthesis Nitrate liquid medium, also referred to herein as “defined (i.e., crtEYIB) were joined together by PCR. The crtE gene medium' or “BTZ-3 medium was comprised of various was amplified using primers crt-260 F: 5-GAATTCAC salts mixed with Solution 1 as indicated below (Tables 5 and TAGTACCAACCATGGATAGCCATTATG-3 (SEQ ID 6) or where specified the nitrate was replaced with 15 mM NO: 28) and crt-260SOE R: 5-ATCAGGTCGCCTCCGC 30 ammonium chloride. Solution 1 provides the composition CAGCACGACTTTCAGTTGAATATCGCTAGCTGTTG for 100-fold concentrated stock solution of trace minerals. 3' (SEQ ID NO: 29). The crtY gene was amplified using primers crt-260SOE F: 5'-CAACAGCTAGCGATAT TABLE 5 TCAACTGAAAGTCGTGCTGGCGGAGGC GACCTGAT-3' (SEQ ID NO: 30) and crt-260R1 R: 35 Solution 1* 5'-CATTTTTTCTTCCCTGGTTCGACAGAGT Conc. TCAACAGCGCGCGCAGCGCTT-3' (SEQ ID NO: 31). MW (mM) g per L. The crtB genes were amplified using primers crt-260R1 F: 5'-AAGCGCTGCGCGCGCTGTTGAACTCT Nitriloacetic acid 1911 66.9 12.8 CuCl2 x 2H2O 170.48 O.15 O.O2S4 GTCGAACCAGGGAAGAAAAAATG-3' (SEQ ID NO: 40 FeCl2 x 4H2O 198.81 1.5 O.3 32) and crt-260 R: 5'-GAATTCAACGAGGACGCTGC MnCl2 x 4H2O 197.91 O.S O.1 CACAGA-3' (SEQ ID NO:33). An EcoRI site at the 3' end CoCl2 x 6H2O 237.9 1.31 O.312 ZnCl2 136.29 0.73 O.1 of the crtY gene was removed by a silent change introduced HBO 61.83 O16 O.O1 at the primers spanning the 3' end of the crtY gene. The Na2MoC) x 241.95 O.04 O.O1 crtEY genes were first joined together by SOEing PCR using 45 2H2O primers crt-260 F (SEQID NO: 28) and crt-260R1 R (SEQ NiCl, x 6HO 237.7 0.77 O.184 ID NO: 31). The crtEY genes were then joined together by *Mix the gram amounts designated above in 900 mL of H2O, adjust to pH PCR with crtIB genes using crt-260 F (SEQ ID NO: 28) = 7, and add H2O to an end volume of 1 L. Keep refrigerated. primer and crt-260 R (SEQID NO:32) primer. The final 4.5 kB crtEYIB fragment was cloned into pTrcHis2-TOPO 50 vector and then subcloned into pBHR1 resulting plCQ340. TABLE 6 E. coli cells containing plCQ340 were shown to produce B-carotene. Nitrate liquid medium (BTZ-3) Conc. Example 8 55 MW (mM) g per L. NaNO 84.99 10 O.85 Expression of the Novel Carotenoid Ketolase KH2PO 136.09 3.67 O.S Genes in Methylomonas Na2SO 142.04 3.52 O.S MgCl2 x 6H2O 2O3.3 O.98 O.2 CaCl2 x 2H2O 147.02 O.68 O.1 This example describes how one of skill in the art can 60 1 M HEPES (pH 7) 238.3 50 mL. express the novel carotenoid ketolase genes for production Solution 1 10 mL. of ketocarotenoids, such as canthaxanthin, in Methylomonas sp. 16a (ATCC PTA-2402) based on previously reported *Dissolve in 900 mL H2O. Adjust to pH = 7, and add H2O to give 1 L. For agar plates: Add 15 g of agarose in 1 L of medium, autoclave, let cool methods (U.S. Ser. No. 09/941,947) and (U.S. Ser. No. down to 50° C., mix, and pour plates. 60/527,083). 65 The crtW genes from Sphingomonas melonis DC18 and The standard gas phase for cultivation contains 25% Brevundimonas vesicularis DC263 were individually cloned methane in air. The Methylomonas sp. 16a MWM1200 US 7,252,985 B2 49 50 recipient strain was cultured under these conditions for 48 h. For analysis of carotenoid composition, transconjugants in BTZ-3 medium, washed three times in BTZ-3, and were cultured in 25 mL BTZ-3 containing kanamycin (50 resuspended in a volume of BTZ-3 representing a 150-fold ug/mL) and incubated at 30°C. in 25% methane as the sole concentration of the original culture Volume. carbon source for up to 1 week. The cells were harvested by The donor, helper, and recipient cell pastes were com- 5 centrifugation and frozen at -20° C. After thawing, the bined in ratios of 1:1:2, respectively, on the surface of BTZ-3 agar plates containing 0.5% (w/v) yeast extract. pellets were extracted and carotenoid content was analyzed Plates were maintained at 30° C. in 25% methane for 16-72 by HPLC, as described in Example 1. hours to allow conjugation to occur, after which the cell HPLC analysis (FIG. 5) of extracts from Methylomonas pastes were collected and resuspended in BTZ-3. Dilutions 10 16a MWM1200 containing plCQ340 showed synthesis of were plated on BTZ-3 agar containing kanamycin (50 B-carotene. Methylomonas 16a MWM1200 containing ug/mL) and incubated at 30°C. in 25% methane for up to 1 either plCQ341 or poCQ342 synthesized canthaxanthin, week. Orange-red transconjugants were streaked onto which confirmed the ketolase activity of the novel ketolases BTZ-3 agar with kanamycin (50 ug/mL). in this methanotrophic host.

SEQUENCE LISTING

<160> NUMBER OF SEQ ID NOS : 33 <210> SEQ ID NO 1 &2 11s LENGTH 747 &212> TYPE DNA <213> ORGANISM: Sphingomonas melonis DC18 &22O > FEATURE <221 NAME/KEY: CDS <222> LOCATION: (1) . . (747) <400 SEQUENCE: 1

atg acc gtc gat cac gac goa cqg atc agc ct g citg citg gcc gca gcc 48 Met Thr Val Asp His Asp Ala Arg Ile Ser Lieu Lleu Lieu Ala Ala Ala 1. 5 10 15 atc ggc gcc gog togg citg gog atc cat gtc ggg gC g atc gtg togg togg 96 Ile Gly Ala Ala Trp Lieu Ala Ile His Val Gly Ala Ile Val Trp Trp 2O 25 30

cga tigg agc cc g gog acg gog gtg citc gcg atc ccc gtc. gtg citc gta 144 Arg Trp Ser Pro Ala Thr Ala Val Leu Ala Ile Pro Val Wall Leu Wall 35 40 45 cag gog togg citg agc acc ggc ctd titc atc gtc. gog cac gat togc atg 192 Glin Ala Trp Leu Ser Thr Gly Leu Phe Ile Val Ala His Asp Cys Met 5 O 55 60

cac gga tog titc gtg ccc ggc cqg ccc gcg gtc. aac cqg acc gtc. ggg 240 His Gly Ser Phe Val Pro Gly Arg Pro Ala Val Asn Arg Thr Val Gly 65 70 75 8O

acg citg togc citc ggc gcc tat gcg gga citg toc tat ggc cag citc cat 288 Thir Lieu. Cys Lieu Gly Ala Tyr Ala Gly Lieu Ser Tyr Gly Glin Lieu. His 85 90 95

ccc aag cat cat gcg cat cac gat gcg cc g g g c acc gcc. gcc gac ccc 336 Pro Lys His His Ala His His Asp Ala Pro Gly Thr Ala Ala Asp Pro 100 105 110

gat titc cat goc ggc gcg cc g c ga toc goa citg cc g togg titc gcg cqc 384 Asp Phe His Ala Gly Ala Pro Arg Ser Ala Lieu Pro Trp Phe Ala Arg 115 120 125

titc titc acc agc tat tac acg cac ggc cag atc ctic cqg atc acc gcg 432 Phe Phe Thr Ser Tyr Tyr Thr His Gly Glin Ile Leu Arg Ile Thr Ala 130 135 14 O

gcq gog gtg citg tac at g citg citc ggt gtg tog citg citc aac atc gto 480 Ala Ala Wall Leu Tyr Met Leu Lleu Gly Val Ser Lieu Lieu Asn Ile Wall 145 15 O 155 160

gtg titc togg gC g ttg cc g gog citg atc gcg citg gog cag citg titc gtc 528 Val Phe Trp Ala Lieu Pro Ala Lieu. Ile Ala Lieu Ala Glin Leu Phe Wall 1.65 17 O 175

titc ggc acc ttic citg cc g cat cqc cac ggc gac acg cc.g titc gcg gac 576 US 7,252,985 B2 51

-continued Phe Gly. Thr Phe Leu Pro His Arg His Gly Asp Thr Pro Phe Ala Asp 18O 185 19 O gCg cac aat gCC C gC agc aac ggC togg cca cqg Ctg gcq tog Citg gC g 624 Ala His Asn Ala Arg Ser Asn Gly Trp Pro Arg Lieu Ala Ser Lieu Ala 195 200 2O5 acc toc titc. cac titc ggc gcc tat cat cac gaa cat cac citg agc cc.g 672 Thr Cys Phe His Phe Gly Ala Tyr His His Glu His His Leu Ser Pro 210 215 220 tgg acg ccc togg togg cag titg cc g c gc gtc. g.gc cag cct gcc gcc gga 720 Trp Thr Pro Trp Trp Gln Leu Pro Arg Val Gly Gln Pro Ala Ala Gly 225 230 235 240 cac cqg to g tta agc aaa gac cqg tag 747 His Arg Ser Lieu Ser Lys Asp Arg 245

<210> SEQ ID NO 2 <211& LENGTH 248 &212> TYPE PRT <213> ORGANISM: Sphingomonas melonis DC18 <400 SEQUENCE: 2 Met Thr Val Asp His Asp Ala Arg Ile Ser Lieu Lleu Lleu Ala Ala Ala 1 5 10 15 Ile Gly Ala Ala Trp Leu Ala Ile His Val Gly Ala Ile Val Trp Trp 2O 25 30 Arg Trp Ser Pro Ala Thr Ala Val Leu Ala Ile Pro Val Val Leu Val 35 40 45 Glin Ala Trp Leu Ser Thr Gly Leu Phe Ile Val Ala His Asp Cys Met 50 55 60 His Gly Ser Phe Val Pro Gly Arg Pro Ala Val Asn Arg Thr Val Gly 65 70 75 8O Thr Lieu. Cys Lieu Gly Ala Tyr Ala Gly Lieu Ser Tyr Gly Glin Lieu. His 85 90 95 Pro Llys His His Ala His His Asp Ala Pro Gly Thr Ala Ala Asp Pro 100 105 110 Asp Phe His Ala Gly Ala Pro Arg Ser Ala Lieu Pro Trp Phe Ala Arg 115 120 125 Phe Phe Thr Ser Tyr Tyr Thr His Gly Glin Ile Leu Arg Ile Thr Ala 130 135 1 4 0 Ala Ala Wall Leu Tyr Met Leu Lieu Gly Val Ser Lieu Lleu. Asn. Ile Val 145 15 O 155 160 Val Phe Trp Ala Leu Pro Ala Lieu. Ile Ala Lieu Ala Glin Leu Phe Val 1.65 170 175 Phe Gly. Thr Phe Leu Pro His Arg His Gly Asp Thr Pro Phe Ala Asp 18O 185 19 O Ala His Asn Ala Arg Ser Asn Gly Trp Pro Arg Lieu Ala Ser Lieu Ala 195 200 2O5 Thr Cys Phe His Phe Gly Ala Tyr His His Glu His His Leu Ser Pro 210 215 220 Trp Thr Pro Trp Trp Gln Leu Pro Arg Val Gly Gln Pro Ala Ala Gly 225 230 235 240 His Arg Ser Lieu Ser Lys Asp Arg 245

<210> SEQ ID NO 3 &2 11s LENGTH 780 &212> TYPE DNA

US 7,252,985 B2 55 56

-contin ued

<400 SEQUENCE: 4 Met Arg Glin Ala Asn Arg Met Lieu Thr Gly Pro Arg Cys Ala Lys Cys 1 10 15

Arg Ala Met Ser Ala Wall Thr Pro Met Ser Arg Wal Wall Pro Asn Glin 25 30

Ala Lieu. Ile Gly Lieu. Thir Lieu Ala Gly Lieu. Ile Ala Ala Ala Trp Lieu 35 40 45 Thr Lieu. His Ile Tyr Gl y Val Tyr Phe His Arg Trp Thr Ile Trp Ser 50 55 60

Wall Leu. Thr Wall Pro Leu. Ile Wall Ala Gly Glin Thir Trp Leu Ser Wall 65 70 75 8O

Gly Leu Phe Ile Wall Al a His Asp Ala Met His Gly Ser Lieu Ala Pro 85 90 95

Ala Arg Pro Arg Lieu. As in Thr Ala Ile Gly Ser Leu Ala Leu Ala Leu 100 105 110 Tyr Ala Gly Phe Arg Phe Thr Pro Lieu Lys Thr Ala His His Ala His 115 120 125 His Ala Ala Pro Gly Thr Ala Asp Asp Pro Asp Phe His Ala Asp Ala 130 135 1 4 0 Pro Arg Ala Phe Leu Pro Trp Phe Tyr Gly Phe Phe Arg Thr Tyr Phe 145 15 O 155 160

Gly Trp Arg Glu Lieu Al a Wall Leu Thr Wall Leu Wall Ala Wall Ala Wall 1.65 170 175

Lieu. Ile Leu Gly Ala Arg Met Pro Asn Lieu. Lieu Val Phe Trp Ala Ala 18O 185 19 O

Pro Ala Leu Leu Ser Al a Leu Glin Leu Phe Thr Phe Gly. Thr Trp Lieu 195 200 2O5 Pro His Arg His Thr As p Asp Ala Phe Pro Asp Asn His Asn Ala Arg 210 215 220

Thir Ser Pro Phe Gly Pro Val Leu Ser Lieu. Lieu Thr Cys Phe His Phe 225 23 O 235 240 Gly Arg His His Glu His His Lieu Thr Pro Trp Lys Pro Trp Trp Ser 245 250 255

Leu Phe Ser

<210 SEQ ID NO 5 &2 11s LENGTH 771 &212> TYPE DNA <213> ORGANISM: Flavobacterium sp. K1-2O2C <400 SEQUENCE: 5 gtggct gatg gaggaagtga aggaaaggat totgactittt tgagaaaa.ca titc.ccalactit 60 gctgaaatga aggcggaaat cacatccatg totgttgatc citaaaggaat citttattgct 120 gtgg.cgataa taggtttgttg gtttitccagt ttggitttittc ttctgaacta tgagatttct 18O tggagtgatc cgctagttta tittaggitatc ttggttcaga tgcatctitta taccggacitt 240 ttcatcaccg cgcatgatgc tatgcatggit ttagtag citt ccaacaaaag attgaac act agcattggitt gggtttcggC attgctattt tottaca att tittatagitaa gctattocca 360 aag catcacg aac atcaccg gtttgtagcc act gaccaag accoggattit ccatactitca 420 gataatttct togtatggta ttittagctitt atcaa.gcaat acattaccct citggcaaatc 480 attctitatgg cgatcactitt taatgtactt aagctttittc tgcctgtgga taacttgatt 540 US 7,252,985 B2 57

-continued attittctgga tigctitccggc agittittatcc acatttcago tttitttactt tagtacatat 600 citgcct cata agggtgtaaa C gacaataag caccattcta citacccagtic aaaaaac cat 660 tttitcggctt ttattacctg. ittatttctitc ggittatcact atgag catca to attctocq 720 ggtacaccct ggtggaggct ttggagggta aaagaatcto aatcaaacta a 771

<210> SEQ ID NO 6 &2 11s LENGTH 256 &212> TYPE PRT <213> ORGANISM: Flavobacterium sp. K1-2O2C <400 SEQUENCE: 6 Met Ala Asp Gly Gly Ser Glu Gly Lys Asp Ser Asp Phe Leu Arg Lys 1 5 10 15 His Ser Gln Leu Ala Glu Met Lys Ala Glu Ile Thr Ser Met Ser Val 2O 25 30 Asp Pro Lys Gly Ile Phe Ile Ala Val Ala Ile Ile Gly Lieu Trp Phe 35 40 45 Ser Ser Leu Val Phe Leu Leu Asn Tyr Glu Ile Ser Trp Ser Asp Pro 50 55 60 Leu Val Tyr Leu Gly Ile Leu Val Glin Met His Leu Tyr Thr Gly Leu 65 70 75 8O Phe Ile Thr Ala His Asp Ala Met His Gly Lieu Val Ala Ser Asn Lys 85 90 95 Arg Leu Asn Thr Ser Ile Gly Trp Val Ser Ala Leu Leu Phe Ser Tyr 100 105 110 Asn Phe Tyr Ser Lys Leu Phe Pro Llys His His Glu His His Arg Phe 115 120 125 Val Ala Thr Asp Gln Asp Pro Asp Phe His Thr Ser Asp Asin Phe Phe 130 135 1 4 0 Val Trp Tyr Phe Ser Phe Ile Lys Glin Tyr Ile Thr Leu Trp Glin Ile 145 15 O 155 160 Ile Leu Met Ala Ile Thr Phe Asin Val Leu Lys Leu Phe Leu Pro Val 1.65 170 175 Asp Asn Leu Ile Ile Phe Trp Met Leu Pro Ala Val Leu Ser Thr Phe 18O 185 19 O Gln Leu Phe Tyr Phe Gly. Thir Tyr Leu Pro His Lys Gly Val Asin Asp 195 200 2O5 Asn Lys His His Ser Thr Thr Glin Ser Lys Asn His Phe Ser Ala Phe 210 215 220 Ile Thr Cys Tyr Phe Phe Gly Tyr His Tyr Glu His His Asp Ser Pro 225 230 235 240 Gly Thr Pro Trp Trp Arg Leu Trp Arg Val Lys Glu Ser Glin Ser Asn 245 250 255

<210 SEQ ID NO 7 &2 11s LENGTH 19 &212> TYPE DNA <213> ORGANISM: Artificial sequence &220s FEATURE <223> OTHER INFORMATION: Primer HK12

<400 SEQUENCE: 7 gagtttgatc ctdgcticag 19

<210 SEQ ID NO 8 &2 11s LENGTH 15

US 7,252,985 B2 75 76

-continued cgcgggtaag to Cacgctgg toggcgatgct C gg cagogac gcggtgcgcg agcggcticga. 528 O cacccatctg. c.gc.cgc.gcag acgcc cattt ttcacgc.gcc toc ggaaaaa accaggccac 5340 gc gacgctitt atgcacgcct g gttittcaaa acagotggcc gcgtttagct gag Caacgga 5 400 tacaccc.cgg taatatttgt ggagatcaca toaaggacgc gcatctggitt cagogtaaaa 546 O atgaccacct ggatat cqtg citgcaccctg accgggc gat gagtaccatt cgcaccggat 552O ttgacgc.ctg gcgttittgaa cactd.cgc.cc toccggagct ggatctogac ggitat.cgatc 558 O totccaccac cct gttitt.co C goccgctgaaag.ccc.cggit gct gatcago to 5 632

SEQ ID NO 16 LENGTH 43 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer critW-18F

<400 SEQUENCE: 16 actagtaagg aggaataaac catgaccg to gatcacgacg cac 43

SEQ ID NO 17 LENGTH 2.8 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer critW-18 R <400 SEQUENCE: 17 totagacitac cqgtotttgc titaacgac 28

SEQ ID NO 18 LENGTH 42 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer critW-263 F

<400 SEQUENCE: 18 actagtaagg aggaataaac catgcggcaa gogalacagga tig 42

SEQ ID NO 19 LENGTH 2.8 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer critW-2 63 R

<400 SEQUENCE: 19 totagacitag citgaacaaac toccaccag 28

SEQ ID NO 20 LENGTH 44 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer critW/K1-2O2CF

<400 SEQUENCE: 20 actagtaagg aggaataaac Catggctgat ggaggaagtg aagg 44

SEQ ID NO 21 LENGTH 26 TYPE DNA

US 7,252,985 B2 79 80

-continued

SEQ ID NO 25 LENGTH 59 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer crt2W soe R

<400 SEQUENCE: 25 aggg catggg cqctdatggt atattocticc tittctagatt aggtocgttc ttgggctitc 59

SEQ ID NO 26 LENGTH 59 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer crt2W sole F

<400 SEQUENCE: 26 gaagcc caag aacgcaccita atctagaaag gaggaatata ccatogagcgc ccatgcc ct 59

SEQ ID NO 27 LENGTH 32 TYPE DNA ORGANISM: Artificial sequence FEATURE: OTHER INFORMATION: Primer crt2W R

<400 SEQUENCE: 27 gctagotgta Catcacgcgg togtogcctitt gg 32

SEQ ID NO 28 LENGTH 35 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Primer crit-26 OF

<400 SEQUENCE: 28 gaattcacta gtaccalacca toggatago.ca ttatg 35

SEQ ID NO 29 LENGTH 51 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Primer crit-26OSOER

<400 SEQUENCE: 29 atcaggtogc citcc.gc.cago acg actttca gttgaatato gottagctgtt g 51

SEQ ID NO 30 LENGTH 51 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Primer crit-26OSOEF

<400 SEQUENCE: 30 caac agctag cqatattoaia citgaaagttcg to citggcgga ggc gacctga t 51

SEQ ID NO 31 LENGTH 49 TYPE DNA ORGANISM: artificial sequence FEATURE: US 7,252,985 B2 81 82

-continued <223> OTHER INFORMATION: Primer crit-26 OR1, R <400 SEQUENCE: 31 cattttittct tcc ctd gttc gacagagttcaa.cago.gc.gc gcago gott 49

SEQ ID NO 32 LENGTH 49 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Primer crit-260R1 F

<400 SEQUENCE: 32 aag.cgctg.cg cqc.gctgttgaactctgtcg aaccagg gala gaaaaaatg 49

SEQ ID NO 33 LENGTH 26 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Primer crit-260R

<400 SEQUENCE: 33 gaattcaacg aggacgctgc cacaga 26

What is claimed is: 8. The transformed host cell of claim 7 wherein the host 1. An isolated nucleic acid molecule encoding a caro 30 cell is selected from the group consisting of Aspergillus, tenoid ketolase enzyme, selected from the group consisting Trichoderma, Saccharomyces, Pichia, Candida, Hansenula, of: or Salmonella, Bacillus, Acinetobacter, Zymomonas, Agro (a) an isolated nucleic acid molecule encoding an amino bacterium, Erythrobacter Chlorobium, Chromatium, Fla acid as set forth in SEQ ID NO:4: (b) an isolated nucleic acid molecule that hybridizes with 35 vobacterium, Cytophaga, Rhodobacter; Rhodococcus, Strep (a) under the following wash conditions: 0.1 xSSC, tomyces, Brevibacterium, Corynebacteria, Mycobacterium, 0.1% SDS, 65° C.; or Deinococcus, Escherichia, Erwinia, Pantoea, Pseudomo an isolated nucleic acid molecule that is fully comple nas, Sphingomonas, Methylomonas, Methylobacter, Methy mentary to (a), or (b). lococcus, Methylosinus, Methylomicrobium, Methylocystis, 2. An isolated nucleic acid molecule according to claim 1 40 Alcaligenes, Synechocystis, Synechococcus, Anabaena, as set fourth in SEQ ID NO:3. Thiobacillus, Methanobacterium, Klebsiella, and Myxococ 3. A polypeptide encoded by the isolated nucleic acid C.S. molecule of claim 1. 4. An isolated nucleic acid molecule encoding a caro 9. The transformed host cell of claim 7 wherein the host tenoid ketolase enzyme, the enzyme of 259 amino acid that 45 cell is a C metabolizing bacteria. has at least 95% identity based on the Smith-Waterman 10. The transformed host cell of claim 7 wherein the host method of alignment when compared to a polypeptide cell is selected from the group consisting of soybean, having the sequence as set forth in SEQ ID NO:4; or a rapeseed, Sunflower, cotton, corn, tobacco, alfalfa, wheat, second nucleotide sequence comprising the complement of barley, oats, Sorghum, rice, Arabidopsis, cruciferous Veg the first nucleotide sequence. 50 etables, melons, carrots, celery, parsley, tomatoes, potatoes, 5. A chimeric gene comprising the isolated nucleic acid Strawberries, peanuts, grapes, grass seed crops, Sugar beets, molecule of any one of claim 1, or 2, operably linked to Suitable regulatory sequences. Sugar cane, beans, peas, rye, flax, hardwood trees, Softwood 6. A transformed host cell comprising the isolated nucleic trees, and forage grasses. acid molecule of claim 1. 55 7. The transformed host cell of claim 6 wherein the host cell is selected from the group consisting of bacteria, yeast, filamentous fungi, algae, and green plants.