<<

US 20080038379 A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2008/0038379 A1 Metz et al. (43) Pub. Date: Feb. 14, 2008

(54) PUFA POLYKETIDE SYNTHASE SYSTEMS (60) Provisional application No. 60/457.979, filed on Mar. AND USES THEREOF 26, 2003. Provisional application No. 60/284,066, 75 filed on Apr. 16, 2001. Provisional application No. (75) Inventors: SMt. so S. 60/298.796, filed on Jun. 15, 2001. Provisional appli Willi? ity. Era 8) ); cation No. 60/323,269, filed on Sep. 18, 2001. Siles It flat Colorado Publication Classification Correspondence Address: (51) Int. Cl. SHERIDAN ROSS PC A6IR 36/00 (2006.01) 1560 BROADWAY A23C II/00 (2006.01) SUTE 12OO C07C 53/00 (2006.01) DENVER, CO 80202 A23D 7700 (2006.01) (52) U.S. Cl...... 424/725; 426/580: 426/601; (73) Assignee: MARTEK BIOSCIENCES CORPO- 554/1 RATION, Columbia, MD (US) (21) Appl. No.: 11/778,570 (57) ABSTRACT (22) Filed: Jul. 16, 2007 O O The invention generally relates to polyunsaturated fatty acid Related U.S. Application Data (PUFA) polyketide synthase (PKS) systems, to homologues (60) Continuation of application No. 1 1/676,971, filed on thereof, to isolated nucleic acid molecules and recombinant Feb. 20, 2007, which is a division of application No. nucleic acid molecules encoding biologically active 10/810,352, filed on Mar. 26, 2004, now Pat. No. domains of such a PUFA PKS system, to genetically modi 7.211,418, and which is a continuation-in-part of fied organisms comprising PUFA PKS systems, to methods application No. 10/124,800, filed on Apr. 16, 2002, of making and using such systems for the production of now Pat. No. 7,247.461, and which is a continuation bioactive molecules of interest, and to novel methods for in-part of application No. 09/231,899, filed on Jan. identifying new bacterial and non-bacterial microorganisms 14, 1999, now Pat. No. 6,566,583. having such a PUFA PKS system. Comparison of PKS Orfs/domains: VS Shewanella

Orf 5 Orf 6 Orf 7 Of 8 KS MAT 6X ACP KR

Amino acid sequence homology

- 20-35% = 35-50% Patent Application Publication Feb. 14, 2008 Sheet 1 of 3 US 2008/0038379 A1

i

s s

2 2 Patent Application Publication Feb. 14, 2008 Sheet 2 of 3 US 2008/0038379 A1

Z*OIH

US 2008/0038379 A1 Feb. 14, 2008

PUFA POLYKETDE SYNTHASE SYSTEMIS AND detail in U.S. Pat. No. 6,566,583. Finally, the PKS pathways USES THEREOF for PUFA synthesis in such as members of Thraustochytriales, including the complete structural CROSS-REFERENCE TO RELATED description of the PUFA PKS pathway in Schizochytrium APPLICATIONS and the identification of the PUFA PKS pathway in Thraus tochytrium, including details regarding uses of these path 0001. This application is a Continuation of U.S. applica ways, are described in detail in U.S. Patent Application tion Ser. No. 1 1/676,971, filed Feb. 20, 2007, which is a Publication No. 20020194641, published Dec. 19, 2002 Continuation of U.S. application Ser. No. 10/810,352, filed (corresponding to U.S. patent application Ser. No. 10/124, Mar. 26, 2004, now U.S. Pat. No. 7,211,418, which claims 800, filed Apr. 16, 2002). the benefit of priority under 35 U.S.C. S 119(e) from U.S. Provisional Application Ser. No. 60/457.979, filed Mar. 26, 0005 Researchers have attempted to exploit polyketide 2003, entitled “Modification of a Schizochytrium PKS Sys synthase (PKS) systems that have been described in the tem to Facilitate Production of Lipids Rich in Polyunsatu literature as falling into one of three basic types, typically rated Fatty Acids”. U.S. application Ser. No. 10/810,352 is referred to as: Type II, Type I and modular. The Type II also a continuation-in-part of U.S. patent application Ser. system is characterized by separable proteins, each of which No. 10/124,800, filed Apr. 16, 2002, which Claims the carries out a distinct enzymatic reaction. The enzymes work benefit of priority under 35 U.S.C. S 119(e) to: U.S. Provi in concert to produce the end product and each individual sional Application Ser. No. 60/284,066, filed Apr. 16, 2001; enzyme of the system typically participates several times in U.S. Provisional Application Ser. No. 60/298.796, filed Jun. the production of the end product. This type of system 15, 2001; and U.S. Provisional Application Ser. No. 60/323, operates in a manner analogous to the fatty acid synthase 269, filed Sep. 18, 2001. U.S. patent application Ser. No. (FAS) systems found in plants and . Type I PKS 10/124,800, supra, is also a continuation-in-part of U.S. systems are similar to the Type II system in that the enzymes application Ser. No. 09/231,899, filed Jan. 14, 1999, now are used in an iterative fashion to produce the end product. U.S. Pat. No. 6,566,583. Each of the above-identified patent The Type I differs from Type II in that enzymatic activities, applications is incorporated herein by reference in its instead of being associated with separable proteins, occur as entirety. domains of larger proteins. This system is analogous to the 0002 This application does not claim the benefit of Type I FAS systems found in animals and fungi. priority from U.S. application Ser. No. 09/090,793, filed Jun. 0006. In contrast to the Type I and II systems, in modular 4, 1998, now U.S. Pat. No. 6,140,486, although U.S. appli PKS systems, each enzyme domain is used only once in the cation Ser. No. 09/090,793 is incorporated herein by refer production of the end product. The domains are found in ence in its entirety. very large proteins and the product of each reaction is passed on to another domain in the PKS protein. Additionally, in all FIELD OF THE INVENTION of the PKS systems described above, if a carbon-carbon 0003. This invention relates to polyunsaturated fatty acid double bond is incorporated into the end product, it is always (PUFA) polyketide synthase (PKS) systems from microor in the trans configuration. ganisms, including eukaryotic organisms, such as Thraus 0007. In the Type I and Type II PKS systems described tochytrid microorganisms. More particularly, this invention above, the same set of reactions is carried out in each cycle relates to nucleic acids encoding non-bacterial PUFA PKS until the end product is obtained. There is no allowance for systems, to non-bacterial PUFA PKS systems, to genetically the introduction of unique reactions during the biosynthetic modified organisms comprising non-bacterial PUFA PKS procedure. The modular PKS systems require huge proteins systems, and to methods of making and using the non that do not utilize the economy of iterative reactions (i.e., a bacterial PUFA PKS systems disclosed herein. This inven distinct domain is required for each reaction). Additionally, tion also relates to genetically modified microorganisms and as stated above, carbon-carbon double bonds are introduced methods to efficiently produce lipids (triacylglyerols (TAG), in the trans configuration in all of the previously described as well as membrane-associated phospholipids (PL)) PKS systems. enriched in various polyunsaturated fatty acids (PUFAs) and particularly, eicosapentaenoic acid (C20:5, co-3; EPA) by 0008 Polyunsaturated fatty acids (PUFAs) are critical manipulation of a PUFA polyketide synthase (PKS) system. components of membrane lipids in most eukaryotes (Lau ritzen et al., Prog. Lipid Res. 401 (2001); McConn et al., BACKGROUND OF THE INVENTION Plant J. 15, 521 (1998)) and are precursors of certain hormones and signaling molecules (Heller et al., Drugs 55. 0004 Polyketide synthase (PKS) systems are generally 487 (1998); Creelman et al., Annu. Rev. Plant Physiol. Plant known in the art as enzyme complexes derived from fatty Mol. Biol. 48, 355 (1997)). Known pathways of PUFA acid synthase (FAS) systems, but which are often highly synthesis involve the processing of saturated 16:0 or 18:0 modified to produce specialized products that typically show fatty acids (the abbreviation X;Y indicates an acyl group little resemblance to fatty acids. It has now been shown, containing X carbon atoms and Y double bonds (usually cis however, that polyketide synthase systems exist in marine in PUFAs); double-bond positions of PUFAs are indicated bacteria and certain microalgae that are capable of synthe relative to the methyl carbon of the fatty acid chain (c)3 or sizing PUFAs from malonyl-CoA. The PKS pathways for (O6) with systematic methylene interruption of the double PUFA synthesis in Shewanella and another marine bacteria, bonds) derived from fatty acid synthase (FAS) by elongation Vibrio marinus, are described in detail in U.S. Pat. No. and aerobic desaturation reactions (Sprecher, Curr. Opin. 6,140,486. The PKS pathways for PUFA synthesis in the Clin. Nutr. Metab. Care 2, 135 (1999); Parker-Barnes et al., eukaryotic Thraustochytrid, Schizochytrium is described in Proc. Natl. Acad. Sci. USA97, 8284 (2000); Shanklin et al., US 2008/0038379 A1 Feb. 14, 2008

Annu. Rev. Plant Physiol. Plant Nol. Biol. 49, 611 (1998)). ciated with fish oils. However, the PUFAs found in com Starting from acetyl-CoA, the synthesis of docosahexaenoic mercially developed plant oils are typically limited to acid (DHA) requires approximately 30 distinct enzyme linoleic acid (eighteen carbons with 2 double bonds, in the activities and nearly 70 reactions including the four repeti delta 9 and 12 positions—18:2 delta 9, 12) and linolenic acid tive steps of the fatty acid synthesis cycle. Polyketide (18:3 delta 9, 12, 15). In the conventional pathway for PUFA synthases (PKSS) carry out some of the same reactions as synthesis, medium chain-length saturated fatty acids (prod FAS (Hopwood et al., Annu. Rev. Genet. 24, 37 (1990); ucts of a fatty acid synthase (FAS) system) are modified by Bentley et al., Annu. Rev. Microbiol. 53, 411 (1999)) and use a series of elongation and desaturation reactions. Because a the same Small protein (or domain), acyl carrier protein number of separate desaturase and elongase enzymes are (ACP), as a covalent attachment site for the growing carbon required for fatty acid synthesis from linoleic and linolenic chain. However, in these enzyme systems, the complete acids to produce the more Saturated and longer chain cycle of reduction, dehydration and reduction seen in FAS is PUFAs, engineering plant host cells for the expression of often abbreviated so that a highly derivatized carbon chain PUFAs such as EPA and docosahexaenoic acid (DHA) may is produced, typically containing many keto- and hydroxy require expression of several separate enzymes to achieve groups as well as carbon-carbon double bonds in the trans synthesis. Additionally, for production of useable quantities configuration. The linear products of PKSs are often of such PUFAs, additional engineering efforts may be cyclized to form complex biochemicals that include antibi required, for example, engineering the down regulation of otics and many other secondary products (Hopwood et al., enzymes that compete for Substrate, engineering of higher (1990) supra; Bentley et al., (1999), Supra; Keating et al., enzyme activities such as by mutagenesis or targeting of Curr. Opin. Chem. Biol. 3, 598 (1999)). enzymes to plastid organelles. Therefore it is of interest to 0009 Very long chain PUFAs such as docosahexaenoic obtain genetic material involved in PUFA biosynthesis from acid (DHA; 22:6c)3) and eicosapentaenoic acid (EPA: species that naturally produce these fatty acids and to 20:5 (03) have been reported from several species of marine express the isolated material alone or in combination in a bacteria, including Shewanella sp (Nichols et al., Curr. Op. heterologous system which can be manipulated to allow Biotechnol. 10, 240 (1999); Yazawa, Lipids 31, S (1996): production of commercial quantities of PUFAs. DeLong et al., Appl. Environ. Microbiol. 51, 730 (1986)). 0011. The discovery of a PUFA PKS system in marine Analysis of a genomic fragment (cloned as plasmid pEPA) bacteria such as Shewanella and Vibrio marinus (see U.S. from Shewanella sp. strain SCRC2738 led to the identifi Pat. No. 6,140,486, ibid.) provides a resource for new cation of five open reading frames (Orfs), totaling 20 Kb, methods of commercial PUFA production. However, these that are necessary and sufficient for EPA production in E. marine bacteria have limitations which may ultimately coli (Yazawa, (1996), supra). Several of the predicted pro restrict their usefulness on a commercial level. First, tein domains were homologues of FAS enzymes, while other although U.S. Pat. No. 6,140,486 discloses that these marine regions showed no homology to proteins of known function. bacteria PUFA PKS systems can be used to genetically At least 11 regions within the five Orfs were identifiable as modify plants, the marine bacteria naturally live and grow in putative enzyme domains (See Metz et al., Science 293:290 cold marine environments and the enzyme systems of these 293 (2001)). When compared with sequences in the gene bacteria do not function well above 22°C. In contrast, many databases, seven of these were more strongly related to PKS crop plants, which are attractive targets for genetic manipu proteins than to FAS proteins. Included in this group were lation using the PUFA PKS system, have normal growth domains putatively encoding malonyl-CoA:ACP acyltrans conditions at temperatures above 22° C. and ranging to ferase (MAT), B-ketoacyl-ACP synthase (KS), B-ketoacyl higher than 40°C. Therefore, the PUFA PKS systems from ACP reductase (KR), acyltransferase (AT), phosphopanteth these marine bacteria are not predicted to be readily adapt eine transferase, chain length (or chain initiation) factor able to plant expression under normal growth conditions. (CLF) and a highly unusual cluster of six ACP domains (i.e., Additionally, the known marine bacteria PUFAPKS systems the presence of more than two clustered ACP domains had do not directly produce triacylglyerols (TAG), whereas not previously been reported in PKS or FAS sequences). It direct production of TAG would be desirable because TAG is likely that the PKS pathway for PUFA synthesis that has are a lipid storage product, and as a result, can be accumu been identified in Shewanella is widespread in marine lated at very high levels in cells, as opposed to a “structural bacteria. Genes with high homology to the Shewanella gene lipid product (e.g. phospholipids), which can generally only cluster have been identified in Photobacterium profiundum accumulate at low levels. (Allen et al., Appli. Environ. Microbiol. 65:1710 (1999)) and in Moritella marina (Vibrio marinus) (see U.S. Pat. No. 0012. With regard to the production of eicosapentaenoic 6,140,486, ibid., and Tanaka et al., Biotechnol. Lett. 21:939 acid (EPA) in particular, researchers have tried to produce (1999)). EPA with microbes by growing them in both photosynthetic and heterotrophic cultures. They have also used both clas 0010 Polyunsaturated fatty acids (PUFAs) are consid sical and directed genetic approaches in attempts to increase ered to be useful for nutritional, pharmaceutical, industrial, the productively of the organisms under culture conditions. and other purposes. An expansive supply of PUFAs from Other researchers have attempted to produce EPA in oil-seed natural sources and from chemical synthesis are not suffi crop plants by introduction of genes encoding various cient for commercial needs. A major current Source for desaturase and elongase enzymes. PUFAs is from marine fish; however, fish stocks are declin ing, and this may not be a Sustainable resource. Additionally, 0013 Researchers have attempted to use cultures of red contamination, both heavy metal and toxic organic mol microalgae (Monodus), (e.g. Phaeodactylum), other ecules, is a serious issue with oil derived from marine fish. microalgae and fungi (e.g. Mortierella cultivated at low Vegetable oils derived from oil seed crops are relatively temperatures). However, in all cases, productivity was low inexpensive and do not have the contamination issues asso compared to existing commercial microbial production sys US 2008/0038379 A1 Feb. 14, 2008

tems for other long chain PUFAs such as DHA. In many more preferably at least about 90% identical, to an amino cases, the EPA occurred primarily in the phospholipids (PL) acid sequence selected from the group consisting of: SEQID rather than the triacylglycerols (TAG). Since productivity of NO:41, SEQID NO:66, SEQID NO:68, wherein the amino microalgae under heterotrophic growth conditions can be acid sequence has a biological activity of at least one domain much higher than under phototrophic conditions, researchers of a polyunsaturated fatty acid (PUFA) polyketide synthase have attempted, and achieved, trophic conversion by intro (PKS) system; and/or (f) a nucleic acid sequence that is fully duction of genes encoding specific Sugar transporters. How complementary to the nucleic acid sequence of (a), (b), (c), ever, even with the newly acquired heterotrophic capability, (d), or (e). In one aspect, the nucleic acid sequence encodes productivity in terms of oil remained relatively low. an amino acid sequence selected from: SEQID NO:39, SEQ 0014) Efforts to produce EPA in oil-seed crop plants by ID NO:41, SEQID NO:43, SEQID NO:45, SEQID NO:48, modification of the endogenous fatty acid biosynthesis path SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, way have only yielded plants with very low levels of the SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, and PUFA in their oils. As discussed above, several marine biologically active fragments thereof. In one aspect, the bacteria have been shown to produce PUFAs (EPA as well nucleic acid sequence is selected from the group consisting as DHA). However, these bacteria do not produce TAG and of: SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ the EPA is found primarily in the PL membranes. The levels ID NO:44, SEQID NO:47, SEQID NO:49, SEQID NO:51, of EPA produced as well as the growth characteristics of SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID these particular marine bacteria (discussed above) limit their NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, utility for commercial production of EPA. and SEQ ID NO:67. 0015 Therefore, there is a need in the art for other PUFA 0017 Another embodiment of the present invention PKS systems having greater flexibility for commercial use, relates to a recombinant nucleic acid molecule comprising and for a biological system that efficiently produces quan any of the above-described nucleic acid molecules, opera tities of lipids (PL and TAG) enriched in desired PUFAs, tively linked to at least one transcription control sequence. Such as EPA, in a commercially useful production process. 0018 Yet another embodiment of the present invention SUMMARY OF THE INVENTION relates to a recombinant cell transfected with any of the above-described recombinant nucleic acid molecules. 0016 One embodiment of the present invention relates to Another embodiment of the present invention relates to a an isolated nucleic acid molecule. The nucleic acid molecule genetically modified microorganism, wherein the microor comprises a nucleic acid sequence selected from: (a) a ganism expresses a PKS system comprising at least one nucleic acid sequence encoding an amino acid sequence biologically active domain of a polyunsaturated fatty acid selected from the group consisting of: SEQID NO:39, SEQ (PUFA) polyketide synthase (PKS) system, wherein the at ID NO:41, SEQID NO:43, SEQID NO:45, SEQID NO:48, least one domain of the PUFA PKS system comprises an SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID amino acid sequence selected from: (a) an amino acid NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, sequence selected from the group consisting of SEQ ID SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68 and NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, biologically active fragments thereof; (b) a nucleic acid SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID sequence encoding an amino acid sequence that is at least NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, about 60% identical, and more preferably at least about 70% SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID identical, and more preferably at least about 80% identical, NO:68 and biologically active fragments thereof; (b) an and more preferably at least about 90% identical, to an amino acid sequence that is at least about 60% identical, and amino acid sequence selected from the group consisting of more preferably at least about 70% identical, and more SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:50, SEQ ID preferably at least about 80% identical, and more preferably NO:52, SEQ ID NO:56 and SEQ ID NO:58, wherein the at least about 90% identical, to an amino acid sequence amino acid sequence has a biological activity of at least one selected from the group consisting of: SEQID NO:39, SEQ domain of a polyunsaturated fatty acid (PUFA) polyketide ID NO:43, SEQID NO:50, SEQID NO:52, SEQID NO:56 synthase (PKS) system; (c) a nucleic acid sequence encod and SEQID NO:58, wherein the amino acid sequence has a ing an amino acid sequence that is at least about 65% biological activity of at least one domain of a polyunsatu identical, and more preferably at least about 70% identical, rated fatty acid (PUFA) polyketide synthase (PKS) system; and more preferably at least about 80% identical, and more (c) an amino acid sequence that is at least about 65% preferably at least about 90% identical, to SEQ ID NO:54, identical, and more preferably at least about 70% identical, wherein the amino acid sequence has a biological activity of and more preferably at least about 80% identical, and more at least one domain of a polyunsaturated fatty acid (PUFA) preferably at least about 90% identical, to SEQ ID NO:54, polyketide synthase (PKS) system; (d) a nucleic acid wherein the amino acid sequence has a biological activity of sequence encoding an amino acid sequence that is at least at least one domain of a polyunsaturated fatty acid (PUFA) about 70% identical, and more preferably at least about 80% polyketide synthase (PKS) system; (d) an amino acid identical, and more preferably at least about 90% identical, sequence that is at least about 70% identical, and more to an amino acid sequence selected from the group consist preferably at least about 80% identical, and more preferably ing of: SEQ ID NO:45, SEQ ID NO:48, SEQ ID NO:60, at least about 90% identical, to an amino acid sequence SEQ ID NO:62 and SEQID NO:64, wherein the amino acid selected from the group consisting of: SEQID NO:45, SEQ sequence has a biological activity of at least one domain of ID NO:48, SEQ ID NO:60, SEQ ID NO:62 and SEQ ID a polyunsaturated fatty acid (PUFA) polyketide synthase NO:64, wherein the amino acid sequence has a biological (PKS) system; (e) a nucleic acid sequence encoding an activity of at least one domain of a polyunsaturated fatty acid amino acid sequence that is at least about 80% identical, and (PUFA) polyketide synthase (PKS) system; and/or (e) an US 2008/0038379 A1 Feb. 14, 2008

amino acid sequence that is at least about 80% identical, and NO:28, SEQ ID NO:30, SEQ ID NO:32, and biologically more preferably at least about 90% identical, to an amino active fragments thereof, and (d) a domain comprising an acid sequence selected from the group consisting of: SEQID amino acid sequence that is at least about 60% identical, and NO:41, SEQID NO:66, SEQID NO:68, wherein the amino more preferably at least about 70% identical, and more acid sequence has a biological activity of at least one domain preferably at least about 80% identical, and more preferably of a polyunsaturated fatty acid (PUFA) polyketide synthase at least about 90% identical, to the amino acid sequence of (PKS) system. The microorganism is genetically modified to (c), wherein the amino acid sequence has a biological affect the activity of the PKS system. activity of at least one domain of a polyunsaturated fatty acid 0019. In one aspect, the microorganism is genetically (PUFA) polyketide synthase (PKS) system. In one aspect, modified by transfection with a recombinant nucleic acid recombinant nucleic acid molecule encodes a phosphopan molecule encoding the at least one domain of a polyunsatu tetheline transferase. In one aspect, the second PKS system rated fatty acid (PUFA) polyketide synthase (PKS) system. is selected from the group consisting of a bacterial PUFA For example, the microorganism can include a Thraus PKS system, a type I PKS system, a type II PKS system, a tochytrid, such as a Schizochytrium. In one aspect, such a modular PKS system, and a non-bacterial PUFAPKS system microorganism has been further genetically modified to (e.g., a eukaryotic PUFA PKS system, such as a Thraus recombinantly express at least one nucleic acid molecule tochytrid PUFA PKS system, including, but not limited to a encoding at least one biologically active domain from a PKS Schizochytrium PUFA PKS system). system selected from the group consisting of a bacterial 0022. Yet another embodiment of the present invention PUFA PKS system, a Type I PKS system, a Type II PKS relates to a genetically modified plant, wherein the plant has system, a modular PKS system, and a non-bacterial PUFA been genetically modified to recombinantly express a PKS PKS system. The non-bacterial PUFA PKS system can system comprising at least one biologically active domain of include a Thraustochytrid PUFA PKS system and in one a polyunsaturated fatty acid (PUFA) polyketide synthase aspect, a Schizochytrium PUFA PKS system. (PKS) system, wherein the domain comprises an amino acid 0020. In another aspect, the microorganism endog sequence selected from the group consisting of: (a) an amino enously expresses a PKS system comprising the at least one acid sequence selected from the group consisting of: SEQID domain of the PUFA PKS system, and wherein the genetic NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, modification is in a nucleic acid sequence encoding at least SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID one domain of the PUFA PKS system. In another aspect, NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, Such a microorganism has been further genetically modified SEQ ID NO:62, SEQ ID NO:64, SEQID NO:66, SEQ ID to recombinantly express at least one nucleic acid molecule NO:68 and biologically active fragments thereof; (b) an encoding at least one biologically active domain from a PKS amino acid sequence that is at least about 60% identical, and system selected from the group consisting of a bacterial more preferably at least about 70% identical, and more PUFA PKS system, a Type I PKS system, a Type II PKS preferably at least about 80% identical, and more preferably system, a modular PKS system, and a non-bacterial PUFA at least about 90% identical, to an amino acid sequence PKS system (e.g., a Thraustochytrid PUFA PKS system, selected from the group consisting of: SEQID NO:39, SEQ such as a Schizochytrium PUFA PKS system). ID NO:43, SEQID NO:50, SEQID NO:52, SEQID NO:56 and SEQID NO:58, wherein the amino acid sequence has a 0021. In another aspect, the microorganism endog biological activity of at least one domain of a polyunsatu enously expresses a PUFA PKS system comprising the at rated fatty acid (PUFA) polyketide synthase (PKS) system; least one biologically active domain of a PUFAPKS system, (c) an amino acid sequence that is at least about 65% and wherein the genetic modification comprises expression identical, and more preferably at least about 70% identical, of a recombinant nucleic acid molecule selected from the and more preferably at least about 80% identical, and more group consisting of a recombinant nucleic acid molecule preferably at least about 90% identical, to SEQ ID NO:54, encoding at least one biologically active domain from a wherein the amino acid sequence has a biological activity of second PKS system and a recombinant nucleic acid mol at least one domain of a polyunsaturated fatty acid (PUFA) ecule encoding a protein that affects the activity of the polyketide synthase (PKS) system; (d) an amino acid endogenous PUFA PKS system. The biologically active sequence that is at least about 70% identical, and more domain from a second PKS system can include, but is not preferably at least about 80% identical, and more preferably limited to: (a) a domain of a polyunsaturated fatty acid at least about 90% identical, to an amino acid sequence (PUFA) polyketide synthase (PKS) system from a Thraus selected from the group consisting of: SEQID NO:45, SEQ tochytrid microorganism; (b) a domain of a PUFA PKS ID NO:48, SEQ ID NO:60, SEQ ID NO:62 and SEQ ID system from a microorganism identified by the following NO:64, wherein the amino acid sequence has a biological method: (i) selecting a microorganism that produces at least activity of at least one domain of a polyunsaturated fatty acid one PUFA; and, (ii) identifying a microorganism from (i) (PUFA) polyketide synthase (PKS) system; and/or (e) an that has an ability to produce increased PUFAs under amino acid sequence that is at least about 80% identical, and dissolved oxygen conditions of less than about 5% of more preferably at least about 90% identical, to an amino saturation in the fermentation medium, as compared to acid sequence selected from the group consisting of: SEQID production of PUFAs by the microorganism under dissolved NO:41, SEQID NO:66, SEQID NO:68, wherein the amino oxygen conditions of greater than about 5% of Saturation in acid sequence has a biological activity of at least one domain the fermentation medium; (c) a domain comprising an amino of a polyunsaturated fatty acid (PUFA) polyketide synthase acid sequence selected from the group consisting of: SEQID (PKS) system. In one aspect, the at least one domain of the NO:2, SEQID NO:4, SEQID NO:6, SEQID NO:8, SEQID PUFA PKS system comprises an amino acid sequence NO:10, SEQ ID NO:13, SEQ ID NO:18, SEQ ID NO:20, selected from the group consisting of: SEQID NO:39, SEQ SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID ID NO:41, SEQID NO:43, SEQID NO:45, SEQID NO:48, US 2008/0038379 A1 Feb. 14, 2008

SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID modification changes at least one product produced by the NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, endogenous PKS system, as compared to an organism SEQ ID NO:64, SEQ ID NO:66 and SEQ ID NO:68 and wherein the PUFA PKS system has not been genetically biologically active fragments thereof. In one aspect, the modified. plant has been further genetically modified to recombinantly 0025. In another aspect, the organism endogenously express at least one nucleic acid molecule encoding at least expresses a PKS system comprising the at least one biologi one biologically active domain from a PKS system selected cally active domain of the PUFA PKS system, and the from the group consisting of: a bacterial PUFA PKS system, genetic modification comprises transfection of the organism a Type IPKS system, a Type II PKS system, a modular PKS with a recombinant nucleic acid molecule selected from the system, and a non-bacterial PUFA PKS system (e.g., a group consisting of a recombinant nucleic acid molecule Thraustochytrid PUFA PKS system, such as a encoding at least one biologically active domain from a Schizochytrium PUFA PKS system). second PKS system and a recombinant nucleic acid mol 0023 Yet another embodiment of the present invention ecule encoding a protein that affects the activity of the PUFA relates to a method to produce a bioactive molecule that is PKS system. In one aspect, the genetic modification changes produced by a polyketide synthase system, comprising cul at least one product produced by the endogenous PKS turing under conditions effective to produce the bioactive system, as compared to an organism that has not been molecule a genetically modified organism that expresses a genetically modified to affect PUFA production. PKS system comprising at least one biologically active domain of a polyunsaturated fatty acid (PUFA) polyketide 0026. In another aspect, the organism is genetically synthase (PKS) system, wherein the at least one domain of modified by transfection with a recombinant nucleic acid the PUFA PKS system comprises an amino acid sequence molecule encoding the at least one domain of the polyun selected from the group consisting of: (a) an amino acid saturated fatty acid (PUFA) polyketide synthase (PKS) sequence selected from the group consisting of SEQ ID system. NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, 0027. In another aspect, the organism produces a poly SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID unsaturated fatty acid (PUFA) profile that differs from the NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, naturally occurring organism without a genetic modification. SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68 and biologically active fragments thereof; (b) an 0028. In another aspect, the organism endogenously amino acid sequence that is at least about 60% identical, and expresses a non-bacterial PUFA PKS system, and wherein more preferably at least about 70% identical, and more the genetic modification comprises Substitution of a domain preferably at least about 80% identical, and more preferably from a different PKS system for a nucleic acid sequence at least about 90% identical, to an amino acid sequence encoding at least one domain of the non-bacterial PUFA selected from the group consisting of: SEQID NO:39, SEQ PKS system. ID NO:43, SEQID NO:50, SEQID NO:52, SEQ ID NO:56 0029. In yet another aspect, the organism endogenously and SEQID NO:58, wherein the amino acid sequence has a expresses a non-bacterial PUFA PKS system that has been biological activity of at least one domain of a polyunsatu modified by transfecting the organism with a recombinant rated fatty acid (PUFA) polyketide synthase (PKS) system; nucleic acid molecule encoding a protein that regulates the (c) an amino acid sequence that is at least about 65% chain length of fatty acids produced by the PUFA PKS identical, and more preferably at least about 70% identical, system. and more preferably at least about 80% identical, and more preferably at least about 90% identical, to SEQ ID NO:54, 0030. In another aspect, the bioactive molecule is wherein the amino acid sequence has a biological activity of selected from: an anti-inflammatory formulation, a chemo at least one domain of a polyunsaturated fatty acid (PUFA) therapeutic agent, an active excipient, an osteoporosis drug, polyketide synthase (PKS) system; (d) an amino acid an anti-depressant, an anti-convulsant, an anti-Heliobactor sequence that is at least about 70% identical, and more pylori drug, a drug for treatment of neurodegenerative preferably at least about 80% identical, and more preferably disease, a drug for treatment of degenerative liver disease, at least about 90% identical, to an amino acid sequence an antibiotic, and/or a cholesterol lowering formulation. In selected from the group consisting of: SEQID NO:45, SEQ one aspect, the bioactive molecule is an antibiotic. In ID NO:48, SEQ ID NO:60, SEQ ID NO:62 and SEQ ID another aspect, the bioactive molecule is a polyunsaturated NO:64, wherein the amino acid sequence has a biological fatty acid (PUFA). In yet another aspect, the bioactive activity of at least one domain of a polyunsaturated fatty acid molecule is a molecule including carbon-carbon double (PUFA) polyketide synthase (PKS) system; and/or (e) an bonds in the cis configuration. In one aspect, the bioactive amino acid sequence that is at least about 80% identical, and molecule is a molecule including a double bond at every more preferably at least about 90% identical, to an amino third carbon. In one aspect, the organism is a microorganism. acid sequence selected from the group consisting of: SEQID In another aspect, the organism is a plant. NO:41, SEQID NO:66, SEQID NO:68, wherein the amino 0031. Another embodiment of the present invention acid sequence has a biological activity of at least one domain relates to a method to produce a plant that has a polyun of a polyunsaturated fatty acid (PUFA) polyketide synthase saturated fatty acid (PUFA) profile that differs from the (PKS) system. naturally occurring plant, comprising genetically modifying 0024. In one aspect, the organism endogenously cells of the plant to express a PKS system comprising at least expresses a PKS system comprising the at least one domain one recombinant nucleic acid molecule comprising a nucleic of the PUFA PKS system, and wherein the genetic modifi acid sequence encoding at least one biologically active cation is in a nucleic acid sequence encoding the at least one domain of a PUFA PKS system, wherein the at least one domain of the PUFA PKS system. In one aspect, the genetic domain of the PUFA PKS system comprises an amino acid US 2008/0038379 A1 Feb. 14, 2008

sequence selected from the group consisting of: (a) an amino and more preferably at least about 80% identical, and more acid sequence selected from the group consisting of: SEQID preferably at least about 90% identical, to SEQ ID NO:54, NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, wherein the amino acid sequence has a biological activity of SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID at least one domain of a polyunsaturated fatty acid (PUFA) NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, polyketide synthase (PKS) system; (d) an amino acid SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID sequence that is at least about 70% identical, and more NO:68 and biologically active fragments thereof; (b) an preferably at least about 80% identical, and more preferably amino acid sequence that is at least about 60% identical, and at least about 90% identical, to an amino acid sequence more preferably at least about 70% identical, and more selected from the group consisting of: SEQID NO:45, SEQ preferably at least about 80% identical, and more preferably ID NO:48, SEQ ID NO:60, SEQ ID NO:62 and SEQ ID at least about 90% identical, to an amino acid sequence NO:64, wherein the amino acid sequence has a biological selected from the group consisting of: SEQID NO:39, SEQ activity of at least one domain of a polyunsaturated fatty acid ID NO:43, SEQID NO:50, SEQID NO:52, SEQ ID NO:56 (PUFA) polyketide synthase (PKS) system; and (e) an amino and SEQID NO:58, wherein the amino acid sequence has a acid sequence that is at least about 80% identical, and more biological activity of at least one domain of a polyunsatu preferably at least about 90% identical, to an amino acid rated fatty acid (PUFA) polyketide synthase (PKS) system; sequence selected from the group consisting of SEQ ID (c) an amino acid sequence that is at least about 65% NO:41, SEQID NO:66, SEQID NO:68, wherein the amino identical, and more preferably at least about 70% identical, acid sequence has a biological activity of at least one domain and more preferably at least about 80% identical, and more of a polyunsaturated fatty acid (PUFA) polyketide synthase preferably at least about 90% identical, to SEQ ID NO:54, (PKS) system. In one aspect, the endproduct is selected wherein the amino acid sequence has a biological activity of from: a dietary Supplement, a food product, a pharmaceuti at least one domain of a polyunsaturated fatty acid (PUFA) cal formulation, a humanized animal milk, and an infant polyketide synthase (PKS) system; (d) an amino acid formula. In one aspect, the pharmaceutical formulation is sequence that is at least about 70% identical, and more selected from the group consisting of an anti-inflammatory preferably at least about 80% identical, and more preferably formulation, a chemotherapeutic agent, an active excipient, at least about 90% identical, to an amino acid sequence an osteoporosis drug, an anti-depressant, an anti-convulsant, selected from the group consisting of: SEQID NO:45, SEQ an anti-Heliobactor pylon drug, a drug for treatment of ID NO:48, SEQ ID NO:60, SEQ ID NO:62 and SEQ ID neurodegenerative disease, a drug for treatment of degen NO:64, wherein the amino acid sequence has a biological erative liver disease, an antibiotic, and a cholesterol lower activity of at least one domain of a polyunsaturated fatty acid ing formulation. In one aspect, the endproduct is used to (PUFA) polyketide synthase (PKS) system; and (e) an amino treat a condition selected from the group consisting of acid sequence that is at least about 80% identical, and more chronic inflammation, acute inflammation, gastrointestinal preferably at least about 90% identical, to an amino acid disorder, cancer, cachexia, cardiac restenosis, neurodegen sequence selected from the group consisting of SEQ ID erative disorder, degenerative disorder of the liver, blood NO:41, SEQID NO:66, SEQID NO:68, wherein the amino lipid disorder, osteoporosis, osteoarthritis, autoimmune dis acid sequence has a biological activity of at least one domain ease, preeclampsia, preterm birth, age related maculopathy, of a polyunsaturated fatty acid (PUFA) polyketide synthase pulmonary disorder, and peroxisomal disorder. (PKS) system. 0033 Yet another embodiment of the present invention 0032) Another embodiment of the present invention relates to a method to produce a humanized animal milk, relates to a method to modify an endproduct containing at comprising genetically modifying milk-producing cells of a least one fatty acid, comprising adding to the endproduct an milk-producing animal with at least one recombinant nucleic oil produced by a recombinant host cell that expresses at acid molecule comprising a nucleic acid sequence encoding least one recombinant nucleic acid molecule comprising a at least one biologically active domain of a PUFA PKS nucleic acid sequence encoding at least one biologically system, wherein the at least one domain of the PUFA PKS active domain of a PUFA PKS system, wherein the at least system comprises an amino acid sequence selected from the one domain of a PUFAPKS system comprises an amino acid group consisting of: (a) an amino acid sequence selected sequence selected from the group consisting of: (a) an amino from the group consisting of: SEQ ID NO:39, SEQ ID acid sequence selected from the group consisting of: SEQID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:48, NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68 and SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID biologically active fragments thereof; (b) an amino acid NO:68 and biologically active fragments thereof; (b) an sequence that is at least about 60% identical, and more amino acid sequence that is at least about 60% identical, and preferably at least about 70% identical, and more preferably more preferably at least about 70% identical, and more at least about 80% identical, and more preferably at least preferably at least about 80% identical, and more preferably about 90% identical, to an amino acid sequence selected at least about 90% identical, to an amino acid sequence from the group consisting of: SEQ ID NO:39, SEQ ID selected from the group consisting of: SEQID NO:39, SEQ NO:43, SEQID NO:50, SEQID NO:52, SEQID NO:56 and ID NO:43, SEQID NO:50, SEQID NO:52, SEQ ID NO:56 SEQ ID NO:58, wherein the amino acid sequence has a and SEQID NO:58, wherein the amino acid sequence has a biological activity of at least one domain of a polyunsatu biological activity of at least one domain of a polyunsatu rated fatty acid (PUFA) polyketide synthase (PKS) system; rated fatty acid (PUFA) polyketide synthase (PKS) system; (c) an amino acid sequence that is at least about 65% (c) an amino acid sequence that is at least about 65% identical, and more preferably at least about 70% identical, identical, and more preferably at least about 70% identical, and more preferably at least about 80% identical, and more US 2008/0038379 A1 Feb. 14, 2008

preferably at least about 90% identical, to SEQ ID NO:54, nucleic acid sequence encoding at least one domain of a PKS wherein the amino acid sequence has a biological activity of system from a different microorganism is from a bacterial at least one domain of a polyunsaturated fatty acid (PUFA) PUFA PKS system. For example, the different microorgan polyketide synthase (PKS) system; (d) an amino acid ism can be a marine bacteria having a PUFA PKS system sequence that is at least about 70% identical, and more that naturally produces PUFAs at a temperature of about 25° preferably at least about 80% identical, and more preferably C. or greater. In one aspect, the marine bacteria is selected at least about 90% identical, to an amino acid sequence from the group consisting of Shewanella Olleyana and selected from the group consisting of: SEQID NO:45, SEQ Shewanella japonica. In one aspect, the domain of a PKS ID NO:48, SEQ ID NO:60, SEQ ID NO:62 and SEQ ID system from a different microorganism is from a PKS NO:64, wherein the amino acid sequence has a biological system selected from the group consisting of a Type IPKS activity of at least one domain of a polyunsaturated fatty acid system, a Type II PKS system, a modular PKS system, and (PUFA) polyketide synthase (PKS) system; and (e) an amino a PUFAPKS system from a different Thraustochytrid micro acid sequence that is at least about 80% identical, and more organism. preferably at least about 90% identical, to an amino acid 0038. In any of the above aspects, the domain of the sequence selected from the group consisting of SEQ ID endogenous PUFA PKS system can include, but is not NO:41, SEQID NO:66, SEQID NO:68, wherein the amino limited to, a domain having a biological activity of at least acid sequence has a biological activity of at least one domain one of the following proteins: malonyl-CoA:ACP acyltrans of a polyunsaturated fatty acid (PUFA) polyketide synthase ferase (MAT), B-ketoacyl-ACP synthase (KS), ketoreduc (PKS) system. tase (KR), acyltransferase (AT), FabA-like B-hydroxyacyl 0034. Another embodiment of the present invention ACP dehydrase (DH), phosphopantetheine transferase, relates to a genetically modified Thraustochytrid microor chain length factor (CLF), acyl carrier protein (ACP), enoyl ganism, wherein the microorganism has an endogenous ACP-reductase (ER), an enzyme that catalyzes the synthesis polyunsaturated fatty acid (PUFA) polyketide synthase of trans-2-acyl-ACP, an enzyme that catalyzes the reversible (PKS) system, and wherein the endogenous PUFA PKS isomerization of trans-2-acyl-ACP to cis-3-acyl-ACP, and system has been genetically modified to alter the expression an enzyme that catalyzes the elongation of cis-3-acyl-ACP profile of a polyunsaturated fatty acid (PUFA) by the to cis-5-3-keto-acyl-ACP. In any of the above aspects, the Thraustochytrid microorganism as compared to the Thraus domain of the endogenous PUFAPKS system can include an tochytrid microorganism in the absence of the genetic modi amino acid sequence selected from the group consisting of fication. (a) an amino acid sequence selected from the group con sisting of: SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 0035) In one aspect, the endogenous PUFA PKS system SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:13, SEQ ID has been modified by mutagenesis of a nucleic acid NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, sequence that encodes at least one domain of the endog SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID enous PUFA PKS system. In one aspect, the modification is NO:32, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, produced by targeted mutagenesis. In another aspect, the SEQ ID NO:45, SEQ ID NO:48, SEQ ID NO:50, SEQ ID modification is produced by classical mutagenesis and NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, Screening. SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68 and biologically active fragments 0036). In another aspect, the endogenous PUFA PKS thereof, and (b) an amino acid sequence that is at least about system has been modified by deleting at least one nucleic 60% identical, and more preferably at least about 70% acid sequence that encodes at least one domain of the identical, and more preferably at least about 80% identical, endogenous PUFA PKS system and inserting therefore a and more preferably at least about 90% identical, to an nucleic acid sequence encoding a homologue of the endog amino acid sequence of (a), wherein the amino acid enous domain to alter the PUFA production profile of the sequence has a biological activity of at least one domain of Thraustochytrid microorganism, wherein the homologue has a polyunsaturated fatty acid (PUFA) polyketide synthase a biological activity of at least one domain of a PKS system. (PKS) system. In one aspect, the homologue of the endogenous domain comprises a modification, as compared to the endogenous 0039. In one aspect, the PUFA production profile is domain, selected from the group consisting of at least one altered to initiate, increase or decrease production of eicosa deletion, insertion or Substitution that results in an alteration pentaenoic acid (EPA) by the microorganism. In another of PUFA production profile by the microorganism. In aspect, the PUFA production profile is altered to initiate, another aspect, the amino acid sequence of the homologue increase or decrease production of docosahexaenoic acid is at least about 60% identical, and more preferably about (DHA) by the microorganism. In another aspect, the PUFA 70% identical, and more preferably about 80% identical, and production profile is altered to initiate, increase or decrease more preferably about 90% identical to the amino acid production of one or both isomers of docosapentaenoic acid sequence of the endogenous domain. In one aspect, homo (DPA) by the microorganism. In another aspect, the PUFA logue of the endogenous domain is a domain from a PUFA production profile is altered to initiate, increase or decrease PKS system of another Thraustochytrid microorganism. production of arachidonic acid (ARA) by the microorgan ism. In another aspect, the Thraustochytrid is from a genus 0037. In another aspect, the endogenous PUFA PKS selected from the group consisting of Schizochytrium, system has been modified by deleting at least one nucleic Thraustochytrium, and Japonochytrium. In another aspect, acid sequence that encodes at least one domain of the the Thraustochytrid is from the genus Schizochytrium. In endogenous PUFA PKS system and inserting therefore a another aspect, the Thraustochytrid is from a Schizochytrium nucleic acid sequence encoding at least one domain of a PKS species selected from the group consisting of system from a different microorganism. In one aspect, the Schizochytrium aggregatum, Schizochytrium limacinum, US 2008/0038379 A1 Feb. 14, 2008 and Schizochytrium minutum. In another aspect, the Thraus selected from the group consisting of Shewanella Olleyana tochytrid is from the genus Thraustochytrium. and Shewanella japonica. In another aspect, the nucleic acid sequence encoding at least one domain of a PKS system is 0040 Yet another embodiment of the present invention relates to a genetically modified Schizochytrium that pro selected from the group consisting of a Type I PKS system, duces eicosapentaenoic acid (EPA), wherein the a Type II PKS system, a modular PKS system, and a Schizochytrium has an endogenous polyunsaturated fatty non-bacterial PUFA PKS system (e.g., a eukaryotic PUFA acid (PUFA) polyketide synthase (PKS) system comprising PKS system, such as a Thraustochytrid PUFA PKS system). a genetic modification in at least one nucleic acid sequence 0041 Another embodiment of the present invention that encodes at least one domain of the endogenous PUFA relates to a genetically modified Schizochytrium that pro PKS system that results in the production of EPA by the duces increased amounts of docosahexaenoic acid (DHA) as Schizochytrium. In one aspect, the Schizochytrium com compared to a non-genetically modified Schizochytrium, prises a genetic modification in at least one nucleic acid wherein the Schizochytrium has an endogenous polyunsatu sequence encoding at least one domain having a biological rated fatty acid (PUFA) polyketide synthase (PKS) system activity of at least one of the following proteins: malonyl comprising a genetic modification in at least one nucleic CoA:ACP acyltransferase (MAT), B-keto acyl-ACP syn sequence that encodes at least one domain of the endog thase (KS), ketoreductase (KR), acyltransferase (AT), FabA enous PUFA PKS system that results in increased the like B-hydroxy acyl-ACP dehydrase (DH), production of DHA by the Schizochytrium. In one aspect, at phosphopantetheline transferase, chain length factor (CLF), least one domain of the endogenous PUFA PKS system has acyl carrier protein (ACP), enoyl ACP-reductase (ER), an been modified by substitution for at least one domain of a enzyme that catalyzes the synthesis of trans-2-acyl-ACP, an PUFA PKS system from Thraustochytrium. In one aspect, enzyme that catalyzes the reversible isomerization of trans the ratio of DHA to DPA produced by the Schizochytrium is 2-acyl-ACP to cis-3-acyl-ACP, and an enzyme that catalyzes increased as compared to a non-genetically modified the elongation of cis-3-acyl-ACP to cis-5-3-keto-acyl-ACP. Schizochytrium. In one aspect, the Schizochytrium comprises a genetic modi 0042 Another embodiment of the present invention fication in at least one nucleic acid sequence encoding at relates to a method to produce lipids enriched for at least one least one domain from the open reading frame encoding selected polyunsaturated fatty acid (PUFA), comprising cul SEQ ID NO:2 of the endogenous PUFA PKS system. In one turing under conditions effective to produce the lipids a aspect, the Schizochytrium comprises a genetic modification genetically modified Thraustochytrid microorganism as in at least one nucleic acid sequence encoding at least one domain from the open reading frame encoding SEQ ID described above or a genetically modified Schizochytrium as NO:4 of the endogenous PUFA PKS system. In one aspect, described above. In one aspect, the selected PUFA is eicosa the Schizochytrium comprises a genetic modification in at pentaenoic acid (EPA). least one nucleic acid sequence encoding at least one domain 0043. Yet another embodiment of the present invention from the open reading frame encoding SEQID NO:6 of the relates to a method to produce eicosapentaenoic acid (EPA)- endogenous PUFA PKS system. In one aspect, the enriched lipids, comprising culturing under conditions effec Schizochytrium comprises a genetic modification in at least tive to produce the EPA-enriched lipids a genetically modi one nucleic acid sequence encoding at least one domain fied Thraustochytrid microorganism, wherein the having a biological activity of at least one of the following microorganism has an endogenous polyunsaturated fatty proteins: B-keto acyl-ACP synthase (KS), Fab A-like B-hy acid (PUFA) polyketide synthase (PKS) system, and droxyacyl-ACP dehydrase (DH), chain length factor (CLF), wherein the endogenous PUFAPKS system has been geneti an enzyme that catalyzes the synthesis of trans-2-acyl-ACP, cally modified in at least one domain to initiate or increase an enzyme that catalyzes the reversible isomerization of the production of EPA in the lipids of the microorganism as trans-2-acyl-ACP to cis-3-acyl-ACP, and an enzyme that compared to in the absence of the modification. catalyzes the elongation of cis-3-acyl-ACP to cis-5-B-keto acyl-ACP. In one aspect, the Schizochytrium comprises a BRIEF DESCRIPTION OF THE FIGURES genetic modification in at least one nucleic acid sequence 0044 FIG. 1 is a graphical representation of the domain encoding an amino acid sequence selected from the group structure of the Schizochytrium PUFA PKS system. consisting of SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:28 and SEQ ID NO:30 of the endogenous PUFA PKS 0045 FIG. 2 shows a comparison of domains of PUFA system. In one aspect, the Schizochytrium has been modified PKS systems from Schizochytrium and Shewanella. by deleting at least one nucleic acid sequence that encodes 0046 FIG. 3 shows a comparison of domains of PUFA at least one domain of the endogenous PUFA PKS system PKS systems from Schizochytrium and a related PKS system and inserting therefore a nucleic acid sequence encoding at from Nostoc whose product is a long chain fatty acid that least one domain of a PKS system from a non Schizochytrium microorganism. In one aspect, the non does not contain any double bonds. Schizochytrium microorganism grows and produces PUFAS DETAILED DESCRIPTION OF THE attemperature of at least about 15° C., and more preferably INVENTION at least about 20°C., and more preferably at least about 25° C., and more preferably at least about 30° C., and more 0047 The present invention generally relates to polyun preferably between about 20° C. and about 40° C. In one saturated fatty acid (PUFA) polyketide synthase (PKS) aspect, the nucleic acid sequence encoding at least one systems, to genetically modified organisms comprising Such domain of a PKS system from a non-Schizochytrium micro PUFA PKS systems, to methods of making and using such organism is from a bacterial PUFA PKS system. In one systems for the production of products of interest, including aspect, the bacterial PUFA PKS system is from a bacterium bioactive molecules and particularly, PUFAs, such as DHA. US 2008/0038379 A1 Feb. 14, 2008

DPA and EPA. As used herein, a PUFA PKS system gener were not known to possess this combination of iterative and ally has the following identifying features: (1) it produces selective enzymatic reactions, and they were not thought of PUFAs as a natural product of the system; and (2) it as being able to produce carbon-carbon double bonds in the comprises several multifunctional proteins assembled into a cis configuration. However, the PUFA PKS system complex that conducts both iterative processing of the fatty described by the present invention has the capacity to acid chain as well non-iterative processing, including trans introduce cis double bonds and the capacity to vary the cis isomerization and enoyl reduction reactions in selected reaction sequence in the cycle. cycles (See FIG. 1, for example). Reference to a PUFA PKS 0051. The present inventors propose to use these features system refers collectively to all of the genes and their of the PUFA PKS system to produce a range of bioactive encoded products that work in a complex to produce PUFAs molecules that could not be produced by the previously in an organism. Therefore, the PUFA PKS system refers described (Type II, Type I and modular) PKS systems. These specifically to a PKS system for which the natural products bioactive molecules include, but are not limited to, polyun are PUFAS. saturated fatty acids (PUFAs), antibiotics or other bioactive 0048 More specifically, first, a PUFA PKS system that compounds, many of which will be discussed below. For forms the basis of this invention produces polyunsaturated example, using the knowledge of the PUFA PKS gene fatty acids (PUFAs) as products (i.e., an organism that structures described herein, any of a number of methods can endogenously (naturally) contains such a PKS system makes be used to alter the PUFA PKS genes, or combine portions PUFAs using this system). The PUFAs referred to herein are of these genes with other synthesis systems, including other preferably polyunsaturated fatty acids with a carbon chain PKS systems, such that new products are produced. The length of at least 16 carbons, and more preferably at least 18 inherent ability of this particular type of system to do both carbons, and more preferably at least 20 carbons, and more iterative and selective reactions will enable this system to preferably 22 or more carbons, with at least 3 or more double yield products that would not be found if similar methods bonds, and preferably 4 or more, and more preferably 5 or were applied to other types of PKS systems. more, and even more preferably 6 or more double bonds, 0.052 Much of the structure of the PKS system for PUFA wherein all double bonds are in the cis configuration. It is an synthesis in the eukaryotic Thraustochytrid, Schizochytrium object of the present invention to find or create via genetic has been described in detail in U.S. Pat. No. 6,566,583. manipulation or manipulation of the endproduct, PKS sys Complete sequencing of cDNA and genomic clones in tems which produce polyunsaturated fatty acids of desired Schizochytrium by the present inventors allowed the iden chain length and with desired numbers of double bonds. tification of the full-length genomic sequence of each of Examples of PUFAs include, but are not limited to, DHA OrfA, OrfB and Orf and the complete identification of the (docosahexaenoic acid (C22:6, ()-3)), ARA (eicosatet specific domains in these Schizochytrium Orfs with homol raenoic acid or arachidonic acid (C20:4, n-6)), DPA (docosa ogy to those in Shewanella (see FIG. 2 and U.S. patent pentaenoic acid (C22:5, ()-6 or (O-3)), and EPA (eicosapen application Ser. No. 10/124,800, supra). In U.S. patent taenoic acid (C20:5, c)-3)). application Ser. No. 10/124,800, the inventors also identified 0049 Second, the PUFA PKS system described herein a Thraustochytrium species as meeting the criteria for hav incorporates both iterative and non-iterative reactions, ing a PUFA PKS system and then demonstrated that this which distinguish the system from previously described organism was likely to contain genes with homology to PKS systems (e.g., type I, type II or modular). More Schizochytrium PUFA PKS genes by Southern blot analysis. particularly, the PUFA PKS system described herein con However, the isolation and determination of the structure of tains domains that appear to function during each cycle as Such genes and the domain organization of the genes was not well as those which appear to function during only some of described in U.S. patent application Ser. No. 10/124.800. In the cycles. A key aspect of this functionality may be related the present invention, the inventors have now cloned and to the domains showing homology to the bacterial Fab-A sequenced the full-length genomic sequence of homologous enzymes. For example, the Fab-A enzyme of E. coli has open reading frames (Orfs) in this Thraustochytrid of the been shown to possess two enzymatic activities. It possesses genus Thraustochytrium (specifically, Thraustochytrium sp. a dehydration activity in which a water molecule (HO) is 23B (ATCC 20892)), and have identified the domains com abstracted from a carbon chain containing a hydroxy group, prising the PUFA PKS system in this Thraustochytrium. leaving a trans double bond in that carbon chain. In addition, Therefore, the present invention solves the above-mentioned it has an isomerase activity in which the trans double bond problem of providing additional PUFA PKS systems that is converted to the cis configuration. This isomerization is have the flexibility for commercial use. The Thraus accomplished in conjunction with a migration of the double tochytrium PUFA PKS system is described in detail below. bond position to adjacent carbons. In PKS (and FAS) 0053. The present invention also solves the above-iden systems, the main carbon chain is extended in 2 carbon tified problem for production of commercially valuable increments. One can therefore predict the number of exten lipids enriched in a desired PUFA, such as EPA, by the sion reactions required to produce the PUFA products of present inventors development of genetically modified these PKS systems. For example, to produce DHA (C22:6. microorganisms and methods for efficiently producing lipids all cis) requires 10 extension reactions. Since there are only (triacylglyerols (TAG) as well as membrane-associated 6 double bonds in the end product, it means that during some phospholipids (PL)) enriched in PUFAs by manipulation of of the reaction cycles, a double bond is retained (as a cis the polyketide synthase-like system that produces PUFAs in isomer), and in others, the double bond is reduced prior to eukaryotes, including members of the order Thraustochytri the next extension. ales such as Schizochytrium and Thraustochytrium. Specifi 0050. Before the discovery of a PUFA PKS system in cally, and by way of example, the present inventors describe marine bacteria (see U.S. Pat. No. 6,140.486), PKS systems herein a strain of Schizochytrium that has previously been US 2008/0038379 A1 Feb. 14, 2008 optimized for commercial production of oils enriched in endoplasmic reticulum, chloroplasts (in algae), lysosomes PUFA, primarily docosahexaenoic acid (DHA; C22:6 n-3) and Golgi apparatus, and their flagella (if present) consists and docosapentaenoic acid (DPA; C22:5 n-6), and that will of many fibrils. In general, bacteria are prokaryotes, while now be genetically modified such that EPA (C20:5 n-3) algae, fungi, protist, protozoa and higher plants are eukary production (or other PUFA production) replaces the DHA OteS. production, without sacrificing the oil productivity charac 0057 The PUFA PKS systems of the marine bacteria teristics of the organism. In addition, the present inventors (e.g., Shewanella sp. strain SCRC2738 and Vibrio marinus) describe herein the genetic modification of Schizochytrium are not the basis of the present invention, although the with PUFA PKS genes from Thraustochytrium to improve present invention does contemplate the use of domains from the DHA production by the Schizochytrium organism, spe these bacterial PUFA PKS systems in conjunction with cifically by altering the ratio of DHA to DPA produced by domains from the non-bacterial PUFA PKS systems of the the microorganism through the modification of the PUFA present invention. In addition, the present invention does PKS system. These are only a few examples of the technol contemplate the isolation and use of PUFA PKS gene sets ogy encompassed by the invention, as the concepts of the (and proteins and domains encoded thereby) isolated from invention can readily be applied to other production organ other bacteria (e.g. Shewanella Olleyana and Shewanella isms and other desired PUFAs as described in detail below. japonica) that will be particularly Suitable for use as sources 0054) In one embodiment, a PUFA PKS system accord of PUFA PKS genes for modifying or combining with the ing to the present invention comprises at least the following non-bacterial PUFA PKS genes described herein to produce biologically active domains: (a) at least two enoyl-ACP hybrid constructs and genetically modified microorganisms reductase (ER) domains; (b) at least six acyl carrier protein and plants. For example, according to the present invention, (ACP) domains; (c) at least two B-ketoacyl-ACP synthase genetically modified organisms can be produced which (KS) domains; (d) at least one acyltransferase (AT) domain; incorporate non-bacterial PUFA PKS functional domains (e) at least one f-ketoacyl-ACP reductase (KR) domain; (f) with bacterial PUFA PKS functional domains, as well as at least two FabA-like B-hydroxyacyl-ACP dehydrase (DH) PKS functional domains or proteins from other PKS systems domains; (g) at least one chain length factor (CLF) domain; (type I, type II, modular) or FAS systems. As discussed in and (h) at least one malonyl-CoA:ACP acyltransferase more detail below, PUFA PKS genes from two species of (MAT) domain. The functions of these domains are gener Shewanella, namely Shewanella Olleyana or Shewanella ally individually known in the art and will be described in japonica, are exemplary bacterial genes that are preferred detail below with regard to the PUFA PKS system of the for use in genetically modified microorganisms, plants, and present invention. methods of the invention. PUFAPKS systems (genes and the 0055. In another embodiment, the PUFA PKS system proteins and domains encoded thereby) from Such marine comprises at least the following biologically active domains: bacteria (e.g., Shewanella Olleyana or Shewanella japonica) (a) at least one enoyl-ACP reductase (ER) domain; (b) are encompassed by the present invention as novel PUFA multiple acyl carrier protein (ACP) domains (at least from PKS sequences. one to four, and preferably at least five, and more preferably 0058 According to the present invention, the terms/ at least six, and even more preferably seven, eight, nine, or phrases “Thraustochytrid”, “Thraustochytriales microorgan more than nine); (c) at least two B-ketoacyl-ACP synthase ism” and “microorganism of the order Thraustochytriales' (KS) domains; (d) at least one acyltransferase (AT) domain; can be used interchangeably and refer to any members of the (e) at least one f-ketoacyl-ACP reductase (KR) domain; (f) order Thraustochytriales, which includes both the family at least two FabA-like B-hydroxyacyl-ACP dehydrase (DH) Thraustochytriaceae and the family Labyrinthulaceae. The domains; (g) at least one chain length factor (CLF) domain; terms "Labyrinthulid’ and “Labyrinthulaceae are used and (h) at least one malonyl-CoA:ACP acyltransferase herein to specifically refer to members of the family Laby (MAT) domain. Preferably, such a PUFA PKS system is a rinthulaceae. To specifically reference Thraustochytrids that non-bacterial PUFA-PKS system. are members of the family Thraustochytriaceae, the term “Thraustochytriaceae is used herein. Thus, for the present 0056. In one embodiment, a PUFA PKS system of the invention, members of the Labyrinthulids are considered to present invention is a non-bacterial PUFA PKS system. In be included in the Thraustochytrids. other words, in one embodiment, the PUFA PKS system of the present invention is isolated from an organism that is not 0059) Developments have resulted in frequent revision of a bacterium, or is a homologue of, or derived from, a PUFA the of the Thraustochytrids. Taxonomic theorists PKS system from an organism that is not a bacterium, Such generally place Thraustochytrids with the algae or algae-like as a or an archaebacterium. Eukaryotes are sepa protists. However, because of taxonomic uncertainty, it rated from prokaryotes based on the degree of differentiation would be best for the purposes of the present invention to of the cells, with eukaryotes having more highly differenti consider the strains described in the present invention as ated cells and prokaryotes having less differentiated cells. In Thraustochytrids to include the following organisms: Order: general, prokaryotes do not possess a nuclear membrane, do Thraustochytriales; Family: Thraustochytriaceae (Genera: not exhibit mitosis during cell division, have only one Thraustochytrium, Schizochytrium, Japonochytrium, Apl.- chromosome, their cytoplasm contains 70S ribosomes, they anochytrium, or Elina) or Labyrinthulaceae (Genera Laby do not possess any mitochondria, endoplasmic reticulum, rinthula, Labyrinthuloides, or Labyrinthomyxa). Also, the chloroplasts, lysosomes or Golgi apparatus, their flagella (if following genera are sometimes included in either family present) consists of a single fibril. In contrast, eukaryotes Thraustochytriaceae or Labyrinthulaceae: Althornia, Coral have a nuclear membrane, they do exhibit mitosis during cell lochytrium, Diplophyry's, and Pyrrhosorus), and for the division, they have many chromosomes, their cytoplasm purposes of this invention are encompassed by reference to contains 80S ribosomes, they do possess mitochondria, a Thraustochytrid or a member of the order Thraustochytri US 2008/0038379 A1 Feb. 14, 2008

ales. It is recognized that at the time of this invention, (Perkins, 1976, pp. 279-312 in “Recent Advances in Aquatic revision in the taxonomy of Thraustochytrids places the Mycology’ (ed. E. B. G. Jones), John Wiley & Sons, New genus Labyrinthuloides in the family of Labyrinthulaceae York; Kazama, 1980, Can. J. Bot. 58:2434-2446; Barr, 1981, and confirms the placement of the two families Thraustochy Biosystems 14:359-370) have provided good evidence that triaceae and Labyrinthulaceae within the lin the Thraustochytriaceae are only distantly related to the eage. It is noted that the Labyrinthulaceae are sometimes . Additionally, genetic data representing a corre commonly called labyrinthulids or labyrinthula, or labyrin spondence analysis (a form of multivariate statistics) of 5-S thuloides and the Thraustochytriaceae are commonly called ribosomal RNA sequences indicate that Thraustochytriales thraustochytrids, although, as discussed above, for the pur are clearly a unique group of eukaryotes, completely sepa poses of clarity of this invention, reference to Thraus rate from the fungi, and most closely related to the red and tochytrids encompasses any member of the order Thraus , and to members of the Oomycetes (Mannella, tochytriales and/or includes members of both et al., 1987, Mol. Evol. 24:228-235). Most taxonomists have Thraustochytriaceae and Labyrinthulaceae. Recent taxo agreed to remove the Thraustochytrids from the Oomycetes nomic changes are summarized below. (Bartnicki-Garcia, 1987, pp. 389-403 in “Evolutionary Biol ogy of the Fungi' (eds. Rayner, A. D. M., Brasier, C. M. & 0060 Strains of certain unicellular microorganisms dis Moore, D.), Cambridge University Press, Cambridge). closed herein are members of the order Thraustochytriales. Thraustochytrids are marine eukaryotes with an evolving 0064. In summary, employing the taxonomic system of taxonomic history. Problems with the taxonomic placement Cavalier-Smith (Cavalier-Smith, 1981, BioSystems 14:461 of the Thraustochytrids have been reviewed by Moss (1986), 481, 1983; Cavalier-Smith, 1993, Microbiol Rev. 57:953 Bahnweb and Jackle (1986) and Chamberlain and Moss 994), the Thraustochytrids are classified with the chro (1988). mophyte algae in the Chromophyta (). This taxonomic placement has been more 0061 For convenience purposes, the Thraustochytrids recently reaffirmed by Cavalier-Smith et al. using the 18s were first placed by taxonomists with other colorless rRNA signatures of the Heterokonta to demonstrate that Zoosporic eukaryotes in the Phycomycetes (algae-like Thraustochytrids are chromists not Fungi (Cavalier-Smith et fungi). The name Phycomycetes, however, was eventually al., 1994, Phil. Tran. Roy. Soc. London Series BioSciences dropped from taxonomic status, and the Thraustochytrids 346:387-397). This places the Thraustochytrids in a com were retained in the Oomycetes (the biflagellate Zoosporic pletely different kingdom from the fungi, which are all fungi). It was initially assumed that the Oomycetes were placed in the kingdom Eufungi. related to the algae, and eventually a wide range of ultrastructural and biochemical studies, Summarized by 0065 Currently, there are 71 distinct groups of eukary Barr (Barr, 1981, Biosystems 14:359-370) supported this otic organisms (Patterson 1999) and within these groups assumption. The Oomycetes were in fact accepted by four major lineages have been identified with some confi Leedale (Leedale, 1974, Taxon 23:261-270) and other phy dence: (1) Alveolates, (2) Stramenopiles, (3) a Land Plant cologists as part of the heterokont algae. However, as a green algae-Rhodophyte Glaucophyte ("plant') clade and matter of convenience resulting from their heterotrophic (4) an Opisthokont clade (Fungi and Animals). Formerly nature, the Oomycetes and Thraustochytrids have been these four major lineages would have been labeled King largely studied by mycologists (scientists who study fungi) doms but use of the "kingdom’ concept is no longer con rather than phycologists (scientists who study algae). sidered useful by some researchers. 0062 From another taxonomic perspective, evolutionary 0066. As noted by Armstrong, Stramenopile refers to biologists have developed two general Schools of thought as three-parted tubular hairs, and most members of this lineage to how eukaryotes evolved. One theory proposes an exog have flagella bearing such hairs. Motile cells of the Stra enous origin of membrane-bound organelles through a series menopiles (unicellular organisms, sperm, Zoopores) are of endosymbioses (Margulis, 1970, Origin of Eukaryotic asymmetrical having two laterally inserted flagella, one Cells. Yale University Press, New Haven); e.g., mitochon long, bearing three-parted tubular hairs that reverse the dria were derived from bacterial endosymbionts, chloro thrust of the flagellum, and one short and smooth. Formerly, plasts from cyanophytes, and flagella from Spirochaetes. The when the group was less broad, the Stramenopiles were other theory suggests a gradual evolution of the membrane called Kingdom Chromista or the heterokont (=different bound organelles from the non-membrane-bounded systems flagella) algae because those groups consisted of the Brown of the prokaryote ancestor via an autogenous process (Cava Algae or Phaeophytes, along with the yellow-green Algae, lier-Smith, 1975, Nature (Lond.) 256:462-468). Both groups Golden-brown Algae, and Diatoms. Sub of evolutionary biologists however, have removed the sequently some heterotrophic, fungal-like organisms, the Oomycetes and Thraustochytrids from the fungi and place water molds, and labyrinthulids (slime net amoebas), were them either with the chromophyte algae in the kingdom found to possess similar motile cells, so a group name Chromophyta (Cavalier-Smith, 1981, BioSystems 14:461 referring to photosynthetic pigments or algae became inap 481) (this kingdom has been more recently expanded to propriate. Currently, two of the families within the Stra include other protists and members of this kingdom are now menopile lineage are the Labyrinthulaceae and the Thraus called Stramenopiles) or with all algae in the kingdom tochytriaceae. Historically, there have been numerous Protoctista (Margulis and Sagen, 1985, Biosystems 18:141 classification strategies for these unique microorganisms and they are often classified under the same order (i.e., 147). Thraustochytriales). Relationships of the members in these 0063 With the development of electron microscopy, groups are still developing. Porter and Leander have devel studies on the ultrastructure of the Zoospores of two genera oped data based on 18S small subunit ribosomal DNA of Thraustochytrids, Thraustochytrium and Schizochytrium, indicating the thraustochytrid-labyrinthulid clade in mono US 2008/0038379 A1 Feb. 14, 2008 phyletic. However, the clade is supported by two branches: elongation enzymes like those described for other eukary the first contains three species of Thraustochytrium and otes (Parker-Barnes et al., 2000, supra; Shanklin et al., 1998, Ulkenia profilinda, and the second includes three species of Supra). These fractionation data contrast with those obtained Labyrinthula, two species of Labyrinthuloides and from the Shewanella enzymes (See Metz et al., 2001, Supra) Schizochytrium aggregatum. and may indicate use of a different (Soluble) acyl acceptor 0067. The taxonomic placement of the Thraustochytrids molecule. Such as CoA, by the Schizochytrium enzyme. It is as used in the present invention is therefore Summarized expected that Thraustochytrium will have a similar bio below: chemistry. 0071. In U.S. Pat. No. 6,566,583, a cDNA library from Kingdom: Chromophyta (Stramenopiles) Schizochytrium was constructed and approximately 8500 Phylum: Heterokonta random clones (ESTs) were sequenced. Sequences that exhibited homology to 8 of the 11 domains of the Order: Thraustochytriales (Thraustochytrids) Shewanella PKS genes shown in FIG. 2 were all identified Family: Thraustochytriaceae or Labyrinthulaceae at frequencies of 0.2-0.5%. In U.S. Pat. No. 6,566,583, several cDNA clones from Schizochytrium showing homol Genera: Thraustochytrium, Schizochytrium, ogy to the Shewanella PKS genes were sequenced, and Japonochytrium, , Elina, Labyrinthula, various clones were assembled into nucleic acid sequences Labyrinthuloides, or Labyrinthulomyxa representing two partial open reading frames and one com 0068. Some early taxonomists separated a few original plete open reading frame. members of the genus Thraustochytrium (those with an 0072 Further sequencing of cDNA and genomic clones amoeboid life stage) into a separate genus called Ulkenia. by the present inventors allowed the identification of the However it is now known that most, if not all, Thraus full-length genomic sequence of each of OrfA, OrfB and tochytrids (including Thraustochytrium and OrfQ in Schizochytrium and the complete identification of Schizochytrium), exhibit amoeboid stages and as such, Ulk the domains in Schizochytrium with homology to those in enia is not considered by Some to be a valid genus. As used Shewanella (see FIG. 2). These genes are described in detail herein, the genus Thraustochytrium will include Ulkenia. in U.S. patent application Ser. No. 10/124,800, supra and are 0069. Despite the uncertainty of taxonomic placement described in some detail below. within higher classifications of Phylum and Kingdom, the 0073. The present inventors have now identified, cloned, Thraustochytrids remain a distinctive and characteristic and sequenced the full-length genomic sequence of homolo grouping whose members remain classifiable within the gous Orfs in a Thraustochytrid of the genus Thraus order Thraustochytriales. tochytrium (specifically, Thraustochytrium sp. 23B (ATCC 20892)) and have identified the domains comprising the 0070 Schizochytrium is a Thraustochytrid marine micro PUFA PKS system in this Thraustochytrium. organism that accumulates large quantities of triacylglycer ols rich in DHA and docosapentaenoic acid (DPA; 22:5 0074 Based on the comparison of the domains of the (O-6); e.g., 30% DHA+DPA by dry weight (Barclay et al., J. PUFA PKS system of Schizochytrium with the domains of Appl. Phycol. 6, 123 (1994)). In eukaryotes that synthesize the PUFA PKS system of Shewanella, clearly, the 20- and 22-carbon PUFAs by an elongation/desaturation Schizochytrium genome encodes proteins that are highly pathway, the pools of 18-, 20- and 22-carbon intermediates similar to the proteins in Shewanella that are capable of are relatively large so that in Vivo labeling experiments using catalyzing EPA synthesis. The proteins in Schizochytrium 'C-acetate reveal clear precursor-product kinetics for the constitute a PUFA PKS system that catalyzes DHA and DPA predicted intermediates (Gellerman et al., Biochim. Biophys. synthesis. Simple modification of the reaction scheme iden Acta 573:23 (1979)). Furthermore, radiolabeled intermedi tified for Shewanella will allow for DHA synthesis in ates provided exogenously to Such organisms are converted Schizochytrium. The homology between the prokaryotic to the final PUFA products. The present inventors have Shewanella and eukaryotic Schizochytrium genes suggests shown that 1-'C-acetate was rapidly taken up by that the PUFA PKS has undergone lateral gene transfer. Schizochytrium cells and incorporated into fatty acids, but at 0075) A similar comparison can be made for Thraus the shortest labeling time (1 min), DHA contained 31% of tochytrium. In all cases, comparison of the Thraus the label recovered in fatty acids, and this percentage tochytrium 23B (Th. 23B) PUFA PKS proteins or domains remained essentially unchanged during the 10-15 min of to other known sequences revealed that the closest match '''C-acetate incorporation and the subsequent 24 hours of was one of the Schizochytrium PUFA PKS proteins (OrfA, culture growth. Similarly, DPA represented 10% of the label B or C, or a domain therefrom) as described in U.S. patent throughout the experiment. There is no evidence for a application Ser. No. 10/124,800, supra. The next closest precursor-product relationship between 16- or 18-carbon matches in all cases were to one of the PUFA PKS proteins fatty acids and the 22-carbon polyunsaturated fatty acids. from marine bacteria (Shewanella SCRC-2738, Shewanella These results are consistent with rapid synthesis of DHA Oneidensis, Photobacter profindum and Moritella marina) from '''C-acetate involving very small (possibly enzyme or from a related system found in nitrogen fixing cyanobac bound) pools of intermediates. A cell-free homogenate teria (e.g., Nostoc punctiforme and Nostoc sp. PCC 7120). derived from Schizochytrium cultures incorporated 1-''C- The products of the cyanobacterial enzyme systems lack malonyl-CoA into DHA, DPA, and saturated fatty acids. The double bonds and the proteins lack domains related to the same biosynthetic activities were retained by a 100,000xg DH domains implicated in cis double bond formation (i.e., Supernatant fraction but were not present in the membrane the FabA related DH domains). pellet. Thus, DHA and DPA synthesis in Schizochytrium 0076 According to the present invention, the phrase does not involve membrane-bound desaturases or fatty acid “open reading frame' is denoted by the abbreviation "Orf. US 2008/0038379 A1 Feb. 14, 2008

It is noted that the protein encoded by an open reading frame in the physiology of the organism (Magnuson et al., Micro can also be denoted in all upper case letters as “ORF and biol. Rev. 57, 522 (1993)). The two KS domains of the a nucleic acid sequence for an open reading frame can also PUFA-PKS systems could have distinct roles in the PUFA be denoted in all lower case letters as “orf, but for the sake biosynthetic reaction sequence. of consistency, the spelling "Orf is preferentially used 0081. As a class of enzymes, KS's have been well herein to describe either the nucleic acid sequence or the characterized. The sequences of many verified KS genes are protein encoded thereby. It will be obvious from the context known, the active site motifs have been identified and the of the usage of the term whether a protein or nucleic acid crystal structures of several have been determined. Proteins sequence is referenced. (or domains of proteins) can be readily identified as belong Schizochytrium PUFA PKS ing to the KS family of enzymes by homology to known KS Sequences. 0.077 FIG. 1 is a graphical representation of the three open reading frames from the Schizochytrium PUFA PKS 0082 The second domain in OrfA is a malonyl system, and includes the domain structure of this PUFAPKS CoA:ACP acyltransferase (MAT) domain, also referred to system. As described in detail in U.S. patent application Ser. herein as OrfA-MAT. This domain is contained within the No. 10/124,800, the domain structure of each open reading nucleotide sequence spanning from a starting point of frame is as follows: between about positions 1723 and 1798 of SEQ ID NO:1 (OrfA) to an ending point of between about positions 2805 Open Reading Frame A (OrfA): and 3000 of SEQ ID NO:1. The nucleotide sequence con taining the sequence encoding the OrfA-MAT domain is 0078. The complete nucleotide sequence for OrfA is represented herein as SEQID NO:9 (positions 1723-3000 of represented herein as SEQ ID NO:1. OrfA is a 8730 nucle SEQ ID NO:1). The amino acid sequence containing the otide sequence (not including the stop codon) which encodes MAT domain spans from a starting point of between about a 2910 amino acid sequence, represented herein as SEQ ID positions 575 and 600 of SEQID NO:2 (OrfA) to an ending NO:2. Within OrfA are twelve domains: (a) one B-ketoacyl point of between about positions 935 and 1000 of SEQ ID ACP synthase (KS) domain; (b) one malonyl-CoA:ACP NO:2. The amino acid sequence containing the OrfA-MAT acyltransferase (MAT) domain; (c) nine acyl carrier protein domain is represented herein as SEQ ID NO:10 (positions (ACP) domains; and (d) one B-ketoacyl-ACP reductase 575-1000 of SEQ ID NO:2). It is noted that the OrfA-MAT (KR) domain. The nucleotide sequence for OrfA has been domain contains an active site motif: GHS*XG (acyl deposited with GenBank as Accession No. AF378327 binding site S7), represented herein as SEQ ID NO:11. (amino acid sequence Accession No. AAK728879). 0083. According to the present invention, a domain or 0079. The first domain in Schizochytrium OrfA is a protein having malonyl-CoA:ACP acyltransferase (MAT) B-ketoacyl-ACP synthase (KS) domain, also referred to biological activity (function) is characterized as one that herein as OrfA-KS. This domain is contained within the transfers the malonyl moiety from malonyl-CoA to ACP. nucleotide sequence spanning from a starting point of The term “malonyl-CoA:ACP acyltransferase' can be used between about positions 1 and 40 of SEQID NO:1 (OrfA) interchangeably with “malonyl acyltransferase' and similar to an ending point of between about positions 1428 and 1500 derivatives. In addition to the active site motif (GXSXG), of SEQ ID NO:1. The nucleotide sequence containing the these enzymes possess an extended motif (R and Q amino sequence encoding the OrfA-KS domain is represented acids in key positions) that identifies them as MAT enzymes herein as SEQID NO:7 (positions 1-1500 of SEQID NO:1). (in contrast to the AT domain of Schizochytrium Orf B). In The amino acid sequence containing the KS domain spans some PKS systems (but not the PUFA PKS domain) MAT from a starting point of between about positions 1 and 14 of domains will preferentially load methyl- or ethyl-malonate SEQ ID NO:2 (OrfA) to an ending point of between about on to the ACP group (from the corresponding CoA ester), positions 476 and 500 of SEQ ID NO:2. The amino acid thereby introducing branches into the linear carbon chain. sequence containing the OrfA-KS domain is represented MAT domains can be recognized by their homology to herein as SEQID NO:8 (positions 1-500 of SEQID NO:2). known MAT sequences and by their extended motif struc It is noted that the OrfA-KS domain contains an active site ture. motif: DXAC* (*acyl binding site Cs). 0084 Domains 3-11 of OrfA are nine tandem acyl carrier 0080 According to the present invention, a domain or protein (ACP) domains, also referred to hereinas OrfA-ACP protein having fB-ketoacyl-ACP synthase (KS) biological (the first domain in the sequence is OrfA-ACP1, the second activity (function) is characterized as the enzyme that carries domain is OrfA-ACP2, the third domain is OrfA-ACP3, out the initial step of the FAS (and PKS) elongation reaction etc.). The first ACP domain, OrfA-ACP1, is contained cycle. The term “B-ketoacyl-ACP synthase' can be used within the nucleotide sequence spanning from about position interchangeably with the terms “3-ketoacyl-ACP synthase'. 3343 to about position 3600 of SEQ ID NO:1 (OrfA). The “B-ketoacyl-ACP synthase', and “keto-acyl ACP synthase'. nucleotide sequence containing the sequence encoding the and similar derivatives. The acyl group destined for elon OrfA-ACP1 domain is represented herein as SEQID NO:12 gation is linked to a cysteine residue at the active site of the (positions 3343-3600 of SEQ ID NO:1). The amino acid enzyme by a thioester bond. In the multi-step reaction, the sequence containing the first ACP domain spans from about acyl-enzyme undergoes condensation with malonyl-ACP to position 1115 to about position 1200 of SEQID NO:2. The form -ketoacyl-ACP, CO, and free enzyme. The KS plays a amino acid sequence containing the OrfA-ACP1 domain is key role in the elongation cycle and in many systems has represented herein as SEQ ID NO:13 (positions 1115-1200 been shown to possess greater Substrate specificity than of SEQ ID NO:2). It is noted that the OrfA-ACP1 domain other enzymes of the reaction cycle. For example, E. coli has contains an active site motif: LGIDS* (pantetheline binding three distinct KS enzymes—each with its own particular role motif Ss.), represented herein by SEQ ID NO:14. US 2008/0038379 A1 Feb. 14, 2008

0085. The nucleotide and amino acid sequences of all ID NO:17 (positions 6598-8730 of SEQ ID NO:1). The nine ACP domains are highly conserved and therefore, the amino acid sequence containing the KR domain spans from sequence for each domain is not represented herein by an a starting point of about position 2200 of SEQ ID NO:2 individual sequence identifier. However, based on the infor (OrfA) to an ending point of about position 2910 of SEQID mation disclosed herein, one of skill in the art can readily NO:2. The amino acid sequence containing the OrfA-KR determine the sequence containing each of the other eight domain is represented herein as SEQ ID NO:18 (positions ACP domains (see discussion below). 2200-2910 of SEQ ID NO:2). Within the KR domain is a 0.086 All nine ACP domains together span a region of core region with homology to short chain aldehyde-dehy OrfA of from about position 3283 to about position 6288 of drogenases (KR is a member of this family). This core SEQID NO:1, which corresponds to amino acid positions of region spans from about position 7198 to about position from about 1095 to about 2096 of SEQ ID NO:2. The 7500 of SEQ ID NO:1, which corresponds to amino acid nucleotide sequence for the entire ACP region containing all positions 2400-2500 of SEQ ID NO:2. nine domains is represented herein as SEQ ID NO:16. The 0089. According to the present invention, a domain or region represented by SEQ ID NO:16 includes the linker protein having fB-ketoacyl-ACP reductase (KR) activity is segments between individual ACP domains. The repeat characterized as one that catalyzes the pyridine-nucleotide interval for the nine domains is approximately every 330 dependent reduction of 3-ketoacyl forms of ACP. The term nucleotides of SEQ ID NO:16 (the actual number of amino “B-ketoacyl-ACP reductase' can be used interchangeably acids measured between adjacent active site serines ranges with the terms “ketoreductase”, “3-ketoacyl-ACP reduc from 104 to 116 amino acids). Each of the nine ACP tase, “keto-acyl ACP reductase' and similar derivatives of domains contains a pantetheline binding motif LGIDS* the term. It is the first reductive step in the de novo fatty acid (represented herein by SEQ ID NO:14), wherein S* is the biosynthesis elongation cycle and a reaction often performed pantetheline binding site serine (S). The pantetheline binding in polyketide biosynthesis. Significant sequence similarity is site serine (S) is located near the center of each ACP domain observed with one family of enoyl-ACP reductases (ER), the sequence. At each end of the ACP domain region and other reductase of FAS (but not the ER family present in the between each ACP domain is a region that is highly enriched PUFA PKS system), and the short-chain alcohol dehydro for proline (P) and alanine (A), which is believed to be a genase family. Pfam analysis of the PUFA PKS region linker region. For example, between ACP domains 1 and 2 indicated above reveals the homology to the short-chain is the sequence: APAPVKAAAPAAPVASAPAPA, repre alcohol dehydrogenase family in the core region. Blast sented herein as SEQID NO:15. The locations of the active analysis of the same region reveals matches in the core area site serine residues (i.e., the pantetheline binding site) for to known KR enzymes as well as an extended region of each of the nine ACP domains, with respect to the amino homology to domains from the other characterized PUFA acid sequence of SEQ ID NO:2, are as follows: ACP1 = PKS systems. Sls; ACP2=See: ACP3=S77, ACP4=Sass, ACP5= Open Reading Frame B (OrfB): So, ACP6=Szs: ACP7=Ss: ACP8=So; and ACP9= Sosa. Given that the average size of an ACP domain is about 0090 The complete nucleotide sequence for OrfB is 85 amino acids, excluding the linker, and about 110 amino represented herein as SEQ ID NO:3. OrfB is a 6177 nucle acids including the linker, with the active site serine being otide sequence (not including the stop codon) which encodes approximately in the center of the domain, one of skill in the a 2059 amino acid sequence, represented herein as SEQ ID art can readily determine the positions of each of the nine NO:4. Within OrfB are four domains: (a) one B-ketoacyl ACP domains in OrfA. ACP synthase (KS) domain; (b) one chain length factor (CLF) domain; (c) one acyltransferase (AT) domain; and, (d) 0087. According to the present invention, a domain or one enoyl-ACP reductase (ER) domain. The nucleotide protein having acyl carrier protein (ACP) biological activity sequence for OrfB has been deposited with GenBank as (function) is characterized as being Small polypeptides (typi Accession No. AF378328 (amino acid sequence Accession cally, 80 to 100 amino acids long), that function as carriers No. AAK728880). for growing fatty acyl chains via a thioester linkage to a covalently bound co-factor of the protein. They occur as 0091. The first domain in OrfB is a B-ketoacyl-ACP separate units or as domains within larger proteins. ACPs are synthase (KS) domain, also referred to herein as OrfB-KS. converted from inactive apo-forms to functional holo-forms This domain is contained within the nucleotide sequence by transfer of the phosphopantetheinyl moeity of CoA to a spanning from a starting point of between about positions 1 highly conserved serine residue of the ACP. Acyl groups are and 43 of SEQ ID NO:3 (OrfB) to an ending point of attached to ACP by a thioester linkage at the free terminus between about positions 1332 and 1350 of SEQ ID NO:3. of the phosphopantetheinyl moiety. ACPs can be identified The nucleotide sequence containing the sequence encoding by labeling with radioactive pantetheline and by sequence the OrfB-KS domain is represented hereinas SEQID NO:19 (positions 1-1350 of SEQ ID NO:3). The amino acid homology to known ACPs. The presence of variations of the sequence containing the KS domain spans from a starting above mentioned motif (LGIDS) is also a signature of an point of between about positions 1 and 15 of SEQID NO:4 ACP. (OrfB) to an ending point of between about positions 444 0088 Domain 12 in OrfA is a B-ketoacyl-ACP reductase and 450 of SEQ ID NO:4. The amino acid sequence con (KR) domain, also referred to herein as OrfA-KR. This taining the OrfB-KS domain is represented herein as SEQ domain is contained within the nucleotide sequence Span ID NO:20 (positions 1-450 of SEQID NO:4). It is noted that ning from a starting point of about position 6598 of SEQID the OrfB-KS domain contains an active site motif: DXAC NO:1 to an ending point of about position 8730 of SEQ ID (acyl binding site Co.). KS biological activity and meth NO:1. The nucleotide sequence containing the sequence ods of identifying proteins or domains having Such activity encoding the OrfA-KR domain is represented herein as SEQ is described above. US 2008/0038379 A1 Feb. 14, 2008

0092. The second domain in OrfB is a chain length factor examined and very weak homology to some acyltransferases (CLF) domain, also referred to herein as OrfB-CLF. This whose specific functions have been identified (e.g. to malo domain is contained within the nucleotide sequence Span nyl-CoA:ACP acyltransferase, MAT). In spite of the weak ning from a starting point of between about positions 1378 homology to MAT, this AT domain is not believed to and 1402 of SEQ ID NO:3 (OrfB) to an ending point of function as a MAT because it does not possess an extended between about positions 2682 and 2700 of SEQ ID NO:3. motif structure characteristic of such enzymes (see MAT The nucleotide sequence containing the sequence encoding domain description, above). For the purposes of this disclo the OrfB-CLF domain is represented herein as SEQ ID sure, the functions of the AT domain in a PUFA PKS system NO:21 (positions 1378-2700 of SEQ ID NO:3). The amino include, but are not limited to: transfer of the fatty acyl group acid sequence containing the CLF domain spans from a from the OrfAACP domain(s) to water (i.e. a thioesterase— starting point of between about positions 460 and 468 of releasing the fatty acyl group as a free fatty acid), transfer of SEQ ID NO:4 (OrfB) to an ending point of between about a fatty acyl group to an acceptor Such as CoA, transfer of the positions 894 and 900 of SEQ ID NO:4. The amino acid acyl group among the various ACP domains, or transfer of sequence containing the OrfB-CLF domain is represented the fatty acyl group to a lipophilic acceptor molecule (e.g. to herein as SEQ ID NO:22 (positions 460-900 of SEQ ID lysophosphadic acid). NO:4). It is noted that the OrfB-CLF domain contains a KS 0096) The fourth domain in OrfB is an ER domain, also active site motif without the acyl-binding cysteine. referred to herein as OrfB-ER. This domain is contained 0093. According to the present invention, a domain or within the nucleotide sequence spanning from a starting protein is referred to as a chain length factor (CLF) based on point of about position 4648 of SEQ ID NO:3 (OrfB) to an the following rationale. The CLF was originally described as ending point of about position 6177 of SEQ ID NO:3. The characteristic of Type II (dissociated enzymes) PKS systems nucleotide sequence containing the sequence encoding the and was hypothesized to play a role in determining the OrfB-ER domain is represented herein as SEQ ID NO:25 number of elongation cycles, and hence the chain length, of (positions 4648-6177 of SEQ ID NO:3). The amino acid the end product. CLF amino acid sequences show homology sequence containing the ER domain spans from a starting to KS domains (and are thought to form heterodimers with point of about position 1550 of SEQ ID NO:4 (OrfB) to an a KS protein), but they lack the active site cysteine. CLF's ending point of about position 2059 of SEQ ID NO:4. The role in PKS systems is currently controversial. New evi amino acid sequence containing the OrfB-ER domain is dence (C. Bisang et al., Nature 401, 502 (1999)) suggests a represented herein as SEQ ID NO:26 (positions 1550-2059 role in priming (providing the initial acyl group to be of SEQ ID NO:4). elongated) the PKS systems. In this role the CLF domain is 0097 According to the present invention, this domain has thought to decarboxylate malonate (as malonyl-ACP), thus enoyl-ACP reductase (ER) biological activity. According to forming an acetate group that can be transferred to the KS the present invention, the term “enoyl-ACP reductase' can active site. This acetate therefore acts as the priming be used interchangeably with "enoyl reductase”, “enoyl molecule that can undergo the initial elongation (condensa ACP-reductase' and “enoyl acyl-ACP reductase'. The ER tion) reaction. Homologues of the Type II CLF have been enzyme reduces the trans-double bond (introduced by the identified as loading domains in some modular PKS sys DH activity) in the fatty acyl-ACP, resulting in fully satu tems. A domain with the sequence features of the CLF is rating those carbons. The ER domain in the PUFA-PKS found in all currently identified PUFA PKS systems and in shows homology to a newly characterized family of ER each case is found as part of a multidomain protein. enzymes (Heath et al., Nature 406, 145 (2000)). Heath and 0094. The third domain in OrfB is an AT domain, also Rock identified this new class of ER enzymes by cloning a referred to herein as OrfB-AT. This domain is contained gene of interest from Streptococcus pneumoniae, purifying within the nucleotide sequence spanning from a starting a protein expressed from that gene, and showing that it had point of between about positions 2701 and 3598 of SEQ ID ER activity in an in vitro assay. The sequence of the NO:3 (OrfB) to an ending point of between about positions Schizochytrium ER domain of OrfB shows homology to the 3975 and 4200 of SEQ ID NO:3. The nucleotide sequence S. pneumoniae ER protein. All of the PUFA PKS systems containing the sequence encoding the OrfB-AT domain is currently examined contain at least one domain with very represented herein as SEQ ID NO:23 (positions 2701-4200 high sequence homology to the Schizochytrium ER domain. of SEQ ID NO:3). The amino acid sequence containing the The Schizochytrium PUFA PKS system contains two ER AT domain spans from a starting point of between about domains (one on OrfB and one on Orfc). positions 901 and 1200 of SEQID NO:4 (OrfB) to an ending point of between about positions 1325 and 1400 of SEQID Open Reading Frame C (Orf(r): NO:4. The amino acid sequence containing the OrfB-AT 0098. The complete nucleotide sequence for Orfc is domain is represented herein as SEQ ID NO:24 (positions represented herein as SEQ ID NO:5. Orf is a 4509 nucle 901-1400 of SEQ ID NO:4). It is noted that the OrfB-AT otide sequence (not including the stop codon) which encodes domain contains an active site motif of GXS*XG (acyl a 1503 amino acid sequence, represented herein as SEQ ID binding site Sao) that is characteristic of acyltransferase NO:6. Within OrfQ are three domains: (a) two FabA-like (AT) proteins. B-hydroxyacyl-ACP dehydrase (DH) domains; and (b) one 0095) An “acyltransferase” or “AT” refers to a general enoyl-ACP reductase (ER) domain. The nucleotide sequence class of enzymes that can carry out a number of distinct acyl for Orf has been deposited with GenBank as Accession No. transfer reactions. The term “acyltransferase' can be used AF3783.29 (amino acid sequence Accession No. interchangeably with the term “acyl transferase'. The AAK728881). Schizochytrium domain shows good homology to a domain 0099] The first domain in Orf is a DH domain, also present in all of the other PUFA PKS systems currently referred to herein as Orf-DH1. This is one of two DH US 2008/0038379 A1 Feb. 14, 2008

domains in Orfc, and therefore is designated DH1. This represented herein as SEQ ID NO:32 (positions 1000-1502 domain is contained within the nucleotide sequence Span of SEQID NO:6). ER biological activity has been described ning from a starting point of between about positions 1 and above. 778 of SEQ ID NO:5 (Orf) to an ending point of between Thraustochytrium 23B PUFA PKS about positions 1233 and 1350 of SEQ ID NO:5. The nucleotide sequence containing the sequence encoding the Th. 23B Open Reading Frame A (OrfA): Orf-DH1 domain is represented herein as SEQ ID NO:27 (positions 1-1350 of SEQ ID NO:5). The amino acid 0103) The complete nucleotide sequence for Th. 23B sequence containing the DH1 domain spans from a starting OrfA is represented herein as SEQ ID NO:38. SEQ ID NO:38 encodes the following domains in Th. 23B OrfA: (a) point of between about positions 1 and 260 of SEQID NO:6 one B-ketoacyl-ACP synthase (KS) domain; (b) one malo (Orf') to an ending point of between about positions 411 nyl-CoA:ACP acyltransferase (MAT) domain; (c) eight acyl and 450 of SEQ ID NO:6. The amino acid sequence con carrier protein (ACP) domains; and (d) one f3-ketoacyl-ACP taining the Orf’-DH1 domain is represented herein as SEQ reductase (KR) domain. This domain organization is the ID NO:28 (positions 1-450 of SEQ ID NO:6). same as is present in Schizochytrium Orf A (SEQ ID NO:1) 0100. According to the present invention, this domain has with the exception that the Th. 23B Orf A has 8 adjacent FabA-like B-hydroxyacyl-ACP dehydrase (DH) biological ACP domains, while Schizochytrium Orf A has 9 adjacent activity. The term “FabA-like B-hydroxyacyl-ACP dehy ACP domains. Th. 23B OrfA is a 8433 nucleotide sequence drase' can be used interchangeably with the terms “FabA (not including the stop codon) which encodes a 2811 amino like B-hydroxyacyl-ACP dehydrase”, “B-hydroxyacyl-ACP acid sequence, represented herein as SEQ ID NO:39. The dehydrase”, “dehydrase' and similar derivatives. The char Th. 23B OrfA amino acid sequence (SEQ ID NO:39) was acteristics of both the DH domains (see below for DH2) in compared with known sequences in a standard BLAST the PUFAPKS systems have been described in the preceding search (BLAST parameters: Blastp, low complexity filter sections. This class of enzyme removes HOH from a B-ke Off, program-BLOSUM62, Gap cost-Existence: 11, Exten toacyl-ACP and leaves a trans double bond in the carbon sion 1: (BLAST described in Altschul, S. F., Madden, T. L., chain. The DH domains of the PUFA PKS systems show Schääffer, A. A., Zhang, J., Zhang, Z. Miller, W. & Lipman, homology to bacterial DH enzymes associated with their D. J. (1997) “Gapped BLAST and PSI-BLAST: a new FAS systems (rather than to the DH domains of other PKS generation of protein database search programs. Nucleic systems). A subset of bacterial DH's, the FabA-like DH's, Acids Res. 25:3389-3402, incorporated herein by reference possesses cis-trans isomerase activity (Heath et al., J. Biol. in its entirety))). At the amino acid level, the sequences with Chem., 271, 27795 (1996)). It is the homologies to the the greatest degree of homology to Th. 23B OrfA was FabA-like DHS that indicate that one or both of the DH Schizochytrium Orf A (gb AAK72879.1) (SEQ ID NO:2). domains is responsible for insertion of the cis double bonds The alignment extends over the entire query but is broken in the PUFA PKS products. into 2 pieces (due to the difference in numbers of ACP repeats). SEQ ID NO:39 first aligns at positions 6 through 0101 The second domain in Orf is a DH domain, also 1985 (including 8 ACP domains) with SEQ ID NO:2 and referred to herein as Orf-DH2. This is the second of two shows a sequence identity to SEQ ID NO:2 of 54% over DH domains in Orf , and therefore is designated DH2. This 2017 amino acids. SEQ ID NO:39 also aligns at positions domain is contained within the nucleotide sequence Span 980 through 2811 with SEQID NO:2 and shows a sequence ning from a starting point of between about positions 1351 identity to SEQID NO:2 of 43% over 1861 amino acids. In and 2437 of SEQ ID NO:5 (Orf() to an ending point of this second alignment, the match is evident for the Th. 23B between about positions 2607 and 2850 of SEQ ID NO:5. 8x ACPs in the regions of the conserved pantetheline attach The nucleotide sequence containing the sequence encoding ment site motif, but is very poor over the 1st Schizochytrium the Orf’-DH2 domain is represented herein as SEQ ID ACP domain (i.e., there is not a 9" ACP domain in the Th. NO:29 (positions 1351-2850 of SEQ ID NO:5). The amino 23B query sequence, but the Blastp output under theses acid sequence containing the DH2 domain spans from a conditions attempts to align them anyway). SEQ ID NO:39 starting point of between about positions 451 and 813 of shows the next closest identity with sequences from SEQ ID NO:6 (Orf(r) to an ending point of between about (Accession No. NP 717214) and positions 869 and 950 of SEQ ID NO:6. The amino acid Photobacter profindum (Accession No. AAL01060). sequence containing the OrfQ-DH2 domain is represented herein as SEQ ID NO:30 (positions 451-950 of SEQ ID 0.104) The first domain in Th. 23B OrfA is a KS domain, NO:6). DH biological activity has been described above. also referred to herein as Th. 23B OrfA-KS. KS domain function has been described in detail above. This domain is 0102) The third domain in Orf is an ER domain, also contained within the nucleotide sequence spanning from referred to herein as Orf-ER. This domain is contained about position 1 to about position 1500 of SEQ ID NO:38, within the nucleotide sequence spanning from a starting represented herein as SEQ ID NO:40. The amino acid point of about position 2998 of SEQ ID NO:5 (Orf) to an sequence containing the Th. 23B KS domain is a region of ending point of about position 4509 of SEQ ID NO:5. The SEQ ID NO:39 spanning from about position 1 to about nucleotide sequence containing the sequence encoding the position 500 of SEQ ID NO:39, represented herein as SEQ Orf-ER domain is represented herein as SEQ ID NO:31 ID NO:41. This region of SEQID NO:39 has a Pfam match (positions 2998-4509 of SEQ ID NO:5). The amino acid to FabB (B-ketoacyl-ACP synthase) spanning from position sequence containing the ER domain spans from a starting 1 to about position 450 of SEQ ID NO:39 (also positions 1 point of about position 1000 of SEQ ID NO:6 (Orf) to an to about 450 of SEQID NO:41). It is noted that the Th. 23B ending point of about position 1502 of SEQ ID NO:6. The OrfA-KS domain contains an active site motif: DXAC amino acid sequence containing the OrfQ-ER domain is (acyl binding site Co.,). Also, a characteristic motif at the US 2008/0038379 A1 Feb. 14, 2008 end of the Th. 23B KS region, GFGG, is present in positions can readily determine the sequence containing each of the 453-456 of SEQID NO:39 (also positions 453-456 of SEQ other seven ACP domains in SEQ ID NO:38 and SEQ ID ID NO:41). The amino acid sequence spanning positions NO:39. 1-500 of SEQ ID NO:39 is about 79% identical to 0.108 All eight Th. 23B ACP domains together span a Schizochytrium OrfA (SEQID NO:2) over 496 amino acids. region of Th. 23B OrfA of from about position 3205 to about The amino acid sequence spanning positions 1-450 of SEQ position 5994 of SEQ ID NO:38, which corresponds to ID NO:39 is about 81% identical to Schizochytrium OrfA amino acid positions of from about 1069 to about 1998 of (SEQ ID NO:2) over 446 amino acids. SEQ ID NO:39. The nucleotide sequence for the entire ACP region containing all eight domains is represented herein as 0105. The second domain in Th. 23B OrfA is a MAT SEQ ID NO:47. SEQ ID NO.47 encodes an amino acid domain, also referred to herein as Th. 23B OrfA-MAT. MAT sequence represented herein by SEQ ID NO:48. SEQ ID domain function has been described in detail above. This NO:48 includes the linker segments between individual ACP domain is contained within the nucleotide sequence Span domains. The repeat interval for the eight domains is ning from between about position 1503 and about position approximately every 116 amino acids of SEQID NO:48, and 3000 of SEQ ID NO:38, represented herein as SEQ ID each domain can be considered to consist of about 116 NO:42. The amino acid sequence containing the Th. 23B amino acids centered on the active site motif (described MAT domain is a region of SEQ ID NO:39 spanning from above). It is noted that the linker regions between the nine about position 501 to about position 1000, represented adjacent ACP domains in OrfA in Schizochytrium are highly herein by SEQID NO:43. This region of SEQID NO:39 has enriched in proline and alanine residues, while the linker a Pfam match to Fab) (malonyl-CoA:ACP acyltransferase) regions between the eight adjacent ACP domains in OrfA of spanning from about position 580 to about position 900 of Thraustochytrium are highly enriched in serine residues (and SEQ ID NO:39 (positions 80-400 of SEQ ID NO:43). It is not proline or alanine residues). noted that the Th. 23B OrfA-MAT domain contains an active 0109) The last domain in Th. 23B OrfA is a KR domain, site motif: GHS*XG (acyl binding site S7), represented also referred to herein as Th. 23B OrfA-KR. KR domain by positions 695-699 of SEQ ID NO:39. The amino acid function has been discussed in detail above. This domain is sequence spanning positions 501-1000 of SEQID NO:39 is contained within the nucleotide sequence spanning from about 46% identical to Schizochytrium OrfA(SEQID NO:2) between about position 6001 to about position 8433 of SEQ over 481 amino acids. The amino acid sequence spanning ID NO:38, represented herein by SEQID NO:49. The amino acid sequence containing the Th. 23B KR domain is a region positions 580-900 of SEQID NO:39 is about 50% identical of SEQ ID NO:39 spanning from about position 2001 to to Schizochytrium OrfA (SEQ ID NO:2) over 333 amino about position 2811 of SEQ ID NO:39, represented herein acids. by SEQ ID NO:50. This region of SEQ ID NO:39 has a 0106 Domains 3-10 of Th. 23B OrfA are eight tandem Pfam match to FabG (B-ketoacyl-ACP reductase) spanning ACP domains, also referred to herein as Th. 23B OrfA-ACP from about position 2300 to about 2550 of SEQ ID NO:39 (the first domain in the sequence is OrfA-ACP1, the second (positions 300-550 of SEQ ID NO:50). The amino acid domain is OrfA-ACP2, the third domain is OrfA-ACP3, sequence spanning positions 2001-2811 of SEQ ID NO:39 etc.). The function of ACP domains has been described in is about 40% identical to Schizochytrium OrfA (SEQ ID detail above. The first Th. 23B ACP domain, Th. 23B NO:2) over 831 amino acids. The amino acid sequence OrfA-ACP1, is contained within the nucleotide sequence spanning positions 2300-2550 of SEQ ID NO:39 is about spanning from about position 3205 to about position 3555 of 51% identical to Schizochytrium OrfA (SEQID NO:2) over SEQ ID NO:38 (OrfA), represented herein as SEQ ID 235 amino acids. NO:44. The amino acid sequence containing the first Th. Th. 23B Open Reading Frame B (Or/B): 23B ACP domain is a region of SEQ ID NO:39 spanning 0110. The complete nucleotide sequence for Th. 23B from about position 1069 to about position 1185 of SEQID OrfB is represented herein as SEQ ID NO:51. SEQ ID NO:39, represented herein by SEQ ID NO:45. The amino NO:51 encodes the following domains in Th. 23B OrfB: (a) acid sequence spanning positions 1069-1185 of SEQ ID one f3-ketoacyl-ACP synthase (KS) domain; (b) one chain NO:39 is about 65% identical to Schizochytrium OrfA (SEQ length factor (CLF) domain; (c) one acyltransferase (AT) ID NO:2) over 85 amino acids. Th. 23B OrfA-ACP1 has a domain; and, (d) one enoyl-ACP reductase (ER) domain. similar identity to any one of the nine ACP domains in This domain organization is the same as in Schizochytrium Schizochytrium OrfA. Orf B (SEQ ID NO:3) with the exception that the linker 0107 The eight ACP domains in Th. 23B OrfA are region between the AT and ER domains of the adjacent to one another and can be identified by the presence Schizochytrium protein is longer than that of Th. 23B by of the phosphopantetheline binding site motif, LGXDS* about 50-60 amino acids. Also, this linker region in (represented by SEQ ID NO:46), wherein the S* is the Schizochytrium has a specific area that is highly enriched in phosphopantetheline attachment site. The amino acid posi serine residues (it contains 15 adjacent serine residues, in tion of each of the eight S* sites, with reference to SEQ ID addition to other serines in the region), whereas the corre NO:39, are 1128 (ACP1), 1244 (ACP2), 1360 (ACP3), 1476 sponding linker region in Th. 23B OrfB is not enriched in (ACP4), 1592 (ACP5), 1708 (ACP6), 1824 (ACP7) and serine residues. This difference in the AT/ER linker region 1940 (ACP8). The nucleotide and amino acid sequences of most likely accounts for a break in the alignment between all eight Th. 23B ACP domains are highly conserved and Schizochytrium OrfB and Th. 23B OrfB at the start of this therefore, the sequence for each domain is not represented region. herein by an individual sequence identifier. However, based 0111. Th. 23B OrfB is a 5805 nucleotide sequence (not on the information disclosed herein, one of skill in the art including the stop codon) which encodes a 1935 amino acid US 2008/0038379 A1 Feb. 14, 2008

sequence, represented herein as SEQ ID NO:52. The Th. (SEQ ID NO:4) over 517 amino acids. The amino acid 23B OrfB amino acid sequence (SEQ ID NO:52) was sequence spanning positions 550-910 of SEQ ID NO:52 is compared with known sequences in a standard BLAST about 54% identical to Schizochytrium OrfB (SEQID NO:4) search (BLAST parameters: Blastp, low complexity filter over 360 amino acids. Off, program-BLOSUM62, Gap cost-Existence: 11, Exten 0114. The third domain in Th. 23B OrfB is an AT domain, sion 1: (BLAST described in Altschul, S. F., Madden, T. L., also referred to herein as Th. 23B OrfB-AT. AT domain Schääffer, A. A., Zhang, J., Zhang, Z. Miller, W. & Lipman, function has been described in detail above. This domain is D. J. (1997) “Gapped BLAST and PSI-BLAST: a new contained within the nucleotide sequence spanning from generation of protein database search programs. Nucleic between about position 3001 and about position 4500 of Acids Res. 25:3389-3402, incorporated herein by reference SEQ ID NO:51 (Th. 23B OrfB), represented herein as SEQ in its entirety))). At the amino acid level, the sequences with ID NO:58. The amino acid sequence containing the Th. 23B the greatest degree of homology to Th. 23B OrfB were AT domain is a region of SEQ ID NO: 52 spanning from Schizochytrium Orf B (gb AAK72880.1) (SEQ ID NO:4), about position 1001 to about position 1500 of SEQ ID over most of OrfB; and Schizochytrium Orf (gb NO:52, represented herein as SEQID NO:58. This region of AAK728881.1) (SEQ ID NO:6), over the last domain (the SEQ ID NO:52 has a Pfam match to FabD (malonyl alignment is broken into 2 pieces, as mentioned above). SEQ CoA:ACP acyltransferase) spanning from about position ID NO:52 first aligns at positions 10 through about 1479 1100 to about position 1375 (positions 100-375 of SEQ ID (including the KS, CLF and AT domains) with SEQID NO:4 NO:58). Although this AT domain of the PUFA synthases and shows a sequence identity to SEQID NO:4 of 52% over has homology to MAT proteins, it lacks the extended motif 1483 amino acids. SEQ ID NO:52 also aligns at positions of the MAT (key arginine and glutamine residues) and it is 1491 through 1935 (including the ER domain) with SEQID not thought to be involved in malonyl-CoA transfers. The NO:6 and shows a sequence identity to SEQ ID NO:4 of GXS*XG motif of acyltransferases is present, with the S* 64% over 448 amino acids. being the site of acyl attachment and located at position 1123 0112) The first domain in the Th. 23B OrfB is a KS with respect to SEQ ID NO:52. The amino acid sequence domain, also referred to herein as Th. 23B OrfB-KS. KS spanning positions 1001-1500 of SEQ ID NO:52 is about domain function has been described in detail above. This 44% identical to Schizochytrium OrfB (SEQ ID NO:4) over domain is contained within the nucleotide sequence Span 459 amino acids. The amino acid sequence spanning posi ning from between about position 1 and about position 1500 tions 1100-1375 of SEQID NO:52 is about 45% identical to of SEQ ID NO:51 (Th. 23B OrfB), represented herein as Schizochytrium OrfB (SEQID NO:4) over 283 amino acids. SEQID NO:53. The amino acid sequence containing the Th. 0115 The fourth domain in Th. 23B OrfB is an ER 23B KS domain is a region of SEQ ID NO: 52 spanning domain, also referred to herein as Th. 23B OrfB-ER. ER from about position 1 to about position 500 of SEQ ID domain function has been described in detail above. This NO:52, represented herein as SEQID NO:54. This region of domain is contained within the nucleotide sequence Span SEQ ID NO:52 has a Pfam match to FabB (B-ketoacyl-ACP ning from between about position 4501 and about position synthase) spanning from about position 1 to about position 5805 of SEQ ID NO:51 (OrfB), represented herein as SEQ 450 (positions 1-450 of SEQID NO:54). It is noted that the ID NO:59. The amino acid sequence containing the Th. 23B Th. 23B OrfB-KS domain contains an active site motif: ER domain is a region of SEQ ID NO: 52 spanning from DXAC, where C* is the site of acyl group attachment and about position 1501 to about position 1935 of SEQ ID wherein the C* is at position 201 of SEQ ID NO:52. Also, NO:52, represented herein as SEQID NO:60. This region of a characteristic motif at the end of the KS region, GFGG is SEQ ID NO:52 has a Pfam match to a family of dioxyge present in amino acid positions 434–437 of SEQID NO:52. nases related to 2-nitropropane dioxygenases spanning from The amino acid sequence spanning positions 1-500 of SEQ about position 1501 to about position 1810 (positions 1-310 ID NO:52 is about 64% identical to Schizochytrium OrfB of SEQID NO:60). That this domain functions as an ER can (SEQ ID NO:4) over 500 amino acids. The amino acid be further predicted due to homology to a newly character sequence spanning positions 1-450 of SEQ ID NO:52 is ized ER enzyme from Streptococcus pneumoniae. The about 67% identical to Schizochytrium OrfB (SEQID NO:4) amino acid sequence spanning positions 1501-1935 of SEQ over 442 amino acids. ID NO:52 is about 66% identical to Schizochytrium OrfB 0113. The second domain in Th. 23B OrfB is a CLF (SEQ ID NO:4) over 433 amino acids. The amino acid domain, also referred to herein as Th. 23B OrfB-CLF. CLF sequence spanning positions 1501-1810 of SEQ ID NO:52 domain function has been described in detail above. This is about 70% identical to Schizochytrium OrfB (SEQ ID domain is contained within the nucleotide sequence Span NO:4) over 305 amino acids. ning from between about position 1501 and about position 3000 of SEQ ID NO:51 (OrfB), represented herein as SEQ Th. 23B Open Reading Frame C (OrfQ): ID NO:55. The amino acid sequence containing the CLF 0.116) The complete nucleotide sequence for Th. 23B domain is a region of SEQID NO: 52 spanning from about Orf is represented herein as SEQ ID NO:61. SEQ ID position 501 to about position 1000 of SEQ ID NO:52, NO:61 encodes the following domains in Th. 23B Orfor: (a) represented herein as SEQID NO:56. This region of SEQID two FabA-like B-hydroxyacyl-ACP dehydrase (DH) NO:52 has a Pfam match to FabB (B-ketoacyl-ACP syn domains, both with homology to the FabA protein (an thase) spanning from about position 550 to about position enzyme that catalyzes the synthesis of trans-2-decenoyl 910 (positions 50-410 of SEQ ID NO:56). Although CLF ACP and the reversible isomerization of this product to has homology to KS proteins, it lacks an active site cysteine cis-3-decenoyl-ACP); and (b) one enoyl-ACP reductase to which the acyl group is attached in KS proteins. The (ER) domain with high homology to the ER domain of amino acid sequence spanning positions 501-1000 of SEQ Schizochytrium OrfB. This domain organization is the same ID NO:52 is about 49% identical to Schizochytrium OrfB as in Schizochytrium Orf C (SEQ ID NO:5). US 2008/0038379 A1 Feb. 14, 2008

0117 Th. 23B Orf is a 4410 nucleotide sequence (not NO:62, represented herein as SEQID NO:68. This region of including the stop codon) which encodes a 1470 amino acid SEQ ID NO:62 has a Pfam match to the dioxygenases sequence, represented herein as SEQ ID NO:62. The Th. related to 2-nitropropane dioxygenases, as mentioned above, 23B Orf amino acid sequence (SEQ ID NO:62) was spanning from about position 1025 to about position 1320 compared with known sequences in a standard BLAST (positions 25-320 of SEQID NO:68). This domain function search (BLAST parameters: Blastp, low complexity filter as an ER can also be predicted due to homology to a newly Off, program-BLOSUM62, Gap cost-Existence: 11, Exten characterized ER enzyme from Streptococcus pneumoniae. sion 1: (BLAST described in Altschul, S. F., Madden, T. L., The amino acid sequence spanning positions 1001-1470 of Schääffer, A. A., Zhang, J., Zhang, Z. Miller, W. & Lipman, SEQ ID NO:62 is about 75% identical to Schizochytrium D. J. (1997) “Gapped BLAST and PSI-BLAST: a new OrfB (SEQID NO:4) over 474 amino acids. The amino acid generation of protein database search programs. Nucleic sequence spanning positions 1025-1320 of SEQ ID NO:62 Acids Res. 25:3389-3402, incorporated herein by reference is about 81% identical to Schizochytrium OrfB (SEQ ID in its entirety))). At the amino acid level, the sequences with NO:4) over 296 amino acids. the greatest degree of homology to Th. 23B Orf was 0121 One embodiment of the present invention relates to Schizochytrium Orf (gb AAK728881.1) (SEQ ID NO:6). an isolated protein or domain from a non-bacterial PUFA SEQ ID NO:52 is 66% identical to Schizochytrium Orf PKS system, a homologue thereof, and/or a fragment (SEQ ID NO:6). thereof. Also included in the invention are isolated nucleic 0118. The first domain in Th. 23B Orf is a DH domain, acid molecules encoding any of the proteins, domains or also referred to herein as Th. 23B Orf-DH1. DH domain peptides described herein (discussed in detail below). function has been described in detail above. This domain is According to the present invention, an isolated protein or contained within the nucleotide sequence spanning from peptide, such as a protein or peptide from a PUFA PKS between about position 1 to about position 1500 of SEQID system, is a protein or a fragment thereof (including a NO:61 (Orf), represented herein as SEQ ID NO:63. The polypeptide or peptide) that has been removed from its amino acid sequence containing the Th. 23B DH1 domain is natural milieu (i.e., that has been Subject to human manipu a region of SEQID NO: 62 spanning from about position 1 lation) and can include purified proteins, partially purified to about position 500 of SEQID NO:62, represented herein proteins, recombinantly produced proteins, and synthetically as SEQID NO:64. This region of SEQID NO:62 has a Pfam produced proteins, for example. As such, "isolated' does not match to FabA, as mentioned above, spanning from about reflect the extent to which the protein has been purified. position 275 to about position 400 (positions 275-400 of Preferably, an isolated protein of the present invention is SEQ ID NO:64). The amino acid sequence spanning posi produced recombinantly. An isolated peptide can be pro tions 1-500 of SEQ ID NO:62 is about 66% identical to duced synthetically (e.g., chemically, Such as by peptide Schizochytrium Orf (SEQID NO:6) over 526 amino acids. synthesis) or recombinantly. In addition, and by way of The amino acid sequence spanning positions 275-400 of example, a “Thraustochytrium PUFA PKS protein’ refers to SEQ ID NO:62 is about 81% identical to Schizochytrium a PUFA PKS protein (generally including a homologue of a Orf (SEQ ID NO:6) over 126 amino acids. naturally occurring PUFA PKS protein) from a Thraus tochytrium microorganism, or to a PUFA PKS protein that 0119) The second domain in Th. 23B Orf is also a DH has been otherwise produced from the knowledge of the domain, also referred to herein as Th. 23B Orf-DH2. This structure (e.g., sequence), and perhaps the function, of a is the second of two DH domains in Orf, and therefore is naturally occurring PUFA PKS protein from Thraus designated DH2. This domain is contained within the nucle tochytrium. In other words, general reference to a Thraus otide sequence spanning from between about position 1501 tochytrium PUFA PKS protein includes any PUFA PKS to about 3000 of SEQID NO:61 (Orf), represented herein protein that has Substantially similar structure and function as SEQID NO:65. The amino acid sequence containing the of a naturally occurring PUFA PKS protein from Thraus Th. 23B DH2 domain is a region of SEQ ID NO: 62 tochytrium or that is a biologically active (i.e., has biological spanning from about position 501 to about position 1000 of activity) homologue of a naturally occurring PUFA PKS SEQ ID NO:62, represented herein as SEQID NO:66. This protein from Thraustochytrium as described in detail herein. region of SEQ ID NO:62 has a Pfam match to FabA, as As such, a Thraustochytrium PUFA PKS protein can include mentioned above, spanning from about position 800 to about purified, partially purified, recombinant, mutated/modified position 925 (positions 300-425 of SEQ ID NO:66). The and synthetic proteins. The same description applies to amino acid sequence spanning positions 501-1000 of SEQ reference to other proteins or peptides described herein, such ID NO:62 is about 56% identical to Schizochytrium Orf as the PUFA PKS proteins and domains from (SEQ ID NO:6) over 518 amino acids. The amino acid Schizochytrium or from other microorganisms. sequence spanning positions 800-925 of SEQ ID NO:62 is 0122) According to the present invention, the terms about 58% identical to Schizochytrium Orf (SEQID NO:6) “modification' and “mutation can be used interchangeably, over 124 amino acids. particularly with regard to the modifications/mutations to the 0120) The third domain in Th. 23B Orf is an ER primary amino acid sequences of a protein or peptide (or domain, also referred to herein as Th. 23B Orf-ER. ER nucleic acid sequences) described herein. The term “modi domain function has been described in detail above. This fication' can also be used to describe post-translational domain is contained within the nucleotide sequence Span modifications to a protein or peptide including, but not ning from between about position 3001 to about position limited to, methylation, farnesylation, carboxymethylation, 4410 of SEQ ID NO:61 (OrfC), represented herein as SEQ geranylgeranylation, glycosylation, phosphorylation, acety ID NO:67. The amino acid sequence containing the Th. 23B lation, myristoylation, prenylation, palmitation, and/or ami ER domain is a region of SEQ ID NO: 62 spanning from dation. Modifications can also include, for example, com about position 1001 to about position 1470 of SEQ ID plexing a protein or peptide with another compound. Such US 2008/0038379 A1 Feb. 14, 2008 20 modifications can be considered to be mutations, for 0127. Modifications or mutations in protein homologues, example, if the modification is different than the post as compared to the wild-type protein, either increase, translational modification that occurs in the natural, wild decrease, or do not Substantially change, the basic biological type protein or peptide. activity of the homologue as compared to the naturally occurring (wild-type) protein. In general, the biological 0123. As used herein, the term “homologue' is used to activity or biological action of a protein refers to any refer to a protein or peptide which differs from a naturally function(s) exhibited or performed by the protein that is occurring protein or peptide (i.e., the “prototype' or “wild ascribed to the naturally occurring form of the protein as type' protein) by one or more minor modifications or measured or observed in vivo (i.e., in the natural physiologi mutations to the naturally occurring protein or peptide, but cal environment of the protein) or in vitro (i.e., under which maintains the overall basic protein and side chain laboratory conditions). Biological activities of PUFA PKS structure of the naturally occurring form (i.e., such that the systems and the individual proteins/domains that make up a homologue is identifiable as being related to the wild-type PUFA PKS system have been described in detail elsewhere protein). Such changes include, but are not limited to: herein. Modifications of a protein, such as in a homologue changes in one or a few amino acid side chains; changes one or mimetic (discussed below), may result in proteins having or a few amino acids, including deletions (e.g., a truncated the same biological activity as the naturally occurring pro version of the protein or peptide) insertions and/or substi tein, or in proteins having decreased or increased biological tutions; changes in Stereochemistry of one or a few atoms; activity as compared to the naturally occurring protein. and/or minor derivatizations, including but not limited to: Modifications which result in a decrease in protein expres methylation, farnesylation, geranylgeranylation, glycosyla sion or a decrease in the activity of the protein, can be tion, carboxymethylation, phosphorylation, acetylation, referred to as inactivation (complete or partial), down myristoylation, prenylation, palmitation, and/or amidation. regulation, or decreased action (or activity) of a protein. A homologue can have either enhanced, decreased, or Sub Similarly, modifications which result in an increase in pro stantially similar properties as compared to the naturally tein expression or an increase in the activity of the protein, occurring protein or peptide. Preferred homologues of a can be referred to as amplification, overproduction, activa PUFA PKS protein or domain are described in detail below. tion, enhancement, up-regulation or increased action (or It is noted that homologues can include synthetically pro activity) of a protein. It is noted that general reference to a duced homologues, naturally occurring allelic variants of a homologue having the biological activity of the wild-type given protein or domain, or homologous sequences from protein does not necessarily mean that the homologue has organisms other than the organism from which the reference identical biological activity as the wild-type protein, par sequence was derived. ticularly with regard to the level of biological activity. 0124 Conservative substitutions typically include substi Rather, a homologue can perform the same biological activ tutions within the following groups: glycine and alanine; ity as the wild-type protein, but at a reduced or increased valine, isoleucine and leucine; aspartic acid, glutamic acid, level of activity as compared to the wild-type protein. A asparagine, and glutamine; serine and threonine; lysine and functional domain of a PUFA PKS system is a domain (i.e., arginine; and phenylalanine and tyrosine. Substitutions may a domain can be a portion of a protein) that is capable of also be made on the basis of conserved hydrophobicity or performing a biological function (i.e., has biological activ hydrophilicity (Kyte and Doolittle, J. Mol. Biol. (1982) 157: ity). 105-132), or on the basis of the ability to assume similar 0128 Methods of detecting and measuring PUFA PKS polypeptide secondary structure (Chou and Fasman, Adv. protein or domain biological activity include, but are not Enzymol. (1978) 47: 45-148, 1978). limited to, measurement of transcription of a PUFA PKS protein or domain, measurement of translation of a PUFA 0125 Homologues can be the result of natural allelic PKS protein or domain, measurement of posttranslational variation or natural mutation. A naturally occurring allelic modification of a PUFA PKS protein or domain, measure variant of a nucleic acid encoding a protein is a gene that ment of enzymatic activity of a PUFA PKS protein or occurs at essentially the same locus (or loci) in the genome domain, and/or measurement production of one or more as the gene which encodes such protein, but which, due to products of a PUFA PKS system (e.g., PUFA production). It natural variations caused by, for example, mutation or is noted that an isolated protein of the present invention recombination, has a similar but not identical sequence. (including a homologue) is not necessarily required to have Allelic variants typically encode proteins having similar the biological activity of the wild-type protein. For example, activity to that of the protein encoded by the gene to which a PUFA PKS protein or domain can be a truncated, mutated they are being compared. One class of allelic variants can or inactive protein, for example. Such proteins are useful in encode the same protein but have different nucleic acid screening assays, for example, or for other purposes Such as sequences due to the degeneracy of the genetic code. Allelic antibody production. In a preferred embodiment, the isolated variants can also comprise alterations in the 5' or 3' untrans proteins of the present invention have biological activity that lated regions of the gene (e.g., in regulatory control regions). is similar to that of the wild-type protein (although not Allelic variants are well known to those skilled in the art. necessarily equivalent, as discussed above). 0126 Homologues can be produced using techniques 0.129 Methods to measure protein expression levels gen known in the art for the production of proteins including, but erally include, but are not limited to: Western blot, immu not limited to, direct modifications to the isolated, naturally noblot, enzyme-linked immunosorbant assay (ELISA), occurring protein, direct protein synthesis, or modifications radioimmunoassay (RIA), immunoprecipitation, Surface to the nucleic acid sequence encoding the protein using, for plasmon resonance, chemiluminescence, fluorescent polar example, classic or recombinant DNA techniques to effect ization, phosphorescence, immunohistochemical analysis, random or targeted mutagenesis. matrix-assisted laser desorption/ionization time-of-flight US 2008/0038379 A1 Feb. 14, 2008

(MALDI-TOF) mass spectrometry, microcytometry, consecutive amino acids, and more preferably to at least microarray, microscopy, fluorescence activated cell sorting about 1000 consecutive amino acids, and more preferably to (FACS), and flow cytometry, as well as assays based on a at least about 1100 consecutive amino acids, and more property of the protein including but not limited to enzy preferably to at least about 1200 consecutive amino acids, matic activity or interaction with other protein partners. and more preferably to at least about 1300 consecutive Binding assays are also well known in the art. For example, amino acids, and more preferably to at least about 1400 a BIAcore machine can be used to determine the binding consecutive amino acids of any of SEQID NO:39, SEQ ID constant of a complex between two proteins. The dissocia NO:52, or SEQ ID NO:62, or to the full length of SEQ ID tion constant for the complex can be determined by moni NO:62. In a further aspect, the amino acid sequence of the toring changes in the refractive index with respect to time as protein is at least about 60% identical to at least about 1500 buffer is passed over the chip (O’Shannessy et al. Anal. consecutive amino acids, and more preferably to at least Biochem. 212:457-468 (1993); Schuster et al., Nature about 1600 consecutive amino acids, and more preferably to 365:343-347 (1993)). Other suitable assays for measuring at least about 1700 consecutive amino acids, and more the binding of one protein to another include, for example, preferably to at least about 1800 consecutive amino acids, immunoassays such as enzyme linked immunoabsorbent and more preferably to at least about 1900 consecutive assays (ELISA) and radioimmunoassays (RIA); or determi amino acids, of any of SEQID NO:39 or SEQID NO:52, or nation of binding by monitoring the change in the spectro to the full length of SEQ ID NO:52. In a further aspect, the scopic or optical properties of the proteins through fluores amino acid sequence of the protein is at least about 60% cence, UV absorption, circular dichrosim, or nuclear identical to at least about 2000 consecutive amino acids, and magnetic resonance (NMR). more preferably to at least about 2100 consecutive amino 0130. In one embodiment, the present invention relates to acids, and more preferably to at least about 2200 consecutive an isolated protein comprising an amino acid sequence amino acids, and more preferably to at least about 2300 selected from the group consisting of: (a) an amino acid consecutive amino acids, and more preferably to at least sequence selected from the group consisting of SEQ ID about 2400 consecutive amino acids, and more preferably to NO:39, SEQ ID NO:52, SEQ ID NO:62, and biologically at least about 2500 consecutive amino acids, and more active fragments thereof; (b) an amino acid sequence preferably to at least about 2600 consecutive amino acids, selected from the group consisting of: SEQID NO:41, SEQ and more preferably to at least about 2700 consecutive ID NO:43, SEQID NO:45, SEQID NO:48, SEQID NO:50, amino acids, and more preferably to at least about 2800 SEQID NO:54, SEQ ID NO:56, SEQID NO:58, SEQ ID consecutive amino acids, and even more preferably, to the NO:60, SEQID NO:64, SEQID NO:66, SEQID NO:68 and full length of SEQID NO:39. In one embodiment, the amino biologically active fragments thereof (c) an amino acid acid sequence described above does not include any of the sequence that is at least about 60% identical to at least 500 following amino acid sequences: SEQ ID NO:2, SEQ ID consecutive amino acids of the amino acid sequence of (a), NO:4, SEQID NO:6, SEQ ID NO:8, SEQID NO:10, SEQ wherein the amino acid sequence has a biological activity of ID NO:13, SEQID NO:18, SEQID NO:20, SEQID NO:22, at least one domain of a polyunsaturated fatty acid (PUFA) SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID polyketide synthase (PKS) system; and/or (d) an amino acid NO:30, SEQ ID NO:32. sequence that is at least about 60% identical to the amino 0.132. In another aspect, a PUFA PKS protein or domain acid sequence of (b), wherein the amino acid sequence has encompassed by the present invention, including homo a biological activity of at least one domain of a polyunsatu logues as described above, comprises an amino acid rated fatty acid (PUFA) polyketide synthase (PKS) system. sequence that is at least about 65% identical, and more In a further embodiment, an amino acid sequence including preferably at least about 70% identical, and more preferably the active site domains or other functional motifs described at least about 75% identical, and more preferably at least above for several of the PUFA PKS domains are encom about 80% identical, and more preferably at least about 85% passed by the invention. In one embodiment, the amino acid identical, and more preferably at least about 90% identical, sequence described above does not include any of the and more preferably at least about 95% identical, and more following amino acid sequences: SEQ ID NO:2, SEQ ID preferably at least about 96% identical, and more preferably NO:4, SEQID NO:6, SEQ ID NO:8, SEQID NO:10, SEQ at least about 97% identical, and more preferably at least ID NO:13, SEQID NO:18, SEQID NO:20, SEQID NO:22, about 98% identical, and more preferably at least about 99% SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID identical to an amino acid sequence chosen from: SEQ ID NO:30, SEQ ID NO:32. NO:39, SEQ ID NO:52, or SEQID NO:62, over any of the 0131). In one aspect of the invention, a PUFAPKS protein consecutive amino acid lengths described in the paragraph or domain encompassed by the present invention, including above, wherein the amino acid sequence has a biological a homologue of a particular PUFA PKS protein or domain activity of at least one domain of a PUFA PKS system. In described herein, comprises an amino acid sequence that is one embodiment, the amino acid sequence described above at least about 60% identical to at least 500 consecutive does not include any of the following amino acid sequences: amino acids of an amino acid sequence chosen from: SEQ SEQID NO:2, SEQID NO:4, SEQID NO:6, SEQID NO:8, ID NO:39, SEQ ID NO:52, or SEQ ID NO:62, wherein the SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:18, SEQ ID amino acid sequence has a biological activity of at least one NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, domain of a PUFA PKS system. In a further aspect, the SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32. amino acid sequence of the protein is at least about 60% 0133). In one aspect of the invention, a PUFAPKS protein identical to at least about 600 consecutive amino acids, and or domain encompassed by the present invention, including more preferably to at least about 700 consecutive amino a homologue as described above, comprises an amino acid acids, and more preferably to at least about 800 consecutive sequence that is at least about 60% identical to an amino acid amino acids, and more preferably to at least about 900 sequence chosen from: SEQ ID NO:39, SEQ ID NO:41, US 2008/0038379 A1 Feb. 14, 2008 22

SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:48, SEQ ID and more preferably at least about 99% identical, to an NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, amino acid sequence chosen from: SEQID NO:39, SEQID SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:48, NO:64, SEQID NO:66, SEQID NO:68, wherein the amino SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID acid sequence has a biological activity of at least one domain NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, of a PUFA PKS system. In a further aspect, the amino acid SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, wherein sequence of the protein is at least about 65% identical, and the amino acid sequence has a biological activity of at least more preferably at least about 70% identical, and more one domain of a PUFA PKS system. In one embodiment, the preferably at least about 75% identical, and more preferably amino acid sequence described above does not include any at least about 80% identical, and more preferably at least of the following amino acid sequences: SEQID NO:2, SEQ about 85% identical, and more preferably at least about 90% ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, identical, and more preferably at least about 95% identical, SEQ ID NO:13, SEQ ID NO:18, SEQ ID NO:20, SEQ ID and more preferably at least about 96% identical, and more NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, preferably at least about 97% identical, and more preferably SEQ ID NO:30, SEQ ID NO:32. at least about 98% identical, and more preferably at least about 99% identical to an amino acid sequence chosen from: 0.135) In a preferred embodiment an isolated protein or SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID domain of the present invention comprises, consists essen NO:45, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, tially of, or consists of an amino acid sequence chosen from: SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, NO:45, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:68, wherein the amino acid sequence has a SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID biological activity of at least one domain of a PUFA PKS NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, system. In one embodiment, the amino acid sequence SEQ ID NO:68, or any biologically active fragments described above does not include any of the following amino thereof, including any fragments that have a biological acid sequences: SEQ ID NO:2, SEQ ID NO:4, SEQ ID activity of at least one domain of a PUFA PKS system. NO:6, SEQID NO:8, SEQID NO:10, SEQID NO:13, SEQ 0.136. In one aspect of the present invention, the follow ID NO:18, SEQID NO:20, SEQID NO:22, SEQID NO:24, ing Schizochytrium proteins and domains are useful in one SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID or more embodiments of the present invention, all of which NO:32. have been previously described in detail in U.S. patent 0134. In another aspect, a PUFA PKS protein or domain application Ser. No. 10/124,800, supra. In one aspect of the encompassed by the present invention, including a homo invention, a PUFA PKS protein or domain useful in the logue as described above, comprises an amino acid sequence present invention comprises an amino acid sequence that is that is at least about 50% identical to an amino acid sequence at least about 60% identical to at least 500 consecutive chosen from: SEQ ID NO:39, SEQ ID NO:43, SEQ ID amino acids of an amino acid sequence chosen from: SEQ NO:50, SEQ ID NO:52, and SEQ ID NO:58, wherein the amino acid sequence has a biological activity of at least one ID NO:2, SEQ ID NO:4, and SEQ ID NO:6; wherein the domain of a PUFAPKS system. In another aspect, the amino amino acid sequence has a biological activity of at least one acid sequence of the protein is at least about 55% identical, domain of a PUFA PKS system. In a further aspect, the and more preferably at least about 60% identical, to an amino acid sequence of the protein is at least about 60% amino acid sequence chosen from: SEQID NO:39, SEQID identical to at least about 600 consecutive amino acids, and NO:43, SEQID NO:50, SEQID NO:52, SEQID NO:56 and more preferably to at least about 700 consecutive amino SEQ ID NO:58, wherein the amino acid sequence has a acids, and more preferably to at least about 800 consecutive biological activity of at least one domain of a PUFA PKS amino acids, and more preferably to at least about 900 system. In a further aspect, the amino acid sequence of the consecutive amino acids, and more preferably to at least protein is at least about 65% identical to an amino acid about 1000 consecutive amino acids, and more preferably to sequence chosen from SEQID NO:39, SEQID NO:43, SEQ at least about 1100 consecutive amino acids, and more ID NO:50, SEQID NO:52, SEQID NO:54, SEQ ID NO:56 and SEQID NO:58, wherein the amino acid sequence has a preferably to at least about 1200 consecutive amino acids, biological activity of at least one domain of a PUFA PKS and more preferably to at least about 1300 consecutive system. In another aspect, the amino acid sequence of the amino acids, and more preferably to at least about 1400 protein is at least about 70% identical, and more preferably consecutive amino acids, and more preferably to at least at least about 75% identical, to an amino acid sequence about 1500 consecutive amino acids of any of SEQ ID chosen from: SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:2, SEQID NO.4 and SEQID NO:6, or to the full length NO:45, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, of SEQ ID NO:6. In a further aspect, the amino acid SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID sequence of the protein is at least about 60% identical to at NO:60, SEQ ID NO:62, and SEQ ID NO:64, wherein the least about 1600 consecutive amino acids, and more pref amino acid sequence has a biological activity of at least one domain of a PUFAPKS system. In another aspect, the amino erably to at least about 1700 consecutive amino acids, and acid sequence of the protein is at least about 80% identical, more preferably to at least about 1800 consecutive amino and more preferably at least about 85% identical, and more acids, and more preferably to at least about 1900 consecutive preferably at least about 90% identical, and more preferably amino acids, and more preferably to at least about 2000 at least about 95% identical, and more preferably at least consecutive amino acids of any of SEQID NO:2 or SEQID about 96% identical, and more preferably at least about 97% NO:4, or to the full length of SEQ ID NO:4. In a further identical, and more preferably at least about 98% identical, aspect, the amino acid sequence of the protein is at least US 2008/0038379 A1 Feb. 14, 2008

about 60% identical to at least about 2100 consecutive SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID amino acids, and more preferably to at least about 2200 NO:28, SEQID NO:30, SEQID NO:32 or any biologically consecutive amino acids, and more preferably to at least active fragments thereof, including any fragments that have about 2300 consecutive amino acids, and more preferably to a biological activity of at least one domain of a PUFA PKS at least about 2400 consecutive amino acids, and more system. preferably to at least about 2500 consecutive amino acids, 0140. According to the present invention, the term “con and more preferably to at least about 2600 consecutive tiguous” or “consecutive', with regard to nucleic acid or amino acids, and more preferably to at least about 2700 amino acid sequences described herein, means to be con consecutive amino acids, and more preferably to at least nected in an unbroken sequence. For example, for a first about 2800 consecutive amino acids, and even more pref sequence to comprise 30 contiguous (or consecutive) amino erably, to the full length of SEQ ID NO:2. acids of a second sequence, means that the first sequence 0137 In another aspect, a PUFA PKS protein or domain includes an unbroken sequence of 30 amino acid residues useful in one or more embodiments of the present invention that is 100% identical to an unbroken sequence of 30 amino comprises an amino acid sequence that is at least about 65% acid residues in the second sequence. Similarly, for a first identical, and more preferably at least about 70% identical, sequence to have “100% identity” with a second sequence and more preferably at least about 75% identical, and more means that the first sequence exactly matches the second preferably at least about 80% identical, and more preferably sequence with no gaps between nucleotides or amino acids. at least about 85% identical, and more preferably at least 0.141. As used herein, unless otherwise specified, refer about 90% identical, and more preferably at least about 95% ence to a percent (%) identity refers to an evaluation of identical, and more preferably at least about 96% identical, homology which is performed using: (1) a BLAST 2.0 Basic and more preferably at least about 97% identical, and more BLAST homology search using blastp for amino acid preferably at least about 98% identical, and more preferably searches, blastin for nucleic acid searches, and blastX for at least about 99% identical to an amino acid sequence nucleic acid searches and searches of translated amino acids chosen from: SEQ ID NO:2, SEQ ID NO:4, or SEQ ID in all 6 open reading frames, all with standard default NO:6, over any of the consecutive amino acid lengths parameters, wherein the query sequence is filtered for low described in the paragraph above, wherein the amino acid complexity regions by default (described in Altschul, S. F., sequence has a biological activity of at least one domain of Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z. Miller, a PUFA PKS system. W. & Lipman, D. J. (1997) “Gapped BLAST and PSI 0.138. In another aspect of the invention, a PUFA PKS BLAST: a new generation of protein database search pro protein or domain useful in one or more embodiments of the grams. Nucleic Acids Res. 25:3389-3402, incorporated present invention comprises an amino acid sequence that is herein by reference in its entirety); (2) a BLAST 2 alignment (using the parameters described below); (3) and/or PSI at least about 60% identical to an amino acid sequence BLAST with the standard default parameters (Position chosen from: SEQ ID NO:8, SEQ ID NO:10, SEQ ID Specific Iterated BLAST). It is noted that due to some NO:13, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, differences in the standard parameters between BLAST 2.0 SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID Basic BLAST and BLAST 2, two specific sequences might NO:30, or SEQID NO:32, wherein the amino acid sequence be recognized as having significant homology using the has a biological activity of at least one domain of a PUFA BLAST 2 program, whereas a search performed in BLAST PKS system. In a further aspect, the amino acid sequence of 2.0 Basic BLAST using one of the sequences as the query the protein is at least about 65% identical, and more pref sequence may not identify the second sequence in the top erably at least about 70% identical, and more preferably at matches. In addition, PSI-BLAST provides an automated, easy-to-use version of a “profile' search, which is a sensitive least about 75% identical, and more preferably at least about way to look for sequence homologues. The program first 80% identical, and more preferably at least about 85% performs a gapped BLAST database search. The PSI identical, and more preferably at least about 90% identical, BLAST program uses the information from any significant and more preferably at least about 95% identical, and more alignments returned to construct a position-specific score preferably at least about 96% identical, and more preferably matrix, which replaces the query sequence for the next round at least about 97% identical, and more preferably at least of database searching. Therefore, it is to be understood that about 98% identical, and more preferably at least about 99% percent identity can be determined by using any one of these identical to an amino acid sequence chosen from: SEQ ID programs. NO:8, SEQ ID NO:10, SEQ ID NO:13, SEQ ID NO:18, 0142. Two specific sequences can be aligned to one SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID another using BLAST 2 sequence as described in Tatusova NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, and Madden, (1999), “Blast 2 sequences—a new tool for wherein the amino acid sequence has a biological activity of comparing protein and nucleotide sequences. FEMS Micro at least one domain of a PUFA PKS system. biol Lett. 174, 247, incorporated herein by reference in its entirety. BLAST 2 sequence alignment is performed in 0.139. In yet another aspect of the invention, a PUFA PKS blastp or blastin using the BLAST 2.0 algorithm to perform protein or domain useful in one or more embodiments of the a Gapped BLAST search (BLAST 2.0) between the two present invention comprises, consists essentially of, or con sequences allowing for the introduction of gaps (deletions sists of an amino acid sequence chosen from: SEQID NO:2. and insertions) in the resulting alignment. For purposes of SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID clarity herein, a BLAST 2 sequence alignment is performed NO:10, SEQ ID NO:13, SEQ ID NO:18, SEQ ID NO:20, using the standard default parameters as follows. US 2008/0038379 A1 Feb. 14, 2008 24

For blastin, using 0 BLOSUM62 matrix: calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mis 0143 Reward for match=1 match of nucleotides are disclosed, for example, in 0144 Penalty for mismatch=-2 Meinkoth et al., 1984, Anal. Biochem. 138, 267-284; Meinkoth et al., ibid., is incorporated by reference herein in 0145 Open gap (5) and extension gap (2) penalties its entirety. site gap X dropoff (50) expect (10) word size (11) filter 0152 More particularly, moderate stringency hybridiza O tion and washing conditions, as referred to herein, refer to For blastp, using 0 BLOSUM62 matrix: conditions which permit isolation of nucleic acid molecules having at least about 70% nucleic acid sequence identity 0147 Open gap (11) and extension gap (1) penalties with the nucleic acid molecule being used to probe in the 0148 gap X dropoff (50) expect (10) word size (3) filter hybridization reaction (i.e., conditions permitting about 30% (on). or less mismatch of nucleotides). High Stringency hybrid ization and washing conditions, as referred to herein, refer to 0149 According to the present invention, an amino acid conditions which permit isolation of nucleic acid molecules sequence that has a biological activity of at least one domain having at least about 80% nucleic acid sequence identity of a PUFA PKS system is an amino acid sequence that has with the nucleic acid molecule being used to probe in the the biological activity of at least one domain of the PUFA hybridization reaction (i.e., conditions permitting about 20% PKS system described in detail herein, as previously exem or less mismatch of nucleotides). Very high Stringency plified by the Schizochytrium PUFA PKS system or as hybridization and washing conditions, as referred to herein, additionally exemplified herein by the Thraustochytrium refer to conditions which permit isolation of nucleic acid PUFA PKS system. The biological activities of the various molecules having at least about 90% nucleic acid sequence domains within the Schizochytrium or Thraustochytrium identity with the nucleic acid molecule being used to probe PUFA PKS systems have been described in detail above. in the hybridization reaction (i.e., conditions permitting Therefore, an isolated protein useful in the present invention about 10% or less mismatch of nucleotides). As discussed can include the translation product of any PUFA PKS open above, one of skill in the art can use the formulae in reading frame, any PUFA PKS domain, biologically active Meinkoth et al., ibid. to calculate the appropriate hybridiza fragment thereof, or any homologue of a naturally occurring tion and wash conditions to achieve these particular levels of PUFAPKS open reading frame product or domain which has nucleotide mismatch. Such conditions will vary, depending biological activity. on whether DNA:RNA or DNA:DNA hybrids are being 0150. In another embodiment of the invention, an amino formed. Calculated melting temperatures for DNA:DNA acid sequence having the biological activity of at least one hybrids are 10° C. less than for DNA:RNA hybrids. In domain of a PUFA PKS system of the present invention particular embodiments, stringent hybridization conditions includes an amino acid sequence that is sufficiently similar for DNA:DNA hybrids include hybridization at an ionic to a naturally occurring PUFA PKS protein or polypeptide strength of 6xSSC (0.9 M. Na") at a temperature of between that a nucleic acid sequence encoding the amino acid about 20° C. and about 35° C. (lower stringency), more sequence is capable of hybridizing under moderate, high, or preferably, between about 28° C. and about 40° C. (more very high stringency conditions (described below) to (i.e., stringent), and even more preferably, between about 35°C. with) a nucleic acid molecule encoding the naturally occur and about 45° C. (even more stringent), with appropriate ring PUFA PKS protein or polypeptide (i.e., to the comple wash conditions. In particular embodiments, stringent ment of the nucleic acid strand encoding the naturally hybridization conditions for DNA:RNA hybrids include occurring PUFA PKS protein or polypeptide). Preferably, an hybridization at an ionic strength of 6xSSC (0.9 M. Na") at amino acid sequence having the biological activity of at least a temperature of between about 30° C. and about 45° C. one domain of a PUFA PKS system of the present invention more preferably, between about 38°C. and about 50° C., and is encoded by a nucleic acid sequence that hybridizes under even more preferably, between about 45° C. and about 55° moderate, high or very high Stringency conditions to the C., with similarly stringent wash conditions. These values complement of a nucleic acid sequence that encodes any of are based on calculations of a melting temperature for the above-described amino acid sequences for a PUFA PKS molecules larger than about 100 nucleotides, 0% formamide protein or domain. Methods to deduce a complementary and a G+C content of about 40%. Alternatively, T. can be sequence are known to those skilled in the art. It should be calculated empirically as set forth in Sambrook et al., Supra, noted that since amino acid sequencing and nucleic acid pages 9.31 to 9.62. In general, the wash conditions should be sequencing technologies are not entirely error-free, the as stringent as possible, and should be appropriate for the sequences presented herein, at best, represent apparent chosen hybridization conditions. For example, hybridization sequences of PUFAPKS domains and proteins of the present conditions can include a combination of salt and temperature invention. conditions that are approximately 20-25° C. below the calculated T of a particular hybrid, and wash conditions 0151. As used herein, hybridization conditions refer to typically include a combination of salt and temperature standard hybridization conditions under which nucleic acid conditions that are approximately 12-20° C. below the molecules are used to identify similar nucleic acid mol calculated T of the particular hybrid. One example of ecules. Such standard conditions are disclosed, for example, hybridization conditions suitable for use with DNA:DNA in Sambrook et al., Molecular Cloning: A Laboratory hybrids includes a 2-24 hour hybridization in 6xSSC (50% Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et formamide) at about 42°C., followed by washing steps that al., ibid., is incorporated by reference herein in its entirety include one or more washes at room temperature in about (see specifically, pages 9.31-9.62). In addition, formulae to 2xSSC, followed by additional washes at higher tempera US 2008/0038379 A1 Feb. 14, 2008

tures and lower ionic strength (e.g., at least one wash as in length, or at least about 300 amino acids in length, or at about 37° C. in about 0.1x-0.5xSSC, followed by at least least about 350 amino acids in length, or at least about 400 one wash at about 68° C. in about 0.1x-0.5xSSC). amino acids in length, or at least about 450 amino acids in 0153. The present invention also includes a fusion protein length, or at least about 500 amino acids in length, or at least that includes any PUFA PKS protein or domain or any about 750 amino acids in length, and so on, in any length homologue or fragment thereof attached to one or more between 8 amino acids and up to the full length of a protein fusion segments. Suitable fusion segments for use with the or domain of the invention or longer, in whole integers (e.g., present invention include, but are not limited to, segments 8,9, 10, ... 25, 26, ... 500, 501, ... 1234, 1235,...). There that can: enhance a protein's stability; provide other desir is no limit, other than a practical limit, on the maximum size able biological activity; and/or assist with the purification of of Such a protein in that the protein can include a portion of the protein (e.g., by affinity chromatography). A Suitable a PUFA PKS protein, domain, or biologically active or fusion segment can be a domain of any size that has the useful fragment thereof, or a full-length PUFA PKS protein desired function (e.g., imparts increased stability, Solubility, or domain, plus additional sequence (e.g., a fusion protein biological activity; and/or simplifies purification of a pro sequence), if desired. tein). Fusion segments can be joined to amino and/or car boxyl termini of the protein and can be susceptible to 0156 Further embodiments of the present invention cleavage in order to enable straight-forward recovery of the include isolated nucleic acid molecules comprising, consist desired protein. Fusion proteins are preferably produced by ing essentially of, or consisting of nucleic acid sequences culturing a recombinant cell transfected with a fusion that encode any of the above-identified proteins or domains, nucleic acid molecule that encodes a protein including the including a homologue or fragment thereof, as well as fusion segment attached to either the carboxyl and/or amino nucleic acid sequences that are fully complementary thereto. terminal end of the protein of the invention as discussed In accordance with the present invention, an isolated nucleic above. acid molecule is a nucleic acid molecule that has been 0154) In one embodiment of the present invention, any of removed from its natural milieu (i.e., that has been subject the above-described PUFA PKS amino acid sequences, as to human manipulation), its natural milieu being the genome well as homologues of Such sequences, can be produced or chromosome in which the nucleic acid molecule is found with from at least one, and up to about 20, additional in nature. As such, “isolated' does not necessarily reflect the heterologous amino acids flanking each of the C- and/or extent to which the nucleic acid molecule has been purified, N-terminal end of the given amino acid sequence. The but indicates that the molecule does not include an entire resulting protein or polypeptide can be referred to as "con genome or an entire chromosome in which the nucleic acid sisting essentially of a given amino acid sequence. Accord molecule is found in nature. An isolated nucleic acid mol ing to the present invention, the heterologous amino acids ecule can include a gene. An isolated nucleic acid molecule are a sequence of amino acids that are not naturally found that includes a gene is not a fragment of a chromosome that (i.e., not found in nature, in vivo) flanking the given amino includes such gene, but rather includes the coding region and acid sequence or which would not be encoded by the regulatory regions associated with the gene, but no addi nucleotides that flank the naturally occurring nucleic acid tional genes naturally found on the same chromosome. An sequence encoding the given amino acid sequence as it isolated nucleic acid molecule can also include a specified occurs in the gene, if such nucleotides in the naturally nucleic acid sequence flanked by (i.e., at the 5' and/or the 3' occurring sequence were translated using standard codon end of the sequence) additional nucleic acids that do not usage for the organism from which the given amino acid normally flank the specified nucleic acid sequence in nature sequence is derived. Similarly, the phrase “consisting essen (i.e., heterologous sequences). Isolated nucleic acid mol tially of, when used with reference to a nucleic acid ecule can include DNA, RNA (e.g., mRNA), or derivatives sequence herein, refers to a nucleic acid sequence encoding of either DNA or RNA (e.g., cDNA). Although the phrase a given amino acid sequence that can be flanked by from at “nucleic acid molecule' primarily refers to the physical least one, and up to as many as about 60, additional nucleic acid molecule and the phrase “nucleic acid heterologous nucleotides at each of the 5' and/or the 3' end sequence' primarily refers to the sequence of nucleotides on of the nucleic acid sequence encoding the given amino acid the nucleic acid molecule, the two phrases can be used sequence. The heterologous nucleotides are not naturally interchangeably, especially with respect to a nucleic acid found (i.e., not found in nature, in vivo) flanking the nucleic molecule, or a nucleic acid sequence, being capable of acid sequence encoding the given amino acid sequence as it encoding a protein or domain of a protein. occurs in the natural gene. 0157 Preferably, an isolated nucleic acid molecule of the 0155 The minimum size of a protein or domain and/or a present invention is produced using recombinant DNA tech homologue or fragment thereof of the present invention is, nology (e.g., polymerase chain reaction (PCR) amplifica in one aspect, a size sufficient to have the requisite biological tion, cloning) or chemical synthesis. Isolated nucleic acid activity, or sufficient to serve as an antigen for the generation molecules include natural nucleic acid molecules and homo of an antibody or as a target in an in vitro assay. In one logues thereof, including, but not limited to, natural allelic embodiment, a protein of the present invention is at least variants and modified nucleic acid molecules in which about 8 amino acids in length (e.g., Suitable for an antibody nucleotides have been inserted, deleted, substituted, and/or epitope or as a detectable peptide in an assay), or at least inverted in Such a manner that Such modifications provide about 25 amino acids in length, or at least about 50 amino the desired effect on PUFA PKS system biological activity acids in length, or at least about 100 amino acids in length, as described herein. Protein homologues (e.g., proteins or at least about 150 amino acids in length, or at least about encoded by nucleic acid homologues) have been discussed 200 amino acids in length, or at least about 250 amino acids in detail above. US 2008/0038379 A1 Feb. 14, 2008 26

0158. A nucleic acid molecule homologue can be pro 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to such duced using a number of methods known to those skilled in sequences), or fragments thereof, or any complementary the art (see, for example, Sambrook et al., Molecular Clon sequences thereof. ing: A Laboratory Manual, Cold Spring Harbor Labs Press, 0.161 Another embodiment of the present invention 1989). For example, nucleic acid molecules can be modified includes a recombinant nucleic acid molecule comprising a using a variety of techniques including, but not limited to, recombinant vector and a nucleic acid sequence encoding classic mutagenesis techniques and recombinant DNA tech protein or peptide having a biological activity of at least one niques. Such as site-directed mutagenesis, chemical treat domain (or homologue or fragment thereof) of a PUFA PKS ment of a nucleic acid molecule to induce mutations, restric system as described herein. Such nucleic acid sequences are tion enzyme cleavage of a nucleic acid fragment, ligation of described in detail above. According to the present inven nucleic acid fragments, PCR amplification and/or mutagen tion, a recombinant vector is an engineered (i.e., artificially esis of selected regions of a nucleic acid sequence, synthesis produced) nucleic acid molecule that is used as a tool for of oligonucleotide mixtures and ligation of mixture groups manipulating a nucleic acid sequence of choice and for to “build a mixture of nucleic acid molecules and combi introducing Such a nucleic acid sequence into a host cell. The nations thereof. Nucleic acid molecule homologues can be recombinant vector is therefore Suitable for use in cloning, selected from a mixture of modified nucleic acids by screen sequencing, and/or otherwise manipulating the nucleic acid ing for the function of the protein encoded by the nucleic sequence of choice. Such as by expressing and/or delivering acid and/or by hybridization with a wild-type gene. the nucleic acid sequence of choice into a host cell to form a recombinant cell. Such a vector typically contains heter 0159. The minimum size of a nucleic acid molecule of ologous nucleic acid sequences, that is nucleic acid the present invention is a size sufficient to form a probe or sequences that are not naturally found adjacent to nucleic oligonucleotide primer that is capable of forming a stable acid sequence to be cloned or delivered, although the vector hybrid (e.g., under moderate, high or very high Stringency can also contain regulatory nucleic acid sequences (e.g., conditions) with the complementary sequence of a nucleic promoters, untranslated regions) which are naturally found acid molecule useful in the present invention, or of a size adjacent to nucleic acid molecules of the present invention Sufficient to encode an amino acid sequence having a bio or which are useful for expression of the nucleic acid logical activity of at least one domain of a PUFA PKS molecules of the present invention (discussed in detail system according to the present invention. As such, the size below). The vector can be either RNA or DNA, either of the nucleic acid molecule encoding Such a protein can be prokaryotic or eukaryotic, and typically is a plasmid. The dependent on nucleic acid composition and percent homol vector can be maintained as an extrachromosomal element ogy or identity between the nucleic acid molecule and (e.g., a plasmid) or it can be integrated into the chromosome complementary sequence as well as upon hybridization of a recombinant organism (e.g., a microbe or a plant). The conditions per se (e.g., temperature, salt concentration, and entire vector can remain in place within a host cell, or under formamide concentration). The minimal size of a nucleic certain conditions, the plasmid DNA can be deleted, leaving acid molecule that is used as an oligonucleotide primer or as behind the nucleic acid molecule of the present invention. a probe is typically at least about 12 to about 15 nucleotides The integrated nucleic acid molecule can be under chromo in length if the nucleic acid molecules are GC-rich and at Somal promoter control, under native or plasmid promoter least about 15 to about 18 bases in length if they are AT-rich. control, or under a combination of several promoter con There is no limit, other than a practical limit, on the maximal trols. Single or multiple copies of the nucleic acid molecule size of a nucleic acid molecule of the present invention, in can be integrated into the chromosome. A recombinant that the nucleic acid molecule can include a sequence vector of the present invention can contain at least one Sufficient to encode a biologically active fragment of a selectable marker. domain of a PUFAPKS system, an entire domain of a PUFA 0162. In one embodiment, a recombinant vector used in PKS system, several domains within an open reading frame a recombinant nucleic acid molecule of the present invention (Orf) of a PUFA PKS system, an entire Orf of a PUFA PKS is an expression vector. As used herein, the phrase “expres system, or more than one Orf of a PUFA PKS system. sion vector” is used to refer to a vector that is suitable for production of an encoded product (e.g., a protein of interest). 0160 In one embodiment of the present invention, an In this embodiment, a nucleic acid sequence encoding the isolated nucleic acid molecule comprises, consists essen product to be produced (e.g., a PUFA PKS domain) is tially of, or consists of a nucleic acid sequence encoding any inserted into the recombinant vector to produce a recombi of the above-described amino acid sequences, including any nant nucleic acid molecule. The nucleic acid sequence of the amino acid sequences, or homologues thereof, from a encoding the protein to be produced is inserted into the Schizochytrium or Thraustochytrium described herein. In vector in a manner that operatively links the nucleic acid one aspect, the nucleic acid sequence is selected from the sequence to regulatory sequences in the vector which enable group of: SEQID NO:1, SEQID NO:3, SEQID NO:5, SEQ the transcription and translation of the nucleic acid sequence ID NO:7, SEQ ID NO:9, SEQ ID NO:12, SEQ ID NO:17, within the recombinant host cell. SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, 0163. In another embodiment, a recombinant vector used SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID in a recombinant nucleic acid molecule of the present NO:44, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, invention is a targeting vector. As used herein, the phrase SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID “targeting vector is used to refer to a vector that is used to NO:59, SEQID NO:61, SEQID NO:63, SEQID NO:65, or deliver a particular nucleic acid molecule into a recombinant SEQ ID NO:67, or homologues (including sequences that host cell, wherein the nucleic acid molecule is used to delete are at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, or inactivate an endogenous gene within the host cell or US 2008/0038379 A1 Feb. 14, 2008 27 microorganism (i.e., used for targeted gene disruption or sequence that is naturally associated with the protein, or any knock-out technology). Such a vector may also be known in heterologous leader sequence capable of directing the deliv the art as a “knock-out vector. In one aspect of this ery and insertion of the protein to the membrane of a cell. embodiment, a portion of the vector, but more typically, the nucleic acid molecule inserted into the vector (i.e., the 0166 The present inventors have found that the insert), has a nucleic acid sequence that is homologous to a Schizochytrium PUFA PKS Orfs A and B are closely linked nucleic acid sequence of a target gene in the host cell (i.e., in the genome and region between the Orfs has been a gene which is targeted to be deleted or inactivated). The sequenced. The Orfs are oriented in opposite directions and nucleic acid sequence of the vector insert is designed to bind 4244 base pairs separate the start (ATG) codons (i.e. they are to the target gene Such that the target gene and the insert arranged as follows: 3'OrfA5'-4244 bp-5'OrfB3'). Examina undergo homologous recombination, whereby the endog tion of the 4244 bp intergenic region did not reveal any enous target gene is deleted, inactivated or attenuated (i.e., obvious Orfs (no significant matches were found on a by at least a portion of the endogenous target gene being BlastX search). Both Orfs A and B are highly expressed in mutated or deleted). The use of this type of recombinant Schizochytrium, at least during the time of oil production, vector to replace an endogenous Schizochytrium gene with implying that active promoter elements are embedded in this a recombinant gene is described in the Examples section, intergenic region. These genetic elements are believed to and the general technique for genetic transformation of have utility as a bi-directional promoter sequence for trans Thraustochytrids is described in detail in U.S. patent appli genic applications. For example, in a preferred embodiment, cation Ser. No. 10/124,807, published as U.S. Patent Appli one could clone this region, place any genes of interest at cation Publication No. 20030166207, published Sep. 4, each end and introduce the construct into Schizochytrium (or 2003. some other host in which the promoters can be shown to function). It is predicted that the regulatory elements, under 0164. Typically, a recombinant nucleic acid molecule the appropriate conditions, would provide for coordinated, includes at least one nucleic acid molecule of the present high level expression of the two introduced genes. The invention operatively linked to one or more expression complete nucleotide sequence for the regulatory region control sequences. As used herein, the phrase “recombinant containing Schizochytrium PUFA PKS regulatory elements molecule' or “recombinant nucleic acid molecule' primarily (e.g., a promoter) is represented herein as SEQ ID NO:36. refers to a nucleic acid molecule or nucleic acid sequence operatively linked to a expression control sequence, but can 0.167 In a similar manner, Orf is highly expressed in be used interchangeably with the phrase “nucleic acid mol Schizochytrium during the time of oil production and regu ecule', when Such nucleic acid molecule is a recombinant latory elements are expected to reside in the region upstream molecule as discussed herein. According to the present of its start codon. A region of genomic DNA upstream of invention, the phrase “operatively linked’ refers to linking a OrfQ has been cloned and sequenced and is represented nucleic acid molecule to an expression control sequence hereinas (SEQ ID NO:37). This sequence contains the 3886 (e.g., a transcription control sequence and/or a translation nt immediately upstream of the Orf start codon. Exami control sequence) in a manner Such that the molecule is able nation of this region did not reveal any obvious Orfs (i.e., no to be expressed when transfected (i.e., transformed, trans significant matches were found on a BlastX search). It is duced, transfected, conjugated or conduced) into a host cell. believed that regulatory elements contained in this region, Transcription control sequences are sequences which control under the appropriate conditions, will provide for high-level the initiation, elongation, or termination of transcription. expression of a gene placed behind them. Additionally, Particularly important transcription control sequences are under the appropriate conditions, the level of expression those which control transcription initiation, such as pro may be coordinated with genes under control of the A-B moter, enhancer, operator and repressor sequences. Suitable intergenic region (SEQ ID NO:36). transcription control sequences include any transcription control sequence that can function in a host cell or organism 0.168. Therefore, in one embodiment, a recombinant into which the recombinant nucleic acid molecule is to be nucleic acid molecule useful in the present invention, as introduced. disclosed herein, can include a PUFA PKS regulatory region contained within SEQ ID NO:36 and/or SEQ ID NO:37. 0165 Recombinant nucleic acid molecules of the present Such a regulatory region can include any portion (fragment) invention can also contain additional regulatory sequences, of SEQ ID NO:36 and/or SEQ ID NO:37 that has at least Such as translation regulatory sequences, origins of replica basal PUFA PKS transcriptional activity. tion, and other regulatory sequences that are compatible with the recombinant cell. In one embodiment, a recombi 0169. One or more recombinant molecules of the present nant molecule of the present invention, including those invention can be used to produce an encoded product (e.g., which are integrated into the host cell chromosome, also a PUFA PKS domain, protein, or system) of the present contains secretory signals (i.e., signal segment nucleic acid invention. In one embodiment, an encoded product is pro sequences) to enable an expressed protein to be secreted duced by expressing a nucleic acid molecule as described from the cell that produces the protein. Suitable signal herein under conditions effective to produce the protein. A segments include a signal segment that is naturally associ preferred method to produce an encoded protein is by ated with the protein to be expressed or any heterologous transfecting a host cell with one or more recombinant signal segment capable of directing the secretion of the molecules to form a recombinant cell. Suitable host cells to protein according to the present invention. In another transfect include, but are not limited to, any bacterial, fungal embodiment, a recombinant molecule of the present inven (e.g., yeast), insect, plant or animal cell that can be trans tion comprises a leader sequence to enable an expressed fected. In one embodiment of the invention, a preferred host protein to be delivered to and inserted into the membrane of cell is a Thraustochytrid host cell (described in detail below) a host cell. Suitable leader sequences include a leader or a plant host cell. Host cells can be either untransfected US 2008/0038379 A1 Feb. 14, 2008 28 cells or cells that are already transfected with at least one gation of Saturated fatty acids. The pathways catalyzed by other recombinant nucleic acid molecule. PUFA PKSs that are distinct from previously recognized 0170 According to the present invention, the term “trans PKSs in both structure and mechanism. Generation of cis fection' is used to refer to any method by which an exog double bonds is suggested to involve position-specific enous nucleic acid molecule (i.e., a recombinant nucleic acid isomerases; these enzymes are believed to be useful in the molecule) can be inserted into a cell. The term “transfor production of new families of antibiotics. mation' can be used interchangeably with the term “trans 0.174. To produce significantly high yields of one or more fection' when such term is used to refer to the introduction desired polyunsaturated fatty acids or other bioactive mol of nucleic acid molecules into microbial cells, such as algae, ecules, an organism, preferably a microorganism or a plant, bacteria and yeast, or into plants. In microbial systems, the term “transformation' is used to describe an inherited and most preferably a Thraustochytrid microorganism, can change due to the acquisition of exogenous nucleic acids by be genetically modified to alter the activity and particularly, the microorganism or plant and is essentially synonymous the end product, of the PUFA PKS system in the microor with the term “transfection.” However, in animal cells, ganism or plant. transformation has acquired a second meaning which can refer to changes in the growth properties of cells in culture 0.175. Therefore, one embodiment of the present inven after they become cancerous, for example. Therefore, to tion relates to a genetically modified microorganism, avoid confusion, the term “transfection' is preferably used wherein the microorganism expresses a PKS system com with regard to the introduction of exogenous nucleic acids prising at least one biologically active domain of a polyun into animal cells, and the term “transfection' will be used saturated fatty acid (PUFA) polyketide synthase (PKS) herein to generally encompass transfection of animal cells, system. The domain of the PUFA PKS system can include and transformation of microbial cells or plant cells, to the any of the domains, including homologues thereof, for extent that the terms pertain to the introduction of exogenous PUFA PKS systems as described above (e.g., for nucleic acids into a cell. Therefore, transfection techniques Schizochytrium and Thraustochytrium), and can also include include, but are not limited to, transformation, particle any domain of a PUFA PKS system from any other non bombardment, diffusion, active transport, bath Sonication, bacterial microorganism, including any eukaryotic microor electroporation, microinjection, lipofection, adsorption, ganism, including any Thraustochytrid microorganism or infection and protoplast fusion. any domain of a PUFA PKS system from a microorganism 0171 It will be appreciated by one skilled in the art that identified by a screening method as described in U.S. patent use of recombinant DNA technologies can improve control application Ser. No. 10/124,800, supra. The genetic modi of expression of transfected nucleic acid molecules by fication affects the activity of the PKS system in the organ manipulating, for example, the number of copies of the ism. The screening process described in U.S. patent appli nucleic acid molecules within the host cell, the efficiency cation Ser. No. 10/124,800 includes the steps of: (a) with which those nucleic acid molecules are transcribed, the selecting a microorganism that produces at least one PUFA; efficiency with which the resultant transcripts are translated, and, (b) identifying a microorganism from (a) that has an and the efficiency of post-translational modifications. Addi ability to produce increased PUFAs under dissolved oxygen tionally, the promoter sequence might be genetically engi conditions of less than about 5% of saturation in the fer neered to improve the level of expression as compared to the mentation medium, as compared to production of PUFAs by native promoter. Recombinant techniques useful for control the microorganism under dissolved oxygen conditions of ling the expression of nucleic acid molecules include, but are greater than about 5% of saturation, and preferably about not limited to, integration of the nucleic acid molecules into 10%, and more preferably about 15%, and more preferably one or more host cell chromosomes, addition of vector about 20% of saturation in the fermentation medium. stability sequences to plasmids, Substitutions or modifica 0176). In one aspect, such an organism can endogenously tions of transcription control signals (e.g., promoters, opera contain and express a PUFA PKS system, and the genetic tors, enhancers), Substitutions or modifications of transla modification can be a genetic modification of one or more of tional control signals (e.g., ribosome binding sites, Shine the functional domains of the endogenous PUFA PKS sys Dalgarno sequences), modification of nucleic acid tem, whereby the modification has some effect on the molecules to correspond to the codon usage of the host cell, activity of the PUFAPKS system. In another aspect, such an and deletion of sequences that destabilize transcripts. organism can endogenously contain and express a PUFA 0172 General discussion above with regard to recombi PKS system, and the genetic modification can be an intro nant nucleic acid molecules and transfection of host cells is duction of at least one exogenous nucleic acid sequence intended to be applied to any recombinant nucleic acid (e.g., a recombinant nucleic acid molecule), wherein the molecule discussed herein, including those encoding any exogenous nucleic acid sequence encodes at least one bio amino acid sequence having a biological activity of at least logically active domain or protein from a second PKS one domain from a PUFA PKS, those encoding amino acid system and/or a protein that affects the activity of the PUFA sequences from other PKS systems, and those encoding PKS system (e.g., a phosphopantetheinyl transferases (PPTase), discussed below). In yet another aspect, the organ other proteins or domains. ism does not necessarily endogenously (naturally) contain a 0173 Polyunsaturated fatty acids (PUFAs) are essential PUFA PKS system, but is genetically modified to introduce membrane components in higher eukaryotes and the precur at least one recombinant nucleic acid molecule encoding an sors of many lipid-derived signaling molecules. The PUFA amino acid sequence having the biological activity of at least PKS system of the present invention uses pathways for one domain of a PUFA PKS system. In this aspect, PUFA PUFA synthesis that do not require desaturation and elon PKS activity is affected by introducing or increasing PUFA US 2008/0038379 A1 Feb. 14, 2008 29

PKS activity in the organism. Various embodiments associ tis, Labyrinthula macrocystis, Labyrinthula macrocystis ated with each of these aspects will be discussed in greater atlantica, Labyrinthula macrocystis macrocystis, Labyrin detail below. thula magnifica, Labyrinthula minuta, Labyrinthula roscoffensis, Labyrinthula valkanovii, Labyrinthula 0177. It is to be understood that a genetic modification of vitellina, Labyrinthula vitellina pacifica, Labyrinthula a PUFA PKS system or an organism comprising a PUFA vitellina vitellina, Labyrinthula Zopfii; any Labyrinthuloides PKS system can involve the modification of at least one species, including Labyrinthuloides sp., Labyrinthuloides domain of a PUFA PKS system (including a portion of a minuta, Labyrinthuloides schizochytrops; any Labyrinth domain), more than one or several domains of a PUFA PKS omyxa species, including Labyrinthomyxa sp., Labyrinth system (including adjacent domains, non-contiguous omyxa pohlia, Labyrinthomyxa sauvageaui, any Apl.- domains, or domains on different proteins in the PUFA PKS anochytrium species, including Aplanochytrium sp. and system), entire proteins of the PUFA PKS system, and the Aplanochytrium kerguelensis; any Elina species, including entire PUFAPKS system (e.g., all of the proteins encoded by Elina sp., Elina marisalba, Elina Sinorifica; any the PUFA PKS genes). As such, modifications can include a Japanochytrium species, including Japanochytrium sp., Small modification to a single domain of an endogenous Japanochytrium marinum; any Schizochytrium species, PUFA PKS system; to substitution, deletion or addition to including Schizochytrium sp., Schizochytrium aggregatum, one or more domains or proteins of a given PUFA PKS Schizochytrium limacinum, Schizochytrium minutum, system; up to replacement of the entire PUFA PKS system Schizochytrium Octosporum; and any Thraustochytrium spe in an organism with the PUFA PKS system from a different cies, including Thraustochytrium sp., Thraustochytrium organism. One of skill in the art will understand that any aggregatum, Thraustochytrium arudimentale, Thraus genetic modification to a PUFAPKS system is encompassed tochytrium aureum, Thraustochytrium benthicola, Thraus by the invention. tochytrium globosum, Thraustochytrium kinnei, Thraus 0178 As used herein, a genetically modified microorgan tochytrium motivum, Thraustochytrium pachydermum, ism can include a genetically modified bacterium, protist, Thraustochytrium proliferum, Thraustochytrium roseum, microalgae, fungus, or other microbe, and particularly, any Thraustochytrium striatum, Ulkenia sp., Ulkenia minuta, of the genera of the order Thraustochytriales (e.g., a Thraus Ulkenia profinda, Ulkenia radiate, Ulkenia Sarkariana, and tochytrid) described herein (e.g., Schizochytrium, Thraus Ulkenia visurgensis. Particularly preferred species within tochytrium, Japonochytrium, Labyrinthula, Labyrinthu these genera include, but are not limited to: any loides, etc.). Such a genetically modified microorganism has Schizochytrium species, including Schizochytrium aggrega a genome which is modified (i.e., mutated or changed) from tum, Schizochytrium limacinum, Schizochytrium minutum; its normal (i.e., wild-type or naturally occurring) form Such any Thraustochytrium species (including former Ulkenia that the desired result is achieved (i.e., increased or modified species such as U. visurgensis, U. amoeboida, U. Sarkari PUFA PKS activity and/or production of a desired product ana, U. profunda, U. radiata, U. minuta and Ulkenia sp. using the PKS system). Genetic modification of a microor BP-5601), and including Thraustochytrium striatum, ganism can be accomplished using classical strain develop Thraustochytrium aureum, Thraustochytrium roseum; and ment and/or molecular genetic techniques. Such techniques any Japonochytrium species. Particularly preferred strains known in the art and are generally disclosed for microor of Thraustochytriales include, but are not limited to: ganisms, for example, in Sambrook et al., 1989, Molecular Schizochytrium sp. (S31)(ATCC 20888); Schizochytrium sp. Cloning: A Laboratory Manual, Cold Spring Harbor Labs (S8)(ATCC 20889); Schizochytrium sp. (LC-RM)(ATCC Press. The reference Sambrook et al., ibid., is incorporated 18915); Schizochytrium sp. (SR21); Schizochytrium aggre by reference herein in its entirety. A genetically modified gatum (Goldstein et Belsky)(ATCC 28209); Schizochytrium microorganism can include a microorganism in which limacinum (Honda et Yokochi) (IFO 32693); Thraus nucleic acid molecules have been inserted, deleted or modi tochytrium sp. (23B)(ATCC 20891); Thraustochytrium fied (i.e., mutated; e.g., by insertion, deletion, Substitution, striatum (Schneider)(ATCC 24473); Thraustochytrium and/or inversion of nucleotides), in Such a manner that Such aureum (Goldstein)(ATCC 34304); Thraustochytrium modifications provide the desired effect within the microor roseum (Goldstein)(ATCC 28210); and Japonochytrium sp. ganism. (L1)(ATCC 28207). Other examples of suitable host micro organisms for genetic modification include, but are not 0179 Preferred microorganism host cells to modify limited to, yeast including Saccharomyces cerevisiae, Sac according to the present invention include, but are not charomyces Carlsbergensis, or other yeast Such as Candida, limited to, any bacteria, protist, microalga, fungus, or pro Kluyveromyces, or other fungi, for example, filamentous tozoa. In one aspect, preferred microorganisms to geneti fungi such as Aspergillus, Neurospora, Penicillium, etc. cally modify include, but are not limited to, any microor ganism of the order Thraustochytriales, including any Bacterial cells also may be used as hosts. These include, but microorganism in the families Thraustochytriaceae and are not limited to, Escherichia coli, which can be useful in Labyrinthulaceae. Particularly preferred host cells for use in fermentation processes. Alternatively, and only by way of the present invention could include microorganisms from a example, a host Such as a Lactobacillus species or Bacillus genus including, but not limited to: Thraustochytrium, species can be used as a host. Japonochytrium, Aplanochytrium, Elina and Schizochytrium 0180 Another embodiment of the present invention within the Thraustochytriaceae and Labyrinthula, Labyrin relates to a genetically modified plant, wherein the plant has thuloides, and Labyrinthomyxa within the Labyrinthulaceae. been genetically modified to recombinantly express a PKS Preferred species within these genera include, but are not system comprising at least one biologically active domain of limited to: any species within Labyrinthula, including a polyunsaturated fatty acid (PUFA) polyketide synthase Labrinthula sp., Labyrinthula algeriensis, Labyrinthula (PKS) system. The domain of the PUFA PKS system can cienkowski, Labyrinthula chattonii, Labyrinthula Coenocys include any of the domains, including homologues thereof, US 2008/0038379 A1 Feb. 14, 2008 30 for PUFA PKS systems as described above (e.g., for no enzymatic activity or action). Genetic modifications that Schizochytrium and/or Thraustochytrium), and can also result in an increase in gene expression or function can be include any domain of a PUFA PKS system from any referred to as amplification, overproduction, overexpression, non-bacterial microorganism (including any eukaryotic activation, enhancement, addition, or up-regulation of a microorganism and any other Thraustochytrid microorgan gene. ism) or any domain of a PUFA PKS system from a micro 0.184 The genetic modification of a microorganism or organism identified by a screening method as described in plant according to the present invention preferably affects U.S. patent application Ser. No. 10/124,800, supra. The plant the activity of the PKS system expressed by the microor can also be further modified with at least one domain or ganism or plant, whether the PKS system is endogenous and biologically active fragment thereof of another PKS system, genetically modified, endogenous with the introduction of including, but not limited to, bacterial PUFA PKS or PKS recombinant nucleic acid molecules into the organism (with systems, Type IPKS systems, Type II PKS systems, modular the option of modifying the endogenous system or not), or PKS systems, and/or any non-bacterial PUFA PKS system provided completely by recombinant technology. To alter (e.g., eukaryotic, Thraustochytrid, Thraustochytriaceae or the PUFA production profile of a PUFA PKS system or Labyrinthulaceae, Schizochytrium, etc.). organism expressing Such system includes causing any 0181. As used herein, a genetically modified plant can detectable or measurable change in the production of any include any genetically modified plant including higher one or more PUFAs by the host microorganism or plant as plants and particularly, any consumable plants or plants compared to in the absence of the genetic modification (i.e., useful for producing a desired bioactive molecule of the as compared to the unmodified, wild-type microorganism or present invention. Such a genetically modified plant has a plant or the microorganism or plant that is unmodified at genome which is modified (i.e., mutated or changed) from least with respect to PUFA synthesis—i.e., the organism its normal (i.e., wild-type or naturally occurring) form Such might have other modifications not related to PUFA synthe that the desired result is achieved (i.e., increased or modified sis). To affect the activity of a PKS system includes any PUFA PKS activity and/or production of a desired product genetic modification that causes any detectable or measur using the PKS system). Genetic modification of a plant can able change or modification in the PKS system expressed by be accomplished using classical strain development and/or the organism as compared to in the absence of the genetic molecular genetic techniques. Methods for producing a modification. A detectable change or modification in the transgenic plant, wherein a recombinant nucleic acid mol PKS system can include, but is not limited to: a change or ecule encoding a desired amino acid sequence is incorpo modification (introduction of, increase or decrease) of the rated into the genome of the plant, are known in the art. A expression and/or biological activity of any one or more of preferred plant to genetically modify according to the the domains in a modified PUFA PKS system as compared present invention is preferably a plant Suitable for consump to the endogenous PUFA PKS system in the absence of tion by animals, including humans. genetic modification, the introduction of PKS system activ ity into an organism such that the organism now has mea 0182 Preferred plants to genetically modify according to Surable/detectable PKS system activity (i.e., the organism the present invention (i.e., plant host cells) include, but are did not contain a PKS system prior to the genetic modifi not limited to any higher plants, and particularly consumable cation), the introduction into the organism of a functional plants, including crop plants and especially plants used for domain from a different PKS system than a PKS system their oils. Such plants can include, for example: canola, endogenously expressed by the organism Such that the PKS Soybeans, rapeseed, linseed, corn, Safflowers, Sunflowers system activity is modified (e.g., a bacterial PUFA PKS and tobacco. Other preferred plants include those plants that domain or a type I PKS domain is introduced into an are known to produce compounds used as pharmaceutical organism that endogenously expresses a non-bacterial PUFA agents, flavoring agents, neutraceutical agents, functional PKS system), a change in the amount of a bioactive mol food ingredients or cosmetically active agents or plants that ecule (e.g., a PUFA) produced by the PKS system (e.g., the are genetically engineered to produce these compounds/ system produces more (increased amount) or less (decreased agents. amount) of a given product as compared to in the absence of 0183 According to the present invention, a genetically the genetic modification), a change in the type of a bioactive modified microorganism or plant includes a microorganism molecule (e.g., a change in the type of PUFA) produced by or plant that has been modified using recombinant technol the PKS system (e.g., the system produces an additional or ogy or by classical mutagenesis and screening techniques. different PUFA, a new or different product, or a variant of a As used herein, genetic modifications which result in a PUFA or other product that is naturally produced by the decrease in gene expression, in the function of the gene, or system), and/or a change in the ratio of multiple bioactive in the function of the gene product (i.e., the protein encoded molecules produced by the PKS system (e.g., the system by the gene) can be referred to as inactivation (complete or produces a different ratio of one PUFA to another PUFA, partial), deletion, interruption, blockage or down-regulation produces a completely different lipid profile as compared to of a gene. For example, a genetic modification in a gene in the absence of the genetic modification, or places various which results in a decrease in the function of the protein PUFAs in different positions in a triacylglycerol as com encoded by Such gene, can be the result of a complete pared to the natural configuration). Such a genetic modifi deletion of the gene (i.e., the gene does not exist, and cation includes any type of genetic modification and spe therefore the protein does not exist), a mutation in the gene cifically includes modifications made by recombinant which results in incomplete or no translation of the protein technology and by classical mutagenesis. (e.g., the protein is not expressed), or a mutation in the gene 0185. It should be noted that reference to increasing the which decreases or abolishes the natural function of the activity of a functional domain or protein in a PUFA PKS protein (e.g., a protein is expressed which has decreased or system refers to any genetic modification in the organism US 2008/0038379 A1 Feb. 14, 2008 containing the domain or protein (or into which the domain gene with a heterologous sequence (demonstrated in the or protein is to be introduced) which results in increased Examples). Examples of heterologous sequences that could functionality of the domain or protein system and can be introduced into a host genome include sequences encod include higher activity of the domain or protein (e.g., ing at least one functional domain from another PKS system, specific activity or in vivo enzymatic activity), reduced such as a different non-bacterial PUFA PKS system (e.g., inhibition or degradation of the domain or protein system, from a eukaryote, including another Thraustochytrid), a and overexpression of the domain or protein. For example, bacterial PUFA PKS system, a type IPKS system, a type II gene copy number can be increased, expression levels can be PKS system, or a modular PKS system. A heterologous increased by use of a promoter that gives higher levels of sequence can also include an entire PUFA PKS system (e.g., expression than that of the native promoter, or a gene can be all genes associated with the PUFAPKS system) that is used altered by genetic engineering or classical mutagenesis to to replace the entire endogenous PUFAPKS system (e.g., all increase the activity of the domain or protein encoded by the genes of the endogenous PUFA PKS system) in a host. A gene. heterologous sequence can also include a sequence encoding 0186 Similarly, reference to decreasing the activity of a a modified functional domain (a homologue) of a natural functional domain or protein in a PUFA PKS system refers domain from a PUFA PKS system of a host Thraustochytrid to any genetic modification in the organism containing Such (e.g., a nucleic acid sequence encoding a modified domain domain or protein (or into which the domain or protein is to from OrfB of a Schizochytrium, wherein the modified be introduced) which results in decreased functionality of domain will, when used to replace the naturally occurring the domain or protein and includes decreased activity of the domain expressed in the Schizochytrium, alter the PUFA domain or protein, increased inhibition or degradation of the production profile by the Schizochytrium). Other heterolo domain or protein and a reduction or elimination of expres gous sequences to introduce into the genome of a host sion of the domain or protein. For example, the action of includes a sequence encoding a protein or functional domain domain or protein of the present invention can be decreased that is not a domain of a PKS system, but which will affect by blocking or reducing the production of the domain or the activity of the endogenous PKS system. For example, protein, "knocking out the gene or portion thereof encoding one could introduce into the host genome a nucleic acid the domain or protein, reducing domain or protein activity, molecule encoding a phosphopantetheinyl transferase (dis or inhibiting the activity of the domain or protein. Blocking cussed below). Specific modifications that could be made to or reducing the production of a domain or protein can an endogenous PUFA PKS system are discussed in detail include placing the gene encoding the domain or protein herein. under the control of a promoter that requires the presence of 0188 In another aspect of this embodiment of the inven an inducing compound in the growth medium. By establish tion, the genetic modification can include: (1) the introduc ing conditions such that the inducer becomes depleted from tion of a recombinant nucleic acid molecule encoding an the medium, the expression of the gene encoding the domain amino acid sequence having a biological activity of at least or protein (and therefore, of protein synthesis) could be one domain of a PUFA PKS system; and/or (2) the intro turned off. The present inventors demonstrate the ability to duction of a recombinant nucleic acid molecule encoding a delete (knock out) targeted genes in a Thraustochytrid protein or functional domain that affects the activity of a microorganism in the Examples section. Blocking or reduc PUFA PKS system, into a host. The host can include: (1) a ing the activity of domain or protein could also include using host cell that does not express any PKS system, wherein all an excision technology approach similar to that described in functional domains of a PKS system are introduced into the U.S. Pat. No. 4,743,546, incorporated herein by reference. host cell, and wherein at least one functional domain is from To use this approach, the gene encoding the protein of a non-bacterial PUFA PKS system; (2) a host cell that interest is cloned between specific genetic sequences that expresses a PKS system (endogenous or recombinant) hav allow specific, controlled excision of the gene from the ing at least one functional domain of a non-bacterial PUFA genome. Excision could be prompted by, for example, a shift PKS system, wherein the introduced recombinant nucleic in the cultivation temperature of the culture, as in U.S. Pat. acid molecule can encode at least one additional non No. 4,743,546, or by some other physical or nutritional bacterial PUFA PKS domain function or another protein or signal. domain that affects the activity of the host PKS system; and 0187. In one embodiment of the present invention, a (3) a host cell that expresses a PKS system (endogenous or genetic modification includes a modification of a nucleic recombinant) which does not necessarily include a domain acid sequence encoding an amino acid sequence that has a function from a non-bacterial PUFA PKS, and wherein the biological activity of at least one domain of a non-bacterial introduced recombinant nucleic acid molecule includes a PUFAPKS system as described herein (e.g., a domain, more nucleic acid sequence encoding at least one functional than one domain, a protein, or the entire PUFA PKS system, domain of a non-bacterial PUFA PKS system. In other of an endogenous PUFA PKS system of a Thraustochytrid words, the present invention intends to encompass any host). Such a modification can be made to an amino acid genetically modified organism (e.g., microorganism or sequence within an endogenously (naturally) expressed non plant), wherein the organism comprises at least one non bacterial PUFA PKS system, whereby a microorganism that bacterial PUFA PKS domain function (either endogenously naturally contains such a system is genetically modified by, or introduced by recombinant modification), and wherein for example, classical mutagenesis and selection techniques the genetic modification has a measurable effect on the and/or molecular genetic techniques, include genetic engi non-bacterial PUFA PKS domain function or on the PKS neering techniques. Genetic engineering techniques can system when the organism comprises a functional PKS include, for example, using a targeting recombinant vector system. to delete a portion of an endogenous gene (demonstrated in 0189 The present invention encompasses many possible the Examples), or to replace a portion of an endogenous non-bacterial and bacterial microorganisms as either pos US 2008/0038379 A1 Feb. 14, 2008 32 sible host cells for the PUFA PKS systems described herein thula magnifica, Labyrinthula minuta, Labyrinthula and/or as sources for additional genetic material encoding roscoffensis, Labyrinthula valkanovii, Labyrinthula PUFA PKS system proteins and domains for use in the vitellina, Labyrinthula vitellina pacifica, Labyrinthula genetic modifications and methods described herein. For vitellina vitellina, Labyrinthula Zopfii; any Labyrinthuloides example, microbial organisms with a PUFA PKS system species, including Labyrinthuloides sp., Labyrinthuloides similar to that found in Schizochytrium, such as the Thraus minuta, Labyrinthuloides schizochytrops; any Labyrinth tochytrium microorganism discovered by the present inven omyxa species, including Labyrinthomyxa sp., Labyrinth tors and described in Example 1, can be readily identified/ omyxa pohlia, Labyrinthomyxa sauvageaui, any Apl.- isolated/screened by methods to identify other non-bacterial anochytrium species, including Aplanochytrium sp. and microorganisms that have a polyunsaturated fatty acid Aplanochytrium kerguelensis; any Elina species, including (PUFA) polyketide synthase (PKS) system that are Elina sp., Elina marisalba, Elina Sinorifica; any described in detail in U.S. Patent Application Publication Japanochytrium species, including Japanochytrium sp., No. 20020194641, supra (corresponding to U.S. patent Japanochytrium marinum; any Schizochytrium species, application Ser. No. 10/124,800). including Schizochytrium sp., Schizochytrium aggregatum, 0190. Locations for collection of the preferred types of Schizochytrium limacinum, Schizochytrium minutum, microbes for screening for a PUFA PKS system according to Schizochytrium Octosporum; and any Thraustochytrium spe the present invention include any of the following: low cies, including Thraustochytrium sp., Thraustochytrium oxygen environments (or locations near these types of low aggregatum, Thraustochytrium arudimentale, Thraus oxygen environments including in the guts of animals tochytrium aureum, Thraustochytrium benthicola, Thraus including invertebrates that consume microbes or microbe tochytrium globosum, Thraustochytrium kinnei, Thraus containing foods (including types of filter feeding organ tochytrium motivum, Thraustochytrium pachydermum, isms), low or non-oxygen containing aquatic habitats Thraustochytrium proliferum, Thraustochytrium roseum, (including freshwater, Saline and marine), and especially Thraustochytrium striatum, Ulkenia sp., Ulkenia minuta, at-or near-low oxygen environments (regions) in the oceans. Ulkenia profinda, Ulkenia radiate, Ulkenia Sarkariana, and The microbial strains would preferably not be obligate Ulkenia visurgensis. anaerobes but be adapted to live in both aerobic and low or 0.193. It is noted that, without being bound by theory, the anoxic environments. Soil environments containing both present inventors consider Labyrinthula and other Labyrin aerobic and low oxygen or anoxic environments would also thulaceae as sources of PUFA PKS genes because the excellent environments to find these organisms in and espe Labyrinthulaceae are closely related to the Thraustochytri cially in these types of Soil in aquatic habitats or temporary aceae which are known to possess PUFA PKS genes, the aquatic habitats. Labyrinthulaceae are known to be bactivorous/phagocytotic, and some members of the Labyrinthulaceae have fatty 0191) A particularly preferred non-bacterial microbial acid/PUFA profiles consistent with having a PUFA PKS strain to screen for use as a host and/or a source of PUFA system. PKS genes according to the present invention would be a strain (selected from the group consisting of algae, fungi 0194 Strains of microbes (other than the members of the (including yeast), protozoa or protists) that, during a portion Thraustochytrids) capable of bacterivory (especially by of its life cycle, is capable of consuming whole bacterial phagocytosis or endocytosis) can be found in the following cells (bacterivory) by mechanisms such as phagocytosis, microbial classes (including but not limited to example phagotrophic or endocytic capability and/or has a stage of its genera): life cycle in which it exists as an amoeboid stage or naked 0.195. In the algae and algae-like microbes (including protoplast. This method of nutrition would greatly increase Stramenopiles): of the class Euglenophyceae (for example the potential for transfer of a bacterial PKS system into a genera Euglena, and Peranema), the class Chrysophyceae eukaryotic cell if a mistake occurred and the bacterial cell (for example the genus Ochromonas), the class Dinobry (or its DNA) did not get digested and instead are function aceae (for example the genera Dinobryon, Platychrysis, and ally incorporated into the eukaryotic cell. Chrysochromulina), the Dinophyceae (including the genera 0192 Included in the present invention as sources of Crypthecodinium, Gymnodinium, Peridinium, Ceratium, PUFA PKS genes (and proteins and domains encoded Gyrodinium, and Oxyrrhis), the class Cryptophyceae (for thereby) are any Thraustochytrids other than those specifi example the genera Cryptomonas, and Rhodomonas), the cally described herein that contain a PUFA PKS system. class Xanthophyceae (for example the genus Olisthodiscus) Such Thraustochytrids include, but are not limited to, but are (and including forms of algae in which an amoeboid stage not limited to, any microorganism of the order Thraustochy occurs as in the flagellates Rhizochloridaceae, and triales, including any microorganism in the families Thraus Zoospores/gametes of Aphanochaete pascheri, Bumilleria tochytriaceae and Labyrinthulaceae, which further comprise stigeoclonium and Vaucheria geminata), the class Eustig a genus including, but not limited to: Thraustochytrium, matophyceae, and the class Prymnesiopyceae (including the Japonochytrium, Aplanochytrium, Elina and Schizochytrium genera Prymnesium and Diacronema). within the Thraustochytriaceae and Labyrinthula, Labyrin 0196. In the Stramenopiles including the: Protero thuloides, and Labyrinthomyxa within the Labyrinthulaceae. monads, Opalines, Developavella, Diplophorys, Labyrin Preferred species within these genera include, but are not thulids, Thraustochytrids, Bicosecids, Oomycetes, limited to: any species within Labyrinthula, including Hypochytridiomycetes, Commation, Reticulosphaera, Pel Labrinthula sp., Labyrinthula algeriensis, Labyrinthula agomonas, Pelapococcus, Ollicola, Aureococcus, , cienkowski, Labyrinthula chattonii, Labyrinthula Coenocys Raphidiophytes, Synurids, Rhizochromulinaales, Pedinella tis, Labyrinthula macrocystis, Labyrinthula macrocystis les, , Chrysomeridales, Sarcinochrysidales, atlantica, Labyrinthula macrocystis macrocystis, Labyrin , , and . US 2008/0038379 A1 Feb. 14, 2008

0197). In the Fungi. Class Myxomycetes (form myxam isms (e.g., genetically modified Thraustochytrids) and/or oebae)—slime molds, class Acrasieae including the orders novel hybrid constructs encoding PUFA PKS systems for Acrasiceae (for example the genus Sappinia), class Guttuli use in the methods and genetically modified microorganisms naceae (for example the genera Guttulinopsis, and Gut and plants of the present invention. In one aspect, bacteria tulina), class Dicty Steliaceae (for example the genera Acra that may be particularly useful in the embodiments of the sis, Dictyostelium, Polysphondylium, and Coenonia), and present invention have PUFA PKS systems, wherein the class Phycomyceae including the orders Chytridiales, PUFA PKS system is capable of producing PUFAs at Ancylistales, Blastocladiales, Monoblepharidales, Saproleg temperatures exceeding about 20°C., preferably exceeding niales, . Mucorales, and Entomophthorales. about 25°C. and even more preferably exceeding about 30° 0198 In the Protozoa: Protozoa strains with life stages C. As described previously herein, the marine bacteria, capable of bacterivory (including by phageocytosis) can be Shewanella and Vibrio marinus, described in U.S. Pat. No. selected from the types classified as ciliates, flagellates or 6,140,486, do not produce PUFAs at higher temperatures, amoebae. Protozoan ciliates include the groups: Chonot which limits the usefulness of PUFA PKS systems derived richs, Colpodids, Cyrtophores, Haptorids, Karyorelicts, Oli from these bacteria, particularly in plant applications under gohymenophora, Polyhymenophora (spirotrichs), Prostomes field conditions. Therefore, in one embodiment, the screen and Suctoria. Protozoan flagellates include the Biosoecids, ing method of the present invention can be used to identify Bodonids, Cercomonads, Chrysophytes (for example the bacteria that have a PUFAPKS system, wherein the bacteria genera Anthophysa, Chrysanoemba, Chrysosphaerella, are capable of growth and PUFA production at higher Dendromonas, Dinobryon, Mallomonas, Ochromonas, temperatures (e.g., above about 15° C., 20° C., 25°C., or 30° Paraphysomonas, Poteriodchromonas, Spumella, Syn C. or even higher). However, even if the bacteria sources do crypta, Synura, and Uroglena), Collar flagellates, Crypto not grow well and/or produce PUFAs at the higher tempera phytes (for example the genera Chilomonas, Cryptomonas, tures, the present invention encompasses the identification, Cyanomonas, and Goniomonas), Dinoflagellates, isolation and use of the PUFA PKS systems (genes and Diplomonads, Euglenoids, Heterolobosea, Pedinellids, proteins/domains encoded thereby), wherein the PUFA PKS Pelobionts, Phalansteriids, Pseudodendromonads, Spon systems from the bacteria have enzymatic/biological activity gomonads and Volvocales (and other flagellates including at temperatures above about 15° C., 20° C., 25° C., or 30° the unassigned flagellate genera of Artodiscus, Clautriavia, C. or even higher. In one aspect of this embodiment, Helkesimastix, Kathablepharis and Multicilia). Amoeboid inhibitors of eukaryotic growth Such as nystatin (antifungal) protozoans include the groups: Actinophryids, Centrohelids, or cycloheximide (inhibitor of eukaryotic protein synthesis) Desmothoricids, Diplophryids, Eumamoebae, Heterolobo can be added to agar plates used to culture/select initial sea, Leptomyxids, Nucleariid filose amoebae, Pelebionts, strains from water samples/soil samples collected from the Testate amoebae and Vampyrellids (and including the unas types of habitats/niches Such as marine or estuarian habits, signed amoebid genera Gymnophrys, Biomyxa, Microcom or any other habitat where such bacteria can be found. This etes, Reticulomyxa, Belonocystis, Elaeorhanis, Allelogro process would help select for enrichment of bacterial strains mia, Gromia or Lieberkuhnia). The protozoan orders include without (or minimal) contamination of eukaryotic strains. the following: Percolomonadeae, Heterolobosea, Lyromon This selection process, in combination with culturing the adea, Pseudociliata, Trichomonadea, Hypermastigea, Heter plates at elevated temperatures (e.g. 30° C.), and then omiteae, Telonemea, Cyathobodonea, Ebridea, Pyytomyxea, selecting strains that produce at least one PUFA would Opalinea, Kinetomonadea, Hemimastigea, Protostelea, initially identify candidate bacterial strains with a PUFA Myxagastrea, Dictyostelea, Choanomonadea, Apicomon PKS system that is operative at elevated temperatures (as adea, Eogregarinea, Neogregarinea, Coelotrolphea, Eucoc opposed to those bacterial strains in the prior art which only cidea, Haemosporea, Piroplasmea, Spirotrichea, Prosto exhibit PUFA production attemperatures less than about 20° matea, Litostomatea, Phyllopharyngea, Nassophorea, C. and more preferably below about 5° C.). Oligohymenophorea, Colpodea, Karyorelicta, Nucleohelea, 0201 However, even in bacteria that do not grow well (or Centrohelea, Acantharea, Sticholonchea, Polycystinea, at all) at higher temperatures, or that do not produce at least Phaeodarea, Lobosea, Filosea, Athalamea, Monothalamea, one PUFA at higher temperatures, such strains can be Polythalamea, Xenophyophorea, Schizocladea, Holosea, identified and selected as comprising a PUFA PKS system Entamoebea, Myxosporea, Actinomyxea, Halosporea, by the identification of the ability of the bacterium to Paramyxea, Rhombozoa and Orthonectea. produce PUFAs under any conditions and/or by screening the genome of the bacterium for genes that are homologous 0199 A preferred embodiment of the present invention to other known PUFA PKS genes from bacteria or non includes strains of the microorganisms listed above that have bacterial organisms (e.g., see Example 7). To evaluate PUFA been collected from one of the preferred habitats listed PKS function at higher temperatures for genes from any above. bacterial source, one can produce cell-free extracts and test 0200. In some embodiments of this method of the present for PUFA production at various temperatures, followed by invention, PUFA PKS systems from bacteria, including selection of microorganisms that contain PUFA PKS genes genes and portions thereof (encoding entire PUFA PKS that have enzymatic/biological activity at higher tempera systems, proteins thereof and/or domains thereof) can be ture ranges (e.g., 15° C., 20° C., 25° C., or 30° C. or even used to genetically modify other PUFA PKS systems (e.g., higher). any non-bacterial PUFA PKS system) and/or microorgan 0202 Suitable bacteria to use as hosts for genetic modi isms containing the same (or vice versa) in the embodiments fication include any bacterial strain as discussed above. of the invention. In one aspect, novel PUFA PKS systems Particularly suitable bacteria to use as a source of PUFA can be identified in bacteria that are expected to be particu PKS genes (and proteins and domains encoded thereby) for larly useful for creating genetically modified microorgan the production of genetically modified sequences and organ US 2008/0038379 A1 Feb. 14, 2008 34 isms according to the present invention include any bacte produce a variety of bioactive molecules. Although mixing rium that comprises a PUFA PKS system. Such bacteria are and modification of any PKS domains and related genes are typically isolated from marine or estuarian habitats and can contemplated by the present inventors, by way of example, be readily identified by their ability to product PUFAs and/or various possible manipulations of the PUFA-PKS system are by the presence of one or more genes having homology to discussed below with regard to genetic modification and known PUFA PKS genes in the organism. Such bacteria can bioactive molecule production. include, but are not limited to, bacteria of the genera Shewanella and Vibrio. Preferred bacteria for use in the 0205 Accordingly, encompassed by the present inven present invention include those with PUFA PKS systems tion are methods to genetically modify microbial or plant that are biologically active at higher temperatures (e.g., cells by: genetically modifying at least one nucleic acid above about 15° C., 20°C., 25°C., or 30° C. or even higher). sequence in the organism that encodes an amino acid The present inventors have identified two exemplary bacte sequence having the biological activity of at least one ria (e.g. Shewanella Olleyana and Shewanella japonica; see functional domain of a non-bacterial PUFA PKS system Examples 7 and 8) that will be particularly suitable for use according to the present invention, and/or expressing at least as sources of PUFA PKS genes, and others can be readily one recombinant nucleic acid molecule comprising a nucleic identified or are known to comprise PUFA PKS genes and acid sequence encoding Such amino acid sequence. Various may be useful in an embodiment of the present invention embodiments of Such sequences, methods to genetically (e.g., ). modify an organism, and specific modifications have been described in detail above. Typically, the method is used to 0203 Furthermore, it is recognized that not all bacterial produce a particular genetically modified organism that or non-bacterial microorganisms can be readily cultured produces a particular bioactive molecule or molecules. from natural habitats. However, genetic characteristics of Such un-culturable microorganisms can be evaluated by 0206. One embodiment of the present invention relates to isolating genes from DNA prepared en mass from mixed or a genetically modified Thraustochytrid microorganism, crude environmental samples. Particularly suitable to the wherein the microorganism has an endogenous polyunsatu present invention, PUFA PKS genes derived from un-cul rated fatty acid (PUFA) polyketide synthase (PKS) system, turable microorganisms can be isolated from environmental and wherein the endogenous PUFA PKS system has been DNA samples by degenerate PCR using primers designed to genetically modified to alter the expression profile of a generally match regions of high similarity in known PUFA polyunsaturated fatty acid (PUFA) by the microorganism as PKS genes (e.g., see Example 7). Alternatively, whole DNA compared to the Thraustochytrid microorganism in the fragments can be cloned directly from purified environmen absence of the modification. Thraustochytrid microorgan tal DNA by any of several methods known to the art. isms useful as host organisms in the present invention Sequence of the DNA fragments thus obtained can reveal endogenously contain and express a PUFAPKS system. The homologs to known genes such as PUFA PKS genes. genetic modification can be a genetic modification of one or Homologs of OrfB and Orf (referring to the domain more of the functional domains of the endogenous PUFA structure of Schizochytrium and Thraustochytrium, for PKS system, whereby the modification alters the PUFA example) may be particularly useful in defining the PUFA production profile of the endogenous PUFA PKS system. In PKS end product. Whole coding regions of PUFA PKS addition, or as an alternative, the genetic modification can be genes can then be expressed in host organisms (such as an introduction of at least one exogenous nucleic acid Escherichia coli or yeast) in combination with each other or sequence (e.g., a recombinant nucleic acid molecule) to the with known PUFAPKS gene or gene fragment combinations microorganism, wherein the exogenous nucleic acid to evaluate their effect on PUFA production. As described sequence encodes at least one biologically active domain or above, activity in cell-free extracts can be used to determine protein from a second PKS system and/or a protein that function at desired temperatures. Isolated PUFA PKS genes affects the activity of the PUFA PKS system (e.g., a phos can also be transformed directly into appropriate phopantetheinyl transferases (PPTase)). The second PKS Schizochytrium or other suitable strains to measure function. system can be any PKS system, including other PUFA PKS PUFA PKS system-encoding constructs identified or pro systems and including homologues of genes from the duced in Such a manner, including hybrid constructs, can Thraustochytrid PUFA PKS system to be genetically modi also be used to transform other organisms, such as plants. fied. 0204 Therefore, using the non-bacterial PUFA PKS sys 0207. This embodiment of the invention is particularly tems of the present invention, which, for example, makes useful for the production of commercially valuable lipids use of genes from Thraustochytrid PUFA PKS systems, as enriched in a desired PUFA, such as EPA, via the present well as PUFA PKS systems and PKS systems from bacteria, inventors development of genetically modified microorgan gene mixing can be used to extend the range of PUFA isms and methods for efficiently producing lipids (triacylg products to include EPA, DHA, ARA, GLA, SDA and others lyerols (TAG) as well as membrane-associated phospholip (described in detail below), as well as to produce a wide variety of bioactive molecules, including antibiotics, other ids (PL)) enriched in PUFAs. pharmaceutical compounds, and other desirable products. 0208. This particular embodiment of the present inven The method to obtain these bioactive molecules includes not tion is derived in part from the following knowledge: (1) only the mixing of genes from various organisms but also utilization of the inherent TAG production capabilities of various methods of genetically modifying the non-bacterial selected microorganisms, and particularly, of Thraus PUFA PKS genes disclosed herein. Knowledge of the tochytrids. Such as the commercially developed genetic basis and domain structure of the non-bacterial Schizochytrium strain described herein; (2) the present PUFA PKS system of the present invention provides a basis inventors detailed understanding of PUFA PKS biosyn for designing novel genetically modified organisms which thetic pathways (i.e., PUFA PKS systems) in eukaryotes and US 2008/0038379 A1 Feb. 14, 2008 in particular, in members of the order Thraustochytriales: sequenced PUFA PKS genes in at least one other marine and, (3) utilization of a homologous genetic recombination protist (Thraustochytrium strain 23B) (described in detail system in Schizochytrium. Based on the inventors knowl below). edge of the systems involved, the same general approach may be exploited to produce PUFAs other than EPA. 0213) The PUFA products of marine bacteria include EPA (e.g., produced by Shewanella SRC2738 and Photobacter 0209. In one embodiment of the invention, the endog profiindum) as well as DHA (Vibrio marinus, now known as enous Thraustochytrid PUFA PKS genes, such as the Moritella marina) (described in U.S. Pat. No. 6,140,486, Schizochytrium genes encoding PUFA PKS enzymes that supra; and in U.S. Pat. No. 6,566,583, supra). It is an normally produce DHA and DPA, are modified by random embodiment of the invention that any PUFA PKS gene set or targeted mutagenesis, replaced with genes from other could be envisioned to substitute for the Schizochytrium organisms that encode homologous PKS proteins (e.g., from genes described in the example herein, as long as the bacteria or other sources), or replaced with genetically physiological growth requirements of the production organ modified Schizochytrium, Thraustochytrium or other ism (e.g., Schizochytrium) in fermentation conditions were Thraustochytrid PUFA PKS genes. The product of the satisfied. In particular, the PUFA-producing bacterial strains enzymes encoded by these introduced and/or modified genes described above grow only at relatively low temperatures can be EPA, for example, or it could be some other related (typically less than 20°C.) which further indicates that their molecule, including other PUFAs. One feature of this PUFA PKS gene products will not function at standard method is the utilization of endogenous components of growth temperatures for Schizochytrium (25-30°C.). How Thraustochytrid PUFA synthesis and accumulation machin ever, the inventors have recently identified at least two other ery that is essential for efficient production and incorpora marine bacteria that grow and produce EPA at standard tion of the PUFA into PL and TAG. In particular, this growth temperatures for Schizochytrium and other Thraus embodiment of the invention is directed to the modification tochytrids (see Example 7). These alternate marine bacteria of the type of PUFA produced by the organism, while have been shown to possess PUFA-PKS-like genes that will retaining the high oil productivity of the parent strain. serve as material for modification of Schizochytrium and 0210 Although some of the following discussion uses other Thraustochytrids by methods described herein. It will the organism Schizochytrium as an exemplary host organ be apparent to those skilled in the art from this disclosure ism, any Thraustochytrid can be modified according to the that other currently unstudied or unidentified PUFA-produc present invention, including members of the genera Thraus ing bacteria could also contain PUFA PKS genes useful for tochytrium, Labyrinthuloides, and Japonochytrium. For modification of Thraustochytrids. example, the genes encoding the PUFA PKS system for a species of Thraustochytrium have been identified (see 0214 Second, in addition to the genes that encode the Example 6), and this organism can also serve as a host enzymes directly involved in PUFA synthesis, an “acces organism for genetic modification using the methods sory enzyme is required. The gene encodes a phosphopan described herein, although it is more likely that the Thraus tetheline transferase (PPTase) that activates the acyl-carrier tochytrium PKS genes will be used to modify the endog protein (ACP) domains present in the PUFA PKS complex. enous PUFA PKS genes of another Thraustochytrid, such as Activation of the ACP domains by addition of this co-factor Schizochytrium. Furthermore, using methods for screening is required for the PUFA PKS enzyme complex to function. organisms as set forth in U.S. application Ser. No. 10/124, All of the ACP domains of the PUFAPKS systems identified 800, Supra, one can identify other organisms useful in the So far show a high degree of amino acid sequence conser present method and all Such organisms are encompassed vation and, without being bound by theory, the present herein. inventors believe that the PPTase of Schizochytrium and other Thraustochytrids will recognize and activate ACP 0211 This embodiment of the present invention can be domains from other PUFA PKS systems. As proof of prin illustrated as follows. By way of example, based on the ciple that heterologous PPTases and PUFA PKS genes can present inventors current understanding of PUFA synthesis function together to produce a PUFA product, the present and accumulation in Schizochytrium, the overall biochemi inventors demonstrate herein the use of two different heter cal process can be divided into three parts. ologous PPTases with the PUFA PKS genes from Schizochytrium to produce a PUFA in a bacterial host cell. 0212 First, the PUFAs that accumulate in Schizochytrium oil (DHA and DPA) are the product of a 0215. Third, in Schizochytrium, the products of the PUFA PUFA PKS system as discussed above. The PUFA PKS PKS system are efficiently channeled into both the phos system in Schizochytrium converts malonyl-CoA into the pholipids (PL) and triacylglycerols (TAG). The present end product PUFA without release of significant amounts of inventors’ data suggest that the PUFA is transferred from the intermediate compounds. In Schizochytrium, three genes ACP domains of the PKS complex to coenzyme A (CoA). As have been identified (Orfs A, B and C; also represented by in other eukaryotic organisms, this acyl-CoA would then SEQ ID NO:1, SEQ ID NO:3 and SEQ ID NO:5, respec serve as the substrate for the various acyl-transferases that tively) that encode all of the enzymatic domains known to be form the PL and TAG molecules. In contrast, the data required for actual synthesis of PUFAs. Similar sets of genes indicate that in bacteria, transfer to CoA does not occur; (encoding proteins containing homologous sets of enzy rather, there is a direct transfer from the ACP domains of the matic domains) have been cloned and characterized from PKS complex to the acyl-transferases that form PL. The several other non-eukaryotic organisms that produce enzymatic system in Schizochytrium that transfers PUFA PUFAs, namely, several strains of marine bacteria. In addi from ACP to CoA clearly can recognize both DHA and DPA tion, the present inventors have identified and now and therefore, the present inventors believe that it is pre US 2008/0038379 A1 Feb. 14, 2008 36 dictable that any PUFA product of the PUFA PKS system (as Bacillus subtilus (see Example 2). The present inventors attached to the PUFA PKS ACP domains) will serve as a have also demonstrated activation (pantetheinylation) of substrate. ACP domains from Schizochytrium Orf Aby a PPTase called Het I from Nostoc (see Example 2). The HetI enzyme was 0216) Therefore, in one embodiment of the present inven additionally used as the PPTase in the experiments discussed tion, the present inventors propose to alter the genes encod above for the production of DHA and DPA in E. coli using ing the components of the PUFAPKS enzyme complex (part the recombinant Schizochytrium PUFA PKS genes 1) while utilizing the endogenous PPTase from (Example 2). Schizochytrium or another Thraustochytrid host (part 2) and PUFA-ACP to PUFA-CoA transferase activity and TAG/PL 0220) Third, data indicate that DHA-CoA and DPA-CoA synthesis systems (or other endogenous PUFA ACP to may be metabolic intermediates in the Schizochytrium TAG TAG/PL mechanism) (part 3). These methods of the present and PL synthesis pathway. Published biochemical data Sug invention are Supported by experimental data, Some of gest that in bacteria, the newly synthesized PUFAs are transferred directly from the PUFA PKS ACP domains to the which are presented in the Examples section in detail. phospholipid synthesis enzymes. In contrast, the present 0217 First, the present inventors have found that the inventors data indicate that in Schizochytrium, a eukaryotic PUFA PKS system can be transferred between organisms, organism, there may be an intermediate between the PUFA and that some parts are interchangeable. More particularly, on the PUFA PKS ACP domains and the target TAG and PL it has been previously shown that the PUFA PKS pathways molecules. The typical carrier of fatty acids in the eukaryotic of the marine bacteria, Shewanella SCR2738 (Yazawa, cytoplasm is CoA. The inventors examined extracts of 1996, Lipids 31:S297-300) and Vibrio marinus (along with Schizochytrium cells and found significant levels of com the PPTase from Shewanella) (U.S. Pat. No. 6,140.486), can pounds that co-migrated during HPLC fractionation with be successfully transferred to a heterologous host (i.e., to E. authentic standards of DHA-CoA, DPA-CoA, 16:0-CoA and coli). Additionally, the degree of structural homology 18:1-CoA. The identity of the putative DHA-CoA and between the subunits of the PUFA PKS enzymes from these DPA-CoA peaks were confirmed using mass spectroscopy. two organisms (Shewanella SCRC2738 and Vibrio marinus) In contrast, the inventors were not able to detect DHA-CoA is such that it has been possible to mix and match genes from in extracts of Vibrio marinus, again suggesting that a dif the two systems (U.S. Pat. No. 6,140,486, supra). The PUFA ferent mechanism exists in bacteria for transfer of the PUFA end product of the mixed sets of genes varied depending on to its final target (e.g., direct transfer to PL). The data the origins of the specific gene homologues. At least one indicate a mechanism likely exists in Schizochytrium for open reading frame (Shewanella's Orf 7 and its Vibrio transfer of the newly synthesized PUFA to CoA (probably marinus homologue; see FIG. 13 of U.S. Pat. No. 6,140,486; via a direct transfer from the ACP to CoA). Both TAG and note that the nomenclature for this Orf has changed; it is PL synthesis enzymes could then access this PUFA-CoA. labeled as Orf 8 in the patent, but was submitted to Genbank The observation that both DHA and DPA CoA are produced as Orf7, and is now referred to by its GenBank designation) Suggests that the enzymatic transfer machinery may recog could be associated with determination of whether DHA or nize a range of PUFAs. EPA would be the product of the composite system. The 0221 Fourth, the present inventors have now created functional domains of all of the PUFA PKS enzymes iden knockouts of OrfA, Orf B, and Orf C in Schizochytrium (see tified so far show sequence homology to one another. Example 3). The knockout strategy relies on the homologous Similarly, these data indicated that PUFA PKS systems, recombination that has been demonstrated to occur in including those from the marine bacteria, can be transferred Schizochytrium (see U.S. patent application Ser. No. 10/124, to, and will function in, Schizochytrium and other Thraus 807, Supra). Several strategies can be employed in the design tochytrids. of knockout constructs. The specific strategy used to inac 0218. The present inventors have now expressed the tivate these three genes utilized insertion of a ZeocinTM PUFA PKS genes (Orfs A, B and C) from Schizochytrium in resistance gene coupled to a tubulin promoter (derived from an E. coli host and have demonstrated that the cells made pMON50000, see U.S. patent application Ser. No. 10/124, DHA and DPA in about the same ratio as the endogenous 807) into a cloned portion of the Orf. The new construct production of these PUFAs in Schizochytrium (see Example containing the interrupted coding region was then used for 2). Therefore, it has been demonstrated that the recombinant the transformation of wild type Schizochytrium cells via Schizochytrium PUFAPKS genes encode a functional PUFA particle bombardment (see U.S. patent application Ser. No. synthesis system. Additionally, all or portions of the Thraus 10/124.807). Bombarded cells were spread on plates con tochytrium 23B OrfA and Orf genes have been shown to taining both ZeocinTM and a supply of PUFA (see below). function in Schizochytrium (see Example 6). Colonies that grew on these plates were then streaked onto ZeocinTM plates that were not supplemented with PUFAs. 0219 Second, the present inventors have previously Those colonies that required PUFA supplementation for found that PPTases can activate heterologous PUFA PKS growth were candidates for having had the PUFA PKS Orf ACP domains. Production of DHA in E. coli transformed inactivated via homologous recombination. In all three with the PUFA PKS genes from Vibrio marinus occurred cases, this presumption was confirmed by rescuing the only when an appropriate PPTase gene (in this case, from knockout by transforming the cells with a full-length Shewanella SCRC2738) was also present (see U.S. Pat. No. genomic DNA clones of the respective Schizochytrium Orfs. 6,140,486, Supra). This demonstrated that the Shewanella Furthermore, in some cases, it was found that the Zeocin TM PPTase was able to activate the Vibrio PUFA PKS ACP resistance gene had been removed (see Example 5), indi domains. Additionally, the present inventors have now dem cating that the introduced functional gene had integrated into onstrated the activation (pantetheinylation) of ACP domains the original site by double homologous recombination (i.e. from Schizochytrium Orf A using a PPTase (sfp) from deleting the resistance marker). One key to the Success of US 2008/0038379 A1 Feb. 14, 2008 37 this strategy was Supplementation of the growth medium 0224. The present invention can make use of genes and with PUFAs. In the present case, an effective means of nucleic acid sequences which encode proteins or domains supplementation was found to be sequestration of the PUFA from PKS systems other than the PUFA PKS system by mixing with partially methylated beta-cyclodextrin prior described herein and in U.S. patent application Ser. No. to adding to the growth medium (see Example 5). Together, 10/124.800, and include genes and nucleic acid sequences these experiments demonstrate the principle that one of skill from bacterial and non-bacterial PKS systems, including in the art, given the guidance provided herein, can inactivate PKS systems of Type II, Type I and modular, described one or more of the PUFA PKS genes in a PUFA PKS above. Organisms which express each of these types of PKS containing microorganism Such as Schizochytrium, and cre systems are known in the art and can serve as sources for ate a PUFA auxotroph which can then be used for further nucleic acids useful in the genetic modification process of genetic modification (e.g., by introducing other PKS genes) the present invention. according to the present invention (e.g., to alter the fatty acid 0225. In a preferred embodiment, genes and nucleic acid profile of the recombinant organism). sequences which encode proteins or domains from PKS 0222 One important element of the genetic modification systems other than the PUFA PKS system or from other of the organisms of the present invention is the ability to PUFA PKS systems are isolated or derived from organisms directly transform a Thraustochytridgenome. In U.S. appli which have preferred growth characteristics for production cation Ser. No. 10/124.807, supra, transformation of of PUFAs. In particular, it is desirable to be able to culture Schizochytrium via single crossover homologous recombi the genetically modified Thraustochytrid microorganism at nation and targeted gene replacement via double crossover temperatures greater than about 15° C., greater than 20° C. homologous recombination were demonstrated. As dis greater than 25°C., greater than 30°C., greater than 35°C., cussed above, the present inventors have now used this greater than 40°C., or in one embodiment, at any tempera technique for homologous recombination to inactivate Orf ture between about 20° C. and 40° C. Therefore, PKS A, Orf B and Orf of the PUFA-PKA system in proteins or domains having functional enzymatic activity at Schizochytrium. The resulting mutants are dependent on these temperatures are preferred. For example, the present supplementation of the media with PUFA. Several markers inventors describe herein the use of PKS genes from of transformation, promoter elements for high level expres Shewanella Olleyana or Shewanella japonica, which are sion of introduced genes and methods for delivery of exog marine bacteria that naturally produce EPA and grow at enous genetic material have been developed and are avail temperatures up to 30° C. and 35° C. respectively (see able. Therefore, the tools are in place for knocking out Example 7). PKS proteins or domains from these organisms endogenous PUFAPKS genes in Thraustochytrids and other are examples of proteins and domains that can be mixed with eukaryotes having similar PUFA PKS systems and replacing Thraustochytrid PUFA PKS proteins and domains as them with genes from other organisms (or with modified described herein to produce a genetically modified organism Schizochytrium genes) as proposed above. that has a specifically designed or modified PUFA produc 0223) In one approach for production of EPA-rich TAG, tion profile. the PUFA PKS system of Schizochytrium can be altered by 0226. In another preferred embodiment, the genes and the addition of heterologous genes encoding a PUFA PKS nucleic acid sequences that encode proteins or domains from system whose product is EPA. It is anticipated that the a PUFA PKS system that produces one fatty acid profile are endogenous PPTase will activate the ACP domains of that used to modify another PUFA PKS system and thereby alter heterologous PUFA PKS system. Additionally, it is antici the fatty acid profile of the host. For example, Thraus pated that the EPA will be converted to EPA-CoA and will tochytrium 23B (ATCC 20892) is significantly different readily be incorporated into Schizochytrium TAG and PL from Schizochytrium sp. (ATCC 20888) in its fatty acid membranes. In one modification of this approach, tech profile. Thraustochytrium 23B can have DHA: DPA(n-6) niques can be used to modify the relevant domains of the ratios as high as 40:1 compared to only 2-3:1 in endogenous Schizochytrium system (either by introduction Schizochytrium (ATCC 20888). Thraustochytrium 23B can of specific regions of heterologous genes or by mutagenesis also have higher levels of C20:5(n-3). However, of the Schizochytrium genes themselves) such that its end Schizochytrium (ATCC 20888) is an excellent oil producer product is EPA rather than DHA and DPA. This is an as compared to Thraustochytrium 23B. Schizochytrium exemplary approach, as this technology can be applied to the accumulates large quantities of triacylglycerols rich in DHA production of other PUFA end products and to any eukary and docosapentaenoic acid (DPA; 22:5c)6); e.g., 30% DHA+ otic microorganism that comprises a PUFA PKS system and DPA by dry weight. Therefore, the present inventors that has the ability to efficiently channel the products of the describe herein the modification of the Schizochytrium PUFA PKS system into both the phospholipids (PL) and endogenous PUFA PKS system with Thraustochytrium 23B triacylglycerols (TAG). In particular, the invention is appli PUFA PKS genes to create a genetically modified cable to any Thraustochytrid microorganism or any other Schizochytrium with a DHA: DPA profile more similar to eukaryote that has an endogenous PUFAPKS system, which Thraustochytrium 23B (i.e., a “super-DHA-producer is described in detail below by way of example. In addition, Schizochytrium, wherein the production capabilities of the the invention is applicable to any Suitable host organism, Schizochytrium combine with the DHA: DPA ratio of into which the modified genetic material for production of Thraustochytrium). various PUFA profiles as described herein can be trans formed. For example, in the Examples, the PUFA PKS 0227. Therefore, the present invention makes use of system from Schizochytrium is transformed into an E. coli. genes from Thraustochytrid PUFA PKS systems, and further Such a transformed organism could then be further modified utilizes gene mixing to extend and/or alter the range of to alter the PUFA production profile using the methods PUFA products to include EPA, DHA, DPA, ARA, GLA, described herein. SDA and others. The method to obtain these altered PUFA US 2008/0038379 A1 Feb. 14, 2008 production profiles includes not only the mixing of genes gene set. The fatty acids of the resulting transformants can from various organisms into the Thrasustochytrid PUFA then be analyzed for alterations in profiles to identify the PKS genes, but also various methods of genetically modi transformants producing EPA and/or ARA. fying the endogenous Thraustochytrid PUFA PKS genes disclosed herein. Knowledge of the genetic basis and 0230 By way of example, in this aspect of the invention, domain structure of the Thraustochytrid PUFA PKS system one could construct a clone with the CLF of OrfB replaced of the present invention (e.g., described in detail for with a CLF from a C20 PUFA-PKS system. A marker gene Schizochytrium above) provides a basis for designing novel could be inserted downstream of the coding region. More genetically modified organisms which produce a variety of specifically, one can use the homologous recombination PUFA profiles. Novel PUFA PKS constructs prepared in system for transformation of Thraustochytrids as described microorganisms such as a Thraustochytrid can be isolated herein and in detail in U.S. patent application Ser. No. and used to transform plants to impart similar PUFA pro 10/124.807, Supra. One can then transform the wild type duction properties onto the plants. Thraustochytrid cells (e.g., Schizochytrium cells), select for 0228) Any one or more of the endogenous Thraus the marker phenotype, and then screen for those that had tochytrid PUFA PKS domains can be altered or replaced incorporated the new CLF. Again, one would analyze these according to the present invention, provided that the modi transformants for any effects on fatty acid profiles to identify fication produces the desired result (i.e., alteration of the transformants producing EPA and/or ARA. If some factor PUFA production profile of the microorganism). Particularly other than those associated with the CLF is found to influ preferred domains to alter or replace include, but are not ence the chain length of the end product, a similar strategy limited to, any of the domains corresponding to the domains could be employed to alter those factors. in Schizochytrium OrfB or Orf (B-ketoacyl-ACP synthase (KS), acyltransferase (AT), FabA-like B-hydroxy acyl-ACP 0231. In another aspect of the invention, modification or dehydrase (DH), chain length factor (CLF), enoyl ACP substitution of the B-hydroxyacyl-ACP dehydrase/keto syn reductase (ER), an enzyme that catalyzes the synthesis of thase pairs is contemplated. During cis-vaccenic acid trans-2-acyl-ACP, an enzyme that catalyzes the reversible (C18:1, A11) synthesis in E. coli, creation of the cis double isomerization of trans-2-acyl-ACP to cis-3-acyl-ACP, and bond is believed to depend on a specific DH enzyme, an enzyme that catalyzes the elongation of cis-3-acyl-ACP B-hydroxy acyl-ACP dehydrase, the product of the fabA to cis-5-f3-keto-acyl-ACP). In one embodiment, preferred gene. This enzyme removes HOH from a B-keto acyl-ACP domains to alter or replace include, but are not limited to, and leaves a trans double bond in the carbon chain. A subset B-ketoacyl-ACP synthase (KS), FabA-like B-hydroxy acyl of DH's, FabA-like, possess cis-trans isomerase activity ACP dehydrase (DH), and chain length factor (CLF). (Heath et al., 1996, Supra). A novel aspect of bacterial and 0229. In one aspect of the invention, Thraustochytrid non-bacterial PUFA-PKS systems is the presence of two PUFA-PKS PUFA production is altered by modifying the FabA-like DH domains. Without being bound by theory, the CLF (chain length factor) domain. This domain is charac present inventors believe that one or both of these DH teristic of Type II (dissociated enzymes) PKS systems. Its domains will possess cis-trans isomerase activity (manipu amino acid sequence shows homology to KS (keto synthase lation of the DH domains is discussed in greater detail pairs) domains, but it lacks the active site cysteine. CLF may below). function to determine the number of elongation cycles, and hence the chain length, of the end product. In this embodi 0232 Another aspect of the unsaturated fatty acid syn ment of the invention, using the current state of knowledge thesis in E. coli is the requirement for a particular KS of FAS and PKS synthesis, a rational strategy for production enzyme. B-ketoacyl-ACP synthase, the product of the fabB of ARA by directed modification of the non-bacterial PUFA gene. This is the enzyme that carries out condensation of a PKS system is provided. There is controversy in the litera fatty acid, linked to a cysteine residue at the active site (by ture concerning the function of the CLF in PKS systems a thio-ester bond), with a malonyl-ACP. In the multi-step (Bisang et al., Nature 401, 502 (1999); Yi et al., J. Am. reaction, CO is released and the linear chain is extended by Chem. Soc. 125, 12708 (2003)) and it is realized that other two carbons. It is believed that only this KS can extend a domains may be involved in determination of the chain carbon chain that contains a double bond. This extension length of the end product. However, it is significant that occurs only when the double bond is in the cis configuration; Schizochytrium produces both DHA (C22:6, ()-3) and DPA if it is in the trans configuration, the double bond is reduced (C22:5, co-6). In the PUFA-PKS system the cis double bonds by enoyl-ACP reductase (ER) prior to elongation (Heath et are introduced during synthesis of the growing carbon chain. al., 1996, supra). All of the PUFA-PKS systems character Since placement of the co-3 and co-6 double bonds occurs ized so far have two KS domains, one of which shows early in the synthesis of the molecules, one would not expect greater homology to the FabB-like KS of E. coli than the that they would affect Subsequent end-product chain length other. Again, without being bound by theory, the present determination. Thus, without being bound by theory, the inventors believe that in PUFA-PKS systems, the specifici present inventors believe that introduction of a factor (e.g. ties and interactions of the DH (FabA-like) and KS (FabB CLF) that directs synthesis of C20 units (instead of C22 like) enzymatic domains determine the number and place units) into the Schizochytrium PUFA-PKS system will result ment of cis double bonds in the end products. Because the in the production of EPA (C20:5, co-3) and ARA (C20:4, number of 2-carbon elongation reactions is greater than the (O-6). For example, in heterologous systems, one could number of double bonds present in the PUFA-PKS end exploit the CLF by directly substituting a CLF from an EPA products, it can be determined that in some extension cycles producing system (such as one from Photobacterium, or complete reduction occurs. Thus the DH and KS domains preferably from a microorganism with the preferred growth can be used as targets for alteration of the DHA/DPA ratio requirements as described below) into the Schizochytrium or ratios of other long chain fatty acids. These can be US 2008/0038379 A1 Feb. 14, 2008 39 modified and/or evaluated by introduction of homologous example is that production of a wide range of unsaturated domains from other systems or by mutagenesis of these gene fatty acid will suffice (even unsaturated fatty acid substitutes fragments. such as branched chain fatty acids). Therefore, in another preferred embodiment of the invention, one could create a 0233. In another embodiment, the ER (enoyl-ACP reduc large number of mutations in one or more of the PUFAPKS tase—an enzyme which reduces the trans-double bond in the genes disclosed herein, and then transform the appropriately fatty acyl-ACP resulting in fully saturated carbons) domains modified FabB-strain (e.g. create mutations in an expression can be modified or substituted to change the type of product construct containing an ER domain and transform a FabB made by the PKS system. For example, the present inventors strain having the other essential domains on a separate know that Schizochytrium PUFA-PKS system differs from plasmid-or integrated into the chromosome) and select the previously described bacterial systems in that it has two only for those transformants that grow without Supplemen (rather than one) ER domains. Without being bound by tation of the medium (i.e., that still possessed an ability to theory, the present inventors believe these ER domains can strongly influence the resulting PKS production product. produce a molecule that could complement the FabB-de The resulting PKS product could be changed by separately fect). knocking out the individual domains or by modifying their 0237. One test system for genetic modification of a PUFA nucleotide sequence or by substitution of ER domains from PKS is exemplified in the Examples section. Briefly, a host other organisms. microorganism Such as E. coli is transformed with genes encoding a PUFA PKS system including all or a portion of 0234. In another aspect of the invention, substitution of a Thraustochytrid PUFA PKS system (e.g., Orfs A, B and C one of the DH (FabA-like) domains of the PUFA-PKS of Schizochytrium) and a gene encoding a phosphopanteth system for a DH domain that does not posses isomerization einyl transferases (PPTase), which is required for the attach activity is contemplated, potentially creating a molecule ment of a phosphopantetheline cofactor to produce the with a mix of cis- and trans-double bonds. The current active, holo-ACP in the PKS system. The genes encoding products of the Schizochytrium PUFA PKS system are DHA the PKS system can be genetically engineered to introduce and DPA (C22:5 (p6). If one manipulated the system to one or more modifications to the Thraustochytrid PUFAPKS produce C20 fatty acids, one would expect the products to genes and/or to introduce nucleic acids encoding domains be EPA and ARA (C20:4 (06). This could provide a new from other PKS systems into the Thraustochytrid genes source for ARA. One could also substitute domains from (including genes from non-Thraustochytrid microorganisms related PUFA-PKS systems that produced a different DHA and genes from different Thraustochytrid microorganisms). to DPA ratio—for example by using genes from Thraus The PUFA PKS system can be expressed in the E. coli and tochytrium 23B (the PUFA PKS system of which is identi the PUFA production profile measured. In this manner, fied in U.S. patent application Ser. No. 10/124,800, supra). potential genetic modifications can be evaluated prior to 0235 Additionally, in one embodiment, one of the ER manipulation of the Thraustochytrid PUFA production domains is altered in the Thraustochytrid PUFAPKS system organism. (e.g. by removing or inactivating) to alter the end product 0238. The present invention includes the manipulation of profile. Similar strategies could be attempted in a directed endogenous nucleic acid molecules and/or the use of iso manner for each of the distinct domains of the PUFA-PKS lated nucleic acid molecules comprising a nucleic acid proteins using more or less Sophisticated approaches. Of sequence from a Thraustochytrid PUFA PKS system or a course one would not be limited to the manipulation of homologue thereof. In one aspect, the present invention single domains. Finally, one could extend the approach by relates to the modification and/or use of a nucleic acid mixing domains from the PUFA-PKS system and other PKS molecule comprising a nucleic acid sequence encoding a or FAS systems (e.g., type I, type II, modular) to create an domain from a PUFA PKS system having a biological entire range of new PUFA end products. activity of at least one of the following proteins: malonyl CoA:ACP acyltransferase (MAT), B-keto acyl-ACP syn 0236. It is recognized that many genetic alterations, thase (KS), ketoreductase (KR), acyltransferase (AT), FabA either random or directed, which one may introduce into a like B-hydroxy acyl-ACP dehydrase (DH), native (endogenous, natural) PKS system, will result in an phosphopantetheline transferase, chain length factor (CLF), inactivation of enzymatic functions. Therefore, in order to acyl carrier protein (ACP), enoyl ACP-reductase (ER), an test for the effects of genetic manipulation of a Thraus enzyme that catalyzes the synthesis of trans-2-acyl-ACP, an tochytrid PUFA PKS system in a controlled environment, enzyme that catalyzes the reversible isomerization of trans one could first use a recombinant system in another host, 2-acyl-ACP to cis-3-acyl-ACP, and/or an enzyme that cata Such as E. coli, to manipulate various aspects of the system lyzes the elongation of cis-3-acyl-ACP to cis-5-f-keto-acyl and evaluate the results. For example, the FabB-strain of E. coli is incapable of synthesizing unsaturated fatty acids and ACP. Preferred domains to modify in order to alter the PUFA requires Supplementation of the medium with fatty acids that production profile of a host Thraustochytrid have been can Substitute for its normal unsaturated fatty acids in order discussed previously herein. to grow (see Metz et al., 2001, Supra). However, this 0239). The genetic modification of a Thraustochytrid requirement (for Supplementation of the medium) can be microorganism according to the present invention preferably removed when the strain is transformed with a functional affects the type, amounts, and/or activity of the PUFAs PUFA-PKS system (i.e. one that produces a PUFA product produced by the microorganism, whether the endogenous in the E. coli host see (Metz et al., 2001, supra, FIG. 2A). PUFA PKS system is genetically modified and/or whether The transformed FabB-strain now requires a functional recombinant nucleic acid molecules are introduced into the PUFA-PKS system (to produce the unsaturated fatty acids) organism. According to the present invention, to affect an for growth without supplementation. The key element in this activity of a PUFA PKS system, such as to affect the PUFA US 2008/0038379 A1 Feb. 14, 2008 40 production profile, includes any genetic modification in the tum, Schizochytrium limacinum, Schizochytrium minutum; PUFAPKS system or genes that interact with the PUFAPKS any Thraustochytrium species (including former Ulkenia system that causes any detectable or measurable change or species such as U. visurgensis, U. amoeboida, U. Sarkari modification in any biological activity the PUFA PKS sys ana, U. profunda, U. radiata, U. minuta and Ulkenia sp. tem expressed by the organism as compared to in the BP-5601), and including Thraustochytrium striatum, absence of the genetic modification. According to the Thraustochytrium aureum, Thraustochytrium roseum; and present invention, the phrases “PUFA profile”, “PUFA any Japonochytrium species. Particularly preferred strains expression profile' and “PUFA production profile' can be of Thraustochytriales include, but are not limited to: used interchangeably and describe the overall profile of Schizochytrium sp. (S31)(ATCC 20888); Schizochytrium sp. PUFAs expressed/produced by a microorganism. The PUFA (S8)(ATCC 20889); Schizochytrium sp. (LC-RM)(ATCC expression profile can include the types of PUFAs expressed 18915); Schizochytrium sp. (SR21); Schizochytrium aggre by the microorganism, as well as the absolute and relative gatum (Goldstein et Belsky)(ATCC 28209); Schizochytrium amounts of the PUFAs produced. Therefore, a PUFA profile limacinum (Honda et Yokochi) (IFO 32693); Thraus can be described in terms of the ratios of PUFAs to one tochytrium sp. (23B)(ATCC 20891); Thraustochytrium another as produced by the microorganism, in terms of the striatum (Schneider)(ATCC 24473); Thraustochytrium types of PUFAs produced by the microorganism, and/or in aureum (Goldstein)(ATCC 34304); Thraustochytrium terms of the types and absolute or relative amounts of roseum (Goldstein)(ATCC 28210); and Japonochytrium sp. PUFAs produced by the microorganism. (L1)(ATCC 28207). 0240. As discussed above, while the host microorganism 0241. In one embodiment of the present invention, it is can include any eukaryotic microorganism with an endog contemplated that a mutagenesis program could be com enous PUFA PKS system and the ability to efficiently bined with a selective screening process to obtain a Thraus channel the products of the PUFA PKS system into both the tochytrid microorganism with the PUFA production profile phospholipids (PL) and triacylglycerols (TAG), the pre of interest. The mutagenesis methods could include, but are ferred host microorganism is any member of the order not limited to: chemical mutagenesis, gene shuffling, Switch Thraustochytriales, including the families Thraustochytri ing regions of the genes encoding specific enzymatic aceae and Labyrinthulaceae. Particularly preferred host cells domains, or mutagenesis restricted to specific regions of for use in the present invention could include microorgan those genes, as well as other methods. isms from a genus including, but not limited to: Thraus tochytrium, Japonochytrium, Aplanochytrium, Elina, and 0242 For example, high throughput mutagenesis meth Schizochytrium within the Thraustochytriaceae, and Laby ods could be used to influence or optimize production of the rinthula, Labyrinthuloides, and Labyrinthomyxa within the desired PUFA profile. Once an effective model system has Labyrinthulaceae. Preferred species within these genera been developed, one could modify these genes in a high include, but are not limited to: any species within Labyrin throughput manner. Utilization of these technologies can be thula, including Labrinthula sp., Labyrinthula algeriensis, envisioned on two levels. First, if a sufficiently selective Labyrinthula cienkowski, Labyrinthula chattonii, Labyrin screen for production of a product of interest (e.g., EPA) can thula coenocystis, Labyrinthula macrocystis, Labyrinthula be devised, it could be used to attempt to alter the system to macrocystis atlantica, Labyrinthula macrocystis macrocys produce this product (e.g., in lieu of or in concert with, other tis, Labyrinthula magnifica, Labyrinthula minuta, Labyrin strategies such as those discussed above). Additionally, if the thula roscoffensis, Labyrinthula valkanovii, Labyrinthula strategies outlined above resulted in a set of genes that did vitellina, Labyrinthula vitellina pacifica, Labyrinthula produce the PUFA profile of interest, the high throughput vitellina vitellina, Labyrinthula Zopfii; any Labyrinthuloides technologies could then be used to optimize the system. For species, including Labyrinthuloides sp., Labyrinthuloides example, if the introduced domain only functioned at rela minuta, Labyrinthuloides schizochytrops; any Labyrinth tively low temperatures, selection methods could be devised omyxa species, including Labyrinthomyxa sp., Labyrinth to permit removing that limitation. omyxa pohlia, Labyrinthomyxa sauvageaui, any Apl.- 0243 In one embodiment of the present invention, a anochytrium species, including Aplanochytrium sp. and genetically modified Thraustochytrid microorganism has an Aplanochytrium kerguelensis; any Elina species, including enhanced ability to synthesize desired PUFAs and/or has a Elina sp., Elina marisalba, Elina Sinorifica; any newly introduced ability to synthesize a different profile of Japanochytrium species, including Japanochytrium sp., PUFAs. According to the present invention, “an enhanced Japanochytrium marinum; any Schizochytrium species, ability to synthesize' a product refers to any enhancement, including Schizochytrium sp., Schizochytrium aggregatum, or up-regulation, in a pathway related to the synthesis of the Schizochytrium limacinum, Schizochytrium minutum, product such that the microorganism produces an increased Schizochytrium Octosporum; and any Thraustochytrium spe amount of the product (including any production of a cies, including Thraustochytrium sp., Thraustochytrium product where there was none before) as compared to the aggregatum, Thraustochytrium arudimentale, Thraus wild-type microorganism, cultured or grown, under the same tochytrium aureum, Thraustochytrium benthicola, Thraus tochytrium globosum, Thraustochytrium kinnei, Thraus conditions. Methods to produce Such genetically modified tochytrium motivum, Thraustochytrium pachydermum, organisms have been described in detail above. Thraustochytrium proliferum, Thraustochytrium roseum, 0244 As described above, in one embodiment of the Thraustochytrium striatum, Ulkenia sp., Ulkenia minuta, present invention, a genetically modified microorganism or Ulkenia profinda, Ulkenia radiate, Ulkenia Sarkariana, and plant includes a microorganism or plant which has an Ulkenia visurgensis. Particularly preferred species within enhanced ability to synthesize desired bioactive molecules these genera include, but are not limited to: any (products) or which has a newly introduced ability to Schizochytrium species, including Schizochytrium aggrega synthesize specific products (e.g., to synthesize a specific US 2008/0038379 A1 Feb. 14, 2008

antibiotic). According to the present invention, “an enhanced example, the fermentation medium can be filtered or cen ability to synthesize' a product refers to any enhancement, trifuged to remove microorganisms, cell debris and other or up-regulation, in a pathway related to the synthesis of the particulate matter, and the product can be recovered from the product such that the microorganism or plant produces an cell-free Supernatant by conventional methods, such as, for increased amount of the product (including any production example, ion exchange, chromatography, extraction, Solvent of a product where there was none before) as compared to extraction, phase separation, membrane separation, elec the wild-type microorganism or plant, cultured or grown, trodialysis, reverse osmosis, distillation, chemical derivati under the same conditions. Methods to produce Such geneti Zation and crystallization. Alternatively, microorganisms cally modified organisms have been described in detail producing the PUFA(s), or extracts and various fractions above. thereof, can be used without removal of the microorganism components from the product. 0245 One embodiment of the present invention is a method to produce desired bioactive molecules (also 0249 Preferably, a genetically modified Thraustochytrid referred to as products or compounds) by growing or cul microorganism of the invention produces one or more poly turing a genetically modified microorganism or plant of the unsaturated fatty acids including, but not limited to, EPA present invention (described in detail above). Such a method (C20:5, co-3), DHA (C22:6, co-3), DPA (C22:5, (D-6), ARA includes the step of culturing in a fermentation medium or (C20:4. (O-6), GLA (C18:3, n-6), and SDA (C18:4, n-3)). In growing in a Suitable environment, Such as soil, a microor one preferred embodiment, a Schizochytrium that, in wild ganism or plant, respectively, that has a genetic modification type form, produces high levels of DHA and DPA, is as described previously herein and in accordance with the genetically modified according to the invention to produce present invention. Preferred host cells for genetic modifica high levels of EPA. As discussed above, one advantage of tion related to the PUFA PKS system of the invention are using genetically modified Thraustochytrid microorganisms described above. to produce PUFAs is that the PUFAs are directly incorpo rated into both the phospholipids (PL) and triacylglycerides 0246. One embodiment of the present invention is a (TAG). method to produce desired PUFAs by culturing a genetically 0250 Preferably, PUFAs are produced in an amount that modified Thraustochytrid microorganism of the present is greater than about 5% of the dry weight of the microor invention (described in detail above). Such a method ganism, and in one aspect, in an amount that is greater than includes the step of culturing in a fermentation medium and 6%, and in another aspect, in an amount that is greater than under conditions effective to produce the PUFA(s) a Thraus 7%, and in another aspect, in an amount that is greater than tochytrid microorganism that has a genetic modification as 8%, and in another aspect, in an amount that is greater than described previously herein and in accordance with the 9%, and in another aspect, in an amount that is greater than present invention. An appropriate, or effective, medium 10%, and so on in whole integer percentages, up to greater refers to any medium in which a genetically modified microorganism of the present invention, including Thraus than 90% dry weight of the microorganism (e.g., 15%, 20%, tochytrids and other microorganisms, when cultured, is 30%, 40%, 50%, and any percentage in between). capable of producing the desired PUFA product(s). Such a 0251. In the method for production of desired bioactive medium is typically an aqueous medium comprising assimi compounds of the present invention, a genetically modified lable carbon, nitrogen and phosphate sources. Such a plant is cultured in a fermentation medium or grown in a medium can also include appropriate salts, minerals, metals Suitable medium Such as soil. An appropriate, or effective, and other nutrients. Any microorganisms of the present fermentation medium has been discussed in detail above. A invention can be cultured in conventional fermentation Suitable growth medium for higher plants includes any bioreactors. The microorganisms can be cultured by any growth medium for plants, including, but not limited to, Soil, fermentation process which includes, but is not limited to, sand, any other particulate media that Support root growth batch, fed-batch, cell recycle, and continuous fermentation. (e.g. Vermiculite, perlite, etc.) or hydroponic culture, as well Preferred growth conditions for Thraustochytrid microor as Suitable light, water and nutritional Supplements which ganisms according to the present invention are well known optimize the growth of the higher plant. The genetically in the art and are described in detail, for example, in U.S. modified plants of the present invention are engineered to Pat. No. 5,130,242, U.S. Pat. No. 5,340,742, and U.S. Pat. produce significant quantities of the desired product through No. 5,698.244, each of which is incorporated herein by the activity of the PKS system that is genetically modified reference in its entirety. according to the present invention. The compounds can be recovered through purification processes which extract the 0247. In one embodiment, the genetically modified compounds from the plant. In a preferred embodiment, the microorganism is cultured at a temperature of greater than compound is recovered by harvesting the plant. In this about 15°C., and in another embodiment, greater than about embodiment, the plant can be consumed in its natural state 20°C., and in another embodiment, greater than about 25° or further processed into consumable products. C., and in another embodiment, greater than about 30° C. 0252) Many genetic modifications useful for producing and in another embodiment, greater than about 35° C., and bioactive molecules will be apparent to those of skill in the in another embodiment, greater than about 40°C., and in one art, given the present disclosure, and various other modifi embodiment, at any temperature between about 20° C. and cations have been discussed previously herein. The present 40° C. invention contemplates any genetic modification related to a 0248. The desired PUFA(s) and/or other bioactive mol PUFA PKS system as described herein which results in the ecules produced by the genetically modified microorganism production of a desired bioactive molecule. can be recovered from the fermentation medium using 0253 Bioactive molecules, according to the present conventional separation and purification techniques. For invention, include any molecules (compounds, products, US 2008/0038379 A1 Feb. 14, 2008 42 etc.) that have a biological activity, and that can be produced anti-Heliobactor pylori drug, a drug for treatment of neu by a PKS system that comprises at least one amino acid rodegenerative disease, a drug for treatment of degenerative sequence having a biological activity of at least one func liver disease, an antibiotic, and a cholesterol lowering for tional domain of a non-bacterial PUFA PKS system as mulation. In one embodiment, the endproduct is used to treat described herein. Such bioactive molecules can include, but a condition selected from the group consisting of chronic are not limited to: a polyunsaturated fatty acid (PUFA), an inflammation, acute inflammation, gastrointestinal disorder, anti-inflammatory formulation, a chemotherapeutic agent, cancer, cachexia, cardiac restenosis, neurodegenerative dis an active excipient, an osteoporosis drug, an anti-depressant, order, degenerative disorder of the liver, blood lipid disorder, an anti-convulsant, an anti-Heliobactor pylori drug, a drug osteoporosis, osteoarthritis, autoimmune disease, preec for treatment of neurodegenerative disease, a drug for treat lampsia, preterm birth, age related maculopathy, pulmonary ment of degenerative liver disease, an antibiotic, and a disorder, and peroxisomal disorder. cholesterol lowering formulation. One advantage of the 0257 Suitable food products include, but are not limited non-bacterial PUFA PKS system of the present invention is to, fine bakery wares, bread and rolls, breakfast cereals, the ability of such a system to introduce carbon-carbon processed and unprocessed cheese, condiments (ketchup, double bonds in the cis configuration, and molecules includ mayonnaise, etc.), dairy products (milk, yogurt), puddings ing a double bond at every third carbon. This ability can be and gelatin desserts, carbonated drinks, teas, powdered utilized to produce a variety of compounds. beverage mixes, processed fish products, fruit-based drinks, 0254 Preferably, bioactive compounds of interest are chewing gum, hard confectionery, frozen dairy products, produced by the genetically modified microorganism in an processed meat products, nut and nut-based spreads, pasta, amount that is greater than about 0.05%, and preferably processed poultry products, gravies and sauces, potato chips greater than about 0.1%, and more preferably greater than and other chips or crisps, chocolate and other confectionery, about 0.25%, and more preferably greater than about 0.5%, Soups and Soup mixes, soya based products (milks, drinks, and more preferably greater than about 0.75%, and more creams, whiteners), vegetable oil-based spreads, and Veg preferably greater than about 1%, and more preferably etable-based drinks. greater than about 2.5%, and more preferably greater than about 5%, and more preferably greater than about 10%, and 0258 Yet another embodiment of the present invention more preferably greater than about 15%, and even more relates to a method to produce a humanized animal milk. preferably greater than about 20% of the dry weight of the This method includes the steps of genetically modifying microorganism. For lipid compounds, preferably, such com milk-producing cells of a milk-producing animal with at pounds are produced in an amount that is greater than about least one recombinant nucleic acid molecule comprising a 5% of the dry weight of the microorganism. For other nucleic acid sequence encoding at least one biologically bioactive compounds, such as antibiotics or compounds that active domain of a PUFA PKS system as described herein. are synthesized in Smaller amounts, those strains possessing 0259 Methods to genetically modify a host cell and to Such compounds at of the dry weight of the microorganism produce a genetically modified non-human, milk-producing are identified as predictably containing a novel PKS system animal, are known in the art. Examples of host animals to of the type described above. In some embodiments, particu modify include cattle, sheep, pigs, goats, yaks, etc., which lar bioactive molecules (compounds) are secreted by the are amenable to genetic manipulation and cloning for rapid microorganism, rather than accumulating. Therefore, Such expansion of a transgene expressing population. For ani bioactive molecules are generally recovered from the culture mals, PKS-like transgenes can be adapted for expression in medium and the concentration of molecule produced will target organelles, tissues and body fluids through modifica vary depending on the microorganism and the size of the tion of the gene regulatory regions. Of particular interest is culture. the production of PUFAs in the breast milk of the host 0255 One embodiment of the present invention relates to animal. a method to modify an endproduct containing at least one 0260 The following examples are provided for the pur fatty acid, comprising adding to the endproduct an oil pose of illustration and are not intended to limit the scope of produced by a recombinant host cell that expresses at least the present invention. one recombinant nucleic acid molecule comprising a nucleic acid sequence encoding at least one biologically active EXAMPLES domain of a PUFA PKS system. The PUFA PKS system includes any suitable bacterial or non-bacterial PUFA PKS Example 1 system described herein, including the PUFA PKS systems from Thraustochytrium and Schizochytrium, or any PUFA 0261) The following example, from U.S. patent applica PKS system from bacteria that normally (i.e., under normal tion Ser. No. 10/124,800, describes the use of the screening or natural conditions) are capable of growing and producing process of the present invention to identify other non PUFAs at temperatures above 22°C., such as Shewanella bacterial organisms comprising a PUFAPKS system accord Olleyana or Shewanella japonica. ing to the present invention. 0256 Preferably, the endproduct is selected from the 0262 Thraustochytrium sp. 23B (ATCC 20892) was cul group consisting of a food, a dietary Supplement, a pharma tured as described in detail herein. ceutical formulation, a humanized animal milk, and an 0263. A frozen vial of Thraustochytrium sp. 23B (ATCC infant formula. Suitable pharmaceutical formulations 20892) was used to inoculate a 250 mL shake flask contain include, but are not limited to, an anti-inflammatory formu ing 50 mL of RCA medium. The culture was shaken on a lation, a chemotherapeutic agent, an active excipient, an shaker table (200 rpm) for 72 hr at 25° C. RCA medium osteoporosis drug, an anti-depressant, an anti-convulsant, an contains the following: US 2008/0038379 A1 Feb. 14, 2008 43

ment in the Thraustochytrium 23B genomic DNA. The Orf probe detected an approximately 7.5 kb ClaI fragment and an approximately 4.6 kb KpnI fragment in the Thraus RCA Medium tochytrium 23B genomic DNA. Deionized water 1000 mL. Reef Crystals (R) sea salts 40 g/L 0269 Finally, a recombinant genomic library, consisting Glucose 20 g/L of DNA fragments from Thraustochytrium 23B genomic Monosodium glutamate (MSG) 20 g/L DNA inserted into vector lambda FIX II (Stratagene), was Yeast extract 1 g/L screened using digoxigenin labeled probes corresponding to PII metals* 5 mLL Vitamin mix 1 mL.L the following segments of Schizochytrium 20888 PUFA pH 7.0 PKS genes: nucleotides 7385-7879 of OrfA(SEQID NO:1), nucleotides 5012-5511 of Orf B (SEQ ID NO:3), and *PII metal mix and vitamin mix are same as those outlined in U.S. Pat. nucleotides 76-549 of Orf C (SEQ ID NO:5). Each of these No. 5,130,742, incorporated herein by reference in its entirety. probes detected positive plaques from the Thraustochytrium 23B library, indicating extensive homology between the 0264. 25 mL of the 72 hr old culture was then used to Schizochytrium PUFA-PKS genes and the genes of Thraus inoculate another 250 mL shake flask containing 50 mL of tochytrium 23B. low nitrogen RCA medium (10 g/L MSG instead of 20 g/L) and the other 25 mL of culture was used to inoculate a 250 0270. These results demonstrate that Thraustochytrium mL shake flask containing 175 mL of low-nitrogen RCA 23B genomic DNA contains sequences that are homologous medium. The two flasks were then placed on a shaker table to PKS genes from Schizochytrium 20888. (200 rpm) for 72 hr at 25°C. The cells were then harvested via centrifugation and dried by lyophilization. The dried Example 2 cells were analyzed for fat content and fatty acid profile and 0271 The following example demonstrates that content using standard gas chromatograph procedures. Schizochytrium Orfs A, B and C encode a functional DHA/ 0265. The screening results for Thraustochytrium 23B DPA synthesis enzyme via functional expression in E. coli. under low oxygen conditions relative to high oxygen con ditions were as follows: 0272 General Preparation of E. coli Transformants 0273. The three genes encoding the Schizochytrium PUFA PKS system that produces DHA and DPA in Schizochytrium (Orfs A, B & C; SEQ ID NO:1, SEQ ID Did DHA as % FAME increase? Yes (38 -> 44%) NO:3 and SEQ ID NO:5, respectively) were cloned into a C14: O + C16: O + C16: 1 greater than Yes (44%) about 40% TFA2 single E. coli expression vector (derived from plT21c No C18:3(n - 3) or C18:3(n - 6)? Yes (0%) (Novagen)). The genes are transcribed as a single message Did fat content increase? Yes (2-fold increase) (by the T7 RNA-polymerase), and a ribosome-binding site Did DHA (or other HUFA content increase)? Yes (2.3-fold increase) cloned in front of each of the genes initiates translation. Modification of the Orf B coding sequence was needed to obtain production of a full-length Orf B protein in E. coli 0266 The results, especially the significant increase in (see below). An accessory gene, encoding a PPTase (see DHA content (as % FAME) under low oxygen conditions, below) was cloned into a second plasmid (derived from conditions, strongly indicates the presence of a PUFA pro pACYC184, New England Biolabs). ducing PKS system in this strain of Thraustochytrium. 0274) OrfB 0267 In order to provide additional data confirming the presence of a PUFAPKS system, a Southern blot of Thraus 0275. The Orf B gene is predicted to encode a protein tochytrium 23B was conducted using PKS probes from with a mass of ~224 kDa. Initial attempts at expression of Schizochytrium strain 20888, a strain which has already the gene in E. coli resulted in accumulation of a protein with been determined to contain a PUFA producing PKS system an apparent molecular mass of ~165 kDa (as judged by (i.e., SEQ ID Nos: 1-32 described above). Fragments of comparison to proteins of known mass during SDS-PAGE). Thraustochytrium 23B genomic DNA which are homolo Examination of the Orf B nucleotide sequence revealed a gous to hybridization probes from PKS PUFA synthesis region containing 15 sequential serine codons—all of them genes were detected using the Southern blot technique. being the TCT codon. The genetic code contains 6 different Thraustochytrium 23B genomic DNA was digested with serine codons, and three of these are used frequently in E. either Clal or KpnI restriction endonucleases, separated by coli. The present inventors used four overlapping oligo agarose gel electrophoresis (0.7% agarose, in standard tris nucleotides in combination with a polymerase chain reaction acetate-EDTA buffer), and blotted to a Schleicher & Schuell protocol to resynthesize a small portion of the OrfB gene (a Nytran Supercharge membrane by capillary transfer. Two ~195 base pair, BspHI to SacII restriction enzyme fragment) digoxigenin labeled hybridization probes were used—one that contained the serine codon repeat region. In the Syn specific for the enoyl-ACP reductase (ER) region of thetic Orf B fragment, a random mixture of the 3 serine codons commonly used by E. coli was used, and some other Schizochytrium PKS OrfB (nucleotides 5012-5511 of OrfB: potentially problematic codons were changed as well (i.e., SEQID NO:3), and the other specific for a conserved region other codons rarely used by E. coli). The BspHI to SacII at the beginning of Schizochytrium PKS Orf C (nucleotides fragment present in the original Orf B was replaced by the 76-549 of Orf'; SEQ ID NO:5). resynthesized fragment (to yield Orf B*) and the modified 0268. The OrfB-ER probe detected an approximately 13 gene was cloned into the relevant expression vectors. The kb Clal fragment and an approximately 3.6 kb KpnI frag modified OrfB still encodes the amino acid sequence of US 2008/0038379 A1 Feb. 14, 2008 44

SEQ ID NO:4. Expression of the modified Orf B* clone in enzyme site in this region is Ndel (CATATG: SEQ ID E. coli resulted in the appearance of a ~224 kDa protein, NO:79). The ATG sequence embedded in this site is utilized indicating that the full-length product of OrfB was pro as the initiation methionine codon for introduced genes. The duced. The sequence of the resynthesized Orf B* BspHI to additional restriction sites (BglLL, NotI, SmaI, PmelI, Hin SacII fragment is shown in SEQ ID NO:80. Referring to dIII, Spel and XhoI) were included to facilitate the cloning SEQ ID NO:80, the nucleotide sequence of the resynthe process. The functionality of this expression vector cassette sized BspHI to SacII region of Orf B is shown. The BspHI was tested by using PCR to generate a version of sfp with a restriction site and the SacII restriction site are identified. Ndel site at the 5' end and an XhoI site at the 3' end. This The BspHI site starts at nucleotide 4415 of the Orf B CDS fragment was cloned into the expression cassette and trans (SEQ ID NO:3) (note: there are a total of three BspHI sites ferred into E. coli along with the OrfA, B and C expression in the Orf B CDS, while the SacII site is unique). The vector. Under appropriate conditions, these cells accumu sequence of the unmodified Orf B CDS is given in GenBank lated DHA, demonstrating that a functional sfp had been Accession number AF378328 and in SEQ ID NO:3. produced. 0276 PPTase 0279. To the present inventors knowledge, Het I has not been tested previously in a heterologous situation. Het I is 0277. The ACP domains of the Orf A protein (SEQ ID present in a cluster of genes in Nostoc known to be respon NO:2 in Schizochytrium) must be activated by addition of sible for the synthesis of long chain hydroxy-fatty acids that phosphopantetheline group in order to function. The are a component of a glyco-lipid layer present in heterocysts enzymes that catalyze this general type of reaction are called of that organism. The present inventors, without being phosphopantetheine transferases (PPTases). E. coli contains bound by theory, believe that Het I activates the ACP two endogenous PPTases, but it was anticipated that they domains of a protein, Hgl E. present in that cluster. The two would not recognize the Orf A ACP domains from ACP domains of Hgl E have a high degree of sequence Schizochytrium. This was confirmed by expressing Orfs A, homology to the ACP domains found in Schizochytrium Orf B* (see above) and C in E. coli without an additional A. The endogenous start codon of Het I has not been PPTase. In this transformant, no DHA production was identified (there is no methionine present in the putative detected. The inventors tested two heterologous PPTases in protein). There are several potential alternative start codons the E. coli PUFA PKS expression system: (1) sfp (derived (e.g., TTG and ATT) near the 5' end of the open reading from Bacillus subtilis) and (2) Het I (from the cyanobacte frame. The sequence of the region of Nostoc DNA encoding rium Nostoc strain 7120). the HetI gene is shown in SEQ ID NO:81. SEQ ID NO:82 0278. The sfp PPTase has been well characterized and is represents the amino acid sequence encoded by SEQ ID widely used due to its ability to recognize a broad range of NO:81. Referring to SEQ ID NO:81, limit to the upstream Substrates. Based on published sequence information coding region indicated by the inframe nonsense triplet (Nakana, et al., 1992, Molecular and General Genetics 232: (TAA) at positions 1-3 of SEQID NO: 81 and ends with the 313-321), an expression vector for sfp was built by cloning stop codon (TGA) at positions 715–717 of SEQ ID NO:81. the coding region, along with defined up- and downstream No methionine codons (ATG) are present in the sequence. flanking DNA sequences, into a pACYC-184 cloning vector. Potential alternative initiation codons are: 3 TTG codons The oligonucleotides: (positions 4-6, 7-9 and 49-51 of SEQ ID NO:81), ATT (positions 76-78 of SEQ ID NO:81) and GTG (positions 235-237 of SEQ ID NO:81). A Het I expression construct CGGGGTACCCGGGAGCCGCCTTGGCTTTGT (forward; was made by using PCR to replace the furthest 5' potential SEO ID NO: 73) alternative start codon (TTG) with a methionine codon and (ATG, as part of the above described Ndel restriction AAACTGCAGCCCGGGTCCAGCTGGCAGGCACCCTG (reverse; enzyme recognition site), and introducing an XhoI site at the SEO ID NO: 74) 3' end of the coding sequence. The modified HetI coding sequence was then inserted into the Ndel and XhoI sites of were used to amplify the region of interest from genomic B. the paCYC184 vector construct containing the sfp regula subtilus DNA. Convenient restriction enzyme sites were tory elements. Expression of this Het I construct in E. coli included in the oligonucleotides to facilitate cloning in an resulted in the appearance of a new protein of the size intermediate, high copy number vector and finally into the expected from the sequence data. Co-expression of Het I EcoRV site of p ACYC184 to create the plasmid: pBR301. with Schizochytrium Orfs A, B, C in E. coli under several Examination of extracts of E. coli transformed with this conditions resulted in the accumulation of DHA and DPA in plasmid revealed the presence of a novel protein with the those cells. In all of the experiments in which sfp and Het I mobility expected for sfp. Co-expression of the sfp construct were compared, more DHA and DPA accumulated in the in cells expressing the Orf A, B, C proteins, under certain cells containing the Het I construct than in cells containing conditions, resulted in DHA production. This experiment the Sfp construct. demonstrated that s?p was able to activate the Schizochytrium Orf AACP domains. In addition, the regu 0280 Production of DHA and DPA in E. coli Transfor latory elements associated with the Sfp gene were used to mants create an expression cassette into which other genes could 0281. The two plasmids encoding: (1) the Schizochytrium be inserted. Specifically, the Sfp coding region (along with PUFA PKS genes (Orfs A, B* and C) and (2) the PPTase three nucleotides immediately upstream of the ATG) in (from sfp or from Het I) were transformed into E. coli strain pBR301 was replaced with a 53 base pair section of DNA BL21 which contains an inducible T7 RNA polymerase designed so that it contains several unique (for this con gene. Synthesis of the Schizochytrium proteins was induced struct) restriction enzyme sites. The initial restriction by addition of IPTG to the medium, while PPTase expres US 2008/0038379 A1 Feb. 14, 2008

sion was controlled by a separate regulatory element (see outs. Northern blot analysis of RNA extracted from several above). Cells were grown under various defined conditions of these mutants confirmed that a full-length OrfA message and using either of the two heterologous PPTase genes. The was not produced in these mutants. cells were harvested and the fatty acids were converted to methyl-esters (FAME) and analyzed using gas-liquid chro 0287. These experiments demonstrate that a matography. Schizochytrium gene (e.g., Orf A) can be inactivated via homologous recombination, that inactivation of Orf A 0282 Under several conditions, DHA and DPA were results in a lethal phenotype, and that those mutants can be detected in E. coli cells expressing the Schizochytrium PUFA rescued by supplementation of the media with PUFA. PKS genes, plus either of the two heterologous PPTases. No DHA or DPA was detected in FAMEs prepared from control 0288 Similar sets of experiments directed to the inacti cells (i.e., cells transformed with a plasmid lacking one of vation of Schizochytrium Orf B (SEQ ID NO:3) and Orf C the Orfs). The ratio of DHA to DPA observed in E. coli (SEQ ID NO:5) have yielded similar results. That is, Orf B approximates that of the endogenous DHA and DPA pro and Orf C can be individually inactivated by homologous duction observed in Schizochytrium. The highest level of recombination and those cells require PUFA supplementa PUFA (DHA plus DPA), representing ~17% of the total tion for growth. FAME, was found in cells grown at 32° C. in 765 medium Example 4 (recipe available from the American Type Culture Collec 0289. The following example shows that PUFA auxotro tion) supplemented with 10% (by weight) glycerol. Note phs can be maintained on medium Supplemented with EPA, that PUFA accumulation was also observed when cells were demonstrating that EPA can substitute for DHA in grown in Luria Broth supplemented with 5 or 10% glycerol, Schizochytrium. and when grown at 20° C. Selection for the presence of the respective plasmids was maintained by inclusion of the 0290. As indicated in Example 3, Schizochytrium cells in appropriate antibiotics during the growth and IPTG (to a which the PUFAPKS complex has been inactivated required final concentration of 0.5 mM) was used to induce expres supplementation with PUFA to survive. Aside from demon sion of Orfs A B* and C. strating that Schizochytrium is dependent on the products of this system for growth, this experimental system permits the 0283 FIG. 4 shows an example chromatogram from testing of various fatty acids for their ability to rescue the gas-liquid chromatographic analysis of FAMEs derived mutants. It was discovered that the mutant cells (in which from control cells and from cells expressing the any of the three genes have been inactivated) grew as well Schizochytrium PUFAPKS genes plus a PPTase (in this case on media supplemented with EPA as they did on media Het I). Identity of the labeled FAMEs has been confirmed supplemented with DHA. This result indicates that, if the using mass spectroscopy. endogenous PUFAPKS complex which produces DHA were replaced with one whose product was EPA, the cells would Example 3 be viable. Additionally, these mutant cells could be rescued 0284. The following example shows demonstrates that by supplementation with either ARA or GLA, demonstrating genes encoding the Schizochytrium PUFA PKS enzyme the feasibility of producing genetically modified complex can be selectively inactivated (knocked out), and Schizochytrium that produce these products. It is noted that that it is a lethal phenotype unless the medium is Supple a preferred method for supplementation with PUFAs mented with polyunsaturated fatty acids. involves combining the free fatty acids with partially methy lated beta-cyclodextrin prior to addition of the PUFAs to the 0285 Homologous recombination has been demon medium. strated in Schizochytrium (see copending U.S. patent appli cation Ser. No. 10/124.807, incorporated herein by reference Example 5 in its entirety). A plasmid designed to inactivate 0291. The following example shows that inactivated Schizochytrium Orf A (SEQ ID NO: 1) was made by insert PUFA genes can be replaced at the same site with active ing a ZeocinTM resistance marker into the Sma I site of a forms of the genes in order to restore PUFA synthesis. clone containing the Orf A coding sequence. The ZeocinTM 0292 Double homologous recombination at the acetolac resistance marker was obtained from the plasmid tate synthase gene site has been demonstrated in pMON50000 expression of the ZeocinTM resistance gene Schizochytrium (see U.S. patent application Ser. No. 10/124, is driven by a Schizochytrium derived tubulin promoter 807, Supra). The present inventors tested this concept for element (see U.S. patent application Ser. No. 10/124,807, replacement of the Schizochytrium PUFA PKS genes by ibid.). The knock-out construct thus consists of 5'Schizochytrium Orf A coding sequence, the tub-ZeocinTM transformation of a Schizochytrium Orf A knockout strain resistance element and 3'Schizochytrium Orf A coding (described in Example 2) with a full-length Schizochytrium sequence, all cloned into pBluescript II SK (+) vector Orf A genomic clone. The transformants were selected by (Stratagene). their ability to grow on media without supplemental PUFAs. These PUFA prototrophs were then tested for resistance to 0286 The plasmid was introduced into Schizochytrium Zeocin TM and several were found that were sensitive to the cells by particle bombardment and transformants were antibiotic. These results indicate that the introduced selected on plates containing Zeocin TM and Supplemented Schizochytrium Orf A has replaced the Zeocin TM resistance with polyunsaturated fatty acids (PUFA) (see Example 4). gene in the knockout Strain via double homologous recom Colonies that grew on the ZeocinTM plus PUFA plates were bination. This experiment demonstrates the proof of concept tested for ability to grow on plates without the PUFA for gene replacement within the PUFA PKS genes. Similar Supplementation and several were found that required the experiments for Schizochytrium Orf Band Orf C knock-outs PUFA. These PUFA auxotrophs are putative Orf A knock have given identical results. US 2008/0038379 A1 Feb. 14, 2008 46

Example 6 0293. This example shows that all or some portions of the prRZ23 Thraustochytrium 23B PUFA PKS genes can function in GGYATGMTGRTTGGTGAAGG (forward; SEQ ID NO: 69) Schizochytrium. prRZ24 0294 As described in U.S. patent application Ser. No. TRTTSASRTAYTGYGAACCTTG (reverse; SEQ ID NO: 70 10/124,800 (supra), the DHA-producing protist Thraus tochytrium 23B (Th. 23B) has been shown to contain orfA, 0299 Primers for the DH region; based on the following orfB, and orfc homologs. Complete genomic clones of the published sequences: Shewanella sp. GA-22; Shewanella sp. three Th. 23B genes were used to transform the SCRC-2738; Photobacter profindum, Moritella marina: Schizochytrium strain containing the cognate orf "knock out”. Direct selection for complemented transformants was prRZ28 carried out in the absence of PUFA supplementation. By this ATGKCNGAAGGTTGTGGCCA (forward; SEQ ID NO: 71.) method, it was shown that the Th. 23B orfA and orfc genes could complement the Schizochytrium orfA and orfc knock prRZ29 out strains, respectively, to PUFA prototrophy. Comple CCWGARATRAAGCCRTTDGGTTG (reverse; SEQ ID NO: 72) mented transformants were found that either retained or lost ZeocinTM resistance (the marker inserted into the The PCR conditions (with bacterial chromosomal DNA as Schizochytrium genes thereby defining the knock-outs). The templates) were as follows: ZeocinTM-resistant complemented transformants are likely 0300 Reaction Mixture: to have arisen by a single cross-over integration of the entire Thraustochytrium gene into the Schizochytrium genome 0301 02 uM dNTPs outside of the respective orf region. This result Suggests that the entire Thraustochytrium gene is functioning in 0302) 0.1 uM each primer Schizochytrium. The ZeocinTM-sensitive complemented 0303) 8% DMSO transformants are likely to have arisen by double cross-over events in which portions (or conceivably all) of the Thraus 0304 250 ng chromosomal DNA tochytrium genes functionally replaced the cognate regions 2.5 U Herculase R DNA polymerase (Stratagene) of the Schizochytrium genes that had contained the disrup 0305 tive ZeocinTM resistance marker. This result suggests that a 0306 1x Herculase(R) buffer fraction of the Thraustochytrium gene is functioning in Schizochytrium. 0307 50 uL total volume 0308) PCR Protocol: (1) 98° C. for 3 min.; (2) 98° C. for Example 7 40 sec.; (3) 56° C. for 30 sec.; (4) 72° C. for 90 sec.; (5) 0295) The following example shows that certain EPA Repeat steps 2-4 for 29 cycles; (6) 72° C. for 10 min. (7) producing bacteria contain PUFA PKS-like genes that Hold at 6° C. appear to be suitable for modification of Schizochytrium. 0309 For both primer pairs, PCR gave distinct products with expected sizes using chromosomal DNA templates 0296 Two EPA-producing marine bacterial strains of the from either Shewanella Olleyana or Shewanella japonica. genus Shewanella have been shown to grow at temperatures The four respective PCR products were cloned into pCR typical of Schizochytrium fermentations and to possess BLUNT II-TOPO (Invitrogen) and insert sequences were PUFA PKS-like genes. Shewanella Olleyana (Australian determined using the M13 forward and reverse primers. In Collection of Antarctic Microorganisms (ACAM) strain all cases, the DNA sequences thus obtained were highly number 644; Skerratt et al., Int. J. Syst. Evol. Microbiol. 52, homologous to known bacterial PUFA PKS gene regions. 2101 (2002)) produces EPA and grows up to 30° C. 0310. The DNA sequences obtained from the bacterial Shewanella japonica (American Type Culture Collection PCR products were compared with known sequences and (ATCC) strain number BAA-316; Ivanova et al., Int. J. Syst. with PUFA PKS genes from Schizochytrium ATCC 20888 in Evol. Microbiol. 51, 1027 (2001)) produces EPA and grows a standard Blastx search (BLAST parameters: Low Com up to 35° C. plexity filter. On: Matrix: BLOSUM62: Word Size: 3; Gap 0297 To identify and isolate the PUFA-PKS genes from Costs: Existance 11, Extension 1 (BLAST described in Alts these bacterial strains, degenerate PCR primer pairs for the chul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, KS-MAT region of bacterial orf5/pfaA genes and the DH Z., Miller, W. & Lipman, D.J. (1997) “Gapped BLAST and DH region of bacterial orf7/pfaC genes were designed based PSI-BLAST: a new generation of protein database search on published gene sequences for Shewanella SCRC-2738, programs. Nucleic Acids Res. 25:3389-3402, incorporated Shewanella Oneidensis MR-1; Shewanella sp. GA-22; Pho herein by reference in its entirety)). tobacter profiindum, and Moritella marina (see discussion 0311. At the amino acid level, the sequences with the above). Specifically, the primers and PCR conditions were greatest degree of homology to the Shewanella Olleyana designed as follows: ACAM644 ketoacyl synthase/acyl transferase (KS-AT) deduced amino acid sequence encoded by SEQ ID NO:76 0298 Primers for the KS/AT region; based on the fol were: Photobacter profindum pfaA (identity=70%; posi lowing published sequences: Shewanella sp. SCRC-2738: tives=81%); Shewanella Oneidensis MR-1 “multi-domain Shewanella Oneidensis MR-1; Photobacter profiundum, B-ketoacyl synthase” (identity=66%; positives=77%); and Moritella marina: Moritella marina ORF8 (identity=56%; positives=71%). US 2008/0038379 A1 Feb. 14, 2008 47

The Schizochytrium sp. ATCC20888 orfA was 41% identical Example 8 and 56% positive to the deduced amino acid sequence 0316. This example demonstrates how the bacterial encoded by SEQ ID NO:76. PUFA PKS gene fragments described in Example 7 can be 0312. At the amino acid level, the sequences with the used to modify PUFA production in Schizochytrium. greatest degree of homology to the Shewanella japonica 0317 All presently-known examples of PUFA PKS ATCC BAA-316 ketoacyl synthase/acyl transferase (KS genes from bacteria exist as four closely linked genes that AT) deduced amino acid sequence encoded by SEQ ID contain the same domains as in the three-gene NO:78 were: Shewanella Oneidensis MR-1 “multi-domain Schizochytrium set. It is anticipated that the PUFA PKS B-ketoacyl synthase” (identity=67%; positives=79%); genes from Shewanella Olleyana and Shewanella japonica Shewanella sp. SCRC-2738 orf5 (identity=69%; positives= will likewise be found in this tightly clustered arrangement. 77%); and Moritella marina ORF8 (identity=56%; posi The homologous regions identified in Example 7 are used to isolate the PUFA PKS gene clusters from clone banks of Sh. tives=70%). The Schizochytrium sp. ATCC20888 orfA was Olleyana and Sh. japonica DNAS. Clone banks can be 41% identical and 55% positive to the deduced amino acid constructed in bacteriophage lambda vectors, cosmid vec sequence encoded by SEQ ID NO:78. tors, bacterial artificial chromosome (“BAC) vectors, or by 0313 At the amino acid level, the sequences with the other methods known in the art. Desired clones containing greatest degree of homology to the Shewanella Olleyana bacterial PUFA PKS genes can be identified by colony or ACAM644 dehydrogenase (DH) deduced amino acid plaque hybridization (as described in Example 1) using sequence encoded by SEQID NO:75 were: Shewanella sp. probes generated by PCR of the partial gene sequences of SCRC-2738 orf7 (identity=77%; positives=86%); Photo Example 7 employing primers designed from these bacter profiundum pfaC (identity=72%; positives 81%); and sequences. The complete DNA sequence of the new bacte Shewanella Oneidensis MR-1 “multi-domain B-ketoacyl rial PUFA PKS gene sets are then used to design vectors for synthase” (identity 75%: positives=83%). The transformation of Schizochytrium strains defective in the Schizochytrium sp. ATCC20888 orfc was 26% identical and endogenous PUFA PKS genes (e.g., see Examples 3, 5, and 42% positive to the deduced amino acid sequence encoded 6). Whole bacterial genes (coding sequences) may be used by SEQ ID NO:75. to replace whole Schizochytrium genes (coding sequences), thus utilizing the Schizochytrium gene expression regions, 0314. At the amino acid level, the sequences with the and the fourth bacterial gene may be targeted to a different greatest degree of homology to the Shewanella japonica location within the genome. Alternatively, individual bacte ATCC BAA-316 dehydrogenase (DH) deduced amino acid rial PUFA PKS functional domains may be “swapped’ or sequence encoded by SEQID NO:77 were: Shewanella sp. exchanged with the analogous Schizochytrium domains by SCRC-2738 orf7 (identity=77%; positives=86%); Photo similar techniques of homologous recombination. It is bacter profiundum pfaC (identity=73%; positives=83%) and understood that the sequence of the bacterial PUFA PKS Shewanella Oneidensis MR-1 “multi-domain B-ketoacyl genes or domains may have to be modified to accommodate synthase' (identity=74%: positives=81%). The details of Schizochytrium codon usage, but this is within the Schizochytrium sp. ATCC20888 orfc was 27% identical and ability of those of skill in the art. 42% positive to the deduced amino acid sequence encoded 0318. Each publication cited or discussed herein is incor by SEQ ID NO:77. porated herein by reference in its entirety. 0315) It is expected that the PUFA PKS gene sets from 0319 While various embodiments of the present inven these two Shewanella strains will provide beneficial sources tion have been described in detail, it is apparent that modi of whole genes or individual domains for the modification of fications and adaptations of those embodiments will occur to Schizochytrium PUFA production. PUFA PKS genes and the those skilled in the art. It is to be expressly understood, proteins and domains encoded thereby from either of however, that such modifications and adaptations are within Shewanella Olleyana or Shewanella japonica are explicitly the scope of the present invention, as set forth in the encompassed by the present invention. following claims.

SEQUENCE LISTING

<16 Oc NUMBER OF SEO ID NOS: 82

<210 SEQ ID NO 1 <211 LENGTH: 873 O &212> TYPE: DNA <213> ORGANISM: Schizochytrium sp. &220s FEATURE: <221 NAME/KEY: CDS <222> LOCATION: (1) . . (8730)

US 2008/0038379 A1 Feb. 14, 2008 58

- Continued Glin Ala Ser Val Ile Ala Thr Asp Ser Lieu Ala Phe 29 OO 29 OS 291. O

<210 SEQ ID NO 2 <211 LENGTH: 291 O &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp.

<4 OO SEQUENCE: 2 Met Ala Ala Arg Lieu. Glin Glu Glin Lys Gly Gly Glu Met Asp Thr Arg 1. 5 1O 15 Ile Ala Ile Ile Gly Met Ser Ala Ile Leu Pro Cys Gly. Thir Thr Val 2O 25 3O Arg Glu Ser Trp Glu Thir Ile Arg Ala Gly Ile Asp Cys Lieu. Ser Asp 35 4 O 45 Lieu Pro Glu Asp Arg Val Asp Val Thir Ala Tyr Phe Asp Pro Val Lys SO 55 6 O Thir Thr Lys Asp Lys Ile Tyr Cys Lys Arg Gly Gly Phe Ile Pro Glu 65 70 7s 8O Tyr Asp Phe Asp Ala Arg Glu Phe Gly Lieu. Asn Met Phe Gln Met Glu 85 90 95 Asp Ser Asp Ala Asn Glin Thir Ile Ser Lieu. Lieu Lys Wall Lys Glu Ala 1OO 105 11 O Lieu. Glin Asp Ala Gly Ile Asp Ala Lieu. Gly Lys Glu Lys Lys ASn Ile 115 12 O 125 Gly Cys Val Lieu. Gly Ile Gly Gly Gly Glin Lys Ser Ser His Glu Phe 13 O 135 14 O Tyr Ser Arg Lieu. Asn Tyr Val Val Val Glu Lys Val Lieu. Arg Llys Met 145 150 155 160 Gly Met Pro Glu Glu Asp Wall Lys Val Ala Val Glu Lys Tyr Lys Ala 1.65 17O 17s Asn Phe Pro Glu Trp Arg Lieu. Asp Ser Phe Pro Gly Phe Leu Gly Asn 18O 185 19 O Val Thir Ala Gly Arg Cys Thr Asn. Thir Phe Asn Lieu. Asp Gly Met Asn 195 2OO 2O5 Cys Val Val Asp Ala Ala Cys Ala Ser Ser Lieu. Ile Ala Val Llys Val 21 O 215 22O Ala Ile Asp Glu Lieu. Lieu. Tyr Gly Asp Cys Asp Met Met Val Thr Gly 225 23 O 235 24 O Ala Thr Cys Thr Asp Asn Ser Ile Gly Met Tyr Met Ala Phe Ser Lys 245 250 255 Thr Pro Val Phe Ser Thr Asp Pro Ser Val Arg Ala Tyr Asp Glu Lys 26 O 265 27 O Thir Lys Gly Met Lieu. Ile Gly Glu Gly Ser Ala Met Lieu Val Lieu Lys 27s 28O 285 Arg Tyr Ala Asp Ala Val Arg Asp Gly Asp Glu Ile His Ala Val Ile 29 O 295 3 OO Arg Gly Cys Ala Ser Ser Ser Asp Gly Lys Ala Ala Gly Ile Tyr Thr 3. OS 310 315 32O Pro Thir Ile Ser Gly Glin Glu Glu Ala Lieu. Arg Arg Ala Tyr Asn Arg 3.25 330 335 Ala Cys Val Asp Pro Ala Thr Val Thr Lieu Val Glu Gly His Gly Thr 34 O 345 35. O US 2008/0038379 A1 Feb. 14, 2008 59

- Continued

Gly Thr Pro Val Gly Asp Arg Ile Glu Lieu. Thir Ala Lieu. Arg Asn Lieu. 355 360 365 Phe Asp Lys Ala Tyr Gly Glu Gly Asn Thr Glu, Llys Val Ala Val Gly 37 O 375 38O Ser Ile Llys Ser Ser Ile Gly His Lieu Lys Ala Val Ala Gly Lieu Ala 385 390 395 4 OO Gly Met Ile Llys Val Ile Met Ala Lieu Lys His Llys Thr Lieu Pro Gly 4 OS 41O 415 Thir Ile Asin Val Asp Asn Pro Pro Asn Lieu. Tyr Asp Asn Thr Pro Ile 42O 425 43 O Asn Glu Ser Ser Lieu. Tyr Ile Asn Thr Met Asn Arg Pro Trp Phe Pro 435 44 O 445 Pro Pro Gly Val Pro Arg Arg Ala Gly Ile Ser Ser Phe Gly Phe Gly 450 45.5 460 Gly Ala Asn Tyr His Ala Val Lieu. Glu Glu Ala Glu Pro Glu. His Thr 465 470 47s 48O Thir Ala Tyr Arg Lieu. Asn Lys Arg Pro Glin Pro Val Lieu Met Met Ala 485 490 495 Ala Thr Pro Ala Ala Lieu. Glin Ser Lieu. Cys Glu Ala Glin Lieu Lys Glu SOO 505 51O Phe Glu Ala Ala Ile Lys Glu Asn. Glu Thr Val Lys Asn. Thir Ala Tyr 515 52O 525 Ile Lys Cys Val Lys Phe Gly Glu Glin Phe Llys Phe Pro Gly Ser Ile 53 O 535 54 O Pro Ala Thr Asn Ala Arg Lieu. Gly Phe Lieu Val Lys Asp Ala Glu Asp 5.45 550 555 560 Ala Cys Ser Thr Lieu. Arg Ala Ile Cys Ala Glin Phe Ala Lys Asp Wall 565 st O sts Thir Lys Glu Ala Trp Arg Lieu Pro Arg Glu Gly Val Ser Phe Arg Ala 58O 585 59 O Lys Gly Ile Ala Thr Asn Gly Ala Val Ala Ala Lieu. Phe Ser Gly Glin 595 6OO 605 Gly Ala Glin Tyr Thr His Met Phe Ser Glu Val Ala Met Asn Trp Pro 610 615 62O Glin Phe Arg Glin Ser Ile Ala Ala Met Asp Ala Ala Glin Ser Llys Val 625 630 635 64 O Ala Gly Ser Asp Lys Asp Phe Glu Arg Val Ser Glin Val Lieu. Tyr Pro 645 650 655 Arg Llys Pro Tyr Glu Arg Glu Pro Glu Glin Asn Pro Llys Lys Ile Ser 660 665 67 O Lieu. Thir Ala Tyr Ser Glin Pro Ser Thr Lieu Ala Cys Ala Lieu. Gly Ala 675 68O 685 Phe Glu Ile Phe Lys Glu Ala Gly Phe Thr Pro Asp Phe Ala Ala Gly 69 O. 695 7 OO His Ser Lieu. Gly Glu Phe Ala Ala Lieu. Tyr Ala Ala Gly Cys Val Asp 7 Os 71O 71s 72O Arg Asp Glu Lieu. Phe Glu Lieu Val Cys Arg Arg Ala Arg Ile Met Gly 72 73 O 73 Gly Lys Asp Ala Pro Ala Thr Pro Lys Gly Cys Met Ala Ala Val Ile 740 74. 7 O US 2008/0038379 A1 Feb. 14, 2008 60

- Continued Gly Pro Asn Ala Glu Asn. Ile Llys Val Glin Ala Ala Asn Val Trp Lieu. 7ss 760 765 Gly Asn Ser Asn Ser Pro Ser Glin Thr Val Ile Thr Gly Ser Val Glu 770 775 78O Gly Ile Glin Ala Glu Ser Ala Arg Lieu Gln Lys Glu Gly Phe Arg Val 78s 79 O 79. 8OO Val Pro Leu Ala Cys Glu Ser Ala Phe His Ser Pro Gln Met Glu Asn 805 810 815 Ala Ser Ser Ala Phe Lys Asp Val Ile Ser Llys Val Ser Phe Arg Thr 82O 825 83 O Pro Lys Ala Glu Thir Lys Lieu Phe Ser Asn Val Ser Gly Glu. Thir Tyr 835 84 O 845 Pro Thr Asp Ala Arg Glu Met Lieu. Thr Glin His Met Thr Ser Ser Val 850 855 860 Llys Phe Lieu. Thr Glin Val Arg Asn Met His Glin Ala Gly Ala Arg Ile 865 87O 87s 88O Phe Val Glu Phe Gly Pro Lys Glin Val Leu Ser Lys Lieu Val Ser Glu 885 890 895 Thr Lieu Lys Asp Asp Pro Ser Val Val Thr Val Ser Val Asin Pro Ala 9 OO 905 91 O Ser Gly Thr Asp Ser Asp Ile Glin Lieu. Arg Asp Ala Ala Val Glin Lieu. 915 92 O 925 Val Val Ala Gly Val Asn Lieu. Glin Gly Phe Asp Llys Trp Asp Ala Pro 93 O 935 94 O Asp Ala Thir Arg Met Glin Ala Ile Llys Llys Lys Arg Thir Thr Lieu. Arg 945 950 955 96.O Lieu. Ser Ala Ala Thr Tyr Val Ser Asp Llys Thr Llys Llys Val Arg Asp 965 97O 97. Ala Ala Met Asn Asp Gly Arg Cys Val Thr Tyr Lieu Lys Gly Ala Ala 98O 985 99 O Pro Lieu. Ile Lys Ala Pro Glu Pro Val Val Asp Glu Ala Ala Lys Arg 995 1OOO 1005 Glu Ala Glu Arg Lieu Gln Lys Glu Lieu. Glin Asp Ala Glin Arg Glin O1O O15 O2O Lieu. Asp Asp Ala Lys Arg Ala Ala Ala Glu Ala Asn. Ser Llys Lieu. O25 O3 O O35 Ala Ala Ala Lys Glu Glu Ala Lys Thr Ala Ala Ala Ser Ala Lys O4 O O45 OSO Pro Ala Val Asp Thir Ala Val Val Glu Lys His Arg Ala Ile Lieu.

Llys Ser Met Lieu Ala Glu Lieu. Asp Gly Tyr Gly Ser Val Asp Ala Of O O7 O8O

Ser Ser Leul Glin Glin Glin Glin Glin Glin Glin. Thir Ala Pro Ala Pro O85 O9 O O95 Val Lys Ala Ala Ala Pro Ala Ala Pro Val Ala Ser Ala Pro Ala 1 OO 105 11 O Pro Ala Val Ser Asn. Glu Lieu. Lieu. Glu Lys Ala Glu Thr Val Val 115 12 O 125 Met Glu Val Lieu Ala Ala Lys Thr Gly Tyr Glu Thr Asp Met Ile 13 O 135 14 O Glu Ala Asp Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile Asp Ser Ile US 2008/0038379 A1 Feb. 14, 2008 61

- Continued

145 15 O 155 Lys Arg Val Glu Ile Lieu. Ser Glu Val Glin Ala Met Lieu. Asn Val

Glu Ala Lys Asp Wall Asp Ala Lieu. Ser Arg Thr Arg Thr Val Gly

Glu Val Val Asn Ala Met Lys Ala Glu Ile Ala Gly Ser Ser Ala

Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Lys Ala Ala Pro

Ala Ala Ala Ala Pro Ala Val Ser Asn. Glu Lieu. Lieu. Glu Lys Ala 22O 225 23 O Glu Thr Val Val Met Glu Val Lieu Ala Ala Lys Thr Gly Tyr Glu 235 24 O 245 Thr Asp Met Ile Glu Ser Asp Met Glu Lieu. Glu Thr Glu Lieu. Gly 250 255 26 O Ile Asp Ser Ile Lys Arg Val Glu Ile Lieu. Ser Glu Val Glin Ala 265 27 O 27s Met Lieu. Asn Val Glu Ala Lys Asp Val Asp Ala Lieu. Ser Arg Thr 28O 285 29 O Arg Thr Val Gly Glu Val Val Asn Ala Met Lys Ala Glu Ile Ala 295 3OO 305 Gly Gly Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro Gly Pro Ala 310 315 32O

Ala Ala Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Wal Ser Asn 3.25 33 O 335 Glu Lieu. Lieu. Glu Lys Ala Glu Thr Val Val Met Glu Val Lieu Ala 34 O 345 350 Ala Lys Thr Gly Tyr Glu Thir Asp Met Ile Glu Ser Asp Met Glu 355 360 365 Lieu. Glu Thr Glu Lieu. Gly Ile Asp Ser Ile Lys Arg Val Glu Ile 37O 375 38O Lieu. Ser Glu Val Glin Ala Met Lieu. Asn Val Glu Ala Lys Asp Val 385 390 395 Asp Ala Lieu. Ser Arg Thr Arg Thr Val Gly Glu Val Val Asp Ala 4 OO 405 41 O Met Lys Ala Glu Ile Ala Gly Gly Ser Ala Pro Ala Pro Ala Ala 415 42O 425

Ala Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala Pro 43 O 435 44 O Ala Pro Ala Val Ser Ser Glu Lieu. Lieu. Glu Lys Ala Glu Thr Val 445 450 45.5 Val Met Glu Val Lieu Ala Ala Lys Thr Gly Tyr Glu Thr Asp Met 460 465 47 O Ile Glu Ser Asp Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile Asp Ser 47s 48O 485 Ile Lys Arg Val Glu Ile Lieu. Ser Glu Val Glin Ala Met Lieu. Asn 490 495 SOO Val Glu Ala Lys Asp Val Asp Ala Lieu. Ser Arg Thr Arg Thr Val 5 OS 510 515 Gly Glu Val Val Asp Ala Met Lys Ala Glu Ile Ala Gly Gly Ser 52O 525 53 O US 2008/0038379 A1 Feb. 14, 2008 62

- Continued

Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala Ala Ala 535 54 O 545

Pro Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Pro Ala Wall 550 555 560 Ser Ser Glu Lieu. Lieu. Glu Lys Ala Glu Thr Val Val Met Glu Val 565 st O sts Lieu Ala Ala Lys Thr Gly Tyr Glu Thir Asp Met Ile Glu Ser Asp 58O 585 590 Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile Asp Ser Ile Lys Arg Val 595 6OO 605 Glu Ile Lieu. Ser Glu Val Glin Ala Met Lieu. Asn Val Glu Ala Lys 610 615 62O Asp Wall Asp Ala Lieu. Ser Arg Thr Arg Thr Val Gly Glu Val Val 625 63 O 635 Asp Ala Met Lys Ala Glu Ile Ala Gly Ser Ser Ala Ser Ala Pro 64 O 645 650

Ala Ala Ala Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala 655 660 665 Ala Ala Ala Pro Ala Val Ser Asn. Glu Lieu. Lieu. Glu Lys Ala Glu 670 675 68O Thr Val Val Met Glu Val Lieu Ala Ala Lys Thr Gly Tyr Glu Thr 685 69 O. 695 Asp Met Ile Glu Ser Asp Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile 7 OO 7Os 71O Asp Ser Ile Lys Arg Val Glu Ile Lieu. Ser Glu Val Glin Ala Met 71s 72 O 72 Lieu. Asn Val Glu Ala Lys Asp Wall Asp Ala Lieu. Ser Arg Thr Arg 73 O 73 74 O Thr Val Gly Glu Val Val Asp Ala Met Lys Ala Glu Ile Ala Gly 74. 7 O 7ss Gly Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala 760 765 770 Ala Ala Pro Ala Val Ser Asn. Glu Lieu. Lieu. Glu Lys Ala Glu Thr 775 78O 78s Val Val Met Glu Val Lieu Ala Ala Lys Thr Gly Tyr Glu Thir Asp 79 O 79. 8OO Met Ile Glu Ser Asp Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile Asp 805 810 815 Ser Ile Lys Arg Val Glu Ile Lieu. Ser Glu Val Glin Ala Met Lieu. 82O 825 83 O Asn Val Glu Ala Lys Asp Val Asp Ala Lieu. Ser Arg Thr Arg Thr 835 84 O 845 Val Gly Glu Val Val Asp Ala Met Lys Ala Glu Ile Ala Gly Ser 850 855 86 O

Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Pro Ala Ala Ala 865 87 O 87s

Ala Pro Ala Pro Ala Ala Ala Ala Pro Ala Wal Ser Ser Glu Lieu. 88O 885 890 Lieu. Glu Lys Ala Glu Thr Val Val Met Glu Val Lieu Ala Ala Lys 895 9 OO 905 US 2008/0038379 A1 Feb. 14, 2008 63

- Continued Thr Gly Tyr Glu Thir Asp Met Ile Glu Ser Asp Met Glu Lieu. Glu 910 915 92 O Thr Glu Lieu. Gly Ile Asp Ser Ile Lys Arg Val Glu Ile Lieu. Ser 925 93 O 935 Glu Val Glin Ala Met Lieu. Asn Val Glu Ala Lys Asp Wall Asp Ala 94 O 945 950 Lieu. Ser Arg Thr Arg Thr Val Gly Glu Val Val Asp Ala Met Lys 955 96.O 965 Ala Glu Ile Ala Gly Gly Ser Ala Pro Ala Pro Ala Ala Ala Ala 97O 97. 98 O

Pro Ala Pro Ala Ala Ala Ala Pro Ala Wal Ser Asn. Glu Lieu. Lieu 985 990 995 Glu Lys Ala Glu Thr Val Val Met Glu Val Lieu Ala Ala Lys Thr 2OOO 2005 2010 Gly Tyr Glu Thr Asp Met Ile Glu Ser Asp Met Glu Lieu. Glu Thr 2015 2O2O 2O25 Glu Lieu. Gly Ile Asp Ser Ile Lys Arg Val Glu Ile Lieu. Ser Glu 2O3O 2O35 2O4. O Val Glin Ala Met Lieu. Asn Val Glu Ala Lys Asp Val Asp Ala Lieu. 2O45 2OSO 2O55 Ser Arg Thr Arg Thr Val Gly Glu Val Val Asp Ala Met Lys Ala 2O60 2O65 2. Of O Glu Ile Ala Gly Gly Ser Ala Pro Ala Pro Ala Ala Ala Ala Pro 2O75 2O8 O 2O85 Ala Ser Ala Gly Ala Ala Pro Ala Wall Lys Ile Asp Ser Val His 2O90 2095 21OO Gly Ala Asp Cys Asp Asp Lieu. Ser Lieu Met His Ala Lys Val Val 2 O 5 2 1. O 2 1. 5 Asp Ile Arg Arg Pro Asp Glu Lieu. Ile Lieu. Glu Arg Pro Glu Asn

Arg Pro Val Lieu Val Val Asp Asp Gly Ser Glu Lieu. Thir Lieu Ala

Lieu Val Arg Val Lieu. Gly Ala Cys Ala Val Val Lieu. Thir Phe Glu

Gly Lieu Gln Lieu Ala Glin Arg Ala Gly Ala Ala Ala Ile Arg His

Val Lieu Ala Lys Asp Lieu. Ser Ala Glu Ser Ala Glu Lys Ala Ile

Lys Glu Ala Glu Glin Arg Phe Gly Ala Lieu. Gly Gly Phe Ile Ser 21.95 22 OO 22O5 Gln Glin Ala Glu Arg Phe Glu Pro Ala Glu Ile Leu Gly Phe Thr 221 O 2215 222 O Lieu Met Cys Ala Lys Phe Ala Lys Ala Ser Lieu. Cys Thr Ala Val 2225 223 O 2235 Ala Gly Gly Arg Pro Ala Phe Ile Gly Val Ala Arg Lieu. Asp Gly 224 O 2.245 225 O Arg Lieu. Gly Phe Thir Ser Glin Gly. Thir Ser Asp Ala Lieu Lys Arg 2255 226 O 2265 Ala Glin Arg Gly Ala Ile Phe Gly Lieu. Cys Llys Thir Ile Gly Lieu. 2270 2275 228O Glu Trp Ser Glu Ser Asp Val Phe Ser Arg Gly Val Asp Ile Ala US 2008/0038379 A1 Feb. 14, 2008 64

- Continued

2285 229 O 2295 Glin Gly Met His Pro Glu Asp Ala Ala Val Ala Ile Val Arg Glu 23 OO 23 OS 2310 Met Ala Cys Ala Asp Ile Arg Ile Arg Glu Val Gly Ile Gly Ala 2315 232O 2325 Asn Glin Glin Arg Cys Thir Ile Arg Ala Ala Lys Lieu. Glu Thr Gly 233 O 2335 234 O Asn Pro Glin Arg Glin Ile Ala Lys Asp Asp Val Lieu. Lieu Val Ser 2345 2350 2355 Gly Gly Ala Arg Gly Ile Thr Pro Lieu. Cys Ile Arg Glu Ile Thr 2360 23.65 2370 Arg Glin Ile Ala Gly Gly Llys Tyr Ile Lieu. Lieu. Gly Arg Ser Lys 2375 238O 23.85 Val Ser Ala Ser Glu Pro Ala Trp Cys Ala Gly Ile Thr Asp Glu 23.90 23.95 24 OO Lys Ala Val Glin Lys Ala Ala Thr Glin Glu Lieu Lys Arg Ala Phe 24 O5 241. O 24.15 Ser Ala Gly Glu Gly Pro Llys Pro Thr Pro Arg Ala Val Thr Lys 242O 24.25 243 O Lieu Val Gly Ser Val Lieu. Gly Ala Arg Glu Val Arg Ser Ser Ile 2435 244 O 2445 Ala Ala Ile Glu Ala Lieu. Gly Gly Lys Ala Ile Tyr Ser Ser Cys 2450 2455 246 O Asp Wall Asn. Ser Ala Ala Asp Wall Ala Lys Ala Val Arg Asp Ala 24 65 2470 2475 Glu Ser Glin Lieu. Gly Ala Arg Val Ser Gly Ile Val His Ala Ser 248O 2485 249 O Gly Val Lieu. Arg Asp Arg Lieu. Ile Glu Lys Llys Lieu Pro Asp Glu 2495 25 OO 2505 Phe Asp Ala Val Phe Gly Thr Llys Val Thr Gly Lieu. Glu Asn Lieu. 251O 25.15 252O Lieu Ala Ala Val Asp Arg Ala Asn Lieu Lys His Met Val Lieu. Phe 2525 253 O 25.35 Ser Ser Lieu Ala Gly Phe His Gly Asn Val Gly Glin Ser Asp Tyr 254 O 25.45 2550 Ala Met Ala Asn. Glu Ala Lieu. Asn Llys Met Gly Lieu. Glu Lieu Ala 2555 2560 25.65 Lys Asp Wal Ser Val Lys Ser Ile Cys Phe Gly Pro Trp Asp Gly 2570 27s 2580 Gly Met Val Thr Pro Glin Leu Lys Lys Glin Phe Glin Glu Met Gly 2585 2590 2595 Val Glin Ile Ile Pro Arg Glu Gly Gly Ala Asp Thr Val Ala Arg 26 OO 2605 261 O Ile Val Lieu. Gly Ser Ser Pro Ala Glu Ile Lieu Val Gly Asn Trp 2615 262O 262s Arg Thr Pro Ser Lys Llys Val Gly Ser Asp Thir Ile Thr Lieu. His 263 O 2635 264 O Arg Lys Ile Ser Ala Lys Ser Asn Pro Phe Lieu. Glu Asp His Val 2645 2650 2655 Ile Glin Gly Arg Arg Val Lieu Pro Met Thr Lieu Ala Ile Gly Ser 266 O 2665 2670 US 2008/0038379 A1 Feb. 14, 2008 65

- Continued

Lieu Ala Glu Thr Cys Lieu. Gly Lieu Phe Pro Gly Tyr Ser Leu Trp 2675 268O 2685 Ala Ile Asp Asp Ala Glin Lieu. Phe Lys Gly Val Thr Val Asp Gly 2690 2695 27 OO Asp Val Asn Cys Glu Val Thr Lieu. Thr Pro Ser Thr Ala Pro Ser 27 OS 271 O 2715 Gly Arg Val Asn Val Glin Ala Thr Lieu Lys Thr Phe Ser Ser Gly 272O 2725 273 O Llys Lieu Val Pro Ala Tyr Arg Ala Val Ile Val Lieu. Ser Asn. Glin 2735 274 O 2745 Gly Ala Pro Pro Ala Asn Ala Thr Met Gln Pro Pro Ser Lieu. Asp 2750 27s 276 O Ala Asp Pro Ala Lieu. Glin Gly Ser Val Tyr Asp Gly Lys Thr Lieu. 2765 2770 2775 Phe His Gly Pro Ala Phe Arg Gly Ile Asp Asp Val Lieu. Ser Cys 2780 2785 279 O Thir Lys Ser Glin Lieu Val Ala Lys Cys Ser Ala Val Pro Gly Ser 2.79s 28OO 2805 Asp Ala Ala Arg Gly Glu Phe Ala Thr Asp Thr Asp Ala His Asp 2810 2815 282O Pro Phe Val Asn Asp Lieu Ala Phe Glin Ala Met Lieu Val Trp Val 2825 283 O 2835 Arg Arg Thr Lieu. Gly Glin Ala Ala Lieu Pro Asn. Ser Ile Glin Arg 284 O 284.5 285 O Ile Val Gln His Arg Pro Val Pro Glin Asp Llys Pro Phe Tyr Ile 2855 286 O 2865 Thr Lieu. Arg Ser Asn Glin Ser Gly Gly His Ser Gln His Llys His 2870 2875 288O Ala Lieu. Glin Phe His Asn. Glu Glin Gly Asp Lieu. Phe Ile Asp Wall 2.885 289 O 2.895 Glin Ala Ser Val Ile Ala Thr Asp Ser Lieu Ala Phe 29 OO 29 OS 291. O

<210 SEQ ID NO 3 &2 11s LENGTH: 6.177 &212> TYPE: DNA <213> ORGANISM: Schizochytrium sp. &220s FEATURE: <221 NAME/KEY: CDS <222> LOCATION: (1) . . (61.77)

<4 OO SEQUENCE: 3 atg gcc gct cqg aat gtg agc gcc gcg cat gag atg cac gat gala aag 48 Met Ala Ala Arg Asn. Wal Ser Ala Ala His Glu Met His Asp Glu Lys 1. 5 1O 15 cgc at C gcc gtC gtc. g.gc atg gcc gt C cag tac gcc gga tigc aaa acc 96 Arg Ile Ala Val Val Gly Met Ala Val Glin Tyr Ala Gly Cys Llys Thr 2O 25 3O aag gac gag titc tig gag gtg Ct c atgaac ggc aag gtc gag to C aag 144 Lys Asp Glu Phe Trp Glu Val Lieu Met Asn Gly Llys Val Glu Ser Lys 35 4 O 45 gtg at C agc gac alaa C9a ctic ggc tic C aac tac cqc gcc gag cac tac 192 Val Ile Ser Asp Lys Arg Lieu. Gly Ser Asn Tyr Arg Ala Glu. His Tyr SO 55 6 O

US 2008/0038379 A1 Feb. 14, 2008 73

- Continued

Wall.

<4 OO SEQUENCE: 4 Met Ala Ala Arg Asn. Wal Ser Ala Ala His Glu Met His Asp Glu Lys 1. 5 1O 15 Arg Ile Ala Val Val Gly Met Ala Val Glin Tyr Ala Gly Cys Llys Thr 2O 25 3O Lys Asp Glu Phe Trp Glu Val Lieu Met Asn Gly Llys Val Glu Ser Lys 35 4 O 45 Val Ile Ser Asp Lys Arg Lieu. Gly Ser Asn Tyr Arg Ala Glu. His Tyr SO 55 6 O Lys Ala Glu Arg Ser Llys Tyr Ala Asp Thir Phe Cys Asn. Glu Thir Tyr 65 70 7s 8O Gly. Thir Lieu. Asp Glu Asn. Glu Ile Asp Asn. Glu. His Glu Lieu. Lieu. Lieu. 85 90 95 Asn Lieu Ala Lys Glin Ala Lieu Ala Glu Thir Ser Wall Lys Asp Ser Thr 1OO 105 11 O Arg Cys Gly Ile Val Ser Gly Cys Lieu. Ser Phe Pro Met Asp Asn Lieu 115 12 O 125 Glin Gly Glu Lieu. Lieu. Asn Val Tyr Glin Asn His Val Glu Lys Llys Lieu. 13 O 135 14 O Gly Ala Arg Val Phe Lys Asp Ala Ser His Trp Ser Glu Arg Glu Glin 145 150 155 160 Ser Asn Llys Pro Glu Ala Gly Asp Arg Arg Ile Phe Met Asp Pro Ala 1.65 17O 17s Ser Phe Val Ala Glu Glu Lieu. Asn Lieu. Gly Ala Lieu. His Tyr Ser Val 18O 185 19 O Asp Ala Ala Cys Ala Thr Ala Lieu. Tyr Val Lieu. Arg Lieu Ala Glin Asp 195 2OO 2O5 His Lieu Val Ser Gly Ala Ala Asp Wal Met Lieu. Cys Gly Ala Thr Cys 21 O 215 22O Lieu Pro Glu Pro Phe Phe Ile Leu Ser Gly Phe Ser Thr Phe Glin Ala 225 23 O 235 24 O Met Pro Val Gly Thr Gly Glin Asn Val Ser Met Pro Leu. His Lys Asp 245 250 255 Ser Glin Gly Lieu. Thr Pro Gly Glu Gly Gly Ser Ile Met Val Lieu Lys 26 O 265 27 O Arg Lieu. Asp Asp Ala Ile Arg Asp Gly Asp His Ile Tyr Gly. Thir Lieu 27s 28O 285 Lieu. Gly Ala Asn Val Ser Asn. Ser Gly Thr Gly Lieu Pro Lieu Lys Pro 29 O 295 3 OO Lieu. Lieu Pro Ser Glu Lys Lys Cys Lieu Met Asp Thr Tyr Thr Arg Ile 3. OS 310 315 32O Asn Val His Pro His Lys Ile Glin Tyr Val Glu. Cys His Ala Thr Gly 3.25 330 335 Thr Pro Glin Gly Asp Arg Val Glu Ile Asp Ala Val Lys Ala Cys Phe 34 O 345 35. O Glu Gly Llys Val Pro Arg Phe Gly Thr Thr Lys Gly Asn Phe Gly His 355 360 365 Thr Xaa Xaa Ala Ala Gly Phe Ala Gly Met Cys Llys Val Lieu. Lieu. Ser 37 O 375 38O US 2008/0038379 A1 Feb. 14, 2008 74

- Continued Met Lys His Gly Ile Ile Pro Pro Thr Pro Gly Ile Asp Asp Glu Thr 385 390 395 4 OO Lys Met Asp Pro Leu Val Val Ser Gly Glu Ala Ile Pro Trp Pro Glu 4 OS 41O 415 Thir Asn Gly Glu Pro Lys Arg Ala Gly Lieu. Ser Ala Phe Gly Phe Gly 42O 425 43 O Gly. Thir Asn Ala His Ala Val Phe Glu Glu. His Asp Pro Ser Asn Ala 435 44 O 445 Ala Cys Thr Gly His Asp Ser Ile Ser Ala Lieu. Ser Ala Arg Cys Gly 450 45.5 460 Gly Glu Ser Asn Met Arg Ile Ala Ile Thr Gly Met Asp Ala Thr Phe 465 470 47s 48O Gly Ala Lieu Lys Gly Lieu. Asp Ala Phe Glu Arg Ala Ile Tyr Thr Gly 485 490 495 Ala His Gly Ala Ile Pro Lieu Pro Glu Lys Arg Trp Arg Phe Lieu. Gly SOO 505 51O Lys Asp Lys Asp Phe Lieu. Asp Lieu. Cys Gly Val Lys Ala Thr Pro His 515 52O 525 Gly Cys Tyr Ile Glu Asp Val Glu Val Asp Phe Glin Arg Lieu. Arg Thr 53 O 535 54 O Pro Met Thr Pro Glu Asp Met Leu Lleu Pro Glin Gln Lieu. Leu Ala Val 5.45 550 555 560 Thir Thir Ile Asp Arg Ala Ile Lieu. Asp Ser Gly Met Llys Lys Gly Gly 565 st O sts Asn Val Ala Val Phe Val Gly Lieu. Gly Thr Asp Lieu. Glu Lieu. Tyr Arg 58O 585 59 O His Arg Ala Arg Val Ala Lieu Lys Glu Arg Val Arg Pro Glu Ala Ser 595 6OO 605 Llys Llys Lieu. Asn Asp Met Met Glin Tyr Ile Asn Asp Cys Gly. Thir Ser 610 615 62O Thir Ser Tyr Thr Ser Tyr Ile Gly Asn Lieu Val Ala Thr Arg Val Ser 625 630 635 64 O Ser Gln Trp Gly Phe Thr Gly Pro Ser Phe Thr Ile Thr Glu Gly Asn 645 650 655 Asn Ser Val Tyr Arg Cys Ala Glu Lieu. Gly Lys Tyr Lieu. Lieu. Glu Thr 660 665 67 O Gly Glu Val Asp Gly Val Val Val Ala Gly Val Asp Lieu. Cys Gly Ser 675 68O 685 Ala Glu Asn Lieu. Tyr Val Llys Ser Arg Arg Phe Llys Val Ser Thir Ser 69 O. 695 7 OO Asp Thr Pro Arg Ala Ser Phe Asp Ala Ala Ala Asp Gly Tyr Phe Val 7 Os 71O 71s 72O Gly Glu Gly Cys Gly Ala Phe Val Lieu Lys Arg Glu Thir Ser Cys Thr 72 73 O 73 Lys Asp Asp Arg Ile Tyr Ala Cys Met Asp Ala Ile Val Pro Gly Asn 740 74. 7 O Val Pro Ser Ala Cys Lieu. Arg Glu Ala Lieu. Asp Glin Ala Arg Val Lys 7ss 760 765 Pro Gly Asp Ile Glu Met Lieu. Glu Lieu. Ser Ala Asp Ser Ala Arg His 770 775 78O Lieu Lys Asp Pro Ser Val Lieu Pro Lys Glu Lieu. Thir Ala Glu Glu Glu US 2008/0038379 A1 Feb. 14, 2008 75

- Continued

79 O 79. 8OO

Ile Gly Gly Lieu. Glin Thir Ile Lieu. Arg Asp Asp Asp Llys Lieu Pro Arg 805 810 815

Asn Wall Ala Thr Gly Ser Val Lys Ala Thr Val Gly Asp Thr Gly Tyr 82O 825 83 O

Ala Ser Gly Ala Ala Ser Lieu. Ile Lys Ala Ala Lieu. Cys Ile Tyr Asn 835 84 O 845

Arg Tyr Luell Pro Ser Asn Gly Asp Asp Trp Asp Glu Pro Ala Pro Glu 850 855 860

Ala Pro Trp Asp Ser Thr Lieu. Phe Ala Cys Glin Thir Ser Arg Ala Trip 865 87O 87s 88O

Lell Asn Pro Gly Glu Arg Arg Tyr Ala Ala Val Ser Gly Val Ser 885 890 895

Glu Thir Arg Ser Cys Tyr Ser Val Lieu. Lieu. Ser Glu Ala Glu Gly His 9 OO 905 91 O

Glu Arg Glu Asn Arg Ile Ser Lieu. Asp Glu Glu Ala Pro Llys Lieu. 915 92 O 925

Ile Wall Luell Arg Ala Asp Ser His Glu Glu Ile Lieu. Gly Arg Lieu. Asp 93 O 935 94 O

Lys Ile Arg Glu Arg Phe Lieu. Glin Pro Thr Gly Ala Ala Pro Arg Glu 945 950 955 96.O

Ser Glu Luell Lys Ala Glin Ala Arg Arg Ile Phe Lieu. Glu Lieu. Lieu. Gly 965 97O 97.

Glu Thir Luell Ala Glin Asp Ala Ala Ser Ser Gly Ser Glin Llys Pro Lieu 98O 985 99 O

Ala Luell Ser Lieu Val Ser Thr Pro Ser Llys Lieu. Glin Arg Glu Val Glu 995 1OOO OOS

Lell Ala Ala Lys Gly Ile Pro Arg Cys Lieu Lys Met Arg Arg Asp O15 O2O

Trp Ser Ser Pro Ala Gly Ser Arg Tyr Ala Pro Glu Pro Lieu Ala O25 O3 O O35

Ser Arg Val Ala Phe Met Tyr Gly Glu Gly Arg Ser Pro Tyr O45 OSO

Ile Thr Glin Asp Ile His Arg Ile Trp Pro Glu Lieu. His O6 O O65

Glu Ile Asn Glu Lys Thr Asn Arg Lieu. Trp Ala Glu Gly Asp O7 O8O

Arg Wall Met Pro Arg Ala Ser Phe Llys Ser Glu Lieu. Glu Ser O9 O O95

Glin Glin Glu Phe Asp Arg Asn Met Ile Glu Met Phe Arg Lieu. O5 10

Gly Lell Thir Ser Ile Ala Phe Thr Asn Lieu Ala Arg Asp Wall 2O 25

Lell Ile Thr Pro Lys Ala Ala Phe Gly Lieu. Ser Lieu. Gly Glu 35 4 O

Ile Met Ile Phe Ala Phe Ser Lys Lys Asn Gly Lieu. Ile Ser SO 55

Asp Lell Thir Lys Asp Lieu. Arg Glu Ser Asp Val Trp Asn Llys 65 70

Ala Ala Val Glu Phe Asn Ala Lieu. Arg Glu Ala Trp Gly Ile 8O 85 US 2008/0038379 A1 Feb. 14, 2008 76

- Continued

Pro Glin Ser Val Pro Lys Asp Glu Phe Trp Gln Gly Tyr Ile Val 190 195 2OO Arg Gly. Thir Lys Glin Asp Ile Glu Ala Ala Ile Ala Pro Asp Ser 2O5 21 O 215 yr Val Arg Lieu. Thir Ile Ile Asin Asp Ala Asn. Thir Ala Lieu 22O 225 23 O Ile Ser Gly Llys Pro Asp Ala Cys Lys Ala Ala Ile Ala Arg Lieu 235 24 O 245 Gly Gly Asn Ile Pro Ala Leu Pro Val Thr Glin Gly Met Cys Gly 250 255 26 O His Cys Pro Glu Val Gly Pro Tyr Thr Lys Asp Ile Ala Lys Ile 265 27 O 27s His Ala Asn Lieu. Glu Phe Pro Val Val Asp Gly Lieu. Asp Lieu. Trp 28O 285 29 O Thir Thir Ile Asn Gln Lys Arg Lieu Val Pro Arg Ala Thr Gly Ala 295 3OO 305 Lys Asp Glu Trp Ala Pro Ser Ser Phe Gly Glu Tyr Ala Gly Glin 310 315 32O Lieu. Tyr Glu Lys Glin Ala Asn Phe Pro Glin Ile Val Glu Thir Ile 3.25 33 O 335 Tyr Lys Glin Asn Tyr Asp Val Phe Val Glu Val Gly Pro Asn. Asn

His Arg Ser Thr Ala Val Arg Thr Thr Lieu. Gly Pro Glin Arg Asn

His Lieu Ala Gly Ala Ile Asp Llys Glin Asn. Glu Asp Ala Trp Thr

Thir Ile Val Llys Lieu Val Ala Ser Lieu Lys Ala His Lieu Val Pro

Gly Val Thr Ile Ser Pro Leu Tyr His Ser Lys Lieu Val Ala Glu 4 OO 405 41 O Ala Glin Ala Cys Tyr Ala Ala Lieu. Cys Lys Gly Glu Lys Pro Llys 415 42O 425 Lys Asn Llys Phe Val Arg Lys Ile Glin Lieu. Asn Gly Arg Phe Asn 43 O 435 44 O Ser Lys Ala Asp Pro Ile Ser Ser Ala Asp Lieu Ala Ser Phe Pro 445 450 45.5 Pro Ala Asp Pro Ala Ile Glu Ala Ala Ile Ser Ser Arg Ile Met 460 465 47 O Llys Pro Val Ala Pro Llys Phe Tyr Ala Arg Lieu. Asn. Ile Asp Glu 47s 48O 485 Glin Asp Glu Thir Arg Asp Pro Ile Lieu. Asn Lys Asp Asn Ala Pro 490 495 SOO

Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser 5 OS 510 515 Pro Ser Pro Ala Pro Ser Ala Pro Val Glin Lys Lys Ala Ala Pro 52O 525 53 O Ala Ala Glu Thir Lys Ala Val Ala Ser Ala Asp Ala Lieu. Arg Ser 535 54 O 545 Ala Lieu. Lieu. Asp Lieu. Asp Ser Met Lieu Ala Lieu. Ser Ser Ala Ser 550 555 560 US 2008/0038379 A1 Feb. 14, 2008 77

- Continued Ala Ser Gly Asn Lieu Val Glu Thir Ala Pro Ser Asp Ala Ser Val 565 st O sts Ile Val Pro Pro Cys Asn. Ile Ala Asp Lieu. Gly Ser Arg Ala Phe 58O 585 590 Met Lys Thr Tyr Gly Val Ser Ala Pro Leu Tyr Thr Gly Ala Met 595 6OO 605 Ala Lys Gly Ile Ala Ser Ala Asp Lieu Val Ile Ala Ala Gly Arg 610 615 62O Glin Gly Ile Lieu Ala Ser Phe Gly Ala Gly Gly Lieu Pro Met Glin 625 63 O 635 Val Val Arg Glu Ser Ile Glu Lys Ile Glin Ala Ala Lieu Pro Asn 64 O 645 650 Gly Pro Tyr Ala Val Asn Lieu. Ile His Ser Pro Phe Asp Ser Asn 655 660 665 Lieu. Glu Lys Gly Asn. Wall Asp Lieu. Phe Lieu. Glu Lys Gly Val Thr 670 675 68O

Phe Wall Glu Ala Ser Ala Phe Met Thir Lieu. Thir Pro Glin Wal Wall 685 69 O. 695 Arg Tyr Arg Ala Ala Gly Lieu. Thir Arg Asn Ala Asp Gly Ser Val 7 OO 7Os 71O Asn. Ile Arg Asn Arg Ile Ile Gly Llys Val Ser Arg Thr Glu Lieu.

Ala Glu Met Phe Met Arg Pro Ala Pro Glu. His Lieu. Lieu. Glin Lys 73 O 73 74 O Lieu. Ile Ala Ser Gly Glu Ile Asin Glin Glu Glin Ala Glu Lieu Ala 74. 7 O 7ss Arg Arg Val Pro Val Ala Asp Asp Ile Ala Val Glu Ala Asp Ser 760 765 770 Gly Gly. His Thr Asp Asn Arg Pro Ile His Val Ile Lieu Pro Lieu. 775 78O 78s Ile Ile Asn Lieu. Arg Asp Arg Lieu. His Arg Glu. Cys Gly Tyr Pro

Ala Asn Lieu. Arg Val Arg Val Gly Ala Gly Gly Gly Ile Gly Cys 805 810 815 Pro Glin Ala Ala Leu Ala Thr Phe Asn Met Gly Ala Ser Phe Ile 82O 825 83 O Val Thr Gly Thr Val Asn Glin Val Ala Lys Glin Ser Gly Thr Cys 835 84 O 845 Asp Asn Val Arg Lys Glin Lieu Ala Lys Ala Thr Tyr Ser Asp Wall 850 855 86 O Cys Met Ala Pro Ala Ala Asp Met Phe Glu Glu Gly Wall Lys Lieu. 865 87 O 87s Glin Val Lieu Lys Lys Gly Thr Met Phe Pro Ser Arg Ala Asn Llys 88O 885 890 Lieu. Tyr Glu Lieu Phe Cys Llys Tyr Asp Ser Phe Glu Ser Met Pro 895 9 OO 905 Pro Ala Glu Lieu Ala Arg Val Glu Lys Arg Ile Phe Ser Arg Ala 910 915 92 O Lieu. Glu Glu Val Trp Asp Glu Thir Lys Asn. Phe Tyr Ile Asn Arg 925 93 O 935 Lieu. His Asn. Pro Glu Lys Ile Glin Arg Ala Glu Arg Asp Pro Llys

US 2008/0038379 A1 Feb. 14, 2008 84

- Continued

145 150 155 160 Arg Val Thr Gly Phe Ala Lys Arg Lieu. Asp Gly Gly Ile Ser Met Phe 1.65 17O 17s Phe Phe Glu Tyr Asp Cys Tyr Val Asn Gly Arg Lieu Lleu. Ile Glu Met 18O 185 19 O Arg Asp Gly Cys Ala Gly Phe Phe Thr Asn. Glu Glu Lieu. Asp Ala Gly 195 2OO 2O5 Lys Gly Val Val Phe Thr Arg Gly Asp Lieu Ala Ala Arg Ala Lys Ile 21 O 215 22O Pro Lys Glin Asp Val Ser Pro Tyr Ala Val Ala Pro Cys Lieu. His Lys 225 23 O 235 24 O Thir Lys Lieu. Asn. Glu Lys Glu Met Glin Thir Lieu Val Asp Lys Asp Trp 245 250 255 Ala Ser Val Phe Gly Ser Lys Asn Gly Met Pro Glu Ile Asn Tyr Lys 26 O 265 27 O Lieu. Cys Ala Arg Llys Met Lieu Met Ile Asp Arg Val Thir Ser Ile Asp 27s 28O 285 His Lys Gly Gly Val Tyr Gly Lieu. Gly Glin Lieu Val Gly Glu Lys Ile 29 O 295 3 OO Lieu. Glu Arg Asp His Trp Tyr Phe Pro Cys His Phe Val Lys Asp Glin 3. OS 310 315 32O Val Met Ala Gly Ser Lieu Val Ser Asp Gly Cys Ser Gln Met Lieu Lys 3.25 330 335 Met Tyr Met Ile Trp Lieu. Gly Lieu. His Lieu. Thir Thr Gly Pro Phe Asp 34 O 345 35. O Phe Arg Pro Val Asn Gly His Pro Asn Llys Val Arg Cys Arg Gly Glin 355 360 365 Ile Ser Pro His Lys Gly Lys Lieu Val Tyr Val Met Glu Ile Lys Glu 37 O 375 38O Met Gly Phe Asp Glu Asp Asn Asp Pro Tyr Ala Ile Ala Asp Wall Asn 385 390 395 4 OO Ile Ile Asp Wall Asp Phe Glu Lys Gly Glin Asp Phe Ser Lieu. Asp Arg 4 OS 41O 415 Ile Ser Asp Tyr Gly Lys Gly Asp Lieu. Asn Llys Lys Ile Val Val Asp 42O 425 43 O Phe Lys Gly Ile Ala Lieu Lys Met Glin Lys Arg Ser Thr Asn Lys Asn 435 44 O 445 Pro Ser Llys Val Glin Pro Val Phe Ala Asn Gly Ala Ala Thr Val Gly 450 45.5 460 Pro Glu Ala Ser Lys Ala Ser Ser Gly Ala Ser Ala Ser Ala Ser Ala 465 470 47s 48O Ala Pro Ala Lys Pro Ala Phe Ser Ala Asp Val Lieu Ala Pro Llys Pro 485 490 495 Val Ala Lieu Pro Glu. His Ile Lieu Lys Gly Asp Ala Lieu Ala Pro Llys SOO 505 51O Glu Met Ser Trp His Pro Met Ala Arg Ile Pro Gly Asn Pro Thr Pro 515 52O 525 Ser Phe Ala Pro Ser Ala Tyr Llys Pro Arg Asn Ile Ala Phe Thr Pro 53 O 535 54 O Phe Pro Gly Asn Pro Asn Asp Asn Asp His Thr Pro Gly Lys Met Pro 5.45 550 555 560 US 2008/0038379 A1 Feb. 14, 2008 85

- Continued

Lieu. Thir Trp Phe Asn Met Ala Glu Phe Met Ala Gly Llys Val Ser Met 565 st O sts Cys Lieu. Gly Pro Glu Phe Ala Lys Phe Asp Asp Ser Asn. Thir Ser Arg 58O 585 59 O Ser Pro Ala Trp Asp Lieu Ala Lieu Val Thr Arg Ala Val Ser Val Ser 595 6OO 605 Asp Lieu Lys His Val Asn Tyr Arg Asn. Ile Asp Lieu. Asp Pro Ser Lys 610 615 62O Gly Thr Met Val Gly Glu Phe Asp Cys Pro Ala Asp Ala Trp Phe Tyr 625 630 635 64 O Lys Gly Ala Cys Asn Asp Ala His Met Pro Tyr Ser Ile Lieu Met Glu 645 650 655 Ile Ala Lieu. Glin Thir Ser Gly Val Lieu. Thir Ser Val Lieu Lys Ala Pro 660 665 67 O Lieu. Thir Met Glu Lys Asp Asp Ile Lieu. Phe Arg Asn Lieu. Asp Ala Asn 675 68O 685 Ala Glu Phe Val Arg Ala Asp Lieu. Asp Tyr Arg Gly Llys Thir Ile Arg 69 O. 695 7 OO Asn Val Thir Lys Cys Thr Gly Tyr Ser Met Leu Gly Glu Met Gly Val 7 Os 71O 71s 72O His Arg Phe Thr Phe Glu Lieu. Tyr Val Asp Asp Val Lieu Phe Tyr Lys 72 73 O 73 Gly Ser Thr Ser Phe Gly Trp Phe Val Pro Glu Val Phe Ala Ala Glin 740 74. 7 O Ala Gly Lieu. Asp Asn Gly Arg Llys Ser Glu Pro Trp Phe Ile Glu Asn 7ss 760 765 Llys Val Pro Ala Ser Glin Val Ser Ser Phe Asp Val Arg Pro Asn Gly 770 775 78O Ser Gly Arg Thir Ala Ile Phe Ala Asn Ala Pro Ser Gly Ala Glin Lieu. 78s 79 O 79. 8OO Asn Arg Arg Thr Asp Glin Gly Glin Tyr Lieu. Asp Ala Val Asp Ile Val 805 810 815 Ser Gly Ser Gly Llys Llys Ser Lieu. Gly Tyr Ala His Gly Ser Lys Thr 82O 825 83 O Val Asn Pro Asn Asp Trp Phe Phe Ser Cys His Phe Trp Phe Asp Ser 835 84 O 845 Val Met Pro Gly Ser Lieu. Gly Val Glu Ser Met Phe Gln Leu Val Glu 850 855 860 Ala Ile Ala Ala His Glu Asp Lieu Ala Gly Lys Ala Arg His Cys Glin 865 87O 87s 88O Pro His Lieu. Cys Ala Arg Pro Arg Ala Arg Ser Ser Trp Llys Tyr Arg 885 890 895 Gly Glin Lieu. Thr Pro Llys Ser Lys Llys Met Asp Ser Glu Val His Ile 9 OO 905 91 O Val Ser Val Asp Ala His Asp Gly Val Val Asp Lieu Val Ala Asp Gly 915 92 O 925 Phe Lieu. Trp Ala Asp Ser Lieu. Arg Val Tyr Ser Val Ser Asn. Ile Arg 93 O 935 94 O Val Arg Ile Ala Ser Gly Glu Ala Pro Ala Ala Ala Ser Ser Ala Ala 945 950 955 96.O US 2008/0038379 A1 Feb. 14, 2008 86

- Continued Ser Val Gly Ser Ser Ala Ser Ser Val Glu Arg Thr Arg Ser Ser Pro 965 97O 97. Ala Val Ala Ser Gly Pro Ala Glin Thir Ile Asp Lieu Lys Glin Lieu Lys 98O 985 99 O Thr Glu Lieu. Lieu. Glu Lieu. Asp Ala Pro Lieu. Tyr Lieu. Ser Glin Asp Pro 995 1OOO 1005 Thir Ser Gly Glin Lieu Lys Llys His Thr Asp Val Ala Ser Gly Glin O1O O15 O2O Ala Thir Ile Val Glin Pro Cys Thr Lieu. Gly Asp Lieu. Gly Asp Arg O25 O3 O O35 Ser Phe Met Glu Thr Tyr Gly Val Val Ala Pro Leu Tyr Thr Gly O4 O O45 OSO Ala Met Ala Lys Gly Ile Ala Ser Ala Asp Lieu Val Ile Ala Ala O55 O6 O O65 Gly Lys Arg Lys Ile Lieu. Gly Ser Phe Gly Ala Gly Gly Lieu Pro

Met His His Val Arg Ala Ala Lieu. Glu Lys Ile Glin Ala Ala Lieu.

Pro Glin Gly Pro Tyr Ala Val Asn Lieu. Ile His Ser Pro Phe Asp

Ser Asn Lieu. Glu Lys Gly Asn Val Asp Lieu. Phe Lieu. Glu Lys Gly

Wall. Thir Wal Wall Glu Ala Ser Ala Phe Met Thir Lieu. Thir Pro Glin

Val Val Arg Tyr Arg Ala Ala Gly Lieu. Ser Arg Asn Ala Asp Gly

Ser Val Asn. Ile Arg Asn Arg Ile Ile Gly Llys Val Ser Arg Thr

Glu Lieu. Ala Glu Met Phe e Arg Pro Ala Pro Glu. His Lieu. Lieu.

Glu Lys Lieu. Ile Ala Ser Gly Glu Ile Thr Glin Glu Glin Ala Glu

Lieu Ala Arg Arg Val Pro Val Ala Asp Asp Ile Ala Val Glu Ala 2O5 21 O 215 Asp Ser Gly Gly. His Thr Asp Asin Arg Pro Ile His Val Ile Lieu. 22O 225 23 O Pro Lieu. Ile Ile Asn Lieu. Arg Asn Arg Lieu. His Arg Glu. Cys Gly 235 24 O 245 Tyr Pro Ala His Lieu. Arg Val Arg Val Gly Ala Gly Gly Gly Val

Gly Cys Pro Glin Ala Ala Ala Ala Ala Lieu. Thir Met Gly Ala Ala

Phe Ile Val Thr Gly Thr Val Asn Glin Val Ala Lys Glin Ser Gly

Thir Cys Asp Asn Val Arg Lys Glin Lieu. Ser Glin Ala Thr Tyr Ser 295 3OO 305 Asp Ile Cys Met Ala Pro Ala Ala Asp Met Phe Glu Glu Gly Val 310 315 32O Llys Lieu. Glin Val Lieu Lys Lys Gly Thr Met Phe Pro Ser Arg Ala 3.25 33 O 335 Asn Llys Lieu. Tyr Glu Lieu. Phe Cys Llys Tyr Asp Ser Phe Asp Ser

US 2008/0038379 A1 Feb. 14, 2008 88

- Continued Gly Cys Val Lieu. Gly Ile Gly Gly Gly Glin Lys Ser Ser His Glu Phe 13 O 135 14 O tac tog cqc ctit aat tat gtt gt C gtg gag aag gtc. Ct c cqc aag atg 48O Tyr Ser Arg Lieu. Asn Tyr Val Val Val Glu Lys Val Lieu. Arg Llys Met 145 150 155 160 ggc atg ccc gag gag gac gtc. aag gtC goc gtc. gala aag tac aag gCC 528 Gly Met Pro Glu Glu Asp Wall Lys Val Ala Val Glu Lys Tyr Lys Ala 1.65 17O 17s aac ttic ccc gag togg cqc ct c gac toc ttic cct ggc titc ct c ggc aac 576 Asn Phe Pro Glu Trp Arg Lieu. Asp Ser Phe Pro Gly Phe Leu Gly Asn 18O 185 19 O gtc. acc gcc ggit cqc to acc aac 6OO Val Thr Ala Gly Arg Cys Thr Asn 195 2OO

<210 SEQ ID NO 8 <211 LENGTH: 2OO &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp.

<4 OO SEQUENCE: 8 Met Ala Ala Arg Lieu. Glin Glu Glin Lys Gly Gly Glu Met Asp Thr Arg 1. 5 1O 15 Ile Ala Ile Ile Gly Met Ser Ala Ile Leu Pro Cys Gly. Thir Thr Val 2O 25 3O Arg Glu Ser Trp Glu Thir Ile Arg Ala Gly Ile Asp Cys Lieu. Ser Asp 35 4 O 45 Lieu Pro Glu Asp Arg Val Asp Val Thir Ala Tyr Phe Asp Pro Val Lys SO 55 6 O Thir Thr Lys Asp Lys Ile Tyr Cys Lys Arg Gly Gly Phe Ile Pro Glu 65 70 7s 8O Tyr Asp Phe Asp Ala Arg Glu Phe Gly Lieu. Asn Met Phe Gln Met Glu 85 90 95 Asp Ser Asp Ala Asn Glin Thir Ile Ser Lieu. Lieu Lys Wall Lys Glu Ala 1OO 105 11 O Lieu. Glin Asp Ala Gly Ile Asp Ala Lieu. Gly Lys Glu Lys Lys Asn. Ile 115 12 O 125 Gly Cys Val Lieu. Gly Ile Gly Gly Gly Glin Lys Ser Ser His Glu Phe 13 O 135 14 O Tyr Ser Arg Lieu. Asn Tyr Val Val Val Glu Lys Val Lieu. Arg Llys Met 145 150 155 160 Gly Met Pro Glu Glu Asp Wall Lys Val Ala Val Glu Lys Tyr Lys Ala 1.65 17O 17s Asn Phe Pro Glu Trp Arg Lieu. Asp Ser Phe Pro Gly Phe Leu Gly Asn 18O 185 19 O Val Thr Ala Gly Arg Cys Thr Asn 195 2OO

<210 SEQ ID NO 9 <211 LENGTH: 1278 &212> TYPE: DNA <213> ORGANISM: Schizochytrium sp. &220s FEATURE: <221 NAME/KEY: CDS <222> LOCATION: (1) ... (1278)

<4 OO SEQUENCE: 9

US 2008/0038379 A1 Feb. 14, 2008 90

- Continued cgc at C titt gtC gag titc gga ccc aag cag gtg Ctc. tcc aag ctt gt C 96.O Arg Ile Phe Val Glu Phe Gly Pro Lys Glin Val Lieu. Ser Lys Lieu Val 3. OS 310 315 32O tcc gag acc ct c aag gat gac ccc tog gtt gtc acc gtc. tct gtc. aac OO8 Ser Glu Thir Lieu Lys Asp Asp Pro Ser Val Val Thr Val Ser Val Asn 3.25 330 335 ccg gcc togggc acg gat t c gaC at C cag ctic cqc gac gog gCC gt C O56 Pro Ala Ser Gly Thr Asp Ser Asp Ile Glin Lieu. Arg Asp Ala Ala Val 34 O 345 35. O

Cag Ct c gtt gtC gct ggc gtc. aac Ctt cag ggc titt gac aag tig gac 104 Glin Lieu Val Val Ala Gly Val Asn Lieu. Glin Gly Phe Asp Llys Trp Asp 355 360 365 gcc ccc gat gcc acc cc atg cag gcc at C aag aag aag cqc act acc 152 Ala Pro Asp Ala Thr Arg Met Glin Ala Ile Llys Llys Lys Arg Thir Thr 37 O 375 38O

Ctc. c9c Ctt tog gcc gcc acc tac gtc. tcg gac aag acc aag aag gt C 2OO Lieu. Arg Lieu. Ser Ala Ala Thr Tyr Val Ser Asp Llys Thr Lys Llys Val 385 390 395 4 OO cgc gaC gcc gcc atgaac gat ggc cgc tigc gtc acc tac Ct c aag ggc 248 Arg Asp Ala Ala Met Asn Asp Gly Arg Cys Val Thr Tyr Lieu Lys Gly 4 OS 41O 415 gcc gca ccg ct c atc aag gcc ccg gag ccc 278 Ala Ala Pro Lieu. Ile Lys Ala Pro Glu Pro 42O 425

<210 SEQ ID NO 10 <211 LENGTH: 426 &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp.

<4 OO SEQUENCE: 10 Asp Val Thir Lys Glu Ala Trp Arg Lieu Pro Arg Glu Gly Val Ser Phe 1. 5 1O 15 Arg Ala Lys Gly Ile Ala Thr Asn Gly Ala Val Ala Ala Lieu. Phe Ser 2O 25 3O Gly Glin Gly Ala Glin Tyr Thr His Met Phe Ser Glu Val Ala Met Asn 35 4 O 45 Trp Pro Glin Phe Arg Glin Ser Ile Ala Ala Met Asp Ala Ala Glin Ser SO 55 6 O Llys Val Ala Gly Ser Asp Lys Asp Phe Glu Arg Val Ser Glin Val Lieu. 65 70 7s 8O Tyr Pro Arg Llys Pro Tyr Glu Arg Glu Pro Glu Glin Asn Pro Llys Llys 85 90 95 Ile Ser Lieu. Thir Ala Tyr Ser Glin Pro Ser Thr Lieu Ala Cys Ala Leu 1OO 105 11 O Gly Ala Phe Glu Ile Phe Lys Glu Ala Gly Phe Thr Pro Asp Phe Ala 115 12 O 125 Ala Gly His Ser Lieu. Gly Glu Phe Ala Ala Lieu. Tyr Ala Ala Gly Cys 13 O 135 14 O Val Asp Arg Asp Glu Lieu. Phe Glu Lieu Val Cys Arg Arg Ala Arg Ile 145 150 155 160 Met Gly Gly Lys Asp Ala Pro Ala Thr Pro Lys Gly Cys Met Ala Ala 1.65 17O 17s Val Ile Gly Pro Asn Ala Glu Asn. Ile Llys Val Glin Ala Ala Asn Val 18O 185 19 O US 2008/0038379 A1 Feb. 14, 2008 91

- Continued

Trp Leu Gly Asn Ser Asn Ser Pro Ser Glin Thr Val Ile Thr Gly Ser 195 2OO 2O5 Val Glu Gly Ile Glin Ala Glu Ser Ala Arg Lieu Gln Lys Glu Gly Phe 21 O 215 22O Arg Val Val Pro Leu Ala Cys Glu Ser Ala Phe His Ser Pro Glin Met 225 23 O 235 24 O Glu Asn Ala Ser Ser Ala Phe Lys Asp Val Ile Ser Llys Val Ser Phe 245 250 255 Arg Thr Pro Lys Ala Glu Thir Lys Lieu. Phe Ser Asn. Wal Ser Gly Glu 26 O 265 27 O Thr Tyr Pro Thr Asp Ala Arg Glu Met Lieu. Thr Gln His Met Thir Ser 27s 28O 285 Ser Val Llys Phe Lieu. Thr Glin Val Arg Asn Met His Glin Ala Gly Ala 29 O 295 3 OO Arg Ile Phe Val Glu Phe Gly Pro Lys Glin Val Lieu. Ser Lys Lieu Val 3. OS 310 315 32O Ser Glu Thir Lieu Lys Asp Asp Pro Ser Val Val Thr Val Ser Val Asn 3.25 330 335 Pro Ala Ser Gly Thr Asp Ser Asp Ile Glin Lieu. Arg Asp Ala Ala Val 34 O 345 35. O Glin Lieu Val Val Ala Gly Val Asn Lieu. Glin Gly Phe Asp Llys Trp Asp 355 360 365 Ala Pro Asp Ala Thr Arg Met Glin Ala Ile Llys Llys Lys Arg Thir Thr 37 O 375 38O Lieu. Arg Lieu. Ser Ala Ala Thr Tyr Val Ser Asp Llys Thr Lys Llys Val 385 390 395 4 OO Arg Asp Ala Ala Met Asn Asp Gly Arg Cys Val Thr Tyr Lieu Lys Gly 4 OS 41O 415 Ala Ala Pro Lieu. Ile Lys Ala Pro Glu Pro 42O 425

<210 SEQ ID NO 1 <211 LENGTH: 5 &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp. &220s FEATURE: <221 NAME/KEY: MISC FEATURE <222> LOCATION: (1) ... (5) <223> OTHER INFORMATION: Xaa = any amino acid <4 OO SEQUENCE: 1 Gly His Ser Xaa Gly 1. 5

<210 SEQ ID NO 12 <211 LENGTH: 258 &212> TYPE: DNA <213> ORGANISM: Schizochytrium sp. &220s FEATURE: <221 NAME/KEY: CDS <222> LOCATION: (1) ... (258)

<4 OO SEQUENCE: 12 gct gtC tog aac gag Ctt Ctt gag aag gcc gag act gtc gtC atg gag 48 Ala Val Ser Asn. Glu Lieu. Lieu. Glu Lys Ala Glu Thr Val Val Met Glu 1. 5 1O 15 US 2008/0038379 A1 Feb. 14, 2008 92

- Continued gtc. Ct c goc goc aag acc ggc tac gag acc gac atg atc gag gCt gac 96 Val Lieu Ala Ala Lys Thr Gly Tyr Glu Thir Asp Met Ile Glu Ala Asp 2O 25 3O atg gag ct c gag acc gag ctic ggc att gac toc atc aag cqt gt C gag 144 Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile Asp Ser Ile Lys Arg Val Glu 35 4 O 45 atc Ct c to C gag gtc. cag gcc atg Ct c aat gtc gag gcc aag gat gt C 192 Ile Lieu. Ser Glu Val Glin Ala Met Lieu. Asn Val Glu Ala Lys Asp Val SO 55 6 O gat gcc ct c agc cqC act cqc act gtt ggit gag gtt gtc. aac goc atg 24 O Asp Ala Lieu. Ser Arg Thr Arg Thr Val Gly Glu Val Val Asn Ala Met 65 70 7s 8O aag gcc gag at C gct ggc 258 Lys Ala Glu Ile Ala Gly 85

<210 SEQ ID NO 13 <211 LENGTH: 86 &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp.

<4 OO SEQUENCE: 13 Ala Val Ser Asn. Glu Lieu. Lieu. Glu Lys Ala Glu Thr Val Val Met Glu 1. 5 1O 15 Val Lieu Ala Ala Lys Thr Gly Tyr Glu Thir Asp Met Ile Glu Ala Asp 2O 25 3O Met Glu Lieu. Glu Thr Glu Lieu. Gly Ile Asp Ser Ile Lys Arg Val Glu 35 4 O 45 Ile Lieu. Ser Glu Val Glin Ala Met Lieu. Asn Val Glu Ala Lys Asp Val SO 55 6 O Asp Ala Lieu. Ser Arg Thr Arg Thr Val Gly Glu Val Val Asn Ala Met 65 70 7s 8O Lys Ala Glu Ile Ala Gly 85

<210 SEQ ID NO 14 <211 LENGTH: 5 &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp.

<4 OO SEQUENCE: 14 Lieu. Gly Ile Asp Ser 1. 5

<210 SEQ ID NO 15 <211 LENGTH: 21 &212> TYPE: PRT <213> ORGANISM: Schizochytrium sp.

<4 OO SEQUENCE: 15 Ala Pro Ala Pro Val Lys Ala Ala Ala Pro Ala Ala Pro Val Ala Ser 1. 5 1O 15

Ala Pro Ala Pro Ala 2O

<210 SEQ ID NO 16 <211 LENGTH: 3 OO6 &212> TYPE: DNA <213> ORGANISM: Schizochytrium sp.