US007 112717B2

(12) United States Patent (10) Patent No.: US 7,112,717 B2 Valentin et al. (45) Date of Patent: Sep. 26, 2006

(54) HOMOGENTISATE PRENYLTRANSFERASE 2004/0045051 A1 3/2004 Norris et al. GENE (HPT2). FROM ARABIDOPSIS AND USES THEREOF FOREIGN PATENT DOCUMENTS (75) Inventors: Henry E. Valentin, Chesterfield, MO CA 2339519 2, 2000 (US); Tyamagondlu V. Venkatesh, St. CA 2343919 3, 2000 Louis, MO (US); Karunanandaa CA 237,2332 11 2000 Balasulojini, Creve Coeur, MO (US) DE 19835 219 A1 8, 1998 (73) Assignee: Monsanto Technology LLC, St. Louis, EP O 531 639 A2 3, 1993 MO (US) E. 8. G. O. (*) Notice: Subject to any disclaimer, the term of this EP O 723 O17 A2 T 1996 patent is extended or adjusted under 35 EP O 763 542 A2 3, 1997 U.S.C. 154(b) by 404 days. EP 1 033. 405 A2 9, 2000 EP 1 063. 297 A1 12/2000 (21) Appl. No.: 10/391,363 FR 2 778 527 11, 1999 GB 560.529 4f1944 (22) Filed: Mar. 18, 2003 WO WO 91/02059 2, 1991 WO WO 91,09128 6, 1991 (65) Prior Publication Data WO WO 91/13078 9, 1991 WO WO 93/18158 9, 1993 US 2003/0213017 A1 Nov. 13, 2003 WO WO 94,11516 5, 1994 O O WO WO 94/12014 6, 1994 Related U.S. Application Data WO WO94, 18337 8, 1994 (60) Provisional application No. 60/365,202, filed on Mar. WO WO95/08914 4f1995 19, 2002. WO WO95/18220 7, 1995 WO WO95/23863 9, 1995 (51) Int. Cl. WO WO 95.34668 12, 1995 CI2N 5/82 (2006.01) WO WO 96,02650 2, 1996 CI2N 15/29 (2006.01) WO WO 96,06172 2, 1996 CI2N 5/14 (2006.01) WO WO 96,13149 5, 1996 (52) U.S. Cl...... 800/278; 536/23.2:536/23.6; WO WO 96,131.59 5, 1996 435/419:435/468 WO WO 96/36717 A2 11/1996 (58) Field of Classification Search ...... None WO WO 96/36717 A3 11, 1996 See application file for complete search history. WO WO 96.38567 12/1996 (56) References Cited WO WO 97/17447 5, 1997 WO WO 97.27285 7/1997 U.S. PATENT DOCUMENTS WO WO 97/49816 12/1997 4,727,219 A 2f1988 Brar et al. 5,304.478 A 4, 1994 Bird et al. (Continued) 5,429,939 A 7, 1995 Misawa et al. 5.432,069 A 7/1995 Gruninger et al. OTHER PUBLICATIONS 5,545,816 A 8, 1996 AuSich et al. Blastin database search results, AAC40232. 5,618,988 A 4, 1997 Hauptmann et al. 5,684.238 A 11/1997 AuSich et al. Continued 5,693,507 A 12/1997 Daniell et al. ( ) 5,750,865 A 5, 1998 Bird et al. Primary Examiner Ashwin D. Mehta 5,792.903 A 8/1998 Hirschberg et al. Assistant Examiner Cathy Kingdon Worley 5,876,964 A 3, 1999 Croteau et al. (74) Attorney, Agent, or Firm Fulbright & Jaworski L.L.P. 5,908,940 A 6, 1999 Lane et al. 6,281,017 B1 8/2001 Croteau et al. (57) ABSTRACT 6,303,365 B1 10/2001 Martin et al. 2005. R : so al The present invention is in the field of plant genetics and 2002/0108148 A1 8/2002 Boronat et al. biochemistry. More specifically, the present invention 2003/0.148300 A1 8/2003 Valentin et al. relates to genes and polypeptides associated with the toco 2003. O150015 A1 8, 2003 Norris et al. pherol biosynthesis pathway, namely those encoding 2003/0154513 A1 8/2003 van Eenennaam et al. homogentisate prenyl transferase activity, and uses thereof. 2003/0166205 A1 9, 2003 van Eenennaam et al. In particular, the sequence of the HPT2 gene from Arabi 2003/0170833 A1 9, 2003 Lassner et al. dopsis thaliana is disclosed for expression in any plant 2003/0176675 A1 9, 2003 Valentin et al. species to increase the levels of tocopherol. 2003/0213017 A1 11/2003 Valentin et al. 2004, OO18602 A1 1/2004 Lassner et al. 31 Claims, 47 Drawing Sheets US 7,112.717 B2 Page 2

FOREIGN PATENT DOCUMENTS Addlesee et al., “Cloning, sequencing and functional assignment of the chlorophyll biosyntheses gene, chIP, of Synechocystis sp. PCC WO WO 98,04685 2, 1998 6803, FEBS Letters 389 (1996) 126-130. WO WO 98,06862 2, 1998 Arango et al., “Tocopherol synthesis from homogentisate in Cap WO WO 98.18910 5, 1998 sicum anuum L. (yellow pepper) chromoplast membranes: evidence WO WO 99,04021 1, 1999 for tocopherol cyclase”, Biochem J., 336:531-533 (1998). WO WO 99,04622 2, 1999 Arigoni et al., “Terpenoid biosynthesis from 1-deoxy-o-xylulose in WO WO 99.0658O 2, 1999 higher plants by intramolecular skeletal rearrangement'. Proc. Natl. WO WO 99/O7867 2, 1999 Acad. Sci. USA, 94:10600-10605 (1997). WO WO 99/11757 3, 1999 Baker et al., “Sequence and characterization of the gopF gene of WO WO99, 19460 4f1999 Escherichia coli, FEMS Microbiology Letters, 94: 175-180 (1992). WO WO 99,55889 11, 1999 Bayley et al., “Engineering 2,4-D resistance into cotton.” Theor WO WO 99,58649 11, 1999 Appl Genet, 83:645-649 (1992). WO WOOOO1650 1, 2000 Bentley, R. “The Shikimate Pathway—A Metabolic Tree with WO WOOOO8169 2, 2000 Many Branches.” Critical ReviewsTM in Biochemistry and Molecu WO WOOO,081.87 2, 2000 lar Biology; vol. 25, Issue 5, 307-384 (1990). WO WOOOf 1038O 3, 2000 Bevan, M.. “Binary Agrobacterium vectors for plant transforma WO WOOOf 11165 3, 2000 tion”, Nucleic Acids Research, 12:8711-8721 (1984). WO WOOOf 142O7 3, 2000 Beyer et al., “Phytoene-forming activities in wild-type and trans WO WOOOf 17233 3, 2000 formed rice endosperm.” IRRN 21:2-3, p. 44-45 (Aug.-Dec. 1996). WO WOOO,22150 A3 4/2000 Bork et al., “Go hunting in sequence databases but watch out for the WO WOOO,28005 5, 2000 traps', TIG 12, 10:425-427 (Oct. 1996). WO WOOO,32757 A2 6, 2000 Bouvier et al., “Dedicated Roles of Plastid Transketolases during WO WOOO,32757 A3 6, 2000 the Early Onset of Isoprenoid Biogenesis in Pepper Fruits'. Plant WO WOOO34448 6, 2000 Physiol., 117: 1423-1431 (1998). WO WOOO/42205 A2 T 2000 Bramley et al., “Biochemical characterization of transgenic tomato WO WOOO/42205 A3 T 2000 plants in which carotenoid synthesis has been inhibited through the WO WOOOf 46346 8, 2000 expression of antisense RNA to pTOM5.” The Plant Journal. 2(3), WO WOOOf 61771 10, 2000 343-349 (1992). WO WOOOf 6338.9 10, 2000 Breitenbach et al., “Expression in Escherichia coli and properties of WO WOOOf 63.391 10, 2000 the carotene ketolase from Haematococcus pluvialis, ” FEMS WO WOOOf 65036 A2 11/2000 Microbiology Letters 140, 241-246 (1996). WO WOOOf 65036 A3 11, 2000 Broun et al., “Catalytic Plasticity of Fatty Acid Modification WO WOOOf 68393 11 2000 Enzymes Underlying Chemical Diversity of Plant Lipids,” Science, WO WO O1,04330 1, 2001 282:1315-1317 (1998). Buckner et al., “The y1 Gene of Maize Codes for Phytoene WO WO O1/O9341 2, 2001 Synthase.” Genetics 143:479-488 (May 1996). WO WO O1/12827 2, 2001 Burkhardt et al., “Genetic engineering of provitamin Abiosynthesis WO WO O1/21650 3, 2001 in rice endosperm.” Experientia, 818-821. WO WO O1/44.276 6, 2001 Burkhardt et al., “Transgenic rice (Oryza sativa) endosperm WO WO O1/62781 8, 2001 expressing daffodil (Narcissus pseudonarcissus) phytoene synthase WO WO O1 (79.472 10, 2001 accumulates phytoene, a key intermediate of provitamin A WO WO O1 (88169 A2 11/2001 biosynthesis” The Plant Journal, 11(5), 1071-1078 (1997). WO WO O1 (88169 A3 11, 2001 Cahoon et al., “Production of Fatty Acid Components of WO WO O2/OO901 A1 1, 2002 Meadowfoam Oil in Somatic Soybean Embryos.” Plant Physiology, WO WO O2/26933 4/2002 124:243-251 (2000). WO WO O2,29022 4/2002 Chaudhuri et al., “The purification of shikimate dehydrogenase WO WO O2/31173 4/2002 from Escherichia coli,” Biochem. J., 226:217-223 (1985). WO WO O2/33060 4/2002 Cheng et al., “Highly Divergent Methyltransferases Catalyze a WO WO O2/46441 6, 2002 Conserved Reaction in Tocopherol and Plastoquinone Synthesis in WO WO O2/O72848 9, 2002 Cyanobacteria and Photosynthetic Eukaryotes'. The Plant Cell, WO WO O2/O895.61 11 2002 15:2343-2356 (2003). WO WO O3,O34812 5, 2003 Collakova et al., “Isolation and Functional Analysis of WO WO 03/047547 6, 2003 Homogentisate Phytyltransferase from Synechocystis sp. PCC 6803 and Arabidopsis”. Plant Physiology, 127:1113-1124 (2001). Collakova et al., “Homogentisate Phytyltransferase Activity is Lim OTHER PUBLICATIONS iting for Tocopherol Biosynthesis in Arabidopsis'. Plant Physiol ogy, 131:632-642 (Feb. 2003). Blastin database search results, AAG24033. Collakova et al., “Isolation and Characterization of Tocopherol Blastin database search results. Prenyl Transferase From Synechocystis and Arabidopsis”. Poster Blastin database search results. Abstract see REN-01-026. Blastin database search results. Cook et al., "Nuclear Mutations affecting plastoquinone accumu Blastin database search results. lation in maize'. Photosynthesis Research, 31:99-111 (1992). International Search Report, PCT /US03/08468, pp. 1-5 (Jan. 11, Cunillera et al., "Characterization of dehydrodolichyl diphosphate 2005). Synthase of Arabidopsis thaliana, a key enzyme in dolichol Subramaniam et al., Database STIC, Accession No. AX360884 biosynthesis”, FEBS Letters, 477: 170-174 (2000). (Feb. 2002). d’Amato et al., “Subcellular localization of chorismate-mutase Bowie et al., “Deciphering the Message in Protein Sequences: isoenzymes in protoplasts from mesophyll and Suspension-cultured Tolerance to Substitutions', Science, 247: 1306-1310 cells of Nicotiana silvestris.” Planta, 162: 104-108 (1984). (1990). Doerks et al., “Protein annotation: detective work for function McConnell et al., “Role of Phabulosa and Phavoluta in determining prediction”, TIG, 14:248-250 (1998). radial patterning in shoots', Nature, 411 (6838): 709-713 (2001). d'Harlingue et al., “Plastid Enzymes of Terpenoid Biosynthesis, Baker et al., NCBI Accession No. X64451 (Dec. 1993). Purification and Characterization of Tocopherol Methyltransferase US 7,112.717 B2 Page 3 from Capsicum Chromoplasts.” The Journal of Biological Chem Hecht et al., “Studies of the nonmevalonate pathway to terpenes: istry, vol. 260, No. 28, pp. 15200-15203, Dec. 5, 1985. The role of the GcpB (IspG) protein', PNAS, 98(26): 14837-14842 De Luca, Vincenzo, “Molecular characterization of secondary meta (2001). bolic pathways”, AgBiotech News and Information, 5(6):225N Herrmann, K.M., “The Shikimate Pathway as an Entry to Aromatic 229N (1993). Secondary Metabolism”. Plan Physiol., 107:7-12 (1995). Duncan et al., “The overexpression and complete amino acid Herz et al., “Biosynthesis of terpenoids: YgbB protein converts sequence of Escherichia coli 3-dehydroquinase'. Biochem. J., 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate tO 238:475-483 (1986). 2C-methyl-D-erythritol 2,4-cyclodiphosphate'. Proc. Natl. Acad. Duvold et al., “Incorporation of 2-C-Methyl-D-erythritol, a Putative Sci. USA, 97(6):2486-2490 (2000). Isoprenoid Precursor in the Mevalonate-Independent Pathway, into Kajiwara et al., “Isolation and functional identification of a novel Ubiquinone and Menaquinone of Escherichia coli, Tetrahedron cDNA for astaxanthin biosynthesis from Haematococcus pluvialis, Letters, 38(35):6181-6184 (1997). and astaxanthin synthesis in Escherichia coli. Plant Molecular Elliott, Thomas, “A Method for Constructing Single-Copy Iac Biology, 29:343-352 (1995). Fusions in Salmonella typhimurium and Its Application to the Kaneko et al., “Complete Genomic Sequence of the Filamentous hemA-prfA Operon”, Journal of Bacteriology, 174:245-253 (1992). Nitrogen-fixing Cyanobacterium Anabaena sp. Strain PCC 7120”. Eisenreich et al., “The deoxyxylulose phosphate pathway of DNA Research, 8(5):205-213 (2001). terpenoid biosynthesis in plants and microorganisms’. Chemistry & Keegstra, K. “Transport and Routing of Proteins into Chloroplasts'. Biology, 5(9):R221-R233 (1998). Cell, 56(2):247-253 (1989). Ericson et al., “Analysis of the promoter region of napin genes from Keller et al., “Metabolic compartmentation of plastid prenyllipid Brassica napus demonstrates binding of nuclear protein in vitro to biosynthesis Evidence for the involvement of a multifunctional a conserved sequence motif. Eur, J. Biochem., 197:741-746 geranylgeranyl reductase'. Eur, J. Biochem., 251:413-417 (1998). (1991). Kishore et al., “Amino Acid Biosynthesis Inhibitors as Herbicides'. Estévez et al., “1-Deoxy-D-xylulose-5-phosphate Synthase, a Lim Ann. Rev. Biochem, 57:627-663 (1988). iting Enzyme for Plastidic Isoprenoid Biosynthesis in Plants'. The Koziel et al., “Optimizing expression of transgenes with an empha Journal of Biological Chemistry, 276(25):22901-22909 (2001). sis on post-transcriptional events'. Plant Molecular Biology, Fellermeier et al., “Cell-free conversion of 1-deoxy-D-xylulose 32:393-405 (1996). 5-phosphate and 2-C-methyl-D-erythritol 4-phosphate into B-caro Kumagai et al., “Cytoplasmic inhibition of carotenoid biosynthesis tene in higher plants and its inhibition by fosmidomycin', Tetrahe with virus-derived RNA'. Proc. Natl. Acad. Sci. USA, 92: 1679 dron Letters, 40:2743-2746 (1999). 1683 (1995). Fiedler et al., “The formation of homogentisate in the biosynthesis Kuntz et al., “Identification of a cDNA for the plastid-located of tocopherol and plastoquinone in spinach chloroplasts'. Planta, geranylgeranyl pyrophosphate synthase from Capsicum annuum: 155:511-515 (1982). correlative increase in enzyme activity and transcript level during Fourgoux-Nicol et al., “Isolation of rapeseed genes expressed early fruit ripening”. The Plant Journal. 2(1):25-34 (1992). and specifically during development of the male gametophyte'. Lange et al., “A Family of transketolases that directs isoprenoid Plant Molecular Biology, 40:857-872 (1999). biosynthesis via a mevalonate-independent pathway”. Proc. Natl. Fraser et al., “Enzymic confirmation of reactions involved in routes to astaxanthin formation, elucidated using a direct Substrate in vitro Acad. Sci. USA, 95:2100-2104 (1998). assay”, Eur, J. Biochem., 252:229-236 (1998). Lange et al., “Isoprenoid Biosyntheis via a Mevalonate-Independent Fraser et al., “In Vitro Characterization of Astaxanthin Biosynthetic Pathway in Plants: Cloning and Heterologous Expression of Enzymes'. The Journal of Biological Chemistry, 272(10) 6128 1-Deoxy-D-xylulose-5-phosphate Reductoisomerase from Pepper 6135 (1997). mint', Archives of Biochemistry and Biophysics, 365(1): 170-174 Fray et al., "Constitutive expression of a fruit phytoene synthase (1999). gene in transgenic tomatoes causes dwarfism by redirecting metabo Li et al., “Identification of a maize endosperm-specific cDNA lites from the gibberellin pathway'. The Plant Journal, 8(5):693-701 encoding farnesyl pyrophosphate synthetase'. Gene, 171: 193-196 (1995). (1996). Fray et al., “Identification and genetic analysis of normal and mutant Linthorst et al., "Constitutive Expression of Pathogenesis-Related phytoene synthase genes of tomato by sequencing, complementa Proteins PR-1,GRP, and PR-S in Tobacco Has No Effect on Virus tion and co-suppression'. Plant Molecular Biology, 22:589-602 Infection”. The Plant Cell, 1:285-291 (1989). (1993). Lois et al., "Cloning and characterization of a gene from Fuqua et al., “Characterization of melA: a gene encoding melanin Escherichia coli encoding a transketolase-like enzyme that cata biosynthesis from the marine bacterium Shewanella colwelliana', lyzes the synthesis of D-1-deoxyxylulose 5-phosphate, a common Gene, 109:131-136 (1991). precursor for isoprenoid, thiamin, and pyridoxol biosynthesis'. Furuya et al., “Production of Tocopherols by Cell Culture of Proc. Natl. Acad. Sci. USA, 95(5):2105-2110 (1998). Saflower”, Phytochemistry, 26(10):2741-2747 (1987). Lopez et al., “Sequence of the bchC Gene from Chloroflexus Garcia et al., “Subcellular localization and purification of a aurantiacus: Relationship between Chlorophyll Synthase and other p-hydroxyphenylpyruvate dioxygenase from cultured carrot cells Polyprenyltransferases”. Journal of Bacteriology, 178(11):3369 and characterization of the corresponding cDNA. Biochem. J., 3373 (1996). 325:761-769 (1997). Lotan et al., “Cloning and expression in Escherichia coli of the gene Gaubier et al., “A chlorophyll synthetase gene from Arabidopsis encoding f-C-4-oxygenase, that converts -carotene to the thaliana”. Mol. Gen. Genet. 249:58-64 (1995). ketocarotenoid canthaxanthin in Haematococcus pluvialis”, FEBS Goers et al., “Separation and characterization of two chorismate Letters, 364:125-128 (1995). mutase isoenzymes from Nicotiana silvestris”. Planta, 162:109-116 Mahmoud et al., “Metabolic engineering of essential oil yield and (1984). composition in mint by altering expression of deoxyxylulose phos Grafse et al., “Loss of C-tocopherol in tobacco plants with phate reductoisomerase and menthofuran synthase'. PNAS, decreased geranylgeranyl reductase activity does not modify pho 98(15):8915-8920 (2001). tosynthesis in optimal growth conditions but increases sensitivity to Mandel et al., "CLAI, a novel gene required for chloroplast devel high-light stress'. Planta, 213:620-628 (2001). opment, is highly conserved in evolution'. The Plant Journal, Harker et al., “BioSynthesis of ketocarotenoids in transgenic 9(5):649-658 (1996). cyanobacteria expressing the algal gene for f3-C-4-oxygenase, Marshall et al., “Biosynthesis of Tocopherols: A Re-Examination of crtO, FEBS Letters, 404: 129-134 (1997). the Biosynthesis and Metabolism of 2-Methyl-6-Phytyl-1,4- Harker et al., “Expression of prokaryotic 1-deoxy-D-xylulose-5- Benzoquinol”, Phytochemistry, 24(8): 1705-1711 (1985). phosphatases in Escherichia coli increases carotenoid and Misawa et al., “Expression of an Erwinia phytoene desaturase gene ubiquinone biosynthesis”, FEBS Letters, 448: 115-119 (1999). not only confers multiple resistance to herbicides interfering with US 7,112.717 B2 Page 4 carotenoid biosynthesis but also alters xanthophyll metabolism in Rohmer et al., “Glyceraldehyde 3-Phosphate and Pyruvate as Pre transgenic plants'. The Plant Journal. 6(4):481-489 (1994). cursors of Isoprenic Units in an Alternative Non-mevalonate Path Misawa et al., “Elucidation of the Erwinia uredovora Carotenoid way for Terpenoid Biosynthesis”. J. Am. Chem. Soc., 118:2564 Biosynthetic Pathway by Functional Analysis of Gene Products 2566 (1996). Expressed in Escherichia coli, Journal of Bacteriology, Rohmer et al., “Isoprenoid biosynthesis in bacteria: a novel pathway 172(12):6704-6712 (1990). for the early steps leading to isopentenyl diphosphate'. Biochem. J., Misawa et al., “Functional expression of the Erwinia uredovora 295:517-524 (1993). carotenoid biosynthesis gene Cril in transgenic plants showing an Rohmer, M., “A Mevalonate-independent Route to Isopentenyl increase of 3-carotene biosynthesis activity and resistance to the Diphosphate'. Comprehensive Natural Products Chemistry, 2:45-67 bleaching herbicide norflurazon'. The Plant Journal, 4(5):833-840 (1999). (1993). Rohmer, M., “Isoprenoid biosynthesis via the mevalonate-indepen Misawa et al., “Structure and Functional Analysis of a Marine dent route, a novel target for antibacterial drugs?'. Progress in Drug Bacterial Carotenoid Biosynthesis Gene Cluster and Astaxanthin Research, 50: 136-154 (1998). Biosynthetic Pathway Proposed at the Gene Level”, Journal of Römer et al., “Expression of the Genes Encoding the Early Bacteriology, 177(22): 6575-6584 (1995). Carotenoid Biosynthetic Enzymes in Capsicum Annuum". Bio Nakamura et al., “Structural Analysis of Arabidopsis thaliana chemical and Biophysical Research Communications, 196(3):1414 Chromosome 5. III. Sequence Features of the Regions of 1,191.918 1421 (1993). bp Covered by Seventeen Physically Assigned P1 Clones', DNA Ruzafa et al., “The protein encoded by the Shewanella colwelliana Research, 4(6):401-414 (1997). melA gene is a p-hydroxyphenylpyruvate dioxygenase'. FEMS Nawrath et al., “Targeting of the polyhydroxybutyrate biosynthetic Microbiology Letters, 124:179-184 (1994). pathway to the plastids of Arabidopsis thaliana results in high levels Saint-Guily et al., “Complementary DNA Sequence of an Adenylate of polymer accumulation'. Proc. Natl. Acad. Sci. USA, 91: 12760 Translocator from Arabidopsis thaliana'. Plant Physiol. 12764 (1994). 100(2):1069-1071 (1992). Norris et al., “Genetic Dissection of Carotenoid Synthesis in Sandmann et al., “New functional assignment of the carotenogenic Arabidopsis Defines Plastoquinone as an Essential Component of genes critB and critE with constructs of these genes from Erwinia Phytoene Desaturation”. The Plant Cell. 7:2139-2149 (1995). species”. FEMS Microbiology Letters, 90:253–258 (1992). Norris et al., “Complementation of the Arabidopsis pas I Mutation Sato et al., “Structural Analysis of Arabidopsis thaliana Chromo with the Gene Encoding p-Hydroxyphenylpyruvate Dioxygenase'. some 5. X. Sequence Features of the Regions of 3,076.755 bp Plant Physiol., 117:1317-1323 (1998). Covered by Sixty P1 and TAC Clones', DNA Research, 7(1):31-63 Oh et al., “Molecular Cloning. Expression, and Functional Analysis (2000). of a cis-Prenyltransferase from Arabidopsis thaliana'. The Journal Sato et al., “Structural Analysis of Arabidopsis thaliana Chromo of Biological Chemistry, 275(24): 18482-18488 (2000). some 5. IV. Sequence Features of the Regions of 1456,315 bp Okada et al., “Five Geranylgeranyl Diphosphate Synthases Covered by Nineteen Physically Assigned P1 and TAC Clones', Expressed in Different Organs Are Localized into Three Subcellular DNA Research, 5:41-54 (1998). Compartments in Arabidopsis'. Plant Physiology, 122:1045-1056 Savidge et al., “Isolation and Characterization of Homogentisate (2000). Phytyltransferase Genes from Synechocystis sp. PCC 6803 and Oommen et al., “The Elicitor-Inducible Alfalfa Isoflavone Arabidopsis”. Plant Physiology, 129:321-332 (2002). Reductase Promoter Confers Different Patterns of Developmental Schwender et al., “Cloning and heterologous expression of a cDNA Expression in Homologous and Heterologous Transgenic Plants'. encoding 1-deoxy-D-xylulose-5-phosphate reductoisomerase of Arabidopsis thaliana”, FEBS Letters, 455:140-144 (1999). The Plant Cell, 6:1789-1803 (1994). Scolnik et al., “Nucleotide Sequence of an Arabidopsis cDNA for Oster et al., “The G4 Gene of Arabidopsis thaliana Encodes a Geranylgeranyl Pyrophosphate Synthase'. Plant Physiol. Chlorophyll Synthase of Etiolated Plants'. Bot. Acta, 110:420-423 104(4): 1469-1470 (1994). (1997). Shewmaker et al., “Seed-specific overexpression of phytoene Peisker et al., “Phytol and the Breakdown of Chlorophyll in Synthase: increase in carotenoids and other metabolic effects'. The Senescent Leaves”. J. Plant Physiol. 135:428-432 (1989). Plant Journal, 2004):401-412 (1999). Pompliano et al., “Probing Lethal Metabolic Perturbations in Plants Shigeoka et al., “Isolation and properties of Y-tocopherol with Chemical Inhibition of Dehydroquinate Synthase'. J. Am. methyltransferase in Euglena gracilis”. Biochimica et Biophysica Chem. Soc., 111:1866-1871 (1989). Acta, 1128:220-226 (1992). Porfirova et al., “Isolation of an Arabidopsis mutant lacking vitamin Shintani et al., “Elevating the Vitamin E Content of Plants Through E and identification of a cyclase essential for all tocopherol Metabolic Engineering”, Science, 282:2098-2100 (1998). biosynthesis”, PNAS, 99(19): 12495-12500 (2002). Singh et al., “Chorismate Mutase Isoenzymes from Sorghum Querol et al., “Functional analysis of the Arabidopsis thaliana bicolor. Purification and Properties'. Archives of Biochemistry and GCPE protein involved in plastid isoprenoid biosynthesis', FEBS Biophysics, 243(2):374-384 (1985). Letters, 514:343-346 (2002). Smith, F.W. et al., “The cloning of two Arabidopsis genes belonging Rippert et al., “Molecular and biochemical characterization of an to a phosphate transporter family”. Plant Journal, 11(1):83-92 Arabidopsis thaliana arogenate dehydrogenase with two highly (1997). similar and active protein domains”. Plant Mol. Biol. 48:361-368 Smith, C.J.S. et al., “Antisense RNA inhibition of polygalacturonase (2002). gene expression in transgenic tomatoes', Nature, 334:724-726 Rippert et al., “Engineering Plant Shikimate Pathway for Production (1998). of Tocotrienol and Improving Herbicide Resistance'. Plant Physi Smith, T.F.. et al., “The challenges of genome sequence annotation ology, 134:92-100 (2004). or the devil is in the details', Nature Biotechnology, 15:1222-1223 Rodriguez-Concepción et al., “Elucidation of the Methylerythritol (1997). Phosphate Pathway for Isoprenoid Biosynthesis in Bacteria and Soll et al., “Hydrogenation of Geranylgeraniol'. Plant Physiol. Plastids. A Metabolic Milestone Achieved through Genomics'. 71:849-854 (1983). Plant Physiology, 130:1079-1089 (2002). Soll et al., “Tocopherol and Plastoquinone Synthesis in Spinach Rodriguez -Concepción et al., “1-Deoxy-D-xylulose 5-phosphate Chloroplasts Subfractions”, Archives of Biochemistry and Biophys reductoisomerase and plastid isoprenoid biosynthesis during tomato ics, 204(2):544-550 (1980). fruit ripening”. The Plant Journal. 27(3):213-222 (2001). Soll et al., “2-Methyl-6-Phytylquinol and 2,3-Dimethyl-5- Rohdich et al., “Cytidine 5'-triphosphate-dependent biosynthesis of Phytylquinol as Precursors of Tocopherol Synthesis in Spinach isoprenoids: YgbP protein of Escherichia coli catalyzes the forma Chloroplasts”, Phytochemistry, 19:215-218 (1980). tion of 4-diphosphocytidyl-2-C-methylerythritol”. Proc. Natl. Acad. Sprenger et al., “Identification of a thiamin-dependent synthase in Sci. USA, 96(21): 11758-11763 (1999). Escherichia coli required for the formation of the 1-deoxy-D- US 7,112.717 B2 Page 5 xylulose 5-phosphate precursor to isoprenoids, thiamin, and Kaneko et al., NCBI General Identifier No. 1001725. Accession No. pyridoxol”. Proc. Natl. Acad. Sci. USA, 94: 12857-12862 (1997). BAA 10562 (Feb. 2003). Spurgeon et al., “Biosynthesis of Isoprenoid Compounds', 1:1-45 Alcala et al., Genbank Accession No. Al 897027 (Jul. 1999). (1981). Bevan et al., Database EMBL, Accession No. AL035394 (Feb. Stam et al., “The Silence of Genes in Transgenic Plants'. Annals of 1999). Botany, 79:3-12 (1997). Bevan et al., TREMBL Database Accession No. O65524 (Aug. Stocker et al., “Identification of the Tocopherol-Cyclase in the 1998). Blue-Green Algae Anabaena variabilis Kitzing (Cyanobacteria)'. Campos et al., NCBI General Identifier BAA 18485, Database Helvetica Chimica Acta, 76:1729-1738 (1993). EMBL, Accession No.: AF 148852, (2000). Stocker et al., “The Substrate Specifically of Tocopherol Cyclase'. Chen et al., EMBL Sequence Database Accession No. Al995392 Bioorganic & Medicinal Chemistry, 4(7): 1129-1134 (1996). (Sep. 1999). Sun et al., “Cloning and Functional Analysis of the B-Carotene Desprez et al., Database EMBL, Accession No. Z34566 (Jun. 1994). Hydroxylase of Arabidopsis thaliana”. The Journal of Biological Fedenko et al., Abstract: RU 2005353, Derwent Accession No. Chemistry, 271 (40):24349-24352 (1996). 1994-253787. Suzich et al., "3-Deoxy-D-arabino-Heptulosonate 7-Phosphate Gaubier et al., Database EMBL, Accession No. Q38833 (Nov. Synthase from Carrot Root (Daucus carota) Is a Hysteretic 1996). Enzyme”, Plant Physiol. 79:765-770 (1985). Kaneko et al., Database EMBL, Accession No. P73726 (Feb. 1997). Svab et al., "High-frequency plastid transformation in tobacco by Kaneko et al., Database EMBL, Accession No. P73962 (Jul 1998). selection for a chimeric aadA gene'. Proc. Natl. Acad. Sci. USA, Kaneko et al., EMBL Sequence Database Accession No. D90909 90:913-917 (1993). (Oct. 1996). Svab et al., "Stable transformation of plastids in higher plants', Kaneko et al., TREMBL Database Accession No. P73727 (Feb. Proc. Natl. Acad. Sci. USA, 87:8526-8530 (1990). 1997). Takahashi et al., “A 1-deoxy-D-xylulose 5-phosphate Lange et al., “Mentha x Piperita 1-deoxy-D-xylulose-5-phosphate reductoisomerase catalyzing the formation of 2-C-methyl-D- Reductoisomerase (DXR) mRNA'. complete cods, Entrez Report, erythritol 4-phosphate in an alternative nonmevalonate pathway for Accession No. AF116825 (Apr. 1999). terpenoid biosynthesis”. Proc. Natl. Acad. Sci. USA, 95:9879-98.84 Lin et al., Database EMBL, Accession No. AC003672 (Dec. 1997). (1998). Lin et al., Database EMBL, Accession No. AC003673 (Dec. 1997). Takatsuji, H., "Zinc-finger transcription factors in plants', CMLS Lin et al., Database EMBL, Accession No. AC004077 (Feb. 1998). Cell. Mol. Life Sci., Birkhauser Verlag Basel CH, 54(6):582-596 Malakhov et al., Database TREMBL, Accession No. Q55207 (Nov. (1998). 1996). Tjaden et al., “Altered plastidic ATP ADP-transporter activity influ Murata et al., EMBL Sequence Database Accession No. D13960 (Mar. 1996). ences potato (Solanum tubersomum L.) tuber morphology, yield and Nakamura et al., Database EMBL, Accession No.: AB009053, composition of tuber starch”. The Plant Journal, 16(5):531-540 Abstract (Dec. 1997) (1998)(2000). (1998). Nakamura et al., Database EMBL, Acccession No.: AB005246 (Jul. Town et al., “Whole genome shotgun sequencing of Brassica 1997). oleracea, BOGKS71TR BOGK Brassica oleracea genomic clone Newman et al., Database EMBL, Accession No.: AA586087, BOGKS71, DNA sequence”. Database EMBL Accession No. Abstract (Sep. 1997). BH534089 (Dec. 2001). Newman et al., Database EMBL, Accession No. R30625 (Aug. Town et al., “Whole genome shotgun sequencing of Brassica 1995). oleracea, BOGAU46, DNA sequence”. Database EMBL Accession Newman et al., Database EMBL, Accession No. T44803 (Feb. No. BH248880 (Nov. 2001). 1995). Verwoert et al., “Developmental specific expression and organelle Newman et al., DEBEST ID: 1262303, Entrez Report, Accession targeting of the Escherichia coli fabl) gene, encoding malonyl No.: AA586087 (Sep. 1997). coenzyme A-acyl carrier protein transacylase in transgenic rape and Oster et al., Database Biosis, Accession No. PREV 1998.00047824 tobacco seeds'. Plant Molecular Biology, 26:189-202 (1994). (Oct. 1997). Xia et al., “A monofunctional prephenate dehydrogenase created by Ouyang et al., Database EMBL, Accession No. AF3.81248 (Jan. cleavage of the 5' 109 bp of the tyra gene from Erwinia herbicola'. 2003). Journal of General Microbiology, 138(7): 1309-1316 (1992). Rounsley et al., Database EMBL, Accession No. B24116 (Oct. Xia et al., “The phea I tyra I aroF Region from Erwinia herbicola: 1997). An Emerging Comparative Basis for Analysis of Gene Organization Rounsley et al., Database EMBL, Accession No. B29398 (Oct. and Regulation in Enteric Bacteria'. Database GENBANK on STN, 1997). GenBank Acc. No. (GBN): M74133, J. Mol. Evol., 36(2):107-120 Rounsley et al., Database TREMBL, Accession No. 064684 (Aug. Abstract (1993). 1998). Yamamoto, E., “Purification and Metal Requirements of Schwender et al., Arabidopsis thaliana mRNA for Partial 1-deoxy 3-Dehydroquinate Synthase from Phaseolus Mungo Seedlings', d-xylulose-5-phosphate Reductoisomerase (dxr gene), Entrez Phytochemistry, 19:779-781 (1980). Report, Accession No.: AJ242588 (Aug. 1999). Zaka et al., “Changes in Carotenoids and Tocopherols During Scolnik et al., Database EMBL, Accession No. L40577 (Apr. 1995). Maturation of Cassia. Seeds', Pakistan J. Sci. Ind. Res., 30(11): Shintani et al., Database NCBI. Accession No. AF104220 (Jan. 812-814 (1987). 1999). Zeidler et al., “Inhibition of the Non-Mevalonate 1-Deoxy-D- Shoemaker et al., Database EMBL, Accession No. AlT48688 (Jun. xylulose-5-phosphate Pathway of Plant Isoprenoid Biosynthesis by 1999). Fosmidomycin'. A Journal of Biosciences, Zeitschrift fuer Shoemaker et al., Database EMBL, Accession No. Al938569 (Aug. Naturforschung, Section C, 53(11/12):980-986 (Nov./Dec. 1998). 1999). Zhu et al., “Geranylgeranyl pyrophosphate synthase encoded by the Shoemaker et al., Database EMBL, Accession No. Al988542 (Sep. newly isolated gene GGPS6 from Arabidopsis thaliana is localized 1999). in mitochondria'. Plant Molecular Biology, 35:331-341 (1997). Shoemaker et al., Database EMBL, Accession No.AW306617 (Jan. Zhu et al., "Cloning and Functional Expression of a Novel 2000). Geranylgeranyl Pyrophosphate Synthase Gene from Arabidopsis Tabata et al., Database EMBL, Accession No. D64001 (Sep. 1995). thaliana in Escherichia coli, Plant Cell Physiol. 38(3):357-361 Tabata et al., Database EMBL, Accession No. D64006 (Sep. 1995). (1997). Tabata et al., Database EMBL, Accession No. D90909 (Oct. 1996). Kaneko et al., NCBI General Identifier No. 1653.572, Accession No. Tabata et al., Database EMBL, Accession No. D90911 (Oct. 1996). BAA18485 (Jul. 2001). Tabata et al., Database EMBL, Accession No. Q55145 (Nov. 1996). US 7,112.717 B2 Page 6

Tabata et al., Database EMBL, Accession No. Q55500 (Nov. 1996). International Search Report, PCT/US01/24335, pp. 1-8 (Mar. 6. Walbot, V., Database EMBL, Accession No. Al795655 (Jul 1999). 2003). Wing et al., Database EMBL, Accession No. AQ690643 (Jul 1999). International Search Report, PCT/US01/42673, pp. 1-4. Xia et al., Database EMBL, Accession No. M74133 (Jun. 1993). International Search Report, PCT/US02/03294, pp. 1-4 (Mar. 19, Bevan et al., Accession T4 8445. 2003). International Search Report, PCT/US00/10367, pp. 1-5 (Sep. 15, International Search Report, PCT/US02/13898, pp. 1-3 (Sep. 13, 2000). 2002). International Search Report, PCT/US00/10368, pp. 1-14 (Jun. 15. IPER, PCT/US02/13898, pp. 1-4 (Apr. 24, 2003). 2001). Written Opinion, PCT/US00/10368, pp. 1-6 (May 9, 2002). International Search Report, PCT/US02/14445, pp. 1-6 (Oct. 30. IPER. PCT/US00/10368, pp. 1-5 (Aug. 16, 2002). 2003). Examination Report, New Zealand Patent Application No. 514600, International Search Report, PCT/US02/26047, pp. 1-5 (Dec. 5, based on PCT/US/00/10368, pp. 1-2 (Apr. 24, 2003). 2003). Communication pursuant to Article 96(2) EPC, EP Application International Search Report, PCT/US02/34079, pp. 1-5 (Jul. 28, 00922287.8, based on PCT/US00/10368, pp. 1-6 (Oct. 17, 2003). 2003). Examiner's Report No. 2, Australia Patent Application No. 42492/ Written Opinion, PCT/US02/34079, pp. 1-4 (Oct. 23, 2003). 00, based on PCT/US00/10368, pp. 1-4 (Nov. 12, 2003). Response to Written Opinion, PCT/US02/34079, pp. 1-6 (Dec. 22, International Search Report, PCT/US01/12334, pp. 1-5 (Apr. 5, 2003). 2002). slr 1736 cyanobase www.kazusa.com.

U.S. Patent Sep. 26, 2006 Sheet 2 of 47 US 7,112,717 B2

U.S. Patent Sep. 26, 2006 Sheet 3 of 47 US 7,112,717 B2

„HA SÆVIÐ9.

s Siirii; U

U.S. Patent US 7,112,717 B2

CI3TE3OWI

U.S. Patent Sep. 26, 2006 Sheet S of 47 US 7,112,717 B2

€/. 9L. 69 98 69 89 69

8 8

8 8 8 s V V V

U.S. Patent Sep. 26, 2006 Sheet 8 of 47 US 7,112,717 B2

PVU 121 EcoR 296 Sach 306 NCO 346 Stul 486 NCO 736 EcoRV 1444 BamH 1659 Not 666 Stul 1837 Sac 19235 Stul 1990 grill Hittin, st;is. EcoRV 16958 N XhoI 2748 Bg9 16762 Y& 69 ada polylinker St. 5.

Sac 15209 aacC. Turbo polylinker EcoRV 1511 ATPT2 PWu 41 16 Cla. 15097 Turbo polylinker PVu 4147 Pvul 15097 NE tim 3' 2-Xhol 41.96 polylinker -Pvull 4432 Pvul 14525 - CGN10800 E-EcoRV 4727 5 loss "E. 35s promoter El Sall 4937 Tn5 kan gene ECOR 4949 XhoI 13882 SNPvul 5204 Pvul 13451 -1 NCO 5530 Dral 32961 EcoR 5939 Sac 1377 Small 5946 Pvul 1367 Bgll 12895 Xho 7088 XhoI 12284 Pvt 7098 Pvu 11776 PVul 11641 RNITIntrutt R ZE Sac 11529 XhoI 8079 Bgll 11506 Small 8243 Sma 1491 ECORV 8274 ECOR 11161 EcoRV 84.46 EcoRI 11 O13 EcoRV 8736 Pvu 10873 Pvu 9386 PVul 10624 EcoRV 9585 NCO 10530 EcoRV 10474 Sal 9837 PVU 98.02 XhoI 9772 Bgll 9716 Aig. 4 U.S. Patent Sep. 26, 2006 Sheet 9 of 47 US 7,112,717 B2

PVI 121 ECOR 296 SaCl 306 Sac 645 ECORV 1549 Xbal 1578 ECOR 1773 Sal 2075 Pst 2085 XhoI 2.192 Sac 19765 ECORW 2390 Stui 2462 ECORV 17488 Rs.4 Stul 2954 Q Stu 3107 Bgll 17292 Wr polylinker && Not 3274

7 aad Napin Promoter & BamH 3281 Sac 15739 aacC1 Turbo linker Q XhoI 3286 ECORV 15641 APT2 W BamH 3603 Cla. 15627 Turbo linker W ECORV 4447 Pvul 15627 SS Napin 3' A Pvull 4646 E. | PVul 4677 Pvul 15055 - ?ses pCGN10801 St 5. 19788 bp 35S promoter T E XhoI 14412 - B-EcoRV 5257 Pvul 13981 - Tn5 kan gene COR "Eg5

aC 32 faWe A\ PWU 13697 s E. BOI''214 13425 s: &S a. PVU 2306 53. s Xho 7618 PWu 1271 AN ar PWL 7628 SaCl 12059 Dra 8080 Bgll 12036 PVu 8164. Small 12021 XhoI 8609 ECOR 1169 Small 8773 ECOR 11543 ECORV 8804 PV 11403 ECORV 8976 PWU 1154 ECORV 9266 NCO 11 O6O ECORW 11004 Sal 10367 PVu 10332 XhoI 10302 Bgll 10246 ECORV 101.15 PVU 9916

Fig. 5 U.S. Patent Sep. 26, 2006 Sheet 10 of 47 US 7,112,717 B2

PVU 21 ECOR 296 SaCl 306 NCO 346 Stul 486 NCO 736 ECORV 1444 Pst 1670 XhoI 777 ECORV 1975 SaCl 19249 Stul 2047 Stul 2539 T Stu 2692 EcoRV 16972 N Not 2859 Bgll 16776 NX: C BamH 2866 s Saci 2883 Sac 15223 4 aacC1 Turbo polylinker e EcoRV 15125 S ATP2 W Pvull 4130 Cal 15111 Af Turbo polylinker A PWu 416 Pvul 15111 SS polylinkertn 3' e-XhoI-Pvull 42104446 PVU 14539 lses p's 35S promoter Egy/ XhoI 13896 Tn5 kan gene B ER Pvul 13465 ta A. Ncol 5544 Dral 13310 /1 ECOR 5953 Saci 131911% Small 5960 Pvu 13181 & LB

Bgll 12909 & XhoI 702 XO 12298 & PVU 7112 Pvul 11790 If Drai 7564 PVU 11655 FIGHILIHill PVU 7648 Sac 11543 XO 8093 Bgll 11520 Small 8257 Small 11505 ECORV 8288 ECOR 11175 EcoRV 8460 ECOR 11027 EcoRV 87.50 E|| || Pye, 999 NCO 10544 ECORV 9599 ECORV 10488 Sal 9851 PVU 9816 Xho 9786 Bgll 9730

Fig. 6

U.S. Patent Sep. 26, 2006 Sheet 12 of 47 US 7,112,717 B2

20

Plant line number

Fig. 8 U.S. Patent Sep. 26, 2006 Sheet 13 of 47 US 7,112,717 B2

5OO ?

250

Plant line number

Aig. 9

U.S. Patent Sep. 26, 2006 Sheet 18 of 47 US 7,112,717 B2

1400

1200

3C

N 7

A Pairs Tukey-Kramer O.05

Construct / line

Fig. 14 U.S. Patent Sep. 26, 2006 Sheet 19 Of 47 US 7,112,717 B2

mal 7 CoR 10 Spnl 26 CR-At. HP 2-0:12 Pst 31 Sa 5461 Sac 5417 2.T.7:1 16 XhoI 5295 -EC, ac2.11 Barnhill 5044 Pvu 249 Bgll 4914 Pvu 342 Bgll 4856 art TF RS4 EcoRV 492 s s & Bsp120) 731 s & C(S QK S S pMON69960 Not 4089 E HPT2 in pSPORT/CPR230005 A-E-P-LAC-13 Xbal 4082 Fa 5461 bp Barnhill 4O76 W Hindl 4070 & P-SP6:1:4 & ORI-PUC M13 -40 FORWARD PRIMER 12 &S S PVul 3926 Nd s BA- TERM PVuf3897 MP Ora 2347 RWul 2B5t W Oral 2366 Dra 3058

Aig. 15 U.S. Patent Sep. 26, 2006 Sheet 20 of 47 US 7,112,717 B2

eft border: 127 lal dam 322 Pvu 322 O 337 OS 3' 159 PWu 6O7 NCO 869 Ec.nptll-Tn5.1:4 Pvul 1199. PSt 1256 OR.W. Pvul 1444 PWU 11518 P-CANV.35S: 3 Oral 11394 XhoI 1807 Pvu 10853 Not 1816 s "fisa Kpnl 1827 ROP2 4. s’ s& EcoRV 1926 PWL 9435 S. W 7p A NAPIN 3:1:1 i polic - Bam 2766 ORI-322 - R 11700 bp HS XhoI 3083 R Pst 3093 W & Bgll 3095 Dra 8269 & Ø Sall 31 O1 Oral 8250 & 3. Sach 369 PSt 7894 *Arties* Dra 3651 PVul 7429 p-ARAB 1:1 SPCfSR 4 ECOR 3840 Pst 6955 Pvu 392O Pw 6828 TS-ARAt EPSPTP-02 Dral 6572 NCO 467O Right Border:1:1 104 Sall 4676 Dra 6495 EcoR1 4978 HindIII 6454. ball 573 Not 6441 coRV52O6 Kpnl 6438 APN 5: :

SaCl 6 114

Aig, 76 U.S. Patent Sep. 26, 2006 Sheet 21 of 47 US 7,112,717 B2

CR-At.-PE 2-02 Sacf 1457 XhoI 11335

Barr 11 O34 --- COR 3O2 Bgll 10954 bal 497 Bgll O896 CORV 530 Bgli 101.19 NAPN 5'11841 Pst 1017 SaCl 1438 XhoI 1 OO7 Kpnl 1762 Barnll 9790 Not 1765 NAPIN 3:11 Attra's Hid 1778 s & Pne 1819 EcoRV 8950 & QS B-AGRtu.right border-1:1:4 Kpnl 8851 Ay Q) Nof 8840 A. W. Pst 2279 XO 883 E.R pMON69963 CR-Ec.aadA-SPC/SR-0:1:2 P-CAMW 35S 3 Napin:HPT2 Sense binary PS 828O \ 11505 b O - Pst 3218 PV 8223 &y 5 Ecnpill-Tim5; 1:4 &- W.S. S NCO 7893 & s TAGRtu.nos-i:1:3 Q & s s? OR-322 Xho 736 PVU 4759 CR-Ecrop-0.1:1 Cal dam 7346 B-AGRtu.left border. . . Pv6842 CFR-corW.RK2-0.6

Fig. 17 U.S. Patent Sep. 26, 2006 Sheet 22 of 47 US 7,112,717 B2

CR-A. HP 2-0:12 Bgll 10734 Bgll 10676 Barn O546 coR 302 XC O295 bat 497 Sac 10181. CORV 530 Bgll 101.19 APN 5' 11842 Pst 101 17 SaCl 1438 XhoI 101 O7 Kpnl 1762 Banihill 9790 Not 1765 NAPIN 3:1:1 turn's HindIII 1778 x S& Pme 1819 ECOR V895O & B-AGR tu, right border-1:1:4 Kpnl 8851 4. & Not 8840 i - Pst 2279 Xho 883 pMON69965 CR-Ec-aadA-SPC/STR-0:1:2 P-CAMV.35S:13 - napin::HPT2 antisense binary 11505 bp PSt 828O E-- Pst 3218 PVU 8223 &y G Ec.npill-Tn5:1:4 CSy- W.S S NCO 7893 Q s T-AGRunOS-13 QS & s s? OR-322 Xho 7361 PVU 4759 CR-EC, rop-0:1:1 Cal dam 7346 BAGRtu.left border-11 PWL 6842 OR-EC, OriV-RK2-0:1:6

Fig. 18

U.S. Patent Sep. 26, 2006 Sheet 26 of 47 US 7,112,717 B2

O :

-- 3 g 5 3 A Pairs 3. 3. Tukey-Kramer s - s O.O.5. s 3 . 2 2 . Construct

Aig. 22 U.S. Patent Sep. 26, 2006 Sheet 27 of 47 US 7,112,717 B2

650 600 - 550- s D - w C d 500 c

5

350,S's st ;L : 3 3. A Pairs 3 9. 3. S. Tukey-Kramer s s 2 O.05 . s (f) Ol 3 O). Z c) Ol O Construct

Aig. 23 U.S. Patent Sep. 26, 2006 Sheet 28 of 47 US 7,112,717 B2

20 s A. - 8O NoHPT : ------: -- AHET : ------sy Her ------ZnT1 ... --MDALRLRPSiLPWRPGAARPRDHFLP3SIQRNGEGRICFSSRTQGPTLHHHQKFF : 5.8 TaT1. MDSLRLRPSSIRSAPGAAAARRRDHIFPSFCSIORNGKGRVTLS CASKG-TINHCKKF : 60 ApHPT1. : ------MLSMDSLT, TKPVVIPLPSPWCSTPILRGSSAPGQYSCRNYNPIRIQRCLWNYEH : 54 GIE - : ------MDSMLLRSFPNINNASSLATTGSYLPNASWHNRKIQKEYNFLRFRWPSLNHHYK : 54 CpHPT1 : ------MRMESLLNSFSSPAGGECRATYKKAYFATARCNTLNSLNKNGEYFISR : S3 At 1 : ------MESLLSSSSIWSAAGGFCWKKONLKLHSLSEIRVLRCDSSKWWAKPKFRNNL : 52 AthPT2 : ------MELSISOSPRW RFSSLAPRFLAASHHHRPSVHLAGKFISLPRTVRF : 46 GHP2 : ------MELSLSPTSHRWPSTIPTLNSAKLSSTRATKSQQPLFLGFSKHFNSIGL : 49 OSHT2 : ------MASASPP,pRAAATASRSGRPAPRLIGPPPPASPI, 39

r 8O k 100 NET ------MSQSSQNSPLPRKPVQSYFHW, 37 AHT : ------MNSSQDRPLRPKPLQSSFQWE, 3. SyHPT : ------9 2HP : EWKSSYCRISHRSLNTSVNASGQQL : 118 TaFIPT1 : DWKYSNHRISHQSTNTSAKAG-QS & 19 ApHPT1 : VKPRFTTCSRSQKLGHVKATSEHSLESGSEGYTPRSIWEAVLAS : 114 GHPT1-1 : ; 1.14 CpHPT1 : TRQRFTFHQNGHRTYLVKAVSGQSLESERESYP-NNRENDS : 12 AtPT1 : (RDGoossi EpkHKSRFrvireoparis SKOKSFRDSES 2. -WX2 12 At T2 : TSLSTSRMRSKFVSTNYRKISTRACSQVGAAESDDPVLDRIARFQNSCRFTRPF O6 GmHPT2 : HHHSYRCCSNAVPERPQRPSSIRACTGWGASGSDRPLAERTLDLKECWRFTRPHil 19 OSHT2 SSASARFPRAPCNAARWSRRDAVRVCSQAGAAGPAPLSKTESDLKDSc, RF REE 99 otei W.

NoHP - & &: 8 ST AHP St. s : :s SISNEPIDEY SHIFT - IFGLNWRIN : 3 2HPT1. : :- PIEL IWGLNOL m 13 TaPT Sp 2. YTWGLINQL 1A ApHPT 59

GmHPT1-1 : AL:s: s FFG-----Y: --- 89 CpHPT1 : s: sis: ii I 57 At HT S. AP2 32 GruHPT2 55 OSHT2 55

NoHP 154 AHP 50 SyHPT :ka. - - : 133 - X: : k: : *- is &: 3: AHPT1 ... a 3.: FAAASFGGAVGSQPFWALFIS : 233 TaPT ExPAS FGWVGSPP ;FIS : 234 ApHPT1. - : 229 GmHP'l- 229 CpHPT1 22. ap T1 227 AtHP2 220 GHP2 223 OSHET2 23 U.S. Patent Sep. 26, 2006 Sheet 29 Of 47 US 7,112,717 B2

NOHP 213 AnHPT 209 syHPT 192 2nHPT 293 TaHPT1. 294 ApHPT1. 289 GHT1-1 289 CpHPT1 287 At-HPT 287 At T2 HATR-AALGLPFQWS 279

GHPT2 YATR-ASLGLAFEWSSEV 282 OSHP2 YATR-AALGLTFQWSS: 272

NoHPT 273 AHPT 269 SyHPT 252 ZHP 353 Tap : 354 ApHPT1 349 GHPT1-1 349 CpHPT1 3Af AtEP1 347 AtPT2 339 GmHP2 342 OSHPT2 332

NoHPI FPACFLA------: 322 AnPT FPTACLA ------318 SyHPT : : : 3 s S- EALWLPNFSNTIF : 308 2HPT : AILSCARSDLT--SKAAITSEYM IPLVR------399 TaHP1 : G. WSSARSDLT-- 2 IPLVR------: 400 ApHPT1 : VR------395 GHP1-1 ; GAWASWFHAKSO.K.--SKASITSEYMEWELFYAEYIPFR------395 CpHPT1 s: s: isis. S: 8::::::832 Lyr------: 393 AthPT1 : TTLARAKSWDLS-- LPFK------393 AtPT2 : S. IKEAISG FPFL------386 GHPT2 : AR KDAISGEYR FPFE------: 389 OSHP2 s KDAS AEYFFPLE------: 379

NoHPT : (SEQ ID NO: 1) AnPT : (SEQ ID NO: 2) SyHPT : (SEQ ID NO: 3) ZmHPT1 : (SEQ ID NO: 4) TahPT1 : (SEQ ID NO: 10) ApHPT1 : (SEQ ID NO: 9) GmHPT1-1 : (SEQ ID NO : 6) CpHPT1 : (SEQ ID NO: 11) AtHPT1 : (SEQ ID NO: 7) AthPT2 : (SEQ ID NO: 57 ) GmHPT2 : (SEQ ID NO: 91) OSHPT2 : (SEQ ID NO: 58)

Fig, 24b U.S. Patent Sep. 26, 2006 Sheet 30 of 47 US 7,112,717 B2

Motif V (SEQ ID NO:46)

NOHPT 12 APT 12 SyHPT 12 ZnHPT1. 12 TaPT1. 12 ApHPT1 12 GmHPT1- 12 CpHPT1. 12 At HPT1. 12 AtEHPT2 12 GmEPT2 12 OSHPT2 12

NoHPT 26 AnHPT 26 SyHPT 26 ZnHPT1. : NI : 26 TaPT1. : NIYIVCLNQL : 26 ApHPT1 : NVYIVGLINQL 26 GmHPF1-1 : NIYIVCLNQL : 26 CpHPT1 : NIYIVCLNQL : 26 At HPT1. p 26 At HPT2 26 GmHPT2 26 OSHPT2 26

NOHPT AnHPT SyHPT ZnHPT1 : 8 & ApHPT1TahPT1 : IALFKDIPDIEGDRIALEKDIPDIDGD GmHPT1-1. &:'s CpHPT1 At PT1. AtHPT2 GHP2 OSHPT2 fig.25a U.S. Patent Sep. 26, 2006 Sheet 31 of 47 US 7,112,717 B2

Motif VIII (SEQ ID NO: 49)

NOHPT 23 AHPT 23 SyHPT 23 ZHPT1. 23 TaPT1. 23 ApHPT1 . 23 GmHPT1-1 23 CpHPT1 23 AtET1. 23 AtEPT2 23 GIHPT2 23 OSHPT2 23

Fig, 25E U.S. Patent Sep. 26, 2006 Sheet 32 of 47 US 7,112,717 B2

ApHPT1

At PT TaPT1

GnHPT1- CpHPT

SyHPT GHP2

899 Tricho Athp2

OSHPT2

Chloro

O.

Aig. 26

U.S. Patent Sep. 26, 2006 Sheet 34 of 47 US 7,112,717 B2

Not 6851 Kpnl 6848 EcoRV 6745 Dral 705 NAPIN 3:1:1 P-SP6:13 AMP

Band 5901 PVul 914 Xhof 5584 Pst 5582 Oral 397 Bgll 5572 Dra 1416 SS BLA-1 TERM pMON81023 pNapin:HPT2Bsp120/Noti OR-UC CR-At. HPT 2-0:12 Bgll 4795 Bgll 4737 Band 46O7 O X P-7:1:6 XhoI 4356 Bgll 2401 SaCl 4242 Small 2409 PWLIf 491 Bsp12024.15 EcoR 3884 Kpnl 2432 Xball 3689 Sac 2756 ECORV 366O APIN 5:1:1806

Aig. 28 U.S. Patent Sep. 26, 2006 Sheet 35 of 47 US 7,112,717 B2

Not 16155 Kpnl 16152 hol 9 NAPN3':1:1 P-CAMV.35S:13 Barn 15205 Pv621 BamH 14652 C.npt-Tn5::4 PVul 14586 Col 947 TYRA (E.HERB(COLA): 1:1 T-AGRtu.nos- 1:2:3 NCO 13757 XhoI 1479 IS-A.SSU-CTP-01: 1 Cal dam 1496 Xbal dam 13486 B-AGRtu.left border-1: 3 Bgll 13483 yassas 'Tis Pvull 2002 Sal 13477 s? Q C OR-EC-oriW-RK2-0:1:6 EcoRI 1375 82K &S& Xbal 1298O A.Y- CW NAPIN 5'1: 1695 - El pMON36596

Saci 12047 Napin-AthPPD-tyra in Binary Puipage&g 1. Kpni 11723 16155 bp /7 Kpnl 11703 AFNOR)-322 NAPIN 3:1. 4. BamH 10756 & s CR-Ec.aadA-SPC/STR-012 R it, XhoI 10439 3-AGfitu.right border-1; 1:4 Bgll 10427 Phel 7025 Sal 10421 ind 7062 Sac 10361 pnl 7092 CR-A.HPPD-0.1:1 aCl 7416 EcoR 9682 P-U.Napin 5'-O:1;2 S-AR. At PSP TP-01. NCO 8852 Sall 8846 EcoR 8544 Xbal 8349

Aig. 29 U.S. Patent Sep. 26, 2006 Sheet 36 of 47 US 7,112,717 B2

.T.:1 IS AGSECUENCE:: to 158 Not S6 3. a 179 ac. 90 co192 Bar 198 EcoRV 206 col 212 NEOKASE:3 pnl 238 Bgll 241 -A is S-AG:14 |Xbal 384 s. 2 GN:

Cladan 453

Pyu A9 AN24 St A353

Cal 470 PET-30a(+) ACl::25 Bsp120 1383

Pyu 1776 w869

Pv2868

Fig. 30 U.S. Patent Sep. 26, 2006 Sheet 37 Of 47 US 7,112,717 B2

CR-A.HPT2-0:12 Bam 6092 Bgll 5962 Bg|II 5904 ball 38 Sal 5255 P-T7:12 HindIII 5249 lal dam 1 O7 Not 5242 t Xho 5234 s R HIS TAG SECUENCE::1 39 Q & AC: 1:25 T-T7.1:1 SX2 &Bsp120 1037 $2 Q £7 W F1 ORIGIN:1: s pMON69993 a PvulP 14304 pET30(a)+:HPT2 Wo Histag PVull 1523 6216 bp S. C PWul 4133 Q S6 QS (s KAN: 1:24 X. s Small 4007 Nitulip LPvull 2522 Claf 3824 REPLICATION ORIGIN11

fig, 31 U.S. Patent Sep. 26, 2006 Sheet 38 of 47 US 7,112,717 B2

CR-At HP-0:12 |cols348 KPn2 BamHI6226 Pg"? -AG-15 Bgll 609 S-TAG:14

Bgll 6038 bata 172 Sal 5389 P-7:12 HindII 5383 lal dam 241 Not 5376 Xhol 5368 airlift, Q HIS TAG SEOUENCE:1:1 C QS & LAC 1:25 T-T7:1:1 & S. Q Bsp120l 1171 AS 1 C F1 ORIGN: 1:1 pMON69992 pET30(a)+:HPT2 w Histag EY:Vul 15641657 6348 bp E. E. W Af Pvul 4267 - 1 & 4. KAN:1:24 Cxx s C Small 4141 &s as Q. It PVU 2656 Cal 3958 REPLICATION ORIGIN.1:1

Alig. 32 U.S. Patent Sep. 26, 2006 Sheet 39 of 47 US 7,112,717 B2

U.S. Patent Sep. 26, 2006 Sheet 40 of 47 US 7,112,717 B2

X37:3OWI

U.S. Patent Sep. 26, 2006 Sheet 41 of 47 US 7,112,717 B2

U.S. Patent Sep. 26, 2006 Sheet 42 of 47 US 7,112,717 B2

of a Sea.g. A 2. 4oWIITXE;†

––TIJ,ÄJLTHO£: ------|-6Lºg:

–§:–999: –?š?i??–689:

a

U.S. Patent Sep. 26, 2006 Sheet 43 of 47 US 7,112,717 B2

Motif K (SEQ ID NO: 92)

NOHPT 12 AHPT 12 Tricho 12 SyHPT 12 ZmHPT1. 12 TaPT1. 12 ApHPT1 12 GmP1-1 1.2 CpHPT1 12 At HPT1. 12 Chloro 12 AtFPT2 12 GIHPT2 12 OSHPT2 12

Motif X (SEQ ID NO: 93)

20 NoHPT MYIVGLNQLED DOKINKPHLPA 26

AnRHP 26 Tricho IWGLNOLEDIEDRINKPHLPI 26

SyHPT YIVOLINQLWDVDIDRINKPNLPLAE. 26 ZImEHPT1. YIWGLINQLFDIEIDKVNKPILPLA 26

TaFIPT1. 26 ApHPT1 26 GIHPT1. -1. EYIVCLNOLSDVEIDKINKPYLPLA 26 CpHPT1 YIVCLNOLSDDIDKVNKPYLPTA 26

AtPT1. YWCLNOLSDWEIDKWINKPYLPLA 26 Chloro WVGVNQLTDWAIDRINKPWTPVA 26 At HPT2 YIVCINOYDIGIDKWINKPYLPA 26 GHPT2 YIWGINOIYDISIDKVNKPYTPIA 26 OSHPT2 YIVCINOIYDIRIDKVNKPYLPIA 26

Pig. 34a U.S. Patent Sep. 26, 2006 Sheet 44 of 47 US 7,112,717 B2

Motif XI (SEQ lD NO : 94)

NOHPT 14 AnHPT 14 Tricho 14 SyHPT 14 ZmHPT1. 14 TaFPT1. 14 ApHPT1 14 GmHPT1-1 14 CpHPT1 14 AltHPT1. 14 Chloro 14 At HPT2 14 GmHPT2 14 OSHPT2 14

Motif XII (SEQ ID NO: 95) 2O NOHPT : KSAIAQEYQEIWK : 23 AnHPT :::: : 23 Tricho 23 SyHPT 23 ZInHPT1 23 TaHPT1. 23 ApHPT1 23 GImHPT1-1 23 CpHPT1 23 At HPT1. 23 Chloro : RQSIA 23

AtHPT2 : KEAIS IWNLFYAEY 23

GmHPT2OSHPT2 : KDAISKDA EEEEyes 23

Fig, 34b U.S. Patent Sep. 26, 2006 Sheet 45 of 47 US 7,112,717 B2

Motif I

Corn YRFSRPHTVGT 14 Wheat YRFSRPHTIIGT 14 leek : VLYKFSRPHTIIGT 14 soy-ppt2 : A YRFSRPHTVIGT 14 SOy-pptl YRSRPHTVGT 14

Cuphea 14 Arabidopsi 14 WKFSRPHTIIGT 14

Nostoc ppt anabaena p WKFSRPHTIIGT 14 Synechocys WRESRPHTIIGT 14 afs 4FSRPHT6 IGT

Motif II

Corn 26 Wheat 26 Leek 26 soy-ppt2 26 SOy-ppt1 26 Cuphea 26 Arabidopsi 26 Nostoc ppt 26 anabaena p 26 Synechocys 26

Motif II

COrn 6 Wheat IALFKDIPDIEGDR; 6 Leek IALFKDIPDIDGDK, 6 soy-ppt2 16 SOy-pptl 6 Cuphea 6 Arabidopsi 16 No Stoc ppt 6

anabaena p IAIFKDIPDMGDRY : 16 Synechocys IAIFKDVPDMGDROF : 16 IA6FKD6PD6 eGD4 5

Aig. 3-5a U.S. Patent Sep. 26, 2006 Sheet 46 of 47 US 7,112,717 B2

Motif IV Corn : EYM AEYLLIP : 17 Wheat : EYMLIWRLFYAEYLLIP : 17 Leek : EYMVWKLFYAEYLLIP : 17 soy-ppt2 IWKLFYAEYLLIP : 17 soy-pptl IWKLFYAEYLLIP 17 Cuphea : EYMIWKLFYAEYLLIP : 17 Arabidopsi : REYLLEP : 17 Nost OC ppt IWKLFEIEYLFP 17

anabaena p EIWKLFFLEYLMFP 17 Synechocys : YOIWKLFFLEYLLYP : 17 as ever,

Fig. 35E U.S. Patent Sep. 26, 2006 Sheet 47 of 47 US 7,112,717 B2

Motif A

Nostoc ppt AFWKFSRPHTIGT 14 anabaena p 14 Synechocys AEWRFSRPHTITGT 14 Corn ppt AFYRESRPHTVIGT 14 soy-ppt2 AFYRESRPHTWIGT 14 soy-ppt1 AEYRESRPHTVIGT 14 Arabidopsi AEYRESRPHTVG 14 Cuphea ppt AFYRFSRPHTIIGT 14 AF54FSRPHT6IGT

Motif B

NostOC ppt 26 anabaena p 26 Synecho Cys 26 Corn ppt 26 soy-ppt2 26 soy-ppt1 26 Arabidopsi 26 Cuphea ppt 26 N6YIVOLNQL D6 ID46NKP LPLA

Motif C Nostoc ppt IAIFKDIPDIEGDREY 16 anabaena p IAIFKDIPDMEGDRY 16 Synechocys IAIFKDVPDMEGDROE 16 Corn ppt IAL FKDIPDIEGDRF 16 soy-ppt2 IALFKDIPDIEGDKWF 16 soy-ppt1 IALFKDIPDIEGDKYE 16 Arabidopsi IALFKDIPDIEGDKEE 16 Cuphea ppt IALFKDIPDIEGDKE 16 IA6FKD6PD6EGD4 5

Motif D

Nostoc ppt anabaena p Synechocys Corn ppt soy-ppt2 soy-ppt. 1 Arabidopsi US 7,112,717 B2 1. 2 HOMOGENTISATE PRENYL TRANSFERASE Other tocopherols, however, such as B, Y, and 8-tocopherols GENE (HPT2) FROM ARABIDOPSIS AND also have significant health and nutritional benefits. USES THEREOF Tocopherols are primarily synthesized only by plants and certain other photosynthetic organisms, including cyanobac This application claims priority to U.S. No. 60/365,202 5 teria. As a result, mammalian dietary tocopherols are filed Mar. 19, 2002, the disclosure of which is incorporated obtained almost exclusively from these sources. Plant tis herein by reference in its entirety. Sues vary considerably in total tocopherol content and The present invention is in the field of plant genetics and tocopherol composition, with C-tocopherol the predominant biochemistry. More specifically, the present invention tocopherol species found in green, photosynthetic plant relates to genes and polypeptides associated with the toco 10 tissues. Leaf tissue can contain from 10–50 jug of total pherol biosynthesis pathway, namely those encoding tocopherols per gram fresh weight, but most of the world's homogentisate prenyl transferase activity, and uses thereof. major staple crops (e.g., rice, corn, wheat, potato) produce Isoprenoids are ubiquitous compounds found in all living low to extremely low levels of total tocopherols, of which organisms. Plants synthesize a diverse array of greater than only a small percentage is C.-tocopherol (Hess, Vitamin E, 22,000 isoprenoids (Connolly and Hill, Dictionary of Ter 15 C-tocopherol, In: Antioxidants in Higher Plants, R. Alscher penoids, Chapman and Hall, New York, N.Y. (1992)). In and J. Hess, (eds.), CRC Press, Boca Raton., pp. 111-134 plants, isoprenoids play essential roles in particular cell (1993)). Oil seed crops generally contain much higher levels functions such as production of sterols, contributing to of total tocopherols, but C.-tocopherol is present only as a eukaryotic membrane architecture, acyclic polyprenoids minor component in most oilseeds (Taylor and Barnes, found in the side chain of ubiquinone and plastoquinone, Chemy Ind., October:722–726 (1981)). growth regulators like abscisic acid, gibberellins, brassinos The recommended daily dietary intake of 15–30 mg of teroids or the photosynthetic pigments chlorophylls and vitamin E is quite difficult to achieve from the average carotenoids. Although the physiological role of other plant American diet. For example, it would take over 750 grams isoprenoids is less evident, like that of the vast array of of spinach leaves, in which C-tocopherol comprises 60% of secondary metabolites, some are known to play key roles 25 total tocopherols, or 200–400 grams of soybean oil to satisfy mediating the adaptative responses to different environmen this recommended daily vitamin E intake. While it is pos tal challenges. In spite of the remarkable diversity of struc sible to augment the diet with Supplements, most of these ture and function, all isoprenoids originate from a single Supplements contain primarily synthetic vitamin E, having metabolic precursor, isopentenyl diphosphate (IPP) (Wright, eight stereoisomers, whereas natural vitamin E is predomi (1961) Annu. Rev. Biochem., 20:525–548; and Spurgeon and 30 nantly composed of only a single isomer. Furthermore, Porter. In: Biosynthesis of Isoprenoid Compounds, Porter Supplements tend to be relatively expensive, and the general and Spurgeon (eds.) John Wiley, NY. Vol. 1, pp. 1–46 population is disinclined to take vitamin supplements on a (1981)). regular basis. Therefore, there is a need in the art for A number of unique and interconnected biochemical compositions and methods that either increase the total pathways derived from the isoprenoid pathway leading to 35 tocopherol production or increase the relative percentage of secondary metabolites, including tocopherols, exist in chlo C-tocopherol produced by plants. roplasts of higher plants. Tocopherols not only perform vital In addition to the health benefits of tocopherols, increased functions in plants, but are also important from mammalian C-tocopherol levels in crops have been associated with nutritional perspectives. In plastids, tocopherols account for enhanced stability and extended shelf life of plant products up to 40% of the total quinone pool. Tocopherols are an 40 (Peterson, Cereal-Chem., 72(1):21–24 (1995); Ball, Fat important component of mammalian diets. Epidemiological soluble vitamin assays in food analysis. A comprehensive evidence indicates that tocopherol Supplementation can review, London, Elsevier Science Publishers Ltd. (1988)). result in decreased risk for cardiovascular disease and can Further, tocopherol supplementation of Swine, beef, and cer, can aid in immune function, and is associated with poultry feeds has been shown to significantly increase meat prevention or retardation of a number of degenerative dis 45 quality and extend the shelf life of post-processed meat ease processes in humans (Traber and Sies, Annu. Rev. Nutri, products by retarding post-processing lipid oxidation, which 16:321-347 (1996)). Tocopherol functions, in part, by sta contributes to the undesirable flavor components (Sante and bilizing the lipid bilayer of biological membranes (Skrypin Lacourt, J. Sci. Food Agric., 65(4):503–507 (1994); Buckley and Kagan, Biochim. Biophys. Acta, 815:209 (1995); Kagan, et al., J. of Animal Science, 73:3122–3130 (1995)). N.Y. Acad. Sci., p. 121 (1989); Gomez-Fernandez et al., Ann. 50 Tocopherol Biosynthesis N.Y. Acad. Sci., p. 109 (1989)), reducing polyunsaturated The plastids of higher plants exhibit interconnected bio fatty acid (PUFA) free radicals generated by lipid oxidation chemical pathways leading to secondary metabolites includ (Fukuzawa et al., Lipids, 17:511–513 (1982)), and scaveng ing tocopherols. The tocopherol biosynthetic pathway in ing oxygen free radicals, lipid peroxy radicals and singlet higher plants involves condensation of homogentisic acid oxygen species (Diplock et al., Ann. NY Acad. Sci., 570:72 55 and phytylpyrophosphate to form 2-methylphytylplasto (1989); Fryer, Plant Cell Environ., 15(4):381-392 (1992)). quinol (Fiedler et al., Planta, 155:511–515 (1982); Soll et The compound C-tocopherol, which is often referred to as al., Arch. Biochem. Biophy.s., 204:544–550 (1980); Marshall Vitamin E, belongs to a class of lipid-soluble antioxidants et al., Phytochem., 24:1705–1711 (1985)). This plant toco that includes C, B, Y, and 8-tocopherols and C. B. Y, and pherol pathway can be divided into four parts: 1) synthesis 8-tocotrienols. Although C, B, Y, and 8-tocopherols and C. B. 60 of homogentisic acid (HGA), which contributes to the Y, and 8-tocotrienols are sometimes referred to collectively aromatic ring of tocopherol; 2) synthesis of phytylpyrophos as “vitamin E. vitamin E is more appropriately defined phate, which contributes to the side chain of tocopherol; 3) chemically as C-tocopherol. Vitamin E, or C-tocopherol, is joining of HGA and phytylpyrophosphate via a homogenti significant for human health, in part because it is readily sate prenyl transferase followed by a Subsequent cyclization; absorbed and retained by the body, and therefore has a 65 and 4) S-adenosyl methionine dependent methylation of an higher degree of bioactivity than other tocopherol species aromatic ring, which affects the relative abundance of each (Traber and Sies, Annu. Rev. Nutri, 16:321-347 (1996)). of the tocopherol species. See FIG. 1. US 7,112,717 B2 3 4 Various genes and their encoded proteins that are involved Rohmer, Prog. Drug. Res., 50:135-154 (1998); Rohmer, in tocopherol biosynthesis include those listed in the table Comprehensive Natural Products Chemistry, Vol. 2, pp. below: 45–68, Barton and Nakanishi (eds.), Pergamon Press, Oxford, England (1999)), who found that the isotope label ing patterns observed in studies on certain eubacterial and plant terpenoids could not be explained in terms of the Gene ID Enzyme name mevalonate pathway. Arigoni and coworkers Subsequently tyra Bifunctional prephenate dehydrogenase showed that 1-deoxyxylulose, or a derivative thereof, serves Sr1736 Homogentisate prenyl transferase from Synechocystis as an intermediate of the novel pathway, now referred to as ATPT2 Homogentisate prenyl transferase from Arabidopsis thaliana 10 DXS 1-Deoxyxylulose-5-phosphate synthase the MEP pathway (Rohmer et al., Biochem. J., 295:517–524 DXR 1-Deoxyxylulose-5-phosphate reductoisomerase (1993); Schwarz, Ph.D. thesis, Eidgenössiche Technische GGPPS Geranylgeranyl pyrophosphate synthase Hochschule, Zurich, Switzerland (1994)). Recent studies HPPD p-Hydroxyphenylpyruvate dioxygenase showed the formation of 1-deoxyxylulose 5-phosphate (Bro AANT1 Adenylate transporter Sr1737 Tocopherol cyclase ers, Ph.D. thesis, Eidgenössiche Technische Hochschule, IDI Isopentenyl diphosphate isomerase 15 Zurich, Switzerland (1994)) from one molecule each of GGEH Geranylgeranyl diphosphate reductase glyceraldehyde 3-phosphate (Rohmer, Comprehensive GMT Gamma Methyl Transferase Natural Products Chemistry, Vol. 2, pp. 45–68, Barton and tMT2 Tocopherol methyl transferase 2 MT1 Methyl transferase 1 Nakanishi (eds.), Pergamon Press, Oxford, England (1999)) gcpE (E)-4-hydroxy-3-methylbut-2-enyl diphosphate synthase and pyruvate (Eisenreich et al., Chem. Biol., 5:R223–R233 (1998); Schwarz supra; Rohmer et al., J. Am. Chem. Soc., 118:2564–2566 (1996); and Sprenger et al., Proc. Natl. The “Gene IDs’ given in the table above identify the gene Acad. Sci. (U.S.A.), 94:12857–12862 (1997)) by an enzyme associated with the listed enzyme. Any of the Gene IDs encoded by the dxs gene (Lois et al., Proc. Natl. Acad. Sci. listed in the table appearing herein in the present disclosure (U.S.A.), 95:2105–2110 (1997); and Lange et al., Proc. Natl. refer to the gene encoding the enzyme with which the Gene 25 Acad. Sci. (U.S.A.), 95:2100–2104 (1998)). 1-Deoxyxylu ID is associated in the table. lose 5-phosphate can be further converted into 2-C-methyl As used herein, HPT, HPT2, PPT, slr1736, and ATPT2 erythritol 4-phosphate (Arigoni et al., Proc. Natl. Acad. Sci. each refer to proteins or genes encoding proteins that have (U.S.A.), 94:10600–10605 (1997)) by a reductoisomerase the same activity. encoded by the dxr gene (Bouvier et al., Plant Physiol, Synthesis of Homogentisic Acid 30 117:1421-1431 (1998); and Rohdichet al., Proc. Natl. Acad. Homogentisic acid is the common precursor to both Sci. (U.S.A.), 96:11758–11763 (1999)). tocopherols and plastoquinones. In at least some bacteria the Reported genes in the MEP pathway also include ygbP. synthesis of homogentisic acid is reported to occur via the which catalyzes the conversion of 2-C-methylerythritol conversion of chorismate to prephenate and then to p-hy 4-phosphate into its respective cytidyl pyrophosphate droxyphenylpyruvate via a bifunctional prephenate dehy 35 derivative and ygbB, which catalyzes the conversion of drogenase. Examples of bifunctional bacterial prephenate 4-phosphocytidyl-2-C-methyl-D-erythritol into 2-C-methyl dehydrogenase enzymes include the proteins encoded by the D-erythritol, 3.4-cyclophosphate. These genes are tightly tyra genes of Erwinia herbicola and Escherichia coli. The linked on the E. coli genome (Herz et al., Proc. Natl. Acad. tyr A gene product catalyzes the production of prephenate Sci. (U.S.A.), 97(6):2485–2490 (2000)). from chorismate, as well as the Subsequent dehydrogenation 40 Once IPP is formed by the MEP pathway, it is converted of prephenate to form p-hydroxyphenylpyruvate (p-HPP), to GGDP by GGPDP synthase, and then to phytylpyrophos the immediate precursor to homogentisic acid. p-HPP is then phate, which is the central constituent of the tocopherol side converted to homogentisic acid by hydroxyphenylpyruvate chain. dioxygenase (HPPD). In contrast, plants are believed to lack Combination and Cyclization prephenate dehydrogenase activity, and it is generally 45 Homogentisic acid is combined with either phytyl-pyro believed that the synthesis of homogentisic acid from cho phosphate or Solanyl-pyrophosphate by homogentisate pre rismate occurs via the synthesis and conversion of the nyl transferase (HPT) forming 2-methylphytyl plastoquinol intermediate arogenate. Since pathways involved in or 2-methylSolanyl plastoquinol, respectively. 2-methylSola homogentisic acid synthesis are also responsible for tyrosine nyl plastoquinol is a precursor to the biosynthesis of plas formation, any alterations in these pathways can also result 50 toquinones, while 2-methylphytyl plastoquinol is ultimately in the alteration in tyrosine synthesis and the synthesis of converted to tocopherol. other aromatic amino acids. Methylation of the Aromatic Ring Synthesis of Phytylpyrophosphate The major structural difference between each of the Tocopherols are a member of the class of compounds tocopherol Subtypes is the position of the methyl groups referred to as the isoprenoids. Other isoprenoids include 55 around the phenyl ring. Both 2-methylphytyl plastoquinol carotenoids, gibberellins, terpenes, chlorophyll and abscisic and 2-methylSolanyl plastoquinol serve as Substrates for the acid. A central intermediate in the production ofisoprenoids plant enzyme 2-methylphytylplastoquinol/2-methylSola is isopentenyl diphosphate (IPP). Cytoplasmic and plastid nylplastoquinol methyltransferase (Tocopherol Methyl based pathways to generate IPP have been reported. The Transferase 2: Methyl Transferase 2: MT2; tMT2), which is cytoplasmic based pathway involves the enzymes 60 capable of methylating a tocopherol precursor. Subsequent acetoacetyl CoA thiolase, HMGCoA synthase, HMGCoA methylation at the 5 position of Y-tocopherol by Y-tocopherol reductase, mevalonate kinase, phosphomevalonate kinase, methyl-transferase (GMT) generates the biologically active and mevalonate pyrophosphate decarboxylase. C-tocopherol. Recently, evidence for the existence of an alternative, Some plants e.g. soy produce Substantial amounts of delta plastid based, isoprenoid biosynthetic pathway emerged 65 and subsequently beta-tocopherol in their seed. The forma from Studies in the research groups of Rohmer and Arigoni tion of 8-tocopherol or B-tocopherol can be prevented by the (Eisenreich et al., Chem. Bio., 5:R221-R233 (1998); overexpression of tMT2, resulting in the methylation of the US 7,112,717 B2 5 6 8-tocopherol precursor, 2-methyl phytyl plastoquinone to The present invention includes and provides a trans form 2,3-dimethyl-5-phytyl plastoquinone followed by formed plant comprising a nucleic acid molecule comprising cyclization with tocopherol cyclase to form Y-tocopherol and an introduced promoter region which functions in plant cells a subsequent methylation by GMT to form C-tocopherol. In to cause the production of an mRNA molecule, wherein said a possible alternative pathway, B-tocopherol is directly con introduced promoter region is linked to a transcribed nucleic verted to C-tocopherol by tMT2 via the methylation of the acid molecule having a transcribed Strand and a non-tran 3 position (see, for example, Biochemical Society Transac scribed strand, wherein said transcribed Strand is comple tions, 11:504–510 (1983); Introduction to Plant Biochemis mentary to a nucleic acid molecule encoding a polypeptide try, 2" edition, Chapter 11 (1983); Vitamin Hormone, selected from the group consisting of SEQID NOs: 5, 9–11, 29:153-200 (1971); Biochemical Journal, 109:577 (1968); 10 43–44, 57–58, and 90, and wherein said transcribed nucleic and, Biochemical and Biophysical Research Communica acid molecule is linked to a 3' non-translated sequence that tion, 28(3):295 (1967)). Since all potential mechanisms for functions in the plant cells to cause termination of transcrip the generation of C-tocopherol involve catalysis by tMT2, tion and addition of polyadenylated ribonucleotides to a 3 plants that are deficient in this activity accumulate 8-toco end of the mRNA sequence. pherol and B-tocopherol. Plants which have increased tMT2 15 The present invention includes and provides a method of activity tend to accumulate Y-tocopherol and C-tocopherol. producing a plant having a seed with an increased total Since there is limited GMT activity in the seeds of many tocopherol level comprising: (A) transforming said plant plants, these plants tend to accumulate Y-tocopherol. with an introduced nucleic acid molecule encoding a There is a need in the art for nucleic acid molecules polypeptide comprising an amino acid sequence selected encoding enzymes involved in tocopherol biosysnthesis, as from the group consisting of SEQID NOs: 5, 9–11, 43–44, well as related enzymes and antibodies for the enhancement 57–58, and 90; and (B) growing said transformed plant. or alteration of tocopherol production in plants. There is a The present invention includes and provides a method of further need for transgenic organisms expressing those producing a plant having a seed with an increased total nucleic acid molecules involved in tocopherol biosynthesis, tocopherol level comprising: (A) transforming said plant which are capable of nutritionally enhancing food and feed 25 with an introduced first nucleic acid molecule, wherein said SOUCS. first nucleic acid molecule encodes a polypeptide having an amino acid sequence selected from the group consisting of SUMMARY OF THE INVENTION SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90, and an introduced second nucleic acid molecule encoding an The present invention includes and provides a Substan 30 enzyme selected from the group consisting of tyra, prephen tially purified nucleic acid molecule encoding an amino acid ate dehydrogenase, tocopherol cyclase, dxs, dxr, GMT, sequence selected from the group consisting of SEQ ID MT1, tMT2, GGPPS, GCPE, HPPD, AANT1, IDI, GGH, NOs: 5, 9–11, 57-58, and 90. and complements thereof, and (B) growing said transformed The present invention includes and provides a Substan plant. tially purified polypeptide molecule comprising an amino 35 The present invention includes and provides a seed acid sequence selected from the group consisting of SEQID derived from a transformed plant comprising an introduced NOs: 5, 9–11, 57-58, and 90. nucleic acid molecule encoding a polypeptide comprising an The present invention includes and provides an antibody amino acid sequence selected from the group consisting of capable of specifically binding a polypeptide comprising an SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90. amino acid sequence selected from the group consisting of 40 The present invention includes and provides a seed SEQ ID NOs: 5, 9–11, 57–58, and 90. derived from a transformed plant comprising an introduced The present invention includes and provides a Substan first nucleic acid molecule encoding an introduced polypep tially purified nucleic acid molecule encoding a polypeptide tide comprising an amino acid sequence selected from the having homogentisate prenyl transferase activity comprising group consisting of SEQ ID NOs: 5, 9–11, 43, 44, 57, 58. an amino acid sequence selected from the group consisting 45 and 90 and an introduced second nucleic acid encoding an of SEQ ID NOs: 43 and 44. enzyme selected from the group consisting of tyra, prephen The present invention includes and provides a Substan ate dehydrogenase, tocopherol cyclase, dxs, dxr, GMT, tially purified polypeptide having homogentisate prenyl MT1, GCPE, tMT2, GGPPS, HPPD, AANT1, IDI, GGH, transferase activity comprising an amino acid sequence and complements thereof. selected from the group consisting of SEQID NOs: 43 and 50 The present invention includes and provides a Substan 44. tially purified polypeptide comprising an amino acid The present invention includes and provides a trans sequence selected from the group consisting of SEQ ID formed plant comprising an introduced nucleic acid mol NOs: 39–42, 46–49, and 92–95 wherein said amino acid ecule encoding a polypeptide comprising an amino acid sequence is not derived from a nucleic acid molecule that is sequence selected from the group consisting of SEQ ID 55 derived from Nostoc punctiforme, Anabaena, Synechocystis, NOs: 5, 9–11, 43–44, 57–58, and 90, and complements Zea mays, Glycine max, Arabidopsis thaliana, Oryza sativa, thereof. Trichodesmium erythraeum, Chloroflexus aurantiacus, The present invention includes and provides a trans wheat, leek, canola, cotton, or tomato. The present invention formed plant comprising an introduced first nucleic acid includes and provides said Substantially purified polypeptide molecule encoding a polypeptide comprising an amino acid 60 wherein more than one more amino acid sequence is sequence selected from the group consisting of SEQ ID selected from the group consisting of SEQID NOs: 39–42, NOs: 5, 9–11, 43–44, 57–58, and 90, and complements 46–49, and 92–95. thereof, and an introduced second nucleic acid molecule The present invention includes and provides a Substan encoding an enzyme selected from the group consisting of tially purified nucleic acid molecule encoding a polypeptide tyr A, prephenate dehyrogenase, tocopherol cyclase, dxS, 65 comprising an amino acid sequence selected from the group dxr, GMT, MT1, tMT2, GCPE, GGPPS, HPPD, AANT1, consisting of SEQ ID NOs: 39–42, 46–49, and 92–95 IDI, GGH, and complements thereof. wherein said nucleic acid molecule is not derived from US 7,112,717 B2 7 8 Nostoc punctiforme, Anabaena, Synechocystis, Zea mays, polypeptide comprising an amino acid sequence selected Glycine max, Arabidopsis thaliana, Oryza sativa, Trichodes from the group consisting of SEQ ID NOs: 39–42, 46–49, mium erythraeum, Chloroflexus aurantiacus, wheat, leek, and 92–95 wherein said nucleic acid molecule does not canola, cotton, or tomato. The present invention includes comprise any of the nucleic acid sequences set forth in and provides said nucleic acid molecule wherein the 5 sequence listings in WO 00/68393: WO 00/63391; WO polypeptide further comprises more than one amino acid 01/62781; or WO 02/33060; and does not comprise SEQID sequence selected from the group consisting of SEQ ID NOS: 27–36; 59–60, 88–89, and 91 from the present appli NOs: 39–42, 46–49, and 92–95. cation, or the gene with Genebank Accession Nos. AI The present invention includes and provides a Substan 897027 or AW 563431. The present invention includes and tially purified nucleic acid molecule encoding a polypeptide 10 provides said nucleic acid molecule wherein the polypeptide comprising an amino acid sequence selected from the group further comprises more than one amino acid sequence consisting of SEQ ID NOs: 39–42, 46–49, and 92–95 selected from the group consisting of SEQID NOs: 39–42, wherein said nucleic acid molecule is not derived from 46–49, and 92–95. Nostoc punctiforme, Anabaena, Synechocystis, Zea mays, The present invention includes and provides a Substan Glycine max, Arabidopsis thaliana, Oryza sativa, Sulfolo 15 tially purified nucleic acid molecule comprising a nucleic bus, Aeropyum, Trichodesmium erythraeum, Chloroflexus acid sequence selected from the group consisting of SEQID aurantiacus, Sorghum, wheat, tomato, or leek. The present NOs: 31, 34–36, 59-60, and 91. invention includes and provides said nucleic acid molecule The present invention includes and provides for wherein the polypeptide further comprises more than one homogentisate prenyl transferases discovered using one or amino acid sequence selected from the group consisting of more of the alignments of FIGS. 2a-2c, 3a–3c, 24a–24b, SEQ ID NOs: 39–42, 46–49 and 92–95. 25a-25b, 33a–33c, 34a–34b, 35a–35b and 36. The present invention includes and provides a plant transformed with a nucleic acid molecule encoding a DESCRIPTION OF THE NUCLEIC AND AMINO polypeptide comprising an amino acid sequence selected ACID SEQUENCES from the group consisting of SEQ ID NOs: 39–42, 46–49, 25 and 92–95 wherein said nucleic acid molecule is not derived SEQID NO: 1 sets forth a Nostoc punctiforme homogen from Nostoc punctiforme, Anabaena, Synechocystis, Zea tisate prenyl transferase polypeptide. mays, Glycine max, Arabidopsis thaliana, Oryza sativa, SEQ ID NO: 2 sets forth an Anabaena homogentisate Sulfolobus, Aeropyum, Trichodesmium erythraeum, Chlo prenyl transferase polypeptide. roflexus aurantiacus, Sorghum, wheat, tomato, or leek. The 30 present invention includes and provides said nucleic acid SEQ ID NO: 3 sets forth a Synechocystis homogentisate molecule wherein the polypeptide further comprises more prenyl transferase polypeptide. than one amino acid sequence selected from the group SEQID NO: 4 sets forth a Zea mays homogentisate prenyl consisting of SEQ ID NOs: 39–42, 46–49, and 92–95. transferase polypeptide (HPT1). The present invention includes and provides a Substan 35 SEQ ID NO: 5 sets forth a Glycine max homogentisate tially purified polypeptide comprising an amino acid prenyl transferase polypeptide (HPT1-2). sequence selected from the group consisting of SEQ ID SEQ ID NO: 6 sets forth a Glycine max homogentisate NOs: 39–42, 46–49, and 92–95 wherein said polypeptide prenyl transferase polypeptide (HPT1-1). does not comprise any of the amino acid sequences set forth SEQ ID NO: 7 sets forth an Arabidopsis thaliana in sequence listings in WO 00/68393 (which sequences are 40 homogentisate prenyl transferase polypeptide (HPT1). incorporated herein by reference); WO 00/63391 (which SEQ ID NO: 8 sets forth a partial Cuphea pulcherrima sequences are incorporated herein by reference); WO homogentisate prenyl transferase polypeptide. 01/62781 (which sequences are incorporated herein by ref SEQ ID NO: 9 sets forth a leek homogentisate prenyl erence); or WO 02/33060 (which sequences are incorporated transferase polypeptide (HPT1). herein by reference); and does not comprise SEQ ID NOs: 45 SEQ ID NO: 10 sets forth a wheat homogentisate prenyl 1–11, 43–45, 57–58, 61–62, or 90 from the present appli transferase polypeptide (HPT1). cation. SEQ ID NO: 11 sets forth a Cuphea pulcherrima The present invention includes and provides a Substan homogentisate prenyl transferase polypeptide (HPT1). tially purified polypeptide comprising more than onean SEQ ID NOs: 12–15 represent domains from SEQ ID amino acid sequence selected from the group consisting of 50 NOS: 1-8. SEQ ID NOs: 39–42, 46–49, and 92–95. The present invention includes and provides a Substan SEQ ID NOs: 16–26 set forth primer sequences. SEQ ID NO: 27 sets forth a nucleic acid molecule tially purified nucleic acid molecule encoding a polypeptide encoding a Nostoc punctiforme homogentisate prenyl trans comprising an amino acid sequence selected from the group ferase polypeptide. consisting of SEQ ID NOs: 39–42, 46–49, and 92–95 55 wherein said nucleic acid molecule does not comprise any of SEQ ID NO: 28 sets forth a nucleic acid molecule the nucleic acid sequences set forth in sequence listings in encoding an Anabaena homogentisate prenyl transferase WO 00/68393: WO 00/63391; WO 01/62781; or WO polypeptide. 02/33060; and does not comprise SEQ ID NOS: 27-36, SEQ ID NO: 29 sets forth a nucleic acid molecule 59–60, 88–89, and 91 from the present application, or the 60 encoding a Synechocystis homogentisate prenyl transferase gene with Genebank Accession Nos. AI 897027 or AW polypeptide. 563431 The present invention includes and provides said SEQ ID NO: 30 sets forth a nucleic acid molecule nucleic acid molecule wherein the polypeptide further com encoding a Zea mays homogentisate prenyl transferase prises more than one amino acid sequence selected from the polypeptide (HPT1). group consisting of SEQID NOs: 39–42, 46–49, and 92–95. 65 SEQ ID NO: 31 sets forth a nucleic acid molecule The present invention includes and provides a plant encoding a Glycine max homogentisate prenyl transferase transformed with a nucleic acid molecule encoding a polypeptide (HPT1-2). US 7,112,717 B2 10 SEQ ID NO: 32 sets forth a nucleic acid molecule SEQ ID NO: 89 sets forth a nucleic acid molecule encoding a Glycine max homogentisate prenyl transferase encoding a homogentisate prenyl transferase polypeptide polypeptide (HPT1-1). from photobacteria Chloroflexus aurantiacus. SEQ ID NO: 33 sets forth a nucleic acid molecule SEQ ID NO: 90 sets forth a Glycine max homogentisate encoding an Arabidopsis thaliana homogentisate prenyl 5 prenyl transferase polypeptide (HPT2). transferase polypeptide (HPT1). SEQ ID NO: 91 sets forth a nucleic acid molecule SEQ ID NO. 34 sets forth a nucleic acid molecule encoding a homogentisate prenyl transferase polypeptide encoding a Cuphea pulcherrima homogentisate prenyl from Glycine max (HPT2). transferase polypeptide (HPT1). SEQ ID NOs: 92–95 represent domains from SEQ ID SEQ ID NO: 35 sets forth a nucleic acid molecule 10 NOs: 1–4, 6–7, 9–11, 43–44, 57–58, and 90. encoding a leek homogentisate prenyl transferase polypep Note: cyanobacteria and photobbacteria have one HPT. tide (HPT1). Plants have both HPT1 and HPT2. In soy, there are two SEQ ID NO: 36 sets forth a nucleic acid molecule variations of HPT1, HPT1-1 and HPT1-2, as well as HPT2. encoding a wheat homogentisate prenyl transferase polypep tide (HPT1). 15 BRIEF DESCRIPTION OF THE FIGURES SEQ ID NOs: 37–38 set forth primer sequences. SEQID NOs: 39–42 set forth domains from SEQID NOs: FIG. 1 is a schematic diagram of the tocopherol biosyn 1–7 and 9–11. thetic pathway. SEQ ID NO: 43 sets forth a homogentisate prenyl trans FIGS. 2a-2c depicts a sequence alignment for several ferase polypeptide from Trichodesmium erythraeum. homogentisate prenyl transferase polypeptides SEQ ID SEQ ID NO: 44 sets forth a homogentisate prenyl trans NOs: 1-8). ferase polypeptide from Chloroflexus aurantiacus. FIGS. 3a–3c depicts a sequence alignment for several SEQ ID NO: 45 sets forth a putative sequence for an homogentisate prenyl transferase polypeptides (SEQ ID Arabidopsis thaliana homogentisate prenyl transferase NOs: 1–7, and 9–11). polypeptide (HPT2). 25 FIG. 4 provides a schematic of the expression construct SEQ ID NOs: 46–49 represent domains from SEQ ID pCGN10800. NOs: 1-4, 6–7, 9-11, 57-58 and 91. FIG. 5 provides a schematic of the expression construct SEQ ID NOs: 50–56 set forth primer sequences. pCGN10801. SEQ ID NO: 57 sets forth an Arabidopsis thaliana FIG. 6 provides a schematic of the expression construct homogentisate prenyl transferase polypeptide (HPT2). 30 pCGN10803. SEQID NO: 58 sets forth an Oryza sativa homogentisate FIG. 7 provides a schematic of the expression construct prenyl transferase polypeptide (HPT2). pCGN10822. SEQ ID NO. 59 sets forth a nucleic acid molecule FIG. 8 provides bar graphs of HPLC data obtained from encoding an Arabidopsis thaliana homogentisate prenyl seed extracts of transgenic Arabidopsis containing transferase polypeptide (HPT2). 35 pCGN10822, which provides of the expression of the SEQ ID NO: 60 sets forth a nucleic acid molecule ATPT2 sequence (SEQID NO: 33), in the sense orientation, encoding an Oryza sativa homogentisate prenyl transferase from the napin promoter. Provided are graphs for C. B. and polypeptide (HPT2). 8-tocopherols, as well as total tocopherol for 22 transformed SEQID NO: 61 sets forth a putative homogentisate prenyl lines, as well as a nontransformed (wild-type) control. transferase polypeptide from Arabidopsis thaliana (HPT2). 40 FIG. 9 provides a bar graph of HPLC analysis of seed SEQID NO: 62 sets forth a putative homogentisate prenyl extracts from Arabidopsis plants transformed with a transferase polypeptide from Arabidopsis thaliana (HPT2). pCGN10803 (lines 1387 through 1624, enhanced 35S SEQ ID NO: 63 sets forth an EST from Arabidopsis ATPT2, in the antisense orientation), a nontransformed (wt) thaliana. control, and an empty vector transformed control. 45 FIG. 10 provides a schematic of the expression construct SEQ ID NO: 64 sets forth an EST from Medicago pMON36581. truncatula. FIG. 11 provides a schematic of the expression construct SEQ ID NO: 65 sets forth an EST from Medicago pMON69933. truncatula developing stem. FIG. 12 provides a schematic of the expression construct SEQ ID NO: 66 sets forth an EST from Medicago 50 pMON69924. truncatula developing stem. FIG. 13 provides a schematic of the expression construct SEQ ID NO: 67 sets forth an EST from Medicago pMON69943. truncatula developing stem. FIG. 14 depicts results of total tocopherol levels in SEQ ID NO: 68 sets forth an EST from mixed potato recombinant Soy lines. tissues. 55 FIG. 15 depicts pMON 69960. SEQ ID NO: 69 sets forth an EST from Arabidopsis FIG. 16 depicts pMON 36525. thaliana, Columbia ecotype flower buds. FIG. 17 depicts pMON 69963. SEQ ID NO: 70 sets forth an EST from Arabidopsis FIG. 18 depicts pMON 69965. thaliana. FIG. 19 depicts pMON 10098. SEQ ID NO: 71 sets forth an EST from Medicago 60 FIG. 20 depicts pMON 69964. truncatula. FIG. 21 depicts pMON 69966. SEQ ID NO: 72 sets forth an EST from Glycine max. FIG. 22 depicts results of seed total tocopherol analysis. SEQ ID NOs: 73–83 and 84–87 set forth primer FIG. 23 depicts results of seed total tocopherol analysis. Sequences. FIG. 24 depicts the alignments of SEQID NOs: 1–4, 6–7, SEQ ID NO: 88 sets forth a nucleic acid molecule 65 9-11, 57-58, and 91. encoding a homogentisate prenyl transferase polypeptide FIG. 25 depicts motifs V through VIII, SEQ ID NOs: from cyanobacteria Trichodesmium erythraeum. 46–49. US 7,112,717 B2 11 12 FIG. 26 depicts a sequence tree derived from a multiple An example of a more preferred homogentisate prenyl alignment shown from SEQ ID NOs: 1-7, 9–11, 43, 44. transferase is a polypeptide with the amino acid sequence 57–58, and 90. selected from the group consisting of SEQID NOs: 5, 9–11, FIG. 27 depicts pMON81028. 43–44, 55, 58, and 90. In a more preferred embodiment, the FIG. 28 depicts pMON81023. homogentisate prenyl transferase is encoded by any nucleic FIG. 29 depicts pMON36596. acid molecule encoding an amino acid sequence selected FIG. 30 depicts pET30a(+) vector. from the group consisting of SEQID NOs: 5, 9–11, 43–44, FIG. 31 depicts pMON69993. 55, 58, and 90. FIG. 32 depicts pMON69992. In another preferred aspect of the present invention the FIGS. 33a–33d depicts a sequence alignment for several 10 nucleic acid molecule of the present invention comprises a homogentisate prenyl transferase polypeptide SEQID NOs: nucleic acid sequence encoding a polypeptide selected from 1–4, 6–7, 9–11, 43–44, 57–58, and 90. the group consisting of SEQID NOs: 5, 9–11, 43–44, 55, 58. FIG. 34 depicts motifs IX through XII, SEQ ID NOs: and 90, and complements thereof and fragments of either. 92 95. In another preferred aspect of the present invention the FIG. 35 depicts motifs I-IV, SEQ ID NOs: 39–42. 15 nucleic acid molecule of the present invention comprises a FIG. 36 depicts motifs A-D, (SEQ ID NOs: 1–8, FIGS. nucleic acid sequence selected from the group consisting of 2A-C). SEQ ID NOs: 31, 34–36, 59-60, and 91. In another embodiment, the present invention includes DETAILED DESCRIPTION nucleic acid molecules encoding polypeptides having a region of conserved amino acid sequence shown in any of The present invention provides a number of agents, for Figures FIGS. 2a-2c, 3a–3c, 24a–24b, 25a–25b, 33a–33c, example, nucleic acid molecules and polypeptides associ 34a–34b, 35a–b and 36, and complements of those nucleic ated with the synthesis of tocopherol, and provides uses of acid molecules. In a preferred embodiment, the present Such agents. invention includes nucleic acid molecules encoding Agents 25 polypeptides comprising a sequence selected from the group The agents of the present invention will preferably be consisting of SEQ ID NOs: 39–42, 46–49, and 92–95, and “biologically active' with respect to either a structural complements of those nucleic acid molecules. The present attribute. Such as the capacity of a nucleic acid to hybridize invention includes and provides said nucleic acid molecule to another nucleic acid molecule, or the ability of a protein wherein the polypeptide further comprises more than one to be bound by an antibody (or to compete with another 30 amino acid sequence selected from the group consisting of molecule for such binding). Alternatively, such an attribute SEQ ID NOs: 39–42, 46–49, and 92–95. may be catalytic and thus involve the capacity of the agent In a further preferred embodiment the present invention to mediate a chemical reaction or response. The agents will includes nucleic acid molecules encoding polypeptides com preferably be “substantially purified’. The term “substan prising two or more, three or more, or four sequences tially purified, as used herein, refers to a molecule sepa 35 selected from the group consisting of SEQID NOs: 39–42, rated from substantially all other molecules normally asso 46–49, and 92–95, and complements of those nucleic acid ciated with it in its native environmental conditions. More molecules. In another embodiment, the present invention preferably a substantially purified molecule is the predomi includes nucleic acid molecules encoding polypeptides hav nant species present in a preparation. A Substantially purified ing homogentisate prenyl transferase activity and a region of molecule may be greater than about 60% free, preferably 40 conserved amino acid sequence shown in any of FIGS. about 75% free, more preferably about 90% free, and most 2a-2c, 3a–3c, 24a–24b, 25a–25b, 33a–33c, 34a–34b, preferably about 95% free from the other molecules (exclu 35a–35b and 36, and complements of those nucleic acid sive of solvent) present in the natural mixture. The term molecules. In a preferred embodiment, the present invention “substantially purified' is not intended to encompass mol includes nucleic acid molecules encoding polypeptides hav ecules present in their native environmental conditions. 45 ing homogentisate prenyl transferase activity and compris The agents of the present invention may also be recom ing a sequence selected from the group consisting of SEQID binant. As used herein, the term recombinant means any NOs: 39–42, 46–49, and 92–95 and complements of those agent (e.g., DNA, peptide etc.), that is, or results, however nucleic acid molecules. The present invention includes and indirectly, from human manipulation of a nucleic acid mol provides said nucleic acid molecule wherein the polypeptide ecule. 50 further comprises more than one amino acid sequence It is understood that the agents of the present invention selected from the group consisting of SEQID NOs: 39–42, may be labeled with reagents that facilitate detection of the 46–49, and 92–95. agent (e.g., fluorescent labels, Prober et al., Science, 238: In a further preferred embodiment the present invention 336–340 (1987); Albarella et al., EP 144 914; chemical includes nucleic acid molecules encoding polypeptides hav labels, Sheldon et al., U.S. Pat. No. 4,582,789; Albarella et 55 ing homogentisate prenyl transferase activity and compris al., U.S. Pat. No. 4,563,417; modified bases, Miyoshi et al., ing two or more, three or more, or four sequences selected EP 119 448). from the group consisting of SEQ ID NOs: 39–42, 46–49, Nucleic Acid Molecules and 92–95, and complements of those nucleic acid mol Agents of the present invention include nucleic acid ecules. In another embodiment, the present invention molecules. In a preferred aspect of the present invention the 60 includes nucleic acid molecules, excluding nucleic acid nucleic acid molecule comprises a nucleic acid sequence molecules derived from Nostoc punctiforme, Anabaena, that encodes a homogentisate prenyl transferase. As used Synechocystis, Zea mays, Glycine max, Arabidopsis herein, a homogentisate prenyl transferase is any plant thaliana, Oryza sativa, Trichodesmium erythraeum, Chlo protein that is capable of specifically catalyzing the forma roflexus aurantiacus, wheat, leek, canola, cotton, or tomato, tion of 2-methyl-6-phytylbenzoquinol (2-methyl-6-gera 65 encoding polypeptides having a region of conserved amino nylgeranylbenzoquinol) from phytyl-DP (GGDP) and acid sequence shown in any of FIGS. 2a-2c, 3a–3c. homogentisate. 24a–24b, 25a–25b, 33a–33c, 34a–34b,35a–35b and 36, and US 7,112,717 B2 13 14 complements of hose nucleic acid molecules. In a preferred regions of conserved sequence shown in FIGS. 2a-2c, embodiment, the present invention includes nucleic acid 3a–3c, 24a–24b, 25a–251), 33a–33c, 34a–34b, 35a–35b molecules, excluding nucleic acid molecules derived from and 36 are used to search for related amino acid sequences. Nostoc punctiforme, Anabaena, Synechocystis, Zea mays, In a preferred embodiment, a member selected from the Glycine max, Arabidopsis thaliana, Oryza sativa, Trichodes group consisting of SEQID NOs: 39–42 and 46–49 is used mium erythraeum, Chloroflexus aurantiacus, wheat, leek, to search for related sequences. In one embodiment, one or canola, cotton, or tomato, encoding polypeptides comprising more of SEQ ID NOs: 39–42, 46–49, and 92–95 is used to a sequence selected from the group consisting of SEQ ID search for related sequences. As used herein, "search for NOs: 39–42, 46–49, and 92–95 and complements of those related sequences' means any method of determining relat nucleic acid molecules. The present invention includes and 10 edness between two sequences, including, but not limited to, provides said nucleic acid molecule wherein the polypeptide searches that compare sequence homology: for example, a further comprises more than one amino acid sequence PBLAST search of a database for relatedness to a single selected from the group consisting of SEQID NOs: 39–42, amino acid sequence. Other searches may be counted using 46–49, and 92–95. profile based methods, such as the HMM (Hidden Markov In a further preferred embodiment the present invention 15 model) META-MEME (http://metameme.sdsc.edu/mhmm includes nucleic acid molecules, excluding nucleic acid links.html), PSI-BLAST (http://www.ncbi.nlm.nih.gov/ molecules derived from Nostoc punctiforme, Anabaena, BLAST7). (http://followed by metameme.sdsc.edu/mhmm Synechocystis, Zea mays, Glycine max, Arabidopsis links.html), PSI-BLAST (on the world wide web at thaliana, Oryza sativa, Trichodesmium erythraeum, Chlo incbi.nlm.nih.gov/BLAST/). roflexus aurantiacus, wheat, leek, canola, cotton, or tomato, As used herein, a nucleic acid molecule is said to be encoding polypeptides comprising two or more, three or "derived from a particular organism, species, ecotype, etc., more, or four sequences selected from the group consisting when the sequence of the nucleic acid molecule originated of SEQ ID NOs: 39–42, 46–49, and 92–95. from that organism, species, ecotype, etc. “Derived from In another embodiment, the present invention includes therefore includes copies of nucleic acid molecules derived nucleic acid molecules, excluding nucleic acid molecules 25 through, for example, PCR, as well as Synthetically gener derived from Nostoc punctiforme, Anabaena, Synechocystis, ated nucleic acid molecules having the same nucleic acid Zea mays, Glycine max, Arabidopsis thaliana, Oryza sativa, sequence as the original organism, species, ecotype, etc. Trichodesmium erythraeum, Chloroflexus aurantiacus, Likewise, a polypeptide is said to be "derived from a wheat, leek, canola, cotton, Sulfolobus, Aeropyum, Sor nucleic acid molecule when that nucleic acid molecule is ghum, or tomato, encoding polypeptides having homogen 30 used to code for the polypeptide, whether the polypeptide is tisate prenyl transferase activity and a region of conserved enzymatically generated from the nucleic acid molecule or amino acid sequence shown in any of FIGS. 2a-2c, 3a–3c. synthesized based on the sequence information inherent in 24a–24b, 25a–25b, 33a–33c, 34a–34b, 35a–35b and 36 and the nucleic acid molecule. complements of those nucleic acid molecules. In a preferred The present invention includes the use of the above embodiment, the present invention includes nucleic acid 35 described conserved sequences and fragments thereof in molecules, excluding nucleic acid molecules derived from transgenic plants, other organisms, and for other uses, Nostoc punctiforme, Anabaena, Synechocystis, Zea mays, including, without limitation, as described below. Glycine max, Arabidopsis thaliana, Oryza sativa, Trichodes In another preferred aspect of the present invention a mium erythraeum, Chloroflexus aurantiacus, wheat, leek, nucleic acid molecule comprises nucleotide sequences canola, cotton, or tomato, encoding polypeptides having 40 encoding a plastid transit peptide operably fused to a nucleic homogentisate prenyl transferase activity and comprising a acid molecule that encodes a protein or fragment of the sequence selected from the group consisting of SEQ ID present invention. NOs: 39–42, 46–49, and 92–95. The present invention In another preferred embodiment of the present invention, includes and provides said nucleic acid molecule wherein the nucleic acid molecules of the present invention encode the polypeptide further comprises more than one amino acid 45 mutant tocopherol homogentisate prenyl transferase sequence selected from the group consisting of SEQ ID enzymes. As used herein, a mutant enzyme is any enzyme NOs: 39–42, 46–49, and 92–95. that contains an amino acid that is different from the amino In a further preferred embodiment the present invention acid in the same position of a wild type enzyme of the same includes nucleic acid molecules, excluding nucleic acid type. molecules derived from Nostoc punctiforme, Anabaena, 50 Synechocystis, Zea mays, Glycine max, Arabidopsis It is understood that in a further aspect of nucleic acid thaliana, Oryza sativa, Trichodesmium erythraeum, Chlo sequences of the present invention, the nucleic acids can roflexus aurantiacus, wheat, leek, canola, cotton, or tomato, encode a protein that differs from any of the proteins in that encoding polypeptides having homogentisate prenyl trans one or more amino acids have been deleted, substituted or ferase activity and comprising two or more, three or more, 55 added without altering the function. For example, it is or four sequences selected from the group consisting of SEQ understood that codons capable of coding for Such conser ID NOs: 39-42, 46–49, and 92-95. Vative amino acid Substitutions are known in the art. In one embodiment of a method of the present invention, In one aspect of the present invention the nucleic acids of any of the nucleic acid sequences or polypeptide sequences, the present invention are said to be introduced nucleic acid or fragments of either, of the present invention can be used 60 molecules. A nucleic acid molecule is said to be “intro to search for related sequences. In a preferred embodiment, duced if it is inserted into a cell or organism as a result of a member selected from the group consisting of SEQ ID human manipulation, no matter how indirect. Examples of NOs: 5, 9–11, 43–44, 57–58, and 90 is used to search for introduced nucleic acid molecules include, without limita related sequences. In a preferred embodiment, a member tion, nucleic acids that have been introduced into cells via selected from the group consisting of SEQ ID NOs: 31, 65 transformation, transfection, injection, and projection, and 34–36, 59–60, 88–89, and 91 is used to search for related those that have been introduced into an organism via con sequences. In another embodiment, any of the motifs or jugation, endocytosis, phagocytosis, etc. US 7,112,717 B2 15 16 One subset of the nucleic acid molecules of the present Appropriate Stringency conditions which promote DNA invention is fragment nucleic acids molecules. Fragment hybridization are, for example, 6.0x sodium chloride/so nucleic acid molecules may consist of significant portion(s) dium citrate (SSC) at about 45° C., followed by a wash of of, or indeed most of the nucleic acid molecules of the 2.0xSSC at 20–25° C., are known to those skilled in the art present invention, such as those specifically disclosed. Alter- 5 or can be found in Current Protocols in Molecular Biology, natively, the fragments may comprise Smaller oligonucle John Wiley & Sons, NY (1989), 6.3.1–6.3.6. For example, otides (having from about 15 to about 400 nucleotide the salt concentration in the wash step can be selected from residues and more preferably, about 15 to about 30 nucle a low stringency of about 2.0xSSC at 50° C. to a high otide residues, or about 50 to about 100 nucleotide residues, stringency of about 0.2xSSC at 65° C. In addition, the or about 100 to about 200 nucleotide residues, or about 200 10 temperature in the wash step can be increased from low to about 400 nucleotide residues, or about 275 to about 350 stringency conditions at room temperature, about 22°C., to nucleotide residues). high stringency conditions at about 65° C. Both temperature A fragment of one or more of the nucleic acid molecules and salt may be varied, or either the temperature or the salt of the present invention may be a probe and specifically a concentration may be held constant while the other variable PCR probe. A PCR probe is a nucleic acid molecule capable 15 is changed. of initiating a polymerase activity while in a double-stranded structure with another nucleic acid. Various methods for In a preferred embodiment, a nucleic acid of the present determining the structure of PCR probes and PCR tech invention will specifically hybridize to one or more of the niques exist in the art. Computer generated searches using nucleic acid molecules described herein and complements programs such as Primer3 (www-genome.wi.mit.edu/cgi thereof, such as those encoding any of SEQ ID NOs: 5, bin/primer/primer3.cgi), STSPipeline (www-genome.wim 9–11, 43–44, 57–58, and 90, under moderately stringent it.edu/cgi-bin/www-STS Pipeline), or GeneUp (Pesole et conditions, for example at about 2.0xSSC and about 65° C. al., BioTechniques, 25:112-123 (1998)), for example, can be In a particularly preferred embodiment, a nucleic acid of used to identify potential PCR primers. the present invention will include those nucleic acid mol Nucleic acid molecules or fragments thereof of the 25 ecules that specifically hybridize to one or more nucleic acid present invention are capable of specifically hybridizing to molecules encoding any of SEQ ID NOs: 5, 9–11, 43–44, other nucleic acid molecules under certain circumstances. 57–58, and 90, and complements thereof, under high strin Nucleic acid molecules of the present invention include gency conditions such as 0.2xSSC and about 65° C. those that specifically hybridize to those nucleic acid mol In one aspect of the present invention, the nucleic acid ecules disclosed herein, such as those encoding any of SEQ 30 molecules of the present invention have one or more nucleic ID NOs: 5, 9–11, 43–44, 57–58, and 90, and complements acid sequences encoding SEQ ID NOs: 5, 9–11, 43–44, thereof. Nucleic acid molecules of the present invention 57–58, and 90, or complements thereof. In another aspect of include those that specifically hybridize to a nucleic acid the present invention, one or more of the nucleic acid molecules comprising a member selected from the group molecules of the present invention share between about consisting of SEQ ID NOs: 31, 34–36, 59–60, and 91, and 35 100% and about 90% sequence identity with one or more of complements thereof. the nucleic acid sequences encoding SEQID NOs: 5, 9–11, As used herein, two nucleic acid molecules are said to be 43–44, 57–58, and 90, and complements thereof, and frag capable of specifically hybridizing to one another if the two ments of either. In a further aspect of the present invention, molecules are capable of forming an anti-parallel, double- 40 one or more of the nucleic acid molecules of the present Stranded nucleic acid structure. invention share between about 100% and about 95% A nucleic acid molecule is said to be the “complement of sequence identity with one or more of the nucleic acid another nucleic acid molecule if they exhibit complete sequences encoding SEQ ID NOs: 5, 9–11, 43–44, 57–58, complementarity. As used herein, molecules are said to and 90, and complements thereof, and fragments of either. In exhibit “complete complementarity” when every nucleotide 45 a more preferred aspect of the present invention, one or more of one of the molecules is complementary to a nucleotide of of the nucleic acid molecules of the present invention share the other. Two molecules are said to be “minimally comple between about 100% and about 98% sequence identity with mentary’ if they can hybridize to one another with sufficient one or more of the nucleic acid sequences encoding SEQID stability to permit them to remain annealed to one another NOs: 5, 9–11, 43–44, 57–58, and 90, and complements under at least conventional “low-stringency' conditions. 50 thereof, and fragments of either. In an even more preferred Similarly, the molecules are said to be “complementary’ if aspect of the present invention, one or more of the nucleic they can hybridize to one another with sufficient stability to acid molecules of the present invention share between about permit them to remain annealed to one another under 100% and about 99% sequence identity with one or more of conventional “high-stringency' conditions. Conventional the sequences encoding SEQ ID NOs: 5, 9–11, 43–44, stringency conditions are described by Sambrook et al., 55 57–58, and 90, and complements thereof, and fragments of Molecular Cloning, A Laboratory Manual, 2nd Ed., Cold either. Spring Harbor Press, Cold Spring Harbor, N.Y. (1989), and In a preferred embodiment the percent identity calcula by Haymes et al., Nucleic Acid Hybridization, A Practical tions are performed using BLASTN or BLASTP (default, Approach, IRL Press, Washington, D.C. (1985). Departures parameters, version 2.0.8, Altschulet al., Nucleic Acids Res., from complete complementarity are therefore permissible, 60 25:3389 3402 (1997)). as long as Such departures do not completely preclude the A nucleic acid molecule of the present invention can also capacity of the molecules to form a double-stranded struc encode a homolog polypeptide. As used herein, a homolog ture. Thus, in order for a nucleic acid molecule to serve as polypeptide molecule or fragment thereof is a counterpart a primer or probe it need only be sufficiently complementary protein molecule or fragment thereof in a second species in sequence to be able to form a stable double-stranded 65 (e.g., corn rubisco Small Subunit is a homolog of Arabidopsis structure under the particular solvent and salt concentrations rubisco Small Subunit). A homolog can also be generated by employed. molecular evolution or DNA shuffling techniques, so that the US 7,112,717 B2 17 18 molecule retains at least one functional or structure charac five or fewer conservative amino acid changes. The encod teristic of the original polypeptide (see, for example, U.S. ing nucleotide sequence will thus have corresponding base Pat. No. 5,811.238). Substitutions, permitting it to encode biologically functional In another embodiment, the homolog is selected from the equivalent forms of the polypeptides of the present inven group consisting of alfalfa, Arabidopsis, barley, Brassica tion. campestris, Brassica napus, oilseed rape, broccoli, cabbage, It is understood that certain amino acids may be substi canola, citrus, cotton, garlic, oat, Allium, flax, an ornamental tuted for other amino acids in a protein structure without plant, peanut, pepper, potato, rapeseed, rice, rye, Sorghum, appreciable loss of interactive binding capacity with struc Strawberry, Sugarcane, Sugarbeet, tomato, wheat, poplar, tures Such as, for example, antigen-binding regions of anti pine, fir, eucalyptus, apple, lettuce, lentils, grape, banana, 10 bodies or binding sites on Substrate molecules. Because it is tea, turf grasses, Sunflower, soybean, corn, Phaseolus, the interactive capacity and nature of a protein that defines crambe, mustard, castor bean, Sesame, cottonseed, linseed, that protein's biological functional activity, certain amino safflower, and oil palm. More particularly, preferred acid sequence Substitutions can be made in a protein homologs are selected from canola, corn, Brassica campes sequence and, of course, its underlying DNA coding tris, Brassica napus, oilseed rape, soybean, crambe, mus 15 sequence and, nevertheless, a protein with like properties tard, castor bean, peanut, sesame, cottonseed, linseed, rape can still be obtained. It is thus contemplated by the inventors seed, safflower, oil palm, flax, and Sunflower. In an even that various changes may be made in the peptide sequences more preferred embodiment, the homolog is selected from of the proteins or fragments of the present invention, or the group consisting of canola, rapeseed, corn, Brassica corresponding DNA sequences that encode said peptides, campestris, Brassica napus, oilseed rape, soybean, Sun without appreciable loss of their biological utility or activity. flower, safflower, oil palms, and peanut. In a particularly It is understood that codons capable of coding for Such preferred embodiment, the homolog is soybean. In a par amino acid changes are known in the art. ticularly preferred embodiment, the homolog is canola. In a In making Such changes, the hydropathic index of amino particularly preferred embodiment, the homolog is oilseed acids may be considered. The importance of the hydropathic rape. 25 amino acid index in conferring interactive biological func In a preferred embodiment, nucleic acid molecules encod tion on a protein is generally understood in the art (Kyte and ing SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90, and Doolittle, J. Mol. Biol., 157:105-132 (1982)). It is accepted complements thereof, and fragments of either, or more that the relative hydropathic character of the amino acid preferably encoding SEQ ID NOs: 5, 9–11, 43–44, 57–58, contributes to the secondary structure of the resultant and 90, and complements thereof, can be utilized to obtain 30 polypeptide, which in turn defines the interaction of the Such homologs. protein with other molecules, for example, enzymes, Sub In another further aspect of the present invention, nucleic strates, receptors, DNA, antibodies, antigens, and the like. acid molecules of the present invention can comprise Each amino acid has been assigned a hydropathic index sequences that differ from those encoding a polypeptide or on the basis of its hydrophobicity and charge characteristics fragment thereof due to the fact that a polypeptide can have 35 (Kyte and Doolittle, J. Mol. Biol., 157:105-132 (1982)); one or more conservative amino acid changes, and nucleic these are isoleucine (+4.5), Valine (+4.2), leucine (+3.8), acid sequences coding for the polypeptide can therefore phenylalanine (+2.8), cysteine/cystine (+2.5), methionine have sequence differences. It is understood that codons (+1.9), (+1.8), glycine (-0.4), threonine (-0.7), capable of coding for Such conservative amino acid Substi serine (-0.8), tryptophan (-0.9), tyrosine (-1.3), proline tutions are known in the art. 40 (-1.6), histidine (-3.2), glutamate (-3.5), glutamine (-3.5), It is well known in the art that one or more amino acids aspartate (-3.5), asparagine (-3.5), lysine (-3.9), and argi in a native sequence can be substituted with other amino nine (-4.5). acid(s), the charge and polarity of which are similar to that In making Such changes, the Substitution of amino acids of the native amino acid, i.e., a conservative amino acid whose hydropathic indices are within +2 is preferred, those substitution. Conservative substitutes for an amino acid 45 that are within +1 are particularly preferred, and those within within the native polypeptide sequence can be selected from +0.5 are even more particularly preferred. other members of the class to which the amino acid belongs. It is also understood in the art that the substitution of like Amino acids can be divided into the following four groups: amino acids can be made effectively on the basis of hydro (1) acidic amino acids; (2) basic amino acids; (3) neutral philicity. U.S. Pat. No. 4,554,101 states that the greatest polar amino acids; and (4) neutral nonpolar amino acids. 50 local average hydrophilicity of a protein, as governed by the Representative amino acids within these various groups hydrophilicity of its adjacent amino acids, correlates with a include, but are not limited to, (1) acidic (negatively biological property of the protein. charged) amino acids such as aspartic acid and glutamic As detailed in U.S. Pat. No. 4,554,101, the following acid; (2) basic (positively charged) amino acids such as hydrophilicity values have been assigned to amino acid , histidine, and lysine; (3) neutral polar amino acids 55 residues: arginine (+3.0), lysine (+3.0), aspartate (+3.0+1), Such as glycine, serine, threonine, cysteine, cystine, tyrosine, glutamate (+3.0+1), serine (+0.3), asparagine (+0.2). asparagine, and glutamine; and (4) neutral nonpolar (hydro glutamine (+0.2), glycine (O), threonine (-0.4), proline phobic) amino acids such as alanine, leucine, isoleucine, (-0.5+1), alanine (-0.5), histidine (-0.5), cysteine (-1.0), valine, proline, phenylalanine, tryptophan, and methionine. methionine (-1.3), valine (-1.5), leucine (-1.8), isoleucine Conservative amino acid substitution within the native 60 (-1.8), tyrosine (-2.3), phenylalanine (-2.5), and tryptophan polypeptide sequence can be made by replacing one amino (-3.4). acid from within one of these groups with another amino In making Such changes, the Substitution of amino acids acid from within the same group. In a preferred aspect, whose hydrophilicity values are within +2 is preferred, those biologically functional equivalents of the proteins or frag that are within +1 are particularly preferred, and those within ments thereof of the present invention can have ten or fewer 65 +0.5 are even more particularly preferred. conservative amino acid changes, more preferably seven or In a further aspect of the present invention, one or more fewer conservative amino acid changes, and most preferably of the nucleic acid molecules of the present invention differ US 7,112,717 B2 19 20 in nucleic acid sequence from those for which a specific are known to inhibit transcription and/or translation in some sequence is provided herein because one or more codons has organisms. In most cases, the adverse effects may be mini been replaced with a codon that encodes a conservative mized by using sequences which do not contain more than Substitution of the amino acid originally encoded. five consecutive A+T or G+C. Agents of the present invention include nucleic acid Protein and Peptide Molecules molecules that encode at least about a contiguous 10 amino A class of agents includes one or more of the polypeptide acid region of a polypeptide of the present invention, more molecules encoded by a nucleic acid agent of the present preferably at least about a contiguous 25, 40, 50, 100, or 125 invention. A particular preferred class of proteins is that amino acid region of a polypeptide of the present invention. having an amino acid sequence selected from the group In a preferred embodiment, any of the nucleic acid 10 consisting of SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90, molecules of the present invention can be operably linked to and fragments thereof. a promoter region that functions in a plant cell to cause the In another embodiment, the present invention includes production of an mRNA molecule, where the nucleic acid polypeptides having a region of conserved amino acid molecule that is linked to the promoter is heterologous with sequence shown in any of FIGS. 2a-2c, 3a–3c, 24a–24b, respect to that promoter. As used herein, "heterologous 15 25a–25b, 33a–33c, 34a–34b, 35a–35b and 36. In an means not naturally occurring together. embodiment, the present invention includes polypeptides The nature of the coding sequences of non-plant genes comprising a sequence selected from the group consisting of can distinguish them from plant genes as well as many other SEQ ID NOs: 39–42, 46–49, and 92–95. The present inven heterologous genes expressed in plants. For example, the tion includes and provides said Substantially purified average A+T content of bacteria can be higher than that for polypeptide wherein more than one amino acid sequence is plants. The A+T content of the genomes (and thus the genes) selected from the group consisting of SEQID NOs: 39–42, of any organism are features of that organism and reflect its 46–49, and 92–95. In a further preferred embodiment the evolutionary history. While within any one organism genes present invention includes polypeptides comprising two or have similar A+T content, the A+T content can vary tre more, three or more, or four sequences selected from the mendously from organism to organism. For example, some 25 group consisting of SEQID NOs: 39–42, 46–49, and 92–95. Bacillus species have among the most A+T rich genomes In another embodiment, the present invention includes while some Steptomyces species are among the least A+T polypeptides having homogentisate prenyl transferase activ rich genomes (about 30 to 35% A+T). ity and a region of conserved amino acid sequence shown in Due to the degeneracy of the genetic code and the limited any of FIGS. 2a-2c, 3a–3c, 25a–25c, 33a–33c, 34a–34b, number of codon choices for any amino acid, most of the 30 35a–35b embodiment, the present invention includes “excess' A+T of the structural coding sequences of some polypeptides having homogentisate prenyl transferase activ Bacillus species, for example, are found in the third position ity and comprising a sequence selected from the group of the codons. That is, genes of Some Bacillus species have consisting of SEQ ID NOs: 39–42, 46–49, and 92–95. The A or T as the third nucleotide in many codons. Thus A+T present invention includes and provides said Substantially content in part can determine codon usage bias. In addition, 35 purified polypeptide wherein more than one amino acid it is clear that genes evolve for maximum function in the sequence is selected from the group consisting of SEQ ID organism in which they evolve. This means that particular NOs: 39–42, 46–49, and 92–95. nucleotide sequences found in a gene from one organism, In a further preferred embodiment the present invention where they may play no role except to code for a particular includes polypeptides having homogentisate prenyl trans stretch of amino acids, have the potential to be recognized 40 ferase activity and comprising two or more, three or more, as gene control elements in another organism (such as or four sequences selected from the group consisting of SEQ transcriptional promoters or terminators, polyA addition ID NOs: 39-42, 46–49, and 92-95. sites, intron splice sites, or specific mRNA degradation In another embodiment, the present invention includes signals). It is perhaps Surprising that such misread signals polypeptides having a region of conserved amino acid are not a more common feature of heterologous gene expres 45 sequence shown in any of FIGS. 2a-2c, 3a–13c, 25a–25c, sion, but this can be explained in part by the relatively 33a–33c, 34a–34b, 35a–35b or 36, excluding polypeptides homogeneous A+T content (about 50%) of many organisms. derived from nucleic acid molecules derived from Nostoc This A+T content plus the nature of the genetic code put punctiforme, Anabaena, Synechocystis, Zea mays, Glycine clear constraints on the likelihood of occurrence of any max, Arabidopsis thaliana, Oryza sativa, Trichodesmium particular oligonucleotide sequence. Thus, a gene from E. 50 erythraeum, Chloroflexus aurantiacus, wheat, leek, canola, coli with a 50% A+T content is much less likely to contain cotton, Sulfolobus, Aeropyum, Sorghum, or tomato. In a any particular A+T rich segment than a gene from B. preferred embodiment, the present invention includes thuringiensis. The same can be true between genes in a polypeptides comprising a sequence selected from the group bacterium and genes in a plant, for example. consisting of SEQ ID NOs: 39–42, 46–49, and 92–95 Any of the nucleic acid molecules of the present invention 55 excluding polypeptides derived from nucleic acid molecules can be altered via any methods known in the art in order to derived from Nostoc punctiforme, Anabaena, Synechocystis, make the codons within the nucleic acid molecule more Zea mays, Glycine max, Arabidopsis thaliana, Oryza sativa, appropriate for the organism in which the nucleic acid Trichodesmium erythraeum, Chloroflexus aurantiacus, molecule is located. That is, the present invention includes wheat, leek, canola, cotton, or tomato. The present invention the modification of any of the nucleic acid molecules 60 includes and provides said Substantially purified polypeptide disclosed herein to improve codon usage in a host organism. wherein more than onethe amino acid sequence is selected It is preferred that regions comprising many consecutive from the group consisting of SEQ ID NOs: 39–42, 46–49, A+T bases or G+C bases are disrupted since these regions and 92–95. are predicted to have a higher likelihood to form hairpin In a further preferred embodiment the present invention structure due to self-complementarity. Therefore, insertion 65 includes polypeptides comprising two or more, three or of heterogeneous base pairs would reduce the likelihood of more, or four sequences selected from the group consisting self-complementary secondary structure formation which of SEQ ID NOs: 39–42, 46–49, and 92–95, excluding US 7,112,717 B2 21 22 polypeptides derived from nucleic acid molecules derived not limited to, disulfide bond formation, glycosylation, from Nostoc punctiforme, Anabaena, Synechocystis, Zea phosphorylation, or oligomerization. Thus, as used herein, mays, Glycine max, Arabidopsis thaliana, Oryza sativa, the terms “protein', 'peptide molecule', or “polypeptide' Trichodesmium erythraeum, Chloroflexus aurantiacus, include any protein that is modified by any biological or wheat, leek, canola, cotton, or tomato. non-biological process. The terms "amino acid and "amino In another embodiment, the present invention includes acids’ refer to all naturally occurring L-amino acids. This polypeptides having homogentisate prenyl transferase activ definition is meant to include norleucine, norvaline, omith ity and a region of conserved amino acid sequence shown in ine, homocysteine, and homoserine. any of FIGS. 2a-2c, 3a–3c, 25a–25c, 33a–33c, 34a–34b, One or more of the protein or fragments thereof, peptide 35a–35b polypeptides derived from nucleic acid molecules 10 molecules, or polypeptide molecules may be produced via derived from Nostoc punctiforme, Anabaena, Synechocystis, chemical synthesis, or more preferably, by expression in a Zea mays, Glycine max, Arabidopsis thaliana, Oryza sativa, suitable bacterial or eukaryotic host. Suitable methods for Trichodesmium erythraeum, Chloroflexus aurantiacus, expression are described by Sambrook et al., In: Molecular wheat, leek, canola, cotton, or tomato. In a preferred Cloning, A Laboratory Manual, 2nd Edition, Cold Spring embodiment, the present invention includes polypeptides 15 Harbor Press, Cold Spring Harbor, N.Y. (1989) or similar having homogentisate prenyl transferase activity and com teXtS. prising a sequence selected from the group consisting of A "protein fragment is a peptide or polypeptide molecule SEQID NOs: 39–42, 46–49, and 92–95, excluding polypep whose amino acid sequence comprises a Subset of the amino tides derived from nucleic acid molecules derived from acid sequence of that protein. A protein or fragment thereof Nostoc punctiforme, Anabaena, Synechocystis, Zea mays, that comprises one or more additional peptide regions not Glycine max, Arabidopsis thaliana, Oryza sativa, Trichodes derived from that protein is a “fusion' protein. Such mol mium erythraeum, Chloroflexus aurantiacus, wheat, leek, ecules may be derivatized to contain carbohydrate or other canola, cotton, or tomato. The present invention includes moieties (such as keyhole limpet hemocyanin). Fusion pro and provides said substantially purified polypeptide wherein tein or peptide molecules of the present invention are more than one amino acid sequence is selected from the 25 preferably produced via recombinant means. group consisting of SEQID NOs: 39–42, 46–49, and 92–95. Another class of agents comprises protein, peptide mol In a further preferred embodiment the present invention ecules, or polypeptide molecules, or fragments or fusions includes polypeptides having homogentisate prenyl trans thereof comprising SEQ ID NOs: 5, 9–11, 43–44, 57–58, ferase activity and comprising two or more, three or more, and 90, and fragments thereof in which conservative, non or four sequences selected from the group consisting of SEQ 30 essential, or non-relevant amino acid residues have been ID NOs: 39–42, 46–49, and 92–95, excluding polypeptides added, replaced, or deleted. Computerized means for design derived from nucleic acid molecules derived from Nostoc ing modifications in protein structure are known in the art punctiforme, Anabaena, Synechocystis, Zea mays, Glycine (Dahiyat and Mayo, Science, 278:82–87 (1997)). max, Arabidopsis thaliana, Oryza sativa, Trichodesmium A protein, peptide, or polypeptide of the present invention erythraeum, Chloroflexus aurantiacus, wheat, leek, canola, 35 can also be a homolog protein, peptide, or polypeptide. As cotton, or tomato. used herein, a homolog protein, peptide, or polypeptide or Polypeptide agents may have C-terminal or N-terminal fragment thereof is a counterpart protein, peptide, or amino acid sequence extensions. One class of N-terminal polypeptide or fragment thereof in a second species. A extensions employed in a preferred embodiment are plastid homolog can also be generated by molecular evolution or transit peptides. When employed, plastid transit peptides can 40 DNA shuffling techniques, so that the molecule retains at be operatively linked to the N-terminal sequence, thereby least one functional or structure characteristic of the original permitting the localization of the agent polypeptides to (see, for example, U.S. Pat. No. 5,811.238). plastids. In an embodiment of the present invention, any In another embodiment, the homolog is selected from the Suitable plastid targeting sequence can be used. Where group consisting of alfalfa, Arabidopsis, barley, broccoli, Suitable, a plastid targeting sequence can be substituted for 45 cabbage, canola, citrus, cotton, garlic, oat, Allium, flax, an a native plastid targetting sequence, for example, for the ornamental plant, peanut, pepper, potato, rapeseed, rice, rye, CTP occurring natively in the tocopherol homogentisate Sorghum, Strawberry, Sugarcane, Sugarbeet, tomato, wheat, prenyl transferase protein. In a further embodiment, a plastid poplar, pine, fir, eucalyptus, apple, lettuce, lentils, grape, targeting sequence that is heterologous to any homogentisate banana, tea, turf grasses, Sunflower, soybean, corn, and prenyl transferase protein or fragment described herein can 50 Phaseolus. More particularly, preferred homologs are be used. In a further embodiment, any suitable, modified selected from canola, rapeseed, corn, Brassica campestris, plastid targetting sequence can be used. In another embodi Brassica napus, oilseed rape, soybean, crambe, mustard, ment, the plastid targeting sequence is a CTP1 sequence (see castor bean, peanut, sesame, cottonseed, linseed, safflower, WO 00/61771). oil palm, flax, and Sunflower. In an even more preferred In a preferred aspect a protein of the present invention is 55 embodiment, the homolog is selected from the group con targeted to a plastid using either a native transit peptide sisting of canola, rapeseed, corn, Brassica campestris, Bras sequence or a heterologous transit peptide sequence. In the sica napus, oilseed rape, soybean, Sunflower, safflower, oil case of nucleic acid sequences corresponding to nucleic acid palms, and peanut. In a preferred embodiment, the homolog sequences of non-higher plant organisms such as cynobac is soybean. In a preferred embodiment, the homolog is teria, Such nucleic acid sequences can be modified to attach 60 canola. In a preferred embodiment, the homolog is oilseed the coding sequence of the protein to a nucleic acid sequence rape. of a plastid targeting peptide. In a preferred embodiment, the nucleic acid molecules of As used herein, the terms “protein”, “peptide molecule'. the present invention or complements and fragments of or “polypeptide' include any molecule that comprises five or either can be utilized to obtain such homologs. more amino acids. It is well known in the art that protein, 65 Agents of the present invention include proteins and peptide, or polypeptide molecules may undergo modifica fragments thereof comprising at least about a contiguous 10 tion, including post-translational modifications, such as, but amino acid region preferably comprising at least about a US 7,112,717 B2 23 24 contiguous 20 amino acid region, even more preferably nucleic acid molecule encoding any of the following comprising at least about a contiguous 25, 35, 50, 75, or 100 enzymes: tyra, prephenate dehydrogenase, tocopherol amino acid region of a protein of the present invention. In cyclase, dxs, dxr, GGPPS, HPPD, tMT2, MT1, GCPE, another preferred embodiment, the proteins of the present AANT1, IDI, GGH, GMT, or a plant ortholog and an invention include between about 10 and about 25 contiguous 5 antisense construct for homogentisic acid dioxygenase are amino acid region, more preferably between about 20 and introduced into a plant. about 50 contiguous amino acid region, and even more For any of the above combinations, a nucleic acid mol preferably between about 40 and about 80 contiguous amino ecule encoding a homogentisate prenyl transferase polypep acid region. tide encodes a polypeptide comprising a sequence selected Plant Constructs and Plant Transformants 10 from the group consisting of SEQID NOs: 5, 9–11, 43–44, One or more of the nucleic acid molecules of the present 57–58, and 90. In another preferred embodiment, a nucleic invention may be used in plant transformation or transfec acid molecule encoding a homogentisate prenyl transferase tion. Exogenous genetic material may be transferred into a polypeptide encodes a polypeptide comprising one or more plant cell and the plant cell regenerated into a whole, fertile, of SEQ ID NOs: 39–42, 46–49, and 92–95. In a preferred or sterile plant. Exogenous genetic material is any genetic 15 embodiment, the homogentisate prenyl transferase polypep material, whether naturally occurring or otherwise, from any tide does not have an amino acid sequence that is derived Source that is capable of being inserted into any organism. from a nucleic acid derived from Nostoc punctiforme, Ana In a preferred aspect of the present invention the exog baena, Synechocystis, Zea mays, Glycine max, Arabidopsis enous genetic material comprises a nucleic acid sequence of thaliana, Oryza sativa, wheat, leek, canola, cotton, or the present invention, more preferably one that encodes tOmatO. homogentisate prenyl transferase. In another preferred Such genetic material may be transferred into either aspect of the present invention the exogenous genetic mate monocotyledons or dicotyledons including, but not limited rial of the present invention comprises a nucleic acid to canola, corn, soybean, Arabidopsis phaseolus, peanut, sequence encoding an amino acid sequence selected from alfalfa, wheat, rice, oat, Sorghum, rapeseed, rye, tritordeum, the group consisting of SEQID NOs: 5, 9–11, 43–44, 57–58, 25 millet, fescue, perennial ryegrass, Sugarcane, cranberry, and 90, and complements thereof and fragments of either. In papaya, banana, Safflower, oil palms, flax, muskmelon, a further aspect of the present invention the exogenous apple, cucumber, dendrobium, gladiolus, chrysanthemum, genetic material comprises a nucleic acid sequence encoding liliacea, cotton, eucalyptus, Sunflower, Brassica campestris, an amino acid sequence selected from the group consisting Brassica napus, oilseed rape, turfgrass, Sugarbeet, coffee of SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90, and 30 and dioscorea (Christou, In: Particle Bombardment for fragments of SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90. Genetic Engineering of Plants, Biotechnology Intelligence In an embodiment of the present invention, exogenous Unit. Academic Press, San Diego, Calif. (1996)), with genetic material encoding a homogentisate prenyl trans canola, corn, Brassica campestris, Brassica napus, oilseed ferase enzyme or fragment thereof is introduced into a plant rape, rapeseed, soybean, crambe, mustard, castor bean, with one or more additional genes. In one embodiment, 35 peanut, Sesame, cottonseed, linseed, safflower, oil palm, flax, preferred combinations of genes include a nucleic acid and Sunflower preferred, and canola, rapeseed, corn, Bras molecule of the present invention and one or more of the sica campestris, Brassica napus, oilseed rape, soybean, following genes: tyra (e.g., WO 02/0895.61 and Xia et al., J. Sunflower, safflower, oil palms, and peanut preferred. In a Gen. Microbiol., 138: 1309–1316 (1992)), tocopherol more preferred embodiment, the genetic material is trans cyclase (e.g., WO 01/79472), prephenate dehydrogenase, 40 ferred into canola. In another more preferred embodiment, dxs (e.g. Lois et al., Proc. Natl. Acad. Sci. (U.S.A.), 95(5): the genetic material is transferred into oilseed rape. In 2105–2110 (1998)), dxr (e.g., U.S. Pub. 2002/0108814A and another particularly preferred embodiment, the genetic Takahashi et al., Proc. Natl. Acad. Sci. (U.S.A.), 95 (17), material is transferred into Soybean. 9879–9884 (1998)), GGPPS (e.g., Bartley and Scolnik, Transfer of a nucleic acid molecule that encodes a protein Plant Physiol., 104: 1469–1470 (1994)), HPPD (e.g., Norris 45 can result in expression or overexpression of that polypep et al., Plant Physiol., 117:1317-1323 (1998)), GMT (e.g., tide in a transformed cell or transgenic plant. One or more U.S. application Ser. No. 10/219,810, filed Aug. 16, 2002), of the proteins or fragments thereof encoded by nucleic acid tMT2 (e.g., U.S. application Ser. No. 10/279,029, filed Oct. molecules of the present invention may be overexpressed in 24, 2002), AANT1 (e.g., WO 02/090506), IDI (E.C.:5.3.3.2; a transformed cell or transformed plant. Such expression or Blanc et al., In: Plant Gene Register, PRG 96-036; and Sato 50 overexpression may be the result of transient or stable et al., DNA Res., 4:215–230 (1997)), GGH (Gra?ses et al., Planta. 213–620 (2001)), or a plant ortholog and an anti transfer of the exogenous genetic material. sense construct for homogentisic acid dioxygenase (Kridlet In a preferred embodiment, expression or overexpression al., Seed Sci. Res., 1:209:219 (1991); Keegstra, Cell, 56(2): of a polypeptide of the present invention in a plant provides 247–53 (1989); Nawrath, et al., Proc. Natl. Acad. Sci. 55 in that plant, relative to an untransformed plant with a (U.S.A.), 91:12760–12764 (1994); Cyanobase, www.ka similar genetic background, an increased level of toco Zusa.or.jp/cyanobase; Smith et al., Plant J., 11:83–92 pherols. (1997); WO 00/32757: ExPASy Molecular Biology Server, In a preferred embodiment, expression or overexpression http://us.expasy.org/enzyme: MT1 WO 00/10380; gcph, of a polypeptide of the present invention in a plant provides WO 02/12478; Saint Guily et al., Plant Physiol., 100(2): 60 in that plant, relative to an untransformed plant with a 1069–1071 (1992); Sato et al., J. DNA Res., 7(1):31-63 similar genetic background, an increased level of C-toco (2000). In Such combinations, in Some crop plants, e.g., pherols. canola, a preferred promoter is a napin promoter and a In a preferred embodiment, expression or overexpression preferred plastid targeting sequence is a CTP1 sequence. It of a polypeptide of the present invention in a plant provides is preferred that gene products are targeted to the plastid. 65 in that plant, relative to an untransformed plant with a In a preferred combination a nucleic acid molecule encod similar genetic background, an increased level of Y-toco ing a homogentisate prenyl transferase polypeptide and a pherols. US 7,112,717 B2 25 26 In a preferred embodiment, expression or overexpression In some embodiments, the levels of one or more products of a polypeptide of the present invention in a plant provides of the tocopherol biosynthesis pathway, including any one or in that plant, relative to an untransformed plant with a more of tocopherols, C-tocopherols, y-tocopherols, 8-toco similar genetic background, an increased level of Ö-toco pherols, B-tocopherols, tocotrienols, C-tocotrienols, Y-tocot pherols. rienols, 8-tocotrienols, B-tocotrienols are increased so that In a preferred embodiment, expression or overexpression they constitute greater than about 10%, 20%, 25%, 30%, of a polypeptide of the present invention in a plant provides 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, in that plant, relative to an untransformed plant with a 85%, 90%, 95%, 96%, 97%, 98%, or 99% of the total similar genetic background, an increased level of B-toco tocopherol content of the organism or tissue. The levels of pherols. 10 products may be increased throughout an organism such as In a preferred embodiment, expression or overexpression a plant or localized in one or more specific organs or tissues of a polypeptide of the present invention in a plant provides of the organism. For example, the levels of products may be in that plant, relative to an untransformed plant with a increased in one or more of the tissues and organs of a plant similar genetic background, an increased level of tocot including without limitation: roots, tubers, stems, leaves, rienols. 15 stalks, fruit, berries, nuts, bark, pods, seeds and flowers. A In a preferred embodiment, expression or overexpression preferred organ is a seed. of a polypeptide of the present invention in a plant provides In a preferred embodiment, expression of enzymes in that plant, relative to an untransformed plant with a involved in tocopherol, tocotrienol or plastoquinol synthesis similar genetic background, an increased level of C-tocot in the seed will result in an increase in Y-tocopherol levels rienols. due to the absence of significant levels of GMT activity in In a preferred embodiment, expression or overexpression those tissues. In another preferred embodiment, expression of a polypeptide of the present invention in a plant provides of enzymes involved in tocopherol, tocotrienol, or plasto in that plant, relative to an untransformed plant with a quinol synthesis in photosyhthetic tissues will result in an similar genetic background, an increased level of Y-tocot increase in C-tocopherol due to the higher levels of GMT rienols. 25 activity in those tissues relative to the same activity in seed In a preferred embodiment, expression or overexpression tissue. of a polypeptide of the present invention in a plant provides In another preferred embodiment, the expression of in that plant, relative to an untransformed plant with a enzymes involved in tocopherol, tocotrienol, or plastoquinol similar genetic background, an increased level of 8-tocot synthesis in the seed will result in an increase in the total rienols. 30 tocopherol, tocotrienol, or plastoquinol level in the plant. In a preferred embodiment, expression or overexpression In some embodiments, the levels of tocopherols or a of a polypeptide of the present invention in a plant provides species such as C-tocopherol may be altered. In some in that plant, relative to an untransformed plant with a embodiments, the levels oftocotrienols may be altered. Such similar genetic background, an increased level of B-tocot alteration can be compared to a plant with a similar back rienols. 35 ground. In a preferred embodiment, expression or overexpression In another embodiment, either the C-tocopherol level, of a polypeptide of the present invention in a plant provides C-tocotrienol level, or both of plants that natively produce in that plant, relative to an untransformed plant with a high levels of either C-tocopherol, C.-tocotrienol or both similar genetic background, an increased level of plasto (e.g., Sunflowers), can be increased by the introduction of a quinols. 40 gene coding for a homogentisate prenyl transferase enzyme. In any of the embodiments described herein, an increase In a preferred aspect, a similar genetic background is a in Y-tocopherol, C-tocopherol, or both can lead to a decrease background where the organisms being compared share in the relative proportion of B-tocopherol, 8-tocopherol, or about 50% or greater of their nuclear genetic material. In a both. Similarly, an increase in Y-tocotienol, C-tocotrienol, or more preferred aspect a similar genetic background is a both can lead to a decrease in the relative proportion of 45 background where the organisms being compared share B-tocotrienol, 8-tocotrienol, or both. about 75% or greater, even more preferably about 90% or In another embodiment, expression overexpression of a greater of their nuclear genetic material. In another even polypeptide of the present invention in a plant provides in more preferable aspect, a similar genetic background is a that plant, or a tissue of that plant, relative to an untrans background where the organisms being compared are plants, formed plant or plant tissue, with a similar genetic back 50 and the plants are isogenic except for any genetic material ground, an increased level of a homogentisate prenyl trans originally introduced using plant transformation techniques. ferase protein or fragment thereof. In another preferred embodiment, expression or overex In some embodiments, the levels of one or more products pression of a polypeptide of the present invention in a of the tocopherol biosynthesis pathway, including any one or transformed plant may provide tolerance to a variety of more of tocopherols, C-tocopherols, y-tocopherols, 8-toco 55 stress, e.g. oxidative stress tolerance Such as to oxygen or pherols, B-tocopherols, tocotrienols, C-tocotrienols, Y-tocot oZone, UV tolerance, cold tolerance, or fungal/microbial rienols, 8-tocotrienols, B-tocotrienols are increased by pathogen tolerance. greater than about 10%, or more preferably greater than As used herein in a preferred aspect, a tolerance or about 25%, 35%, 50%, 75%, 80%, 90%, 100%, 150%, resistance to stress is determined by the ability of a plant, 200%, 1,000%, 2,000%, or 2,500%. The levels of products 60 when challenged by a stress such as cold to produce a plant may be increased throughout an organism such as a plant or having a higher yield than one without such tolerance or localized in one or more specific organs or tissues of the resistance to stress. In a particularly preferred aspect of the organism. For example, the levels of products may be present invention, the tolerance or resistance to stress is increased in one or more of the tissues and organs of a plant measured relative to a plant with a similar genetic back including without limitation: roots, tubers, stems, leaves, 65 ground to the tolerant or resistance plant except that the plant stalks, fruit, berries, nuts, bark, pods, seeds and flowers. A reduces the expression, expresses, or over expresses a pro preferred organ is a seed. tein or fragment thereof of the present invention. US 7,112,717 B2 27 28 Exogenous genetic material may be transferred into a host (1997)), the Arabidopsis thaliana SUC2 sucrose-H+ sym cell by the use of a DNA vector or construct designed for porter promoter (Truernit et al., Planta., 196:564–570 Such a purpose. Design of Such a vector is generally within (1995)) and the promoter for the thylakoid membrane pro the skill of the art (see, Plant Molecular Biology: A Labo teins from spinach (psal D, psaE, psaE, PC, FNR, atpC, atp), ratory Manual, Clark (ed.), Springer, N.Y. (1997)). 5 cab, rbcS). Other promoters for the chlorophyll a?b-binding A construct or vector may include a plant promoter to proteins may also be utilized in the present invention, Such express the polypeptide of choice. In a preferred embodi as the promoters for LhcB gene and PsbP gene from white ment, any nucleic acid molecules described herein can be mustard (Sinapis alba; Kretsch et al., Plant Mol. Biol., operably linked to a promoter region which functions in a 28:219–229 (1995)). plant cell to cause the production of an mRNA molecule. For 10 For the purpose of expression in sink tissues of the plant, example, any promoter that functions in a plant cell to cause such as the tuber of the potato plant, the fruit of tomato, or the production of an mRNA molecule, such as those pro the seed of corn, wheat, rice and barley, it is preferred that moters described herein, without limitation, can be used. In the promoters utilized in the present invention have rela a preferred embodiment, the promoter is a plant promoter. tively high expression in these specific tissues. A number of A number of promoters that are active in plant cells have 15 promoters for genes with tuber-specific or tuber-enhanced been described in the literature. These include the expression are known, including the class I patatin promoter synthase (NOS) promoter (Ebert et al., Proc. Natl. Acad. Sci. (Bevan et al., EMBO.J., 8:1899–1906 (1986); Jefferson et (U.S.A.), 84:5745–5749 (1987)), the octopine synthase al., Plant Mol. Biol., 14:995-1006 (1990)), the promoter for (OCS) promoter (which is carried on tumor-inducing plas the potato tuber ADPGPP genes, both the large and small mids of Agrobacterium tumefaciens), the caulimovirus pro- 20 Subunits, the Sucrose synthase promoter (Salanoubat and moters such as the cauliflower mosaic virus (CaMV) 19S Belliard, Gene, 60:47–56 (1987), Salanoubat and Belliard, promoter (Lawton et al., Plant Mol. Biol., 9:315–324 Gene, 84:181–185 (1989)), the promoter for the major tuber (1987)) and the CaMV 35S promoter (Odell et al., Nature, proteins including the 22 kd protein complexes and protease 313:810–812 (1985)), the figwort mosaic virus 35S-pro inhibitors (Hannapel, Plant Physiol., 101:703–704 (1993)), moter, the light-inducible promoter from the small subunit 25 the promoter for the granule-bound starch synthase gene of ribulose-1,5-bis-phosphate carboxylase (ssERUBISCO). (GBSS) (Visser et al., Plant Mol. Biol., 17:691-699 (1991)) the Adh promoter (Walker et al., Proc. Natl. Acad. Sci. and other class I and II patatins promoters (Koster-Topfer et (U.S.A.), 84:6624–6628 (1987)), the sucrose synthase pro al., Mol. Gen. Genet., 219:390–396 (1989); Mignery et al., moter (Yang et al., Proc. Natl. Acad. Sci. (U.S.A.), Gene., 62:27–44 (1988)). 87:4144-4148 (1990)), the R gene complex promoter 30 Other promoters can also be used to express a polypeptide (Chandler et al., The Plant Cell, 1:1175–1183 (1989)) and in specific tissues, such as seeds or fruits. Indeed, in a the chlorophyll afb binding protein gene promoter, etc. preferred embodiment, the promoter used is a seed specific These promoters have been used to create DNA constructs promoter. Examples of Such promoters include the 5' regu that have been expressed in plants; see, e.g., WO 84/02913. latory regions from Such genes as napin (Kridlet al., Seed The CaMV 35S promoters are preferred for use in plants. 35 Sci. Res., 1:209:219 (1991)), phaseolin (Bustos et al., Plant Promoters known or found to cause transcription of DNA in Cell, 1 (9):839–853 (1989)), soybean trypsin inhibitor (Riggs plant cells can be used in the present invention. et al., Plant Cell, 1(6):609-621 (1989)), ACP (Baerson et al., For the purpose of expression in source tissues of the Plant Mol. Biol., 22(2):255-267 (1993)), stearoyl-ACP plant, such as the leaf, seed, root or stem, it is preferred that desaturase (Slocombe et al., Plant Physiol., 104(4): 167–176 the promoters utilized have relatively high expression in 40 (1994)), soybean O' subunit of B-conglycinin (soy 7s, (Chen these specific tissues. Tissue-specific expression of a protein et al., Proc. Natl. Acad. Sci., 83:8560–8564 (1986))), and of the present invention is a particularly preferred embodi oleosin (see, for example, Hong et al., Plant Mol. Biol., ment. For this purpose, one may choose from a number of 34(3):549–555 (1997)). Further examples include the pro promoters for genes with tissue- or cell-specific or enhanced moter for B-conglycinin (Chen et al., Dev. Genet., expression. Examples of Such promoters reported in the 45 10:112-122 (1989)). Also included are the Zeins, which are literature include the chloroplast glutamine synthetase GS2 a group of storage proteins found in corn endosperm. promoter from pea (Edwards et al., Proc. Natl. Acad. Sci. Genomic clones for Zein genes have been isolated (Pedersen (U.S.A.), 87:3459–3463 (1990)), the chloroplast fructose-1, et al., Cell, 29:1015-1026 (1982), and Russell et al., Trans 6-biphosphatase (FBPase) promoter from wheat (Lloyd et genic Res., 6(2): 157–168) and the promoters from these al., Mol. Gen. Genet., 225:209–216 (1991)), the nuclear 50 clones, including the 15 kD. 16 kD. 19 kD. 22 kD. 27 kD and photosynthetic ST-LS1 promoter from potato (Stockhaus et genes, could also be used. Other promoters known to al., EMBO J., 8:2445–2451 (1989)), the serine/threonine function, for example, in corn include the promoters for the kinase (PAL) promoter and the glucoamylase (CHS) pro following genes: waxy, Brittle, Shrunken 2, Branching moter from Arabidopsis thaliana. Also, reported to be active enzymes I and II, starch synthases, debranching enzymes, in photosynthetically active tissues are the ribulose-1,5- 55 oleosins, glutelins and Sucrose synthases. A particularly bisphosphate carboxylase (RbcS) promoter from eastern preferred promoter for corn endosperm expression is the larch (Larix laricina), the promoter for the cab gene, cab6. promoter for the glutelin gene from rice, more particularly from pine (Yamamoto et al., Plant Cell Physiol., 35:773–778 the Osgt-1 promoter (Zheng et al., Mol. Cell Biol., (1994)), the promoter for the Cab-1 gene from wheat (Fejes 13:5829–5842 (1993)). Examples of promoters suitable for et al., Plant Mol. Biol., 15:921-932 (1990)), the promoter 60 expression in wheat include those promoters for the ADP for the CAB-1 gene from spinach (Lubberstedt et al., Plant glucose pyrosynthase (ADPGPP) subunits, the granule Physiol., 104:997-1006 (1994)), the promoter for the cablR bound and other starch synthase, the branching and gene from rice (Luan et al., Plant Cell., 4:971–981 (1992)), debranching enzymes, the embryogenesis-abundant pro the pyruvate, orthophosphate dikinase (PPDK) promoter teins, the gliadins and the glutenins. Examples of Such from corn (Matsuoka et al., Proc. Natl. Acad. Sci. (U.S.A.), 65 promoters in rice include those promoters for the ADPGPP 90:9586–9590 (1993)), the promoter for the tobacco Subunits, the granule bound and other starch synthase, the Lhcb1*2 gene (Cerdan et al., Plant Mol. Biol., 33:245-255 branching enzymes, the debranching enzymes, sucrose Syn US 7,112,717 B2 29 30 thases and the glutelins. A particularly preferred promoter is Plant Molecular Biology Manual, Kluwer, Dordrecht the promoter for rice glutelin, Osgt-1. Examples of Such (1988)), aadA (Jones et al., Mol. Gen. Genet. (1987)), which promoters for barley include those for the ADPGPP sub encodes glyphosate resistance; a nitrilase gene which con units, the granule bound and other starch synthase, the fers resistance to bromoxynil (Stalker et al., J. Biol. Chem., branching enzymes, the debranching enzymes. Sucrose Syn 263:6310–6314 (1988)); a mutant acetolactate synthase thases, the hordeins, the embryo globulins and the aleurone gene (ALS) which confers imidazolinone or Sulphonylurea specific proteins. A preferred promoter for expression in the resistance (EP 0 154204 (Sep. 11, 1985)), ALS (DHalluin seed is a napin promoter. Another preferred promoter for et al., Bio/Technology, 10:309-314 (1992)), and a methotr expression is an Arcelin 5 promoter. exate resistant DHFR gene (Thillet et al., J. Biol. Chem., Root specific promoters may also be used. An example of 10 263:12500–12508 (1988)). Such a promoter is the promoter for the acid chitinase gene A vector or construct may also include a transit peptide. (Samac et al., Plant Mol. Biol., 25:587–596 (1994)). Expres Incorporation of a Suitable chloroplast transit peptide may sion in root tissue could also be accomplished by utilizing also be employed (EPO 218 571). Translational enhancers the root specific subdomains of the CaMV35S promoter that may also be incorporated as part of the vector DNA. DNA have been identified (Lam et al., Proc. Natl. Acad. Sci. 15 constructs could contain one or more 5' non-translated leader (U.S.A.), 86:7890-7894 (1989)). Other root cell specific sequences, which may serve to enhance expression of the promoters include those reported by Conkling et al., Plant gene products from the resulting mRNA transcripts. Such Physiol., 93:1203–1211 (1990). sequences may be derived from the promoter selected to Other preferred promoters include 7Scot' (Beachy et al., express the gene or can be specifically modified to increase EMBO.J., 4:3047 (1985); Schuler et al., Nucleic Acid Res., translation of the mRNA. Such regions may also be obtained 10(24):8225–8244 (1982)); USP 88 and enhanced USP 88 from viral RNAs, from suitable eukaryotic genes, or from a (U.S. Patent Application No. 60/377,236, filed May 3, 2002, synthetic gene sequence. For a review of optimizing expres incorporated herein by reference); and 75C. (U.S. patent sion of transgenes, see Koziel et al., Plant Mol. Biol., application Ser. No. 10/235,618). 32:393-405 (1996). A preferred transit peptide is CTP1. Additional promoters that may be utilized are described, 25 A vector or construct may also include a screenable for example, in U.S. Pat. Nos. 5,378,619; 5,391,725; 5,428, marker. Screenable markers may be used to monitor expres 147: 5,447,858; 5,608,144; 5,608,144; 5,614,399; 5,633, Sion. Exemplary screenable markers include: a 3-glucu 441; 5,633,435; and 4,633,436. In addition, a tissue specific ronidase or uidA gene (GUS) which encodes an enzyme for enhancer may be used (Fromm et al., The Plant Cell, which various chromogenic Substrates are known (Jefferson, 1:977-984 (1989)). 30 Plant Mol. Biol, Rep., 5:387-405 (1987); Jefferson et al., Constructs or vectors may also include, with the coding EMBO J., 6:3901-3907 (1987)); an R-locus gene, which region of interest, a nucleic acid sequence that acts, in whole encodes a product that regulates the production of antho or in part, to terminate transcription of that region. A number cyanin pigments (red color) in plant tissues (Dellaporta et of such sequences have been isolated, including the Tr7 3 al., Stadler Symposium, 11:263–282 (1988)); a B-lactamase sequence and the NOS 3' sequence (Ingelbrecht et al., The 35 gene (Sutcliffe et al., Proc. Natl. Acad. Sci. (U.S.A.), Plant Cell, 1:671–680 (1989); Bevan et al., Nucleic Acids 75:3737 3741 (1978)), a gene which encodes an enzyme for Res., 11:369-385 (1983)). Regulatory transcript termination which various chromogenic Substrates are known (e.g., regions can be provided in plant expression constructs of PADAC, a chromogenic cephalosporin); a luciferase gene this present invention as well. Transcript termination regions (Ow et al., Science, 234:856–859 (1986)); a xylE gene can be provided by the DNA sequence encoding the gene of 40 (Zukowsky et al., Proc. Natl. Acad. Sci. (U.S.A.), interest or a convenient transcription termination region 80: 1101-1105 (1983)) which encodes a catechol dioxyge derived from a different gene source, for example, the nase that can convert chromogenic catechols; an O-amylase transcript termination region that is naturally associated with gene (Ikatu et al., Bio/Technol., 8:241-242 (1990)); a tyro the transcript initiation region. The skilled artisan will sinase gene (Katz et al., J. Gen. Microbiol., 129:2703–2714 recognize that any convenient transcript termination region 45 (1983)) which encodes an enzyme capable of oxidizing that is capable of terminating transcription in a plant cell can tyrosine to DOPA and dopaquinone which in turn condenses be employed in the constructs of the present invention. to melanin; an O-galactosidase, which will turn a chromoge A vector or construct may also include regulatory ele nic O-galactose Substrate. ments. Examples of such include the Adh intron 1 (Callis et Included within the terms “selectable or screenable al., Genes and Develop., 1:1183–1200 (1987)), the sucrose 50 marker genes' are also genes that encode a secretable synthase intron (Vasil et al., Plant Physiol., 91: 1575–1579 marker whose secretion can be detected as a means of (1989)) and the TMV omega element (Gallie et al., The identifying or selecting for transformed cells. Examples Plant Cell, 1:301–311 (1989)). These and other regulatory include markers that encode a secretable antigen that can be elements may be included when appropriate. identified by antibody interaction, or even secretable A vector or construct may also include a selectable 55 enzymes that can be detected catalytically. Secretable pro marker. Selectable markers may also be used to select for teins fall into a number of classes, including small, diffusible plants or plant cells that contain the exogenous genetic proteins that are detectable, (e.g., by ELISA), small active material. Examples of such include, but are not limited to: a enzymes that are detectable in extracellular solution (e.g., neo gene (Potrykus et al., Mol. Gen. Genet., 199:183–188 C.-amylase, B-lactamase, phosphinothricin transferase), or (1985)), which codes for kanamycin resistance and can be 60 proteins that are inserted or trapped in the cell wall (Such as selected for using kanamycin, Rpt. G418, hpt etc.; a bar proteins that include a leader sequence such as that found in gene which codes for bialaphos resistance; a mutant EPSP the expression unit of extension or tobacco PR-S). Other synthase gene (Hinchee et al., Bio/Technology, 6:915-922 possible selectable and/or screenable marker genes will be (1988); Reynaerts et al., Selectable and Screenable Markers. apparent to those of skill in the art. In: Gelvin and Schilperoort, Plant Molecular Biology 65 There are many methods for introducing transforming Manual, Kluwer, Dordrecht (1988); Reynaerts et al., Select nucleic acid molecules into plant cells. Suitable methods are able and Screenable Markers. In: Gelvin and Schilperoort, believed to include virtually any method by which nucleic US 7,112,717 B2 31 32 acid molecules may be introduced into a cell. Such as by For the bombardment, cells in Suspension may be con Agrobacterium infection or direct delivery of nucleic acid centrated on filters. Filters containing the cells to be bom molecules such as, for example, by PEG-mediated transfor barded are positioned at an appropriate distance below the mation, by electroporation or by acceleration of DNA coated microprojectile stopping plate. If desired, one or more particles, and the like. (Potrykus, Ann. Rev. Plant Physiol. screens are also positioned between the gun and the cells to Plant Mol. Biol., 42:205–225 (1991); Vasil, Plant Mol. Biol., be bombarded. 25:925-937 (1994)). For example, electroporation has been Alternatively, immature embryos or other target cells may used to transform corn protoplasts (Fromm et al., Nature, be arranged on solid culture medium. The cells to be 312:791-793 (1986)). bombarded are positioned at an appropriate distance below Other vector systems suitable for introducing transform 10 the microprojectile stopping plate. If desired, one or more ing DNA into a host plant cell include but are not limited to screens are also positioned between the acceleration device binary artificial chromosome (BIBAC) vectors (Hamilton et and the cells to be bombarded. Through the use of tech al., Gene, 200:107–116 (1997)); and transfection with RNA niques set forth herein one may obtain 1000 or more loci of viral vectors (Della-Cioppa et al., Ann. N.Y. Acad. Sci. cells transiently expressing a marker gene. The number of (1996), 792 (Engineering Plants for Commercial Products 15 cells in a focus that express the exogenous gene product 48 and Applications, 57-61). Additional vector systems also hours post-bombardment often ranges from one to ten, and include plant selectable YAC vectors such as those described average one to three. in Mullen et al., Molecular Breeding, 4:449–457 (1988). In bombardment transformation, one may optimize the Technology for introduction of DNA into cells is well pre-bombardment culturing conditions and the bombard known to those of skill in the art. Four general methods for ment parameters to yield the maximum numbers of stable delivering a gene into cells have been described: (1) chemi transformants. Both the physical and biological parameters cal methods (Graham and van der Eb, Virology, 54:536–539 for bombardment are important in this technology. Physical (1973)); (2) physical methods such as microinjection factors are those that involve manipulating the DNA/micro (Capecchi, Cell, 22:479–488 (1980)), electroporation (Wong projectile precipitate or those that affect the flight and and Neumann, Biochem. Biophys. Res. Commun., 25 velocity of either the macro- or microprojectiles. Biological 107:584-587 (1982); Fromm et al., Proc. Natl. Acad. Sci. factors include all steps involved in manipulation of cells (U.S.A.), 82:5824–5828 (1985); U.S. Pat. No. 5,384,253); before and immediately after bombardment, the osmotic the gene gun (Johnston and Tang, Methods Cell Biol., adjustment of target cells to help alleviate the trauma asso 43:353–365 (1994)); and vacuum infiltration (Bechtold et ciated with bombardment and also the nature of the trans al., C.R. Acad. Sci. Paris, Life Sci., 316:1194-1199 (1993)); 30 forming DNA, such as linearized DNA or intact supercoiled (3) viral vectors (Clapp, Clin. Perinatol., 20:155-168 plasmids. It is believed that pre-bombardment manipulations (1993); Lu et al., J. Exp. Med., 178:2089. 2096 (1993); are especially important for successful transformation of Eglitis and Anderson, Biotechniques, 6:608–614 (1988)); immature embryos. and (4) receptor-mediated mechanisms (Curiel et al., Hum. In another alternative embodiment, plastids can be stably Gen. Ther., 3:147–154 (1992), Wagner et al., Proc. Natl. 35 transformed. Methods disclosed for plastid transformation in Acad. Sci. (U.S.A.), 89:6099-6103 (1992)). higher plants include the particle gun delivery of DNA Acceleration methods that may be used include, for containing a selectable marker and targeting of the DNA to example, microprojectile bombardment and the like. One the plastid genome through homologous recombination example of a method for delivering transforming nucleic (Svab et al., Proc. Natl. Acad. Sci. (U.S.A.), 87:8526–8530 acid molecules into plant cells is microprojectile bombard 40 (1990); Svab and Maliga, Proc. Natl. Acad. Sci. (U.S.A.), ment. This method has been reviewed by Yang and Christou 90:913–917 (1993); Staub and Maliga, EMBO J., (eds.), Particle Bombardment Technology for Gene Trans 12:601-606 (1993); U.S. Pat. Nos. 5.451,513 and 5,545, fer, Oxford Press, Oxford, England (1994). Non-biological 818). particles (microprojectiles) may be coated with nucleic acids Accordingly, it is contemplated that one may wish to and delivered into cells by a propelling force. Exemplary 45 adjust various aspects of the bombardment parameters in particles include those comprised of tungsten, gold, plati Small scale studies to fully optimize the conditions. One may num and the like. particularly wish to adjust physical parameters such as gap A particular advantage of microprojectile bombardment, distance, flight distance, tissue distance and helium pressure. in addition to it being an effective means of reproducibly One may also minimize the trauma reduction factors by transforming monocots, is that neither the isolation of pro 50 modifying conditions that influence the physiological State toplasts (Cristou et al., Plant Physiol., 87:671-674 (1988)) of the recipient cells and which may therefore influence nor the Susceptibility to Agrobacterium infection is required. transformation and integration efficiencies. For example, the An illustrative embodiment of a method for delivering DNA osmotic state, tissue hydration and the Subculture stage or into corn cells by acceleration is a biolistics C-particle cell cycle of the recipient cells may be adjusted for optimum delivery system, which can be used to propel particles 55 transformation. The execution of other routine adjustments coated with DNA through a screen, such as a stainless steel will be known to those of skill in the art in light of the or Nytex screen, onto a filter surface covered with corn cells present disclosure. cultured in Suspension. Gordon-Kamm et al., describes the Agrobacterium-mediated transfer is a widely applicable basic procedure for coating tungsten particles with DNA system for introducing genes into plant cells because the (Gordon-Kamm et al., Plant Cell, 2:603–618 (1990)). The 60 DNA can be introduced into whole plant tissues, thereby screen disperses the tungsten nucleic acid particles so that bypassing the need for regeneration of an intact plant from they are not delivered to the recipient cells in large aggre a protoplast. The use of Agrobacterium-mediated plant gates. A particle delivery system suitable for use with the integrating vectors to introduce DNA into plant cells is well present invention is the helium acceleration PDS-1000/He known in the art. See, for example, the methods described by gun, which is available from Bio-Rad Laboratories (Bio 65 Fraley et al., Bio/Technology, 3:629–635 (1985) and Rogers Rad, Hercules, Calif.) (Sanford et al., Technique, 3:3-16 et al., Methods Enzymol., 153:253–277 (1987). Further, the (1991)). integration of the Ti-DNA is a relatively precise process US 7,112,717 B2 33 34 resulting in few rearrangements. The region of DNA to be (1988)). In addition, “particle gun' or high-velocity micro transferred is defined by the border sequences and interven projectile technology can be utilized (Vasil et al., Bio/ ing DNA is usually inserted into the plant genome as Technology, 10:667 (1992)). described (Spielmann et al., Mol. Gen. Genet., 205:34 Using the latter technology, DNA is carried through the (1986)). cell wall and into the cytoplasm on the surface of small metal particles as described (Klein et al., Nature, 328:70 Modern Agrobacterium transformation vectors are (1987); Klein et al., Proc. Natl. Acad. Sci. (U.S.A.), capable of replication in E. coli as well as Agrobacterium, 85:8502–8505 (1988); McCabe et al., Bio/Technology, 6:923 allowing for convenient manipulations as described (Klee et (1988)). The metal particles penetrate through several layers al., In: Plant DNA Infectious Agents, Hohn and Schell (eds.), 10 of cells and thus allow the transformation of cells within Springer-Verlag, NY. pp. 179–203 (1985)). Moreover, tech tissue explants. nological advances in vectors for Agrobacterium-mediated Other methods of cell transformation can also be used and gene transfer have improved the arrangement of genes and include but are not limited to introduction of DNA into restriction sites in the vectors to facilitate construction of plants by direct DNA transfer into pollen (Hess et al., Intern vectors capable of expressing various polypeptide coding 15 Rev. Cytol., 107:367 (1987); Luo et al., Plant Mol. Biol. genes. The vectors described have convenient multi-linker Reporter, 6:165 (1988)), by direct injection of DNA into regions flanked by a promoter and a polyadenylation site for reproductive organs of a plant (Pena et al., Nature, 325:274 direct expression of inserted polypeptide coding genes and (1987)), or by direct injection of DNA into the cells of are suitable for present purposes (Rogers et al., Methods immature embryos followed by the rehydration of desic Enzymol., 153:253–277 (1987)). In addition, Agrobacterium cated embryos (Neuhaus et al., Theor: Appl. Genet., 75:30 containing both armed and disarmed Tigenes can be used (1987)). for the transformations. In those plant strains where Agro The regeneration, development and cultivation of plants bacterium-mediated transformation is efficient, it is the from single plant protoplast transformants or from various method of choice because of the facile and defined nature of transformed explants is well known in the art (Weissbach the gene transfer. 25 and Weissbach, In: Methods for Plant Molecular Biology, A transgenic plant formed using Agrobacterium transfor Academic Press, San Diego, Calif., (1988)). This regenera mation methods typically contains a single gene on one tion and growth process typically includes the steps of chromosome. Such transgenic plants can be referred to as selection of transformed cells, culturing those individualized being heterozygous for the added gene. More preferred is a cells through the usual stages of embryonic development transgenic plant that is homozygous for the added structural 30 through the rooted plantlet stage. Transgenic embryos and gene; i.e., a transgenic plant that contains two added genes, seeds are similarly regenerated. The resulting transgenic one gene at the same locus on each chromosome of a rooted shoots are thereafter planted in an appropriate plant chromosome pair. A homozygous transgenic plant can be growth medium such as soil. obtained by sexually mating (selfing) an independent seg The development or regeneration of plants containing the regant, transgenic plant that contains a single added gene, 35 foreign, exogenous gene that encodes a protein of interest is germinating some of the seed produced and analyzing the well known in the art. Preferably, the regenerated plants are resulting plants produced for the gene of interest. self-pollinated to provide homozygous transgenic plants. It is also to be understood that two different transgenic Otherwise, pollen obtained from the regenerated plants is plants can also be mated to produce offspring that contain crossed to seed-grown plants of agronomically important two independently segregating, exogenous genes. Selfing of 40 lines. Conversely, pollen from plants of these important lines appropriate progeny can produce plants that are homozy is used to pollinate regenerated plants. A transgenic plant of gous for both added, exogenous genes that encode a the present invention containing a desired polypeptide is polypeptide of interest. Back-crossing to a parental plant and cultivated using methods well known to one skilled in the out-crossing with a non-transgenic plant are also contem art. plated, as is vegetative propagation. 45 There are a variety of methods for the regeneration of plants from plant tissue. The particular method of regenera Transformation of plant protoplasts can be achieved using tion will depend on the starting plant tissue and the particular methods based on calcium phosphate precipitation, polyeth plant species to be regenerated. ylene glycol treatment, electroporation and combinations of Methods for transforming dicots, primarily by use of these treatments (see, for example, Potrykus et al., Mol. Gen. 50 Agrobacterium tumefaciens and obtaining transgenic plants Genet., 205:193–200 (1986); Lorz et al., Mol. Gen. Genet., have been published for cotton (U.S. Pat. Nos. 5,004,863: 199: 1780(1985): Fromm et al., Nature, 319:791 (1986); 5,159,135; and 5.518,908); soybean (U.S. Pat. Nos. 5,569, Uchimiya et al., Mol. Gen. Genet., 204:204 (1986); Marcotte 834 and 5,416,011; McCabe et al., Biotechnology, 6:923 et al., Nature, 335:454–457 (1988)). (1988); Christou et al., Plant Physiol., 87:671-674 (1988)); Application of these systems to different plant Strains 55 Brassica (U.S. Pat. No. 5,463,174); peanut (Cheng et al., depends upon the ability to regenerate that particular plant Plant Cell Rep., 15:653–657 (1996), McKently et al., Plant strain from protoplasts. Illustrative methods for the regen Cell Rep., 14:699-703 (1995)); papaya; pea (Grant et al., eration of cereals from protoplasts are described (Fujimura Plant Cell Rep., 15:254–258 (1995)); and Arabidopsis et al., Plant Tissue Culture Letters, 2:74 (1985); Toriyama et thaliana (Bechtold et al., C.R. Acad. Sci. Paris, Life Sci., al., Theor: Appl. Genet., 205:34 (1986); Yamada et al., Plant 60 3.16:1194-1199 (1993)). The latter method for transforming Cell Rep., 4:85 (1986); Abdullah et al., Biotechnology, Arabidopsis thaliana is commonly called 'dipping” or 4:1087 (1986)). vacuum infiltration or germplasm transformation. To transform plant strains that cannot be successfully Transformation of monocotyledons using electroporation, regenerated from protoplasts, other ways to introduce DNA particle bombardment and Agrobacterium have also been into intact cells or tissues can be utilized. For example, 65 reported. Transformation and plant regeneration have been regeneration of cereals from immature embryos or explants achieved in asparagus (Bytebier et al., Proc. Natl. Acad. Sci. can be effected as described (Vasil, Biotechnology, 6:397 (U.S.A.), 84:5354 (1987)); barley (Wan and Lemaux, Plant US 7,112,717 B2 35 36 Physiol, 104:37 (1994)); corn (Rhodes et al., Science, 240: antisense approach is to use a sequence complementary to 204 (1988); Gordon-Kamm et al., Plant Cell. 2:603–618 the target gene to block its expression and create a mutant (1990); Fromm et al., Bio/Technology, 8:833 (1990): Koziel cell line or organism in which the level of a single chosen et al., Bio/Technology, 11:194 (1993); Armstrong et al., Crop protein is selectively reduced or abolished. Antisense tech Science, 35:550–557 (1995)); oat (Somers et al., Bio/Tech niques have several advantages over other “reverse genetic' nology, 10:1589 (1992)); orchard grass (Horn et al., Plant approaches. The site of inactivation and its developmental Cell Rep., 7:469 (1988)); rice (Toriyama et al., Theor Appl. effect can be manipulated by the choice of promoter for Genet., 205:34 (1986); Part et al., Plant Mol. Biol., antisense genes or by the timing of external application or 32:1135-1148 (1996); Abedinia et al., Aust. J. Plant microinjection. Antisense can manipulate its specificity by Physiol., 24:133-141 (1997); Zhang and Wu, Theor: Appl. 10 selecting either unique regions of the target gene or regions Genet., 76:835 (1988); Zhang et al., Plant Cell Rep., 7:379 where it shares homology to other related genes (Hiatt et al., (1988); Battraw and Hall, Plant Sci., 86:191-202 (1992); In: Genetic Engineering, Setlow (ed.), Vol. 11, New York: Christou et al., Bio/Technology, 9:957 (1991)): rye (De la Plenum 49-63 (1989)). Pena et al., Nature, 325:274 (1987)); sugarcane (Bower and Antisense RNA techniques involve introduction of RNA Birch, Plant J., 2:409 (1992)); tall fescue (Wang et al., 15 that is complementary to the target mRNA into cells, which Bio/Technology, 10:691 (1992)); and wheat (Vasil et al., results in specific RNA:RNA duplexes being formed by base Bio/Technology, 10:667 (1992); U.S. Pat. No. 5,631,152). pairing between the antisense substrate and the target mRNA Assays for gene expression based on the transient expres (Green et al., Annu. Rev. Biochem., 55:569–597 (1986)). sion of cloned nucleic acid constructs have been developed Under one embodiment, the process involves the introduc by introducing the nucleic acid molecules into plant cells by tion and expression of an antisense gene sequence. Such a polyethylene glycol treatment, electroporation, or particle sequence is one in which part or all of the normal gene bombardment (Marcotte et al., Nature, 335:454–457 (1988); sequences are placed under a promoter in inverted orienta Marcotte et al., Plant Cell, 1:523-532 (1989); McCarty et tion so that the “wrong’ or complementary Strand is tran al., Cell, 66:895–905 (1991); Hattori et al., Genes Dev, scribed into a noncoding antisense RNA that hybridizes with 6:609-618 (1992); Goff et al., EMBO J., 9:25.17 2522 25 the target mRNA and interferes with its expression (1990)). Transient expression systems may be used to func (Takayama and Inouye, Crit. Rev. Biochem. Mol. Biol., tionally dissect gene constructs (see generally, Mailga et al., 25:155–184 (1990)). An antisense vector is constructed by Methods in Plant Molecular Biology, Cold Spring Harbor standard procedures and introduced into cells by transfor Press, NY (1995)). mation, transfection, electroporation, microinjection, infec Any of the nucleic acid molecules of the present invention 30 tion, etc. The type of transformation and choice of vector may be introduced into a plant cell in a permanent or will determine whether expression is transient or stable. The transient manner in combination with other genetic elements promoter used for the antisense gene may influence the Such as vectors, promoters, enhancers, etc. Further, any of level, timing, tissue, specificity, or inducibility of the anti the nucleic acid molecules of the present invention may be sense inhibition. introduced into a plant cell in a manner that allows for 35 expression or overexpression of the protein or fragment It is understood that the activity of a protein in a plant cell thereof encoded by the nucleic acid molecule. may be reduced or depressed by growing a transformed CoSuppression is the reduction in expression levels, usu plant cell containing a nucleic acid molecule whose non ally at the level of RNA, of a particular endogenous gene or transcribed strand encodes a protein or fragment thereof. A gene family by the expression of a homologous sense 40 preferred protein whose activity can be reduced or construct that is capable of transcribing mRNA of the same depressed, by any method, is a homogentisate prenyl trans Strandedness as the transcript of the endogenous gene ferase. (Napoliet al., Plant Cell, 2:279-289 (1990); van der Krolet Posttranscriptional gene silencing (PTGS) can result in al., Plant Cell, 2:291–299 (1990)). Cosuppression may virus immunity or gene silencing in plants. PTGS is induced result from stable transformation with a single copy nucleic 45 by dsRNA and is mediated by an RNA-dependent RNA acid molecule that is homologous to a nucleic acid sequence polymerase, present in the cytoplasm, which requires a found with the cell (Prolls and Meyer, Plant J., 2:465–475 dsRNA template. The dsRNA is formed by hybridization of (1992)) or with multiple copies of a nucleic acid molecule complementary transgene mRNAS or complementary that is homologous to a nucleic acid sequence found with the regions of the same transcript. Duplex formation can be cell (Mittlesten et al., Mol. Gen. Genet., 244:325–330 50 accomplished by using transcripts from one sense gene and (1994)). Genes, even though different, linked to homologous one antisense gene colocated in the plant genome, a single promoters may result in the coSuppression of the linked transcript that has self-complementarity, or sense and anti genes (Vaucheret, C.R. Acad. Sci. III, 316:1471–1483 sense transcripts from genes brought together by crossing. (1993); Flavell, Proc. Natl. Acad. Sci. (U.S.A.), The dsRNA-dependent RNA polymerase makes a comple 91:3490–3496 (1994)); van Blokland et al., Plant J., 55 mentary strand from the transgene mRNA and RNAse 6:861–877 (1994); Jorgensen, Trends Biotechnol., molecules attach to this complementary strand (cRNA). 8:340–344 (1990); Meins and Kunz. In: Gene Inactivation These crNA-RNase molecules hybridize to the endogene and Homologous Recombination in Plants, Paszkowski mRNA and cleave the single-stranded RNA adjacent to the (ed.), pp. 335-348, Kluwer Academic, Netherlands (1994)). hybrid. The cleaved single-stranded RNAs are further It is understood that one or more of the nucleic acids of 60 degraded by other host RNases because one will lack a the present invention may be introduced into a plant cell and capped 5' end and the other will lack a poly (A) tail transcribed using an appropriate promoter with Such tran (Waterhouse et al., PNAS, 95:13959–13964 (1998)). Scription resulting in the coSuppression of an endogenous It is understood that one or more of the nucleic acids of protein. the present invention may be introduced into a plant cell and Antisense approaches area way of preventing or reducing 65 transcribed using an appropriate promoter with Such tran gene function by targeting the genetic material (Mol et al., Scription resulting in the posttranscriptional gene silencing FEBS Lett., 268:427–430 (1990)). The objective of the of an endogenous transcript. US 7,112,717 B2 37 38 Antibodies have been expressed in plants (Hiatt et al., preferably 15% w/v. In a preferred oil preparation, the oil Nature, 342:76–78 (1989); Conrad and Fielder, Plant Mol. preparation is a high oil preparation with an oil content Biol., 26:1023-1030 (1994)). Cytoplasmic expression of a derived from a plant or part thereof of the present invention schv (single-chain Fv antibody) has been reported to delay of greater than about 5% w/v. more preferably 10% w/v. and infection by artichoke mottled crinkle virus. Transgenic even more preferably 15% w/v. In a preferred embodiment plants that express antibodies directed against endogenous the oil preparation is a liquid and of a Volume greater than proteins may exhibit a physiological effect (Philips et al., about 1, 5, 10, or 50 liters. The present invention provides EMBO J., 16:4489–4496 (1997); Marion-Poll, Trends in for oil produced from plants of the present invention or Plant Science, 2:447–448 (1997)). For example, expressed generated by a method of the present invention. Such an oil anti-abscisic antibodies have been reported to result in a 10 may exhibit enhanced oxidative stability. Also, Such oil may general perturbation of seed development (Philips et al., be a minor or major component of any resultant product. EMBO.J., 16:4489–4496 (1997)). Moreover, such oil may be blended with other oils. In a Antibodies that are catalytic may also be expressed in preferred embodiment, the oil produced from plants of the plants (abzymes). The principle behind abzymes is that since present invention or generated by a method of the present antibodies may be raised against many molecules, this 15 invention constitutes greater than about 0.5%, 1%. 5%, 10%, recognition ability can be directed toward generating anti 25%, 50%, 75%, or 90% by volume or weight of the oil bodies that bind transition states to force a chemical reaction component of any product. In another embodiment, the oil forward (Persidas, Nature Biotechnology, 15:1313–1315 preparation may be blended and can constitute greater than (1997); Baca et al., Ann. Rev. Biophys. Biomol. Struct., about 10%, 25%, 35%, 50%, or 75% of the blend by volume. 26:461–493 (1997)). The catalytic abilities of abzymes may Oil produced from a plant of the present invention can be be enhanced by site directed mutagenesis. Examples of admixed with one or more organic solvents or petroleum abzymes are, for example, set forth in U.S. Pat. Nos.: distillates. 5,658,753: 5,632,990; 5,631,137; 5,602,015; 5,559,538; Plants of the present invention can be part of or generated 5,576, 174: 5,500,358; 5,318,897: 5,298,409; 5,258,289; and from a breeding program. The choice of breeding method 5,194,585. 25 depends on the mode of plant reproduction, the heritability It is understood that any of the antibodies of the present of the trait(s) being improved, and the type of cultivar used invention may be expressed in plants and that such expres commercially (e.g., F hybrid cultivar, pureline cultivar, sion can result in a physiological effect. It is also understood etc.). Selected, non-limiting approaches, for breeding the that any of the expressed antibodies may be catalytic. plants of the present invention are set forth below. A The present invention also provides for parts of the plants, 30 breeding program can be enhanced using marker assisted particularly reproductive or storage parts, of the present selection of the progeny of any cross. It is further understood invention. Plant parts, without limitation, include seed, that any commercial and non-commercial cultivars can be endosperm, ovule and pollen. In a particularly preferred utilized in a breeding program. Factors such as, for example, embodiment of the present invention, the plant part is a seed. emergence vigor, vegetative vigor, stress tolerance, disease In one embodiment the seed is a constituent of animal feed. 35 resistance, branching, flowering, seed set, seed size, seed In another embodiment, the plant part is a fruit, more density, standability, and threshability etc. will generally preferably a fruit with enhanced shelf life. In another pre dictate the choice. ferred embodiment, the fruit has increased levels of a For highly heritable traits, a choice of superior individual tocopherol. In another preferred embodiment, the fruit has plants evaluated at a single location will be effective, increased levels of a tocotrienol. 40 whereas for traits with low heritability, selection should be The present invention also provides a container of over based on mean values obtained from replicated evaluations about 10,000, more preferably about 20,000, and even more of families of related plants. Popular selection methods preferably about 40,000 seeds where over about 10%, more commonly include pedigree selection, modified pedigree preferably 25%, more preferably 50%, and even more selection, mass selection, and recurrent selection. In a pre preferably 75% or 90% of the seeds are seeds derived from 45 ferred embodiment a backcross or recurrent breeding pro a plant of the present invention. gram is undertaken. The present invention also provides a container of over The complexity of inheritance influences choice of the about 10 kg, more preferably 25 kg, and even more prefer breeding method. Backcross breeding can be used to transfer ably 50 kg seeds where over about 10%, more preferably one or a few favorable genes for a highly heritable trait into 25%, more preferably 50%, and even more preferably 75% 50 a desirable cultivar. This approach has been used extensively or 90% of the seeds are seeds derived from a plant of the for breeding disease-resistant cultivars. Various recurrent present invention. selection techniques are used to improve quantitatively Any of the plants or parts thereof of the present invention inherited traits controlled by numerous genes. The use of may be processed to produce a feed, meal, protein, or oil recurrent selection in self-pollinating crops depends on the preparation, including oil preparations high in total toco 55 ease of pollination, the frequency of Successful hybrids from pherol content and oil preparations high in any one or more each pollination, and the number of hybrid offspring from of each tocopherol component listed herein. A particularly each Successful cross. preferred plant part for this purpose is a seed. In a preferred Breeding lines can be tested and compared to appropriate embodiment the feed, meal, protein or oil preparation is standards in environments representative of the commercial designed for livestock animals or humans, or both. Methods 60 target area(s) for two or more generations. The best lines are to produce feed, meal, protein and oil preparations are candidates for new commercial cultivars; those still deficient known in the art. See, for example, U.S. Pat. Nos. 4,957,748; in traits may be used as parents to produce new populations 5,100,679; 5,219,596; 5,936,069; 6,005,076; 6,146,669; and for further selection. 6,156.227. In a preferred embodiment, the protein prepara One method of identifying a superior plant is to observe tion is a high protein preparation. Such a high protein 65 its performance relative to other experimental plants and to preparation preferably has a protein content of greater than a widely grown standard cultivar. If a single observation is about 5% w/v., more preferably 10% w/v. and even more inconclusive, replicated observations can provide a better US 7,112,717 B2 39 40 estimate of its genetic worth. A breeder can select and cross makes it possible to plant the same number of seeds of a two or more parental lines, followed by repeated selfing and population each generation of inbreeding. selection, producing many new genetic combinations. Descriptions of other breeding methods that are com The development of new cultivars requires the develop monly used for different traits and crops can be found in one ment and selection of varieties, the crossing of these vari of several reference books (e.g., Fehr, Principles of Cultivar eties and the selection of superior hybrid crosses. The hybrid Development, Vol. 1, pp. 2-3 (1987)). seed can be produced by manual crosses between selected A transgenic plant of the present invention may also be male-fertile parents or by using male sterility systems. reproduced using apomixis. Apomixis is a genetically con Hybrids are selected for certain single gene traits such as pod trolled method of reproduction in plants where the embryo color, flower color, seed yield, pubescence color, or herbi 10 is formed without union of an egg and a sperm. There are cide resistance, which indicate that the seed is truly a hybrid. three basic types of apomictic reproduction: 1) apospory Additional data on parental lines, as well as the phenotype where the embryo develops from a chromosomally unre of the hybrid, influence the breeder's decision whether to duced egg in an embryo Sac derived from the nucleus; 2) continue with the specific hybrid cross. diplospory where the embryo develops from an unreduced Pedigree breeding and recurrent selection breeding meth 15 egg in an embryo Sac derived from the megaspore mother ods can be used to develop cultivars from breeding popu cell; and 3) adventitious embryony where the embryo devel lations. Breeding programs combine desirable traits from ops directly from a Somatic cell. In most forms of apomixis, two or more cultivars or various broad-based sources into pseudogamy, or fertilization of the polar nuclei to produce breeding pools from which cultivars are developed by endosperm is necessary for seed viability. In apospory, a selfing and selection of desired phenotypes. New cultivars nurse cultivar can be used as a pollen Source for endosperm can be evaluated to determine which have commercial formation in seeds. The nurse cultivar does not affect the potential. genetics of the aposporous apomictic cultivar since the Pedigree breeding is used commonly for the improvement unreduced egg of the cultivar develops parthenogenetically, of self-pollinating crops. Two parents who possess favor but makes possible endosperm production. Apomixis is able, complementary traits are crossed to produce an F. A 25 economically important, especially in transgenic plants, F. population is produced by selfing one or several Fis. because it causes any genotype, no matter how heterozy Selection of the best individuals from the best families is gous, to breed true. Thus, with apomictic reproduction, carried out. Replicated testing of families can begin in the Fa heterozygous transgenic plants can maintain their genetic generation to improve the effectiveness of selection for traits fidelity throughout repeated life cycles. Methods for the with low heritability. At an advanced stage of inbreeding 30 production of apomictic plants are known in the art. See, (i.e., F and F7), the best lines or mixtures of phenotypically U.S. Pat. No. 5,811,636. similar lines are tested for potential release as new cultivars. Other Organisms Backcross breeding has been used to transfer genes for a A nucleic acid of the present invention may be introduced simply inherited, highly heritable trait into a desirable into any cell or organism such as a mammalian cell, mam homozygous cultivar or inbred line, which is the recurrent 35 mal, fish cell, fish, bird cell, bird, algae cell, algae, fungal parent. The source of the trait to be transferred is called the cell, fungi, or bacterial cell. A protein of the present inven donor parent. The resulting plant is expected to have the tion may be produced in an appropriate cell or organism. attributes of the recurrent parent (e.g., cultivar) and the Preferred host and transformants include: fungal cells such desirable trait transferred from the donor parent. After the as Aspergillus, yeasts, mammals, particularly bovine and initial cross, individuals possessing the phenotype of the 40 porcine, insects, bacteria, and algae. Particularly preferred donor parent are selected and repeatedly crossed (back bacteria are Agrobacteruin tumefaciens and E. coli. crossed) to the recurrent parent. The resulting parent is Methods to transform such cells or organisms are known expected to have the attributes of the recurrent parent (e.g., in the art (EPO 238 023; Yelton et al., Proc. Natl. Acad. Sci. cultivar) and the desirable trait transferred from the donor (U.S.A.), 81: 1470–1474 (1984); Malardier et al., Gene, parent. 45 78:147–156 (1989); Becker and Guarente, In: Abelson and The single-seed descent procedure in the strict sense Simon (eds.), Guide to Yeast Genetics and Molecular Biol refers to planting a segregating population, harvesting a ogy, Method Enzymol. Vol. 194, pp. 182–187, Academic sample of one seed per plant, and using the one-seed sample Press, Inc., NY: Ito et al., J. Bacteriology, 153:163 (1983); to plant the next generation. When the population has been Hinnen et al., Proc. Natl. Acad. Sci. (U.S.A.), 75:1920 advanced from the F to the desired level of inbreeding, the 50 (1978); Bennett and LaSure (eds.), More Gene Manipual plants from which lines are derived will each trace to tionins in fingi, Academic Press, CA (1991)). Methods to different F individuals. The number of plants in a popula produce proteins of the present invention are also known tion declines each generation due to failure of some seeds to (Kudla et al., EMBO, 9:1355–1364 (1990); Jarai and Bux germinate or some plants to produce at least one seed. As a ton, Current Genetics, 26:2238-2244 (1994); Verdier, Yeast, result, not all of the F plants originally sampled in the 55 6:271-297 (1990); MacKenzie et al., Journal of Gen. Micro population will be represented by a progeny when genera biol., 139:2295 2307 (1993); Hartlet al., TIBS, 19:20–25 tion advance is completed. (1994); Bergenron et al., TIBS, 19:124-128 (1994): In a multiple-seed procedure, breeders commonly harvest Demolder et al., J. Biotechnology, 32:179–189 (1994): one or more pods from each plant in a population and thresh Craig, Science, 260:1902–1903 (1993); Gething and Sam them together to form a bulk. Part of the bulk is used to plant 60 brook, Nature, 355:33–45 (1992); Puig and Gilbert, J., Biol. the next generation and part is put in reserve. The procedure Chem., 269:7764-7771 (1994); Wang and Tsou, FASEB has been referred to as modified single-seed descent or the Journal, 7:1515–1517 (1993); Robinson et al., Bio/Technol pod-bulk technique. ogy, 1:381-384 (1994); Enderlin and Ogrydziak, Yeast, The multiple-seed procedure has been used to save labor 10:67–79 (1994); Fuller et al., Proc. Natl. Acad. Sci. at harvest. It is considerably faster to thresh pods with a 65 (U.S.A.), 86:1434–1438 (1989); Julius et al., Cell, machine than to remove one seed from each by hand for the 37: 1075–1089 (1984); Julius et al., Cell, 32:839-852 single-seed procedure. The multiple-seed procedure also (1983)). US 7,112,717 B2 41 42 In a preferred embodiment, overexpression of a protein or As discussed below, such antibody molecules or their fragment thereof of the present invention in a cell or fragments may be used for diagnostic purposes. Where the organism provides in that cell or organism, relative to an antibodies are intended for diagnostic purposes, it may be untransformed cell or organism with a similar genetic back desirable to derivatize them, for example with a ligand group ground, an increased level of tocopherols. (such as biotin) or a detectable marker group (Such as a In a preferred embodiment, overexpression of a protein or fluorescent group, a radioisotope or an enzyme). fragment thereof of the present invention in a cell or The ability to produce antibodies that bind the protein or organism provides in that cell or organism, relative to an peptide molecules of the present invention permits the untransformed cell or organism with a similar genetic back identification of mimetic compounds derived from those ground, an increased level of C-tocopherols. 10 molecules. These mimetic compounds may contain a frag In a preferred embodiment, overexpression of a protein or ment of the protein or peptide or merely a structurally fragment thereof of the present invention in a cell or similar region and nonetheless exhibits an ability to specifi organism provides in that cell or organism, relative to an cally bind to antibodies directed against that compound. untransformed cell or organism with a similar genetic back Exemplary Uses ground, an increased level of Y-tocopherols. 15 Nucleic acid molecules and fragments thereof of the In another preferred embodiment, overexpression of a present invention may be employed to obtain other nucleic protein or fragment thereof of the present invention in a cell acid molecules from the same species (nucleic acid mol or organism provides in that cell or organism, relative to an ecules from corn may be utilized to obtain other nucleic acid untransformed cell or organism with a similar genetic back molecules from corn). Such nucleic acid molecules include ground, an increased level of C-tocotrienols. the nucleic acid molecules that encode the complete coding In another preferred embodiment, overexpression of a sequence of a protein and promoters and flanking sequences protein or fragment thereof of the present invention in a cell of Such molecules. In addition, such nucleic acid molecules or organism provides in that cell or organism, relative to an include nucleic acid molecules that encode for other untransformed cell or organism with a similar genetic back isozymes or gene family members. Such molecules can be ground, an increased level of Y-tocotrienols. 25 readily obtained by using the above-described nucleic acid Antibodes molecules or fragments thereof to screen cDNA or genomic One aspect of the present invention concerns antibodies, libraries. Methods for forming such libraries are well known single-chain antigen binding molecules, or other proteins in the art. that specifically bind to one or more of the protein or peptide 30 Nucleic acid molecules and fragments thereof of the molecules of the present invention and their homologs, present invention may also be employed to obtain nucleic fusions or fragments. In a particularly preferred embodi acid homologs. Such homologs include the nucleic acid ment, the antibody specifically binds to a protein having the molecules of plants and other organisms, including bacteria amino acid sequence set forth in SEQ ID NOs: 5, 9–11, and fungi, including the nucleic acid molecules that encode, 43–44, 57–58, and 90, or fragments thereof. Antibodies of 35 in whole or in part, protein homologues of other plant the present invention may be used to quantitatively or species or other organisms, sequences of genetic elements, qualitatively detect the protein or peptide molecules of the Such as promoters and transcriptional regulatory elements. present invention, or to detect post translational modifica Such molecules can be readily obtained by using the above tions of the proteins. As used herein, an antibody or peptide described nucleic acid molecules or fragments thereof to is said to “specifically bind' to a protein or peptide molecule 40 screen cINA or genomic libraries obtained from such plant of the present invention if Such binding is not competitively species. Methods for forming such libraries are well known inhibited by the presence of non-related molecules. in the art. Such homolog molecules may differ in their Nucleic acid molecules that encode all or part of the nucleotide sequences from those coding for one or more of protein of the present invention can be expressed, via SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90, and comple recombinant means, to yield protein or peptides that can in 45 ments thereof because complete complementarity is not turn be used to elicit antibodies that are capable of binding needed for stable hybridization. The nucleic acid molecules the expressed protein or peptide. Such antibodies may be of the present invention therefore also include molecules used in immunoassays for that protein. Such protein-encod that, although capable of specifically hybridizing with the ing molecules, or their fragments may be a “fusion' mol nucleic acid molecules may lack "complete complementar -- ecule (i.e., a part of a larger nucleic acid molecule) Such that, 50 1ty”. upon expression, a fusion protein is produced. It is under Any of a variety of methods may be used to obtain one or stood that any of the nucleic acid molecules of the present more of the above-described nucleic acid molecules (Za invention may be expressed, via recombinant means, to mechik et al., Proc. Natl. Acad. Sci. (U.S.A.), 83:4143–4146 yield proteins or peptides encoded by these nucleic acid (1986); Goodchild et al., Proc. Natl. Acad. Sci. (U.S.A.), molecules. 55 85:5507–5511 (1988); Wickstrom et al., Proc. Natl. Acad. The antibodies that specifically bind proteins and protein Sci. (U.S.A.), 85:1028–1032 (1988); Holt et al., Molec. Cell. fragments of the present invention may be polyclonal or Biol., 8:963–973 (1988); Gerwirtz et al., Science, 242: monoclonal and may comprise intact immunoglobulins, or 1303–1306 (1988); Anfossi et al., Proc. Natl. Acad. Sci. antigen binding portions of immunoglobulins fragments (U.S.A.), 86:3379–3383 (1989); Becker et al., EMBO J., (such as (F(ab'), F(ab'))), or single-chain immunoglobulins 60 8:3685–3691 (1989)). Automated nucleic acid synthesizers producible, for example, via recombinant means. It is under may be employed for this purpose. In lieu of Such synthesis, stood that practitioners are familiar with the standard the disclosed nucleic acid molecules may be used to define resource materials that describe specific conditions and a pair of primers that can be used with the polymerase chain procedures for the construction, manipulation and isolation reaction (Mullis et al., Cold Spring Harbor Symp. Ouant. of antibodies (see, for example, Harlow and Lane. In: 65 Biol. 51:263. 273 (1986); Erlich et al., EP 50 424; EP 84 Antibodies: A Laboratory Manual, Cold Spring Harbor 796; EP 258 017; EP 237 362; Mullis, EP201 184: Mullis Press, Cold Spring Harbor, N.Y. (1988)). et al., U.S. Pat. No. 4,683,202: Erlich, U.S. Pat. No. 4,582, US 7,112,717 B2 43 44 788; and Saiki et al., U.S. Pat. No. 4,683,194) to amplify and for example, capable of detecting polymorphisms such as obtain any desired nucleic acid molecule or fragment. single nucleotide polymorphisms (SNPs). Promoter sequences and other genetic elements, including The genomes of animals and plants naturally undergo but not limited to transcriptional regulatory flanking spontaneous mutation in the course of their continuing sequences, associated with one or more of the disclosed evolution (Gusella, Ann. Rev. Biochem., 55:831–854 nucleic acid sequences can also be obtained using the (1986)). A “polymorphism' is a variation or difference in the disclosed nucleic acid sequence provided herein. In one sequence of the gene or its flanking regions that arises in embodiment, such sequences are obtained by incubating Some of the members of a species. The variant sequence and nucleic acid molecules of the present invention with mem the “original” sequence co-exist in the species population. bers of genomic libraries and recovering clones that hybrid 10 In some instances, such co-existence is in stable or quasi ize to Such nucleic acid molecules thereof. In a second stable equilibrium. embodiment, methods of "chromosome walking', or inverse A polymorphism is thus said to be “allelic', in that, due PCR may be used to obtain such sequences (Frohman et al., to the existence of the polymorphism, some members of a Proc. Natl. Acad. Sci. (U.S.A.), 85:8998–9002 (1988); Ohara population may have the original sequence (i.e., the original et al., Proc. Natl. Acad. Sci. (U.S.A.), 86:5673–5677 (1989); 15 “allele') whereas other members may have the variant Pang et al., Biotechniques, 22:1046-1048 (1977); Huang et sequence (i.e., the variant “allele). In the simplest case, al., Methods Mol. Biol., 69:89–96 (1997); Huang et al., only one variant sequence may exist and the polymorphism Method Mol. Biol., 67:287–294 (1997); Benkel et al., Genet. is thus said to be di-allelic. In other cases, the species Anal., 13:123–127 (1996); Hartlet al., Methods Mol. Biol., population may contain multiple alleles and the polymor 58:293–301 (1996)). The term “chromosome walking” phism is termed tri-allelic, etc. A single gene may have means a process of extending a genetic map by Successive multiple different unrelated polymorphisms. For example, it hybridization steps. may have a di-allelic polymorphism at one site and a multi-allelic polymorphism at another site. The nucleic acid molecules of the present invention may The variation that defines the polymorphism may range be used to isolate promoters of cell enhanced, cell specific, 25 from a single nucleotide variation to the insertion or deletion tissue enhanced, tissue specific, developmentally or envi of extended regions within a gene. In some cases, the DNA ronmentally regulated expression profiles. Isolation and sequence variations are in regions of the genome that are functional analysis of the 5' flanking promoter sequences of characterized by short tandem repeats (STRs) that include these genes from genomic libraries, for example, using tandem di- or tri-nucleotide repeated motifs of nucleotides. genomic screening methods and PCR techniques would 30 Polymorphisms characterized by Such tandem repeats are result in the isolation of useful promoters and transcriptional referred to as “variable number tandem repeat” (“VNTR’) regulatory elements. These methods are known to those of polymorphisms. VNTRs have been used in identity analysis skill in the art and have been described (see, for example, (Weber, U.S. Pat. No. 5,075,217: Armour et al., FEBS Lett., Birren et al., Genome Analysis: Analyzing DNA, 1. (1997), 307:113-115 (1992); Jones et al., Eur: J. Haematol., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 35 39: 144–147 (1987); Horn et al., PCT Application WO N.Y.). Promoters obtained utilizing the nucleic acid mol 91/14003; Jeffreys, EP 370 719; Jeffreys, U.S. Pat. No. ecules of the present invention could also be modified to 5,175,082; Jeffreys et al., Amer: J. Hum. Genet., 39:11-24 affect their control characteristics. Examples of such modi (1986); Jeffreys et al., Nature, 316:76–79 (1985); Gray et al., fications would include but are not limited to enhancer Proc. R. Acad. Soc. Lond, 243:241–253 (1991); Moore et sequences. Such genetic elements could be used to enhance 40 al., Genomics, 10:654–660 (1991); Jeffreys et al., Anim. gene expression of new and existing traits for crop improve Genet., 18:1–15 (1987); Hillel et al., Anim. Genet., ment. 20:145–155 (1989); Hillel et al., Genet., 124:783–789 Another subset of the nucleic acid molecules of the (1990)). present invention includes nucleic acid molecules that are The detection of polymorphic sites in a sample of DNA markers. The markers can be used in a number of conven 45 may be facilitated through the use of nucleic acid amplifi tional ways in the field of molecular genetics. Such markers cation methods. Such methods specifically increase the include nucleic acid molecules encoding SEQ ID NOs: 5, concentration of polynucleotides that span the polymorphic 9–11, 43–44, 57–58, and 90, and complements thereof, and site, or include that site and sequences located either distal fragments of either that can act as markers and other nucleic or proximal to it. Such amplified molecules can be readily acid molecules of the present invention that can act as 50 detected by gel electrophoresis or other means. markers. In an alternative embodiment, Such polymorphisms can Genetic markers of the present invention include “domi be detected through the use of a marker nucleic acid mol nant’ or “codominant markers. “Codominant markers' ecule that is physically linked to Such polymorphism(s). For reveal the presence of two or more alleles (two per diploid this purpose, marker nucleic acid molecules comprising a individual) at a locus. “Dominant markers' reveal the pres 55 nucleotide sequence of a polynucleotide located within 1 mb ence of only a single allele per locus. The presence of the of the polymorphism(s) and more preferably within 100 kb dominant marker phenotype (e.g., a band of DNA) is an of the polymorphism(s) and most preferably within 10 kb of indication that one allele is in either the homozygous or the polymorphism(s) can be employed. heterozygous condition. The absence of the dominant The identification of a polymorphism can be determined marker phenotype (e.g., absence of a DNA band) is merely 60 in a variety of ways. By correlating the presence or absence evidence that “some other undefined allele is present. In the of it in a plant with the presence or absence of a phenotype, case of populations where individuals are predominantly it is possible to predict the phenotype of that plant. If a homozygous and loci are predominately dimorphic, domi polymorphism creates or destroys a restriction endonuclease nant and codominant markers can be equally valuable. As cleavage site, or if it results in the loss or insertion of DNA populations become more heterozygous and multi-allelic, 65 (e.g., a VNTR polymorphism), it will alter the size or profile codominant markers often become more informative of the of the DNA fragments that are generated by digestion with genotype than dominant markers. Marker molecules can be, that restriction endonuclease. As such, organisms that pos US 7,112,717 B2 45 46 sess a variant sequence can be distinguished from those phism analysis (Labrune et al., Am. J. Hum. Genet., having the original sequence by restriction fragment analy 48: 1115-1120 (1991)), single base primer extension (Kup sis. Polymorphisms that can be identified in this manner are puswamy et al., Proc. Natl. Acad. Sci. (U.S.A.), termed “restriction fragment length polymorphisms' 88: 1143–1147 (1991), Goelet, U.S. Pat. No. 6,004,744; (RFLPs) (Glassberg, UK Patent Application 2135774; Goelet, U.S. Pat. No. 5,888.819), solid-phase ELISA-based Skolnick et al., Cytogen. Cell Genet., 32:58–67 (1982); oligonucleotide ligation assays (Nikiforov et al., Nucl. Acids Botstein et al., Ann. J. Hum. Genet., 32:314–331 (1980); Res., 22:4167 4175 (1994)), dideoxy fingerprinting (Sarkar Fischer et al., PCT Application WO 90/13668; Uhlen, PCT et al., Genomics, 13:441–443 (1992)), oligonucleotide fluo Application WO 90/11369). rescence-quenching assays (Livak et al., PCR Methods Polymorphisms can also be identified by Single Strand 10 Appl., 4:357-362 (1995a)), 5'-nuclease allele-specific Conformation Polymorphism (SSCP) analysis (Elles, Meth hybridization TaqManTM assay (Livak et al., Nature Genet., ods in Molecular Medicine. Molecular Diagnosis of Genetic 9:341–342 (1995)), template-directed dye-terminator incor Diseases, Humana Press (1996)); Orita et al., Genomics, poration (TDI) assay (Chen and Kwok, Nucl. Acids Res., 5:874-879 (1989)). A number of protocols have been 25:347–353 (1997)), allele-specific molecular beacon assay described for SSCP including, but not limited to, Lee et al., 15 (Tyagi et al., Nature Biotech., 16:49–53 (1998)), PinPoint Anal. Biochem., 205:289-293 (1992); Suzuki et al., Anal. assay (Haff and Smirnimov, Genome Res., 7:378-388 Biochem., 192:82–84 (1991); Lo et al., Nucleic Acids (1997)), dCAPS analysis (Neff et al., Plant J., 14:387–392 Research, 20:1005–1009 (1992); Sarkar et al., Genomics, (1998)), pyrosequencing (Ronaghi et al., Analytical Bio 13:441–443 (1992). It is understood that one or more of the chemistry, 267:65-71 (1999); Ronaghi et al., WO 98/13523; nucleic acids of the present invention, may be utilized as Nyren et al., WO 98/28440; www.pyrosequencing.com), markers or probes to detect polymorphisms by SSCP analy using mass spectrometry, e.g. the Masscode M System (How S1S. bert et al., WO 99/05319; Howbert et al., WO 97/27331; Polymorphisms may also be found using a DNA finger www.rapigene.com: Becker et al., WO 98/26095; Becker et printing technique called amplified fragment length poly al., WO 98/12355; Becker et al., WO 97/33000; Monforte et morphism (AFLP), which is based on the selective PCR 25 al., U.S. Pat. No. 5,965.363), invasive cleavage of oligo amplification of restriction fragments from a total digest of nucleotide probes (Lyamichev et al., Nature Biotechnology, genomic DNA to profile that DNA (Vos et al., Nucleic Acids 17:292–296; www.twit.com), and using high density oligo Res., 23:4407-4414 (1995)). This method allows for the nucleotide arrays (Hacia et al., Nature Genetics, specific co-amplification of high numbers of restriction 22:164-167: www.affymetrix.com). fragments, which can be visualized by PCR without knowl 30 Polymorphisms may also be detected using allele-specific edge of the nucleic acid sequence. It is understood that one oligonucleotides (ASO), which, can be for example, used in or more of the nucleic acids of the present invention may be combination with hybridization based technology including utilized as markers or probes to detect polymorphisms by Southern, Northern, and dot blot hybridizations, reverse dot AFLP analysis or for fingerprinting RNA. blot hybridizations and hybridizations performed on Polymorphisms may also be found using random ampli 35 microarray and related technology. fied polymorphic DNA (RAPD) (Williams et al., Nucl. Acids The stringency of hybridization for polymorphism detec Res., 18:6531–6535 (1990)) and cleaveable amplified poly tion is highly dependent upon a variety of factors, including morphic sequences (CAPS) (Lyamichev et al., Science, length of the allele-specific oligonucleotide, sequence com 260:778–783 (1993)). It is understood that one or more of position, degree of complementarity (i.e., presence or the nucleic acid molecules of the present invention, may be 40 absence of base mismatches), concentration of salts and utilized as markers or probes to detect polymorphisms by other factors such as formamide and temperature. These RAPD or CAPS analysis. factors are important both during the hybridization itself and Single Nucleotide Polymorphisms (SNPs) generally during Subsequent washes performed to remove target poly occur at greater frequency than other polymorphic markers nucleotide that is not specifically hybridized. In practice, the and are spaced with a greater uniformity throughout a 45 conditions of the final, most stringent wash are most critical. genome than other reported forms of polymorphism. The In addition, the amount of target polynucleotide that is able greater frequency and uniformity of SNPs means that there to hybridize to the allele-specific oligonucleotide is also is greater probability that such a polymorphism will be governed by such factors as the concentration of both the found near or in a genetic locus of interest than would be the ASO and the target polynucleotide, the presence and con case for other polymorphisms. SNPs are located in protein 50 centration of factors that act to "tie up water molecules, so coding regions and noncoding regions of a genome. Some of as to effectively concentrate the reagents (e.g., PEG, dextran, these SNPs may result in defective or variant protein expres dextran Sulfate, etc.), whether the nucleic acids are immo sion (e.g., as a result of mutations or defective splicing). bilized or in solution, and the duration of hybridization and Analysis (genotyping) of characterized SNPs can require washing steps. only a plus/minus assay rather than a lengthy measurement, 55 Hybridizations are preferably performed below the melt permitting easier automation. ing temperature (T) of the ASO. The closer the hybridiza SNPs can be characterized using any of a variety of tion and/or washing step is to the T, the higher the methods. Such methods include the direct or indirect stringency. T, for an oligonucleotide may be approximated, sequencing of the site, the use of restriction enzymes (Bot for example, according to the following formula: T81.5+ stein et al., Am. J. Hum. Genet., 32:314331 (1980); Koniec 60 16.6x(log 10 Na+I)+0.41x(% G+C)-675/n; where Na+ is Zny and Ausubel, Plant J., 4:403-410 (1993)), enzymatic the molar salt concentration of Na+ or any other suitable and chemical mismatch assays (Myers et al., Nature, 313: cation and n-number of bases in the oligonucleotide. Other 495-498 (1985)), allele-specific PCR (Newton et al., Nucl. formulas for approximating T are available and are known Acids Res., 17:2503-2516 (1989); Wu et al., Proc. Natl. to those of ordinary skill in the art. Acad. Sci. (U.S.A.), 86:2757 2760 (1989)), ligase chain 65 Stringency is preferably adjusted so as to allow a given reaction (Barany, Proc. Natl. Acad. Sci. (U.S.A.), ASO to differentially hybridize to a target polynucleotide of 88:189-193 (1991)), single-strand conformation polymor the correct allele and a target polynucleotide of the incorrect US 7,112,717 B2 47 48 allele. Preferably, there will be at least a two-fold differential In a preferred embodiment of the present invention the between the signal produced by the ASO hybridizing to a nucleic acid marker exhibits a LOD score of greater than 2.0, target polynucleotide of the correct allele and the level of the more preferably 2.5, even more preferably greater than 3.0 signal produced by the ASO cross-hybridizing to a target or 4.0 with the trait or phenotype of interest. In a preferred polynucleotide of the incorrect allele (e.g., an ASO specific 5 embodiment, the trait of interest is altered tocopherol levels for a mutant allele cross-hybridizing to a wild-type allele). or compositions or altered tocotrienol levels or composi In more preferred embodiments of the present invention, tions. there is at least a five-fold signal differential. In highly Additional models can be used. Many modifications and preferred embodiments of the present invention, there is at alternative approaches to interval mapping have been least an order of magnitude signal differential between the 10 reported, including the use of non-parametric methods ASO hybridizing to a target polynucleotide of the correct (Kruglyak and Lander, Genetics, 139: 1421–1428 (1995)). allele and the level of the signal produced by the ASO Multiple regression methods or models can also be used, in cross-hybridizing to a target polynucleotide of the incorrect which the trait is regressed on a large number of markers allele. (Jansen, Biometrics in Plant Breeding, van Oijen and Jansen While certain methods for detecting polymorphisms are 15 (eds.), Proceedings of the Ninth Meeting of the Eucarpia described herein, other detection methodologies may be Section Biometrics in Plant Breeding. The Netherlands, pp. utilized. For example, additional methodologies are known 116–124 (1994); Weber and Wricke, Advances in Plant and set forth, in Birren et al., Genome Analysis, 4:135-186. Breeding, Blackwell, Berlin, 16 (1994)). Procedures com A Laboratory Manual. Mapping Genomes, Cold Spring bining interval mapping with regression analysis, whereby Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1999): 20 the phenotype is regressed onto a single putative QTL at a Maliga et al., Methods in Plant Molecular Biology. A Labo given marker interval and at the same time onto a number of ratory Course Manual, Cold Spring Harbor Laboratory markers that serve as “cofactors', have been reported by Press, Cold Spring Harbor, N.Y. (1995); Paterson, Biotech Jansen and Stam, Genetics, 136:1447–1455 (1994); and nology Intelligence Unit. Genome Mapping in Plants, R. G. Zeng, Genetics, 136:1457–1468 (1994). Generally, the use Landes Co., Georgetown, Tex., and Academic Press, San 25 of cofactors reduces the bias and sampling error of the Diego, Calif. (1996); The Corn Handbook, Freeling and estimated QTL positions (Utz and Melchinger, Biometrics in Walbot, (eds.), Springer-Verlag, New York, N.Y. (1994): Plant Breeding, van Oijen and Jansen (eds.), Proceedings of Methods in Molecular Medicine. Molecular Diagnosis of the Ninth Meeting of the Eucarpia Section Biometrics in Genetic Diseases, Elles, (ed.), Humana Press, Totowa, N.J. Plant Breeding. The Netherlands, pp. 195-204 (1994), (1996); Clark, (ed.), Plant Molecular Biology: A Laboratory 30 thereby improving the precision and efficiency of QTL Manual, Springer-Verlag, Berlin, Germany (1997). mapping (Zeng, Genetics, 136:1457–1468 (1994)). These Factors for marker-assisted selection in a plant breeding models can be extended to multi-environment experiments program are: (1) the marker(s) should co-segregate or be to analyze genotype-environment interactions (Jansen et al., closely linked with the desired trait; (2) an efficient means of Theo. Appl. Genet., 91:33–37 (1995)). screening large populations for the molecular marker(s) 35 It is understood that one or more of the nucleic acid should be available; and (3) the screening technique should molecules of the present invention may be used as molecular have high reproducibility across laboratories and preferably markers. It is also understood that one or more of the protein be economical to use and be user-friendly. molecules of the present invention may be used as molecular The genetic linkage of marker molecules can be estab markers. lished by a gene mapping model Such as, without limitation, 40 In a preferred embodiment, the polymorphism is present the flanking marker model reported by Lander and Botstein, and Screened for in a mapping population, e.g. a collection Genetics, 121:185-199 (1989) and the interval mapping, of plants capable of being used with markers such as based on maximum likelihood methods described by Lander polymorphic markers to map genetic position of traits. The and Botstein, Genetics, 121:185-199 (1989) and imple choice of appropriate mapping population often depends on mented in the software package MAPMAKER/QTL (Lin- 45 the type of marker systems employed (Tanksley et al., J. P. coln and Lander, Mapping Genes Controlling Ouantitative Gustafson and R. Appels (eds.). Plenum Press, NY. pp. Traits Using MAPMAKER/OTL, Whitehead Institute for 157–173 (1988)). Consideration must be given to the source Biomedical Research, MA (1990). Additional software of parents (adapted VS. exotic) used in the mapping popu includes Qgene, Version 2.23 (1996), Department of Plant lation. Chromosome pairing and recombination rates can be Breeding and Biometry, 266 Emerson Hall, Cornell Univer- 50 severely disturbed (Suppressed) in wide crosses (adaptedx sity, Ithaca, N.Y. Use of Qgene software is a particularly exotic) and generally yield greatly reduced linkage dis preferred approach. tances. Wide crosses will usually provide segregating popu A maximum likelihood estimate (MLE) for the presence lations with a relatively large number of polymorphisms of a marker is calculated, together with an MLE assuming no when compared to progeny in a narrow cross (adaptedx QTL effect, to avoid false positives. A logo of an odds ratio 55 adapted). (LOD) is then calculated as: LOD–logo (MLE for the An F. population is the first generation of selfing (self presence of a QTL/MLE given no linked QTL). pollinating) after the hybrid seed is produced. Usually a The LOD score essentially indicates how much more single F plant is selfed to generate a population segregating likely the data are to have arisen assuming the presence of for all the genes in Mendelian (1:2:1) pattern. Maximum a QTL than in its absence. The LOD threshold value for 60 genetic information is obtained from a completely classified avoiding a false positive with a given confidence, say 95%, F. population using a codominant marker system (Mather, depends on the number of markers and the length of the Measurement of Linkage in Heredity: Methuen and Co., genome. Graphs indicating LOD thresholds are set forth in (1938)). In the case of dominant markers, progeny tests (e.g., Lander and Botstein, Genetics, 121:185-199 (1989) and F. BCF) are required to identify the heterozygotes, in order further described by Arus and Moreno-González, Plant 65 to classify the population. However, this procedure is often Breeding, Hayward et al., (eds.) Chapman & Hall, London, prohibitive because of the cost and time involved in progeny pp. 314331 (1993). testing. Progeny testing of F individuals is often used in US 7,112,717 B2 49 50 map construction where phenotypes do not consistently sample, etc.) in a plant (preferably canola, corn, Brassica reflect genotype (e.g., disease resistance) or where trait campestris, oilseed rape, rapeseed, soybean, crambe, mus expression is controlled by a QTL. Segregation data from tard, castor bean, peanut, Sesame, cottonseed, linseed, saf progeny test populations (e.g., F. or BCF) can be used in flower, oil palm, flax or sunflower) or pattern (i.e., the map construction. Marker-assisted selection can then be kinetics of expression, rate of decomposition, stability pro applied to cross progeny based on marker-trait map asso file, etc.) of the expression of a protein encoded in part or ciations (F, F), where linkage groups have not been whole by one or more of the nucleic acid molecule of the completely disassociated by recombination events (i.e., present invention (collectively, the “Expression Response' maximum disequilibrium). of a cell or tissue). Recombinant inbred lines (RIL) (genetically related lines: 10 As used herein, the Expression Response manifested by a usually >Fs, developed from continuously selfing F lines cell or tissue is said to be “altered” if it differs from the towards homozygosity) can be used as a mapping popula Expression Response of cells or tissues of plants not exhib tion. Information obtained from dominant markers can be iting the phenotype. To determine whether a Expression maximized by using RIL because all loci are homozygous or Response is altered, the Expression Response manifested by nearly so. Under conditions of tight linkage (i.e., about 15 the cell or tissue of the plant exhibiting the phenotype is <10% recombination), dominant and co-dominant markers compared with that of a similar cell or tissue sample of a evaluated in RIL populations provide more information per plant not exhibiting the phenotype. As will be appreciated, individual than either marker type in backcross populations it is not necessary to re-determine the Expression Response (Reiter et al., Proc. Natl. Acad. Sci. (U.S.A.), 89:1477–1481 of the cell or tissue sample of plants not exhibiting the (1992)). However, as the distance between markers becomes phenotype each time such a comparison is made; rather, the larger (i.e., loci become more independent), the information Expression Response of a particular plant may be compared in RIL populations decreases dramatically when compared with previously obtained values of normal plants. As used to codominant markers. herein, the phenotype of the organism is any of one or more Backcross populations e.g., generated from a cross characteristics of an organism (e.g., disease resistance, pest between a Successful variety (recurrent parent) and another 25 tolerance, environmental tolerance such as tolerance to variety (donor parent) carrying a trait not present in the abiotic stress, male sterility, quality improvement or yield former) can be utilized as a mapping population. A series of etc.). A change in genotype or phenotype may be transient or backcrosses to the recurrent parent can be made to recover permanent. Also as used herein, a tissue sample is any most of its desirable traits. Thus a population is created sample that comprises more than one cell. In a preferred consisting of individuals nearly like the recurrent parent but 30 aspect, a tissue sample comprises cells that share a common each individual carries varying amounts or mosaic of characteristic (e.g., Derived from root, seed, flower, leaf. genomic regions from the donor parent. Backcross popula stem or pollen etc.). tions can be useful for mapping dominant markers if all loci In one aspect of the present invention, an evaluation can in the recurrent parent are homozygous and the donor and be conducted to determine whether a particular mRNA recurrent parent have contrasting polymorphic marker alle 35 molecule is present. One or more of the nucleic acid les (Reiter et al., Proc. Natl. Acad. Sci. (U.S.A.), molecules of the present invention are utilized to detect the 89:1477–1481 (1992)). Information obtained from back presence or quantity of the mRNA species. Such molecules cross populations using either codominant or dominant are then incubated with cell or tissue extracts of a plant markers is less than that obtained from F populations under conditions sufficient to permit nucleic acid hybridiza because one, rather than two, recombinant gamete is 40 tion. The detection of double-stranded probe-mRNA hybrid sampled per plant. Backcross populations, however, are molecules is indicative of the presence of the mRNA; the more informative (at low marker Saturation) when compared amount of Such hybrid formed is proportional to the amount to RILs as the distance between linked loci increases in RIL of mRNA. Thus, such probes may be used to ascertain the populations (i.e., about 0.15% recombination). Increased level and extent of the mRNA production in a plant's cells recombination can be beneficial for resolution of tight 45 or tissues. Such nucleic acid hybridization may be con linkages, but may be undesirable in the construction of maps ducted under quantitative conditions (thereby providing a with low marker saturation. numerical value of the amount of the mRNA present). Near-isogenic lines (NIL) (created by many backcrosses Alternatively, the assay may be conducted as a qualitative to produce a collection of individuals that is nearly identical assay that indicates either that the mRNA is present, or that in genetic composition except for the trait or genomic region 50 its level exceeds a user set, predefined value. under interrogation) can be used as a mapping population. In A number of methods can be used to compare the expres mapping with NILS, only a portion of the polymorphic loci sion response between two or more samples of cells or is expected to map to a selected region. tissue. These methods include hybridization assays, such as Bulk segregant analysis (BSA) is a method developed for northerns, RNAse protection assays, and in situ hybridiza the rapid identification of linkage between markers and traits 55 tion. Alternatively, the methods include PCR-type assays. In of interest (Michelmore et al., Proc. Natl. Acad. Sci. a preferred method, the expression response is compared by (U.S.A.), 88:9828–9832 (1991)). In BSA, two bulked DNA hybridizing nucleic acids from the two or more samples to samples are drawn from a segregating population originating an array of nucleic acids. The array contains a plurality of from a single cross. These bulks contain individuals that are Suspected sequences known or Suspected of being present in identical for a particular trait (resistant or susceptible to 60 the cells or tissue of the samples. particular disease) or genomic region but arbitrary at An advantage of in situ hybridization over more conven unlinked regions (i.e., heterozygous). Regions unlinked to tional techniques for the detection of nucleic acids is that it the target region will not differ between the bulked samples allows an investigator to determine the precise spatial popu of many individuals in BSA. lation (Angerer et al., Dev. Biol., 101:477 484 (1984); In an aspect of the present invention, one or more of the 65 Angerer et al., Dev. Biol., 112:157–166 (1985); Dixon et al., nucleic molecules of the present invention are used to EMBO.J., 10:1317–1324 (1991)). In situ hybridization may determine the level (i.e., the concentration of mRNA in a be used to measure the steady-state level of RNA accumu US 7,112,717 B2 51 52 lation (Hardin et al., J. Mol. Biol., 202:417-431 (1989)). A Verlag, Berlin, (1997); Methods in Plant Molecular Biology, number of protocols have been devised for in situ hybrid Maliga et al., Cold Spring Harbor Press, Cold Spring Har ization, each with tissue preparation, hybridization and bor, N.Y. (1995). These texts can, of course, also be referred washing conditions (Meyerowitz, Plant Mol. Biol. Rep., to in making or using an aspect of the present invention. It 5:242-250 (1987); Cox and Goldberg, In: Plant Molecular is understood that any of the agents of the present invention Biology: A Practical Approach, Shaw (ed.), pp. 1–35, IRL can be substantially purified and/or be biologically active Press, Oxford (1988); Raikhel et al., In situ RNA hybridiza and/or recombinant. tion in plant tissues, In: Plant Molecular Biology Manual, Having now generally described the present invention, the Vol. B9:1–32, Kluwer Academic Publisher, Dordrecht, Bel same will be more readily understood through reference to gium (1989)). 10 the following examples that are provided by way of illus In situ hybridization also allows for the localization of tration, and are not intended to be limiting of the present proteins within a tissue or cell (Wilkinson. In Situ Hybrid invention, unless specified. ization, Oxford University Press, Oxford (1992); Langdale, In Situ Hybridization In: The Corn Handbook, Freeling and EXAMPLE 1. Walbot (eds.), pp. 165-179, Springer-Verlag, NY (1994)). It 15 is understood that one or more of the molecules of the Identification of Homogentisate Prenyl Transferase present invention, preferably one or more of the nucleic acid Sequences molecules or fragments thereof of the present invention or one or more of the antibodies of the present invention may This example sets forth methods used to analyze be utilized to detect the level or pattern of a protein or homogentisate prenyl transferase sequences from various mRNA thereof by in situ hybridization. Sources in order to identify motifs common to homogenti Fluorescent in situ hybridization allows the localization of sate prenyl transferase that are contained therein. a particular DNA sequence along a chromosome, which is Homogentisate prenyl transferase sequences from Soy, useful, among other uses, for gene mapping, following Arabidopsis, Corn and Cuphea (partial) are cloned and chromosomes in hybrid lines, or detecting chromosomes 25 sequenced from EST sequences found in an EST database. with translocations, transversions or deletions. In situ Synechocystis, Nostoc, and Anabaena are obtained from hybridization has been used to identify chromosomes in Genbank. These sequences (representing SEQID NOs: 1-8) several plant species (Griffor et al., Plant Mol. Biol., are then aligned with respect to each other using the multiple 17:101–109 (1991); Gustafson et al., Proc. Natl. Acad. Sci. alignment software ClustalX, which is described by Thomp (U.S.A.), 87: 1899–1902 (1990); Mukai and Gill, Genome, 30 34:448–452 (1991); Schwarzacher and Heslop-Harrison, son et al., Nucleic Acids Research, 24:4876 4882 (1997). Genome, 34:317-323 (1991); Wang et al., Jpn. J. Genet., The multiple alignment of the protein sequences is visual 66:313–316 (1991); Parra and Windle, Nature Genetics, ized and edited using Genedoc, which is described by 5:17-21 (1993)). It is understood that the nucleic acid Nicholas et al., EMBNEW NEWS, 4:14 (1997). molecules of the present invention may be used as probes or 35 Using the aforementioned multiple alignment tool, four markers to localize sequences along a chromosome. motifs (A-D) are identified, as shown in FIGS. 2a-2c, Another method to localize the expression of a molecule wherein motifs A–D are set forth. These motifs are repre is tissue printing. Tissue printing provides a way to screen, sented by SEQ ID NOs: 12–15. The Cuphea sequence is at the same time on the same membrane many tissue sections removed from motif D because the sequence had multiple from different plants or different developmental stages 40 errors towards the 3' end that generated apparent frame shift (Yomo and Taylor, Planta, 112:35–43 (1973); Harris and COS. Chrispeels, Plant Physiol., 56:292–299 (1975); Cassab and The specificity of these motifs is demonstrated using a Varner, J. Cell. Biol., 105:2581–2588 (1987); Spruce et al., Hidden Markov Model (HMM) that is built using an Phytochemistry, 26:2901-2903 (1987); Barres et al., Neu HMMER(version 2.2 g) software package (Eddy, Bioinfor ron, 5:527–544 (1990); Reid and Pont-Lezica, Tissue Print 45 matics, 14:755–763 (1998)). A HMM search is performed on ing. Tools for the Study of Anatomy, Histochemistry and a cDNA sequence database containing full insert sequence Gene Expression, Academic Press, New York, N.Y. (1992); from different plant species. This search identifies two new Reid et al., Plant Physiol., 93:160-165 (1990); Ye et al., homogentisate prenyl transferase sequences (SEQ ID NOs: Plant J., 1:175-183 (1991)). 9–10) in addition to several partial homogentisate prenyl One skilled in the art can refer to general reference texts 50 transferase sequences. The two new homogentisate prenyl for detailed descriptions of known techniques discussed transferase sequences identified are from leek and wheat. herein or equivalent techniques. These texts include Current This search also identifies a complete Cuphea sequence Protocols in Molecular Biology, Ausubel et al., (eds.), John (SEQ ID NO: 11) with no errors. A second alignment is Wiley & Sons, NY (1989), and supplements through Sep generated using the aforementioned multiple alignment tool, tember (1998), Molecular Cloning, A Laboratory Manual, 55 as shown in FIGS. 3a–3c. This alignment has the leek, Sambrook et al., 2" Ed., Cold Spring Harbor Press, Cold wheat, and full Cuphea sequences incorporated. Motifs I-IV Spring Harbor, N.Y. (1989), Genome Analysis: A Laboratory (SEQ ID NOs: 39–42) are shown. Manual 1: Analyzing DNA, Birren et al., Cold Spring Harbor Specificity is also tested by using each motif sequence to Press, Cold Spring Harbor, N.Y. (1997); Genome Analysis. search the non-redundant amino acid database downloaded A Laboratory Manual 2: Detecting Genes, Birren et al., Cold 60 from Genbank available through NCBI. All four motifs Spring Harbor Press, Cold Spring Harbor, N.Y. (1998); identify three homogentisate prenyl transferase found in the Genome Analysis. A Laboratory Manual 3. Cloning Sys aforementioned non-redundant amino acid database, as fol tems, Birren et al., Cold Spring Harbor Press, Cold Spring lows: Nostoc, Synechocystis, Arabidopsis. Motifs II and IV Harbor, N.Y. (1999); Genome Analysis: A Laboratory also identified some genomic variants of an uncharacterized Manual 4. Mapping Genomes, Birren et al., Cold Spring 65 Arabidopsis protein. Motifs I and III only identified known Harbor Press, Cold Spring Harbor, N.Y. (1999); Plant homogentisate prenyl transferase at an E value of 0.001 or Molecular Biology: A Laboratory Manual. Clark, Springer lower. US 7,112,717 B2 53 54 EXAMPLE 2 napin 3' is closest to the blunted HindIII site is subjected to sequence analysis to confirm both the insert orientation and Preparation of Expression Constructs the integrity of cloning junctions. The resulting plasmid is designated pCGN8623. A plasmid containing the napin cassette derived from The plasmid pCONS620 is constructed by ligating oligo pCGN3223 (described in U.S. Pat. No. 5,639,790, the nucleotides 5'-TCGAGGATCCGCGGCCGCAAGCTTC entirety of which is incorporated herein by reference) is CTGCAGGAGCT-3' (SEQ ID NO: 21) and 5'-CCTGCAG modified to make it more useful for cloning large DNA GAAGCTTGCGGCCGCGGATCC-3' (SEQ ID NO: 22) fragments containing multiple restriction sites, and to allow into SalI/SacI-digested pCGN7787. A fragment containing the cloning of multiple napin fusion genes into plant binary 10 the d35S promoter, polylinker and trial 3' region is removed transformation vectors. An adapter comprised of the self from pCGN8620 by complete digestion with Asp718I and annealed oligonucleotide of sequence CGCGATTTAAATG partial digestion with NotI. The fragment is blunt-ended by GCGCGCCCTGCAGGCGGCCGCCTG filling in the 5' overhangs with Klenow fragment then ligated CAGGGCGCGCCATTTAAAT (SEQID NO: 16) is ligated into pCGN5139 that is digested with Asp718I and HindIII into the cloning vector pBC SK-- (Stratagene) after digestion 15 and blunt-ended by filling in the 5' overhangs with Klenow with the restriction endonuclease BSSHII to construct vector fragment. A plasmid containing the insert oriented so that the pCGN7765. Plasmids pCGN3223 and pCGN7765 are d35S promoter is closest to the blunted Asp718I site of digested with NotI and ligated together. The resultant vector, pCGN5139 and the timl 3' is closest to the blunted HindIII pCGN7770, contains the pCON7765 backbone with the site is subjected to sequence analysis to confirm both the napin seed specific expression cassette from pCGN3223. insert orientation and the integrity of cloning junctions. The The cloning cassette pCGN7787 comprises essentially the resulting plasmid is designated pCON8624. same regulatory elements as pCGN7770, with the exception The plasmid pCON8621 is constructed by ligating oligo that the napin regulatory regions of pCGN7770 have been nucleotides 5-TCGACCTGCAGGAAGCTTGCGGC replaced with the double CAMV 35S promoter and the CGCGGATCCAGCT-3' (SEQID NO. 23) and 5'-GGATC polyadenylation and transcriptional termination region. 25 CGCGGCCGCAAGCTTCCTGCAGG-3 (SEQID NO:24) A binary vector for plant transformation, pCGN5139, is into SalI/SacI-digested pCGN7787. A fragment containing constructed from pCGN1558 (McBride and Summerfelt, the d35S promoter, polylinker and timl 3' region is removed Plant Molecular Biology, 14:269-276 (1990)). The from pCGN8621 by complete digestion with Asp718I and polylinker of pCGN1558 is replaced as a HindIII/Asp718 partial digestion with NotI. fragment with a polylinker containing unique restriction 30 The fragment is blunt-ended by filling in the 5' overhangs endonuclease sites, AscI, PacI, Xbal, Swal, BamHI, and with Klenow fragment then ligated into pCGN5139 that had NotI. The Asp718 and HindIII restriction endonuclease sites been digested with Asp718I and HindIII and blunt-ended by are retained in pCGN5139. filling in the 5' overhangs with Klenow fragment. A plasmid A series of binary vectors are constructed to allow for the containing the insert oriented so that the d35S promoter is rapid cloning of DNA sequences into binary vectors con 35 closest to the blunted Asp718I site of pCGN5139 and the timl taining transcriptional initiation regions (promoters) and 3' is closest to the blunted HindIII site is subjected to transcriptional termination regions. sequence analysis to confirm both the insert orientation and The plasmid pCON8618 is constructed by ligating oligo the integrity of cloning junctions. The resulting plasmid is nucleotides 5'-TCGAGGATCCGCGGCCGCAAGCTTC designated pCGN8625. CTGCAGG-3 (SEQID NO: 17) and 5'-TCGACCTGCAG 40 GAAGCTTGCGGCCGCGGATCC-3' (SEQ ID NO: 18) The plasmid construct pCON8640 is a modification of into SalIl/XhoI-digested pCON7770. A fragment containing pCGN8624 described above. A 938 bp PstI fragment iso the napin promoter, polylinker and napin 3' region is excised lated from transposon TnT which encodes bacterial specti from pCGN8618 by digestion with Asp718I; the fragment is nomycin and streptomycin resistance (Fling et al., Nucleic blunt-ended by filling in the 5' overhangs with Klenow 45 Acids Research, 13(19):7095 7106 (1985)), a determinant fragment then ligated into pCGN5139 that is digested with for E. coli and Agrobacterium selection, is blunt-ended with Asp7181 and HindIII and blunt-ended by filling in the 5' Pfu polymerase. The blunt-ended fragment is ligated into overhangs with Klenow fragment. A plasmid containing the pCGN8624 that had been digested with Spel and blunt insert oriented so that the napin promoter is closest to the ended with Pfu polymerase. The region containing the PstI blunted Asp718I site of pCGN5139 and the napin 3' is 50 fragment is sequenced to confirm both the insert orientation closest to the blunted HindIII site is subjected to sequence and the integrity of cloning junctions. analysis to confirm both the insert orientation and the The spectinomycin resistance marker is introduced into integrity of cloning junctions. The resulting plasmid is pCGN8622 and pCGN8623 as follows. A 7.7 Kbp Avril designated pCGN8622. SnaBI fragment from pCGN8640 is ligated to a 10.9 Kbp The plasmid pCON8619 is constructed by ligating oligo 55 Avril-SnaBI fragment from pCGN8623 or pCGN8622, nucleotides 5-TCGACCTGCAGGAAGCTTGCGGC described above. The resulting plasmids are pCON8641 and CGCGGATCC-3' (SEQ ID NO: 19) and 5'-TCGAGGATC pCGN8643, respectively. CGCGGCCGCAAGCTTCCTGCAGG-3 (SEQID NO:20) The plasmid pCON8644 is constructed by ligating oligo into SalI/XhoI-digested pCGN7770. A fragment containing nucleotides 5'-GATCACCTGCAGGAAGCTTGCGGC the napin promoter, polylinker and napin 3' region is 60 CGCGGATCCAATGCA-3' (SEQID NO: 25) and 5' TTG removed from pCGN8619 by digestion with Asp718I; the GATCCGCGGCCGCAAGCTTCCTGCAGGT-3' (SEQ ID fragment is blunt-ended by filling in the 5' overhangs with NO: 26) into BamHI-PstI digested pCGN8640. Klenow fragment then ligated into pCGN5139 that is Synthetic oligonucleotides are designed for use in Poly digested with Asp7181 and HindIII and blunt-ended by merase Chain Reactions (PCR) to amplify the coding filling in the 5' overhangs with Klenow fragment. A plasmid 65 sequences of each of the nucleic acids that encode the containing the insert oriented so that the napin promoter is polypeptides of SEQID NOs: 1-7, 9–11, 43–44, 57–58, and closest to the blunted Asp7181 site of pCGN5139 and the 90 for the preparation of expression constructs. US 7,112,717 B2 55 56 The coding sequences of each of the nucleic acids that EXAMPLE 4 encode the polypeptides of SEQID NOs: 1-7, 9–11, 43–44, 57–58, and 90 are all amplified and cloned into the TopoTA Identification of Additional Homogentisate Prenyl vector (Invitrogen). Constructs containing the respective Transferase homogentisate prenyl transferase sequences are digested with NotI and Sse&3871 and cloned into the turbobinary In order to identify additional homogentisate prenyl trans vectors described above. ferase, motifs identified through sequence homology are used to search a database of cDNA sequences containing full Synthetic oligonucleotides were designed for use in Poly insert sequences. The cDNA database is first translated in all merase Chain Reactions (PCR) to amplify SEQ ID NO: 33 10 six frames and then a HMM search is done using a HMM for the preparation of expression constructs and are provided model built for the motifs. All HMM hits are annotated by in the table below: performing a blast search against a non-redundant amino acid database. All motifs are sensitive and identify homogentisate prenyl transferase sequences present in the 15 database. Novel homogentisate prenyl transferase sequences Restriction Site Sequence SEQ ID NO : are thereby discovered. 5' Not GGATCCGCGGCCGCACAATGG 37 AGTCTCTGCTCTCTAGTTCT EXAMPLE 5 3' SseI GGATCCTGCAGGTCACTTCAAA 38 AAAGGTAACAGCAAGT Transgenic Plant Analysis Arabidopsis plants transformed with constructs for the SEQ ID NO:33 was amplified using the respective PCR sense or antisense expression of the homogentisate prenyl primers shown in the table above and cloned into the transferase proteins are analyzed by High Performance TopoTA vector (Invitrogen). Constructs containing the 25 Liquid Chromatography (HPLC) for altered levels of total respective homogentisate prenyl transferase sequences were tocopherols and tocotrienols, as well as altered levels of digested with NotI and Sse33871 and cloned into the tur specific tocopherols and tocotrienols (e.g. C, B, Y, and bobinary vectors described above. 8-tocopherol/tocotrienol). SEQ ID NO: 33 was cloned in the sense orientation into Extracts of leaves and seeds are prepared for HPLC as pCGN8640 to produce the plant transformation construct 30 follows. For seed extracts, 1-mg of seed is added to 1 g of MICROBEADS (glass beads) (Biospec) in a sterile pCGN10800 (FIG. 4). SEQ ID NO:33 is under control of microfuge tube to which 500 ul 1% pyrogallol (Sigma the enhanced 35S promoter. Chem)/ethanol is added. The mixture is shaken for 3 minutes SEQ ID NO: 33 was also cloned in the antisense orien in a mini Beadbeater (Biospec) on “fast” speed. The extract tation into the construct pCGN8641 to create pCGN10801 35 is filtered through a 0.2 um filter into an autosampler tube. (FIG. 5). This construct provides for the antisense expres The filtered extracts are then used in HPLC analysis sion of SEQ ID NO: 33 from the napin promoter. described below. SEQID NO: 33 was also cloned in the sense orientation Leaf extracts are prepared by mixing 30–50 mg of leaf into the vector pCGN8643 to create the plant transformation tissue with 1 g microbeads and freezing in liquid nitrogen construct pCON10822 (FIG. 7). This construct provides for 40 until extraction. For extraction, 500 ul 1% pyrogallol in the sense expression of SEQ ID NO: 33 from the napin ethanol is added to the leafbead mixture and shaken for 1 promoter. minute on a Beadbeater (Biospec) on “fast” speed. The resulting mixture is centrifuged for 4 minutes at 14,000 rpm SEQ ID NO: 33 was also cloned in the antisense orien tation into the vector pCGN8644 to create the plant trans and filtered as described above prior to HPLC analysis. 45 HPLC is performed on a ZORBAX (material used for formation construct pCGN10803 (FIG. 6). This construct chromatography) silica HPLC column (4.6 mmx250 mm), provides for the antisense expression of SEQ ID NO: 33 using a fluorescent detection monitor, with excitation and from the enhanced 35S promoter. emission spectra set at 290 nm and 336 nm, respectively. Solvent A is hexane and solvent B is methyl-t-butyl ether. EXAMPLE 3 50 The injection volume is 20 ul, the flow rate is 1.5 ml/min, the run time is 12 mm (40°C.) using the table below: Plant Transformation Transgenic Brassica plants are obtained by Agrobacte Time Solvent A Solvent B rium-mediated transformation as described by Radke et al., 55 Theor: Appl. Genet., 75:685–694 (1988); Plant Cell Reports, O min. 90% 10% 11:499-505 (1992). Transgenic Arabidopsis thaliana plants 10 min. 90% 10% may be obtained by Agrobacterium-mediated transforma 11 min. 25% 75% tion as described by Valverkens et al., Proc. Nat. Acad. Sci., 12 min. 90% 10% 85:5536-5540 (1988), or as described by Bent et al., Sci 60 ence, 265:1856–1860 (1994), or Bechtold et al., C.R. Acad. Tocopherol standards in 1% pyrogallol/ethanol are also Sci. Life Sciences, 316:1194-1199 (1993). Other plant spe run for comparison (alpha tocopherol, gamma tocopherol, cies may be similarly transformed using related techniques. beta tocopherol, delta tocopherol, and tocopherol (tocol) (all Alternatively, microprojectile bombardment methods, from Matreya, State College, Pa., or Calbiochem, La Jolla, such as described by Klein et al., Bio/Technology, 65 Calif.)). 10:286-291 may also be used to obtain nuclear transformed Standard curves for alpha, beta, delta, and gamma toco plants. pherol are calculated using CHEMSTATION (software for US 7,112,717 B2 57 58 analyzing chromatography data) software. The absolute binary vector downstream of the tyr A, expression cassette amount of component X is: Absolute amount of resulting in the formation of pMON69924 (FIG. 12). x=Responses xRFixsilution factor where Response, is the A fourth plasmid was constructed by cloning the Arcelin area of peak X, RF, is the response factor for component X (Amount/Response) and the dilution factor is 500 ul. The 5-expression cassette for slr1736 (SEQ ID NO: 29), down ng/mg tissue is found by: total ng component/mg plant stream of the HPPD, and the tyra, expression cassettes, tissue. resulting in the formation of pMON69943 (FIG. 13). Results of the HPLC analysis of seed extracts of trans Each of these binary constructs was transformed into genic Arabidopsis lines containing pMON10822 for the Soybean. R1 seed pools from plants harboring these con expression of SEQ ID NO: 33 from the napin promoter are 10 structs were analyzed for tocopherol content and composi provided in FIG. 8. tion. For constructs pMON36581 and pMON69933, the HPLC analysis results of Arabidopsis seed tissue express seed for analysis were chosen at random. Seed from plants ing the SEQID NO: 33 sequence from the napin promoter transformed with pMON69924 and pMON69943 showed a (pMON10822) demonstrates an increased level of toco segregating dark phenotype. This phenotype has been asso pherols in the seed. Total tocopherol levels are increased as 15 ciated with the presence of increased levels of homogentisic much as 50 to 60% over the total tocopherol levels of acid as a result of the expression of trans genes HPPD and non-transformed (wild-type) Arabidopsis plants (FIG. 8). tyra. Seed without dark coloration did have wild-type Results of the HPLC analysis of seed extracts of trans tocopherol levels and were not transgenic. For this reason genic Arabidopsis lines 1387–1624 containing pMON10803 colored seed were chosen for analysis of plants transformed for the antisense expression of SEQ ID NO: 33 from the with pMON69924 or pMON69943. For the impact of the enhanced 35S promoter are provided in FIG. 9. Two lines, HPT expression on total tocopherol accumulation in a single 1393 and 1401, show a substantial reduction in overall gene vector, or in a multi gene vector, seed from non tocopherol levels, supporting the position that HPT is a transformed soy, or seed transformed with plMON69924 homogentisate prenyl transferase involved in the synthesis served as controls, respectively. FIG. 14 Summarizes the of tocopherol. 25 tocopherol data obtained from these experiments. While Results of the HPLC analysis of seed extracts of trans expression of ATPT2 or slr1736 increased total tocopherol genic Arabidopsis lines containing constructs for the expres and tocotrienol levels in soy moderately, the impact of HPT sion of SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90 are expression in the context of a multi gene vector was much obtained. 30 more pronounced. FIG. 14 demonstrates a significantly Results of the HPLC analysis of seed extracts of trans increased level of tocopherol and tocotrienol accumulation genic Arabidopsis lines containing constructsfor the expres for pMON69943 compared to pMON69924 lines. These sion of SEQ ID NOs: 5, 9–11, 43–44, 57–58, and 90 from data suggest that combination of an HPT with tyra, and the enhanced 35S promoter are obtained. HPPD can substantially enhance tocopherol biosynthesis in 35 Soy. EXAMPLE 6 Western analysis is carried out to detect the transgene expression in tissues harboring the gene of interest (GOI) Expression of a Homogentisate Prenyl Transferase expression cassette using the GOI protein specific antibody. as Single Gene, and in Combination with HPPD Northern analysis is done for detecting the mRNA level of and tyra in Soy 40 the transgene using the GOI sequence specific radiolabelled probe. The Arabidopsis homogentisate prenyl transferase (ATPT2) (SEQID NO:33) was cloned in a soy binary vector EXAMPLE 7 harboring an Arcelin 5 expression cassette. This expression cassette consisted of an Arcelin 5-promoter, a multi cloning 45 Identification of Additional Homogentisate Prenyl site, and the Arcelin 53'-untranslated sequence in the order Transferase Sequences as described. Vector construction for this construct and the following constructs was performed using standard cloning In an analysis of the non-redundant amino acid database, techniques well established in the art and described in lab Motifs II and IV (SEQ ID NOs: 40 and 42 identified in manuals such as (Sambrook et al. 2001). The resulting 50 addition to HPT sequences, two genomic variants of Ara binary vector for soy seed-specific expression of ATPT2 was bidopsis thaliana sequence related to HPTs (SEQ ID NOs: designated pMON36581 (FIG. 10). Similarly the Syn 61-62). These sequences are based on insillico prediction echocystis homogentisate prenyl transferase (slr1736) (SEQ from genomic sequence by gene prediction algorithms. ID NO: 29) was fused to a chloroplast target peptide (CTP1), Further bioinformatic analysis showed that these sequences and cloned into the Arcelin 5 soy seed-specific expression 55 encoded an additional homogentisate prenyl transferase cassette. The resulting binary plasmid was designated related to HPT. Both sequences (SEQID NOs: 61–62) were pMON69933 (FIG. 11). An additional binary plasmid for used to search the non-redundant amino acid database. The seed-specific co-expression of the Arabidopsis p-hydrox BLAST search results indicated that these sequences are yphenylpyruvate dioxygenase (HPPD) and the bifunc related most to HPT sequences from cyanobacteria (SEQID tional prephenate dehydrogenase from Erwinia herbicola 60 (tyrA) (see WO 02/089561) was constructed by fusing the NOs: 1–3) and Arabidopsis (SEQ ID NO: 7). HPPD-gene and the tyra-gene to the chloroplast target Alignment of gi15229898 (970 aa)(SEQID NO: 61) and peptides, CTP2, and CTP1, respectively. These fusion genes gi10998.133 (441 aa) (SEQ ID NO: 62) showed that: were Subsequently cloned into the multi cloning site of Soy a) C terminal half of gi15229898 (SEQ ID NO: 61) seed-specific expression cassettes consisting of the p7SC'- 65 overlaps with gi10998.133 (SEQ ID NO: 62); promoter, a multi cloning site, and the E9 3'-untranslated b) the last 40–50 aa in the C terminal portions of these two region. The HPPD expression cassette was cloned into a proteins do not align; and US 7,112,717 B2 59 60 c) the N terminal of gi 15229898 does not align also with ID NOs: 7, and 61-62). Motif VII (SEQ ID NO: 48) HPTs (SEQ ID NOs: 1–7, and 9–11). These findings identified distantly related ubiA prenyl transerase from bac indicate discrepancy in the coding sequence prediction teria in addition to homogentisate prenyl transferase. How reported in Genbank. ever, the sensitivity of Motif VII to homogentisate prenyl In order to verify the predicted sequence, the BAC transferase was higher. Homogentisate prenyl transferases sequence of the Arabidopsis genome corresponding to the had lower e-values by several orders and hihger alignment region WaS downloaded from Genbank score (higher than 30). HPT2 sequences are distinct from (gil 12408742gb|AC016795.6IATAC016795, 100835 bp). HPT and cyanobacterial HPTs as demonstrated by the Coding sequences were predicted from this BAC clone sequence dendogram in FIG. 26. using the FGENESH (Solovyev V.V. (2001) Statistical 10 SEQ ID NOs: 43–44 were added to an alignment with approaches in Eukaryotic gene prediction: in Handbook of SEQ ID NOs: 1–4, 6–7, 9–11, 57-58, and 91, see FIGS. Statistical genetics (eds. Balding D. et al.), John Wiley & 33–34, and the resulting motifs (SEQ ID NOs: 92–95, Sons, Ltd., p. 83-127) gene prediction program. FGENESH Motifs IX-XII) were analyzed. Specificity of these motifs to predicted 28 proteins from this BAC clone. To identify new homogentisate prenyl transferases was confirmed by HMM homogentisate prenyl transferase proteins among 28 FGE 15 search. A non redundant database containing more than NESH predicted proteins, all 28 predicted proteins were 1.34M sequence was searched using HMM models built blasted against the non-redundant amino acid database. from the alignments shown in FIG. 34 for Motifs IX-XII. E FGENESH predicted protein No. 25 (402aa) (SEQ ID NO: value limits for the search was set at 1.0. All four motifs 45) was most similar to gi10998.133 (441 aa) (SEQID NO: identified only homogentisate prenyl transferase from 62), C terminal half of gi15229898 (970 aa) (SEQ ID NO: cyanobacteria, photobacteria and Arabidopsis. Upper E Val 61) and HPTs (SEQ ID NOs: 1-7, and 9-11.) ues limits for Motif IX, X, XI, and XII were 0.9, 11E10', To provide functional and transcriptional evidence and to 0.03, 8E10 respectively. The small size of motifs resulted confirm the coding sequence for this gene, plant EST in higher E values for Motif IX and XI. sequences database comprising proprietary and public sequence was searched. We found several ESTs (SEQ ID 25 EXAMPLE 8 NOs: 63-72) which match the N terminal and C terminal portions of this gene. The new gene was named HPT2 (SEQ Transformation and Expression of a Wild Type ID NO. 59) from Arabidopsis. The HPT2 (SEQID NO: 57) Arabidopsis HPT2 Gene in Sense and Antisense sequence is quite distinct from HPT1 (SEQ ID NO: 7). Orientations with Respect to Seed-specific and HPT2 (SEQ ID NO: 57) from Arabidopsis is also known 30 Constitutive Promoters in Arabidopsis thialiana as Tocopherol Synthase (TS). Present data suggests that the overexpression of TS leads to a similar increase in the The HPT2 full-length cDNA (SEQID NO: 59) is excised amount of overall tocopherol, over the wild type, as with from an EST clone, CPR230005 (pMON69960-FIG. 15), HPT1 (SEQ ID NO: 33). However, the enzymes may have with SalI and NotI enzymes, blunt-ended and cloned in different biochemical characteristics because the overex 35 between the napin promoter and napin 3' end at blunt-ended pression of TS results in less production of the delta toco SalI site in sense and antisense orientations with respect to pherol than the overexpression of HPT1 (SEQ ID NO: 33). the napin promoter in pMON36525 (FIG. 16) to generate The presence of chloroplast transit peptide in the HPT2 recombinant binary vectors pMON69963 (FIG. 17) and Arabidopsis sequence (SEQ ID NOs: 45 and 57) was pMON69965 (FIG. 18), respectively. The sequence of the verified using ChloroP program (Olof Emanuelsson 1, Hen 40 HIPT2 cl DNA is confirmed by sequencing with napin rik Nielsen1, 2, and Gunnar von Heijnel ChloroP, a neural 5'-sense (5'-GTGGCTCGGCTTCACTTTTTAC-3) (SEQ network-based method for predicting chloroplast transit ID NO: 50) and napin 3'-antisense (5'-CCACACTCATAT peptides and their cleavage sites. Protein Science: 8: CACCGTGG-3") (SEQ ID NO: 51) primers using standard 978-984, 1999). sequencing methodology. The HPT2 cl DNA used to generate In addition to SEQID NOs: 1, 7, and 9–11 (HPT), SEQID 45 the pMON69963 and pMON69965 is also cloned in between NOs: 57–58, and 90 (HPT2) were added to the alignment, the enhanced 35S promoter and E9–3' end at blunt-ended see FIGS. 24–25 and the resulting motifs analyzed. Motif V BglII and BamHI sites of pMON10098 (FIG. 19) to generate (SEQ ID NO: 46), VII (SEQID NO: 48), and vim (SEQ ID the pMON69964 (FIG. 20) and pMON69966 (FIG. 21) in NO: 49) are specific to HPT and HPT2 sequences. A HMM sense and antisense orientations with respect to the enhanced search of the non-redundantamino acid database using these 50 35S promoter, respectively. Additional HPT2 internal prim motifs identified only cyanobacteria (SEQID NOs: 1-3, and ers synthesized to completely sequence the whole HPT2 43), photobacteria (SEQ ID NO. 44), and plant HPTs (SEQ cDNA are listed in the table below:

A list of primers used for confirming the HPT2 cDNA sequence.

Primer Description Sequence

BXK 169 HPT2/CPR23005/sense 5'-CAGTGCTGGATAGAATTGCCCGGTTCC-3' (SEQ ID NO: 52) BXK 170 HPT2/CPR23005/sense 5'-GAGATCTATCAGTGCAGTCTGCTTGG-3' (SEQ ID NO: 53) BXK 171 HPT2/CPR23005/antisense 5'-GGGACAAGCATTTTTATTGCAAG-3' (SEQ ID NO: 54) US 7,112,717 B2 61 62 - continued

A list of primers used for confirming the HPT2 cDNA sequence.

Primer Description Sequence

BXK 172 HPT2/CPR23005/antisense 5'-GCCAAGATCACATGTGCAGGAATC-3' (SEQ ID NO: 55) BXK 173 HPT2/CPR23005/sense (SEQ ID NO: 56)

The plant binary vectors pMON69963, and pMON69965 EXAMPLE 9 are used in Arabidopsis thaliana plant transformation to 15 direct the sense and antisense expression of the HPT2, in the Preparation of Plant Binary Vector for Expression embryo. The binary vectors pMON69964, and pMON69966 of HPT2 from Arabidopsis in Combination with are used in Arabidopsis thaliana plant transformation for Tocopherol Pathway Genes sense and antisense expression of the HPT2 in whole plant. The binary vectors are transformed into ABI strain Agro bacterium cells by electroporation (Bio-Rad Electroprotocol To investigate the combinatorial effect of HPT2 with Manual, Dower et al., Nucleic Acids Res., 16:6127-6145 other key enzymes in the pathway, a plant binary vector (1988)). Transgenic Arabidopsis thaliana plants are containing seed-specifically expressed hydroxyphenylpyru obtained by Agrobacterium-mediated transformation as vate dioxygenase (HPPD), bifunctional prephenate dehydro described by Valverkens et al., Proc. Nat. Acad. Sci., 25 genase tyr A, and HPT2 (pMON81028 FIG. 27) is pre 85:5536-5540 (1988), Bent et al., Science, 265:1856–1860 pared. The pMON81028 is made by exercising the pNapin: (1994), and Bechtold et al., C.R. Acad. Sci., Life Sciences, HPT2::Napin 3' expression cassette from pMON81023 3.16:1194-1199 (1993). Transgenic plants are selected by (FIG. 28) with Bsp120I and NotI enzymes and ligating it to sprinkling the transformed T seeds onto the selection plates containing MS basal salts (4.3 g/L), Gamborg' a B-5, 500x 30 pMON36596 (FIG. 29) at NotI site. The pMON36596 (2.0 g/L), sucrose (10 g/L), MES (0.5 g/L), phytagar (8 g/L). contains the pNapin:CTP2::HPPD::Napin 3' and pNapin: carbenicillin (250 mg/L), cefotaxime (100 mg/L), plant CTP1::TyrA::Napin 3' expression cassettes. The preservation medium (2 ml/L), and kanamycin (60 mg/L) pMON81028 is transformed into Arabidopsis thaliana plant and then vernalizing them at 4°C. in the absence of light for using the method described in Example 8. 2–4 days. The seeds are transferred to 23°C., and 1678 hours 35 light/dark cycle for 5-10 days until seedlings emerge. Once EXAMPLE 10 one set of true leaves are formed on the kanamycin resistant seedlings, they are transferred to Soil and grown to maturity. Preparation of Construct for Bacterial Expression The transgenic lines generated through kanamycin selection of HPT2 from Arabidopsis are grown under two different light conditions. One set of 40 the transgenic lines are grown under 16 hrs light and 8 hrs The EST clone CPR23005 containing the HPT2 full dark and another set of the transgenic lines are grown under length cDNA is used as a template for PCR to amplify the 24 hrs light to study the effect of light on seed tocopherol HPT2 cl DNA fragment codes for the mature form of the levels. The T seed harvested from the transformants is 45 HPT2 protein. Two sets of PCR products are generated to analyzed for tocopherol content. The results from the seed total tocopherol analysis from lines grown under both nor clone at the pET30a(+) vector (Novagen, Inc.) (FIG. 30) to mal and high light conditions are presented in FIGS. 22 and produce HPT2 protein with and without his tag. The primer 23. Seed-specific overexpression of HPT2 under normal and set BXK174 (5'-CACATATGGCATGTTCTCAGGTTGGT highlight conditions produced a significant 1.6- and 1.5-fold GCTGC-3') (SEQ ID NO: 84) and BXK176 (5'-GCGTC increase in total tocopherol levels (alpha=0.05; Tukey 50 GACCTAGAGGAAGGGGAATAACAG-3') (SEQ ID NO: 85) is used for cloning HPT2 at the Ndel and SalI sites of Kramer HSD) (SAS institute, 2002, JPM version 5.0). pET30a(+), behind the T7 promoter to generate mature Expression of HPT2 using the constitutive promoter, HPT2 protein without the his tag. The resulting recobmbi e35S, produced about 20% increase in seed total tocopherol nant vector is named pMON69993 (FIG. 31). The primer set levels as compared to control under both light conditions. 55 Maximum tocopherol level reduction in lines harboring the BXK175 (5'-CAACCATGGCATGTTCTCAGGTTGGT enhanced 35S::HPT2 antisense construct was 20%. Overall, GCTGC-3') (SEQ ID NO: 86) and BXK176 (5'-GCGTC the significant increase in seed total tocopherol level in the GACCTAGAGGAAGGGGAATAACAG-3') (SEQ ID NO: Arabidopsis thaliana lines harboring the HPT2 driven by the 87) is used to generate HPT2 PCR product to clone at the napin promoter Suggests that HPT2 plays a key role in 60 NcoI and SalI sites of plT30a(+) to produce mature HPT2 tocopherol biosynthesis. with his tag. The recombinant vector is named as Western analysis is carried out to detect the transgene pMON69992 (FIG. 32). The pMON69993 and pMON69992 expression in tissues harboring the gene of interest (GOI) is used for producing bacterial expressed HPT2 to carry out expression cassette using the GOI protein specific antibody. enzyme assays to confirm its homogentisate prenyl trans Northern analysis is done for detecting the mRNA level of 65 ferase activity and specificity towards geranylgeranyl pyro the transgene using the GOI sequence specific radiolabelled phosphosphate, phytyl pyrophophaste and Solanyl pyro probe. phosphate Substrates. US 7,112,717 B2 63

SEQUENCE LISTING

<160> NUMBER OF SEQ ID NOS: 95 <210> SEQ ID NO 1 <211& LENGTH 322 &212> TYPE PRT <213> ORGANISM: Nostoc punctiforme <400 SEQUENCE: 1 Met Ser Glin Ser Ser Glin Asn Ser Pro Leu Pro Arg Lys Pro Val Glin 1 5 10 15 Ser Tyr Phe His Trp Leu Tyr Ala Phe Trp Llys Phe Ser Arg Pro His 2O 25 30 Thir Ile Ile Gly Thr Ser Leu Ser Val Leu Ser Leu Tyr Leu Ile Ala 35 40 45 Ile Ala Ile Ser Asn Asn Thr Ala Ser Leu Phe Thr Thr Pro Gly Ser 50 55 60 Leu Ser Pro Leu Phe Gly Ala Trp Ile Ala Cys Lieu. Cys Gly Asn. Wal 65 70 75 8O Tyr Ile Val Gly Lieu. Asn Gln Leu Glu Asp Val Asp Ile Asp Lys Ile 85 90 95 Asn Llys Pro His Leu Pro Leu Ala Ser Gly Glu Phe Ser Glin Glin Thr 100 105 110 Gly Glin Lieu. Ile Val Ala Ser Thr Gly Ile Leu Ala Lieu Val Met Ala 115 120 125 Trp Leu Thr Gly Pro Phe Leu Phe Gly Met Val Thr Ile Ser Leu Ala 130 135 1 4 0 Ile Gly Thr Ala Tyr Ser Leu Pro Pro Ile Arg Leu Lys Glin Phe Pro 145 15 O 155 160 Phe Trp Ala Ala Leu Cys Ile Phe Ser Val Arg Gly Thr Ile Val Asn 1.65 170 175 Leu Gly Lieu. Tyr Lieu. His Tyr Ser Trp Ala Lieu Lys Glin Ser Glin Thr 18O 185 19 O Ile Pro Pro Val Val Trp Val Leu Thr Leu Phe Ile Leu Val Phe Thr 195 200 2O5 Phe Ala Ile Ala Ile Phe Lys Asp Ile Pro Asp Ile Glu Gly Asp Arg 210 215 220 Leu Tyr Asn Ile Thr Thr Phe Thr Ile Llys Leu Gly Ser Glin Ala Val 225 230 235 240 Phe Asn Leu Ala Leu Trp Val Ile Thr Val Cys Tyr Leu Gly Ile Ile 245 250 255 Leu Val Gly Val Leu Arg Ile Ala Ser Val Asn Pro Ile Phe Leu Ile 260 265 27 O Thr Ala His Leu Ala Lieu Lleu Val Trp Met Trp Trp Arg Ser Lieu Ala 275 280 285 Val Asp Leu Glin Asp Lys Ser Ala Ile Ala Glin Phe Tyr Glin Phe Ile 29 O 295 3OO Trp Llys Leu Phe Phe Ile Glu Tyr Leu Ile Phe Pro Ile Ala Cys Phe 305 310 315 320

Leu Ala

<210> SEQ ID NO 2 &2 11s LENGTH 318 &212> TYPE PRT <213> ORGANISM: Anabaena sp. US 7,112,717 B2 65 66

-continued

<400 SEQUENCE: 2

Met Asn Glin Ser Ser Glin Pro Telu Arg Pro Pro Telu Glin 1 10 15

Ser Ser Phe Glin Trp Teu Tyr Ala Phe Trp Phe Ser Arg Pro His 25 30

Thr Ile Ile Gly Thr Ser Teu Ser Wall Telu Gly Teu Tyr Telu Ile Ser 35 40 45

Ile Ala Wall Ser Ser Thr Gly Phe Ala Telu Thr Glin Ile Asn Ser Wall 50 55 60

Teu Gly Ala Trp Teu Ala Cys Telu Cys Gly Asn Wall Ile Wall Gly 65 70 75

Teu Asn Glin Telu Glu Asp Ile Glu Ile Asp Wall Asn Pro His 85 90 95

Teu Pro Telu Ala Ser Gly Glu Phe Ser Arg Glin Gly Arg Ile Ile 100 105 110

Wall Ile Telu Thr Gly Ile Thr Ala Ile Wall Teu Ala Trp Telu Asn Gly 115 120 125

Pro Tyr Telu Phe Gly Met Wall Ala Wall Ser Teu Ala Ile Gly Thr Ala 130 135 1 4 0

Tyr Ser Telu Pro Pro Ile Arg Telu Lys Glin Phe Pro Phe Trp Ala Ala 145 15 O 155 160

Teu Cys Ile Phe Ser Wall Arg Gly Thr Ile Wall Asn Teu Gly Telu 1.65 170 175

Teu His Phe Ser Trp Teu Teu Glin Asn Glin Ser Ile Pro Telu Pro 18O 185 19 O

Wall Trp Ile Telu Thr Wall Phe Ile Telu Ile Phe Thr Phe Ala Ile Ala 195 200

Ile Phe Asp Ile Pro Asp Met Glu Gly Asp Arg Teu Asn Ile 210 215 220

Thr Thr Telu Thr Ile Glin Teu Gly Pro Glin Ala Wall Phe Asn Telu Ala 225 230 235 240

Met Trp Wall Telu Thr Wall Cys Tyr Telu Gly Met Wall Ile Ile Gly Wall 245 250 255

Teu Telu Gly Thr Ile Asn Ser Wall Phe Teu Wall Wall Thr His Telu 260 265 27 O

Wall Ile Telu Cys Trp Met Trp Met Glin Ser Teu Ala Wall Asp Ile His 275 280 285

Asp Lys Thr Ala Ile Ala Glin Phe Tyr Glin Phe Ile Trp Telu Phe 29 O 295

Phe Telu Glu Tyr Teu Met Phe Pro Ile Ala Cys Teu Teu Ala 305 310 315

SEQ ID NO 3 LENGTH TYPE PRT ORGANISM: Synechocystis sp.

<400 SEQUENCE: 3 Met Ala Thr Ile Glin Ala Phe Trp Arg Phe Ser Arg Pro His Thir Ile 1 5 10 15

Ile Gly Thr Thr Leu Ser Val Trp Ala Val Tyr Leu Leu Thir Ile Leu 25 30

Gly Asp Gly Asn. Ser Val Asn. Ser Pro Ala Ser Lieu. Asp Leu Wall Phe 35 40 45 US 7,112,717 B2 67

-continued

Gly Ala Trp Lieu Ala Cys Lieu Lieu Gly Asn Val Tyr Ile Val Gly Lieu 50 55 60 Asn Gln Leu Trp Asp Wall Asp Ile Asp Arg Ile Asn Lys Pro Asn Lieu 65 70 75 8O Pro Leu Ala Asn Gly Asp Phe Ser Ile Ala Glin Gly Arg Trp Ile Val 85 90 95 Gly Lieu. Cys Gly Val Ala Ser Lieu Ala Ile Ala Trp Gly Lieu Gly Lieu 100 105 110 Trp Leu Gly Leu Thr Val Gly Ile Ser Leu Ile Ile Gly Thr Ala Tyr 115 120 125 Ser Val Pro Pro Val Arg Lieu Lys Arg Phe Ser Lieu Lieu Ala Ala Lieu 130 135 1 4 0 Cys Ile Lieu. Thr Val Arg Gly Ile Val Val Asn Lieu Gly Lieu Phe Lieu 145 15 O 155 160 Phe Phe Arg Ile Gly Leu Gly Tyr Pro Pro Thr Leu Ile Thr Pro Ile 1.65 170 175 Trp Val Leu Thr Leu Phe Ile Leu Val Phe Thr Val Ala Ile Ala Ile 18O 185 19 O Phe Lys Asp Val Pro Asp Met Glu Gly Asp Arg Glin Phe Lys Ile Glin 195 200 2O5 Thr Leu Thir Leu Glin Ile Gly Lys Glin Asn Val Phe Arg Gly Thr Leu 210 215 220 Ile Leu Lieu. Thr Gly Cys Tyr Lieu Ala Met Ala Ile Trp Gly Lieu Trp 225 230 235 240 Ala Ala Met Pro Leu Asn Thr Ala Phe Leu Ile Val Ser His Leu Cys 245 250 255 Leu Lieu Ala Lieu Lleu Trp Trp Arg Ser Arg Asp Wal His Leu Glu Ser 260 265 27 O Lys Thr Glu Ile Ala Ser Phe Tyr Glin Phe Ile Trp Lys Leu Phe Phe 275 280 285 Leu Glu Tyr Lieu Lleu Tyr Pro Leu Ala Leu Trp Leu Pro Asn. Phe Ser 29 O 295 3OO

Asn. Thir Ile Phe 305

<210> SEQ ID NO 4 &2 11s LENGTH 399 &212> TYPE PRT <213> ORGANISM: Zea mays <400 SEQUENCE: 4 Met Asp Ala Lieu Arg Lieu Arg Pro Ser Lieu Lleu Pro Val Arg Pro Gly 1 5 10 15 Ala Ala Arg Pro Arg Asp His Phe Lieu Pro Pro Cys Cys Ser Ile Glin 2O 25 30 Arg Asn Gly Glu Gly Arg Ile Cys Phe Ser Ser Glin Arg Thr Glin Gly 35 40 45 Pro Thr Leu. His His His Gln Lys Phe Phe Glu Trp Lys Ser Ser Tyr 50 55 60 Cys Arg Ile Ser His Arg Ser Lieu. Asn. Thir Ser Val Asn Ala Ser Gly 65 70 75 8O Gln Glin Leu Glin Ser Glu Pro Glu Thr His Asp Ser Thr Thr Ile Trp 85 90 95 Arg Ala Ile Ser Ser Ser Lieu. Asp Ala Phe Tyr Arg Phe Ser Arg Pro US 7,112,717 B2 69 70

-continued

100 105 110

His Thr Wall Ile Gly Thr Ala Leu Ser Ile Wall Ser Wall Ser Telu Telu 115 120 125

Ala Wall Glin Ser Teu Ser Asp Ile Ser Pro Teu Phe Teu Thr Gly Telu 130 135 1 4 0

Teu Glu Ala Wal Wall Ala Ala Leu Phe Met Asn Ile Ile Wall Gly 145 15 O 155 160

Teu Asn Glin Leu Phe Ile Glu Ile Asp Wall Asn Pro Thr 1.65 170 175

Teu Pro Telu Ala Ser Gly Glu Tyr Thr Telu Ala Thr Gly Wall Ala Ile 18O 185 19 O

Wall Ser Wall Phe Ala Ala Met Ser Phe Gly Teu Gly Trp Ala Wall Gly 195 200

Ser Glin Pro Leu Phe Trp Ala Leu Phe Ile Ser Phe Wall Telu Gly Thr 210 215 220

Ala Tyr Ser Ile Asn Teu Pro Tyr Telu Trp Arg Phe Ala Wall 225 230 235 240

Wall Ala Ala Lieu. Cys Ile Leu Ala Wall Arg Ala Wall Ile Wall Glin Telu 245 250 255

Ala Phe Phe Lieu. His Ile Glin Thr Phe Wall Phe Arg Arg Pro Ala Wall 260 265 27 O

Phe Ser Arg Pro Leu Teu Phe Ala Thr Gly Phe Met Thr Phe Phe Ser 275 280 285

Wall Wall Ile Ala Leu Phe Lys Asp Ile Pro Asp Ile Glu Gly Asp Arg 29 O 295

Ile Phe Gly Ile Arg Ser Phe Ser Wall Arg Teu Gly Glin Wall 305 310 315 320

Phe Trp Ile Cys Val Gly Teu Telu Glu Met Ala Ser Wall Ala Ile 325 330 335

Teu Met Gly Ala Thr Ser Ser Cys Telu Trp Ser Thr Ala Thr Ile 340 345 35 O

Ala Gly His Ser Ile Teu Ala Ala Ile Telu Trp Ser Cys Ala Arg Ser 355 360 365

Wall Asp Telu Thir Ser Lys Ala Ala Ile Thr Ser Phe Met Phe Ile 370 375 38O

Trp Lys Telu Phe Tyr Ala Glu Tyr Telu Telu Ile Pro Teu Wall Arg 385 390 395

SEQ ID NO 5 LENGTH 411 TYPE PRT ORGANISM: Glycine max (ppt2) <400 SEQUENCE: 5

Met Asp Ser Teu Lleu. Teu Arg Ser Phe Pro Asn Ile Asn Asn Ala Ser 1 5 10 15

Ser Telu Thr Th Thr Gly Ala Asn Phe Ser Arg Thr Ser Phe Ala 2O 25 30

Asn Ile Tyr His Ala Ser Ser Tyr Wall Pro Asn Ala Ser Trp His Asn 35 40 45

Arg Lys Ile Gln Lys Glu Tyr Asn Phe Telu Arg Phe Arg Trp Pro Ser 50 55 60

Teu Asn His His Tyr Lys Gly Ile Glu Gly Ala Thr 65 70 75 US 7,112,717 B2 71

-continued Cys Asn. Ile Llys Phe Val Val Lys Ala Thir Ser Glu Lys Ser Lieu Glu 85 90 95 Ser Glu Pro Glin Ala Phe Asp Pro Lys Ser Ile Leu Asp Ser Wall Lys 100 105 110 Asn Ser Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val Ile 115 120 125 Gly Thr Ala Leu Ser Ile Ile Ser Val Ser Lieu Lleu Ala Val Glu Lys 130 135 1 4 0 Ile Ser Asp Ile Ser Pro Leu Phe Phe Thr Gly Val Leu Glu Ala Val 145 15 O 155 160 Val Ala Ala Lieu Phe Met Asn. Ile Tyr Ile Val Gly Lieu. Asn Glin Lieu 1.65 170 175 Ser Asp Val Glu Ile Asp Lys Ile Asn Lys Pro Tyr Lieu Pro Leu Ala 18O 185 19 O Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr Ile Val Ala Ser Phe 195 200 2O5 Ser Ile Leu Ser Phe Trp Leu Gly Trp Val Val Gly Ser Trp Pro Leu 210 215 220 Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr Ala Tyr Ser Ile 225 230 235 240 Asn Val Pro Leu Lleu Arg Trp Lys Arg Phe Ala Wall Leu Ala Ala Met 245 250 255 Cys Ile Leu Ala Val Arg Ala Val Ile Val Glin Leu Ala Phe Phe Lieu 260 265 27 O His Met Gln Thr His Val Tyr Lys Arg Pro Pro Val Phe Ser Arg Pro 275 280 285

Leu. Ile Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Wal Wall Ile Ala 29 O 295 3OO Leu Phe Lys Asp Ile Pro Asp Ile Glu Gly Asp Llys Val Phe Gly Ile 305 310 315 320 Gln Ser Phe Ser Val Arg Leu Gly Gln Lys Pro Val Phe Trp Thr Cys 325 330 335 Val Thr Lieu Lieu Glu Ile Ala Tyr Gly Val Ala Lieu Lleu Val Gly Ala 340 345 35 O Ala Ser Pro Cys Lieu Trp Ser Lys Ile Phe Thr Gly Lieu Gly. His Ala 355 360 365 Val Lieu Ala Ser Ile Leu Trp Phe His Ala Lys Ser Val Asp Lieu Lys 370 375 38O Ser Lys Ala Ser Ile Thr Ser Phe Tyr Met Phe Ile Trp Llys Leu Phe 385 390 395 400 Tyr Ala Glu Tyr Leu Leu Ile Pro Phe Val Arg 405 410

<210> SEQ ID NO 6 &2 11s LENGTH 395 &212> TYPE PRT <213> ORGANISM: Glycine max (ppt1) <400 SEQUENCE: 6 Met Asp Ser Met Leu Lleu Arg Ser Phe Pro Asn. Ile Asn. Asn Ala Ser 1 5 10 15 Ser Leu Ala Thr Thr Gly Ser Tyr Leu Pro Asn Ala Ser Trp His Asn 2O 25 30 Arg Lys Ile Glin Lys Glu Tyr Asn. Phe Leu Arg Phe Arg Trp Pro Ser 35 40 45 US 7,112,717 B2 73

-continued

Lieu. Asn His His Tyr Lys Ser Ile Glu Gly Gly Cys Thr Cys Lys Lys 50 55 60 Cys Asn Ile Llys Phe Val Val Lys Ala Thr Ser Glu Lys Ser Phe Glu 65 70 75 8O Ser Glu Pro Glin Ala Phe Asp Pro Lys Ser Ile Leu Asp Ser Wall Lys 85 90 95 Asn Ser Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val Ile 100 105 110 Gly Thr Ala Leu Ser Ile Ile Ser Val Ser Lieu Lleu Ala Val Glu Lys 115 120 125 Ile Ser Asp Ile Ser Pro Leu Phe Phe Thr Gly Val Leu Glu Ala Val 130 135 1 4 0 Val Ala Ala Lieu Phe Met Asn. Ile Tyr Ile Val Gly Lieu. Asn Glin Lieu 145 15 O 155 160 Ser Asp Val Glu Ile Asp Lys Ile Asn Lys Pro Tyr Lieu Pro Leu Ala 1.65 170 175 Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr Ile Val Ala Ser Phe 18O 185 19 O Ser Ile Leu Ser Phe Trp Leu Gly Trp Val Val Gly Ser Trp Pro Leu 195 200 2O5 Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr Ala Tyr Ser Ile 210 215 220 Asn Val Pro Leu Lleu Arg Trp Lys Arg Phe Ala Wall Leu Ala Ala Met 225 230 235 240 Cys Ile Leu Ala Val Arg Ala Val Ile Val Glin Leu Ala Phe Phe Lieu 245 250 255 His Ile Glin Thr His Val Tyr Lys Arg Pro Pro Val Phe Ser Arg Ser 260 265 27 O

Leu. Ile Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Wal Wall Ile Ala 275 280 285 Leu Phe Lys Asp Ile Pro Asp Ile Glu Gly Asp Llys Val Phe Gly Ile 29 O 295 3OO Gln Ser Phe Ser Val Arg Leu Gly Gln Lys Pro Val Phe Trp Thr Cys 305 310 315 320 Val Ile Leu Lieu Glu Ile Ala Tyr Gly Val Ala Lieu Lleu Val Gly Ala 325 330 335 Ala Ser Pro Cys Lieu Trp Ser Lys Ile Val Thr Gly Lieu Gly. His Ala 340 345 35 O Val Lieu Ala Ser Ile Leu Trp Phe His Ala Lys Ser Val Asp Lieu Lys 355 360 365 Ser Lys Ala Ser Ile Thr Ser Phe Tyr Met Phe Ile Trp Llys Leu Phe 370 375 38O Tyr Ala Glu Tyr Leu Leu Ile Pro Phe Val Arg 385 390 395

<210 SEQ ID NO 7 &2 11s LENGTH 393 &212> TYPE PRT <213> ORGANISM: Arabidopsis thaliana <400 SEQUENCE: 7 Met Glu Ser Lieu Lleu Ser Ser Ser Ser Lieu Val Ser Ala Ala Gly Gly 1 5 10 15 Phe Cys Trp Llys Lys Glin Asn Lieu Lys Lieu. His Ser Lieu Ser Glu Ile US 7,112,717 B2 75 76

-continued

25 30

Arg Wall Telu Cys Ser Ser Lys Wall Wall Ala Lys Pro Phe 35 40 45

Arg Asn Asn Telu Wall Arg Pro Asp Gly Glin Gly Ser Ser Telu Telu Telu 50 55 60

Tyr Pro His Lys Ser Arg Phe Wall Asn Ala Thr Ala Gly Glin 65 70 75

Pro Glu Ala Phe Ser Asn Ser Lys Glin Ser Phe Arg Asp Ser 85 90 95

Teu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Wall Ile Gly Thr 100 105 110

Wall Telu Ser Ile Teu Ser Wall Ser Phe Telu Ala Wall Glu Wall Ser 115 120 125

Asp Ile Ser Pro Teu Teu Phe Thr Gly Ile Teu Glu Ala Wall Wall Ala 130 135 1 4 0

Ala Telu Met Met Asn Ile Tyr Ile Wall Gly Teu Asn Glin Telu Ser Asp 145 15 O 155 160

Wall Glu Ile Asp Lys Wall Asn Lys Pro Tyr Teu Pro Teu Ala Ser Gly 1.65 170 175

Glu Ser Wall Asn Thr Gly Ile Ala Ile Wall Ala Ser Phe Ser Ile 18O 185 19 O

Met Ser Phe Trp Teu Gly Trp Ile Wall Gly Ser Trp Pro Telu Phe Trp 195 200

Ala Telu Phe Wall Ser Phe Met Telu Gly Thr Ala Tyr Ser Ile Asn Telu 210 215 220

Pro Telu Telu Arg Trp Lys Arg Phe Ala Telu Wall Ala Ala Met Ile 225 230 235 240

Teu Ala Wall Arg Ala Ile Ile Wall Glin Ile Ala Phe Telu His Ile 245 250 255

Glin Thr His Wall Phe Gly Pro Ile Telu Phe Thr Arg Pro Telu Ile 260 265 27 O

Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Wall Wall Ile Ala Telu Phe 275 280 285

Asp Ile Pro Asp Ile Glu Gly Asp Ile Phe Gly Ile Arg Ser 29 O 295

Phe Ser Wall Thr Teu Gly Glin Lys Wall Phe Trp Thr Wall Thr 305 310 315 320

Teu Telu Glin Met Ala Ala Wall Ala Ile Teu Wall Gly Ala Thr Ser 325 330 335

Pro Phe Ile Trp Ser Lys Wall Ile Ser Wall Wall Gly His Wall Ile Telu 340 345 35 O

Ala Thr Thr Telu Trp Ala Ala Lys Ser Wall Asp Teu Ser Ser 355 360 365

Thr Glu Ile Thr Ser Cys Tyr Met Phe Ile Trp Lys Teu Phe Ala 370 375 38O

Glu Telu Telu Teu Pro Phe Telu Lys 385 390

SEQ ID NO 8 LENGTH 361 TYPE PRT ORGANISM: Cuphea pulcherrima

<400 SEQUENCE: 8 US 7,112,717 B2 77

-continued

Met Arg Met Glu Ser Teu Teu Telu Asn Ser Phe Ser Pro Ser Pro Ala 10 15

Gly Gly Lys Ile Cys Ala Thr Tyr Ala Tyr Phe Ala 25 30

Thr Ala Arg Asn Thr Teu Asn Ser Telu Asn Asn Thr Gly Glu 35 40 45

His Telu Ser Thr Glin Arg Phe Thr Phe His Glin Asn Gly 50 55 60

His Arg Thr Teu Wall Lys Ala Wall Ser Gly Glin Ser Telu Glu Ser 65 70 75

Pro Glu Ser Tyr Pro Asn Asn Arg Trp Asp Wall Ser Ala 85 90 95

Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Ile Ile Gly Thr 100 105 110

Telu Ser Ile Wall Ser Wall Ser Telu Telu Ala Wall Glu Telu Pro 115 120 125

Telu Asn Ser Met Phe Phe Thr Gly Telu Teu Glu Wall Ile Telu Ala 130 135 1 4 0

Telu Phe Met Asn Ile Tyr Ile Wall Gly Teu Asn Glin Telu Ser Asp 15 O 155 160

Asp Ile Asp Lys Wall Asn Lys Pro Tyr Teu Pro Teu Ala Ser Gly 1.65 170 175

Phe Ser Wall Gly Thr Gly Wall Thr Ile Wall Thr Ser Phe Telu Ile 18O 185 19 O

Met Ser Phe Trp Teu Gly Trp Wall Wall Gly Ser Trp Pro Telu Phe Trp 195 200

Ala Telu Phe Ile Ser Phe Wall Telu Gly Thr Ala Tyr Ser Ile Asp Met 210 215 220

Pro Met Telu Arg Trp Lys Arg Ser Ala Wall Wall Ala Ala Telu Ile 225 230 235 240

Teu Ala Wall Arg Ala Wall Ile Wall Glin Ile Ala Phe Phe Telu His Met 245 250 255

Glin Met His Wall Tyr Gly Ala Ala Ala Teu Ser Arg Pro Wall Ile 260 265 27 O

Phe Ala Thr Gly Phe Met Ser Phe Phe Ser Ile Wall Ile Ala Telu Phe 275 280 285

Asp Ile Pro Asp Ile Glu Gly Asp Lys Ile Phe Gly Ile Arg Ser 29 O 295

Phe Thr Wall Arg Teu Gly Glin Glu Arg Wall Phe Trp Ile Ile Ser 305 310 315 320

Teu Telu Glu Met Ala Ala Wall Ala Telu Trp Wall Teu Arg Ala Arg 325 330 335

Gly Arg Lys His Gly Wall Ser Ala Ser Glu Phe Phe Telu 340 345 35 O

Ser Ile Ser Gly Gly Lys Asn Telu 355 360

SEQ ID NO 9 LENGTH 395 TYPE PRT ORGANISM: Allium porrum <400 SEQUENCE: Met Leu Ser Met Asp Ser Leu Leu Thr Lys Pro Val Val Ile Pro Leu 1 5 10 15 US 7,112,717 B2 79

-continued

Pro Ser Pro Val Cys Ser Leu Pro Ile Leu Arg Gly Ser Ser Ala Pro 2O 25 30 Gly Glin Tyr Ser Cys Arg Asn Tyr Asn Pro Ile Arg Ile Glin Arg Cys 35 40 45 Leu Val Asn Tyr Glu His Val Lys Pro Arg Phe Thr Thr Cys Ser Arg 50 55 60 Ser Glin Lys Lieu Gly His Wall Lys Ala Thir Ser Glu His Ser Lieu Glu 65 70 75 8O Ser Gly Ser Glu Gly Tyr Thr Pro Arg Ser Ile Trp Glu Ala Val Leu 85 90 95 Ala Ser Leu Asn Val Leu Tyr Lys Phe Ser Arg Pro His Thr Ile Ile 100 105 110 Gly Thr Ala Met Gly Ile Met Ser Val Ser Leu Leu Val Val Glu Ser 115 120 125 Leu Ser Asp Ile Ser Pro Leu Phe Phe Val Gly Lieu Lleu Glu Ala Wal 130 135 1 4 0 Val Ala Ala Leu Phe Met Asn Val Tyr Ile Val Gly Leu Asn Glin Leu 145 15 O 155 160 Phe Asp Ile Glu Ile Asp Llys Val Asn Lys Pro Asp Lieu Pro Leu Ala 1.65 170 175 Ser Gly Glu Tyr Ser Pro Arg Ala Gly Thr Ala Ile Val Ile Ala Ser 18O 185 19 O Ala Ile Met Ser Phe Gly Ile Gly Trp Leu Val Gly Ser Trp Pro Leu 195 200 2O5 Phe Trp Ala Leu Phe Ile Ser Phe Val Leu Gly Thr Ala Tyr Ser Ile 210 215 220 Asn Lieu Pro Phe Leu Arg Trp Lys Arg Ser Ala Val Val Ala Ala Ile 225 230 235 240 Cys Ile Leu Ala Val Arg Ala Val Ile Val Glin Leu Ala Phe Phe Lieu 245 250 255 His Ile Glin Ser Phe Val Phe Lys Arg Pro Ala Ser Phe Thr Arg Pro 260 265 27 O

Leu. Ile Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Wal Wall Ile Ala 275 280 285 Leu Phe Lys Asp Ile Pro Asp Ile Asp Gly Asp Lys Ile Phe Gly Ile 29 O 295 3OO His Ser Phe Ser Val Arg Leu Gly Glin Glu Arg Val Phe Trp Ile Cys 305 310 315 320 Ile Tyr Leu Leu Glu Met Ala Tyr Thr Val Val Met Val Val Gly Ala 325 330 335 Thr Ser Ser Cys Leu Trp Ser Lys Cys Leu Thr Val Ile Gly His Ala 340 345 35 O Ile Leu Gly Ser Lieu Lleu Trp Asn Arg Ala Arg Ser His Gly Pro Met 355 360 365 Thr Lys Thr Thr Ile Thr Ser Phe Tyr Met Phe Val Trp Llys Leu Phe 370 375 38O Tyr Ala Glu Tyr Leu Leu Ile Pro Phe Val Arg 385 390 395

<210> SEQ ID NO 10 &2 11s LENGTH 4 OO &212> TYPE PRT <213> ORGANISM: Triticum sp. US 7,112,717 B2 81 82

-continued <400 SEQUENCE: 10

Met Asp Ser Telu Arg Teu Pro Ser Ser Teu Arg Ser Ala Pro Gly 1 5 10 15

Ala Ala Ala Ala Arg Asp His Ile Teu Pro Ser Phe Ser 2O 25 30

Ile Glin Arg Asn Gly Lys Gly Arg Wall Thr Teu Ser Ile Glin Ser 35 40 45

Gly Pro Thr Ile Asn His Cys Lys Phe Teu Asp Trp 50 55 60

Ser Asn His Ile Ser His Glin Ser Ile Asn Thr Ser Ala Ala 65 70 75

Gly Glin Ser Telu Glin Pro Glu Thr Glu Ala His Asp Pro Ala Ser Phe 85 90 95

Trp Pro Ile Ser Ser Ser Telu Asp Ala Phe Arg Phe Ser Arg 100 105 110

Pro His Thr Ile Ile Gly Thr Ala Telu Ser Ile Wall Ser Wall Ser Telu 115 120 125

Teu Ala Wall Glu Ser Teu Ser Asp Ile Ser Pro Teu Phe Telu Thr Gly 130 135 1 4 0

Teu Telu Glu Ala Wall Wall Ala Ala Telu Phe Met Asn Ile Ile Wall 145 15 O 155 160

Gly Telu Asn Glin Teu Phe Ile Glu Ile Asp Wall Asn Lys Pro 1.65 170 175

Thr Telu Pro Telu Ala Ser Gly Glu Tyr Ser Pro Ala Thr Gly Wall Ala 18O 185 19 O

Ile Wall Ser Wall Phe Ala Ala Met Ser Phe Gly Teu Gly Trp Wall Wall 195 200

Gly Ser Pro Pro Teu Phe Trp Ala Telu Phe Ile Ser Phe Wall Telu Gly 210 215 220

Thr Ala Ser Wall Asn Teu Pro Tyr Phe Arg Trp Arg Ser Ala 225 230 235 240

Wall Wall Ala Ala Teu Cys Ile Telu Ala Wall Arg Ala Wall Ile Wall Glin 245 250 255

Teu Ala Phe Phe Teu His Ile Glin Thr Phe Wall Phe Arg Arg Pro Ala 260 265 27 O

Wall Phe Ser Pro Teu Ile Phe Ala Thr Ala Phe Met Thr Phe Phe 275 280 285

Ser Wall Wall Ile Ala Teu Phe Lys Ile Pro Asp Ile Glu Gly Asp 29 O 295

Arg Ile Phe Gly Ile Glin Ser Phe Ser Wall Arg Teu Gly Glin Ser Lys 305 310 315 320

Wall Phe Trp Thr Cys Wall Gly Telu Telu Glu Wall Ala Gly Wall Ala 325 330 335

Ile Telu Met Gly Wall Thr Ser Ser Ser Telu Trp Ser Ser Telu Thr 340 345 35 O

Wall Wall Gly His Ala Ile Teu Ala Ser Ile Teu Trp Ser Ser Ala Arg 355 360 365

Ser Ile Asp Telu Thr Ser Lys Ala Ala Ile Thr Ser Phe Met Telu 370 375

Ile Trp Arg Telu Phe Tyr Ala Glu Tyr Telu Teu Ile Pro Telu Wall Arg 385 390 395 400

<210> SEQ ID NO 11 US 7,112,717 B2 83 84

-continued

LENGTH 393 TYPE PRT ORGANISM: Cuphea pulcherrima

<400 SEQUENCE: 11

Met Arg Met Glu Ser Teu Teu Telu Asn Ser Phe Ser Pro Ser Pro Ala 1 5 10 15

Gly Gly Lys Ile Cys Ala Thr Ala Tyr Phe Ala 25 30

Thr Ala Arg Cys Asn Thr Teu Asn Ser Telu Asn Asn Thr Gly Glu 35 40 45

His Telu Ser Thr Glin Arg Phe Thr Phe His Glin Asn Gly 50 55 60

His Arg Thr Tyr Teu Wall Lys Ala Wall Ser Gly Glin Ser Telu Glu Ser 65 70 75

Pro Glu Ser Tyr Pro Asn Asn Trp Asp Wall Ser Ala 85 90 95

Asp Ala Phe Tyr Phe Ser Arg Pro His Thr Ile Ile Gly Thr 100 105 110

Telu Ser Ile Wall Ser Wall Ser Telu Telu Ala Wall Glu Telu Pro 115 120 125

Telu Asn Ser Met Phe Phe Thr Gly Telu Teu Glu Wall Ile Telu Ala 130 135 1 4 0

Telu Phe Met Asn Ile Tyr Ile Wall Gly Teu Asn Glin Telu Ser Asp 15 O 155 160

Asp Ile Asp Lys Wall Asn Lys Pro Tyr Teu Pro Teu Ala Ser Gly 1.65 170 175

Phe Ser Wall Gly Thr Gly Wall Thr Ile Wall Thr Ser Phe Telu Ile 18O 185 19 O

Met Ser Phe Trp Teu Gly Trp Wall Wall Gly Ser Trp Pro Telu Phe Trp 195 200

Ala Telu Phe Ile Ser Phe Wall Telu Gly Thr Ala Tyr Ser Ile Asp Met 210 215 220

Pro Met Telu Trp Lys Arg Ser Ala Wall Wall Ala Ala Telu Ile 225 230 235 240

Teu Ala Wall Arg Ala Wall Ile Wall Glin Ile Ala Phe Phe Telu His Met 245 250 255

Glin Met His Wall Tyr Gly Ala Ala Ala Teu Ser Arg Pro Wall Ile 260 265 27 O

Phe Ala Thr Gly Phe Met Ser Phe Phe Ser Ile Wall Ile Ala Telu Phe 275 280 285

Asp Ile Pro Asp Ile Glu Gly Asp Ile Phe Gly Ile Arg Ser 29 O 295

Phe Thr Wall Arg Teu Gly Glin Glu Arg Wall Phe Trp Ile Ile Ser 305 310 315 320

Teu Telu Glu Met Ala Tyr Ala Wall Ala Ile Teu Wall Gly Ser Thr Ser 325 330 335

Pro Telu Trp Ser Lys Wall Ile Thr Wall Ser Gly His Wall Wall Telu 340 345 35 O

Ala Ser Ile Telu Trp Gly Ala Lys Ser Ile Asp Phe Ser 355 360 365

Ala Ala Telu Thr Ser Phe Tyr Met Phe Ile Trp Lys Teu Phe Ala 370 375 38O

Glu Telu Telu Ile Pro Teu Wall Arg US 7,112,717 B2 85

-continued

385 390

<210> SEQ ID NO 12 <211& LENGTH: 14 &212> TYPE PRT <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Conserved Motif &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (3) . . (3) <223> OTHER INFORMATION: x = w or y &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (4) ... (4) &223> OTHER INFORMATION: x = k or r &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (11) . . (11) <223> OTHER INFORMATION: x = i or v

<400 SEQUENCE: 12 Ala Phe Xaa Xaa Phe Ser Arg Pro His Thr Xaa Ile Gly Thr 1 5 10

<210> SEQ ID NO 13 &2 11s LENGTH 26 &212> TYPE PRT <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Conserved Motif &220s FEATURE <221 NAME/KEY: MISCFEATURE <222> LOCATION: (2) ... (2) <223> OTHER INFORMATION: x = v or i &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (11) . . (11) <223> OTHER INFORMATION: x = e, w, f, or s &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (13) . . (13) <223> OTHER INFORMATION: x = v or i &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (14) . . (14) &223> OTHER INFORMATION: x = d or e &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (17) . . (17) &223> OTHER INFORMATION: x = k or r &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (18) . . (18) <223> OTHER INFORMATION: x = i or v &220s FEATURE <221 NAME/KEY: MISC FEATURE <222> LOCATION: (22). ... (22) <223> OTHER INFORMATION: x = h n, t or y <400 SEQUENCE: 13 Asn Xaa Tyr Ile Val Gly Lieu. Asn Gln Leu Xaa Asp Xaa Xala Ile Asp 1 5 10 15 Xaa Xala Asn Lys Pro Xaa Lieu Pro Leu Ala 2O 25

<210> SEQ ID NO 14 &2 11s LENGTH 16 &212> TYPE PRT <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Conserved Motif &220s FEATURE US 7,112,717 B2 88

-continued

NAME/KEY: MISC FEATU E LOCATION: (3) . . (3) OTHER INFORMATION: x O FEATURE NAME/KEY: MISC FEATU E LOCATION: (7) . . (7) OTHER INFORMATION: x O FEATURE NAME/KEY: MISC FEATU E LOCATION: (10 ) . . (10) OTHER INFORMATION: x O FEATURE NAME/KEY: MISC FEATU E LOCATION: (14) . . (14) OTHER INFORMATION: x O FEATURE NAME/KEY: MISC FEATU E LOCATION: (15) . . (15) OTHER INFORMATION: x C ir or w FEATURE NAME/KEY: MISC FEATU E LOCATION: (16) . . (16) OTHER INFORMATION: x y or f <400 SEQUENCE: 14 Ile Ala Xala Phe Lys Asp Xaa Pro Asp Xaa Glu Gly Asp Xala Xala Xala 1 5 10 15

SEQ ID NO 15 LENGTH 17 TYPE PRT ORGANISM: artificial sequence FEATURE OTHER INFORMAT ON: Conserved Motif FEATURE NAMEAKEY MISC FEATU E LOCATION: (1) . . (1) OTHER INFORMATION: x f or c FEATURE NAME/KEY: MISC FEATU E LOCATION: (3) . . (3) OTHER INFORMATION: x O FEATURE NAME/KEY: MISC FEATU E LOCATION: (10 ) . . (10) OTHER INFORMATION: x f or y FEATURE NAME/KEY: MISC FEATU E LOCATION: (11) . . (11) OTHER INFORMATION: x l O FEATURE NAME/KEY: MISC FEATU E LOCATION: (15) . . (15) OTHER INFORMATION: x Il O FEATURE NAME/KEY: MISC FEATU E LOCATION: (16) . . (16) OTHER INFORMATION: x y ir or l <400 SEQUENCE: 15 Xaa Tyr Xaa Phe Ile Trp Lys Leu Phe Xaa Xaa Glu Tyr Leu Xaa Xaa 1 5 10 15

Pro

SEQ ID NO 16 LENGTH 56 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 16 cgcgatttaa atggcgc.gcc citgcaggcgg cc.gc.ctdcag gg.cgc.gc.cat ttaaat US 7,112,717 B2 89 90

-continued

<210 SEQ ID NO 17 &2 11s LENGTH 32 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 17 to gaggatcc gcggcc.gcaa gottcctgca gg 32

<210> SEQ ID NO 18 &2 11s LENGTH 32 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 18 to gacctgca ggaagcttgc gg.ccg.cggat co 32

<210 SEQ ID NO 19 &2 11s LENGTH 32 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 19 to gacctd.ca ggaagcttgc gg.ccg.cggat CC 32

<210> SEQ ID NO 20 &2 11s LENGTH 32 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 20 to gaggatcc gcggcc.gcaa gottcctgca gg 32

<210> SEQ ID NO 21 &2 11s LENGTH 36 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 21 to gaggatcc gcggcc.gcaa gottcctgca ggagct 36

<210> SEQ ID NO 22 &2 11s LENGTH 2.8 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE <223> OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 22 cctgcaggaa gottgcggcc gcggatcc 28

<210> SEQ ID NO 23 &2 11s LENGTH 36 &212> TYPE DNA <213> ORGANISM: artificial sequence &220s FEATURE US 7,112,717 B2 91 92

-continued <223> OTHER INFORMATION: Synthetic Primer Sequence SEQUENCE: 23 to gacctgca ggaagcttgc gg.ccg.cggat coagct 36

SEQ ID NO 24 LENGTH 2.8 TYPE DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 24 ggat.ccg.cgg cc.gcaa.gctt cotgcagg 28

<210> SSEQ ID NO 25 &2 11s LENGTH 39 &212> TYPET DNA <213> ORGANISM: artificial sequence &220s FEATUREF <223> O THER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 25 gatcacc tgc aggaagcttg cqgcc.gcgga tocaatgca 39

<210> SSEQ ID NO 26 &2 11s LENGTH: 31 &212> TYPET DNA <213> ORGANISM: artificial sequence &220s FEATUREF <223> O THER INFORMATION: Synthetic Primer Sequence <400 SEQUENCE: 26 ttggatc cgc ggcc.gcaa.gc titcct gcagg t 31

<210 SEQ ID NO 27 &2 11s LENGTH 969 &212> TYPE DNA <213> O RGANISM: Nostoc punctiforme <400 SEQUENCE: 27 atgagccaga gttctoaaaa cagocctttg ccacgcaaac citgttcaatc a tattitccat 60 tggittatacg citttctggaa attctdtc.gc cct cacacga ttattggitac aagtctgagt 120 gtgttgagtt totatttaat tgctattgcc attagtaata ataccgctitc tittattoact 18O actc.ccggct coctaag.ccc totctitcggc gcatggattg cittgttctatg tggcaatgtt 240 tacattgtag ggctgaatca attagaagat gttgatattg acaagattaa taalacct cat 3OO ttaccgttgg catcaggtga gttittctdaa Cagacgggac aattaattgt tgcatctact 360 gggattittgg cactagittat ggcgtggcta actgggc cat tottgtttgg catggtaa.ca 420 attagtttgg ccattggtac tgcttattot ttaccgc.caa titc.gcttaaa acagtttccc 480 ttittgggcag cqctgttgtat tttittcggta cgcgg Cacga. ttgttaattit aggattgtat 540 ttgcactata gttggg.cgct gaaacaaag.c caaacaattic cgc.ctgtggit gtgggtgctg 600 acattgttta ttittggtgtt tacctittgcg atc.gcaatct ttaaagatat cccagatata 660 gaaggc gatc gccitctacaa tattacitact ttcacgatta aac tagggto ccaagctgtg 720 tittaatctag citctittgggit gataactgtc tgttatctag ggata attct ggtaggagtg 78O citacgcatcg cittcagttaa coccatttitt citgataactg citcatttggc gctgttggitt 840 tggatgtggt gg.cggagttt ggcggtagac ttacaagata aaagtgc gat cgctcaattic 9 OO