US007868229B2

(12) United States Patent (10) Patent No.: US 7,868,229 B2 Ratcliffe et al. (45) Date of Patent: Jan. 11, 2011

(54) EARLY FLOWERING IN GENETICALLY (58) Field of Classification Search ...... None MODIFIED See application file for complete search history. (56) References Cited (75) Inventors: Oliver Ratcliffe, Oakland, CA (US); Roderick W. Kumimoto, San Bruno, U.S. PATENT DOCUMENTS CA (US); Peter P. Repetti, Emeryville, 6,235,975 B1 5, 2001 Harada et al. 6,320,102 B1 1 1/2001 Harada et al. CA (US); T. Lynne Reuber, San Mateo, 6,476.212 B1 1 1/2002 Lalgudi et al. CA (US); Robert Creelman, Castro 6,495,742 B1 12/2002 Shinozaki et al. Valley, CA (US); Frederick D. Hempel, 6,545.201 B1 4/2003 Harada et al. Albany, CA (US) 6,677,504 B2 1/2004 da Costa e Silva et al. 6,781,035 B1 8/2004 Harada et al. (73) Assignee: Mendel Biotechnology, Inc., Hayward, 6.825,397 B1 1 1/2004 Lowe et al. 2001/0051335 A1 12/2001 Lalgudi et al. CA (US) 2002fOO23281 A1 2/2002 Gorlach et al. 2002fOO40489 A1 4/2002 Gorlach et al. (*) Notice: Subject to any disclaimer, the term of this 2003/0093837 A1 5/2003 Keddie et al. patent is extended or adjusted under 35 2003/O126638 A1 7/2003 Allen et al. U.S.C. 154(b) by 933 days. 2003. O188330 A1 10/2003 Heard et al. 2004/OOO9476 A9 1/2004 Harper et al. 2004, OO16022 A1 1/2004 Lowe et al. (21) Appl. No.: 11/705,903 2004/OO 19927 A1 1/2004 Sherman et al. 2004/0031072 A1 2/2004 La Rosa et al. (22) Filed: Feb. 12, 2007 2004.0034.888 A1 2/2004 Liu et al. 2004/O123338 A1 6/2004 Fincher et al. (65) Prior Publication Data 2004/O123339 A1 6/2004 Conner et al. 2004/O123343 A1 6/2004 La Rosa et al. US 2007/O1991 O7 A1 Aug. 23, 2007 2004/0172684 A1 9, 2004 Kovalic et al. 2004/O181830 A1 9, 2004 Kovalic et al. Related U.S. Application Data 2004/0214272 A1 10/2004 La Rosa et al. (63) Continuation-in-part of application No. 1 1/435.388, (Continued) filed on May 15, 2006, now Pat. No. 7,663,025, which is a continuation-in-part of application No. PCT/ FOREIGN PATENT DOCUMENTS AR PO70 102784 1, 2008 US2004/37584, filed on Nov. 12, 2004, application CA 2302828 10, 2000 No. 1 1/705.903, which is a continuation-in-part of CL 185107 2, 2008 application No. 10/714,887, filed on Nov. 13, 2003, (Continued) application No. 1 1/705.903, which is a continuation in-part of application No. 10/546,266, filed as applica OTHER PUBLICATIONS tion No. PCT/US2004/005654 on Feb. 25, 2004, now U.S. Appl. No. 09/533,030, filed Mar. 22, 2000, Keddie, James et al., entire document. Pat. No. 7,659,446, application No. 1 1/705,903, which U.S. Appl. No. 10/286.264, filed Jun. 25, 2008, Keddie, James, et al., is a continuation-in-part of application No. 10/374, office action. 780, filed on Feb. 25, 2003, application No. 1 1/705, U.S. Appl. No. 10/286.264, filed Mar. 10, 2009, Keddie, James, et al., 903, which is a continuation-in-part of application No. office action. U.S. Appl. No. 10/286.264, filed Aug. 11, 2006, Keddie, James, et al., 10/412,699, filed on Apr. 10, 2003, application No. office action. 11/705.903, which is a continuation-in-part of appli U.S. Appl. No. 10/286.264, filed Oct. 7, 2005, Keddie, James, et al., cation No. 10/374,780, filed on Feb. 25, 2003, appli office action. cation No. 1 1/705.903, which is a continuation-in-part (Continued) of application No. 10/675,852, filed on Sep. 30, 2003, Primary Examiner Stuart F. Baum application No. 1 1/705.903, which is a continuation (74) Attorney, Agent, or Firm Jeffrey M. Libby:Yifan Mao in-part of application No. 10/666,642, filed on Sep. 18, 2003. (57) ABSTRACT The present invention provides polynucleotides encoding (60) Provisional application No. 60/411,837, filed on Sep. CCAAT-binding transcription factor polypeptides that modu 18, 2002, provisional application No. 60/434,166, late the onset of reproductive development in plants. Poly filed on Dec. 17, 2002. nucleotides encoding functional CCAAT-binding transcrip tion factors were incorporated into expression vectors, (51) Int. Cl. introduced into plants, and ectopically expressed. The CI2N 15/82 (2006.01) encoded polypeptides of the invention significantly shortened AOIH 5/00 (2006.01) the time to flower development in the transgenic plants, as AOIH 5/10 (2006.01) compared to the flowering time of control plants. (52) U.S. Cl...... 800/298; 800/290,800/287: 800/320, 800/306; 800/278 19 Claims, 9 Drawing Sheets US 7,868,229 B2 Page 2

U.S. PATENT DOCUMENTS Asamizu, E., et al. (Apr. 28, 2000). Generation of 7137 non-redun dant expressed sequence tags from a legume, Lotus japonicus. DNA 2004/0216190 A1 10, 2004 Kovalic et al. Res. 7 (2), 127-130. 2004/0229367 A1 11/2004 Berka et al. Bucher, P., and Trifonov, E.N. (Jun. 1988). CCAAT box revisited: 2004/0259145 A1 12/2004 Wood et al. bidirectionality, location and context. J Biomol Struct Dyn 5, 1231 2005/0022266 A1 1/2005 Wu et al. 1236. 2005, OO86718 A1 4/2005 Heard et al. Bucher, P. (Apr. 20, 1990). Weight matrix descriptions of four 2005/0172364 A1 8, 2005 Heard et al. eukaryotic RNA polymerase II promoter elements derived from 502 2007. O184092 A1 8/2007 Meyer et al. unrelated promoter sequences. J Mol Biol 212, 563-578. 2008/0040973 A1 2/2008 Nelson et al. Chae, H.D., et al. (May 20, 2004). Cdk2-dependent phosphorylation 2008. O104730 A1 5, 2008 Wu et al. of the NF-Y transcription factor is essential for the expression of the 2008. O163397 A1 7/2008 Ratcliffe et al. cell cycle-regulatory genes and cell cycle G1/S and G2/M transitions. 2008/O172759 A1 7/2008 da Costa e Silva et al. Oncogene 23, 4084-4088. Clarke, B., et al. (Mar. 2003). Arabidopsis genomic information for FOREIGN PATENT DOCUMENTS interpreting wheat EST sequences. Funct. Integr. Genomics 3 (1-2), 33-38. DE 19503359 2, 1996 Gelinas, R., et al. (Jan. 1985). Sequences of G gamma, Agamma, and EP 1033 405 9, 2000 beta genes of the Greek (A gamma) HPFH mutant: evidence for a EP 1 42O 630 2, 2003 distal CCAAT box mutation in the Agamma gene. ProgClin Biol Res EP 1454993 9, 2004 191, 125-139. EP O8O3572 10/2007 Good, L.F., and Chen, K.Y. (May 1996). Cell cycle- and age-depen GB 2244272 11, 1991 dent transcriptional regulation of human thymidine kinase gene: the GB 2392444 3, 2004 role of NF-Y in the CBP/tk binding complex. Biol Signals 5, 163 WO WO 98.37184 8, 1998 169. WO WO 98/37755 9, 1998 WO WO 98.58069 12/1998 Hiei, Y., et al. Efficient transformation of rice (Oryza sativa L.) WO WO99,53016 A2 10, 1999 mediated by Agrobacterium and sequence analysis of the boundaries WO WO99,67405 12/1999 of the T-DNA. J. Aug. 1994; 6(2): 271-282. WO WOOO,28058 5, 2000 Ito, T., et al. (Oct. 1995). A far-upstream sequence of the wheat WO WOOO, 53724 A2 9, 2000 histone H3 promoter functions differently in rice and tobacco cul WO WOO1/36598 A1 5, 2001 tured cells. Plant Cell Physiol 36, 1281-1289. WO WOO1/45493 A2 6, 2001 Johnson, P.F., and McKnight, S.L. (Jul. 1989). Eukaryotic transcrip WO WOO1,64022 9, 2001 tional regulatory proteins. Annu Rev Biochem 58, 799-839. WO WOO1,773.11 10, 2001 Jones, P.G., et al. (epubNov. 26, 2002). Gene discovery and microar WO WOO2,06499 1, 2002 ray analysis of cacao varieties. Planta 216 (2), 255-264. WO WO O2, 16655 2, 2002 Maity, S.N., and de Crombrugghe, B. (May 1998). Role of the WO WO O2/46442 6, 2002 CCAAT-binding protein CBF/NF-Y in transcription. Trends WO WO O2/O57439 A2 * T 2002 Biochem Sci 23, 174-178. WO WO O2/O792.45 A2 10, 2002 Mazon, M.J., Gancedo, J.M., and Gancedo, C. (Oct. 1982). WO WOO3,OOO898 1, 2003 Phosphorylation and inactivation of yeast fructose-bisphosphatase in WO WOO3,OO1902 1, 2003 vivo by glucose and by proton ionophores. A possible role for cAMP. Eur J. Biochem 127, 605-608. WO WOO3,OO2751 1, 2003 McNabb, D.S., Xing, Y., and Guarente, L. (Jan. 1995). Cloning of WO WOO3,OO8540 1, 2003 yeast HAP5: a novel subunit of a heterotrimeric complex required for WO WOO3,O14327 A2 2, 2003 CCAAT binding. Genes Dev 9, 47-58. WO WO 03/020936 3, 2003 Olesen, J.T., and Guarente, L. (Oct. 1990). The HAP2 subunit of WO WOO3,083.042 10, 2003 yeast CCAAT transcriptional activator contains adjacent domains for WO WO 2004/OO982O 1, 2004 subunit association and DNA recognition: model for the HAP2/3/4 WO WO 2004/031349 A2 4/2004 complex. Genes Dev 4, 1714-1729. WO WO 2004/076638 A2 9, 2004 Rieping, M., and Schoffl, F. (Jan. 1992). Synergistic effect of WO WO 2004/079006 9, 2004 upstream sequences, CCAAT box elements, and HSE sequences for WO WO 2005/001050 1, 2005 enhanced expression of chimaeric heat shock genes in transgenic WO WO2005/033319 A2 4/2005 tobacco. Mol Gen Genet 231, 226-232. WO WO 2008.002480 A2 1, 2008 Edwards, D., et al. (1998). Multiple genes encoding the conserved CCAAT-Box transcription factor complex are expressed in OTHER PUBLICATIONS Arabidopsis. Plant Physiol. 117, pp. 1015-1022. Edwards, et al. (1997). Arabidopsis thaliana mRNA for Hap3b tran .S. Appl. No. 10/286.264, filed Jul. 26, 2007, Keddie, James, et al., scription factor. GenBank Accession No. Y 13724. ffice action. Gusmaroli (2002). Regulation of the Arabidopsis thaliana CCAAT .S. Appl. No. 10/675,852, filed Apr. 14, 2008, Heard, J., et al., office binding nuclear factor Y subunits. Gene 283, 41-48. C t O Gusmaroli, et al. (2001). Regulation of the CCAAT-Binding NF-Y .S. Appl. No. 1 1/069.255, filed Nov. 26, 2008, Heard, J., et al., office subunits in Arabidopsis thaliana. Gene 264, 173-185. C tiO Riechmann, et al. (2000). Arabidopsis transcription factors: genome .S. Appl. No. 1 1/069.255, filed Mar. 19, 2008, Heard, J., et al., office wide comparative analysis among eukaryotes. Science 290, 2105 2110. .S. Appl. No. 1 1/069.255, filed Mar. 21, 2007, Heard, J., et al., office Mantovani (Oct. 18, 1999). The molecular biology of the CCAAT C tiO binding factor. Gene 239, 15-27. .S. Appl. No. 10/112,887, filed Sep. 28, 2004, Heard, J., et al., office Li, etal. (1999). Transcription factor NF-Y. CCAAT-binding chain B C tiO maize, NCBI Sequence S22820. .S. Appl. No. 09/533,030, filed Nov. 23, 2001, Keddie, James, et al., Li, et al. (Mar. 1992). Evolutionary variation of the CCAAT-binding ffice action. transcription factor NF-Y. Nucl. Acids Res. 20, 1087-1091. .S. Appl. No. 09/533,030, filed May 3, 2002, Keddie, James, et al., Guo, et al. (2004). “Protein tolerance to random amino acid change”. office action. Proc. Natl. Acad. Sci. USA 101,9205-9210. US 7,868,229 B2 Page 3

Smolen, et al. (2002). Dominant Alleles of the basic Helix-Loop Chang, Z.F., and Liu, C.J. (Jul. 8, 1994). Human thymidine kinase Helix Transcription Factor ATR2 Activate Stress-Responsive Genes CCAAT-binding protein is NF-Y, whose A subunit expression is in Arabidopsis. Gen vol. 161, pp. 1235-1246. serum-dependent in human IMR-90 diploid fibroblasts. J Biol Chem Liu, Q., et al. (1998). Two transcription factors, DREB1 and DREB2. 269, 17893-17898. with an EREBPAP2 DNA binding domain separate two cellular Chen, W. etal. (Mar. 2002). Expression profile matrix of Arabidopsis signal ... Plant Cell, vol. 10, pp. 1391-1406. transcription factor genes Suggests their putative functions in Fourgoux-Nicol, et al. (1999). Isolation of rapeseed genes expressed response to environmental stresses. Plant Cell 14, 559-574. early and specifically during development of the male gametophyte. Ayele, M., et al. (Apr. 2005). Whole genome shotgun sequencing of Plant Mol Biol 40, 857-872. Brassica oleracea and its application to gene discovery and annota Eisen, et al. (1998). “Phylogenomics: Improving Functional Predic tion in Arabidopsis. Genome Res. 15 (4), 487-495. tions for Uncharacterized Genes by Evolutionary Analysis”. Genome Bezhani, S. Sherameti, I., Pfannschmidt, T., and Oelmuller, R. (Jun. Research 8, 163-167. 29, 2001). A repressor with similarities to prokaryotic and eukaryotic Rossini, et al. The maize golden2 gene defines a novel class of DNA helicases controls the assembly of the CAAT box binding transcriptional regulators in plants. The Plant Cell vol. 13, May 2001, complex ataphotosynthesis gene promoter, J Biol Chem 276,23785 pp. 1231-1244, XPO02962733. 23789. Hill, et al. (1998). Functional analysis of conserved histidines in ADP Bi, W., et al. (Oct. 17, 1997). DNA binding specificity of the CCAAT glucose pyrophosphorylase from Escherichia coli. Biochem. binding factor CBF/NF-Y. J Biol Chem 272, 26562-26572. Biophys. Res. Comm. 244, 573-577. Borrell, A.K., et al. (Jul. 2000). Does Maintaining Green Leaf Area in Rounsley, S.D., et al. (Aug. 1997) Database EMBL, Arabidopsis Sorghum Improve Yield under Drought? II. Dry Matter Production thaliana chromosome II BAC T13E 15 genomic seq. complete seq. and Yield. CropScience 40, 1037-1048. XPOO2303794. Acc No. ACOO2388. Positions 57340-58040. Bowler, C., and Fluhr, R. (Jun. 2000). The role of calcium and Rounsley, S.D., et al. (2000). Database EMBL Sep. 12, 1997. activated oxygens as signals for controlling cross-tolerance. Trends “Arabidopsis thaliana mRNA for Hap3a.” XP002315220 retrieved Plant Sci 5, 241-246. from EBI acc No. EM PRO: ATHAP3A, acc No. Y13723. Bowie, et al. Science 247: 1306-1310 (1990). Ohme-Takagi, M. and Singh, K. (Feb. 1995) “Ethylene-inducible Cooper, B., et al. (Oct. 2003). Identification of rice (Oryza sativa) DNA binding proteins that interact with an ...” Plant Cell, vol. 7, pp. proteins linked to the cyclin-mediated regulation of the cell cycle. 173-182, XPO02108954. *Figs 5, 6*. Plant Mol. Biol. 53(3):273-9. Buttner, M. and Singh, K. (May 27, 1997) 'Arabidopsis thaliana Coupland (Oct. 1995). Flower development. LEAFY blooms in ethylene-responsive.” Proc Of The Natl Acad Of Sciences, vol. 94. aspen. Nature 377:482-483. pp. 5961-5966, XPO02108953. Coustry, F. Maity, S.N., and de Crombrugghe, B. (Jan. 6, 1995). Winicov. I. (Dec. 1998) “New molecular approaches to improving Studies on transcription activation by the multimeric CCAAT-bind Salt tolerance in crop plants.” Annals of Botany, Acad Press, vol. 82. ing factor CBF. J. Biol Chem 270, 468-475. No. 6, pp. 705-710, XP00 1007288. Coustry, F. Maity, S.N. Sinha, S., and de Crombrugghe, B. (Jun. 14. Urao, T., et al. (Nov. 1993) “An Arabidopsis MYB homolog is 1996). The transcriptional activity of the CCAAT-binding factor CBF induced by dehydration stress and its geneproduct binds to...” Plant is mediated by two distinct activation domains, one in the CBF-B Cell, vol. 5, pp. 1529-1539, XPO02938159. subunit and the other in the CBF-C subunit. J Biol Chem 271, 14485 Kasuga Mie, et al. (Mar. 1999) “Improving plant drought, salt, and 14491. freezing tolerance by gene.” Nature Biotechnol., vol. 17. No. 3, pp. Coustry, F. Sinha, S., Maity, S.N., and Crombrugghe, B. (Apr. 1, 287-291, XP002 173128. 1998). The two activation domains of the CCAAT-binding factor Kaneko, t., et al. (Mar. 1, 2001) database trembl seqlib ebi, hinxton; CBF interact with the dTAFII 110 component of the Drosophila “transcription factor hap5a-like (at5g50480) . . .” xp002302644 TFIID complex. Biochem J 331 (Pt 1), 291-297. www.ebi.ac.UK Database accession No. Q9FGP7. Coustry, F. Hu, Q., de Crombrugghe, B., and Maity, S.N. (Nov. 2, Kaneko, T., et al. (Apr. 9, 1999) EMBL SEQ LIB EBI, 2001). CBF/NF-Y functions both in nucleosomal disruption and HINXTON:"Structural analysis of Arabidopsis thaliana chromo transcription activation of the chromatin-assembled topoisomerase some 5. XI.; P1 clone : MBA10' WWW.EBI. AC.UK acc. No. IIalpha promoter. Transcription activation by CBF/NF-Y in chroma ABO25619. tin is dependent on the promoter structure. J Biol Chem 276, 40621 Lotan, Tamar, et al. (Jun. 26, 1998). “Arabidopsis Leafy COTYLE 40630. DON1 is sufficient to induce embryo. . .” Cell, Cell Press, vol. 93, Covitz, P.A., et al. (Aug. 1998). Expressed sequence tags from a No. 7, pp. 1195-1205, XP002 136428. root-hair-enriched medicago truncatula cDNA library. Plant Physiol. Albani, Diego et al., (Dec. 29, 1995) Cloning and characterization of 117(4): 1325-32. a Brassica napus gene encoding a homologue of the B subunit of a Crevillen P. Ventriglia T. Pinto F, Orea A. Merida A. Romero JM heteromeric CCAAT-binding factor, Gene 167, pp. 209-213. (Mar. 4, 2005) Differential pattern of expression and Sugar regulation Arents, G., and Moudrianakis, E.N. (Nov. 21, 1995). The histone of Arabidopsis thaliana ADP-glucose pyrophosphorylase-encoding fold: a ubiquitous architectural motif utilized in DNA compaction genes. J Biol Chem 280: 8143-8149. and protein dimerization. Proc Natl AcadSci USA92, 11170-11174. Crookshanks, M., et al. (Sep. 18, 2001). The potato tuber Asamizu, E., et al. (Apr. 28, 2000). Generation of 7137 non-redun transcriptome: analysis of 6077 expressed sequence tags. FEBS Lett. dant expressed sequence tags from a legume, Lotus japonicus. DNA 506 (2), 123-126. Res. 7 (2), 127-130. Currie, R.A. (Dec. 5, 1997). Functional interaction between the DNA Caretti G. MottaMC, Mantovani R (Dec. 1999)NF-Yassociates with binding subunit trimerization domain of NF-Y and the high mobility H3-H4 tetramers and octamers by multiple mechanisms. Mol Cell group protein HMG-I(Y). J Biol Chem 272,30880-30888. Biol 19: 8591-8603. Daly et al. (Dec. 2001). Plant Systematics in the Age of Genomics. Caretti, G., Salsi, V., Vecchi, C., Imbriano, C., and Mantovani, R. Plant Physiology 127: 1328-1333. (Aug. 15, 2003). Dynamic recruitment of NF-Y and histone Dang, V.D., Bohn, C., Bolotin-Fukuhara, M., and Daignan-Fornier, acetyltransferases on cell-cycle promoters. J Biol Chem 278, 30435 B. (Apr. 1996). The CCAAT box-binding factor stimulates ammo 30440. nium assimilation in Saccharomyces cerevisiae, defining a new Carre, I.A., and Kay, S.A. (Dec. 1995). Multiple DNA-Protein com cross-pathway regulation between nitrogen and carbon metabolisms. plexes at a circadian-regulated promoter element. Plant Cell 7, 2039 J Bacteriol 178, 1842-1849. 2051. di Silvio A, Imbriano C. Mantovani R (Jul. 1, 1999) Dissection of the Chattopadhyay C, et al. (Jul. 8, 2004) Human p32, interacts with B NF-Y transcriptional activation potential. Nucleic Acids Res 27: subunit of the CCAAT-binding factor, CBF/NF-Y, and inhibits CBF 2578-2584. mediated transcription activation in vitro. Nucleic Acids Res 32: Faniello MC, Bevilacqua MA. Condorelli G. de Crombrugghe B. 3632-3641 Maity SN, Avvedimento VE, Cimino F. Costanzo F (Mar. 19, 1999) US 7,868,229 B2 Page 4

The B subunit of the CAAT-binding factor NFY binds the central Lazaretal. (Mar. 1988) Transforming Growth Factor alpha: Mutation segment of the Co-activator p300. J Biol Chem 274: 7623-7626. of Aspartic Acid 47 and Leucine 48 Results in Different Biological Forsburg, S.L., and Guarente, L. (Feb. 1988). Mutational analysis of Activities. Mol. Cell Biol. 8:1247-1252. upstream activation sequence 2 of the CYC1 gene of Saccharomyces Lee, H., et al. (Feb. 18, 2003). Arabidopsis Leafy COTYLEDON1 cerevisiae: a HAP2-HAP3-responsive site. Mol Cell Biol 8,647-654. represents a functionally specialized subunit of the CCAAT binding Forsburg, S.L., and Guarente, L. (Aug. 1989). Identification and transcription factor. Proc Natl Acad Sci USA 100, 2152-2156. characterization of HAP4: a third component of the CCAAT-bound Lee J. H. etal. (Oct. 1995), Derepression of the activity of genetically HAP2/HAP3 heteromer. Genes Dev 3, 1166-1178. engineered heat shock factor causes constitutive synthesis of heat Franchini A, Imbriano C. Peruzzi E. Mantovani R. Ottaviani E (Feb. shock proteins and increased themotolerance in transgenic 2005) Expression of the CCAAT-binding factor NF-Y in arabidopsis. Plant Journal, vol. 8, No. 4, pp. 603-612. Caenorhabditis elegans. J Mol Histol 36: 139-145. Levesque-Lemay M. et al. (Mar. 4, 2003) Expression of CCAAT FrontiniM, Imbriano C. Manni I. Mantovani R (Feb. 2004) Cell cycle binding factor antisense transcripts in reproductive tissues affects regulation of NF-YC nuclear localization. Cell Cycle 3: 217-222. plant fertility. Plant Cell Rep 21: 804-808. Fu, et al. (Aug. 2001). Expression of Arabidopsis GAI in Transgenic Li, Q, et al. (Nov. 2, 1998). Xenopus NF-Y pre-sets chromatin to Rice Represses Multiple Gibberellin Responses. Plant Cell 13:1791 potentiate p300 and acetylation-responsive transcription from the 1802. Xenopus hsp70 promoter in vivo. Embo J 17, 6300-6315. Gancedo, J.M. (Jun. 1998). Yeast carbon catabolite repression. Lin, J.F., and Wu, S.H. (Aug. 2004). Molecular events in senescing Microbiol Mol Biol Rev 62,334-361. Arabidopsis leaves. Plant J39, 612-628. Gardiner, J., et al. (Apr. 2004). Anchoring 9,371 Maize Expressed Luger, K., et al. (Sep. 18, 1997). Crystal structure of the nucleosome Sequence Tagged Unigenes to the Bacterial Artificial Chromosome core particle at 2.8 A resolution. Nature 389, 251-260. Contig Map by Two-Dimensional Overgo Hybridization. Plant Maity, S.N., and de Crombrugghe, B. (May 1998). Role of the Physiol. 134 (4), 1317-1326. CCAAT-binding protein CBF/NF-Y in transcription. Trends Gurtner A. etal. (Jul. 2003) Requirement for Down-Regulation of the Biochem Sci 23, 174-178. CCAAT-binding Activity of the NF-Y Transcription Factor during Mandel et al. (Oct. 1992). Manipulation of flower structure in Skeletal Muscle Differentiation. Mol Biol Cell 14: 2706-2715. transgenic tobacco, Cell 71-133-143. Guterman, I., et al. (Oct. 2002). Rose Scent: Genomics Approach to Mantovani, R. (Mar. 1998). A survey of 178 NF-Y binding CCAAT Discovering Novel Floral Fragrance-Related Genes. Plant Cell 14 boxes. Nucleic Acids Res 26, 1135-1143. (10), 2325-2338. Masiero, S., et al. (Jul 19, 2002). Ternary complex formation Haas, B., et al. (May 30, 2002). Full-length messenger RNA between MADS-box transcription factors and the histone fold pro sequences greatly improve genome annotation. Genome Biology tein NF-YB. J. Biol Chem 277,26429-26435. 2002, 3(6)research0029.1-0029.12. McNabb, D.S., et al. (Dec. 1997). The Saccharomyces cerevisiae Harmer, S., et al., (Dec. 15, 2000) Orchestrated Transcription of Key Hap5p homolog from fission yeast reveals two conserved domains Pathways in Arabidopsis by the Circadian Clock. Science vol. 290, p. that are essential for assembly of heterotetrameric CCAAT-binding 2110. factor. Mol Cell Biol 17, 7008-7018. Herwig, R., et al. (Aug. 2002). Construction of a ‘unigene cDNA McConnell, et al. (6838). Role of PHABULOSA and PHAVOLUTA clone set by oligonucleotide fingerprinting allows access to 25,000 in determining radial patterning in shoots; Nature 411, 709–713 potential Sugar beet genes. Plant J. 32 (5), 845-857. (2001). Hollung Kristin et al. (Jun. 1997) Developmental stress and ABA Meinke, D. (Dec. 4, 1992). A homeotic mutant of Arabidopsis modulation of mRNA levels for b/IP transcription factors and Vpl in thaliana with leafy cotyledons. Science 258, 1647-1650. barley embryos and embryo-derived Suspension cultures, Plant Meinke, D.W., et al. (Aug. 1994). Leafy Cotyledon Mutants of Molecular Biology, vol. 35. No. 5, pp. 561-571. Arabidopsis. Plant Cell 6, 1049-1064. Kahle J, et al. (Jul. 2005) Subunits of the heterotrimeric transcription Miyoshi, K., et al. (Nov. 2003). OshAP3 genes regulate chloroplast factor NF-Y are imported into the nucleus by distinct pathways biogenesis in rice. Plant J36,532-540. involving importin beta and importin 13. Mol Cell Biol 25: 5339 Myers, R.M., Tilly, K., and Maniatis, T. (May 2, 1986). Fine structure 5354. genetic analysis of a beta-globin promoter. Science 232, 613-618. Kater, et al. (Feb. 1998). Multiple AGAMOUS Homologs from Nakamura, Y., et al. (Dec. 31, 1997). Structural analysis of Cucumber and Petunia Differ in Their Ability to Induce Reproductive Arabidopsis thaliana chromosome 5. III. Sequence features of the Organ Fate. Plant Cell 10:171-182. regions of 1,191.918bp covered by seventeen physically assigned P1 Kehoe, D.M., et al. (Aug. 1994). Two 10-bp regions are critical for clones. DNA Res.4(6):401-414. phytochrome regulation of a Lemna gibba Lhcbgene promoter. Plant Nakshatri, H., et al. (Nov. 15, 1996). Subunit association and DNA Cell 6, 1123-1134. binding activity of the heterotrimeric transcription factor NF-Y is Kim, C.G., and Sheffery, M. (Aug. 1990). Physical characterization regulated by cellular redox. J Biol Chem 271, 28784-28791. of the purified CCAAT transcription factor, alpha-CP1. J Biol Chem Nandi et al. (Feb. 24, 2000). A conserved function for Arabidopsis 265, 13362-13369. SUPERMAN in regulating floral-whorl cell proliferation in rice, a Kim, I.S., et al. (Aug. 1996). Determination of functional domains in monocotyledonous plant. Curr. Biol. 10:215-218. the C subunit of the CCAAT-binding factor (CBF) necessary for Novillo, F., et al. (Mar. 16, 2004). CBF2/DREB1C is a negative formation of a CBF-DNA complex: CBF-B interacts simultaneously regulator of CBF1/DREB1.B and CBF3/DREB1A expression and with both the CBF-A and CBF-C subunits to form a heterotrimeric plays a central role in stress tolerance in Arabidopsis. Proc Natl Acad CBF molecule. Mol Cell Biol 16, 4003-4013. Sci USA 101,3985-3990. Kusnetsov, V., et al. (Dec. 10, 1999). The assembly of the CAAT-box Parcy, F., et al. (Aug. 1997). The ABCISIC Acid-INSENSITIVE3, binding complex at a photosynthesis gene promoter is regulated by FUSCA3, and LEAFY COTYLEDON1 loci act in concert to control light, cytokinin, and the stage of the plastids. J Biol Chem 274. multiple aspects of Arabidopsis seed development. Plant Cell 9, 36009-36014. 1265-1277. Kwong, R.W., et al. (Jan. 2003). LEAFY COTYLEDON1-Like Peng et al. (Dec. 1, 1997). The Arabidopsis GAI gene defines a defines a class of regulators essential for embryo development. Plant signaling pathway that negatively regulates gibberellin responses. Cell 15, 5-18. Genes and Development 11:3194-3205. Lapik.Y.R., and Kaufman, L.S. (Jul. 2003). The Arabidopsis cupin Peng et al. (Jul. 15, 1999). 'Green revolution genes encode mutant domain protein AtPirin1 interacts with the G protein alpha-subunit gibberellin response modulators, Nature 400:256-261. GPA1 and regulates seed germination and early seedling develop Pinkham, J.L., and Guarente, L. (Dec. 1985). Cloning and molecular ment. Plant Cell 15, 1578-1590. analysis of the HAP2 locus: a global regulator of respiratory genes in Lascaris R. et al. (Dec. 17, 2002) Hap4p overexpression in glucose Saccharomyces cerevisiae. Mol Cell Biol 5, 3410-3416. grown Saccharomyces cerevisiae induces cells to enter a novel meta Prandl R. et al., (May 1998) HSF3, a new heat shock factor from bolic state. Genome Biol 4: R3. Arabidopsis thaliana, derepresses the heat shock response and con US 7,868,229 B2 Page 5 fers thermotolerance when overexpressed in transgenic plants. Yu, J., et al. (Feb. 2005). The Genomes of Oryza sativa: A History of Molecular and General Genetics, vol. 258, pp. 269-278. Duplications. PloS Biol. 3 (2), E38: 266-281. Rizhsky, L., et al. (Nov. 2002). The combined effect of drought stress Yun, J., et al. (Sep. 19, 2003). Cdk2-dependent phosphorylation of and heat shock on gene expression in tobacco. Plant Physiol 130, the NF-Y transcription factor and its involvement in the p53-p21 1143-1151. signaling pathway, J Biol Chem 278,36966-36972. Romier, C., et al. (Jan. 10, 2003). The NF-YB/NF-YC structure gives Zhang, S., et al. (Jun. 2002). Similarity of expression patterns of insight into DNA binding and transcription regulation by CCAAT knotted 1 and ZmLEC1 during somatic and Zygotic embryogenesis in factor NF-Y. J Biol Chem 278, 1336-1345. maize (Zea mays L.). Planta 215, 191-194. Sabehat, A., et al. (Jun. 1998). Expression of Small heat-shock pro Zhou, Y., and Lee, A.S. (Mar. 4, 1998). Mechanism for the suppres teins at low temperatures. A possible role in protecting against chill sion of the mammalian stress response by genistein, an anticancer ing injuries. Plant Physiol 117,651-658. phytoestrogen from soy. J Natl Cancer Inst 90,381-388. Salmi, M.L., etal. (Jul. 2005). Profile and analysis of gene expression Shinn, et al. (Dec. 1, 2001). Database SPTREMBL Online changes during early development in germinating spores of XP002962732 Database accession No. (Q94F45). Ceratopteris richardii. Plant Physiol. 138 (3), 1734-1745. Nakamura, Y, et al. (Mar. 1, 2001). Database TREMBL SEQ LIB Salsi, V., et al. (Feb. 28, 2003). Interactions between p300 and mul EBI. HINXTON: "Structural analysis of Arabidopsis . . . ” tiple NF-Y trimers govern cyclin B2 promoter function. J Biol Chem XP002302645 WWWEBI.AC.UK Database acc No. Q9FMV5. 278,6642-6650. Database UniProt Online (May 1, 2000). “Putative CCAAT-bind Sasaki, T., et al. (Nov. 21, 2002). The genome sequence and structure ing transcription factor subunit.” retrieved from EBI accession No. of rice chromosome 1. Nature 420 (6913), 312-316. UNIPROT: Database acc No. Q9SLGO. Seki, M., et al. (Jan. 2001). Monitoring the expression pattern of 1300 NCBI accession No. AA660543 (gi:2604587) (Nov. 11, 1997); Arabidopsis genes under drought and cold stresses by using a full Covitz, P.A., et al. “00429 MtRHE Medicago truncatula cDNA 5' length cDNA microarray. Plant Cell 13, 61-72. similar to CCAAT box DNA binding transcription factor, mRNA Sinha S, et al. (Feb. 28, 1995) Recombinant rat CBF-C, the third sequence'; (Medicago truncatula) (publication: See Plant Physiol. subunit of CBF/NFY, allows formation of a protein-DNA complex 117 (4), 1325-1332, 1998). with CBF-A and CBF-B and with yeast HAP2 and HAP3. Proc Natl NCBI accession No. AAAA01000016 (gi: 19924325) (pos. 61435 AcadSci U S A92: 1624-1628. 62623) (Apr. 4, 2002); Yu, J., et al. “Oryza sativa (indica cultivar Sinha, S., et al. (Jan. 1996). Three classes of mutations in the A group) scaffoldO00016, whole genome shotgun sequence”. (Oryza. subunit of the CCAAT-binding factor CBF delineate functional sativa) (publication: see PloS Biol. 3 (2), E38, 2005. The Genomes of domains involved in the three-step assembly of the CBF-DNA com Oryza sativa: A History of Duplications). plex. Mol Cell Biol 16, 328-337. NCBI accession No. AAAA01001 199 (gi: 19925508) (pos. 24979 Surpin, M., et al. (May 2002). Signal transduction between the 25653) (Apr. 4, 2002); Yu, J., et al. “Oryza sativa (indica cultivar chloroplast and the nucleus. Plant Cell 14 Suppl. S327-338. group) scaffold001 199, whole genome shotgun sequence”. (Oryza Suzuki etal. (Nov. 2001). Maize VP1 complements Arabidopsis abi3 sativa) (publication: see PloS Biol. 3 (2), E38, 2005. The Genomes of and confers a novel ABA'auxin interaction in roots. Plant J. 28:409 Oryza sativa: A History of Duplications). 418. NCBI accession No. AAAA01003638 (gi: 19927947) (Apr. 4, 2002); Tasanen, K., et al. (Jun. 5, 1992). Promoter of the gene for the Yu, J., et al. "Oryza sativa (indica cultivar-group) scaffold003638, multifunctional protein disulfide isomerase polypeptide. Functional whole genome shotgun sequence'; (Oryza sativa) (publication: See significance of the six CCAAT boxes and other promoter elements. J PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History Biol Chem 267, 11513-11519. of Duplications). Testa A. etal. (Jan. 11, 2005) Chromatin immunoprecipitation (ChIP) NCBI accession No. AAAA01006073 (gi: 19930383) (Apr. 4, 2002); on chip experiments uncover a widespread distribution of NF-Y Yu, J., et al. "Oryza sativa (indica cultivar-group) scaffold006073, binding CCAAT sites outside of core promoters. J Biol Chem 280: whole genome shotgun sequence'; (Oryza sativa) (publication: See 13606-13615. PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History Thomas, H., and Howarth, C.J. (Feb. 2000). Five ways to stay green. of Duplications). J Exp Bot 51 Spec No. 329-337. NCBI accession No. AAAA01008870 (gi: 19933180) (Apr. 4, 2002); Vicient, C.M., et al. (Jun. 2000). Changes in gene expression in the Yu, J., et al. "Oryza sativa (indica cultivar-group) scaffold008870, leafy cotyledon 1 (lec 1) and fusca3 (fus3) mutants of Arabidopsis whole genome shotgun sequence'; (Oryza sativa) (publication: See thaliana L.J Exp. Bot 51,995-1003. PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History Weigel and Nilsson (Oct. 12, 1995). A developmental switch suffi of Duplications). cient for flower initiation in diverse plants. Nature 377:482-500. NCBI accession No. AAAA01009782 (gi: 1993.4092) (Apr. 4, 2002); Wendler, W.M., et al. (Mar. 28, 1997). Identification of pirin, a novel Yu, J., et al. "Oryza sativa (indica cultivar-group) scaffold009782, highly conserved nuclear protein. J Biol Chem 272, 8482-8489. whole genome shotgun sequence'; (Oryza sativa) (publication: See West, M., et al. (Dec. 1994). Leafy COTYLEDON1 Is an essential PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History regulator of late embryogenesis and cotyledon identity in of Duplications). Arabidopsis. Plant Cell 6, 1731-1745. NCBI acc. No. AAAAO 1015835 (gii: 1994,5865) (Apr. 42002);Yu.J., Winicov. Ilga et al., (Jun. 1999) Transgenic overexpression of the et al. “Oryza sativa(indica cultivar-group) scaffold015835, whole transcription factor Alfin enhances expression of the endogenous genome shotgun sequence'; source: Oryza sativa (indica cultivar MsPRP2 gene in alfalfa and improves salinity tolerance of the plants. group); Title: “The Genomes of Oryza sativa: A History of Duplica Plant Physiology vol. 120, No. 2, pp. 473-480. tions” (PLoS Biol. 3 (2), E38 (2005)). Xing, Y, Fikes, J.D., and Guarente, L. (Dec. 1993). Mutations in NCBI accession No. AAAA01015884 (gi: 1994.5953) (Apr. 4, 2002); yeast HAP2/HAP3 define a hybrid CCAAT box binding domain. Yu, J., et al. "Oryza sativa (indica cultivar-group) scaffold015884, Embo J 12, 4647-4655. whole genome shotgun sequence'; (Oryza sativa) (publication: See Xiong, L., et al. (Dec. 2001). Modulation of abscisic acid signal PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History transduction and biosynthesis by an Sm-like protein in Arabidopsis. of Duplications). Dev Cell 1,771-781. NCBI accession No. AAK95562 (gii: 15321716) (Aug. 28, 2001); Xiong, L., et al. (Aug. 1, 2001). FIERY1 encoding an inositol Lowe, K.S. etal "Leafy cotyledonl” (Zea mays). polyphosphate 1-phosphatase is a negative regulator of abscisic acid NCBI accession No. AAL27657 (gii: 16902050) (Nov. 10, 2001); and stress signaling in Arabidopsis. Genes Dev 15, 1971-1984. Lowe, K.S., et al. “CCAAT-box binding factor HAP3 B domain' Yan, J., et al. (Aug. 2004). Overexpression of the Arabidopsis 14-3-3 (Glycine max.). protein GF14 lambda in cotton leads to a “stay-green” phenotype and NCBI accession No. AAL27658 (gii: 16902052) (Nov. 10, 2001); improves stress tolerance under moderate drought conditions. Plant Lowe, K.S., et al. “CCAAT-box binding factor HAP3 B domain' Cell Physiol 45, 1007-1014. (Glycine max.). US 7,868,229 B2 Page 6

NCBI accession No. AAL27659 (gii: 16902054) (Nov. 10, 2001); NCBI accession No. AC122165 (gi:21 104913) (pos. 113095-113208 Lowe, K.S., et al. “CCAAT-box binding factor HAP3 B domain' in latest version) (May 23, 2002); Shaull, S., et al. “Medicago ( galamensis). truncatula clone mth2-32m22, complete sequence'; (Medicago NCBI accession No. AAL27660 (gii: 16902056) (Nov. 10, 2001); truncatula) (note: two versions are presented for review, gi:21104913 Lowe, K.S. et al. "CCAAT-box binding factor HAP3 B domain' submitted May 23, 2002; and, gi:71274322 submitted on July 27, (Argemone mexicana). 2005). NCBI accession No. AAL27661 (gii: 16902058) (Nov. 10, 2001); NCBI accession No. AF193440 (gi:6289056) (Nov. 9, 1999); Gher Lowe, K.S. et al. "CCAAT-box binding factor HAP3 B domain' raby, W., et al., “Arabidopsis thaliana heme activated protein (Triticum aestivum). (HAP5c) mRNA, complete cods”. NCBI accession No. AAL49943 (gi: 17979253) (Dec. 26, 2001); NCBI accession No. AF410176 (gi:15321715) (Aug. 28, 2001); Shinn, P. et al., “At2g37060/T2N18.18 Arabidopsis thaliana”. Lowe, K.S., et al. “Zea mays leafy cotyledon1) (Lec1) mRNA, com NCBI accession No. AANO 1148 (gi:2253.6010) (Aug. 29, 2002); plete cols', (Zea mays). Kwong, R. W., et al. “LEC1-like protein' (Phaseolus coccineus) NCBI accession No. AF533650 (gi:22536009) (Aug. 29, 2002); (note: the original Submission and latest update are provided for Kwong, R.W., et al. "Phaseolus coccineus LEC1-like protein mRNA, examination) (publication: see Plant Cell 15(1), May 18, 2003, Leafy complete cols', (Phaseolus coccineus) (note: the original Submission COTYLEDON1-Like Defines a Class of Regulators Essential for and latest update are provided for examination). Embryo Development). NCBI accession No. AI442376 (gi:4295745) (Feb. 19, 1999); Shoe NCBI accession No. AAO33918 (gi:28274.147) (Feb. 8, 2003); maker, R., et al. “Sa26b07.y1 Gm-c1004 Glycine max cDNA clone Adams, K.L., et al. “Putative CCAAT-binding transcription factor': Genome systems clone ID: Gm-c1004-3985" similar to TR:023634 (Gossypium barbadense). 023634 transcription factor; mRNA sequence” (publication: see NCBI accession No. AAO33919 (gi:28274149) (Feb. 8, 2003); Plant Cell 15(1), May 18, 2003, Leafy COTYLEDON1-Like Defines Adams, K.L., et al. “Putative CCAAT-binding transcription factor” a Class of Regulators Essential for Embryo Development). Gossypium barbadense. NCBI accession No. AI442765 (gi:42.98466) (Feb. 19, 1999); Shoe NCBI accession No. AAO72650 (gi:29367577) (Mar. 30, 2003); maker, R., et al. “Sa26b07.x1 Gm-c1004 Glycine max cDNA clone Cooper, B., et al. “CCAAT-binding transcription factor-like protein' genome systems clone ID: Gm-c1004-398 3' similar to TR:023634 (Oryza sativa) (note: the original Submission and latest update are 023634 transcription factor; mRNA sequence”; (Glycine max). provided for examination) (publication: see Plant Mol. Biol. 53 (3), NCBI accession No. AI486503 (gi:43.81874) (Mar. 9, 1999); Alcala, 273-279, 2003, Identification of rice (Oryza sativa) proteins linked to J., et al. “EST244824 tomato ovary, TAMU Lycopersiconesculentum the cyclin-mediated regulation of the cell cycle). cDNA clone cIED6C8, mRNA sequence'; (Lycopersicon NCBI accession No. AB025628 (gi:4589434) (pos. 69357-69929) esculentum). (Apr. 20, 1999); Nakamura, Y., “Arabidopsis thaliana genomic DNA, NCBI accession No. AI495.007 (gi:4396010) (Mar. 11, 1999); Shoe chromosome 5, P1 clone: MNJ7, complete sequence” (note: the maker, R., et al. “Sa89f)3.y1 Gm-c1004 Glycine max cDNA clone original Submission and latest update are provided for examination, genome systems clone ID: Gm-c1004-64865" similar to TR:O23310 see gene id: MNJ7.23, protein id: BAB09090.1, gi:9758792 marked O23310 CCAAT-binding transcription factor subunit A.; mRNA on p. 7/29). sequence', (Glycine max). NCBI accession No. AC000 106 (gi:1785951) (Jan. 21 1997); NCBI accession No. AI725612 (gi:5044464) (Jun. 11, 1999); Osborne, B.I., et al., “Arabidopsis thaliana chromosome 1” (note: Blewitt, M., et al., “BNLGHi12445 Six-day Cotton fiber Gossypium two versions are presented for review, gi: 1785951 submitted Jan. 21. hirsutum cDNA 5" similar to CCAAT-binding transcription factor 1997; and, gi:2342673 submitted Sep. 17, 1997) (Delete for MBI subunit A (CBF-A) (NF-Y Protein Chain B) (NF-YB) (CAAT-Box 0047PCT Disclosure). DNA binding protein subunit B), mRNA sequence'; (Gossypium NCBI accession No. AC002388 (gi:2282009) (Jul. 28, 1997); hirsutum). Rounsley, S.D., et al., “Arabidopsis thaliana clone T13E15” (note: NCBI accession No. AI728916 (gi:5047768) (Jun. 11, 1999); two versions are presented for review, gi:2282009 submitted Jul. 28, Blewitt, M., et al., “BNLGHi12022 Six-day Cotton fiber Gossypium 1997; and, gi:20196917 submitted Apr. 18, 2002) (Delete for MBI hirsutum cDNA 5" similar to (Y13723) Transcription factor 0047PCT Disclosure). Arabidopsis thaliana, mRNA sequence', (Gossypium hirsutum). NCBI accession No. AC005770 (gi:3694645) (pos 56423-57963) (Oct. 3, 1998); Rounsley, S.D., et al., “Arabidopsis thaliana clone NCBI accession No. AI731250 (gi:505.0102) (Jun. 11, 1999); T7F6” (note: two versions are presented for review, gi:3694645 sub Blewitt, M., et al., “BNLGHi9010 Six-day Cotton fiber Gossypium mitted Oct. 3, 1998; and, gi:20 197440 submitted Apr. 18, 2002, see hirsutum cDNA 5' similar to (X59714) CAAT-box DNA binding gene At2g38870, gi:20 197447 marked on p. 2/35). protein subunit B (NF-YB) (Zea mays), mRNA sequence”; (Gos NCBI accession No. AC006260 (gi:4071012) (pos 40445 sypium hirsutum). 41633)(Dec. 29, 1998); Lin, X., et al., “Arabidopsis thaliana clone NCBI accession No. AI731275 (gi:5050127) (Jun. 11, 1999); T2N18” (note: two versions are presented for review, gi:4071012 Blewitt, M., et al., “BNLGHi9078 Six-day Cotton fiber Gossypium submitted Dec. 29, 1998; and, gi:20 197714 submitted Apr. 18, 2002, hirsutum cDNA 5' similar to (X59714) CAAT-box DNA binding see p. 6 of 28, gene At2g37070, gi:4371294). protein subunit B (NF-YB) (Zea mays), mRNA sequence”; (Gos NCBI acc. No. AC007063 (gi: 4389532) (Mar. 11, 1999); Lin.X., et sypium hirsutum). al. “Arabidopsis thaliana clone T10F5, *** Sequencing in NCBI accession No. AI782351 (gi:5280392) (Jun. 29, 1999); Progress ***, 4 unordered pieces'; source: Arabidopsis thaliana D’Ascenzo, M., et al., “EST263230 tomato susceptible, Cornell (thale cress); Title: “Arabidopsis thaliana 'TAMU' Lycopersicon esculentum cDNA clone cIES18L2, mRNA BAC: 'T10F5' genomic Sequence ea sequence', (Lycopersicon esculentum). marker 'GPC6'” (Unpublished). NCBI accession No. AI900024 (gi:5605926) (Jul 27, 1999); Shoe NCBI acc. No. AC104284 (gi: 17402732) (Dec. 7, 2001); Chow,T- maker, R., et al., “sb97g 11.y1 Gm-c 1012 Glycine max cDNA clone Y., et al. "Oryza sativa (japonica cultivar-group) chromosome 5 clone Genome Systems clone ID: Gm-c 1012-669 5" similar to OJ1735C10, *** Sequencing in Progress ***, 7 ordered pieces”; SW:CBFA Maize P25209 CCAAT-binding transcription factor Source: Oryza sativa (japonica cultivar-group); Title: "Oryza sativa Subunit A, mRNA sequence', (Glycine max). BAC OJ1735C10 genomic sequence” (Unpublished). NCBI accession No. AI965590 (gi:5760227) (Aug. 23, 1999); Shoe NCBI accession No. AC 120529 (gi:20503070) (pos. 90470-91129) maker, R., et al., "sc74b05.y1 Gm-c 1018 Glycine max cDNA clone (May 8, 2002); Buell, C., et al. “Oryza sativa (japonica cultivar Genome Systems clone ID: Gm-c 1018-586 5" similar to group) chromosome 3 clone OSJNBa0039N21, complete sequence': SW:CBFA Maize P25209 CCAAT-binding transcription factor (Oryza sativa) (note: two versions are presented for review, Subunit A, mRNA sequence', (Glycine max). gi:20503070 submitted May 8, 2002; and, gi:34447241 submitted NCBI accession No. AI966550(gi:5761187) (Aug. 23, 1999); Shoe Sep. 4, 2003, see protein id: 41469085, marked on p. 6/53). maker, R., et al., "scSlhO1.y1 Gm-c 1015 Glycine max cDNA clone US 7,868,229 B2 Page 7

Genome Systems Clone ID: Gm-c1015-1130 5" similar to NCBI accession No. AP003266 (gi: 13027296)(pos. 83090-83330) TR:023633 023633 Transcription factor; mRNA sequence': (Feb. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar (Glycine max). group) genomic DNA, chromosome 1, PAC clone:P0492G09”; NCBI accession No. AJ487398 (gi:22022152) (Jul 30, 2002); (Oryza sativa) (note: two versions are presented for review, Gebhardt, C., et al., “AJ487398 Solanum tuberosum cv. Provita gi: 13027296 submitted Feb. 21, 2001; and, gi:15408784 submitted Solanum tuberosum cDNA clone Ple4, mRNA sequence'. (Solanum Aug. 31, 2001, see protein id BAB54190.1 marked on p. 6/47) (pub tuberosum). lication: see Nature 420 (6913), 312-316, 2002. The genome NCBI accession No. AJ5.01023 (gi:22081956) (Aug. 1, 2002); sequence and structure of rice chromosome 1). Manthey, K., et al., “AJ5.01023 MTAMP Medicago truncatula cDNA NCBI accession No. AP003271 (gi: 13027301) (pos. 154362 clone mtgmadc 120001 h01, mRNA sequence”; (Medicago 155761) (Feb. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica truncatula) (note: the original Submission and latest update are pro cultivar-group) genomic DNA, chromosome 1, PAC clone: vided for examination). P0506B12: (Oryza sativa) ) (note: two versions are presented for NCBI accession No. AJ501814 (gi:22082742) (Aug. 1, 2002); review, gi: 13027301 submitted Feb. 21, 2001; and, gi:20160789 sub Manthey, K., et al., “AJ5201814 MTAMP Medicago truncatula mitted Apr. 16, 2002, see protein id: BAD73383.1, gi:56.201933 cDNA clone mtgmadc1200 12b08, mRNA sequence”; (Medicago marked on p. 3/45) (publication: see Nature 420 (6913), 312-316, truncatula) (note: the original Submission and latest update are pro 2002. The genome sequence and structure of rice chromosome 1). vided for examination). NCBI accession No. AP004 179 (gi: 15718436) (pos. 36798-37036) NCBI accession No. AL132966 (gi:6434215) (pos. 15052-16661) (Sep. 20, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar (Nov. 15, 1999); Bloecker, H., et al., “Arabidopsis thaliana DNA group) genomic DNA, chromosome 2, BAC clone: OJ1124 G07": chromosome 3, BAC clone F4P12” (note: the original submission (Oryza sativa) ) (note: two versions are presented for review, and latest update are provided for examination, see gene F4P12 40, gi:157 18436 submitted Sep. 20, 2001; and, gi:45735881 submitted protein id gi:CAB6764.1 marked on p. 4/66). Mar. 25, 2004, see protein id BAD 12927.1, gi:45735894 marked on NCBI accession No. AL387357 (gi:9687108) (Aug. 3, 2000); p. 6/32). Journet, E.P. et al., “MtBC42A04E1 MtBC Medicago truncatula NCBI accession No. AP004366 (gi: 17046146) (pos. 72619-74018) cDNA clone MtBC42A04 T3, mRNA sequence”; (Medicago (Nov. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar truncatula) (note: the original Submission and latest update are pro group) genomic DNA, chromosome 1, PAC clone: P0460C04'; vided for examination). (Oryza sativa) ) (note: two versions are presented for review, NCBI accession No. AL506199 (gi: 12032414) (Jan. 4, 2001); gi: 17046146 submittedd Nov. 21, 2001; and, gi:20805242 submitted Michalek, W., et al., “AL506199 Hordeum vulgare Baarke develop May 15, 2002, see protein id: BAD73788.1, gi:56202329 marked on ing caryopsis (3.-15. DAP) Hordeum vulgare subsp. Vulgare cDNA p. 12/46) (publication: see Nature 420 (6913), 312-316, 2002, The clone HY02F18T 5", mRNA sequence': (Hordeum vulugare). genome sequence and structure of rice chromosome 1). NCBI accession No. AL509098 (gi: 12035601) (Jan. 4, 2001); NCBI accession No. AP005193 (gi:20975319) (pos. 56389-57063) Michalek, W., et al., “AL509098 Hordeum vulgare Barke developing (May 17, 2002); Sasaki, T., et al., “Oryza sativa (japonica cultivar caryopsis (3,-15. DAP)Hordeum vulgare subsp. Vulgare cDNA clone group) genomic DNA, chromosome 7, PAC clone: P0493C06': HY1OL07V 5", mRNA sequence': (Hordeum vulgare). (Oryza sativa) ) (note: two versions are presented for review, NCBI accession No. AL830693 (gi:21842473) (Jul 16, 2002); gi:20975319 submitted May 17, 2002; and, gi:38142450 submitted Sprunck, S., et al., “AL830693 q:242 Triticum aestivum cDNA clone Oct. 31, 2003, see protein id: BAD31 143.1, gi:50508657 marked on D05 q242 plate 10, mRNA sequence': (Triticum aestivum). p. 7/50). NCBI accession No. AP003246 (gii: 13027276) (pos. 28048-28351) NCBI accession No. AT002114 (gi:5724898) (Aug. 10, 1999); Ryu, (Feb. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar S.W., et al., “AT002114 Flower bud cDNA Brassica rapa Subsp. group) genomic DNA, chromosome 1, PAC clone: P0423A12'; Pekinesis cDNA clone RF0417, mRNA sequence”. (Oryza sativa) (note: two versions are presented for review, NCBI accession No. AU088581 (gi:7378310) (Mar. 31, 2000); gi:21104640 submitted Feb. 21, 2001; and, gi:21104640 submitted Sasaki, T., et al., “AU88581 Rice Callus Oryza sativa (japonica May 22, 2002, see protein id BAB93258.1 marked on p. 14/50) cultivar-group) cDNA clone C52742, mRNA sequence': (Oryza (publication: see Nature 420 (6913), 312-316, 2002. The genome sativa). sequence and structure office chromosome 1). NCBI accession No. AV411210 (gi:7740371) (May 9, 2000); NCBI acc. No. NP 030436 (gii: 18404885) (Jan. 29, 2002); , et al. Asamizu, E., et al., “AV41 1210 Lotus japonicus young plants (two “putative CCAAT-binding transcription factor subunit'; source: week old) Lotus corniculatus var. japonicus cDNA clone Unknown.; Title:. MWM203a04 r 5", mRNA sequence': (Lotus corniculatus) (publi NCBI acc. No. NP 199575 (gii: 15238156) (Aug. 21, 2001); cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 Tabata, S., et al. “putative protein Arabidopsis thaliana'; source: non-redundant expressed sequence tags from a legume, Lotus Arabidopsis thaliana (thale cress); Title: “Sequence and analysis of japonicus). chromosome 5 of the plant Arabidopsis thaliana” (Nature 408 NCBI accession No. AV420653 (gi:7749830) (May 9, 2000); (6814), 823-826 (2000)). Asamizo, E., et al., “AV420653 Lotus japonicus young plants (two NCBI acc. No. NP 193190 (gii: 15233475) (Aug. 21, 2001); week old) Lotus corniculatus var. japonicus cDNA clone Mayer.K., et al. "CCAAT-binding transcription factor subunit MWM184h05 r 5", mRNA sequence': (Lotus corniculatus) (publi A(CBF-A) Arabidopsis thaliana'; source: Arabidopsis thaliana cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 (thale cress); Title: “Sequence and analysis of chromosome 4 of the non-redundant expressed sequence tags from a legume, Lotus plant Arabidopsis thaliana” (Nature 402 (6763), 769-777 (1999)). japonicus). NCBI acc. No. NP 001031500 (gi: 79324546) (Nov. 3, 2005), et al. NCBI accession No. AV424305 (gi:7781090) (May 12, 2000); “unknown protein Arabidopsis thaliana'; source: Arabidopsis Asamizo, E., et al., “AV424305 Lotus japonicus young plants (two thaliana (thale cress); Title. week old) Lotus corniculatus var. japonicus cDNA clone NCBI acc. No. NP 178981 (gi: 15225440) (Aug. 21, 2001); Lin.X., MWMO38e02 r 5", mRNA sequence': (Lotus corniculatus) (publi et al. “putative CCAAT-box binding trancription factor Arabidopsis cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 thaliana'; source: Arabidopsis thaliana (thale cress); Title: non-redundant expressed sequence tags from a legume, Lotus “Sequence and analysis of chromosome 2 of the plant Arabidopsis japonicus). thaliana” (Nature 402 (6763), 761-768 (1999)). NCBI accession No. AV425835 (gi:7784165) (May 12, 2000); NCBI acc. No. NP 190902 (gii: 15231796) (Aug. 21, 2001); Asamizo, E., et al., “AV425835 Lotus japonicus young plants (two Salanoubat.M., et al. “transcription factor NF-Y. CCAAT week old) Lotus corniculatus var. japonicus cDNA clone binding—like protein Arabidopsis thaliana'; source: Arabidopsis MWMO59e05 r 5", mRNA sequence': (Lotus corniculatus) (publi thaliana (thale cress); Title: “Sequence and analysis of chromosome cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 3 of the plant Arabidopsis thaliana” (Nature 408 (6814), 820-822 non-redundant expressed sequence tags from a legume, Lotus (2000)). japonicus). US 7,868,229 B2 Page 8

NCBI accession No. AV632044 (gi:10775364) (Oct. 11, 2000); NCBI accession No. AW625817 (gi:7338844) (Mar. 28, 2000); Van Asamizo, E., et al., “AV632044 Chlamydomonas reinhardtii 5% CO2 der Hoeven, R.S., et al., “EST319724 tomato radicle, 5d post-imbi Chlamydomonas reinhardtii cDNA clone HC003b12 r 5", mRNA bition, Cornell University Lycopersicon esculentum cDNA clone sequence'; (Chlamydomonas reinhardtii) (publication: see DNA cLEZ17A35", mRNA sequence”; (Lycopersicon esculentum). Res. 7 (5), 305-307, 2000, Generation of expressed sequence tags NCBI accession No. AW648378 (gi:7409616) (Apr. 4, 2000); Alcala, from low-CO2 and high-CO2 adapted cells of Chlamydomonas J., et al., “EST326832 tomato germinating seedlings, TAMU reinhardtii). Lycopersicon esculentum cDNA clone cI.EI4I18 5", mRNA NCBI accession No. AV632945 (gi:10776265) (Oct. 11, 2000); sequence', (Lycopersicon esculentum). Asamizo, E., et al., “AV632945 Chlamydomonas reinhardtii 5% CO2 NCBI accession No. AW648379 (gi:7409617) (Apr. 4, 2000); Alcala, Chlamydomonas reinhardtii cDNA clone HC014f12 r 5", mRNA J., et al., “EST326833 tomato germinating seedlings, TAMU sequence'; (Chlamydomonas reinhardtii) (publication: see DNA Lycopersicon esculentum cDNA clone cI.EI4I22 5", mRNA Res. 7 (5), 305-307, 2000, Generation of expressed sequence tags sequence', (Lycopersicon esculentum). from low-CO2 and high-CO2 adapted cells of Chlamydomonas NCBI accession No. AW688588 (gi:7563324) (Apr. 14, 2000); He, reinhardtii). X.-Z. et al., “NFO09C11ST1F1000 Developing stem Medicago NCBI accession No. AW035570 (gi:5894326) (Sep. 15, 1999); truncatula cDNA clone NF009C11ST 5", mRNA sequence': Alcala, J., et al., “EST281308 tomato callus, TAMU Lycopersicon (Medicago truncatula). esculentum cDNA clone cLEC39F2 similar to CAAT-box DNA NCBI accession No. AW719547 (gi:7614059) (Apr. 19, 2000); binding protein subunit B (NF-YB), putative, mRNA sequence': Freund, S., et al., “LNEST6a3r Lotus japonicus nodule library, (Lycopersicon esculentum). mature and immature nodules Lotus corniculatus var. japonicus NCBI accession No. AWO43377 (gi:5903906) (Sep. 18, 1999); cDNA 5", mRNA sequence': (Lotus corniculatus). Whetten, R.W. “ST32F09 Pine TriplEx shoot tip library Pinus taeda NCBI accession No. AW720671 (gi:7615221) (Apr. 19, 2000); cDNA clone ST32F09, mRNA sequence': (Pinus taeda). Colebatch, G., et al., “LNEST6a3rc Lotus japonicus nodule library NCBI accession No. AW132359 (gi:6133966) (Oct. 27, 1999); Shoe 5 and 7 week-old Lotus corniculatus var. japonicus cDNA 5", mRNA maker, R., et al., “se03b02.y1 Gm-c 1013 Glycine max cDNA clone sequence', (Lotus corniculatus). Genome Systems clone ID: Gm-c 1013-2404 5" similar to NCBI accession No. AW733618 (gi:7639292) (Apr. 24, 2000); Shoe SW:CBFA Maize P25209 CCAAT-binding transcription factor maker, R., et al., “sk75h06.y1 Gm-c 1016 Glycine max cDNA clone Subunit A, mRNA sequence', (Glycine max). Genome systems clone ID: Gm-c 1016-9972 5" similar to NCBI accession No. AW200790 (gi:6481519) (Nov. 30, 1999); TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac Shoemaker, R., et al., “se93e11.y1 Gm-c1027 Glycine max cDNA tor, mRNA sequence', (Glycine max). clone Genome Systems clone ID: Gm-c 1027-357 5' similar to NCBI accession No. AW738727 (gi:7647672) (Apr. 25, 2000); Van TR:O23310 O23310 CCAAT-binding transcription factor subunit A: der Hoeven, R.S., et al., “EST340154 tomato flower buds, anthesis, mRNA sequence', (Glycine max). Cornell University Lycopersicon esculentum cDNA clone NCBI accession No. AW201996 (gi:6482782) (Nov. 30, 1999); cTOD8G22 5", mRNA sequence'; (Lycopersicon esculentum). Shoemaker, R., et al., “sfo9g 11.y1 Gm-c1027 Glycine max cDNA clone Genome Systems clone ID: Gm-c 1027-1821 5" similar to NCBI accession No. AW754604 (gi:7676324) (May 1, 2000); TR:O23310 O23310 CCAAT-binding transcription factor subunit A: Whetten, R.W. et al., “PC04B12 Pine Tripl Ex pollen cone library mRNA sequence', (Glycine max). Pinustaeda cDNA clone PC04B12, mRNA sequence': (Pinustaeda). NCBI accession No. AW348.165 (gi:6845875) (Feb. 1, 2000); NCBI accession No. AW756413 (gi:7685765) (May 3, 2000); Shoe Vodkin, L., et al., “GM210001A21D7Gm-r1021 Glycinemax cDNA maker, R., et al., “s 121a12.y1 Gm-c 1036 Glycine max cDNA clone clone Gm-r1021-1583', mRNA sequence'; (Glycine max). Genome systems clone ID: Gm-c 1036-19435" similar to TR:081130 NCBI accession No. AW395227 (gi:6913697) (Feb. 7, 2000); Shoe 081130 CCAAT-box binding factor HAP3 homolog, mRNA maker, R., et al., “sh45e04.y1 Gm-c 1017 Glycine max cDNA clone sequence', (Glycine max). Genome Systems clone ID: Gm-c 1017-4663 5" similar to NCBI accession No. AW760103 (gi:769 1987) (May 4, 2000); Shoe TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac maker, R., et al., “s 158b03.y1 Gm-c 1027 Glycine max cDNA clone tor; mRNA sequence'; (Glycine max). Genome systems clone ID: Gm-c1027-5478.5" similar to TR:O23310 NCBI accession No. AW397727 (gi:69 16197) (Feb. 7, 2000); Shoe O23310 CCAAT-binding transcription factor subunit A, mRNA maker, R., et al., “sg83ff)4.y1 Gm-c1026 Glycine max cDNA clone sequence', (Glycine max). Genome Systems clone ID: Gm-c1026-3445" similar to TR:023634 NCBI accession No. AW775623 (gi:7765436) (May 9, 2000); 023634 transcription factor; mRNA sequence'; (Glycine max). Fedorova, M., et al., “EST334688 DSIL Medicago truncatula cDNA NCBI accession No. AW432980 (gi:6964287) (Feb. 11, 2000); Shoe clone plDSIL-2K6, mRNA sequence'; (Medicago truncatula). maker, R., et al., “si03a01.y1 Gm-c 1029 Glycine max cDNA clone NCBI accession No. AW907348 (gi:807 1558) (May 24, 2000); Van Genome Systems clone ID: Gm-c1029-97 5" similar to TR:081130 der Hoeven, R.S., et al., “EST343471 potato stolon, Cornell Univer 081130 CCAAT-box binding factor HAP3 homolog; mRNA sity Solanum tuberosum cDNA clone cSTA6D11, mRNA sequence': sequence', (Glycine max). (Solanum tuberosum). NCBI accession No. AW459387 (gi:7029546) (Feb. 24, 2000); Shoe NCBI accession No. AW931376 (gi:8106777) (May 30, 2000); maker, R., et al., “sh23ff)3.y1 Gm-c1016 Glycine max cDNA clone Alcala, J., et al., “EST357219 tomato fruit mature green, TAMU Genome Systems clone ID: Gm-c 1016-5622 5" similar to Lycopersicon esculentum cDNA clone cI.EF44N20 5", mRNA TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac sequence', (Lycopersicon esculentum). tor; mRNA sequence'; (Glycine max). NCBI accession No. AW931634 (gi:8107035) (May 30, 2000); NCBI accession No. AW570530 (gi:7235201) (Mar. 13, 2000); Alcala, J., et al., “EST357477 tomato fruit mature green, TAMU Shoemaker, R., et al., “sj63c01.y1 Gm-c1033 Glycine max cDNA Lycopersicon esculentum cDNA clone cI.EF45F24 5", mRNA clone Genome Systems clone ID: Gm-c 1033-1945 5" similar to sequence', (Lycopersicon esculentum). SW:CBFA Maize P25209 CCAAT-binding transcription factor NCBI accession No. AW980494 (gi:8172030) (Jun. 2, 2000); Subunit A, mRNA sequence', (Glycine max). Fedorova, M., et al., “EST391647 GVN Medicago truncatula cDNA NCBI accession No. AW597630 (gi:7285.143) (Mar. 22, 2000); clone pGVN-55A17, mRNA sequence”; (Medicago truncatula). Shoemaker, R., et al., “sj96g06.y1 Gm-c 1023 Glycine max cDNA NCBI accession No. AW981720 (gi:8173288) (Jun. 2, 2000); clone Genome Systems clone ID: Gm-c 1023-2483 5" similar to Whetten, R.W., et al., “PC15H07 pine TriplEx pollen cone library TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac Pinustaeda cDNA clone PC15H07, mRNA sequence':(Pinus taeda). tor; mRNA sequence'; (Glycine max). NCBI accession No. AX180950 (gi:15132741)(Aug.9, 2001); Costa NCBI accession No. AW621652 (gi:7333299) (Mar. 28, 2000); Van e Silva, O.D., et al., “Sequence 1 from Patent WO0145493”; der Hoeven, R.S., et al., “EST312450 tomato root during/after fruit (Physcomitrella patens) (note: the original Submission and latest set, Cornell University Lycopersicon esculentum cDNA clone update are provided for examination) (publication: see WOO 145493 cLEX12N125", mRNA sequence”; (Lycopersicon esculentum). A1: Jun. 28, 2001). US 7,868,229 B2 Page 9

NCBI accession No. AX180957 (gi:15132748) (Aug.9, 2001); Costa NCBI accession No. AX584283 (gi:27655772) (Jan. 11, 2003); e Silva, O.D., et al., “Sequence 8 from Patent WO0145493”; Cahoon, R.E., et al., “Sequence 25 from patent WO02057439”: (Physcomitrella patens) (note: the original Submission and latest (Triticum aestivum) (note: the original Submission and latest update update are provided for examination) (publication: see WO 0145493 are provided for examination) (publication: see WO 02057439-A25; A1: Jun. 28, 2001). Jul. 25, 2002). NCBI accession No. AX288144 (gi:17049846) (Nov. 22, 2001); da NCBI accession No. AX584285 (gi:27655773) (Jan. 11, 2003); Costa Silva, O., et al., “Sequence 15 from Patent WOO177311”: Cahoon, R.E., et al., “Sequence 27 from patent WO02057439”: (Physcomitrella patens) (note: the original Submission and latest (Triticum aestivum) (note: the original Submission and latest update update are provided for examination) (publication: see WO are provided for examination) (publication: see WO 02057439-A27; 0.177311-A15; Oct. 18, 2001). Jul. 25, 2002). NCBI accession No. AX365282 (gii: 18697024) (Feb. 16, 2002); NCBI accession No. AX584287 (gi:27655774) (Jan. 11, 2003); Garnaat, C., et al., “Sequence 18 from Patent WO0206499”; (Zea Cahoon, R.E., et al., “Sequence 29 from patent WO02057439”: mays) (note: the original Submission and latest update are provided (Canna indica) (note: the original Submission and latest update are for examination) (publication: see WO 0206499-A 18; Jan. 24. provided for examination) (publication: see WO 02057439-A29; Jul. 2002). 25, 2002). NCBI accession No. AX584259 (gi:27655760) (Jan. 11, 2003); NCBI accession No. AY 1 12643 (gi:21217233) (May 26, 2002); Cahoon, R.E., et al., Sequence 1 from Patent WO02057439 Gardiner, J., et al., “Zea mays CL691. 1 mRNA sequence”; (Zea (Momordica charantia) (note: the original Submission and latest mays) (note: the original Submission and latest update are provided update are provided for examination) (publication: see WO for examination) (publication: see Plant Physiol. 134 (4), 1317-1326 02057439-A 1: Jul 25, 2002). 2004, Anchoring 9,371 maize expressed sequence tagged unigenes to NCBI accession No. AX584261 (gi:27655761) (Jan. 11, 2003); the bacterial artificial chromosome contig map by two-dimensional Cahoon, R.E., et al., “Sequence 3 from patent WO2057439”; (Euca overgo hybridization). lyptus grandis) (note: the original Submission and latest update are NCBI accession No. BAB64189 (gi: 15408793) (Aug. 31, 2001); provided for examination) (publication: see WO 02057439-A3; Jul. Sasaki, T. et al. “PO492G09.10” (Oryza sativa) (note: the original 25, 2002). Submission and latest update are provided for examination) (publi NCBI accession No. AX584263 (gi:27655762) (Jan. 11, 2003); cation: see Nature 420 (6913), 312-316, 2002. The genome sequence Cahoon, R.E., et al., “Sequence 5 from Patent WO02057439”; (Zea and structure of rice chromosome 1). mays) (note: the original Submission and latest update are provided NCBI accession No. BAB64190 (gi: 15408794) (Aug. 31, 2001); for examination) (publication: see WO 02057439-A5; Jul. 25, 2002). Sasaki, T., et al. “Putative HAP3-like transcriptional-activator' NCBI accession No. AX584265 (gi:27655763) (Jan. 11, 2003); (Oryza sativa) (note: the original Submission and latest update are Cahoon, R.E., et al., “Sequence 7 from Patent WO02057439”; (Zea provided for examination) (publication: see Nature 420 (6913), 312 mays) (note: the original Submission and latest update are provided 316, 2002. The genome sequence and structure of rice chromosome for examination) (publication: see WO 02057439-A7; Jul. 25, 2002). 1). NCBI accession No. AX584267 (gi:27655764) (Jan. 11, 2003); NCBI accession No. BAB89732 (gi:20 160792) (Apr. 16, 2002); Cahoon, R.E., et al., “Sequence 9 from patent WO02057439”; (Oryza Sasaki, et al. “Putative CAAT-box DNA binding protein' (Oryza sativa) (note: the original Submission and latest update are provided sativa) (note: the original Submission and latest update are provided for examination) (publication: see WO 02057439-A9; Jul. 25, 2002). for examination) (publication: see Nature 420 (6913),312-316, 2002, NCBI accession No. AX584269 (gi:27655765) (Jan. 11, 2003); The genome sequence and structure of rice chromosome 1). Cahoon, R.E., et al., “Sequence 11 from Patent WO02057439”: NCBI accession No. BAB92931 (gi:20805265) (May 15, 2002); (Oryza sativa) (note: the original Submission and latest update are Sasaki, T., et al. “Putative CAAT-box DNA binding protein' (Oryza provided for examination) (publication: see WOO2057439-A 11; Jul. sativa) (note: the original Submission and latest update are provided 25, 2002). for examination) (publication: see Nature 420 (6913),312-316, 2002, NCBI accession No. AX584271 (gi:27655766) (Jan. 11, 2003); The genome sequence and structure of rice chromosome 1). Cahoon, R.E. et al., “Sequence 13 from Patent WO02057439”: NCBI accession No. BAB93257 (gi:21 104666) (May 22, 2002); (Glycine max) (note: the original Submission and latest update are Sasaki, T., et al. “P0423A12.29” (Oryza sativa) (note: the original provided for examination) (publication: see WOO2057439-A 13; Jul. Submission and latest update are provided for examination) (publi 25, 2002). cation: see Nature 420 (6913), 312-316, 2002. The genome sequence NCBI accession No. AX584273 (gi:27655767) (Jan. 11, 2003); and structure of rice chromosome 1). Cahoon, R.E., et al. “Sequence 15 from Patent WO02057439”: NCBI accession No. BAB93258 (gi:21 104667) (May 22, 2002); (Glycine max) (note: the original Submission and latest update are Sasaki, T., et al. “Putative HAP3-like transcriptional-activator': provided for examination) (publication: see WOO2057439-A 15; Jul. (Oryza sativa) (note: the original Submission and latest update are 25, 2002). provided for examination) (publication: see Nature 420 (6913), 312 NCBI accession No. AX584275 (gi:27655768) (Jan. 11, 2003); 316, 2002. The genome sequence and structure of rice chromosome Cahoon, R.E., et al., “Sequence 17 from Patent WO02057439”: 1). (Glycine max) (note: the original Submission and latest update are NCBI accession No. BAC76331 (gi:30409459) (May 6, 2003); provided for examination) (publication: see WOO2057439-A 17; Jul. Miyoshi, K., et al., “NF-YBOryza sativa (japonica cultivar-group)” 25, 2002). (note: the original Submission and latest update are provided for NCBI accession No. AX584277 (gi:27655769) (Jan. 11, 2003); examination) (publication: see Plant J. 36,532-540, 2003, OshAP3 Cahoon, R.E., et al., “Sequence 19 from Patent WO02057439”: genes regulate chloroplast biogenesis in rice). (Glycine max) (note: the original Submission and latest update are NCBI accession No. BAC76332 (gi:30409461) (May 6, 2003); provided for examination) (publication: see WOO2057439-A 19; Jul. Miyoshi, K., et al., “NF-YB Oryza sativa (japonica cultivar 25, 2002). group); (note: the original Submission and latest update are pro NCBI accession No. AX584279 (gi:27655770) (Jan. 11, 2003); vided for examination) (publication: see Plant J. 36,532-540, 2003, Cahoon, R.E., et al., “Sequence 21 from patent WO02057439”: OSHAP3 genes regulate chloroplast biogenesis in rice). (Glycine max) (note: the original Submission and latest update are NCBI accession No. BAC76333 (gi:30409463) (May 6, 2003); provided for examination) (publication: see WOO2057439-A21; Jul. Miyoshi, K., et al., “NF-YB Oryza sativa (japonica cultivar 25, 2002). group); (note: the original Submission and latest update are pro NCBI accession No. AX584281 (gi:27655771) (Jan. 11, 2003); vided for examination) (publication: see Plant J. 36,532-540, 2003, Cahoon, R.E., et al., “Sequence 23 from patent WO02057439”: OSHAP3 genes regulate chloroplast biogenesis in rice). (Triticum aestivum) (note: the original Submission and latest update NCBI accession No. BE021941 (gi:8284382) (Jun. 6, 2000); Shoe are provided for examination) (publication: see WO 02057439-A23; maker, R., et al., “smó4d05.y1 Gm-c 1028 Glycinemax cDNA clone Jul. 25, 2002). Genome systems clone ID: Gm-c 1028-8674 5" similar to US 7,868,229 B2 Page 10

TT:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac 138 (3), 1734-1745, 2005, Profile and analysis of gene expression tor; mRNA sequence'; (Glycine max). changes during early development in germinating spores of NCBI accession No. BE054369 (gi: 13243855) (Jun. 8, 2000); Wing, Ceratopteris richardii). R.A., et al., “GA EaO002 A05f Gossypium arboreum 7-10 dpa fiber NCBI accession No. BE726750 (gi:10 127934) (Sep. 14, 2000); library Gossypium arboreum cDNA clone GA Ea?)002 A05f. Grossman, A., et al., "894093C12.y3 C. Reinhardtii CC-1690, nor mRNA sequence', (Gossypium arboreum) (note: two versions are malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA presented for review, gi: 13243855 submitted Jun. 8, 2000; and, sequence'; (Chlamydomonas reinhardtii). gi:8381425 submitted on Jun. 8, 2000). NCBI accession No. BE802539 (gi:10233651) (Sep. 20, 2000); NCBI accession No. BE060015 (gi:8404381) (Jun. 9, 2000); Shoe Shoemaker, R. et al., “sr32fO2.y1 Gm-c1050 Glycine max cDNA maker, R., et al., "sn39h 06.y1 Gm-c 1027 Glycine max cDNA clone clone Genome systems clone ID: Gm-c 1050-2068 5" similar to SW: Genome systems clone ID: Gm-c1027-96845" similar to TR:023634 CBFA Maize P25209 CCAAT-binding transcription factor subunit 023634 transcription factor; mRNA sequence'; (Glycine max). A, mRNA sequence', (Glycine max). NCBI accession No. BE121888 (gi:8513993) (Jun. 13, 2000); Gross NCBI accession No. BE803572 (gi:10234684) (Sep. 20, 2000); man, A., et al., "894.015G05.y1 C. reinhardtii cc-1690, normalized, Shoemaker, R. et al., “sró0el 1.y1 Gm-c 1052 Glycine max cDNA Lambda ZapII Chlamydomonas reinhardtii cDNA, mRNA clone Genome systems clone ID: Gm-c1052-165 5" similar to sequence'; (Chlamydomonas reinhardtii). TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac NCBI accession No. BE196056 (gi: 13188584) (Jun. 26, 2000); tor; mRNA sequence'; (Glycine max). Wing, R. et al., “HVSMEh0091D23f Hordeum vulgare 5-45 DAP NCBI accession No. BE804236 (gi:10235348) (Sep. 20, 2000); spike EST library HVcDNAO009 (5 to 45 DAP) Hordeum vulgare Shoemaker, R. et al., “sr77b04.y1 Gm-c 1052 Glycine max cDNA subsp. Vulgare cDNA clone HVSMEh0091D23f, mRNA sequence': clone Genome systems clone ID: Gm-c 1052-1736 5" similar to (Hordeum vulgare) ) (note: two versions are presented for review, SW:CBFA Maize P25209 CCAAT-binding transcription factor gi: 13188584 submitted Jun. 26, 2000; and, gi:8708251 submitted Subunit A, mRNA sequence', (Glycine max). Jun. 26, 2000). NCBI accession No. BF065056 (gi: 1084.1695) (Oct. 17, 2000); NCBI accession No. BE210041 (gi:8826311) (Jun. 29, 2000); Shoe Wing, R., et al., “HV CEbO22M01f Hordeum vulgare seedling maker, R., et al., "so38b01.y1 Gm-c 1039 Glycine max cDNA clone green leaf EST library HvcDNA 0005 (Blumeria challenged) Genome systems clone ID: Gm-c 1039-1945" similar to TR:Q9ZQC3 Hordeum vulgare subsp. Vulgare cDNA clone HV CEb00212M01f. Q97.QC3 putative CCAAT-binding transcription factor; mRNA mRNA sequence', (Hordeum vulgare). sequence', (Glycine max). NCBI accession No. BF071234 (gi: 10845982) (Oct. 17, 2000); NCBI accession No. BE356560 (gi:9298117) (Jul. 20, 2000); Shoemaker, R. et al., “st06hO5.y1 Gm-c 1065 Glycine max cDNA Cordonnier-Pratt, M., et al. “DG1 126 D05.b1 AO02 Dark clone Genome systems clone ID: Gm-c1065-562 5" similar to Grown 1 (DG 1) Sorghum bicolor cDNA, mRNA sequence': (Sor TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac ghum bicolor). tor; mRNA sequence'; (Glycine max). NCBI accession No. BE413647 (gi:941 1493) (Jul 24, 2000); Ander NCBI accession No. BF 169598 (gi: 11054215) (Oct. 30, 2000); son, O.A., et al., “SCU001.E10.R990714 ITEC SCU. Wheat Sederoff, R., “NXCI 125 B04 FNXCI (NsfXylem Compression Endosperm Library Triticum aestivum cDNA clone SCU001.E.10, wood Inclined) Pinus taeda cDNA clone NXCI 125 B045" similar mRNA sequence': (Triticum aestivum). to Arabidopsis thaliana sequence At2g37060 putative CCAAT-box NCBI accession No. BE418716 (gi:9416562) (Jul 24, 2000); Ander binding transcription factor see http://mips.gsfdef pro?thal/db/index. son, O.A., et al., “SCL074.B01 R990724 ITEC SCL Wheat Leaf html, mRNA sequence': (Pinus taeda). Library Triticum aestivum cDNA clone SCLO74.B01, mRNA NCBI accession No. BF263449 (gii: 13260832) (Nov. 17, 2000); sequence'. (Triticum aestivum). Wing, R. et al., “HV CEaO006M10f Hordeum vulgare seedling NCBI accession No. BE441 135 (gi:944.0635) (Jul 25, 2000); Alcala, green leaf EST library HvcDNA 0004 (Blumeria challenged) J., et al., “EST408405 tomato developing/immature green fruit Hordeum vulgare subsp. Vulgare cDNA clone HV CEa(0.006M10f. Lycopersicon esculentum cDNA clone cI.EM6C23 similar to Zea mRNA sequence', (Hordeum Vulgare) (note: two versions are pre mays CAAT-box DNA binding protein subunit B, mRNA sequence': sented for review, gi: 13260832 Submitted Nov. 17, 2000; and, (Lycopersicon esculentum). gi:11 194443 submitted Nov. 17, 2000). NCBI accession No. BE441739 (gi:9441376) (Jul 25, 2000); Gross NCBI accession No. BF263455 (gii: 13260837) (Nov. 17, 2000); man, A., et al., '925.009 A11.xl C. Reinhardtii CC-2290, normalized, Wing, R. et al., “HV CEaO006M16f Hordeum vulgare seedling Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA green leaf EST library HvcDNA 0004 (Blumeria challenged) sequence'; (Chlamydomonas reinhardtii). Hordeum vulgare subsp. Vulgare cDNA clone HV CEa(0.006M16f. NCBI accession No. BE496857 (gi:9695474) (Aug. 4, 2000); Ander mRNA sequence', (Hordeum Vulgare) (note: two versions are pre son, O.D. et al., “WHE0761. D09 H17ZS Wheat heat-stressed sented for review, gi: 13260837 submitted on Nov. 17, 2000; and seedling cDNA library Triticum aestivum cDNA clone gi:16334312 submitted Oct. 23, 2001). WHE0761 D09 H17, mRNA sequence”; (Triticum aestivum). NCBI accession No. BF270164 (gii: 13247369) (Nov. 17, 2000); NCBI accession No. BE516510 (gi:9740538) (Aug. 8, 2000); Ander Wing, R., et al., “GA Eb0007A21f Gossypium arboreum 7-10 dpa son, O.D., et al., “WHE611 D10 H19ZA Wheat ABA-treated fiber library Gossypium arboreum cDNA clone GA Eb0007A21f. embryo cDNA library Triticum aestivum cDNA clone mRNA sequence', (Gossypium arboreum) (note: two versions are WHE611 D10 H19, mRNA sequence': (Triticum aestivum). presented for review, gi: 13247369 submitted Nov. 17, 2000; and, NCBI accession No. BE603222 (gi: 13191083) (Aug. 21, 2000); gi:1120 1159 submitted Nov. 17, 2000). Wing, R., et al., “HVSMEh0102J16f Hordeum vulgare 5-45 DAP NCBI accession No. BF270944 (gi:1120 1939) (Nov. 17, 2000); spike EST library HVcDNAO009 (5 to 4 DAP) Hordeum vulgare Wing, R. et al., “GA Eb0010B11f Gossypium arboreum 7-10 dpa subsp. Vulgare cDNA clone HVSMEhQ102J16f mRNA sequence': fiber library Gossypium arboreum cDNA clone GA Eb0010B11f. (Hordeum vulgare) ) (note: two versions are presented for review, mRNA sequence', (Gossypium arboreum). gi: 13191083 submitted Aug. 21, 2000; and, gi:9860783 submitted NCBI accession No. BF291752 (gi:11222816) (Nov. 17, 2000); Aug. 21, 2000). Akhunov, E., et al., “WHE2205 F04 K07ZS Aegilops speltoides NCBI accession No. BE604847 (gi:9862117) (Aug. 21, 2000); anther cDNA library Aegilops speltoides cDNA clone Anderson, O.D., et al., “WHE 1713-1716 D19 D19ZS Wheat heat WHE2205 F04 K07, mRNA sequence”; (Aegilops speltoides). stressed spike cDNA library Triticum aestivum cDNA clone NCBI accession No. BF459554 (gi:1 1528.732) (Dec. 4, 2000); WHE 1713-1716 D19 D19, mRNA sequence”; (Triticum Crookshanks, M., et al., “061 A04 Mature tuber lambda ZAP aestivum). Solanum tuberosum cDNA 5" similar to (AL 132966) transcription NCBI accession No. BE641 101 (gi:9958761) (Sep. 1, 2000); Salmi, factor NF-Y. CCAAT-binding . . . emb CAB7641.1, mRNA M.L., et al., “Cri2 2 E11 SP6 Ceratopteris Spore Library sequence': (Solanum tuberosum) (publication: see FEBS Lett. 506 Ceratopteris richardii cDNA clone Cri2 2 E11 5", mRNA (2), 123-126, 2001. The potato tuber transcriptome: analysis of 6077 sequence”; (Ceratopteris richardii) (publication: see Plant Physiol. expressed sequence tags). US 7,868,229 B2 Page 11

NCBI accession No. BF460267 (gi: 11529424) (Dec. 4, 2000); NCBI accession No. BG314203 (gi: 13116006) (Feb. 23, 2001); Crookshanks, M., et al., “073E08 Mature tuber lambda ZAP Anderson, O.D., et al., “WHE2460 E 10 120ZS Triticum Solanum tuberosum cDNA 5" similar to CCAAT-binding transcrip monococcum early reproductive apex cDNA library Triticum tion factor subunit . . . sp. P25209, mRNA sequence': (Solanum monococcum cDNA clone WHE2460 E10 I20, mRNA tuberosum) (publication: see FEBS Lett. 506(2), 123-126, 2001. The sequence'. (Triticum monococcum). potato tuber transcriptome: analysis of 6077 expressed sequence NCBI accession No. BG318871 (gi: 13128301) (Feb. 26, 2001); tags). Sederoff, R., “NXPV 020 H08 F NXPV (Nsf Xylem Planings NCBI accession No. BF517889 (gi: 11606021) (Dec. 8, 2000); Wood Vertical) Pinustaeda cDNA clone NXPV 020 H08.5" similar Sederoff, R., “NXSI 029 D01 F NXSI (Nsf Xylem Side wood to Arabidopsis thaliana sequence Atág 14540 CCAAT-binding tran Inclined) Pinus taeda cDNA clone NXSI 029 D01 5" similar to scription factor subunit A (CBF-A) see http://mips.gsfde?projthal/ Arabidopsis thaliana sequence At2g37060 putative CCATT-box db/index.html, mRNA sequence': (Pinus taeda). binding transcription factor see http://mips.gsfdef pro?thal/db/index. NCBI accession No. BG350430 (gi: 13179172) (Mar. 1, 2001); html, mRNA sequence': (Pinus taeda). Crookshanks, M., et al., “O91D09 mature tuber lambda ZAP NCBI accession No. BF585526 (gi: 11677850) (Dec. 12, 2000); Solanum tuberosum cDNA, mRNA sequence'. (Solanum Cordonnier-Pratt, M., et al., “FM1 23 E09.g1 AO03 Floral-In tuberosum) (publication: see FEBS Lett. 506(2), 123-126, 2001. The duced Meristem 1 (FM1) Sorghum propinquum cDNA, mRNA potato tuber transcriptome: analysis of 6077 expressed sequence sequence'. (Sorghum propinquum). tags). NCBI accession No. BF585616 (gi: 11677940) (Dec. 12, 2000); NCBI accession No. BG350792 (gi: 13179534) (Mar. 1, 2001); Cordonnier-Pratt, M., et al., “FM1 23 E09.b1 AO03 Floral-In Crookshanks, M., et al., “O98CO7 mature tuberlambda ZAP Solanum duced Meristem 1 (FM1) Sorghum propinquum cDNA, mRNA tuberosum cDNA, mRNA sequence': (Solanum tuberosum) (publi sequence'. (Sorghum propinquum). cation: see FEBS Lett. 506 (2), 123-126, 2001, The potato tuber NCBI accession No. BF595304 (gi: 11687628) (Dec. 12, 2000); transcriptome: analysis of 6077 expressed sequence tags). Shoemaker, R., et al., “su76f03.y1 Gm-c 1055 Glycine max cDNA NCBI accession No. BG362898 (gii: 1325 1995) (Mar. 8, 2001); Shoe clone Genome Systems clone ID: gm-c 1055-653 5" similar to maker, R., et al., “sac13e07.y1 Gm-c 1040 glycine max cDNA clone TR:081130 081130 CCAAT-Box binding factor HAP3 homolog: Genome systems clone ID: Gm-c 1040-4453.5" similar to TR:023633 mRNA sequence', (Glycine max). 023633 Transcription factor; mRNA sequence”; (Glycine max). NCBI accession No. BF597252 (gi: 11689576) (Dec. 12, 2000); NCBI accession No. BG363233 (gii: 13252330) (Mar. 8, 2001); Shoe Shoemaker, R., et al., “su96co6.y1 Gm-c1056 Glycine soja cDNA maker, R., et al., “sac11h11.y1 Gm-c1040 Glycinemax cDNA clone clone Genome Systems clone ID: gm-c 1056-131 5" similar to Genome Systems clone ID: Gm-c 1040-4581 5" similar to TR: TR:Q97OC3 Q97.QC3 Putative CCAAT-Box binding transcription Q97.QC3 Q97.QC3 Putative CCAAT-binding transcription factor; factor; mRNA sequence', (Glycine Soja). mRNA sequence', (Glycine max). NCBI accession No. BF636140 (gi:1 1900298) (Dec. 19, 2000); Tor NCBI accession No. BG368375 (gi: 13257476) (Mar. 8, 2001); Wing, rez-Jerez, I., et al., “NFO60HO9DT1F1079 Drought Medicago R., et al., “HVSMEi0018C01f Hordeum vulgare 20 DAP spike EST truncaatula cDNA clone NF060HO9DT 5", mRNA sequence': library HvcDNA0010 (20 DAP) Hordeum vulgare subsp. vulgare (Medicago truncatula). cDNA clone HVSMEi0018C01f, mRNA sequence”; (Hordeum NCBI accession No. BF645376 (gi: 11910505) (Dec. 20, 2000); vulgare)) (note: two versions are presented for review, gi: 13257476 Torres-Jerez, I., et al., “NFO40B5EC1F1044 Elicted cell culture submitted Mar. 8, 2001; and, gi:16325230 submitted Oct. 22, 2001). Medicago truncatula cDNA clone NF040B05EC 5", mRNA NCBI accession No. BG440251 (gii: 13349902) (Mar. 15, 2001); sequence'; (Medicago truncatula). Wing, R.A., et al., “GA Ea?)06K20f Gossypium arboreum 7-10 dpa NCBI accession No. BF65.1151 (gi: 11916281) (Dec. 20, 2000); fiber library Gossypium arboreum cDNA clone GA Ea?)006K20f. Torres-Jerez, I., et al., “NF101H1OEC1F1090 Elicted cell culture mRNA sequence', (Gossypium arboreum). Medicago truncatula cDNA clone NF101H10EC 5", mRNA sequence'; (Medicago truncatula). NCBI accession No. BG445358 (gii: 13355010) (Mar. 15, 2001); NCBI accession No. BF715909 (gii: 12015181) (Jan. 2, 2001); Shoe Wing, R.A., et al., “GA Ea?)027N18f Gossypium arboreum 7-10 maker, R., et al., “saal le(08.y1 Gm-c 1058 Glycine soja cDNA clone dpa fiber library Gossypium arboreum cDNA clone Genome systems clone ID: Gm-c 1058-999.5" similar to TR:O23310 GA Ea?)027N18f mRNA sequence'; (Gossypium arboreum). O23310 CCAAT-binding transcription factor subunit A, mRNA NCBI accession No. BG526135 (gi: 16949604) (Nov. 16, 2001); sequence', (Glycine Soja). Brandle, J.E. et al., “57-6 Stevia field grown leaf cDNA Stevia NCBI accession No. BF777951 (gii: 12125851) (Jan. 12, 2001); rebaudiana cDNA 5", mRNA sequence'. (Stevia rebaudiana). Sederoff, R., “NXSI 079 C03 F NXSI (Nsf Xylem Side wood NCBI accession No. BG551755 (gi: 13563535) (Apr. 9, 2001); Shoe Inclined) Pinus taeda cDNA clone NXSI 079 C03 5' similar to maker, R., "sad42f11.y1 Gm-c 1075 Glycine max cDNA clone Arabidopsis thaliana sequence Atag14540 CCAAT-binding tran Genome systems clone ID: Gm-c1075-6695' similar to TR:081130 scription factor subunit A (CBF-A) see http://mips.gsfde?projthal/ 081130 CCAAT-Box binding factor HAP3 homolog; mRNA db/index.html, mRNA sequence': (Pinus taeda). sequence', (Glycine max). NCBI accession No. BG039303 (gi: 1248.1888) (Jan. 24, 2001); NCBI accession No. BG589029 (gii: 13607169) (Apr. 12, 2001); Har Sederoff, R., “NXSI 097 E11 F NXSI (Nsf Xylem Side Wood rison, M.J., et al., “EST490838 MHRP-Medicago truncatula cDNA Inclined) Pinus taeda cDNA clone NXSI 097 E11 5" similar to clone pMHRP-59024, mRNA sequence”; (Medicago truncatula). Arabidopsis thaliana sequence At2g37060 putative CCAAT-box NCBI accession No. BG594268 (gi: 13612408) (Apr. 12, 2001); Van binding transcription factor see http://mips.gsfdef proj?thal/db/ der Hoeven, R., et al., “EST49294.6 cSTS Solanum tuberosum cDNA index.html, mRNA sequence': (Pinus taeda). clone cSTS7A17 5" sequence, mRNA sequence'. (Solanum NCBI accession No. BG135204 (gi: 12635392) (Jan. 31, 2001); Van tuberosum). der Hoeven, R., et al., “EST468096 tomato crown gall Lycopersicon NCBI accession No. BG599785 (gi: 13616921) (Apr. 12, 2001); Van esculentum cDNA clone cTOE21D12 5" sequence, mRNA der Hoeven, R., et al., “EST504680 cSTS Solanum tuberosum cDNA sequence', (Lycopersicon esculentum). clone cSTS26G12 5" sequence, mRNA sequence"; (Solanum NCBI accession No. BG263362 (gi: 12865444) (Feb. 16, 2001); tuberosum). Anderson, O.D., et al., “WHE2341 B02 C03ZS Wheat pre NCBI accession No. BG642751 (gi: 13777673) (Apr. 24, 2001); Van anthesis spike cDNA library Triticum aestivum cdNA clone der Hoeven, R., et al., “EST510945 tomato shoot/meristem WHE2341 B02 C03, mRNA sequence': NCBI accession No. Lycopersicon esculentum cDNA clone cTOF25F23 5" sequence, BG263362 (gi: 12865444) (Triticum aestivum). mRNA sequence', (Lycopersicon esculentum). NCBI accession No. BG274786 (gi: 13067446) (Feb. 21, 2001); NCBI accession No. BG644353 (gii: 13779465) (Apr. 24, 2001); Akhunov, E., et al., “WHE2234 C03 E06ZS Aegilops speltoides VandenBosch, K., et al., “EST505972 KV3 Medicago truncatula anther cDNA library Aegilops speltoides cDNA clone cDNA clone pKV3-37C23 5' end, mRNA sequence”; (Medicago WHE2234 C03 E06, mRNA sequence”; (Aegilops speltoides). truncatula). US 7,868,229 B2 Page 12

NCBI accession No. BG662094 (gii: 13884016) (Apr. 30, 2001); NCBI accession No. BI311277 (gi: 14985604) (Jul. 20, 2001); Poulsen, C., et al., “Ljirnpest38-110-g8 Ljirnp Lambda HybriZap Grusak, M.A., et al., “EST5313027 GESD Medicago truncatula two-hybrid library Lotus corniculatus var. japonicus cDNA clone cDNA clone pGESD10M10 5' end, mRNA sequence': (Medicago LP110-38-g8 5" similar to homolog of transcription factor NF-Y. truncatula). CCAAT-binding-like protein, mRNA sequence': (Lotus NCBI accession No. BI316766 (gi:14991093) (Jul 20, 2001); Shoe corniculatus). maker, R., et al., "saf73a12.y1 Gm-c 1078 Glycinemax cDNA clone NCBI accession No. BG832836 (gi:14189478) (May 22, 2001); Genome systems clone ID: Gm-c 1078-1584 5" similar to TR: Sederoff, R., “NXPV 081 C10 F NXPV (Nsf Xylem Planings Q97.QC3 Q9ZQC3 Putative CCAAT-Binding transcription factor; wood Vertical) Pinus taeda cDNA clone NXPV 081 C105" similar mRNA sequence', (Glycine max). to Arabidopsis thaliana sequence Atág 14540 CCAAT-binding tran NCBI accession No. BI406257 (gi:15185671) (Aug. 14, 2001); scription factor subunit A (CBF-A) see http://mips.gsfde?projthal/ Crookshanks, M., et al., “158C 12 Mature tuber lambda ZAP db/index.html, mRNA sequence': (Pinus taeda). Solanum tubersum cDNA, mRNA sequence'. (Solanum tuberosum) NCBI accession No. BG846124 (gi: 14227308) (May 29, 2001); (publication: see FEBSLett. 506(2), 123-126, 2001. The potato tuber Grossman, A., et al., "1024012C11.y1 C. Reinhardtii CC-1690, nor transcriptome: analysis of 6077 expressed sequence tags). malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA sequence'; (Chlamydomonas reinhardtii). NCBI accession No. BI419749 (gi:15190772) (Aug. 15, 2001); NCBI accession No. BG847452 (gi: 14228.636) (May 29, 2001); Colebatch, G. et al., “LNEST14e12r Lotus japonicus nodule library Grossman, A., et al., "1024017D03.y1 C. Reinhardtii CC-1690, nor 5 and 7 week-old Lotus corniculatus var. japonicus cDNA 5", mRNA malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA sequence', (Lotus corniculatus var. japonicus). sequence'; (Chlamydomonas reinhardtii). NCBI accession No. BI423967 (gi:15199704) (Aug. 16, 2001); NCBI accession No. BG850688 (gi: 14231872) (May 29, 2001); Shoemaker, R., et al., "sahó4c11.y1 Gm-c 1049 Glycine max cDNA Grossman, A., et al., "1024029A11.y1 C. Reinhardtii CC-1690, nor clone Genome systems clone ID: Gm-c 1049-3189 5" similar to malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA TR:O23310 O23310 CCAAT-binding transcription factor subunit A: sequence'; (Chlamydomonas reinhardtii). mRNA sequence', (Glycine max). NCBI accession No. BG850689 (gi: 14231873) (May 29, 2001); NCBI accession No. BI469382 (gi:15285491) (Aug. 24, 2001); Grossman, A., et al., "1024029A11.y2 C. Reinhardtii CC-1690, nor Shoemaker, R., et al., "sail 1b10.y1 Gm-c 1053 Glycine max cDNA malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA clone Genome systems clone ID: Gm-c 1053-2779 5" similar to sequence'; (Chlamydomonas reinhardtii). TR:O23310O23310 CCAAT-Binding transcription factor subunit A: NCBI accession No. BG857007 (gi: 14238.191) (May 29, 2001); mRNA sequence', (Glycine max). Grossman, A., et al., "1024049D01.y1 C. Reinhardtii CC-1690, nor NCBI accession No. BI480208 (gi:15315976) (Aug. 27, 2001); malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA Akhunov, E., et al., “WHE2403 H07 P13ZS Wheat 3-6 DAP seed sequence'; (Chlamydomonas reinhardtii). cDNA library Triticum aestivum cDNA clone NCBI accession No. BG858372 (gi: 14239556) (May 29, 2001); WHE2403 H07 P13, mRNA sequence”; (Triticum aestivum). Grossman, A., et al., “1024057C11.y1 C. Reinhardtii CC-1690, nor NCBI accession No. BI531782 (gi:15372356) (Aug. 29, 2001); malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA Grossman, A., et al., "1024116E03.y1 C. Reinhardtii CC-1690, nor sequence'; (Chlamydomonas reinhardtii). malized, Lambda ZAPIIChlamydomonas reinhardtii cDNA, mRNA NCBI accession No. BG890447 (gii: 14267556) (May 30, 2001); Van sequence'; (Chlamydomonas reinhardtii). der Hoeven, R., et al., “EST516298 cSTD Solanum tuberosum cDNA clone cSTD 18M6 5' sequence, mRNA sequence'. (Solanum NCBI accession No. BI531808 (gi:15372382) (Aug. 29, 2001); tuberosum). Grossman, A., et al., "1024116G03.y1 C. Reinhardtii cc-1690, nor NCBI accession No. BH966788 (gi:23448014) (Oct. 1, 2002); malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA Delehaunty, K., et al., “odi26h 12,b1 B.oleracea(002 Brassicaoleracea sequence'; (Chlamydomonas reinhardtii). genomic, genomic Survey sequence', (Brassica oleracea). NCBI accession No. BIT 18232 (gi:15693927) (Sep. 19, 2001); NCBI accession No. BI129814 (gi:18.013785) (Dec. 31, 2001); Grossman, A., et al., "1031024F10.y1 C. Reinhardtii cc-1690, stress Hertzberg, M., et al., “G095P88Y Populus cambium cDNA library II (normalized), Lambda Zap II Chlamydomonas reinhardtii cDNA, Populus tremula X Populus tremuloides cDNA, mRNA sequence': mRNA sequence'; (Chlamydomonas reinhardtii). (Populus tremula X Populus tremuloides). NCBI accession No. BIT 19728 (gi:15695423) (Sep. 19, 2001); NCBI accession No. BI176409 (gi:14642220) (Jul. 9, 2001); Grossman, A., et al., “1031045D08.y1 C. Reinhardtiicc-1690, stress Restrepo, S, et al., “EST521199P Infestans-challenged potato leaf, II (normalized), Lambda Zap II Chlamydomonas reinhardtii cDNA, compatible reaction Solanum tuberosum cDNA clone PPCAC88 5' mRNA sequence'; (Chlamydomonas reinhardtii). sequence similar to CCAAT-box binding transcription factor NCBI accession No. BI875221 (gi:16073225) (Oct. 11, 2001); gene id:MNJ7.26 (Arabdiopsis thaliana), mRNA sequence': Grossman, A., et al., “963122O10,y1 C. Reinhardtii cc-1690, stress (Solanum tuberosum). condition I, normalized, Lambda Zap II Chlamydomonas reinhardtii NCBI accession No. BI206716 (gi: 14684440) (Jul 11, 2001); Van cDNA, mRNA sequence”; (Chlamydomonas reinhardtii). der Hoeven, R., et al., “EST524756 cTOS Lycopersicon esculentum NCBI accession No. BI952722(gi: 16296768) (Oct. 19, 2001); Wing, cDNA clone cTOS11H10 5' end, mRNA sequence”; (Lycopersicon R. et al., “HVSMEm0007I19f Hordeum vulgaregreen seedling EST esculentum). library HvcDNA0014 (Blumeria infected) Hordeum vulgare subsp. NCBI accession No. BI207873 (gi: 14685597) (Jul 11, 2001); Van Vulgare cDNA clone HVSMEm0007I19f, mRNA sequence': der Hoeven, R., et al., “EST525913 cTOS Lycopersicon esculentum (Hordeum vulgare). cDNA clone cTOS15I16 5' end, mRNA sequence”; (Lycopersicon esculentum). NCBI accession No. BI953657 (gi: 16298505) (Oct. 19, 2001); Wing, NCBI accession No. BI268123 (gii: 14873755) (Jul.18, 2001); Korth, R., et al., “HVSMEm()013M03f Hordeum vulgare green seedling K., et al., “NF116D11 IN1F1094 Insect herbivory Medicago EST library HvcDNA0014 (Blumeria infected) Hordeum vulgare truncatula cDNA clone NF116D11 IN 5", mRNA sequence': subsp. Vulgare cDNA clone HVSMEm()013M03f, mRNA (Medicago truncatula). sequence', (Hordeum vulgare). NCBI accession No. BI271802 (gii: 14880590) (Jul. 18, 2001); NCBI accession No. BI967397 (gi:1634 1802) (Oct. 23, 2001); Torres-Jerez, I., et al., “NFO13D06FL1F1057 Developing flower Vodkin, L., et al., “GM830001B20E03 Gm-r1083 Glycine max Medicago truncatula cDNA clone NF013D06FL 5", mRNA cDNA clone Gm-r1083-222 3', mRNA sequence”; (Glycine max). sequence'; (Medicago truncatula). NCBI accession No. BI972318(gii: 16346723) (Oct. 23, 2001); Shoe NCBI accession No. BI309 186 (gii: 14983513) (Jul. 20, 2001); maker, R., et al., "sag90a01.y1 Gm-c1084 Glycinemax cDNA clone Grusak, M.A., et al., “EST530596 GPOD Medicago truncatula Genome Systems clone ID: Gm-c 1084-1178 5" similar to TR: cDNA clone pGPOD-10P10 5' end, mRNA sequence': (Medicago Q97.QC3 Q97.QC3 Putative CCAAT-Binding Transcription Factor; truncatula). mRNA sequence', (Glycine max). US 7,868,229 B2 Page 13

NCBI accession No. BJ208385 (gi: 1994.6503) (Apr. 4, 2002); NCBI accession No. BM139923 (gi:21638877) (Jul. 1, 2002); Wang, Ogihara, Y, et al., “BJ208385Y. Ogihara unpublished cDNA library, G. et al., “Gm-R115 Soybean root reverse subtractive cDNA library Wh Triticum aestivum cDNA clone wh8a18 5", mRNA sequence': Glycine max cDNA clone Gm-R115, mRNA sequence”; (Glycine (Triticum aestivum). max). NCBI accession No. BJ208815 (gi: 19947041) (Apr. 4, 2002); NCBI accession No. BM268414 (gi:1793.1454) (Dec. 18, 2001); Ogihara, Y, et al., “BJ208815Y. Ogihara unpublished cDNA library, Wen, T.J., et al., “MEST395-C12.univ ISUM5-RNZea mays cDNA Wh Triticum aestivum cDNA clone wh10p165", mRNA sequence': clone MEST395-C123', mRNA sequence”; (Zea mays). (Triticum aestivum). NCBI accession No. BM269434 (gi:17932474) (Dec. 18, 2001); NCBI accession No. BJ210400 (gi: 19949034) (Apr. 4, 2002); Wen, T.J., et al., “MEST409-G 11. Univ ISUM5-RNZea mays cDNA Ogihara, Y, et al., “BJ210400Y. Ogihara unpublished cDNA library, clone MEST409-G 11 3', mRNA sequence”; (Zea mays). Wh Triticum aestivum cDNA clone wh271225", mRNA sequence': NCBI accession No. BM308208 (gi: 18039914) (Jan. 2, 2002); Shoe (Triticum aestivum). maker, R., et al., “sak43a12.y1 Gm-c 1036 glycinemax cDNA clone NCBI accession No. BJ210722 (gi: 1994.9459) (Apr. 4, 2002); SOYBEAN Clone ID: Gm-c1036-5783 5" Similar to TR:081130 Ogihara, Y, et al., “BJ210722Y. Ogihara unpublished cDNA library, 081130 CCAAT-Box Binding Factor HAP3 Homolog: mRNA Wh Triticum aestivum cDNA clone wh29f245", mRNA sequence': sequence', (Glycine max). (Triticum aestivum). NCBI accession No. BM331836 (gi: 18161997) (Jan. 16, 2002); Wen, NCBI accession No. BJ215658 (gi: 1995.4285) (Apr. 4, 2002); T.J., et al., “MEST171-B1 1.T3 ISUM5-RNZea mays cDNA clone Ogihara, Y, et al., “BJ215658.Y. Ogihara unpublished cDNA library, MEST 171-B113', mRNA sequence”; (Zea mays). Wh Triticum aestivum cDNA clone wh8a18 3", mRNA sequence': NCBI accession No. BM337630 (gi: 18167790) (Jan. 16, 2002); Wen, (Triticum aestivum). T.J., et al., “MEST215-B12.T3 ISUM-RNZea mays cDNA clone NCBI accession No. BJ217966 (gi: 1995.7482) (Apr. 4, 2002); MEST215-B123', mRNA sequence”; (Zea mays). Ogihara, Y, et al., “BJ217966Y. Ogihara unpublished cDNA library, NCBI accession No. BM341 107 (gi: 18171267) (Jan. 16, 2002); Wen, Wh Triticum aestivum cDNA clone wh29f243, mRNA sequence': T.J., et al., “MEST330-D11.T3 ISUM-RN Zea mays cDNA clone (Triticum aestivum). MEST330-D113', mRNA sequence'; (Zea mays). NCBI accession No. BJ234039 (gi:20051078) (Apr. 5, 2002); NCBI accession No. BM341536 (gi: 18171696) (Jan. 16, 2002); Wen, Ogihara, Y, et al., “BJ234039Y. Ogihara unpublished cDNA library, T.J., et al., “MEST336-C11.T3 ISUM5-RN zea mays cDNA clone Wh Triticum aestivum cDNA clone whe8e 165", mRNA sequence': MEST336-C113', mRNA sequence”; (Zea mays). (Triticum aestivum). NCBI accession No. BM349646 (gi: 18174258)(Jan. 16, 2002); Wen, NCBI accession No. BJ236476 (gi:2005.2449) (Apr. 5, 2002); T.J., et al., “MEST253-D11.T3 ISUM5-RNZea mays cDNA clone Ogihara, Y, et al., “BJ236476Y. Ogihara unpublished cDNA library, MEST253-D113', mRNA sequence”. Wh Triticum aestivum cDNA clone whe21n095", mRNA sequence': NCBI accession No. BM525962 (gii: 1873.0588) (Feb. 19, 2002); (Triticum aestivum). Shoemaker, R., et al., “sak74b11.y1 gm-c 1036 Glycine max cDNA NCBI accession No. BJ248969 (gi:20059585) (Apr. 5, 2002); clone SOYBEAN clone ID: Gm-c1036-8542 5'similar to TR: Ogihara, Y, et al., “BJ248969 Y. Ogihara unpublished cDNA library, Q97.QC3 Q97.QC3 Putative CCAAT-Binding Transcription factor; Wh Triticum aestivum cDNA clone whf&no95", mRNA sequence': mRNA sequence', (Glycine max). (Triticum aestivum). NCBI accession No. BM528842 (gii: 18735577) (Feb. 19, 2002); NCBI accession No. BJ255219 (gi:20079277) (Apr. 8, 2002); Shoemaker, R., et al., “sak69b03.y1 Gm-c 1036 glycine max cDNA Ogihara, Y, et al., “BJ255219Y. Ogihara unpublished cDNA library, clone SOYBEAN clone ID: Gm-c1036-81415" similar to TR:081130 Wh Triticum aestivum cDNA clone whf&no93", mRNA sequence': 0811130 CCAAT-Box Binding Factor HAP3 Homolog, mRNA (Triticum aestivum). sequence', (Glycine max). NCBI accession No. BJ463462 (gi:21 141969) (May 23, 2002); Sato, NCBI accession No. BM887558 (gi: 1927.1302) (Mar. 8, 2002); K., et al., “BJ463462 K. Sato unpublished cDNA library, cv. Haruna Shoemaker, R., et al., “sam40cO9.y1 Gm-c 1068 Glycinemax cDNA Nijo germination shoots Hordeum vulgare subsp. Vulgare cDNA clone SOYBEAN clone ID: Gm-c1068-7050 5" similar to clone bags32a01 5", mRNA sequence': (Hordeum vulgare subsp. TR:Q9ZQC3 Q9ZQC3 Putative CCAAT-Box Binding Transcription Vulgare). Factor, mRNA sequence', (Glycine max). NCBI accession No. BM888735 (gi: 19272479) (Mar. 8, 2002); NCBI accession No. BJ482187(gi:21160648) (May 23, 2002); Sato, Walbot, V., “952068E04.y1 952. BMS tissue from Walbot Lab K., et al., “BJ48218.7 K. Sato unpublished cDNA library, strain H602 (reduced rRNA) Zea mays clDNA, mRNA sequence”; (Zea mays). adult, heading stage top three leaves Hordeum Vulgare subsp. NCBI accession No. BM892103 (gi: 19347223) (Mar. 11, 2002); Spontaneum cDNA clone bahó3025", mRNA sequence': (Hordeum Shoemaker, R., et al., “sam48d03.y1 Gm-c 1069 Glycinemax cDNA vulgare). clone SOYBEAN clone ID: Gm-c1069-2478 5" similar to NCBI accession No. BJ555744 (gi:27237564) (Dec. 18, 2002); TR:Q9ZQC3 Q9ZQC3 Putative CCAAT-Binding Factor, mRNA Hoshino, A., et al., “BJ555744 Ipomoea nil mixture of flower and sequence', (Glycine max). flower bud Ipomoea nil cDNA clonejm 18c025", mRNA sequence': NCBI accession No. BM896.169 (gi: 19351637) (Mar. 11, 2002); (Ipomoea nil). Walbot, V., “952067E04 xl 952. BMS tissue from Walbot Lab NCBI accession No. BJ571596 (gi:27253424) (Dec. 18, 2002); (reduced rRNA) Zea mays clDNA, mRNA sequence”; (Zea mays). Hoshino, A., et al., “BJ571596 Ipomoea nil mixture of flower and NCBI accession No. BQ046483 (gii: 19820469) (Mar. 29, 2002); flower bud Ipomoea nil cDNA clonejm 18c02 3', mRNA sequence': Zhang, P. et al., “EST595601 P. Infestans-challenged potato leaf, (Ipomoea nil). incompatible reaction Solanum tuberosum cDNA clone BPLI114J14 NCBI accession No. BJ575345 (gi:27257173) (Dec. 18, 2002); 5' end, mRNA sequence'. (Solanum tuberosum). Hoshino, A., et al., “BJ575345 Ipomoea nil mixture of flower and NCBI accession No. BQ104671 (gi:20154333) (Apr. 16, 2002); flower bud Ipomoea nil cDNA clonejm29, 193", mRNA sequence': Guterman, I., et al., “fc0546.e Rose Petals (Fragrant Cloud) Lambda (Ipomoea nil). Zap Express Library Rosa hybrid cultivar cDNA clone fe0546.e5". NCBI accession No. BM109471 (gi: 17070418) (Nov. 26, 2001); Van mRNA sequence”; (Rosa hybrid cultivar) (publication: see Plant Cell der Hoeven, R., et al., “EST557007 potato roots Solanum tuberosum 14 (10), 2325-2338, 2002, Rose Scent: Genomics Approach to Dis cDNA clone cFRO4E20 5' end, mRNA sequence"; (Solanum covering Novel Floral Fragrance-Related Genes). tuberosum). NCBI accession No. BQ105902 (gi:20 155564) (Apr. 16, 2002); NCBI accession No. BM134935 (gi: 17143059) (Nov. 28, 2001); Guterman, I., et al., “fc0632.e Rose Petals (Fragrant Cloud) Lambda Anderson, O.D., et al., “WHE0460 A02 A03ZS Wheat Fusarium Zap Express Library Rosa hybrid cultivar cDNA clone fe0632.e5". graminearum infected spike cDNA library Triticum aestivum cDNA mRNA sequence”; (Rosa hybrid cultivar) (publication: see Plant Cell clone WHE0460 A02 A03, mRNA sequence'. (Triticum 14 (10), 2325-2338, 2002, Rose Scent: Genomics Approach to Dis aestivum). covering Novel Floral Fragrance-Related Genes). US 7,868,229 B2 Page 14

NCBI accession No. BQ164458 (gi:2030.1515) (Apr. 24, 2002); sequence': (Beta vulgaris) (publication: see Plant J.32 (5), 845-857. Walbot, V., et al., “1091020E 11 y3 1091-Immature ear with common 2002, Construction of a ‘unigene cDNA clone set by oligonucleotide ESTs screened by Schmidt lab Zea mays clDNA, mRNA sequence': fingerprinting allows access to 25,000 potential Sugar beet genes). (Zea mays). NCBI accession No. BQ606328 (gi:21555594) (Jun. 25, 2002); NCBI accession No. BQ255423 (gi:20456176) (May 6, 2002); Clarke, B., et al., “BRY 2180 wheat EST endosperm library VandenBosch, K., “MTNAL55TKN KVKC Medicago truncatula Triticum aestivum cDNA 5", mRNA sequence': (Triticum aestivum) cDNA clone pKVKC-12E7, mRNA sequence”; (Medicago (publication: see Funct. Integr. Genomics 3 (1-2), 33-38, 2003, truncatula). Arabidopsis genomic information for interpreting wheat EST NCBI accession No. BQ296537 (gi:20812059) (May 16, 2002); sequences). Shoemaker, R., et al., "san93ff)1 y2 Gm-c 1052 glycine max cDNA NCBI accession No. BQ629472 (gi:21677121) (Jul. 2, 2002); Shoe clone SOYBEAN clone ID: Gm-c1052-7178 5" similar to TR: maker, R., et al., "saq02r06.Y 1 Gm-c 1045 Glycinemax cDNA clone Q97.QC3 Q97.QC3 putative CCAAT-binding transcription factor; SOYBEAN clone ID: Gm-c1045-3660 5" similar to TR: Q97.QC3 mRNA sequence', (Glycine max). Putative CCAAT-Binding Transcription Factor, mRNA sequence': NCBI accession No. BQ405785 (gi:21093472) (May 22, 2002); (Glycine max). Wing, R.A., et al., “GA EdO086G12f Gossypium arboreum 7-10 NCBI accession No. BQ667936 (gi:21809618) (Jul 15, 2002); dpa fiber library Gossypium arboreum cDNA clone Walbot, V., "946102B01.y1 946—tassel primordium prepared by GA Ed0086G12f mRNA sequence'; (Gossypium arboreum). Schmidt lab Zea mays clDNA, mRNA sequence”; (Zea mays). NCBI accession No. BQ488908 (gi:21333528) (Jun. 7, 2002); NCBI accession No. BQ744755 (gi:218.91542) (Jul 17, 2002); Bellin, D., et al., “95-E9134-006-006-M23-T3 Sugar beet MPIZ Walbot, V., "946 111A02.y1 946—tassel primordium prepared by ADIS-006 Lambda Zap II Library Beta vulgaris cDNA clone M-23 Schmidt lab Zea mays clDNA, mRNA sequence”; (Zea mays). 6, mRNA sequence', (Beta vulgaris). NCBI accession No. BQ757780 (gi:2196.6252) (Jul 26, 2002); NCBI accession No. BQ505705 (gi:21364,574) (Jun. 10, 2002); Hedley, P. et al., “EBem10 SQ005 D11 Rembryo. 2 Day germi Buell, C.R., et al., “EST613120 Generation of a set of potato cDNA nation, no treatment, cv Optic, EBem10 Hordeum vulgare subsp. clones for microarray analyses mixed potato tissues Solanum Vulgare cDNA clone EBeml.0 SQ005 D115", mRNA sequence': tuberosum cDNA clone STMGF80 5' end, mRNA sequence': (Hordeum vulgare). (Solanum tuberosum) (note: two versions are presented for review, NCBI accession No. BQ799965 (gi:220 14931) (Jul 30, 2002); gi:21364,574 submitted Jun. 10, 2002; and, gi:21921628 submitted Abbal. P. et al., “EST 2134 Green Grape berries Lambda Zap II Jul. 22, 2002). Library Vitis vinifera cDNA clone GT203H053", mRNA sequence': NCBI accession No. BQ505706 (gi:21364575) (Jun. 10, 2002); (Vitis vinifera). Buell, C.R., et al., “EST613121 Generation of a set of potato cDNA NCBI accession No. BQ816666 (gi:22065967) (Aug. 1, 2002); clones for microarray analyses mixed potato tissues Solanum Grossman, A., et al., “1030059C09.y1 C. Reinhardtii CC-1690, tuberosum cDNA clone STMGF80 3' end, mRNA sequence': Deflgellation (normalized), Lambda Zap II Chlamydomonas (Solanum tuberosum) (note: two versions are presented for review, reinhardtii cDNA, mRNA sequence”; (Chlamydomonas reinhardtii). gi:21364575 submitted Jun. 10, 2002; and, gi:21921629 submitted NCBI accession No. BQ838221 (gi:22142539) (Aug. 8, 2002); Jul. 22, 2002). Anderson, O.D., et al., “WHE2907 HO9 P17ZS Wheat aluminum NCBI accession No. BQ507074 (gi:21365943) (Jun. 10, 2002); stressed root tip cDNA library Triticum aestivum cDNA clone Buell, C.R., et al., “EST614489 Generation of a set of potato cDNA WHE2907 HO9 P17, mRNA sequence”; (Triticum aestivum). clones for microarray analyses mixed potato tissues Solanum NCBI accession No. BQ857127 (gi:22242592) (Aug. 14, 2002); tuberosum cDNA clone STMG023 3' end, mRNA sequence': Kozik, A., et al., “QGB6K24.yg.ab1 QG ABCDI lettuce Salinas (Solanum tuberosum) (note: two versions are presented for review, Lactuca sativa cDNA clone QGB6K24, mRNA sequence'; (Lactuca gi:21365943 submitted Jun. 10, 2002; and, gi:21922914 submitted sativa). Jul. 22, 2002). NCBI accession No. BQ862671 (gi:22248.136) (Aug. 14, 2002); NCBI accession No. BQ508506 (gi:21367375) (Jun. 10, 2002); Kozik, A., et al., “QGC21L19.yg.abl QG ABCDI lettuce salinas Buell, C.R., et al., “EST615921 Generation of a set of potato cDNA Lactuca sativa cDNA clone QGC21L19, mRNA sequence'; (Lactuca clones for microarray analyses mixed potato tissues Solanum sativa). tuberosum cDNA clone STMGW70 5' end, mRNA sequence': NCBI accession No. BQ875352 (gi:22264573) (Aug. 15, 2002); (Solanum tuberosum) (note: two versions are presented for review, Kozik, A., et al., “QGI7N22.ygab1 QG ABCDI lettuce salinas gi:21367375 submitted Jun. 10, 2002; and, gi:21924285 submitted Lactuca sativa cDNA clone QGI7N22, mRNA sequence”; (Lactuca Jul. 22, 2002). sativa). NCBI accession No. BQ508507 (gi:21367376) (Jun. 10, 2002); NCBI accession No. BQ911236 (gi:223 10015) (Aug. 19, 2002); Buell, C.R., et al., “EST615922 Generation of a set of potato cDNA Kozik, A., et al., “QHA16J13.yg.ab1 QG ABCDI sunflower clones for microarray analyses mixed potato tissues Solanum RHA801 Helianthus annus cDNA clone QHA16J13, mRNA tuberosum cDNA clone STMGW70 3' end, mRNA sequence': sequence'; (Helianthus annus). (Solanum tuberosum) (note: two versions are presented for review, NCBI accession No. BQ990941 (gi:22410476) (Aug. 21, 2002); gi:21367376 Submitted Jun. 10, 2002; and, gi:21924286 submitted Kozik, A., et al., “QGF21I11.yg.abl QG EFGHJ lettuce serriola Jul. 22, 2002). Lactuca serriola cDNA clone QGF21I11, mRNA sequence': NCBI accession No. BQ510555 (gi:21369424) (Jun. 10, 2002); (Lactuca Serriola). Buell, C.R., et al., “EST617970 Generation of a set of potato cDNA NCBI accession No. BQ996.905 (gi:22431301) (Aug. 22, 2002); clones for microarray analyses mixed potato tissues Solanum Kozik, A., et al., “QGG14C03.yg.ab1 QG EFGHJ lettuce serriola tuberosum cDNA clone STMHL 13 5' end, mRNA sequence': Lactuca serriola cDNA clone QGG14C03, mRNA sequence': (Solanum tuberosum) (note: two versions are presented for review, (Lactuca Serriola). gi:21369424 submitted Jun. 10, 2002; and, gi:21926247 submitted NCBI accession No. BU008142 (gi:22442537) (Aug. 22, 2002); Jul. 22, 2002). Kozik, A., et al., “QGH6K04.yg.abl QG EFGHJ lettuce serriola NCBI accession No. BQ511879 (gi:21370748) (Jun. 10, 2002); Lactuca serriola cDNA clone QGH6K04, mRNA sequence': Buell, C.R., et al., “EST619294 Generation of a set of potato cDNA (Lactuca Serriola). clones for microarray analyses mixed potato tissues Solanum NCBI accession No. BU013962 (gi:22448357) (Aug. 22, 2002); tuberosum cDNA clone STMHU61 5' end, mRNA sequence': Kozik, A., et al., “QGJ6B02.yg.ab1 QG EFGHJ lettuce serriola (Solanum tuberosum) (note: two versions are presented for review, Lactuca serriola cDNA clone QGJ6B02, mRNA sequence'; (Lactuca gi:21370748 submitted Jun. 10, 2002; and, gi:21927514 submitted Serriola). Jul. 22, 2002). NCBI accession No. BUO16847 (gi:224.52367) (Aug. 23, 2002); NCBI accession No. BQ592365 (gi:26121948) (Dec. 6, 2002); Kozik, A., et al., “QHE14C23.yg.ab1 QG EFGHJ Sunflower Herwig, R., et al., “EO 12681-024-020-H10-SP6 MPIZ-ADIS-024 RHA280 Helianthus annuus cDNA clone QHE14C23, mRNA developing root Beta vulgaris cDNA clone 024020-H105", mRNA sequence'; (Helianthus annuus). US 7,868,229 B2 Page 15

NCBI accession No. BU051043 (gi:22491 120) (Aug. 26, 2002); NCBI accession No. BZ733911 (gi:287 10244) (Mar. 3, 2003); Walbot, V., “1111037G04.y2 1111 Unigene III from Maize Whitelaw, C.A. et al., “OGFAN93TMZM 0.7 1.5 KBZea mays Genome Project Zea mays clDNA, mRNA sequence”; (Zea mays). genomic clone ZMMBMaO240O18, genomic Survey sequence': NCBI accession No. BUO90324 (gi:22540481) (Aug. 29, 2002); (Zea mays). Shoemaker, R., et al., “sr70c10,y1 Gm-c 1052 Glycine max cDNA NCBI accession No. CA026619 (gi:24303993) (Oct. 23, 2002); clone Genome systems clone ID: Gm-c 1052-1099 5" similar to SW: Radchuk, V., et al., “HZ56G24r HZ Hordeum vulgare subsp. Vulgare CBFA Maize P25209 CCAAT-binding transcription factor subunit cDNA clone HZ56G245", mRNA sequence': (Hordeum vulgare). A, mRNA sequence', (Glycine max). NCBI accession No. CA688065 (gi:25278629) (Nov. 25, 2002); NCBI accessino No. BU097938 (gi:22545579) (Aug. 29, 2002); Tingey, S.V., et al., “wlm96.pk037.k9 wilm96 Triticum aestivum Walbot, V., et al., "946122D01.y1 946—tassel primodium prepared cDNA clone wilm96.pk037.k.9 5' end, mRNA sequence”; (Triticum by Schmidt lab Zea mays clDNA, mRNA sequence”; (Zea mays). aestivum). NCBI accession No. BU238020 (gi:22749845) (Sep. 6, 2002); NCBI accession No. CA753815 (gi:25797854) (Nov. 27, 2002); Singh, J.A., et al, DSO 14a2. A Bohnert, H.J., et al., “BR04.0003000 PLATE A05 33 34.abl Ds01 AAFC ECORC cold stressed Flixweed seedlings OA Oryza sativa (japonica cultivar-group) cDNA clone Descurainia sophia cDNA clone Ds01 14a 12, mRNA sequence': BR040003000 PLATE A05 33 34.ab1 similar to putative (Descurainia Sophia). CAAT-box DNA binding protein Oryza sativa (japonica cultivar NCBI accession No. BU4994.57 (gi:22819367) (Sep. 12, 2002); group) gi:20805265:dbj:BAB92931.1 (AP004366) putative CAAT Walbot, V., "946 175D02.y1 946—tassel primordium prepared by box DNA binding protein Oryza sativa (japonica cultivar-group). Schmidt lab Zea mays clDNA, mRNA sequence”; (Zea mays). mRNA sequence': (Oryza sativa). NCBI accession No. BU551072 (gi:22933933) (Sep. 16, 2002); NCBI accession No. CA785249 (gi:26048796) (Dec. 4, 2002); Shoe Vodkin, L, et al., “GM880006B11G05 Gm-r1088 Glycine max maker, R., et al., "sau26hl 1.y1 Gm-c1062 Glycinemax cDNA clone cDNA clone Gm-r1088-2241 3", mRNA sequence”; (Glycine max). SOYBEAN Clone ID: Gm-c1062-9574 5" Similar to TR: O23310 NCBI accession No. BU653821 (gi:23366001) (Sep. 30, 2002); 023310 CCAAT-binding transcription factor subunit A. mRNA Grossman, A., et al., “1112109C03.y1 C. Reinhardtiicc-1690 (mt--), sequence', (Glycine max.). cc-1691 (mt-), Gamete (normalized), Lambda Zap II NCBI accession No. CA795.038 (gi:26052114) (Dec. 5, 2002); Chlamydomonas reinhardtii cDNA, mRNA sequence': Jones, P.G., et al., "Cac BL 208 Cac BL (Bean and Leaf from (Chlamydoonas reinhardtii). Amelonardo type Cacao) Theobroma cacao cDNA clone NCBI accession No. BU655174 (gi:23367356) (Sep. 30, 2002); Cac BL 208 5", mRNA sequence': (Theobroma cacao) (publica Grossman, A., et al., “11121 18C09.y1 C. Reinhardtiicc-1690 (mt--), tion: see Planta 216 (2), 255-264, 2002, Gene discovery and microar cc-1691 (mt-), Gamete (normalized), Lambda Zap II ray analysis of cacao varieties). Chlamydomonas reinhardtii cDNA, mRNA sequence': NCBI accession No. CA801742 (gi:26058828) (Dec. 5, 2002); Shoe (Chlamydomonas reinhardtii). maker, R., et al., “satl7b11.y1 Gm-c 1036 Glycinemax cDNA clone NCBI accession No. BU871529 (gi:24063053) (Oct. 16, 2002); SOYBEAN Clone ID: Gm-c1036-138945" similar to TR: 081130 Unneberg, P. et al., Q031E12 Populus flower cDNA library Populus 081130 CCAAT-Box binding transcription factor HAP3 Homolog, trichocarpa cDNA 5", mRNA sequence (Populus trichocarpa). mRNA sequence', (Glycine max.). NCBI accession No. CA802391 (gi:26059477) (Dec. 5, 2002); Shoe NCBI accession No. BU880488 (gi:24072012) (Oct. 16, 2002); maker, R., et al., "sau35cO3.y1 Gm-c1071 Glycinemax cDNA clone Unneberg, P. et al., “UM49TG09 Populus flower cDNA library SOYBEAN Clone ID: Gm-c1071-2813 5" similar to Populus trichocarpa cDNA 5", mRNA sequence”; (Populus SW:CBFA Maize P25209 CCAAT-binding transcription factor trichocarpa). Subunit A. mRNA sequence'; (Glycine max.). NCBI accession No. BU881483 (gi:24073007) (Oct. 16, 2002); NCBI accession No. CA802515 (gi:26059601) (Dec. 5, 2002); Shoe Unneberg, P. et al., “UM63TC12 Populus flower cDNA library maker, R., et al., "sau37e09.y1 Gm-c1071 Glycinemax cDNA clone Populus trichocarpa cDNA 5", mRNA sequence”; (Populus SOYBEAN Clone ID: Gm-c1071-3281 5" similar to TR: 023633 trichocarpa). 023633 transcription factor, mRNA sequence”; (Glycine max.). NCBI accession No. BU896236 (gi:24107443) (Oct. 17, 2002); NCBI accession No. CA820519 (gi:26269456) (Dec. 9, 2002); Shoe Unneberg, P. et al., “X037F04 Populus wood cDNA library Populus maker, R., et al., "sau.90cQ6.y1 Gm-c1048 Glycinemax cDNA clone tremulax Populus tremuloides cDNA 5", mRNA sequence”; (Populus SOYBEAN Clone ID: Gm-c1048-3180 5 Similar to TR: O23310 tremula X Populus tremuloides). O23310 CCAAT-binding transcription factor subunit A. mRNA NCBI accession No. BU997198 (gi:24274181) (Oct. 23, 2002); sequence', (Glycine max.). Zhang, H., et al., “H107D15r HI Hordeum vulgare subsp. Vulgare NCBI accession No. CA828.166 (gi:26456583) (Dec. 11, 2002); cDNA clone H107D 155", mRNA sequence': (Hordeum vulgare). Walbot, V., “11 14024G02.y2 1114 Unigene IV from Maize NCBI accession No. BZ505711 (gi:27026128) (Dec. 16, 2002); Genome Project Zea mays clDNA, mRNA sequence”; (Zea mays). Ayele, M., et al., “BONBC73TF BO 1y 2 KB tot Brassica NCBI accession No. CA82993.1 (gi:26557696) (Dec. 12, 2002); oleracea genomic clone BONBC73, genomic Survey sequence': Walbot, V., “3529 1 1 1 A10.y 13529–2 mm ear tissue from (Brassica oleracea) (publication: see Genome Res. 15 (4), 487-495, Schmidt and Hake labs Zea mays cDNA, mRNA sequence”; (Zea 2005, Whole genome shotgun sequencing of Brassica oleracea and mays). its application to gene discovery and annotation in Arabidopsis). NCBI accession No. CA832165 (gi:26559930) (Dec. 12, 2002); NCBI accession No. BZ635584 (gi:28084146) (Jan. 29, 2003); Walbot, V., “11 17028F06.y1 1117 Unigene V from Maize Genome Whitelaw, C.A. et al., “OGCAF48TCZM 0.7 1.5 KBZea mays Project Zea mays clDNA, mRNA sequence': (Zea mays). genomic clone ZMMBMaO126G24, genomic Survey sequence': NCBI accession No. CA897002 (gi:27383993) (Dec. 27, 2002); Bui, (Zea mays). A.Q., et al., “PCEP0404 Scarlet Runner Bean Embryo-Proper NCBI accession No. BZ635588 (gi:28084150) (Jan. 29, 2003); Region Phaseolus coccineus cDNA 5" similar to AB025628: contains Whitelaw, C.A. et al., “OGCAF48TMZM 0.7 1.5 KBZea mays similarity to CCAAT-box binding transcription factor genomic clone ZMMBMaO126G24, genomic Survey sequence': gene id:MNJ7.26 Arabidopsis thaliana: Identical to At5g47670: (Zea mays). LEC1-like AF533650), mRNA sequence'. (Phaseolus coccineus). NCBI accession No. BZ707 196 (gi:28427237) (Feb. 19, 2003); NCBI accession No. CA902393 (gi:27389385) (Dec. 27, 2002); Bui, Whitelaw, C.A. et al., “OGEBD41TCZM 0.7 1.5 KBZea mays A.Q., et al., “PCS03217F Scarlet Runner Bean Suspensor Region genomic clone ZMMBMaO222G10, genomic Survey sequence': TripIEx2 Phaseolus coccineus cDNA 5' similar to AB025628: con (Zea mays). tains similarity to CCAAT-box binding transcription factor NCBI accession No. BZ733904 (gi:287 10230) (Mar. 3, 2003); gene id:MNJ7.26 Arabidopsis thaliana: Identical to At5g47670: Whitelaw, C.A. et al., “OGFAN93TCZM 0.7 1.5 KBZea mays LEC1-like AF533650), mRNA sequence'. (Phaseolus coccineus). genomic clone ZMMBMaO240O18, genomic Survey sequence': NCBI accession No. CA902394 (gi:27389386) (Dec. 27, 2002); Bui, (Zea mays). A.Q., et al., “PCSC12889 Scarlet Runner Bean Suspensor Region US 7,868,229 B2 Page 16

TriplEx2 Phaseolus coccineus cDNA 5' similar to AB025628: con CAB2SG0003 IaF C07 5", mRNA sequence”; (Vitis vinifera) tains similarity to CCAAT-box binding transcription factor (note: two versions are presented for review, gi:28968653 submitted gene id:MNJ7.26 Arabidopsis thaliana: Identical to At5g47670: Mar. 14, 2003; and, gi:297824.05 submitted Apr. 10, 2003). LEC1-like AF533650), mRNA sequence'. (Phaseolus coccineus). NCBI accession No. CB347754 (gi:28968721) (Mar. 14, 2003); NCBI accession No. CA902402 (gi:27389394) (Dec. 27, 2002); Bui, Goes da Silva, F., et al., “CAB2SG0003 IaR CO7 Cabernet A.Q., et al. “PCSC08658 Scarlet Runner Bean Suspensor Region Sauvignon Berry—CAB2SG Vitis vinifera cDNA clone TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: CAB2SG0003 IaR C07 3', mRNA sequence”; (Vitis vinifera) CBFA Maize CCAAT-binding transcription factor subunit A (CBF (note: two versions are presented for review, gi:28968721 submitted A) Arabidopsis thaliana; similar to At3 g53340: Non-LEC1-type Mar. 14, 2003; and, gi:29782446 submitted Apr. 10, 2003). AHAP3, mRNA sequence': (Phaseolus coccineus). NCBI accession No. CB350611 (gi:28985378) (Mar. 17, 2003); NCBI accession No. CA902403 (gi:27389395) (Dec. 27, 2002); Bui, Wen, T.J., et al., “MEST253-D11.univ ISUM5-RNZea mays cDNA A.Q., et al., “PCSC08738 Scarlet Runner Bean Suspensor Region clone MEST253-D113', mRNA sequence”; (Zea mays). TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: NCBI accession No. CB351460 (gi:2898.7093) (Mar. 17, 2003); CBFA Maize CCAAT-binding transcription factor subunit A (CBF Walbot, V., “3529 1 39 1 B01y 13529–2 mm ear tissue from A) Arabidopsis thaliana; similar to At3 g53340: Non-LEC1-type Schmidt and Hake labs Zea mays cDNA, mRNA sequence”; (Zea AHAP3, mRNA sequence': (Phaseolus coccineus). mays). NCBI accession No. CA902404 (gi:273893.96) (Dec. 27, 2002); Bui, NCBI accession No. CB351633 (gi:28987436) (Mar. 17, 2003); A.Q., et al., “PCSC 14305 Scarlet Runner Bean Suspensor Region Walbot, V., “3529 1 42 1 E05.y 13529–2 mm ear tissue from TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: Schmidt and Hake labs Zea mays cDNA, mRNA sequence”; (Zea CBFA Maize CCAAT-binding transcription factor subunit A (CBF mays). A) Arabidopsis thaliana; similar to At3g53340: Non-LEC1 type NCBI accession No. P25209 (gii: 115840) (Apr. 23, 1993); Li, X.Y., et AHAP3, mRNA sequence': (Phaseolus coccineus). al. "CCAAT-binding transcription factor subunit A (CBF-A) (NF-Y NCBI accession No. CA902413 (gi:27389405) (Dec. 27, 2002); Bui, protein chain B) (NF-YB) (CAAT-box DNA binding protein subunit A.Q., et al. “PCSC08277 Scarlet Runner Bean Suspensor Region B)” (Zea mays) (note: the original Submission and latest update are TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: provided for examination) (publication: see Nucleic Acids Res. 20 CBFA Maize CCAAT-binding transcription factor subunit A (CBF (5), 1087-1091, 1992, Evolutionary variation of the CCAAT-binding A) Arabidopsis thaliana: Identical to At3 g53340: Non-LEC1-type transcription factor NF-Y). AHAP3, mRNA sequence': (Phaseolus coccineus). NCBI accession No. S22820 (gi:7443522) (Apr. 5, 2000); Li, X.Y., et NCBI accession No. CA9 19789 (gi:27406719) (Dec. 27, 2002); al. “Transcription factor NF-Y. CCAAT-binding, chain B—maize” VanderBosch, K., et al., “EST637507 MTUS Medicago truncatula (Zea mays) (note: the original Submission and latest update are pro cDNA clone MTUS-18B7, mRNA sequence”; (Medicago vided for examination) (publication: see Nucleic Acids Res. 20 (5), truncatula). 1087-1091, 1992, Evolutionary variation of the CCAAT-binding NCBI accession No. CA923822 (gi:274 10752) (Dec. 27, 2002); transcription factor NF-Y). Ranjan, P. et al., “MTU7CL. P15. D04 Aspen leaf cDNA Library NCBI accession No. X59714 (gi:22379) (Apr. 21, 1993); Benoist, C., Populus tremuloides cDNA, mRNA sequence'. (Populus "Z. mays mRNA for CAAT-box DNA binding protein subunit B (NF tremuloides). YB); (Zea mays) (note: the original Submission and latest update are NCBI accession No. CA935541 (gi:27424021) (Dec. 30, 2002); provided for examination) (publication: see Nucleic Acids Res. 20 Shoemaker, R., et al., "sau55g04.y1 Gm-c1071 Glycinemax cDNA (5), 1087-1091, 1992, Evolutionary variation of the CCAAT-binding clone SOYBEAN Clone ID: Gm-c1071-49275 Similar to TR:08113 transcription factor NF-Y). 081130 CCAAT-Box Binding Factor HAP3 Homolog, mRNA sequence', (Glycine max). NCBI accession No.Y13723 (gi:2398526)(Sep. 16, 1997); Edwards, NCBI accession No. CAA42234 (gi:22380) (Apr. 21, 1993); Li, D., “Arabidopsis thaliana mRNA for Hap3a transcription factor” X.Y., et al. “CAAT-box DNA binding protein subunit B (NF-YB)” (note: the original Submission and latest update are provided for (Zea mays) (note: the original Submission and latest update are pro examination). vided for examination) (publication: see Nucleic Acids Res. 20 (5), NCBI accession No.Y13724 (gi:2398528) (Sep. 16, 1997); Edwards, 1087-1091, 1992, Evolutionary variation of the CCAAT-binding D., “Arabidopsis thaliana mRNA for Hap3b transcription factor” transcription factor NF-Y). (note: the original Submission and latest update are provided for NCBI accession No. CAA74051 (gi:2398527) (Sep. 16, 1997); examination). Edwards, D., “Transcription factor Arabidopsis thaliana” (note: the NCBI accession No. Z97336 (gi:2244788) (pos. 102664-103149) original Submission and latest update are provided for examination). (Jul. 6, 1997); Bevan, M., et al., “Arabidopsis thaliana DNA chro NCBI accession No. CAC37695 (gi: 13928060) (May 2, 2001); mosome 4, ESSA I contig fragment No. 1'' (note: the original sub Masiero, S., et al. “NF-YB1 protein' (Oryza sativa) (note: the origi mission and latest update are provided for examination, see gene nal Submission and latest update are provided for examination). d13310w, protein id CAB10233.1 marked on p. 21-22/47). NCBI accession No. CB077569 (gi:2789 1006) (Jan. 24, 2003); Database TREMBL Accession No. AB025619; (Apr. 9, 1999) Levesque, M.P. et al., “hj56d09.g1 Hedyotis terminalis “Arabidopsis thaliana genomic DNA, chromosome 5, P1 clone: flower—Stage 2 (NYBG) Hedyotis terminalis cDNA clone hj56d09, MBA10'. mRNA sequence'; (Hedyotis terminalis). Database TREMBL Sequence Library TREMBL Accession No. NCBI accession No. CBO90548 (gi:27914740)(Jan. 27, 2003); Bren Q9FGJ3 (Mar. 1, 2001); protein: “Similarity to CCAAT-box binding ner, E.D., et al., "gy76g 12 gl Cycad Leaf Library (NYBG) Cycas transcription factor' (origin-gene: AT5.g47640). rumphii cDNA clone gy76g 12, mRNA sequence”; (Cycas rumphii). Database TREMBL Sequence Library TREMBL Accession No. NCBI accession No. CB290512 (gi:28615969) (Feb. 28, 2003); Q9FGP7 (Mar. 1, 2001); protein: “Transcription factor Hap5a-like” Close, T.J., et al., “UCRCS01 01bh 12 b1 Washington Navel (origin-gene: At5g50480). orange cold acclimated flavedo & albedo cDNA library Citrus Database TREMBL Sequence Library TREMBL Accession No. sinensis cDNA clone UCRCS01 01bh 12, mRNA sequence”; (Cit Q9FMV5 (Mar. 1, 2001); protein: “Transcription factor Hap5a-like rus sinensis). protein' (origin-gene: At5g63470) (publication: see DNA Res. NCBI accession No. CB290513 (gi:28615970) (Feb. 28, 2003); 4:401-414, 1997. Structural analysis of Arabidopsis thaliana chro Close, T.J., et al., “UCRCS01 01bh 12 gl Washington Navel mosome 5. . . ). orange cold acclimated flavedo & albedo cDNA library Citrus Database TREMBL Sequence Library TREMBL Accession No. sinensis cDNA clone UCRCS01 01bh 12, mRNA sequence”; (Cit Q9LFI3 (Oct. 1, 2000); protein: “Transcription factor NF-Y. rus sinensis). CCAAT-binding-like protein' (origin-gene: At3 g53340). NCBI accession No. CB347686 (gi:28968653) (Mar. 14, 2003); Database TREMBL Sequence Library TREMBL Accession No. Goes da Silva, F., et al., “CAB2SG0003 IaF CO7 Cabernet Q9SLGO (May 1, 2000); protein: “Putative CCAAT-binding tran Sauvignon Berry—CAB2SG Vitis vinifera cDNA clone Scription factor Subunit' (origin-gene: At2g38880) (publication: See US 7,868,229 B2 Page 17

Genome Biol. 3:RESEARCHOO29-RESEARCHO029, 2002, Full Database TREMBL Sequence Library TREMBL Accession No. length messenger RNA sequences greatly improve genome annota O81130 (Nov. 1, 1998); protein: “CCAAT-box binding factor HAP3 tion). homolog”; (publication: see Cell 93:1195-1205, 1998, “Arabidopsis LEAFY COTYLEDON1 is sufficient to induce embryo development Database TREMBL Sequence Library TREMBL Accession No. in vegetative cells”). Q97.QC3 (May 1, 1999); protein: “Putative CCAAT-box binding U.S. Appl. No. 09/733,089, filed Dec. 11, 2000, Lutfiyya, Entire transcription factor' (origin-gene: At2g37060). document. Database TREMBL Sequence Library TREMBL Accession No. U.S. Appl. No. 09/474,435, filed Dec. 29, 1999, Monsanto Co., et al., O23634 (Jan. 1, 1998); protein: “Transcription factor fragment” Entire document. (origin-gene: hap3b). U.S. Appl. No. 10/675,852, filed Feb. 25, 2009, Heard, J., et al., office Database TREMBL Sequence Library TREMBL Accession No. action. O23310 (Jan. 1, 1998); protein: “CCAAT-binding transcription fac tor subunit A (CBF-A)”. * cited by examiner U.S. Patent Jan. 11, 2011 Sheet 1 of 9 US 7,868,229 B2

Fagales Cucurbitales ROSales Fabales Oxalidales Malpighiales Sapindales Malvales Brassicales Myrtales Geraniales DipSaCales Apiales Aquifoliales Solanales Lamiales Gentianales Garryales Ericales Cornales Saxifradales Santalales Caryophyllales Proteales Ranunculales Zingiberales Commelinales Poales AreCales Pandanales Liliales DioSCOreales Asparagales Alismatales ACOrales Piperales Magnoliales Laurales Ceratophyllales Fig. 1

U.S. Patent Jan. 11, 2011 Sheet 3 of 9 US 7,868,229 B2

58 G3435 Zm (4) (not tested) 62 G3436 Zm (6) G3397 Os (2) G3398 Os (8) G482 At (26) G3476 Gm (18) 85 G3475 Gm (14) 52 74 G3478 Gm (12) 94 G3868 Pp (30) G3870 Pp (28) G485 At (16) (not tested) G3472 Gm (20) 54 G3474 Gm (10) (not tested) 99 (not tested) \ 79

99 G3876 Zm (22) 78

96 G3875 Gm (24)

87 99

Fig. 3 U.S. Patent US 7,868,229 B2

–(Z)5––?NISCH5O5DSCINCIS–CIGHIN –(8) –NISOEIÐÐSEINIGIS––CI?IINT –(†T)§––?XIÐVÌNHV75)ÐSCINCIS–CIVIIN– –(†)–NS&H5)ÐSCINCIS––CI?IINT”– –(9)_`––VZNISCHEDE)SEINIGIS–CIGHIN –(ZI)|-––?XIÐÐNHVÆÐIÐSCINCIS–CIVIIN– £––NIS?DSVNIÐI,–(OT)HÆDÆDSEINIGIS––EIVIN

–(OZ)–NSÆÐSVNIÐLH??SEINCIS––GIVIINºº– :––SVCIÕGIH?HÆÐÐSGINGISH?ISSEH?ISÐVÌNH–(0€)??SGIVIN– „––SVCIÕGI?IKH?ÐSGINGISH?ISSOEICIS?DVÌNH–(8Z)?ÃSCTVIN– ------SV/N??CIXIH5)ÐSCINCIS –(9T)––CIVIWN =TINEÐSNIÐVNÖVÆÐÐSCINCIS–GIVIN(8T)••=== -->ÕÐNÕNNIÐNÕÐ?ÐSCI?CIS–CIÐW(9Z) - _------?5D5D5D5D5DRICH–(ZZ)SÐSEIHS5D5D5D5)––?ôISVGIVETVIN •?H?ISHEI-?SGIHSÐ?Ж–?SV??CTVW(†;z)•=<=<==-->--~

US 7,868,229 B2 1. 2 EARLY FLOWERING IN GENETICALLY Breeding programs for the development of new varieties MODIFIED PLANTS can be limited by the seed-to-seed cycle. Thus, breeding new varieties of plants with multi-year cycles (such as biennials, RELATIONSHIP TO COPENDING e.g. carrot, or fruit trees, such as citrus) can be very slow. With APPLICATIONS respect to breeding programs, there would be a significant advantage in having commercially valuable plants that This application is a continuation-in-part under 35 U.S.C. exhibit controllable and modified periods to flowering S120 of International Application No. PCT/US2006/034615, (“flowering times'). For example, accelerated flowering filed Aug. 31, 2006 (expired), which claims the benefit of U.S. would shorten crop and tree breeding programs. provisional application 60/713.952, filed Aug. 31, 2005; and, 10 Improved flowering control allows more than one planting this application is a continuation-in-part of National Stage and harvest of a crop to be made within a single season. In a application Ser. No. 11/435,388, filed May 15, 2006 (issued number of species, for example, certain grain crops, fruits, as U.S. Pat. No. 7,663,025), which is a continuation-in-part and ornamentals such as cut flowers, where the reproductive under 35 U.S.C. S 120 of International Application No. PCT/ parts of the plants constitute the crop and the vegetative US04/37584, filed Nov. 12, 2004 (expired); and, this appli 15 tissues are discarded, it would be advantageous to accelerate cation is a continuation-in-part of U.S. non-provisional appli time to flowering. Accelerating flowering can shorten crop cation Ser. No. 10/714,887, filed Nov. 13, 2003 (pending): and tree breeding programs. Additionally, in some instances, and, this application is a continuation-in-part of National a faster generation time would allow additional harvests of a Stage application Ser. No. 10/546,266, filed under 35 U.S.C. crop to be made within a given growing season. A number of S371 on Aug. 19, 2005 (issued as U.S. Pat. No. 7,659,446) of Arabidopsis genes have already been shown to accelerate International Application No. PCT/US04/05654, filed Feb. flowering when constitutively expressed. These include 25, 2004 (expired); and, the present application is a continu LEAFY, APETALA1 and CONSTANS (Mandelet al., 1995; ation-in-part of U.S. non-provisional application Ser. No. Weigel and Nilsson, 1995; Simon et al., 1996). The floral 10/374,780, filed Feb. 25, 2003 (issued as U.S. Pat. No. 7,511, control gene LEAFY from Arabidopsis can dramatically 190); and, the present application is a continuation-in-part of 25 accelerate flowering in numerous dicotyledonous plants. U.S. non-provisional application Ser. No. 10/412,699, filed Constitutive expression of Arabidopsis LEAFY also caused Apr. 10, 2003 (issued as U.S. Pat. No. 7,345,217), which is a early flowering in transgenic rice (a monocot), with a heading continuation-in-part of U.S. non-provisional application Ser. date that was 26-34 days earlier than that of wild-type plants. No. 10/374,780, filed Feb. 25, 2003 (issued as U.S. Pat. No. These observations indicate that floral regulatory genes from 7.511,190); and, the present application is a continuation-in 30 Arabidopsis are useftil tools for heading date improvement in part of U.S. non-provisional application Ser. No. 10/675,852, cereal crops (He et al., 2000). filed Sep. 30, 2003 (pending); and, this application is a con Flowering time and other developmental characteristics tinuation-in-part of U.S. non-provisional application Ser. No. may be controlled by manipulating the expression of relevant 10/666,642, filed Sep. 18, 2003 (issued as U.S. Pat. No. transcription factors. Transcription factors can modulate gene 7,196,245), which claims the benefit of U.S. provisional 35 expression, either increasing or decreasing (inducing or applications 60/411,837, filed Sep. 18, 2002 and 60/434,166, repressing) the rate of transcription. This modulation results filed Dec. 17, 2002. The entire contents of each of these in differential levels of gene expression at various develop applications are hereby incorporated by reference. mental stages, in different tissues and cell types, and in response to different exogenous (e.g., environmental) and JOINT RESEARCH AGREEMENT 40 endogenous stimuli throughout the life cycle of the organism. Because transcription factors are key controlling elements The claimed invention, in the field of functional genomics of biological pathways, altering the expression levels of one and the characterization of plant genes for the improvement or more transcription factors can change entire biological of plants, was made by or on behalf of Mendel Biotechnology, pathways in an organism. This may include the alteration of Inc. and Monsanto Company as a result of activities under 45 development pathways in specific tissues and cell types. We taken within the scope of a joint research agreement in effect have, in fact, identified closely-related CCAAT-box family on or before the date the claimed invention was made. transcription factors, including G3397 (SEQ ID NO: 2), FIELD OF THE INVENTION G3476 (SEQID NO: 18) and other closely-related CCAAT box sequences that accelerate flowering time in plants. These The present invention relates to plant genomics and plant 50 discoveries were made by developing numerous transformed improvement, decreasing the time to flower development, or transgenic plant lines and analyzing the plants for an accel and increasing the yield that may be obtained from plants. erated time to flower development (i.e., an “early flowering phenotype). In so doing, we have identified important poly BACKGROUND OF THE INVENTION nucleotide and polypeptide sequences for producing com 55 mercially valuable plants and crops as well as the methods for Due to increasing food production needs for a burgeoning making them and using them. Other aspects and embodi global population, a significant amount of biotechnology ments of the invention are described below and can be derived research is being devoted to increasing the yield of crop from the teachings of this disclosure as a whole. plants. Timing of flowering can have a significant impact on production of agricultural products. For example, varieties 60 SUMMARY OF THE INVENTION with different flowering responses to environmental cues are necessary to adapt crops to different production regions or The present invention pertains to transformed plants that systems. Such a range of varieties have been developed for comprise and overexpress a G482 subclade polypeptide many crops, including wheat, corn, Soybean, and strawberry. sequence, including G3397 (SEQID NO: 2), G3476 (SEQID Improved methods for alteration of flowering time will facili 65 NO: 18) and closely and evolutionarily-related sequences. tate the development of new, geographically adapted variet The sequences of the invention are further characterized by 1CS a consensus subsequence comprising SEQ ID NO: 60. The US 7,868,229 B2 3 4 polypeptides are overexpressed in plants after target plants A FastA formatted alignment was then used to generate a are transformed with an expression vector that comprises a phylogenetic tree in MEGA2using the neighborjoining algo recombinant nucleic acid sequence encoding a polypeptide rithm and a p-distance model. A test of phylogeny was done having at least 93% or at least 95% amino acid identity with via bootstrap with 100 replications and Random Speed set to the conserved central B domain of SEQID NO: 18 or SEQID 5 default. Cut off values of the bootstrap tree were set to 50%. NO: 2, respectively. When the polypeptide is overexpressed G482 subclade transcription factors of the broader non-LEC in the transformed plant, the plant that is produced generally 1-like clade of transcription factors found in the L1 L-related flowers earlier than a control plant. CCAAT transcription factor family are derived from a com The invention is also directed to transformed seed pro mon single node (arrow). Of particular interest are the duced by any of the transformed plants of the invention, 10 sequences that are most closely related to rice G3397 (SEQ wherein the transformed seed comprises a transcription factor ID NO: 2) and soy G3476 (SEQID NO: 18) are found in the sequence of the invention. The presently disclosed subject box in this figure. The sequences within this box that have matter also provides methods for producing a transformed been introduced into plants, including sequences from both plant seed. In some embodiments, the method comprises (a) monocots and , have conferred an early flowering transforming a plant cell with an expression vector compris 15 phenotype. Other sequences within the G482 subclade, ing a polynucleotide sequence encoding a transcription factor including G3875 and G3876, have also shown the ability to polypeptide of the invention, or a fragment or derivative produce accelerated flowering when these sequences are thereof; (b) regenerating a plant from the transformed plant overexpressed. The sequences within the box of FIG. 3 that cell; and (c) isolating a transformed seed from the regenerated conferred early flowering in Arabidopsis plants have B plant. In some embodiments, the seed may be grown into a domains with at least 95% identity to the B domain of G3397, plant that has an early flowering phenotype relative to a con or at least 93% identity to the B domain of G3476. SEQ ID trol plant. NOs: of the sequences found in FIG. 3 are provided in the parentheses. BRIEF DESCRIPTION OF THE SEQUENCE In FIGS. 4A-4F, HAP3 polypeptides from Arabidopsis, LISTING AND DRAWINGS 25 Soybean, rice, corn and Physcomitrella are aligned with G3397 and G3476. The A domains of these proteins appear in The Sequence Listing provides exemplary polynucleotide FIGS. 4A-4B before the box in FIG. 4B (i.e., from the N-ter and polypeptide sequences of the invention. The traits asso mini to the box) and the C domains are shown in FIGS. 4C-4F ciated with the use of the sequences are included in the after the box in FIG. 4C (i.e., from the box to the C-termini). Examples. 30 SEQID NOs of sequences in FIGS. 4A-4F are found within CD-ROMs Copy 1 and Copy 2, and the CRF copy (Copy 3) the parentheses after the Gene Identification Numbers (GIDs: of the Sequence Listing under CFR Section 1.821 (e), are e.g., “G3397 or “G3476”). read-only memory computer-readable compact discs. Each contains a copy of the Sequence Listing in ASCII text format. DETAILED DESCRIPTION The Sequence Listing is named “MBI0073CIPST25.txt”, the 35 electronic file of the Sequence Listing contained on each of The present invention relates to polynucleotides and these CD-ROMs was created on Feb. 12, 2007, and is 98 polypeptides for modifying phenotypes of plants, particularly kilobytes in size. These copies of the Sequence Listing on the those associated with accelerating time to flowering with three CD-ROM discs submitted with this application are respect to a control plant (for example, a wild-type plant or a hereby incorporated by reference in their entirety. 40 plant transformed with an "empty' vector lacking a gene of FIG. 1 shows a conservative estimate of phylogenetic rela interest). Throughout this disclosure, various information tionships among the orders of flowering plants (modified Sources are referred to and/or are specifically incorporated. from Soltis et al., 1997). Those plants with a single cotyledon The information sources include scientific journal articles, (monocots) are a monophyletic clade nested within at least patent documents, textbooks, and World WideWeb browser two major lineages of dicots; the eudicots are further divided 45 inactive page addresses. While the reference to these infor into rosids and . Arabidopsis is a rosid eudicot clas mation Sources clearly indicates that they can be used by one sified within the order Brassicales; rice is a member of the of skill in the art, each and every one of the information monocot order Poales. FIG. 1 was adapted from Daly et al., Sources cited herein are specifically incorporated in their 2001. entirety, whether or not a specific mention of “incorporation FIG. 2 shows a phylogenic dendogram depicting phyloge 50 by reference' is noted. The contents and teachings of each and netic relationships of higher plant taxa, including clades con every one of the information sources can be relied on and used taining tomato and Arabidopsis; adapted from Ku et al., 2000; to make and use embodiments of the invention. and Chase et al., 1993. As used herein and in the appended claims, the singular FIG. 3 illustrates the phylogenic relationship of a number forms “a”, “an', and “the include the plural reference unless of sequences within the G482 subclade. The phylogenetic 55 the context clearly dictates otherwise. Thus, for example, a tree and multiple sequence alignments of G3397 and related reference to “a host cell includes a plurality of such host full length proteins were constructed using ClustalW cells, and a reference to “a stress' is a reference to one or more (CLUSTALW Multiple Sequence Alignment Program ver stresses and equivalents thereof known to those skilled in the sion 1.83, 2003) and MEGA2 (www.megasoftware.net) soft art, and so forth. ware. The ClustalW multiple alignment parameters were: 60 Gap Opening Penalty: 10.00 DEFINITIONS Gap Extension Penalty: 0.20 Delay divergent sequences: 30% “Polynucleotide' is a nucleic acid molecule comprising a DNA Transitions Weight: 0.50 plurality of polymerized nucleotides, e.g., at least about 15 Protein weight matrix: Gonnet series 65 consecutive polymerized nucleotides. A polynucleotide may DNA weight matrix: IUB be a nucleic acid, oligonucleotide, nucleotide, or any frag Use negative matrix OFF. ment thereof. In many instances, a polynucleotide comprises US 7,868,229 B2 5 6 a nucleotide sequence encoding a polypeptide (or protein) or ally, the polypeptide may comprise: (i) a localization domain; a domain or fragment thereof. Additionally, the polynucle (ii) an activation domain; (iii) a repression domain; (iv) an otide may comprise a promoter, an intron, an enhancer region, oligomerization domain; (v) a protein-protein interaction a polyadenylation site, a translation initiation site, 5' or 3' domain; (vi) a DNA-binding domain; or the like. The untranslated regions, a reporter gene, a selectable marker, or polypeptide optionally comprises modified amino acid resi the like. The polynucleotide can be single-stranded or double dues, naturally occurring amino acid residues not encoded by stranded DNA or RNA. The polynucleotide optionally com a codon, non-naturally occurring amino acid residues. prises modified bases or a modified backbone. The polynucle “Protein’ refers to an amino acid sequence, oligopeptide, otide can be, e.g., genomic DNA or RNA, a transcript (such as peptide, polypeptide or portions thereof whether naturally an mRNA), a cDNA, a PCR product, a cloned DNA, a syn 10 occurring or synthetic. thetic DNA or RNA, or the like. The polynucleotide can be “Portion', as used herein, refers to any part of a protein combined with carbohydrate, lipids, protein, or other materi used for any purpose, but especially for the screening of a als to perform a particular activity Such as transformation or library of molecules which specifically bind to that portion or form a useful composition Such as a peptide nucleic acid for the production of antibodies. (PNA). The polynucleotide can comprise a sequence in either 15 A “recombinant polypeptide' is a polypeptide produced by sense orantisense orientations. “Oligonucleotide' is Substan translation of a recombinant polynucleotide. A “synthetic tially equivalent to the terms amplimer, primer, oligomer, polypeptide' is a polypeptide created by consecutive poly element, target, and probe and is preferably single-stranded. merization of isolated amino acid residues using methods A “recombinant polynucleotide' is a polynucleotide that is well known in the art. An "isolated polypeptide, whether a not in its native state, e.g., the polynucleotide comprises a naturally occurring or a recombinant polypeptide, is more nucleotide sequence not found in nature, or the polynucle enriched in (or out of) a cell than the polypeptide in its natural otide is in a context other than that in which it is naturally state in a wild-type cell, e.g., more than about 5% enriched, found, e.g., separated from nucleotide sequences with which more than about 10% enriched, or more than about 20%, or it typically is in proximity in nature, or adjacent (or contigu more than about 50%, or more, enriched, i.e., alternatively ous with) nucleotide sequences with which it typically is not 25 denoted: 105%, 110%, 120%, 150% or more, enriched rela in proximity. For example, the sequence at issue can be tive to wildtype standardized at 100%. Such an enrichment is cloned into a vector, or otherwise recombined with one or not the result of a natural response of a wild-type plant. more additional nucleic acid. Alternatively, or additionally, the isolated polypeptide is An "isolated polynucleotide' is a polynucleotide, whether separated from other cellular components with which it is naturally occurring or recombinant, that is present outside the 30 typically associated, e.g., by any of the various protein puri cell in which it is typically found in nature, whether purified fication methods herein. or not. Optionally, an isolated polynucleotide is subject to one "Homology” refers to sequence similarity between a ref or more enrichment or purification procedures, e.g., cell lysis, erence sequence and at least a fragment of a newly sequenced extraction, centrifugation, precipitation, or the like. clone insert or its encoded amino acid sequence. "Gene' or “gene sequence” refers to the partial or complete 35 “Identity” or “similarity” refers to sequence similarity coding sequence of a gene, its complement, and its 5' or 3' between two polynucleotide sequences or between two untranslated regions. A gene is also a functional unit of inher polypeptide sequences, with identity being a more strict com itance, and in physical terms is a particular segment or parison. The phrases “percent identity” and “96 identity” refer sequence of nucleotides along a molecule of DNA (or RNA, to the percentage of sequence similarity found in a compari in the case of RNA viruses) involved in producing a polypep 40 Son of two or more polynucleotide sequences or two or more tide chain. The latter may be subjected to Subsequent process polypeptide sequences. “Sequence similarity refers to the ing such as chemical modification or folding to obtain a percent similarity in base pair sequence (as determined by any functional protein or polypeptide. A gene may be isolated, suitable method) between two or more polynucleotide partially isolated, or found with an organism's genome. By sequences. Two or more sequences can be anywhere from way of example, a transcription factor gene encodes a tran 45 0-100% similar, or any integer value therebetween. Identity Scription factor polypeptide, which may be functional or or similarity can be determined by comparing a position in require processing to function as an initiator of transcription. each sequence that may be aligned for purposes of compari Operationally, genes may be defined by the cis-trans test, a son. When a position in the compared sequence is occupied genetic test that determines whether two mutations occur in by the same nucleotide base oramino acid, then the molecules the same gene and that may be used to determine the limits of 50 are identical at that position. A degree of similarity or identity the genetically active unit (Rieger et al., 1976). A gene gen between polynucleotide sequences is a function of the num erally includes regions preceding (“leaders'; upstream) and ber of identical, matching or corresponding nucleotides at following (“trailers'; downstream) the coding region. A gene positions shared by the polynucleotide sequences. A degree may also include intervening, non-coding sequences, referred of identity of polypeptide sequences is a function of the to as “introns', located between individual coding segments, 55 number of identical amino acids at corresponding positions referred to as “exons'. Most genes have an associated pro shared by the polypeptide sequences. A degree of homology moter region, a regulatory sequence 5' of the transcription or similarity of polypeptide sequences is a function of the initiation codon (there are some genes that do not have an number of amino acids at corresponding positions shared by identifiable promoter). The function of a gene may also be the polypeptide sequences. regulated by enhancers, operators, and other regulatory ele 60 By “substantially identical' is meant an amino acid mentS. sequence which differs only by conservative amino acid Sub A "polypeptide' is an amino acid sequence comprising a stitutions, for example, Substitution of one amino acid for plurality of consecutive polymerized amino acid residues another of the same class (e.g., valine for glycine, arginine for e.g., at least about 15 consecutive polymerized amino acid lysine, etc.) or by one or more non-conservative Substitutions, residues. In many instances, a polypeptide comprises a poly 65 deletions, or insertions located at positions of the amino acid merized amino acid residue sequence that is a transcription sequence which do not destroy the function of the protein factor or a domain or portion or fragment thereof. Addition assayed. (e.g., as described herein). Preferably, such a US 7,868,229 B2 7 8 sequence has at least 93% or greater identity with a sequence example, the sequence A-C-G-T (5'->3') forms hydrogen of the invention, such as at least 93% or greater identity with bonds with its complements A-C-G-T (5'->3') or A-C-G-U the B domain of SEQ ID NO: 18 or at least 95% or greater (5'->3'). Two single-stranded molecules may be considered identity with the B domain of SEQID NO: 2. partially complementary, if only some of the nucleotides Alignment” refers to a number of nucleotide bases or 5 bond, or “completely complementary’ if all of the nucle amino acid residue sequences aligned by lengthwise com otides bond. The degree of complementarity between nucleic parison so that components in common (i.e., nucleotide bases acid strands affects the efficiency and strength of hybridiza or amino acid residues at corresponding positions) may be tion and amplification reactions. “Fully complementary visually and readily identified. The fraction or percentage of refers to the case where bonding occurs between every base components in common is related to the homology or identity 10 pair and its complement in a pair of sequences, and the two between the sequences. Alignments such as those of FIGS. sequences have the same number of nucleotides. 4A-4F may be used to identify conserved B domains and The terms “highly stringent' or “highly stringent condi relatedness within these domains. An alignment may suitably tion” refer to conditions that permit hybridization of DNA be determined by means of computer programs known in the Strands whose sequences are highly complementary, wherein art, such as MACVECTOR software (1999) (Accelrys, Inc., 15 these same conditions exclude hybridization of significantly San Diego, Calif.). mismatched DNAs. Polynucleotide sequences capable of A “conserved domain or “conserved region' as used hybridizing under stringent conditions with the polynucle herein refers to a region in heterologous polynucleotide or otides of the present invention may be, for example, variants polypeptide sequences where there is a relatively high degree of the disclosed polynucleotide sequences, including allelic of sequence identity between the distinct sequences. With or splice variants, or sequences that encode orthologs or para respect to polynucleotides encoding presently disclosed logs of presently disclosed polypeptides. Nucleic acid polypeptides, a conserved domain is preferably at least nine hybridization methods are disclosed in detail by Kashima et base pairs (bp) in length. Transcription factor sequences that al., 1985, Sambrook et al., 1989, and by Haymes et al., 1985, possess or encode for conserved domains that have a mini which references are incorporated herein by reference. mum percentage identity and have comparable biological 25 In general, stringency is determined by the temperature, activity to the present polypeptide sequences, thus being ionic strength, and concentration of denaturing agents (e.g., members of the same clade or subclade of transcription factor formamide) used in a hybridization and washing procedure polypeptides, are encompassed by the invention. Overexpres (for a more detailed description of establishing and determin sion in a transformed plant of a polypeptide that comprises, ing stringency, see the section “Identifying Polynucleotides for example, a conserved domain having DNA-binding, acti 30 or Nucleic Acids by Hybridization', below). The degree to Vation or nuclear localization activity results in the trans which two nucleic acids hybridize under various conditions formed plant having similar improved traits as other trans of stringency is correlated with the extent of their similarity. formed plants overexpressing other members of the same lade Thus, similar nucleic acid sequences from a variety of or Subclade of transcription factor polypeptides. Sources, such as within a plant's genome (as in the case of A fragment or domain can be referred to as outside a 35 paralogs) or from another plant (as in the case of orthologs) conserved domain, outside a consensus sequence, or outside that may perform similar functions can be isolated on the a consensus DNA-binding site that is known to exist or that basis of their ability to hybridize with known related poly exists for a particular polypeptide class, family, or Sub-family. nucleotide sequences. Numerous variations are possible in In this case, the fragment or domain will not include the exact the conditions and means by which nucleic acid hybridization amino acids of a consensus sequence or consensus DNA 40 can be performed to isolate related polynucleotide sequences binding site of a transcription factor class, family or Sub having similarity to sequences known in the art and are not family, or the exact amino acids of a particular transcription limited to those explicitly disclosed herein. Such an approach factor consensus sequence or consensus DNA-binding site. may be used to isolate polynucleotide sequences having vari Furthermore, a particular fragment, region, or domain of a ous degrees of similarity with disclosed polynucleotide polypeptide, or a polynucleotide encoding a polypeptide, can 45 be “outside a conserved domain if all the amino acids of the sequences, such as, for example, encoded transcription fac fragment, region, or domain fall outside of a defined con tors having 93% or 95% or greater identity with the conserved served domain(s) for a polypeptide or protein. Sequences B domain of disclosed sequences. having lesser degrees of identity but comparable biological The terms “paralog and “ortholog” are defined below in activity are considered to be equivalents. 50 the section entitled “Orthologs and Paralogs”. In brief, As one of ordinary skill in the art recognizes, conserved orthologs and paralogs are evolutionarily related genes that domains may be identified as regions or domains of identity to have similar sequences and functions. Orthologs are structur a specific consensus sequence (see, for example, Riechmann ally related genes in different species that are derived by a et al., 2000a, 2000b). Thus, by using alignment methods well speciation event. Paralogs are structurally related genes known in the art, the conserved domains of the plant polypep 55 within a single species that are derived by a duplication event. tides may be determined. The term “equivalog describes members of a set of The conserved B domains for many of the polypeptide homologous proteins that are conserved with respect to func sequences of the invention are listed in Tables 2 and 3. Also, tion since their last common ancestor. Related proteins are the polypeptides of FIGS. 4A-4F and Tables 2 and 3 have grouped into equivalog families, and otherwise into protein conserved B domains specifically indicated by amino acid 60 families with other hierarchically defined homology types. coordinate start and stop sites. A comparison of the regions of This definition is provided at the Institute for Genomic these polypeptides allows one of skill in the art (see, for Research (TIGR) WorldWideWeb (www) website, “tigr.org example, Reeves and Nissen, 1995) to identify domains or under the heading “Terms associated with TIGRFAMs. conserved B domains for any of the polypeptides listed or In general, the term “variant” refers to molecules with referred to in this disclosure. 65 Some differences, generated synthetically or naturally, in “Complementary” refers to the natural hydrogen bonding their base oramino acid sequences as compared to a reference by base pairing between purines and pyrimidines. For (native) polynucleotide or polypeptide, respectively. These US 7,868,229 B2 10 differences include Substitutions, insertions, deletions or any Differences between presently disclosed polypeptides and desired combinations of such changes in a native polynucle polypeptide variants are limited so that the sequences of the otide of amino acid sequence. former and the latter are closely similar overall and, in many With regard to polynucleotide variants, differences regions, identical. Presently disclosed polypeptide sequences between presently disclosed polynucleotides and polynucle 5 and similar polypeptide variants may differ in amino acid otide variants are limited so that the nucleotide sequences of sequence by one or more substitutions, additions, deletions, the former and the latter are closely similar overall and, in fusions and truncations, which may be present in any combi many regions, identical. Due to the degeneracy of the genetic nation. These differences may produce silent changes and code, differences between the former and latter nucleotide result in a functionally equivalent polypeptides. Thus, it will sequences may be silent (i.e., the amino acids encoded by the 10 be readily appreciated by those of skill in the art, that any of polynucleotide are the same, and the variant polynucleotide a variety of polynucleotide sequences is capable of encoding sequence encodes the same amino acid sequence as the pres the polypeptides and homolog polypeptides of the invention. ently disclosed polynucleotide. Variant nucleotide sequences A polypeptide sequence variant may have “conservative' may encode different amino acid sequences, in which case changes, wherein a Substituted amino acid has similar struc such nucleotide differences will result in amino acid substi 15 tural or chemical properties. Deliberate amino acid substitu tutions, additions, deletions, insertions, truncations or fusions tions may thus be made on the basis of similarity in polarity, with respect to the similar disclosed polynucleotide charge, solubility, hydrophobicity, hydrophilicity, and/or the sequences. These variations may result in polynucleotide amphipathic nature of the residues, as long as a significant variants encoding polypeptides that share at least one func amount of the functional or biological activity of the polypep tional characteristic. The degeneracy of the genetic code also tide is retained. For example, negatively charged amino acids dictates that many different variant polynucleotides can may include aspartic acid and glutamic acid, positively encode identical and/or Substantially similar polypeptides in charged amino acids may include lysine and arginine, and addition to those sequences illustrated in the Sequence List amino acids with uncharged polar head groups having similar ing. hydrophilicity values may include leucine, isoleucine, and Also within the scope of the invention is a variant of a 25 Valine; glycine and alanine; asparagine and glutamine; serine nucleic acid listed in the Sequence Listing, that is, one having and threonine; and phenylalanine and tyrosine. More rarely, a a sequence that differs from the one of the polynucleotide variant may have “non-conservative changes, e.g., replace sequences in the Sequence Listing, or a complementary ment of a glycine with a tryptophan. Similar minor variations sequence, that encodes a functionally equivalent polypeptide may also include amino acid deletions or insertions, or both. (i.e., a polypeptide having some degree of equivalent or simi 30 Related polypeptides may comprise, for example, additions lar biological activity) but differs in sequence from the and/or deletions of one or more N-linked or O-linked glyco sequence in the Sequence Listing, due to degeneracy in the sylation sites, or an addition and/or a deletion of one or more genetic code. Included within this definition are polymor cysteine residues. Guidance in determining which and how phisms that may or may not be readily detectable using a many amino acid residues may be substituted, inserted or particular oligonucleotide probe of the polynucleotide encod 35 deleted without abolishing functional or biological activity ing polypeptide, and improper or unexpected hybridization to may be found using computer programs well known in the art, allelic variants, with a locus other than the normal chromo for example, DNASTAR software (see U.S. Pat. No. 5,840, Somal locus for the polynucleotide sequence encoding 544). polypeptide. “Fragment', with respect to a polynucleotide, refers to a Allelic variant' or “polynucleotide allelic variant” refers 40 clone or any part of a polynucleotide molecule that retains a to any of two or more alternative forms of a gene occupying usable, functional characteristic. Useful fragments include the same chromosomal locus. Allelic variation arises natu oligonucleotides and polynucleotides that may be used in rally through mutation, and may result in phenotypic poly hybridization or amplification technologies or in the regula morphism within populations. Gene mutations may be tion of replication, transcription or translation. A polynucle 'silent” or may encode polypeptides having altered amino 45 otide fragment” refers to any Subsequence of a polynucle acid sequence. “Allelic variant' and “polypeptide allelic vari otide, typically, of at least about 9 consecutive nucleotides, ant’ may also be used with respect to polypeptides, and in this preferably at least about 30 nucleotides, more preferably at case the terms refer to a polypeptide encoded by an allelic least about 50 nucleotides, of any of the sequences provided variant of a gene. herein. Exemplary polynucleotide fragments are the first “Splice variant' or “polynucleotide splice variant as used 50 sixty consecutive nucleotides of the polynucleotides listed in herein refers to alternative forms of RNA transcribed from a the Sequence Listing. Exemplary fragments also include gene. Splice variation naturally occurs as a result of alterna fragments that comprise a region that encodes a conserved B tive sites being spliced within a single transcribed RNA mol domain of a polypeptide. Exemplary fragments also include ecule or between separately transcribed RNA molecules, and fragments that comprise a conserved domain of a polypep may result in several different forms of mRNA transcribed 55 tide. from the same gene. Thus, splice variants may encode Fragments may also include Subsequences of polypeptides polypeptides having different amino acid sequences, which and protein molecules, or a Subsequence of the polypeptide. may or may not have similar functions in the organism. Fragments may have uses in that they may have antigenic “Splice variant' or “polypeptide splice variant may also potential. In some cases, the fragment or domain is a Subse refer to a polypeptide encoded by a splice variant of a tran 60 quence of the polypeptide which performs at least one bio scribed mRNA. logical function of the intact polypeptide in Substantially the As used herein, “polynucleotide variants' may also refer to same manner, or to a similar extent, as does the intact polynucleotide sequences that encode paralogs and orthologs polypeptide. For example, a polypeptide fragment can com of the presently disclosed polypeptide sequences. “Polypep prise a recognizable structural motif or functional domain tide variants' may refer to polypeptide sequences that are 65 such as a DNA-binding site or domain that binds to a DNA paralogs and orthologs of the presently disclosed polypeptide promoter region, an activation domain, or a domain for pro Sequences. tein-protein interactions, and may initiate transcription. Frag US 7,868,229 B2 11 12 ments can vary in size from as few as 3 amino acid residues to wild-type plant of the same species, variety or cultivar. The the full length of the intact polypeptide, but are preferably at genetic material may include a regulatory element, a trans least about 30 amino acid residues in length and more pref gene (for example, a foreign transcription factor sequence), erably at least about 60 amino acid residues in length. an insertional mutagenesis event (such as by transposon or The invention also encompasses production of DNA T-DNA insertional mutagenesis), an activation tagging sequences that encode polypeptides and derivatives, or frag sequence, a mutated sequence, a homologous recombination ments thereof, entirely by synthetic chemistry. After produc event or a sequence modified by chimeraplasty. In some tion, the synthetic sequence may be inserted into any of the embodiments the regulatory and transcription factor many available expression vectors and cell Systems using sequence may be derived from the host plant, but by their reagents well known in the art. Moreover, synthetic chemistry 10 incorporation into an expression vector of cassette, represent may be used to introduce mutations into a sequence encoding an arrangement of the polynucleotide sequences not found a polypeptides or any fragment thereof. wild-type plant of the same species, variety or cultivar. “Derivative' refers to the chemical modification of a An “untransformed plant' is a plant that has not been nucleic acid molecule or amino acid sequence. Chemical through the transformation process. modifications can include replacement of hydrogen by an 15 A “stably transformed plant, plant cell or plant tissue has alkyl, acyl, or amino group or glycosylation, pegylation, or generally been selected and regenerated on a selection media any similar process that retains or enhances biological activ following transformation. ity or lifespan of the molecule or sequence. An expression vector or cassette typically comprises a The term “plant' includes whole plants, shoot vegetative polypeptide-encoding sequence operably linked (i.e., under organs/structures (for example, leaves, stems and tubers), regulatory control of) to appropriate inducible or constitutive roots, flowers and floral organs/structures (for example, regulatory sequences that allow for the controlled expression bracts, sepals, petals, stamens, carpels, anthers and ovules), of polypeptide. The expression vector or cassette can be intro seed (including embryo, endosperm, and seed coat) and fruit duced into a plant by transformation or by breeding after (the mature ovary), plant tissue (for example, vascular tissue, transformation of a parent plant. A plant refers to a whole ground tissue, and the like) and cells (for example, guard 25 plantas well as to a plant part, Such as seed, fruit, leaf, or root, cells, egg cells, and the like), and progeny of same. The class plant tissue, plant cells or any other plant material, e.g., a plant of plants that can be used in the method of the invention is explant, as well as to progeny thereof, and to in vitro systems generally as broad as the class of higher and lower plants that mimic biochemical or cellular components or processes amenable to transformation techniques, including in a cell. angiosperms (monocotyledonous and dicotyledonous 30 “Wildtype' or “wild-type', as used herein, refers to a plant plants), gymnosperms, ferns, horsetails, psilophytes, lyco cell, seed, plant component, plant tissue, plant organ or whole phytes, bryophytes, and multicellular algae (see for example, plant that has not been genetically modified or treated in an FIG. 1, adapted from Daly et al., 2001, FIG. 2, adapted from experimental sense. Wild-type cells, seed, components, tis Ku et al., 2000; and see also Tudge, 2000. Sue, organs or whole plants may be used as controls to com A “control plant” as used in the present invention refers to 35 pare levels of expression and the extent and nature of trait a plant cell, seed, plant component, plant tissue, plant organ or modification with cells, tissue or plants of the same species in whole plant used to compare against transformed, transgenic which a polypeptide's expression is altered, e.g., in that it has or genetically modified plant for the purpose of identifying an been knocked out, overexpressed, or ectopically expressed. enhanced phenotype in the transformed, transgenic or geneti A “trait” refers to a physiological, morphological, bio cally modified plant. A control plant may in some cases be a 40 chemical, or physical characteristic of a plant or particular transformed or transgenic plant line that comprises an empty plant material or cell. In some instances, this characteristic is vector or marker gene, but does not contain the recombinant visible to the human eye. Such as seed or plant size, or can be polynucleotide of the present invention that is expressed in measured by biochemical techniques, such as detecting the the transformed, transgenic or genetically modified plant protein, starch, or oil content of seed or leaves, or by obser being evaluated. In general, a control plant is a plant of the 45 Vation of a metabolic or physiological process, e.g. by mea same line or variety as the transformed, transgenic or geneti Suring tolerance to water deprivation or particular salt or cally modified plant being tested. A suitable control plant Sugar concentrations, or by the observation of the expression would include a genetically unaltered or non-transgenic plant level of agene or genes, e.g., by employing Northern analysis, of the parental line used to generate a transformed or trans RT-PCR, microarray gene expression assays, or reporter gene genic plant herein. 50 expression systems, or by agricultural observations such as a “Transformation” refers to the transfer of a foreign poly decreased time to flowering or an increased yield. Any tech nucleotide sequence into the genome of a host organism Such nique can be used to measure the amount of comparative as that of a plant or plant cell. Typically, the foreign genetic level of, or difference in any selected chemical compound or material has been introduced into the plant by human manipu macromolecule in the transformed or transgenic plants, how lation, but any method can be used as one of skill in the art 55 eVe. recognizes. Examples of methods of plant transformation “Trait modification” refers to a detectable difference in a include Agrobacterium-mediated transformation (De Blaere characteristic in a plant ectopically expressing a polynucle et al., 1987) and biolistic methodology (Klein et al., 1987: otide or polypeptide of the present invention relative to a plant U.S. Pat. No. 4,945,050). not doing so. Such as a wild-type plant. In some cases, the trait A “transformed plant, which may also be referred to as a 60 modification can be evaluated quantitatively. For example, “transgenic plant’ or “transformant', generally refers to a the trait modification can entail at least about a 2% increase or plant, a plant cell, plant tissue, seed or calli that has been decrease, or an even greater difference, in an observed trait as through, or is derived from a plant that has been through, a compared with a control or wild-type plant. It is known that transformation process in which an expression vector or cas there can be a natural variation in the modified trait. There sette that contains at least one foreign polynucleotide 65 fore, the trait modification observed entails a change of the sequence is introduced into the plant. The expression vector normal distribution and magnitude of the trait in the plants as or cassette contains genetic material that is not found in a compared to control or wild-type plants. US 7,868,229 B2 13 14 When two or more plants have “similar morphologies, tissue, at any developmental or temporal stage for the gene. “Substantially similar morphologies”, “a morphology that is Overexpression can occur when, for example, the genes substantially similar, or are “morphologically similar, the encoding one or more polypeptides are under the control of a plants have comparable forms or appearances, including strong promoter (e.g., the cauliflower mosaic virus 35S tran analogous features such as overall dimensions, height, width, scription initiation region (SEQID NO: 61). Overexpression mass, root mass, shape, glossiness, color, stem diameter, leaf may also under the control of an inducible or tissue specific size, leaf dimension, leaf density, internode distance, branch promoter. The choice of promoters may include, for example, ing, root branching, number and form of inflorescences, and the ARSK1 root-specific promoter, the RSI 1 root-specific other macroscopic characteristics, and the individual plants promoter, the RBCS3 leaf-specific or photosynthetic-tissue are not readily distinguishable based on morphological char 10 specific-promoter, the SUC2 vascular-specific promoter, the acteristics alone. CUTI epidermal-specific promoter, the LTP1 epidermal-spe “Modulates' refers to a change in activity (biological, cific promoter, the AS1 emergent leaf primordia-specific pro chemical, or immunological) or lifespan resulting from spe moter, or the RD29A stress inducible promoter (SEQID NO: cific binding between a molecule and either a nucleic acid 62-69, respectively). Many of these promoters have been used molecule or a protein. 15 with polynucleotide sequences of the invention to produce The term “transcript profile” refers to the expression levels transgenic plants. These or other inducible or tissue-specific of a set of genes in a cell in a particular state, particularly by promoters may be incorporated into an expression vector comparison with the expression levels of that same set of comprising a transcription factor polynucleotide of the inven genes in a cell of the same type in a reference state. For tion, where the promoter is operably linked to the transcrip example, the transcript profile of a particular polypeptide in a tion factor polynucleotide, can be envisioned and produced. Suspension cell is the expression levels of a set of genes in a Thus, overexpression may occur throughout a plant, in spe cell knocking out or overexpressing that polypeptide com cific tissues of the plant, or in the presence or absence of pared with the expression levels of that same set of genes in a particular environmental signals, depending on the promoter suspension cell that has normal levels of that polypeptide. The used. transcript profile can be presented as a list of those genes 25 Overexpression may take place in plant cells normally whose expression level is significantly different between the lacking expression of polypeptides functionally equivalent or two treatments, and the difference ratios. Differences and identical to the present polypeptides. Overexpression may similarities between expression levels may also be evaluated also occur in plant cells where endogenous expression of the and calculated using statistical and clustering methods. present polypeptides or functionally equivalent molecules With regard to gene knockouts as used herein, the term 30 normally occurs, but Such normal expression is at a lower “knockout” refers to a plant or plant cell having a disruption level. Overexpression thus results in a greater than normal in at least one gene in the plant or cell, where the disruption production, or "overproduction” of the polypeptide in the results in a reduced expression or activity of the polypeptide plant, cell or tissue. encoded by that gene compared to a control cell. The knock The term “transcription regulating region” refers to a DNA out can be the result of for example, genomic disruptions, 35 regulatory sequence that regulates expression of one or more including transposons, tilling, and homologous recombina genes in a plant when a transcription factor having one or tion, antisense constructs, sense constructs, RNA silencing more specific binding domains binds to the DNA regulatory constructs, or RNA interference. AT-DNA insertion within a sequence. Transcription factors possess at least one con gene is an example of a genotypic alteration that may abolish served domain. The transcription factors also comprise an expression of that gene. 40 amino acid Subsequence that forms a transcription activation “Ectopic expression' or “altered expression' in reference domain that regulates expression of one or more genes that to a polynucleotide indicates that the pattern of expression in, modulate flowering time in a plant when the transcription e.g., a transformed or transgenic plant or plant tissue, is dif factor binds to the regulating region. ferent from the expression pattern in a wild-type plant or a “Yield' or “plant yield’ refers to increased plant growth, reference plant of the same species. The pattern of expression 45 increased crop growth, increased biomass, and/or increased may also be compared with a reference expression pattern in plant product production, and is dependent to Some extent on a wild-type plant of the same species. For example, the poly temperature, plant size, organ size, planting density, light, nucleotide or polypeptide is expressed in a cell or tissue type water and nutrient availability, and how the plant copes with other than a cell or tissue type in which the sequence is various stresses, such as through temperature acclimation and expressed in the wild-type plant, or by expression at a time 50 water or nutrient use efficiency. other than at the time the sequence is expressed in the wild “Planting density” refers to the number of plants that can be type plant, or by a response to different inducible agents. Such grown per acre. For crop species, planting or population den as hormones or environmental signals, or at different expres sity varies from a crop to a crop, from one growing region to sion levels (either higher or lower) compared with those another, and from year to year. Using corn as an example, the found in a wild-type plant. The term also refers to altered 55 average prevailing density in 2000 was in the range of 20,000 expression patterns that are produced by lowering the levels 25,000 plants per acre in Missouri, USA. A desirable higher of expression to below the detection level or completely abol population density (a measure of yield) would be at least ishing expression. The resulting expression pattern can be 22,000 plants per acre, and a more desirable higher popula transient or stable, constitutive or inducible. In reference to a tion density would be at least 28,000 plants per acre, more polypeptide, the terms “ectopic expression” or “altered 60 preferably at least 34,000 plants per acre, and most preferably expression further may relate to altered activity levels result at least 40,000 plants per acre. The average prevailing densi ing from the interactions of the polypeptides with exogenous ties per acre of a few other examples of crop plants in the USA or endogenous modulators or from interactions with factors in the year 2000 were: wheat 1,000,000-1,500,000; rice 650, oras a result of the chemical modification of the polypeptides. 000-900,000; soybean 150,000-200,000, canola 260,000 The term “overexpression' as used herein refers to a 65 350,000, sunflower 17,000-23,000 and cotton 28,000-55,000 greater expression level of a gene in a plant, plant cell or plant plants per acre (Cheikh et al., 2003) U.S. Patent Application tissue, compared to expression in a wild-type plant, cell or No. 20030101479). A desirable higher population density for US 7,868,229 B2 15 16 each of these examples, as well as other valuable species of include Peng et al., 1997 and Peng et al., 1999. In addition, plants, would be at least 10% higher than the average prevail many others have demonstrated that an Arabidopsis tran ing density or yield. Scription factor expressed in an exogenous plant species elic its the same or very similar phenotypic response. See, for DESCRIPTION OF THE SPECIFIC example, Fu et al., 2001; Nandiet al., 2000; Coupland, 1995: EMBODIMENTS and Weigel and Nilsson, 1995). In another example, Mandelet al., 1992b, and Suzuki et al., Transcription Factors Modify Expression of Endogenous 2001, teach that a transcription factor expressed in another Genes plant species elicits the same or very similar phenotypic A transcription factor may include, but is not limited to, 10 response of the endogenous sequence, as often predicted in any polypeptide that can activate or repress transcription of a earlier studies of Arabidopsis transcription factors in Arabi single gene or a number of genes. As one of ordinary skill in dopsis (see Mandel et al., 1992a: Suzuki et al., 2001). Other the art recognizes, transcription factors can be identified by examples include Müller et al., 2001; Kim et al., 2001; Kyo the presence of a region or domain of structural similarity or Zuka and Shimamoto, 2002; Boss and Thomas, 2002; He et identity to a specific consensus sequence or the presence of a 15 al., 2000; and Robson et al., 2001. specific consensus DNA-binding motif (see, for example, In yet another example, Gilmour et al., 1998, teach an Riechmann et al., 2000a). The plant transcription factors of Arabidopsis AP2 transcription factor, CBF1, which, when the present invention are transcription factors. overexpressed in transgenic plants, increases plant freezing Generally, transcription factors are involved in cell differ tolerance. Jaglo et al., 2001, further identified sequences in entiation and proliferation and the regulation of growth. Brassica napus which encode CBF-like genes and that tran Accordingly, one skilled in the art would recognize that by Scripts for these genes accumulated rapidly in response to low expressing the present sequences in a plant, one may change temperature. Transcripts encoding CBF-like proteins were the expression of autologous genes or induce the expression also found to accumulate rapidly in response to low tempera of introduced genes. By affecting the expression of similar ture in wheat, as well as in tomato. An alignment of the CBF autologous sequences in a plant that have the biological activ 25 proteins from Arabidopsis, B. napus, wheat, rye, and tomato ity of the present sequences, or by introducing the present revealed the presence of conserved consecutive amino acid sequences into a plant, one may alter a plants phenotype to residues, PKK/RPAGRXKFXETRHP (SEQID NO: 70) and one with improved traits related to an accelerated or early DSAWR (SEQID NO: 71), which bracket the AP2/EREBP flowering time. The sequences of the invention may also be DNA binding domains of the proteins and distinguish them used to transform a plant and introduce desirable traits not 30 from other members of the AP2/EREBP protein family. (Ja found in the wild-type cultivar or strain. Plants may then be glo et al., 2001). selected for those that produce the most desirable degree of Transcription factors mediate cellular responses and con over- or under-expression of target genes of interest and coin trol traits through altered expression of genes containing cis cident trait improvement. acting nucleotide sequences that are targets of the introduced The sequences of the present invention may be from any 35 transcription factor. It is well appreciated in the art that the species, particularly plant species, in a naturally occurring effect of a transcription factor on cellular responses or a form or from any source whether natural, synthetic, semi cellular trait is determined by the particular genes whose synthetic or recombinant. The sequences of the invention may expression is either directly or indirectly (e.g., by a cascade of also include fragments of the present amino acid sequences. transcription factor binding events and transcriptional Where “amino acid sequence” is recited to refer to an amino 40 changes) altered by transcription factor binding. In a global acid sequence of a naturally occurring protein molecule, analysis of transcription comparing a standard condition with “amino acid sequence” and like terms are not meant to limit one in which a transcription factor is overexpressed, the the amino acid sequence to the complete native amino acid resulting transcript profile associated with transcription fac sequence associated with the recited protein molecule. tor overexpression is related to the trait or cellular process In addition to methods for modifying a plant phenotype by 45 controlled by that transcription factor. For example, the PAP2 employing one or more polynucleotides and polypeptides of gene and other genes in the MYB family have been shown to the invention described herein, the polynucleotides and control anthocyanin biosynthesis through regulation of the polypeptides of the invention have a variety of additional expression of genes known to be involved in the anthocyanin uses. These uses include their use in the recombinant produc biosynthetic pathway (Bruce et al., 2000; and Borevitz et al., tion (i.e., expression) of proteins; as regulators of plant gene 50 2000). Further, global transcript profiles have been used suc expression, as diagnostic probes for the presence of comple cessfully as diagnostic tools for specific cellular states (e.g., mentary or partially complementary nucleic acids (including cancerous vs. non-cancerous; Bhattacharjee et al., 2001); and for detection of natural coding nucleic acids); as Substrates Xu et al., 2001). Consequently, it is evident to one skilled in for further reactions, e.g., mutation reactions, PCR reactions, the art that similarity of transcript profile upon overexpres or the like; as Substrates for cloning e.g., including digestion 55 sion of different transcription factors would indicate similar or ligation reactions; and for identifying exogenous or endog ity of transcription factor function. enous modulators of the transcription factors. The polynucle Polypeptides and Polynucleotides of the Invention otide can be, e.g., genomic DNA or RNA, a transcript (such as The present invention includes transcription factors (TFs), an mRNA), a cDNA, a PCR product, a cloned DNA, a syn and isolated or recombinant polynucleotides encoding the thetic DNA or RNA, or the like. The polynucleotide can 60 polypeptides, or novel sequence variant polypeptides or poly comprise a sequence in either sense or antisense orientations. nucleotides encoding novel variants of polypeptides derived Expression of genes that encode polypeptides that modify from the specific sequences provided in the Sequence Listing: expression of endogenous genes, polynucleotides, and pro the recombinant polynucleotides of the invention may be teins are well known in the art. In addition, transformed or incorporated in expression vectors for the purpose of produc transgenic plants comprising isolated polynucleotides encod 65 ing transformed plants. Also provided are methods for accel ing transcription factors may also modify expression of erating the time for a plant to flower, as compared to a control endogenous genes, polynucleotides, and proteins. Examples plant. These methods are based on the ability to alter the US 7,868,229 B2 17 18 expression of critical regulatory molecules that may be con been confirmed in plants, although a recent publication impli served between diverse plant species. Related conserved cates CCT-domain containing proteins as having Such a role regulatory molecules may be originally discovered in a model (Ben-Naim et al., 2006). In yeast, however, the HAP4 protein system such as Arabidopsis and homologous, functional mol provides the primary activation domain (McNabb et al., ecules then discovered in other plant species. The latter may 1995); Olesen and Guarente, 1990). then be used to confer early flower development in diverse HAP2-, HAP3- and HAP5-like proteins have two highly plant species. conserved sub-domains, one that functions in subunit inter Exemplary polynucleotides encoding the polypeptides of action and the other that acts in a direct association with DNA. the invention were identified in the Arabidopsis thaliana Outside these two regions, non-paralogous Arabidopsis GenBank database using publicly available sequence analy 10 HAP-like proteins are quite divergent in sequence and in sis programs and parameters. Sequences initially identified overall length. were then further characterized to identify sequences com The general domain structure of HAP3 proteins is found in prising specified sequence strings corresponding to sequence FIG. 4. HAP3 proteins contain an amino-terminal A domain, motifs present infamilies of known polypeptides. In addition, a central B domain and a carboxy-terminal C domain. There further exemplary polynucleotides encoding the polypeptides 15 is very little sequence similarity between HAP3 proteins in of the invention were identified in the plant GenBank data the A and C domains; it is therefore reasonable to assume that base using publicly available sequence analysis programs and the A and C domains could provide a degree of functional parameters. Sequences initially identified were then further specificity to each member of the HAP3 subfamily. characterized to identify sequences comprising specified HAP3-like NF-YB proteins comprise a “conserved pro sequence Strings corresponding to sequence motifs present in tein-protein and DNA-binding interaction module” within families of known polypeptides. their “histone fold motif or “HFM” (Gusmaroliet al., 2002). Additional polynucleotides of the invention were identi The HFM, which is “specific and required for HAP function” fied by Screening Arabidopsis thaliana and/or other plant (Edwards et al., 1998), is comprised within the larger highly cDNA libraries with probes corresponding to known conserved B domain (Lee et al., 2003) which is responsible polypeptides under low stringency hybridization conditions. 25 for DNA binding and Subunit association and is necessary and Additional sequences, including full length coding sufficient for the activity of another HAP3 protein (LEC 1: sequences, were Subsequently recovered by the rapid ampli Lee et al. 2003). According to Gusmaroli et al., 2002, “all fication of cDNA ends (RACE) procedure using a commer residues that constitute the backbone structure of the HFMs cially available kit according to the manufacturers instruc are conserved, and residues such as AtNF-YB-10 N38, K58 tions. Where necessary, multiple rounds of RACE are 30 and Q62, involved in CCAAT-binding, and E67 and E75, performed to isolate 5' and 3' ends. The full-length cDNA was involved in NF-YA association (Maity and de Crombrugghe, then recovered by a routine end-to-end polymerase chain 1998: Zemzoumi et al., 1999), are maintained”. reaction (PCR) using primers specific to the isolated 5' and 3' Phylogenetic trees based on sequential relatedness of the ends. Exemplary sequences are provided in the Sequence HAP3 genes are shown in FIG. 3. The present invention Listing. 35 encompasses the G482 subclade within the non-LEC 1-like The CCAAT Family Members Under Study clade of HAP3 (NF-YB) proteins, for which a representative Transcriptional regulation of most eukaryotic genes occurs number of monocot and dicot species, including members through the binding of transcription factors to sequence spe from dicot and monocot species, have been shown to confer cific binding sites in their promoter regions. Many of these an early flowering time in plants when overexpressed (shown protein binding sites have been conserved through evolution 40 in Tables 2 and 3 in Example V). and are found in the promoters of diverse eukaryotic organ In FIGS. 4A-4F, HAP3 polypeptides from Arabidopsis, isms. One element that shows a high degree of conservation is Soybean, rice, corn and Physcomitrella are aligned with the CCAAT-box (Gelinas et al., 1985). The CCAAT family of G482, with the A, B and C domains and the DNA binding and transcription factors, also be referred to as the “CAAT. subunit interaction domains indicated. The B domains of the “CAAT-box” or “CCAAT-box” family, are characterized by 45 sequences in this non-LEC1-like G482 subclade appearing in their ability to bind to the CCAAT-box element located 80 to the box in FIGS. 4B-4C are generally distinguished by the 300 bp 5' from a transcription start site (Gelinas et al., 1985). conserved residues within the HFM and larger B domain This cis-acting regulatory element is found in all eukaryotic comprised within the subsequence SEQID NO: 60: species and present in the promoter and enhancer regions of ASn(Xaa).Lys(Xaa)-Gln(Xaa). GlucXaa),Glu approximately 30% of genes (Bucher and Trifonov, 1988: 50 where Xaa can be any amino acid residue. The A domains Bucher, 1990). The element can function in either orientation, of these proteins in FIGS. 4A-4F are located before the box and operates alone, or in possible cooperation with other cis (i.e., nearer the N-termini) in FIGS. 4A-4B and the C domains regulatory elements (Tasanen et al., 1992). are located after the box (i.e., nearer the C-termini) in FIGS. Plant CCAAT binding transcription factors potentially 4C-4F. Within the G482 subclade, the A and C domains are bind DNA as heterotrimers composed of HAP2-like, HAP3 55 more variable than the B domain in both length and sequence like and HAP5-like subunits. The heterotrimer is also refer identity. SEQID NOs of the sequences listed in FIGS. 4A-4F enced in the public literature as Nuclear Factor Y (NF-Y), are found with the parentheses. which comprises an NF-YA subunit (corresponding to the Overexpression of the G482 subclade polypeptides com HAP2-like subunit), an NF-YB subunit (corresponding to the prising a central conserved domain containing this Subse HAP3-like subunit) and an NF-YC subunit (corresponding to 60 quence have been shown to confer accelerated flowering in the HAP5-like subunit) (Mantovani, 1999; Gusmaroli et al., transgenic plants, as compared to a non-transformed plant 2001; Gusmaroli et al., 2002). All subunits contain regions that does not overexpress the polypeptide. that are required for DNA binding and subunit association. Orthologs and Paralogs The subunit proteins appear to lack activation domains; there Homologous sequences as described above can comprise fore, that function must come from proteins with which they 65 orthologous or paralogous sequences. Several different meth interact on target promoters. No proteins that provide the ods are known by those of skill in the art for identifying and activation domain function for CCAAT binding factors have defining these functionally homologous sequences. General US 7,868,229 B2 19 20 methods for identifying orthologs and paralogs, including changeable between species without losing function. phylogenetic methods, sequence similarity and hybridization Because plants have common ancestors, many genes in any methods, are described herein; an ortholog or paralog, includ plant species will have a corresponding orthologous gene in ing equivalogs, may be identified by one or more of the another plant species. Once a phylogenic tree for a gene methods described below. family of one species has been constructed using a program As described by Eisen, 1998, evolutionary information such as CLUSTAL (Thompson et al., 1994; Higgins et al., may be used to predict gene function. It is common for groups 1996) potential orthologous sequences can be placed into the of genes that are homologous in sequence to have diverse, phylogenetic tree and their relationship to genes from the although usually related, functions. However, in many cases, species of interest can be determined. Orthologous sequences the identification of homologs is not sufficient to make spe 10 can also be identified by a reciprocal BLAST strategy. Once cific predictions because not all homologs have the same an orthologous sequence has been identified, the function of function. Thus, an initial analysis of functional relatedness the ortholog can be deduced from the identified function of based on sequence similarity alone may not provide one with the reference sequence. a means to determine where similarity ends and functional By using a phylogenetic analysis, one skilled in the art relatedness begins. Fortunately, it is well known in the art that 15 would recognize that the ability to predict similar functions protein function can be classified using phylogenetic analysis conferred by closely-related polypeptides is predictable. This of gene trees combined with the corresponding species. Func predictability has been confirmed by our own many studies in tional predictions can be greatly improved by focusing on which we have found that a wide variety of polypeptides have how the genes became similar in sequence (i.e., by evolution orthologous or closely-related homologous sequences that ary processes) rather than on the sequence similarity itself function as does the first, closely-related reference sequence. (Eisen, 1998). In fact, many specific examples exist in which For example, distinct transcription factors, including: gene function has been shown to correlate well with gene (i) AP2 family Arabidopsis G47 (found in U.S. Pat. No. phylogeny (Eisen, 1998). Thus, “the first step in making 7,135.616), a phylogenetically-related sequence from Soy functional predictions is the generation of a phylogenetic tree bean, and two phylogenetically-related homologs from rice representing the evolutionary history of the gene of interest 25 all can confer greater tolerance to drought, hyperosmotic and its homologs. Such trees are distinct from clusters and stress, or delayed flowering as compared to control plants; other means of characterizing sequence similarity because (ii) CAAT family Arabidopsis G481 (found in PCT patent they are inferred by techniques that help convert patterns of publication WO2004076638), and numerous phylogeneti similarity into evolutionary relationships. . . . After the gene cally-related sequences from dicots and monocots can confer tree is inferred, biologically determined functions of the vari 30 greater tolerance to drought-related stress as compared to ous homologs are overlaid onto the tree. Finally, the structure control plants; of the tree and the relative phylogenetic positions of genes of (iii) Myb-related Arabidopsis G682 (found in PCT patent different functions are used to trace the history of functional publication WO2004076638) and numerous phylogeneti changes, which is then used to predict functions of as yet cally-related sequences from dicots and monocots can confer uncharacterized genes' (Eisen, 1998). 35 greater tolerance to heat, drought-related stress, cold, and salt Within a single plant species, gene duplication may cause as compared to control plants; two copies of a particular gene, giving rise to two or more (iv) WRKY family Arabidopsis G1274 (found in U.S. genes with similar sequence and often similar function known patent application Ser. No. 10/666,642) and numerous as paralogs. A paralog is therefore a similar gene formed by closely-related sequences from dicots and monocots have duplication within the same species. Paralogs typically clus 40 been shown to confer increased water deprivation tolerance, ter together or in the same lade (a group of similar genes) and when a gene family phylogeny is analyzed using programs (v) AT-hook family soy sequence G3456 (found in US such as CLUSTAL (Thompson et al., 1994); Higgins et al., patent publication 2004.0128712A1) and numerous phyloge 1996). Groups of similar genes can also be identified with netically-related sequences from dicots and monocots, pair-wise BLAST analysis (Feng and Doolittle, 1987). For 45 increased biomass compared to control plants when these example, a dade of very similar MADS domain transcription sequences are overexpressed in plants. factors from Arabidopsis all share a common function in The polypeptides sequences belong to distinct clades or flowering time (Ratcliffe et al., 2001), and a group of very subclades of polypeptides that include members from diverse similar AP2 domain transcription factors from Arabidopsis species. Many of the G482 subclade member sequences are involved in tolerance of plants to freezing (Gilmour et al., 50 derived from both dicots and monocots that have been intro 1998). Analysis of groups of similar genes with similar func duced into plants have been shown to confer an accelerated tion that fall within one lade can yield Sub-sequences that are flowering time relative to control plants when the sequences particular to the clade. These Sub-sequences, known as con were overexpressed, particularly those most closely related to sensus sequences, can not only be used to define the G3397 or G3476 (SEQ ID NOs: 2 and 18, respectively). sequences within each lade, but define the functions of these 55 These studies each demonstrate, in accord with the teachings genes; genes within a lade may contain paralogous of Goodrich et al., 1993, Lin et al., 1991, and Sadowski et al., sequences, or orthologous sequences that share the same 1988, that evolutionarily conserved genes from diverse spe function (see also, for example, Mount, 2001). cies are likely to function similarly (i.e., by regulating similar Transcription factor gene sequences are conserved across target sequences and controlling the same traits), and that diverse eukaryotic species lines (Goodrich et al., 1993; Linet 60 polynucleotides from one species may be transformed into al., 1991; Sadowski et al., 1988). Plants are no exception to closely-related or distantly-related plant species to confer or this observation; diverse plant species possess transcription improve traits. factors that have similar sequences and functions. Speciation, At the nucleotide level, the sequences of the invention will the production of new species from a parental species, gives typically share at least about 30% or 40% nucleotide rise to two or more genes with similar sequence and similar 65 sequence identity, preferably at least about 50%, about 60%, function. These genes, termed orthologs, often have an iden about 70% or about 80% sequence identity, and more prefer tical function within their host plants and are often inter ably about 85%, about 90%, about 95% or about 97% or more US 7,868,229 B2 21 22 sequence identity to one or more of the listed full-length expectation (E) of 10, a cutoff of 100, M-5, N=-4, and a sequences, or to a region of a listed sequence excluding or comparison of both Strands. For amino acid sequences, the outside of the region(s) encoding a known consensus BLASTP program uses as defaults a wordlength (W) of 3, an sequence or consensus DNA-binding site, or outside of the expectation (E) of 10, and the BLOSUM62 scoring matrix region(s) encoding one or all conserved domains. The degen (see Henikoff & Henikoff, 1989). Unless otherwise indicated eracy of the genetic code enables major variations in the for comparisons of predicted polynucleotides, “sequence nucleotide sequence of a polynucleotide while maintaining identity” refers to the % sequence identity generated from a the amino acid sequence of the encoded protein. tblastx using the NCBI version of the algorithm at the default At the polypeptide level, the sequences of the invention settings using gapped alignments with the filter"off (see, for will typically share at least about 50%, about 60%, about 70% 10 example, internet website at www.ncbi.nlm.nih.gov/). or about 80% sequence identity, and more preferably at least Other techniques for alignment are described by Doolittle, about 85%, at least about 90%, at least about 93%, at least 1996. Preferably, an alignment program that permits gaps in about 94%, at least about 95%, at least about 96%, at least the sequence is utilized to align the sequences. The Smith about 97%, at least about 98%, or at least about 99% or 100% Waterman is one type of algorithm that permits gaps in sequence identity to one or more of the listed full-length 15 sequence alignments (see Shpaer, 1997). Also, the GAP pro sequences, or to a listed sequence but excluding or outside of gram using the Needleman and Wunsch alignment method the known consensus sequence or consensus DNA-binding can be utilized to align sequences. An alternative search strat site, or to a B domain (e.g., SEQID NOs: 31-43) of a sequence egy uses MPSRCH software, which runs on a MASPAR of the invention, said B domain being required for DNA computer. MPSRCH uses a Smith-Waterrnan algorithm to binding and subunit association. score sequences on a massively parallel computer. This Percent identity can be determined electronically, e.g., by approach improves ability to pick up distantly related using the MEGALIGN program (DNASTAR, Inc. Madison, matches, and is especially tolerant of Small gaps and nucle Wis.). The MEGALIGN program can create alignments otide sequence errors. Nucleic acid-encoded amino acid between two or more sequences according to different meth sequences can be used to search both protein and DNA data ods, for example, the clustal method (see, for example, Hig 25 bases. gins and Sharp, 1988. The clustal algorithm groups sequences The percentage similarity between two polypeptide into clusters by examining the distances between all pairs. sequences, e.g., sequence A and sequence B, is calculated by The clusters are aligned pairwise and then in groups. Other dividing the length of sequence A, minus the number of gap alignment algorithms or programs may be used, including residues in sequence A, minus the number of gap residues in FASTA, BLAST, or ENTREZ, FASTA and BLAST, and 30 sequence B, into the Sum of the residue matches between which may be used to calculate percent similarity. These are sequence A and sequence B, times one hundred. Gaps of low available as a part of the GCG sequence analysis package or of no similarity between the two amino acid sequences are (University of Wisconsin, Madison, Wis.), and can be used not included in determining percentage similarity. Percent with or without default settings. ENTREZ is available identity between polynucleotide sequences can also be through the National Center for Biotechnology Information. 35 counted or calculated by other methods known in the art, e.g., In one embodiment, the percent identity of two sequences can the Jotun Hein method (see, for example, Hein, 1990) Identity be determined by the GCG program with a gap weight of 1. between sequences can also be determined by other methods e.g., each amino acid gap is weighted as if it were a single known in the art, e.g., by varying hybridization conditions amino acid or nucleotide mismatch between the two (see US Patent Application No. 20010010913). sequences (see U.S. Pat. No. 6,262.333). 40 Thus, the invention provides methods for identifying a Software for performing BLAST analyses is publicly sequence similar or paralogous or orthologous or homolo available, e.g., through the National Center for Biotechnol gous to one or more polynucleotides as noted herein, or one or ogy Information (see internet website at www.ncbi.nlm. more target polypeptides encoded by the polynucleotides, or nih.gov/). This algorithm involves first identifying high scor otherwise noted herein and may include linking or associat ing sequence pairs (HSPs) by identifying short words of 45 ing a given plant phenotype or gene function with a sequence. length W in the query sequence, which either match or satisfy In the methods, a sequence database is provided (locally or some positive-valued threshold score T when aligned with a across an internet or intranet) and a query is made against the word of the same length in a database sequence. T is referred sequence database using the relevant sequences herein and to as the neighborhood word score threshold (Altschul, associated plant phenotypes or gene functions. 1990); Altschulet al., 1993). These initial neighborhood word 50 In addition, one or more polynucleotide sequences or one hits act as seeds for initiating searches to find longer HSPs or more polypeptides encoded by the polynucleotide containing them. The word hits are then extended in both sequences may be used to search against a BLOCKS (Bairoch directions along each sequence for as far as the cumulative et al., 1997), PFAM, and other databases which contain pre alignment score can be increased. Cumulative scores are cal viously identified and annotated motifs, sequences and gene culated using, for nucleotide sequences, the parameters M 55 functions. Methods that search for primary sequence patterns (reward score for a pair of matching residues; always >0) and with secondary structure gap penalties (Smith et al., 1992) as N (penalty score for mismatching residues; always <0). For well as algorithms such as Basic Local Alignment Search amino acid sequences, a scoring matrix is used to calculate Tool (BLAST: Altschul, 1990: Altschul et al., 1993), the cumulative score. Extension of the word hits in each BLOCKS (Henikoff and Henikoff, 1991), Hidden Markov direction are halted when: the cumulative alignment score 60 Models (HMM; Eddy, 1996: Sonnhamrnmer et al., 1997), falls off by the quantity X from its maximum achieved value: and the like, can be used to manipulate and analyze poly the cumulative score goes to Zero or below, due to the accu nucleotide and polypeptide sequences encoded by polynucle mulation of one or more negative-scoring residue alignments; otides. These databases, algorithms and other methods are or the end of either sequence is reached. The BLAST algo well known in the art and are described in Ausubel et al., rithm parameters W. T. and X determine the sensitivity and 65 1997, and in Meyers, 1995. speed of the alignment. The BLASTN program (for nucle A further method for identifying or confirming that specific otide sequences) uses as defaults a wordlength (W) of 11, an homologous sequences control the same function is by com US 7,868,229 B2 23 24 parison of the transcript profile(s) obtained upon overexpres Identifying Polynucleotides or Nucleic Acids by Hybrid sion or knockout of two or more related polypeptides. Since ization transcript profiles are diagnostic for specific cellular states, Polynucleotides homologous to the sequences illustrated one skilled in the art will appreciate that genes that have a in the Sequence Listing and tables can be identified, e.g., by highly similar transcript profile (e.g., with greater than 50% hybridization to each other under Stringent or under highly regulated transcripts in common, or with greater than 70% stringent conditions. Single stranded polynucleotides hybrid regulated transcripts in common, or with greater than 90% ize when they associate based on a variety of well character regulated transcripts in common) will have highly similar ized physical-chemical forces, such as hydrogen bonding, functions. Fowler and Thomashow, 2002, have shown that Solvent exclusion, base stacking and the like. The stringency 10 of a hybridization reflects the degree of sequence identity of three paralogous AP2 family genes (CBF1, CBF2 and CBF3) the nucleic acids involved, such that the higher the Stringency, are induced upon cold treatment, and each of which can the more similar are the two polynucleotide strands. Strin condition improved freezing tolerance, and all have highly gency is influenced by a variety of factors, including tempera similar transcript profiles. Once a polypeptide has been ture, salt concentration and composition, organic and non shown to provide a specific function, its transcript profile 15 organic additives, solvents, etc. present in both the becomes a diagnostic tool to determine whether paralogs or hybridization and wash solutions and incubations (and num orthologs have the same function. berthereof), as described in more detail in the references cited Furthermore, methods using manual alignment of below (e.g., Sambrook et al., 1989; Berger and Kimmel, sequences similar or homologous to one or more polynucle 1987; and Anderson and Young, 1985). otide sequences or one or more polypeptides encoded by the Encompassed by the invention are polynucleotide polynucleotide sequences may be used to identify regions of sequences that are capable of hybridizing to the claimed similarity and conserved domains characteristic of a particu polynucleotide sequences, including any of the polynucle lar transcription factor family. Such manual methods are otides within the Sequence Listing, and fragments thereof well-known of those of skill in the art and can include, for under various conditions of stringency (see, for example, example, comparisons of tertiary structure between a 25 Wahl and Berger, 1987; and Kimmel, 1987). In addition to the polypeptide sequence encoded by a polynucleotide that com nucleotide sequences listed in the Sequence Listing, full prises a known function and a polypeptide sequence encoded length cDNA, orthologs, and paralogs of the present nucle by a polynucleotide sequence that has a function not yet otide sequences may be identified and isolated using well determined. Such examples of tertiary structure may com known methods. The cDNA libraries, orthologs, and paralogs 30 of the present nucleotide sequences may be screened using prise predicted alpha helices, beta-sheets, amphipathic heli hybridization methods to determine their utility as hybridiza ces, leucine Zipper motifs, Zinc finger motifs, proline-rich tion target or amplification probes. regions, cysteine repeat motifs, and the like. With regard to hybridization, conditions that are highly Orthologs and paralogs of presently disclosed polypep stringent, and means for achieving them, are well known in tides may be cloned using compositions provided by the 35 the art. See, for example, Sambrook et al., 1989; Berger, 1987, present invention according to methods well known in the art. pages 467-469; and Anderson and Young, 1985. cDNAs can be cloned using mRNA from a plant cell or tissue Stability of DNA duplexes is affected by such factors as that expresses one of the present sequences. Appropriate base composition, length, and degree of base pair mismatch. mRNA sources may be identified by interrogating Northern Hybridization conditions may be adjusted to allow DNAs of blots with probes designed from the present sequences, after 40 different sequence relatedness to hybridize. The melting tem which a library is prepared from the mRNA obtained from a perature (T,) is defined as the temperature when 50% of the positive cell or tissue. Polypeptide-encoding cDNA is then duplex molecules have dissociated into their constituent isolated using, for example, PCR, using primers designed single strands. The melting temperature of a perfectly from a presently disclosed gene sequence, or by probing with matched duplex, where the hybridization buffer contains for a partial or complete cDNA or with one or more sets of 45 mamide as a denaturing agent, may be estimated by the fol degenerate probes based on the disclosed sequences. The lowing equations: cDNA library may be used to transform plant cells. Expres sion of the cDNAs of interest is detected using, for example, microarrays, Northern blots, quantitative PCR, or any other 0.62(% formamide)-500/L (I) DNA-DNA technique for monitoring changes in expression. Genomic 50 clones may be isolated using similar techniques to those. (%G+C)?-0.5(% formamide)-820/L (II) DNA-RNA Examples of orthologs of the Arabidopsis polypeptide sequences and their functionally similar orthologs are listed in FIGS. 4A-4F and Tables 2 and 3, and the Sequence Listing. (%G+C)?–0.35(% formamide)-820/L (III) RNA-RNA In addition to the sequences in FIGS. 4A-4F and Tables 2 and 55 where L is the length of the duplex formed, Na+ is the 3 and the Sequence Listing, the invention encompasses iso molar concentration of the sodium ion in the hybridization or lated nucleotide sequences that are phylogenetically and washing solution, and % G+C is the percentage of (guanine-- structurally similar to sequences listed in the Sequence List cytosine) bases in the hybrid. For imperfectly matched ing) and can function in a plant by accelerating time to flow hybrids, approximately 1°C. is required to reduce the melting ering when ectopically expressed in a plant. 60 temperature for each 1% mismatch. Since a significant number of these sequences are phylo Hybridization experiments are generally conducted in a genetically and sequentially related to each other and have buffer of pH between 6.8 to 7.4, although the rate of hybrid been shown to accelerate flowering time, one skilled in the art ization is nearly independent of pH at ionic strengths likely to would predict that other similar, phylogenetically related be used in the hybridization buffer (Anderson and Young, sequences (found in the box in FIG. 3) derived from the same 65 1985). In addition, one or more of the following may be used ancestral sequence would also perform similar fitnctions to reduce non-specific hybridization: Sonicated Salmon sperm when ectopically expressed. DNA or another non-complementary DNA. bovine serum US 7,868,229 B2 25 26 albumin, Sodium pyrophosphate, Sodium dodecylsulfate sulfate (SDS) and ionic strength, are well known to those (SDS), polyvinyl-pyrrolidone, ficolland Denhardt’s solution. skilled in the art. Various levels of stringency are accom Dextran sulfate and polyethylene glycol 6000 act to exclude plished by combining these various conditions as needed. DNA from solution, thus raising the effective probe DNA The washing steps that follow hybridization may also vary concentration and the hybridization signal within a given unit in stringency; the post-hybridization wash steps primarily of time. In some instances, conditions of even greater Strin determine hybridization specificity, with the most critical gency may be desirable or required to reduce non-specific factors being temperature and the ionic strength of the final and/or background hybridization. These conditions may be wash Solution. Wash stringency can be increased by decreas created with the use of higher temperature, lower ionic ing salt concentration or by increasing temperature. Stringent strength and higher concentration of a denaturing agent Such 10 salt concentration for the wash steps will preferably be less as formamide. thanabout 30 mMNaCl and 3 mMtrisodium citrate, and most Stringency conditions can be adjusted to screen for mod preferably less than about 15 mM NaCl and 1.5 mM triso erately similar fragments such as homologous sequences dium citrate. from distantly related organisms, or to highly similar frag Thus, hybridization and wash conditions that may be used ments such as genes that duplicate functional enzymes from 15 to bind and remove polynucleotides with less than the desired closely related organisms. The stringency can be adjusted homology to the nucleic acid sequences or their complements either during the hybridization step or in the post-hybridiza that encode the present polypeptides include, for example: tion washes. Salt concentration, formamide concentration, 6xSSC at 65° C.: hybridization temperature and probe lengths are variables 50% formamide, 4xSSC at 42°C.; or that can be used to alter stringency (as described by the 0.5xSSC, 0.1% SDS at 65° C.; formula above). As a general guidelines high Stringency is with, for example, two wash steps of 10-30 minutes each. typically performed at T-5° C. to T-20° C. moderate Useful variations on these conditions will be readily apparent stringency at T-20°C. to T-35°C. and low stringency at to those skilled in the art. T-35°C. to T-50° C. for duplex >150 base pairs. Hybrid A person of skill in the art would not expect substantial ization may be performed at low to moderate stringency (25 25 variation among polynucleotide species encompassed within 50° C. below T), followed by post-hybridization washes at the scope of the present invention because the highly stringent increasing stringencies. Maximum rates of hybridization in conditions set forth in the above formulae yield structurally solution are determined empirically to occurat T-25°C. for similar polynucleotides. DNA-DNA duplex and T-15° C. for RNA-DNA duplex. If desired, one may employ wash steps of even greater Optionally, the degree of dissociation may be assessed after 30 stringency, including about 0.2xSSC, 0.1% SDS at 65° C. and each wash step to determine the need for Subsequent, higher washing twice, each wash step being about 30 minutes, or Stringency wash steps. about 0.1xSSC, 0.1% SDS at 65° C. and washing twice for 30 High stringency conditions may be used to select for minutes. The temperature for the wash solutions will ordi nucleic acid sequences with high degrees of identity to the narily be at least about 25°C., and for greater stringency at disclosed sequences. An example of stringent hybridization 35 least about 42°C. Hybridization stringency may be increased conditions obtained in a filter-based method such as a South further by using the same conditions as in the hybridization ern or Northern blot for hybridization of complementary steps, with the wash temperature raised about 3°C. to about nucleic acids that have more than 100 complementary resi 5°C., and stringency may be increased even further by using dues is about 5° C. to 20° C. lower than the thermal melting the same conditions except the wash temperature is raised point (T) for the specific sequence at a defined ionic strength 40 about 6° C. to about 9° C. For identification of less closely and pH. Conditions used for hybridization may include about related homologs, wash steps may be performed at a lower 0.02 M to about 0.15M sodium chloride, about 0.5% to about temperature, e.g., 50° C. 5% casein, about 0.02% SDS or about 0.1% N-laurylsar An example of a low stringency wash step employs a cosine, about 0.001 M to about 0.03 M sodium citrate, at solution and conditions of at least 25°C. in 30 mM NaCl, 3 hybridization temperatures between about 50° C. and about 45 mM trisodium citrate, and 0.1% SDS over 30 minutes. 70° C. More preferably, high stringency conditions are about Greater stringency may be obtained at 42°C. in 15 mMNaCl, 0.02 M sodium chloride, about 0.5% casein, about 0.02% with 1.5 mM trisodium citrate, and 0.1% SDS over 30 min SDS, about 0.001M sodium citrate, at a temperature of about utes. Even higher stringency wash conditions are obtained at 50° C. Nucleic acid molecules that hybridize under stringent 65°C.-68°C. in a solution of 15 mMNaCl, 1.5 mM trisodium conditions will typically hybridize to a probe based on either 50 citrate, and 0.1% SDS. Wash procedures will generally the entire DNA molecule or selected portions, e.g., to a employ at least two final wash steps. Additional variations on unique Subsequence, of the DNA. these conditions will be readily apparent to those skilled in the Stringent salt concentration will ordinarily be less than art (see, for example, US Patent Application No. about 750 mM NaCl and 75 mM trisodium citrate. Increas 20010010913). ingly stringent conditions may be obtained with less than 55 Stringency conditions can be selected Such that an oligo about 500 mM. NaCl and 50 mM trisodium citrate, to even nucleotide that is perfectly complementary to the coding oli greater stringency with less than about 250 mM NaCl and 25 gonucleotide hybridizes to the coding oligonucleotide with at mM trisodium citrate. Low stringency hybridization can be least about a 5-10x higher signal to noise ratio than the ratio obtained in the absence of organic solvent, e.g., formamide, for hybridization of the perfectly complementary oligonucle whereas high Stringency hybridization may be obtained in the 60 otide to a nucleic acid encoding a polypeptide known as of the presence of at least about 35% formamide, and more prefer filing date of the application. It may be desirable to select ably at least about 50% formamide. Stringent temperature conditions for a particular assay Such that a higher signal to conditions will ordinarily include temperatures of at least noise ratio, that is, about 15x or more, is obtained. Accord about 30° C., more preferably of at least about 37° C., and ingly, a Subject nucleic acid will hybridize to a unique coding most preferably of at least about 42° C. with formamide 65 oligonucleotide with at least a 2x or greater signal to noise present. Varying additional parameters, such as hybridization ratio as compared to hybridization of the coding oligonucle time, the concentration of detergent, e.g., Sodium dodecyl otide to a nucleic acid encoding known polypeptide. The US 7,868,229 B2 27 28 particular signal will depend on the label used in the relevant domain component, the transcription factors of the invention assay, e.g., a fluorescent label, a colorimetric label, a radio could be expressed by Super-transforming or crossing in a active label, or the like. Labeled hybridization or PCR probes second construct carrying a Sulphonamide resistance select for detecting related polynucleotide sequences may be pro able marker and the transcription factor polynucleotide of duced by oligolabeling, nick translation, end-labeling, or interest cloned behind a Lex A operator site (opLex A:TF). PCR amplification using a labeled nucleotide. For example, the two constructs P6506 (35S::LexA Encompassed by the invention are polynucleotide GAL4TA: SEQID NO. 59) and P5072 (opLex A::G482: SEQ sequences that are capable of hybridizing to the claimed ID NO. 58) together constituted a two-component system for polynucleotide sequences, including any of the polynucle expression of G482 from the 35S promoter. A kanamycin otides within the Sequence Listing, and fragments thereof 10 resistant transgenic line containing P6506 was established, under various conditions of stringency (see, for example, and this was then supertransformed with the P5072 construct Wahl and Berger, 1987, pages 399-407; and Kimmel, 1987). containing a genomic clone of G482 and a Sulfonamide resis In addition to the nucleotide sequences in the Sequence List tance marker. For each transcription factor that was overex ing, full length cDNA, orthologs, and paralogs of the present pressed with a two component system, the second construct nucleotide sequences may be identified and isolated using 15 carried a sulfonamide selectable marker and was contained well-known methods. The cDNA libraries, orthologs, and within vector backbone pMEN53. paralogs of the present nucleotide sequences may be screened For the present study, the opLex A::TF constructs prepared using hybridization methods to determine their utility as and used to supertransform plants are listed in Table 1. These hybridization target or amplification probes. constructs were used to generate lines of transgenic Arabi dopsis plants constitutively overexpressing the G482 Sub EXAMPLES clade polypeptides. Compilations of the sequences of pro moter fragments and the expressed transgene sequences It is to be understood that this invention is not limited to the within the PIDs are provided in the Sequence Listing. particular devices, machines, materials and methods described. Although particular embodiments are described, 25 TABLE 1 equivalent embodiments may be used to practice the inven tion. G482 Subclade polynucleotide constructs The invention, now being generally described, will be more SEQ ID readily understood by reference to the following examples, Gene Construct NO: Construct Projec which are included merely for purposes of illustration of 30 Identifier (PID) of PID components type certain aspects and embodiments of the present invention and OS.G.3397 P21265 46 35S::G3397 Direc are not intended to limit the invention. It will be recognized by OOe one of skill in the art that a polypeptide that is associated with fusion a particular first trait may also be associated with at least one Zm G3435 P21314 47 3SS::G343S Direc 35 OOe other, unrelated and inherent second trait which was not pre fusion dicted by the first trait. Zm G3436 P21315 48 3SS::G3436 Direc OOe Example I fusion OS.G.3398 P21252 49 3SS::G3398 Direc OOe Project Types and Vector and Cloning Information 40 fusion GmiG3474 P21344 50 3SS::G3474 Direc A number of constructs were used to modulate the activity OOe fusion of sequences of the invention. An individual project was GnfG3478 P21350 51 3SS::G3478 Direc defined as the analysis of lines for a particular construct (for OOe example, this might include plant lines that constitutively 45 fusion overexpress G482 or another subclade polypeptide). Gener GnfG3475 P21347 52 35S::G3475 Direc ally, a full-length wild-type version of a gene or its cDNA was OOe directly fused to a promoter that drove its expression in trans fusion genic plants, except as noted in Table 1. Such a promoter At G485 P1441 53 3SS::G48S Direc OOe could be the native promoter of that gene, or the CaMV 35S 50 fusion promoter which drives constitutive expression. Alternatively, GmiG3476 P21345 S4 3SS::G3476 Direc a promoter that drives tissue specific or conditional expres OOe sion could be used in similar studies. A direct fusion approach fusion has the advantage of allowing for simple genetic analysis if a GmiG3472 P21348 55 3SS::G3472 Direc given promoter-polynucleotide line is to be crossed into dif OOe ferent genetic backgrounds at a later date. 55 fusion Zn, G3876 P25657 56 3SS::G3876 Direc As an alternative to plant transformation with a direct OOe fusion construct, Some plant lines were transformed with a fusion two component expression system in which a kanamycin GnfG3875 P26.609 57 35S::G3875 Direc resistant 35S::LeXA-GAL4-TA driver line was established OOe fusion and then Supertransformed with an opLex A::transcription 60 At G482 P5072 58 and 59 opLex A::G482 Two factor construct carrying a Sulfonamide resistance gene for (with P6506) component each of the transcription factors of interest. Supertrans ormation The first component vector, the “driver vector or construct LexA-GALATA P6506 59 35S::LexA- Driver (P6506) contained a transgene carrying a 35S::Lex A-GAL4 in driver GAL4TA construct transactivation domain (TA) (SEQID NO: 59) along with a 65 construct kanamycin resistance selectable marker. Having established a driver line containing the 35S:LexA-GAL4-transactivation US 7,868,229 B2 29 30 Example II first two true leaves, and were easily distinguished from bleached kanamycin or Sulfonamide-susceptible seedlings. Transformation Resistant seedlings were then transferred onto soil (Sunshine potting mix). Following transfer to soil, trays of seedlings Transformation of Arabidopsis was performed by an Agro were covered with plastic lids for 2-3 days to maintain humid bacterium-mediated protocol based on the method of Bech ity while they became established. Plants were grown on soil told and Pelletier, 1998. Unless otherwise specified, all underfluorescent light at an intensity of 70-95 microEinsteins experimental work was done using the Columbia ecotype. and a temperature of 18–23°C. Light conditions consisted of Plant preparation. Arabidopsis seeds were sown on mesh a 24-hour photoperiod unless otherwise stated. In instances covered pots. The seedlings were thinned so that 6-10 evenly 10 where alterations in flowering time were apparent, flowering spaced plants remained on each pot 10 days after planting. time was re-examined under both 12-hour and 24-hour light The primary bolts were cut off a week before transformation to assess whether the phenotype was photoperiod dependent. to break apical dominance and encourage auxiliary shoots to Under our 24-hour light growth conditions, the typical gen form. Transformation was typically performed at 4-5 weeks eration time (seed to seed) was approximately 14 weeks. after sowing. 15 Because many aspects of Arabidopsis development are Bacterial culture preparation. Agrobacterium stocks were dependent on localized environmental conditions, in all cases inoculated from single colony plates or from glycerol Stocks plants were evaluated in comparison to controls in the same and grown with the appropriate antibiotics and grown until flat. Controls for transgenic lines were wild-type plants or saturation. On the morning of transformation, the Saturated transgenic plants harboring an empty transformation vector cultures were centrifuged and bacterial pellets were re-sus selected on kanamycin or Sulfonamide. Plants were macro pended in Infiltration Media (0.5xMS, 1x B5 Vitamins, 5% scopically evaluated while growing on soil. For a given sucrose, 1 mg/ml benzylaminopurine riboside, 200 ul/L Sil project (for example, a particular promoter-gene combina wet L77) until an A600 reading of 0.8 was reached. tion), ten transformed lines were typically examined. Transformation and seed harvest. The Agrobacterium solu tion was poured into dipping containers. All flower buds and 25 Example IV rosette leaves of the plants were immersed in this solution for 30 seconds. The plants were laid on their side and wrapped to Data Collection keep the humidity high. The plants were kept this way over night at 4° C. and then the pots were turned upright, Phenotypic Analysis: Flowering time. Flowering time unwrapped, and moved to the growth racks. 30 analysis was conducted with transformed or control Arabi The plants were maintained on the growth rack under dopsis plants grown in soil. Plants exhibiting modulated onset 24-hour light until seeds were ready to be harvested. Seeds of flower development relative to the controls were readily were harvested when 80% of the siliques of the transformed identifiable by visual observation and could be selected on plants were ripe (approximately 5 weeks after the initial that basis. Flowering time was determined based on either or transformation). This putatively transgenic seed was deemed 35 both of (i) number to days after planting at which the first TO seed, since it was obtained from the TO generation, and visible flower bud was observed; and (ii) the total number of was later plated on selection plates (either kanamycin or leaves (rosette or rosette plus cauline) produced by the pri sulfonamide). Resistant plants that were identified on such mary shoot meristem. selection plates comprised the T1 generation, and could be Measurement of yield. Yield of transformed crop species used to produce transgenic seed comprising the expression 40 and other non-Arabidopsis plants may be recorded as bushels vector encoding a G482 Subclade transcription factor per acre, or number of harvested fruit, or weight of harvested polypeptide. fruit, and compared to, for example, a non-transformed parental control line or a control line transformed with an Example III empty vector that does not comprise a G482 subclade poly 45 nucleotide. Yield data may be averaged across multiple loca Morphology tions. Thus, transgenic plants transformed with a member of the G482 subclade of polypeptides can show early flowering, Morphological analysis was performed to determine more harvests per growing season, and/or increased yield whether changes in polypeptide levels affect plant growth and relative to the yield exhibited by control plants. development. This was primarily carried out on the T1 gen 50 eration, when at least 10-20 independent lines were exam Example V ined. However, in cases where a phenotype required confir mation or detailed characterization, plants from Subsequent Transcription Factor Polynucleotide and Polypeptide generations were also analyzed. Sequences of the Invention, and Results Obtained Primary transformants were selected on MS medium with 55 with Plants Overexpressing these Sequences 0.3%. Sucrose and 50 mg/lkanamycin. T2 and later generation plants were selected in the same manner, except that kana Table 2 and Table 3 show the polypeptides identified by mycin was used at 35 mg/l. In cases where lines carry a SEQ ID NO; Gene ID (GID) No.: the transcription factor Sulfonamide marker (as in all lines generated by Super-trans family to which the polypeptide belongs, and conserved B formation), seeds were selected on MS medium with 0.3% 60 domains of the polypeptide. The first column shows the sucrose and 1.5 mg/l sulfonamide. KO lines were usually polypeptide SEQ ID NO; the second column the species germinated on plates without a selection. Seeds were cold (abbreviated) and identifier (GID or “Gene IDentifier); the treated (stratified) on plates for three days in the dark (in order third column shows percentage identity of each sequence to to increase germination efficiency) prior to transfer to growth the G3397 protein (the number of identical residues per the cabinets. Initially, plates were incubated at 22°C. under a 65 total number of residues in the subsequence used by the light intensity of approximately 100 microEinsteins for 7 BLASTp algorithm for comparison appears in parentheses), days. At this stage, transformants were green, possessed the the fourth column shows the B domain of each sequence; the US 7,868,229 B2 31 32 fifth column lists each SEQ ID NO: of the respective B domains in Tables 2 and 3, respectively. G3397 and G3476 domains, the six column shows the amino acid coordinates of are two of the sequences in the G482 subclade shown to the conserved B domains that were used to determine per confer accelerated flowering in overexpressing plants. centage identity of the conserved B domains to the G3397 and G3476 B domains (Tables 2 and 3, respectively); the seventh Homologies of sequences listed in Tables 2 and 3 were column shows the percentage identity of each of the B determined after aligning the sequences using the methods of domains to the G3397 and G3476 B domains (Tables 2 and 3. Smith and Waterman, 1981. After alignment, sequence com respectively; the number of identical residues per the total parisons between the polypeptides were performed by com number of residues in the subsequence used by the BLASTp parison over a comparison window to identify and compare algorithm for comparison appears in parentheses), and the 10 local regions of sequence similarity. A description of the eighth column identifies by a plus sign (+) sequences that, method is provided in Ausubel et al. (eds.) Current Protocols when overexpressed in plants, produced at least some lines in Molecular Biology, John Wiley & Sons (1997 and supple that were visibly earlier in their flower development than ments through 2001). Altschul et al., 1990, and Gish and control plants. The sequences are arranged in descending States, 1993. The percentage identity reported in these tables order of percentage identity to the G3397 and G3476 B is based on the comparison within these windows.

TABL 2 Conserved sequences and functions for TF's closely related to G3397

Col. 7 Col. 2 & ID to Co. 8 Species/ Col. 3 Col. 6 B OEs Col. 1 GID No. & ID to Co. 5 Amino domain flowered SEQ Accession G3397 SEO ID Acid of G3397 earlier ID No. or (SEQ ID Col. 4 NO: of B Coordinates (SEQ ID than NO: Identifier NO; 2) B Domain domain of B domain NO: 31) controls

2 Os/G3397 1 OO& REODRFLPIANVSRIMKKA 31 23-113 1 OO& -- (219/219) LPANAKISKDAKETVOEC (91/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWDPLKHYLHKFRE

4. Zm/G3435 8O3. REODRFLPIANVSRIMKKA 32 22-112 982 -- (182/226) LPANAKISKDAKETVQEC (9 O/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWEPLKHYLHKFRE

6 Zm/G3436 7 O3. REODRFLPIANVSRIMKKA 33 2O-110 973 -- (154/219) LPANAKISKDAKETVOEC (89/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWEPLKLYLHKFRE

8 Os AG3398 683. REODRFLPIANVSRIMKRA 34 21-111 96.3 -- (158/229) LPANAKISKDAKETVOEC (88/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYIDPLKLYLLIKFRE

10 Gm/G3474 583 REODRFLPIANVSRIMKKA 35 25-115 953 -- (124/211) LPANAKISKEAKETVQECV (87/91) SEFISFITGEASDKCOKEKR KTINGDDLLWAMTTLGFE DYWDPLKIYLHKYRE

12 Gm/G3478 63& REODRFLPIANVSRIMKKA 36 23-113 953 -- (129/2O2) LPANAKISKDAKETVQEC (87/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWEPLKGYLORFRE

14 Gm/G3475 62& REODRFLPIANVSRIMKKA 37 23-113 953 -- (127/2O2) LPANAKISKDAKETVOEC (87/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWEPLKGYLORFRE

16 At ?c485 64& REODRFLPIANVSRIMKKA 38 2O-110 953 -- (119/185) LPANAKISKDAKETVOEC (87/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWEPLKWYLOKYRE US 7,868,229 B2 33 34

TABLE 2 - continued

Conserved sequences and functions for TF's closely related to G3397

Col. 7 Col. 2 & ID to Co. 8 Species / Col. 3 Col. 6 B OEs Col. 1 GID No. & ID to Co. 5 Amino domain flowered SEQ Accession G3397 SEQ ID Acid of G3397 earlier ID No. or (SEQ ID Col. 4 NO: of B Coordinates (SEQ ID than NO: Identifier NO; 2) B Domain domain of B domain NO: 31) controls

28 Pp/G3870 62& REODRFLPIANVSRIMKKA 44 34 - 124 943. n/d (113/182) LPSNAKISKDAKETVOECW (86/91) SEFISFITGEASDKCOREKR KTINGDDLLWAMSTLGFE DYWEPLKWYLHKYRE

3O Pp/G3868 61& REODRFLPIANVSRIMKKA 45 34 - 124 943. n/d (113/184) LPSNAKISKDAKETVOECW (86/91) SEFISFITGEASDKCOREKR KTINGDDLLWAMSTLGFE DYWEPLKWYLHKYRE

18 Gm/G3476 773 REODRFLPIANVSRIMKKA 39 26 - 116 943. -- (106/137) LPANAKISKDAKETVOEC (86/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EEY WEPLKIYLORFRE

2O Gm/G3472 sf3. REODRFLPIANVSRIMKKA 4 O 25-115 933. +/- (124/216) LPANAKISKEAKETVQECV (85/91) SEFISFITGEASDKCOKEKR KTINGDDLLWAMTTLGFE EYWEPLKWYLHKYRE

22 Zm/G3876 61& REODRFLPIANISRIMKKAI 41 3 O - 12O 873 +/- (1O2/165) PANGKIAKDAKETVOECW (80A91) SEFISFITSEASDKCOREKR KTINGDDLLWAMATLGFE DYIEPLKWYLOKYRE

24 Gm/G3875 sf3. REODRYLPIANISRIMKKA 42 25-115 873 +/- (98/17O) LPANGKIAKDAKETVOEC (80A91) WSEFISFITSEASDKCOREK RKTINGDDLLWAMATLGF EDYIDPLKIYLTRYRE

TABLE 3 Conserved sequences and functions for TFs closely related to G3476

Col. 7 Col. 2 & ID to Co. 8 Species / Col. 3 Col. 6 B OEs Col. 1 GID No. & ID to Co. 5 Amino domain flowered SEQ Accession G3476 SEQ ID Acid of G3476 earlier ID No. or (SEQ ID Col. 4 NO: of B Coordinates (SEQ ID than NO: Identifier NO: 18) B Domain domain of B domain NO: 39) controls

18 Gm/G3476 10O3. REODRFLPIANVSRIMKKA 39 26 - 116 10O3. -- (165/165) LPANAKISKDAKETVOEC (91/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EEY WEPLKIYLORFRE

12 Gm/G3478 7 O3. REODRFLPIANVSRIMKKA 36 23-113 973 -- (137/194) LPANAKISKDAKETVOEC (89/91) WSEFISFITGEASDKCOREK RKTINGDDLLWAMTTLGF EDYWEPLKGYLORFRE

US 7,868,229 B2 37 38

TABLE 3 - continued Conserved sequences and functions for TFs closely related to G3476

Col. 7 Col. 2 & ID to Co. 8 Species / Col. 3 Col. 6 B OEs Col. 1 GID No. & ID to Co. 5 Amino domain flowered SEQ Accession G3476 SEQ ID Acid of G3476 earlier ID No. or (SEQ ID Col. 4 NO: of B Coordinates (SEQ ID than NO: Identifier NO: 18) B Domain domain of B domain NO: 39) controls 24 Gm/G3875 683. REODRYLPIANISRIMKKA 42 25-115 873 +/- (99/144) LPANGKIAKDAKETVOEC (80A91) WSEFISFITSEASDKCOREK RKTINGDDLLWAMATLGF EDYIDPLKIYLTRYRE

Abbreviations for Tables 2 and 3 At Arabidopsis thaliana Gim Glycine max Os Oryza sativa Pp Physcomitrella patens Z. Zea mays OEs transformed plants overexpressing the sequence in Column 1 n/d transformation and testing not yet performed +/- some lines flowered earlier, some later than Control plants

As seen in Tables 2 and 3, almost all of the G482subclade biological activity to the present polypeptide sequences, thus polypeptides that have been tested to date accelerated flow being members of the G482 subclade polypeptides, are ering time in transgenic plant lines overexpressing these encompassed by the invention. Conserved B domains of the sequences. These sequences were derived from diverse spe G482 subclade of transcription factor polypeptides are cies of monocots and dicots, indicating evolutionary conser examples of domains comprising Subunit association and vation of both structure and function. Since very diverse 30 DNA binding domains and are required for conferring similar species of plants have retained sequences that function simi functions in the transcription factors of the invention. Over larily in other species, it is highly likely that a great many expression in a transformed plant of a polypeptide that com sequences may be found in many less evolutionary distant prises a G482 subclade CCAAT-binding B domain of the plants that function similarly. Conservation of function also invention results in the transformed plant having an early strongly suggests that the pathways accelerating flowering 35 flowering time, as compared to a control plant. time are also conserved, and thus G482 subclade members are Exemplary fragments of the sequences of the invention expected to function in many diverse species. include central conserved B domains listed in Tables 2 and 3, SEQID NO: 31-43, including, for example, amino acid resi Sequences that fall within the scope of the present claims dues 23-113 of rice G3397 (SEQ ID NO: 31), amino acid but which have not yet been introduced into transgenic plants 40 and tested for their ability to accelerate flowering compared to residues 26-116 of soy G3476 (SEQ ID NO:39) or amino control plants are expected also shorten the time when to acid residues 20-110 of maize G3436 (SEQID NO:33). flowering when the sequences are overexpressed. Example VI Thus, transgenic plants that are transformed with an expression vector comprising a G482 subclade polynucle 45 Utilities of G482 Subclade Sequences otide the encodes a G482 subclade polypeptide, wherein the latter comprises a conserved B domain at least at least about Based on the data obtained in the above-disclosed 75% amino acid sequence identity, or at least about 78% Examples, accelerated flowering time of G482 subclade over amino acid sequence identity, or at least about 80% amino expressors indicates that G482-related sequence overexpres acid sequence identity, or at least about 81% amino acid 50 sion may allow more than one planting and harvest of a crop sequence identity, or at least about 82% amino acid sequence to be made within a single season. In commercial species identity, or at least about 83% amino acid sequence identity, where the reproductive parts of the plants constitute the crop or at least about 84% amino acid sequence identity, or at least and the vegetative tissues are discarded, it can be advanta about 85% amino acid sequence identity, or at least about geous to accelerate the time to flowering. Accelerating flow 86% amino acid sequence identity, or at least about 87% 55 ering can also shorten crop and tree breeding programs. Addi amino acid sequence identity, or at least about 91% amino tionally, in Some instances, a faster generation time would acid sequence identity, or at least about 93% amino acid allow additional harvests of a crop to be made within a given sequence identity, or at least about 94% amino acid sequence growing season, or allow a crop to be harvested Sooner, identity, or at least about 95% amino acid residue sequence thereby avoiding damage from drought or low temperature identity, or at least about 96% amino acid sequence identity, 60 later in the season. It is also envisaged that transcription or at least about 97% amino acid sequence identity, or at least factors of the G482 subclade will have utility as tools for about 98% amino acid sequence identity, or at least about regulated induction offlowering; for example, a crop contain 98% amino acid sequence identity, or at least about 99% ing a G481-related sequence controlled via a chemically or amino acid sequence identity, to a conserved B domain of a conditionally inducible expression system, could be synchro polypeptide of the invention (e.g., SEQ ID NOs: 31-43). 65 nized to flower at a desired time by inducing the expression of Sequences that possess or encode for B domains that meet the G482-related sequence. Furthermore, some winter vari these criteria of percentage identity, and that have comparable eties of crops require a long period of cold (vernalization) to US 7,868,229 B2 39 40 trigger the onset of flowering and fruit production; expression Koornneef et al., 1986, and in U.S. Pat. No. 6,613,962, the of G482-related sequences could obviate the need for such latter method described in briefhere. Eight day old cotyledon treatmentS. explants are precultured for 24 hours in Petri dishes contain ing a feeder layer of Petunia hybrida suspension cells plated Example VII on MS medium with 2% (w/v) sucrose and 0.8% agar supple Transformation of Dicots to Produce Improved Traits mented with 10 uM C.-naphthalene acetic acid and 4.4 uM such as Accelerated Flowering Time 6-benzylaminopurine. The explants are then infected with a diluted overnight culture of Agrobacterium tumefaciens con Crop species that overexpress polypeptides of the inven 10 taining an expression vector comprising a polynucleotide of tion may produce plants with earlier flowering time than the invention for 5-10 minutes, blotted dry on sterile filter plants not transformed with a sequence of the invention. Thus, paper and cocultured for 48 hours on the original feeder layer polynucleotide sequences listed in the Sequence Listing plates. Culture conditions are as described above. Overnight recombined into, for example, one of the expression vectors cultures of Agrobacterium tumefaciens are diluted in liquid of the invention, or another Suitable expression vector, may be 15 transformed into a plant for the purpose of modifying plant MS medium with 2% (w/v/) sucrose, pH 5.7) to an ODoo of traits for the purpose of improving yield and/or quality. The O8. expression vector may contain a constitutive, tissue-specific Following cocultivation, the cotyledon explants are trans or inducible promoter operably linked to the polynucleotide. ferred to Petri dishes with selective medium comprising MS The cloning vector may be introduced into a variety of plants medium with 4.56 uM Zeatin, 67.3 uM Vancomycin, 418.9 by means well known in the art such as, for example, direct uM cefotaxime and 171.6 uM kanamycin sulfate, and cul DNA transfer or Agrobacterium tumefaciens-mediated trans tured under the culture conditions described above. The formation. It is now routine to produce transgenic plants explants are subcultured every three weeks onto fresh using most dicot plants (see Weissbach and Weissbach, 1989; medium. Emerging shoots are dissected from the underlying Gelvin et al., 1990; Herrera-Estrella et al., 1983; Bevan, 1984; 25 and Klee, 1985). Methods for analysis of traits are routine in callus and transferred to glass jars with selective medium the art and examples are disclosed above. without Zeatin to form roots. The formation of roots in a Numerous protocols for the transformation of tomato and kanamycin Sulfate-containing medium is a positive indica soy plants have been previously described, and are well tion of a successful transformation. known in the art. Gruber et al., 1993, in Glick and Thompson, 30 Transformation of soybean plants may be conducted using 1993 describe several expression vectors and culture methods the methods found in, for example, U.S. Pat. No. 5,563,055 that may be used for cell or tissue transformation and subse (Townsendet al., issued Oct. 8, 1996), described inbriefhere. quent regeneration. For soybean transformation, methods are In this method soybean seed is surface sterilized by exposure described by Miki et al., 1993; and U.S. Pat. No. 5,563,055, to chlorine gas evolved in a glass bell jar. Seeds are germi (Townsend and Thomas), issued Oct. 8, 1996. 35 nated by plating on 1/10 strength agar Solidified medium There are a substantial number of alternatives to Agrobac without plant growth regulators and culturing at 28°C. with a terium-mediated transformation protocols, other methods for 16 hour day length. After three or four days, seed may be the purpose of transferring exogenous genes into Soybeans or prepared for cocultivation. The seedcoat is removed and the tomatoes. One Such method is microprojectile-mediated elongating radicle removed 3-4 mm below the cotyledons. transformation, in which DNA on the surface of microprojec 40 tile particles is driven into plant tissues with a biolistic device Overnight cultures of Agrobacterium tumefaciens harbor (see, for example, Sanford et al., 1987: Christou et al., 1992: ing the expression vector comprising a polynucleotide of the Sanford, 1993; Klein et al., 1987; U.S. Pat. No. 5,015,580 invention are grown to log phase, pooled, and concentrated by (Christou et al), issued May 14, 1991; and U.S. Pat. No. centrifugation. Inoculations are conducted in batches Such 5,322,783 (Tomes et al.), issued Jun. 21, 1994). 45 that each plate of seed was treated with a newly resuspended Alternatively, Sonication methods (see, for example, pellet of Agrobacterium. The pellets are resuspended in 20 ml Zhang et al., 1991); direct uptake of DNA into protoplasts inoculation medium. The inoculum is poured into a Petridish using CaCl2 precipitation, polyvinyl alcohol or poly-L-orni containing prepared seed and the cotyledonary nodes are thine (see, for example, Hain et al., 1985; Draper et al., 1982); macerated with a surgical blade. After 30 minutes the explants liposome or spheroplast fusion (see, for example, Deshayes et 50 are transferred to plates of the same medium that has been al., 1985; Christou et al., 1987); and electroporation of pro solidified. Explants are embedded with the adaxial side up toplasts and whole cells and tissues (see, for example, Donn and level with the surface of the medium and cultured at 22° et al., 1990; DHalluin et al., 1992; and Spencer et al., 1994) C. for three days under white fluorescent light. These plants have been used to introduce foreign DNA and expression may then be regenerated according to methods well estab vectors into plants. 55 lished in the art, such as by moving the explants after three After a plant or plant cell is transformed (and the latter days to a liquid counter-selection medium (see U.S. Pat. No. regenerated into a plant), the transformed plant may be 5,563,055). crossed with itself or a plant from the same line, a non transformed or wild-type plant, or another transformed plant The explants may then be picked, embedded and cultured in solidified selection medium. After one month on selective from a different transgenic line of plants to produce trans 60 genic seed comprising a G482 Subclade polynucleotide. media transformed tissue becomes visible as green sectors of Crossing provides the advantages of producing new and often regenerating tissue against a background of bleached, less stable transgenic varieties. Genes and the traits they confer healthy tissue. Explants with green sectors are transferred to that have been introduced into a tomato or soybean line may an elongation medium. Culture is continued on this medium be moved into distinct line of plants using traditional back 65 with transfers to fresh plates every two weeks. When shoots crossing techniques well known in the art. Transformation of are 0.5 cm in length they may be excised at the base and tomato plants may be conducted using the protocols of placed in a rooting medium. US 7,868,229 B2 41 42 Example VIII ated by Standard corn regeneration techniques (Fromm et al., 1990; Gordon-Kamm et al., 1990). Transformation of Monocots to Produce Improved Traits such as Accelerated Flowering Time Example IX Expression and Analysis of Improved Traits such as Cereal plants such as, but not limited to, corn, wheat, rice, Decreased Flowering Time in Non-Arabidopsis Sorghum, or barley, may be transformed with the present Species polynucleotide sequences, including monocot or dicot-de 10 As sequences of the invention have been shown to decrease rived sequences such as those found in the present Tables, the time to flowering in plant species, it is also expected that cloned into a vector Such as pGA643 and containing a kana these sequences will accelerate the time to flowering of orna mycin-resistance marker, and expressed constitutively under, mentals, crops or other commercially important plant species. for example, the CaMV 35S or COR15 promoters, or with Northern blot analysis, RT-PCR or microarray analysis of tissue-specific or inducible promoters. The expression vec 15 the regenerated, transformed plants may be used to show expression of a polypeptide or the invention and related genes tors may be one found in the Sequence Listing, or any other that are capable of accelerating flowering time. suitable expression vector may be similarly used. For After a dicot plant, monocot plant or plant cell has been example, pMENO20 may be modified to replace the NptII transformed (and the latter regenerated into a plant) and coding region with the BAR gene of Streptomyces hygro shown to flower earlier than a control plant, the transformed scopicus that confers resistance to phosphinothricin. The monocot plant may be crossed with itself or a plant from the KpnI and BglII sites of the Bar gene are removed by site same line, a non-transformed or wild-type monocot plant, or another transformed monocot plant from a different trans directed mutagenesis with silent codon changes. genic line of plants, to produce transgenic seed comprising a The cloning vector may be introduced into a variety of 25 G482 subclade polynucleotide. cereal plants by means well known in the art including direct The function of specific polypeptides of the invention, DNA transfer or Agrobacterium tumefaciens-mediated trans including closely-related orthologs, have been analyzed and formation. The latter approach may be accomplished by a may be further characterized and incorporated into crop variety of means, including, for example, that of U.S. Pat. No. plants. The ectopic overexpression of these sequences may be 30 regulated using constitutive, inducible, or tissue specific 5,591,616, in which monocotyledon callus is transformed by regulatory elements. Genes that have been examined and contacting dedifferentiating tissue with the Agrobacterium have been shown to modify plant traits (including, for containing the cloning vector. example, accelerated flowering time) encode polypeptides The sampletissues are immersed in a suspension of 3x10 found in the Sequence Listing. In addition to these sequences, cells of Agrobacterium containing the cloning vector for 3-10 35 it is expected that newly discovered polynucleotide and minutes. The callus material is cultured on Solid medium at polypeptide sequences closely related to polynucleotide and 25°C. in the dark for several days. The calli grown on this polypeptide sequences found in the Sequence Listing can also medium are transferred to Regeneration medium. Transfers confer alteration of traits in a similar manner to the sequences are continued every 2-3 weeks (2 or 3 times) until shoots found in the Sequence Listing, when transformed into any of 40 a considerable variety of plants of different species, and develop. Shoots are then transferred to Shoot-Elongation including dicots and monocots. The polynucleotide and medium every 2-3 weeks. Healthy looking shoots are trans polypeptide sequences derived from monocots (e.g., the rice ferred to rooting medium and after roots have developed, the sequences) may be used to transform both monocot and dicot plants are placed into moist potting soil. plants, and those derived from dicots (e.g., the Arabidopsis The transformed plants are then analyzed for the presence 45 and Soy genes) may be used to transform either group, of the NPTII gene/kanamycin resistance by ELISA, using the although it is expected that Some of these sequences will function best if the gene is transformed into a plant from the ELISA NPTII kit from SPrime-3Prime Inc. (Boulder, Colo.). same group as that from which the sequence is derived. It is also routine to use other methods to produce transgenic As an example of a first step to determine improved yield plants of most cereal crops (Vasil, 1994) such as corn, wheat, 50 related traits, seeds of these transgenic plants may be Sub rice, sorghum (Cassas et al., 1993), and barley (Wan and jected to germination or growth assays to measure flowering Lemeaux, 1994). DNA transfer methods such as the micro time. The decrease in flowering time conferred by G482 projectile method can be used for corn (Fromm et al., 1990; Subclade polypeptides may reduce time to market and can Gordon-Kamm et al., 1990; Ishida, 1990), wheat (Vasil et al., contribute to increased yield of commercially available 1992; Vasil et al., 1993: Weeks et al., 1993), and rice (Chris 55 plants, by, for example, providing improved pollination or tou, 1991; Hiei et al., 1994; Aldemita and Hodges, 1996; and additional plantings and harvests within a given growing SCaSO. Hiei et al., 1997). For most cereal plants, embryogenic cells It is expected that the same methods may be applied to derived from immature scutellum tissues are the preferred identify other useful and valuable sequences of the present cellular targets for transformation (Hiei et al., 1997: Vasil, 60 polypeptide clades and Subclades, and the sequences may be 1994). Fortransforming corn embryogenic cells derived from derived from a diverse range of species. immature scutellar tissue using microprojectile bombard ment, the A188XB73 genotype is the preferred genotype REFERENCES CITED (Fromm et al., 1990; Gordon-Kamm et al., 1990). After microprojectile bombardment the tissues are selected on 65 Aldemita and Hodges (1996) Planta 199: 612-617 phosphinothricinto identify the transgenic embryogenic cells Altschul (1990).J. Mol. Biol. 215: 403-410 (Gordon-Kamm et al., 1990). Transgenic plants are regener Altschul (1993).J. Mol. Evol. 36: 290-300 US 7,868,229 B2 43 44 Anderson and Young (1985) “Quantitative Filter Hybridisa Henikoff and Henikoff (1991) Nucleic Acids Res. 19: 6565 tion'. In: Hames and Higgins, ed., Nucleic Acid Hybridi 6572 sation A Practical Approach. Oxford, IRL Press, 73-111 Herrera-Estrella et al. (1983) Nature 303: 209 Ausubel et al. (1997) Short Protocols in Molecular Biology, Hiei et al. (1994) Plant J. 6:271-282 John Wiley & Sons, New York, N.Y., unit 7.7 Hiei et al. (1997) Plant Mol. Biol. 35:205-218 Bairochet al. (1997) Nucleic Acids Res. 25: 217-221 Higgins and Sharp (1988) Gene 73:237-244 Bechtold and Pelletier (1998) Methods Mol. Biol. 82: 259 Higgins et al. (1996) Methods Enzymol. 266: 383-402 266 Ishida (1990) Nature Biotechnol. 14:745-750 Ben-Naim et al. (2006) Plant J. 46: 462-476 Jaglo et al. (2001) Plant Physiol. 127: 910-917 Berger and Kimmel (1987), “Guide to Molecular Cloning 10 Kashima et al. (1985) Nature 313:402-404 Techniques', in Methods in Enzymology, Vol. 152, Aca Kim et al. (2001) Plant J. 25: 247-259 demic Press, Inc., San Diego, Calif. Kimmel (1987) Methods Enzymol. 152:507-511 Bevan (1984) Nucleic Acids Res. 12:8711-8721 Klee (1985) Bio/Technology 3: 637-642 Bhattacharjee et al. (2001) Proc. Natl. Acad. Sci. USA 98: Klein et al. (1987) Nature 327: 70-73 13790-13795 15 Koonmeef et al (1986) In Tomato Biotechnology: Alan R. Borevitz et al. (2000) Plant Cell 12: 2383-2393 Liss, Inc., 169-178 Boss and Thomas (2002) Nature, 416: 847-850 Ku et al. (2000) Proc. Natl. Acad. Sci. USA 97: 9121-9126 Bruce et al. (2000) Plant Cell 12: 65-79 Kyozuka and Shimamoto (2002) Plant Cell Physiol. 43: 130 Bucher (1988).J. Biomol. Struct. Dyn. 5: 1231-1236 135 Bucher (1990).J. Mol. Biol. 212:563-578 Lee et al. (2003) Proc. Natl. Acad. Sci. 100: 2152-2156 Cassas et al. (1993) Proc. Natl. Acad. Sci. USA 90: 11212 Lin et al. (1991) Nature 353: 569-571 11216 Maity and de Crombrugghe (1998) Trends Biochem. Sci. 23: Chase et al. (1993) Ann. Missouri Bot. Gard. 80: 528-580 174-178 Cheikh et al. (2003) U.S. Patent Application No. Mandel (1992a) Nature 360:273-277 2003O101479 25 Mandelet al. (1992b) Cell 71-133-143 Christou et al. (1987) Proc. Natl. Acad. Sci. USA 84: 3962 Mandelet al. (1995) Nature 377:522-524 3966 Mantovani (1999). Gene 239, 15-27 Christou (1991) Bio/Technol. 9:957-962 McNabb et al. (1995) Genes Dev. 9: 47-58 Christou et al. (1992) Plant. J. 2: 275-281 Meyers (1995) Molecular Biology and Biotechnology, Wiley Coupland (1995) Nature 377: 482-483 30 VCH, New York, N.Y., p856-853 Daly et al. (2001) Plant Physiol. 127: 1328-1333 Miki et al. (1993) in Methods in Plant Molecular Biology and De Blaere et al. (1987) Meth. Enzymol. 143:277) Biotechnology, p. 67-88, Glick and Thompson, eds., CRC Deshayes et al. (1985) EMBO.J., 4: 2731-2737 Press, Inc., Boca Raton DHalluin et al. (1992) Plant Cell 4: 1495-1505 Mount (2001), in Bioinformatics. Sequence and Genome Donnet al. (1990) in Abstracts of VIth International Congress 35 Analysis, Cold Spring Harbor Laboratory Press, Cold on Plant Cell and Tissue Culture LAPTC, A2-38: 53 Spring Harbor, N.Y., p. 543 Doolittle, ed. (1996) Methods in Enzymology, Vol. 266: Nandi et al. (2000) Curr. Biol. 10: 215-218 “Computer Methods for Macromolecular Sequence Olesen and Guarente (1990) Genes Dev. 4, 1714-1729 Analysis' Academic Press, Inc., San Diego, Calif., USA Peng et al. (1997) Genes Development 11:3194-3205 Draper et al. (1982) Plant Cell Physiol. 23: 451–458 40 Peng et al. (1999) Nature 400: 256-261 Eddy (1996) Curr. Opin. Str. Biol. 6: 361-365 Ratcliffe et al. (2001) Plant Physiol. 126: 122-132 Edwards et al. (1998) Plant Physiol. 117: 1015-1022 Reeves and Nissen (1995) Prog. Cell Cycle Res. 1: 339-349 Eisen (1998) Genome Res. 8: 163-167 Riechmann et al. (2000a) Science 290, 2105-2110 Feng and Doolittle (1987).J. Mol. Evol. 25: 351-360 Riechmann (2000b) Curr. Opin. Plant Biol. 3, 423-434 Fowler and Thomashow (2002) Plant Cell 14: 1675-1690 45 Rieger et al. (1976) Glossary of Genetics and Cytogenetics. Fromm et al. (1990) Bio/Technol. 8: 833-839 Classical and Molecular, 4th ed., Springer Verlag, Berlin Fu et al. (2001) Plant Cell 13: 1791-1802 Robson et al. (2001) Plant J28: 619-631 Gelinas et al. (1985) Nature 313: 323-325 Sadowski et al. (1988) Nature 335: 563-564 Gelvin et al. (1990) Plant Molecular Biology Manual, Klu Sambrook et al. (1989) Molecular Cloning. A Laboratory wer Academic Publishers 50 Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Gilmour et al. (1998) Plant J. 16:433-442 Spring Harbor, N.Y. Gish and States (1993) Nature Genetics 3: 266-272 Sanford et al. (1987) Part. Sci. Technol. 5:27-37 Gruber et al., in Glick and Thompson (1993) Methods in Sanford (1993) Methods Enzymol. 217:483-509 Plant Molecular Biology and Biotechnolog. eds., CRC Shpaer (1997) Methods Mol. Biol. 70: 173-187 55 Simon et al. (1996) Nature 384: 59-62 Press, Inc., Boca Raton Smith and Waterman (1981) Adv. Appl. Math. 2: 482-489 Goodrich et al. (1993) Cell 75: 519-530 Smith et al. (1992) Protein Engineering 5:35-51 Gordon-Karnmetal. (1990) Plant Cell 2: 603-618 Soltis et al. (1997) Ann. Missouri Bot. Gard. 84: 1-49 Gusmaroliet al. (2001) Gene 264: 173-185 Sonnhammer et al. (1997) Proteins 28: 405-420 Gusmaroliet al. (2002) Gene 283: 4148 60 Spencer et al. (1994) Plant Mol. Biol. 24: 51-61 Hain et al. (1985) Mol. Gen. Genet. 199: 161-168 Suzuki et al. (2001) Plant J. 28: 409–418 Haymes et al. (1985) Nucleic Acid Hybridization: A Practical Tasanen et al. (1992) J Biol. Chem. 267: 11513-11519 Approach, IRL Press, Washington, D.C. Thompson et al. (1994) Nucleic Acids Res. 22:4673-4680 He et al. (2000) Transgenic Res. 9; 223-227 Tudge (2000) in The Variety of Life, Oxford University Press, Hein (1990) Methods Enzymol. 183: 626-645 65 New York, N.Y. pp. 547-606 Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA Vasil et al. (1992) Bio/Technol. 10:667-674 89:10915 Vasil et al. (1993) Bio/Technol. 11: 1553-1558 US 7,868,229 B2 45 46 Vasil (1994) Plant Mol. Biol. 25: 925-937 extent as if each individual publication or patent application Wahl and Berger (1987) Methods Enzymol. 152:399-407 was specifically and individually indicated to be incorporated Wan and Lemeaux (1994) Plant Physiol. 104:37-48 by reference. Weeks et al. (1993) Plant Physiol. 102: 1077-1084 Weigel and Nilsson (1995) Nature 377: 482-500 The present invention is not limited by the specific embodi Weissbach and Weissbach (1989) Methods for Plant Molecu ments described herein. The invention now being fully lar Biology, Academic Press described, it will be apparent to one of ordinary skill in the art Xu et al. (2001) Proc. Natl. Acad. Sci. USA 98: 15089-15094 that many changes and modifications can be made thereto Zemzoumi et al. (1999).J. Mol. Biol. 286: 327-337 without departing from the spirit or scope of the appended Zhang et al. (1991) Bio/Technology 9: 996-997 10 claims. Modifications that become apparent from the forego All publications and patent applications mentioned in this ing description and accompanying figures fall within the specification are herein incorporated by reference to the same Scope of the claims.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 71

<21 Oc SEO ID NO 1 <211 LENGTH: 720 <212> TYPE: DNA <213> ORGANISM: Oryza sativa <22 Os FEATURE; OTHER INFORMATION: SEO ID NO: 1. G3397

<4 OOs SEQUENCE: 1

gcgtctgatt totgaagag gaggaggagg atgc.cggact cggacaacga Ct.ccggcggg 60

CC9agcaact acgcgggaggggagctgtc.g. tcgcc.gcggg agcaggacag gttcCtgc.cg 12O

atcgcgaacg tagcaggat catgaagaag gCdCtgc.cgg cgaacgc.cala gatcagcaag 18O

gacgc.calagg agacggtgca ggagtgcgtc. tcc.gagttca tot cott cat Caccggcgag 24 O

gcctic caca agtgc.ca.gcg cgaga agcgc aagaccatca acggcgacga Cctgct ctgg 3 OO

gcc atgacca cc ct cqgctt Caggactac git caccCCC toaa.gcacta cctic cacaag 360

titcc.gc.gaga to gagggcga gcdc.gc.cgcc gcct coacca Caccagcgc.c 42O

gccticcacca ccc.gc.cgca gCaggagcac accgc.calatg Ctacgc.cggg 48O

tacgc.cgc.cc cqggagc.cgg ccc.cggcggc atgatgatga tgatggggca gcc catgitac 54 O

ggctcgc.cgc caccgc.cgcc acago agcag cagcagdaac accaccacat ggcaatggga 6 OO

ggalagaggcg gctt.cggtca to atc.ccggc gg.cggcggcg gtcgt.cgt.cg 660

gggcacggit C ggcaaaa.ca.g gggcgcttga catcgct cog agacgagtag catgcaccat 72O

SEO ID NO 2 LENGTH: 219 TYPE PRT ORGANISM: Oryza sativa FEATURE; OTHER INFORMATION: SEO ID NO: 2 G3397 polypeptide

<4 OOs SEQUENCE: 2

Met Pro Asp Ser Asp Asn Asp Ser Gly Gly Pro Ser Asn Ala Gly 1. 5 1O 15

Gly Glu Lell Ser Ser Pro Arg Glu Glin Asp Arg Phe Luell Pro Ile Ala 25 3 O

ASn Wall Ser Arg Ile Met Lys Ala Lieu Pro Ala Asn Ala Ile 35 4 O 45

Ser Lys Asp Ala Glu Thir Wall Glin Glu Cys Wall Ser Glu Phe Ile SO 55 60

Ser Phe Ile Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu 65 70 US 7,868,229 B2 47 48

- Continued Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly 85 90 95

Phe Glu Asp Tyr Val Asp Pro Lieu Llys His Tyr Lell His Lys Phe Arg 105 11 O

Glu Ile Glu Gly Glu Arg Ala Ala Ala Ser Thr Thir Gly Ala Gly Thr 115 12 O 125

Ser Ala Ala Ser Thir Thr Pro Pro Glin Glin Glin His Thir Ala Asn Ala 13 O 135 14 O

Ala Gly Gly Tyr Ala Gly Tyr Ala Ala Pro Gly Ala Gly Pro Gly Gly 145 150 155 160

Met Met Met Met Met Gly Glin Pro Met Tyr Gly Ser Pro Pro Pro Pro 1.65 17O 17s

Pro Glin Glin Glin Glin Glin Glin His His His Met Ala Met Gly Gly Arg 18O 185 19 O

Gly Gly Phe Gly His His Pro Gly Gly Gly Gly Gly Gly Ser Ser Ser 195 2O5

Ser Ser Gly His Gly Arg Glin Asn Arg Gly Ala 21 O 215

SEQ ID NO 3 LENGTH 735 TYPE: DNA ORGANISM: Zea mays FEATURE: OTHER INFORMATION: SEO ID NO : 3 G3 435 SEQUENCE: 3 cggcggtggc Cttgagctga ggcgg.cggag cgatgc.cgga Ctcggacaac gactic.cggcg 6 O ggcc.gagcaa. cgc.cgggggc gagctgtcgt. cgcc.gcggga gCaggaccgg titcct gcc.ca 12 O tcqccaacgt. gag.ccggatc atgaagaagg cgctic ccggc Caacgc.caag atcagcaagg 18O acgc.caagga gacggtgcag gagtgcgt.gt cc.gagttcat cit cott catc. accgg.cgagg 24 O cctic cqacaa gtgc.ca.gc.gc gagaa.gc.gca agaccatcaa cggcgacgac Ctgctgttggg 3OO c catgaccac gct cqgct tc gaggactacg tcgagcc.gct caa.gcactac ctgcacaagt 360 tcc.gc.gagat cgaggg.cgag cgt.ccgc.cgg tcgcagcagc agcagcagca ggg.cgagctg Cccagaggcg cc.gc.caatgc cgc.cggg tac gcc.gggtacg

Ctc.cggcggc atgatgatga tgatgatggg gcago Ccatg tacggcggct 54 O cgcagcc.gca cc.gcc.gc.ctic agcc.gc.caca gcago agcag caa.catcaac agcatcacat ggcaat agga ggcagaggag gatt.cggc.ca acaaggcggc ggcgg.cggct 660 cct cqtcqtc gtcagggctt acagggcgtg agttgcgacg atacgtcaga 72 O atcagaatcq citgat 73

SEQ ID NO 4 LENGTH: 222 TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: SEO ID NO: 4 G3435 polypeptide

<4 OOs, SEQUENCE: 4 Met Pro Asp Ser Asp Asn Asp Ser Gly Gly Pro Ser Asn Ala Gly Gly 1. 5 15 Glu Lieu. Ser Ser Pro Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn 25

US 7,868,229 B2 51 52

- Continued

22 Os. FEATURE: <223> OTHER INFORMATION: SEO ID NO: 6 G343 6 polypeptide

<4 OOs, SEQUENCE: 6

Met Pro Asp Ser Asp Asn. Glu Ser Gly Gly Pro Ser Asn Ala Glu Phe 1. 5 1O 15

Ser Ser Pro Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn Wall Ser 25 3O

Arg Ile Met Lys Lys Ala Lieu Pro Ala Asn Ala Ile Ser Lys Asp 35 4 O 45

Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Ile Ser Phe Ile SO 55

Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Arg Thir Ile 65 70 7s 8O

Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Lell Gly Phe Glu Asp 85 90 95

Tyr Val Glu Pro Leu Lys Lieu. Tyr Lieu. His Llys Phe Arg Glu Lieu. Glu 105 11 O

Gly Glu Lys Ala Ala Thir Thir Ser Ala Ser Ser Gly Pro Glin Pro Pro 115 12 O 125

Lieu. His Arg Glu Thir Thr Pro Ser Ser Ser Thr His Asn Gly Ala Gly 13 O 135 14 O

Gly Pro Val Gly Gly Tyr Gly Met Ala Gly Gly Gly Ser 145 150 155 160

Gly Met Ile Met Met Met Gly Glin Pro Met Tyr Gly Gly Ser Pro Pro 1.65 17O 17s

Ala Ala Ser Ser Gly Ser Tyr Pro His His Glin Met Ala Met Gly Gly 18O 185 19 O

Lys Gly Gly Ala Tyr Gly Tyr Gly Gly Gly Ser Ser Ser Ser Pro Ser 195 2OO 2O5 Gly Lieu. Gly Arg 21 O

<210s, SEQ ID NO 7 &211s LENGTH: 761 &212s. TYPE: DNA <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: SEO ID NO : 7 G3398

<4 OO > SEQUENCE: 7 cctcitcct ct tcgtct tcct cct cqcct tc gct tcgactg citt catcga gggagat.cga 6 O ggttgcgatg ccggatticgg acaacgagtic agggggg.ccg agcaacg.cgg gggagtacgc 12 O gtcggcgagg gag caggaca ggttcCtgcc gat.cgcgaac gtgagcagga t catgaagag 18O ggcgct cocg gcgaacgc.ca agat cagcaa. ggacgc.caag gaga.cggtgc aggagtgcgt 24 O citcggagttc atctoctitca t caccggcga ggcct Cogac gggagaa.gc.g 3OO

Caagac catc aacggcgacg acctic citctg ggcgatgacc tcgaggact a 360 catcgacccg citcaagct ct acctic Cacala gttcc.gc.gag Ctcgagggcg agaaggc cat cggcgcc.gcc ggcagogg.cg cgcct cotcc gct Coggctic cggctcgcac caccaccagg atgct tcc.cg gaacaatggc ggatacggca tgtacggcgg 54 O cggcggcggc atgat catga tgatgggaca gcc tatgtac cggcgt.cgt.c agctggg tac gcgcagcc.gc cgcc.gc.ccca CCaCCaC CaC Caccagatgg tgatgggagg 660 US 7,868,229 B2 53 54

- Continued

72 O cggc.cggcaa gacaggct at gagcttgctt t cttggttgg t 761

SEQ ID NO 8 LENGTH: 224 TYPE : PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: SEO ID NO: 8 G3398 polypeptide

SEQUENCE: 8

Met Pro Asp Ser Asp Asn. Glu Ser Gly Gly Pro Ser Asn Ala Gly Glu 1. 5 1O 15

Tyr Ala Ser Ala Arg Glu Glin Asp Arg Phe Lieu. Pro Ile Ala Asn. Wall 25

Ser Arg Ile Met Lys Arg Ala Lieu. Pro Ala Asn Ala Lys Ile Ser Lys 35 4 O 45

Asp Ala Glu Thir Wall Glin Glu Cys Val Ser Glu Phe Ile Ser Phe SO 55 6 O

Ile Thir Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Arg Lys Thr 65 70 7s 8O

Ile Asn Gly Asp Asp Lieu Lleu Trp Ala Met Thr Thir Lell Gly Phe Glu 85 90 95

Asp Ile Asp Pro Lieu Lys Lieu. Tyr Lieu. His Phe Arg Glu Lieu. 105 11 O

Glu Gly Glu Lys Ala Ile Gly Ala Ala Gly Ser Gly Gly Gly Gly Ala 115 12 O 125

Ala Ser Ser Gly Gly Ser Gly Ser Gly Ser Gly Ser His His His Glin 13 O 135 14 O

Asp Ala Ser Arg Asn. Asn Gly Gly Tyr Gly Met Gly Gly Gly Gly 145 150 155 160

Gly Met Ile Met Met Met Gly Glin Pro Met Tyr Gly Ser Pro Pro Ala 1.65 17O 17s

Ser Ser Ala Gly Tyr Ala Gln Pro Pro Pro Pro His His His His His 18O 185 19 O

Glin Met Wall Met Gly Gly Lys Gly Ala Tyr Gly His Gly Gly Gly Gly 195

Gly Gly Gly Pro Ser Pro Ser Ser Gly Tyr Gly Arg Glin Asp Arg Lieu. 21 O 215 22O

SEO ID NO 9 LENGTH: 553 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 9 G3474

<4 OOs, SEQUENCE: 9 gatatc catg gctgagtc.cg acaacgagtic aggaggt cac acggggaacg cgagcgggag 6 O

Caacgagttg tcc.ggttgca gggagcaaga caggttcctic c caatagdaa acgtgagcag 12 O gat catgaag aaggcgttgc cggcgaacgc gaagatatcg aaggagg.cga aggaga.cggit 18O gCaggagtgc gtgtcggagt t catcagott Catalacagga gaggctt cog atalagtgc.ca 24 O galaggagaag aggalagacga t caacggcga cgatcttctic tgggc catga CtaccCtggg 3OO

Cttcgaggac tacgtggatc citct caagat ttacctgcac aagtataggg agatggaggg 360 ggagaaaacc gctatgatgg gaaggccaca tgagagggat gagggittatg gcc atggc.ca US 7,868,229 B2 55 56

- Continued tggit catgca act Cotatga tigacgatgat gatgggg.cat cagc.cccagc accagcacca 48O gcaccagcac Caggga cacg titatggatc tigat cagca tottctgcaa gaact agata 54 O gcatgtgtca tot 553

SEQ ID NO 10 LENGTH: 177 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 10 G3474 polypeptide

<4 OOs, SEQUENCE: 10

Met Ala Glu Ser Asp As in Glu Ser Gly Gly His Thir Gly Asn Ala Ser 1. 5 1O 15

Gly Ser Asn Glu Lieu. Se r Gly Cys Arg Glu Glin Asp Arg Phe Leul Pro 2O 25

Ile Ala Asn Wall Ser Airg Ile Met Llys Lys Ala Lell Pro Ala Asn Ala 35 4 O 45

Ile Ser Lys Glu Al a Lys Glu Thir Wall Glin Glu Wall Ser Glu SO 55 6 O

Ile Ser Phe Ile Thr Gly Glu Ala Ser Asp Glin Lys Glu 70 7s 8O

Arg Thir Ile As in Gly Asp Asp Lieu. Lieu. Trp Ala Met Th Thr 85 90 95

Lell Gly Phe Glu Asp Tyr Val Asp Pro Lieu Lys Ile Luell His Lys 105 11 O

Arg Glu Met Glu Gl y Glu Lys Thir Ala Met Met Gly Arg Pro His 115 12 O 125

Glu Arg Asp Glu Gly Tyr Gly His Gly His Gly His Ala Thir Pro Met 13 O 135 14 O

Met Thir Met Met Met Gl y His Glin Pro Glin His Glin His Glin His Glin 145 15 O 155 160

His Glin Gly His Val Tyr Gly Ser Gly Ser Ala Ser Ser Ala Arg Thr 1.65 17O 17s Arg

SEQ ID NO 11 LENGTH: 6O2 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO : 11. G3 478

<4 OOs, SEQUENCE: 11 titc.cgittagt catggcgga citc.cgacaac gactic.cggcg gcgc.gcacala cgg.cggcaa.g 6 O gggagcgaga tgtc.gc.cgcg ggagcaggac cggitttcticc cgatcgcgaa cgtgagcc.gc 12 O atcatgaaga aggcgctgcc aagat ct cqa aggacgcgaa ggaga.cggtg 18O

Caggagtgcg tgtcagagtt catcagottc at Caccggcg aggccticca Caagtgc.ca.g 24 O cgc.gagaa.gc gcaagacgat Caacggcgac gacctgctict ggg.cgatgac Cactctgggc 3OO titcgaggact acgtggagcc t ct caaaggc tacct coagc gct tcc.gaga aatggaagga 360 gagalagaccg tgg.cggcgc.g tgacaaggac gcqcctic ctic ttacgaatgc taccalacagt gcctacgaga gtgctaatta gctgctgttc Ctggtggaat catgatgcat

Caggga cacg tgtacggttc Cat Caagtgg Ctggcgggg.c tataaagggit 54 O US 7,868,229 B2 57 58

- Continued gggcctgctt atcctgggcc tigatccalat gcc.gg taggc ccagataaag agcct attat ta

SEQ ID NO 12 LENGTH: 191 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 12 G3478 polypeptide

<4 OOs, SEQUENCE: 12

Met Ala Asp Ser Asp Asn Asp Ser Gly Gly Ala His Asn Gly Gly Lys 1. 5 1O 15

Gly Ser Glu Met Ser Pro Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala 25

Asn Wall Ser Arg Ile Met Lys Llys Ala Lieu Pro Ala Asn Ala Lys Ile 35 4 O 45

Ser Lys Asp Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Phe Ile SO 55 6 O

Ser Phe Ile Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu 65 70

Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly 85 90 95

Phe Glu Asp Tyr Val Glu Pro Leu Lys Gly Tyr Lell Glin Arg Phe Arg 105 11 O

Glu Met Glu Gly Glu Lys Thr Val Ala Ala Arg Asp Lys Asp Ala Pro 115 12 O 125

Pro Luell Thir Asn Ala Thir Asn. Ser Ala Tyr Glu Ser Ala Asn Tyr Ala 13 O 135 14 O

Ala Ala Ala Ala Val Pro Gly Gly Ile Met Met His Glin Gly His Wall 145 150 155 160

Gly Ser Ala Gly Phe His Glin Val Ala Gly Gly Ala Ile Lys Gly 1.65 17O 17s

Gly Pro Ala Tyr Pro Gly Pro Gly Ser Asn Ala Gly Arg Pro Arg 18O 185 19 O

SEQ ID NO 13 LENGTH: 6OO TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO : 13 G3,475

SEQUENCE: 13 tcqattatcc gtttgtcgat ggcggacticg gacaacgact gcacaacgc.c 6 O gggaagggga gcgagatgtc. gcc.gcgggag Caggaccggit tcc tigc.cgat cgcgaacgtg 12 O agcc.gcatca tgaagaaggc gctg.ccggcg aacgc.galaga t ct caagga cgcgaaggag 18O acggtgcagg agtgcgtgtc. ggagttcatc agctt catca Ctc.cgacaag 24 O

agaa.gc.gcaa. gacgatcaac ggcgacgacc tgctctgggc gatgaccact 3OO

Ctcggctt.cg aggact acgt. cgagcct citc aagggct acc tccagcgctt cc.gagaaatg 360 galaggagaga agacagtggc ggcgc.gtgac aaggacgc.gc ctic ct cottac caatgct acc aacagtgc ct acgagagt cc tag titatgct gctgctic ctg gtggaat cat gatgcatcag ggacacgtgt acggttctgc cggct tcc at Caagtggctg gtggtgctat aaagggtggg 54 O US 7,868,229 B2 59 60

- Continued Cctgtttatic ccgggcctgg atccalatgcc gg taggcc.ca ggtagatggg cctatgttat

SEQ ID NO 14 LENGTH: 188 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 14 G3475 polypeptide

SEQUENCE: 14

Met Ala Asp Ser Asp Asn Asp Ser Gly Gly Ala His Asn Ala Gly Lys 1. 5 1O 15

Gly Ser Glu Met Ser Pro Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala 25 3O

Asn Wall Ser Arg Ile Met Lys Llys Ala Lieu Pro Ala Asn Ala Lys Ile 35 4 O 45

Ser Lys Asp Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Phe Ile SO 55 6 O

Ser Phe Ile Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg 65 70 8O

Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly 85 90 95

Phe Glu Asp Tyr Val Glu Pro Leu Lys Gly Tyr Lell Glin Arg Phe Arg 105 11 O

Glu Met Glu Gly Glu Lys Thr Val Ala Ala Arg Asp Lys Asp Ala Pro 115 12 O 125

Pro Pro Thir Asn Ala Thir Asn. Ser Ala Tyr Glu Ser Pro Ser Tyr Ala 13 O 135 14 O

Ala Ala Pro Gly Gly Ile Met Met His Glin Gly His Wall Tyr Gly Ser 145 150 155 160

Ala Gly Phe His Glin Val Ala Gly Gly Ala Ile Gly Gly Pro Val 1.65 17O 17s

Pro Gly Pro Gly Ser Asn Ala Gly Arg Pro Arg 18O 185

SEO ID NO 15 LENGTH: 684 TYPE: DNA ORGANISM: Arabidopsis thal iana FEATURE: OTHER INFORMATION: SEO ID NO : 15 G485

<4 OOs, SEQUENCE: 15 cct citctgat coaacggacc Caaaa.cat Ct at citctic titt ctic gacctitt tdt ct cotcg 6 O atctaaagat ggcggatt.cg gacaacgatt Caggaggaca caaagacggt ggaaatgctt 12 O cgacacgtga gcaagatagg tittct accoga tcqctaacgt. tag caggat C atgaagaaag 18O cact tcctgc gaacgcaaaa atctotaagg atgctaaaga aacggttcaa gagtgttgtat 24 O cggaatt cat aagttt catc accggtgagg cittctgacaa gtgtcagaga gagaa.gagga 3OO agacaatcaa cggtgacgat cittctittggg cgatgacitac gct agggittt gaggact acg 360 tggagcct ct caaggttitat Ctgcaaaagt at agggaggt ggaaggagag aagacitact a cggCagggag acaaggcgat aaggalaggtg gaggaggagg cggtggagct ggaagttggaa gtggaggagc tcc gatgtac ggtggtggca tggtgactac gatgggacat caattitt coc 54 O at cattitt to ttaattgtaa aatgataaaa gcaaattitt c atttitt atta attaatgata tatatatata tatgtttaac ttittagtata atgtttacag aatttitttitt ttaaaac tag 660 US 7,868,229 B2 61 62

- Continued gttcaa.ccca ctaacgtaac agcg 684

SEQ ID NO 16 LENGTH: 161 TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: SEO ID NO: 16 G485 polypeptide

<4 OOs, SEQUENCE: 16 Met Ala Asp Ser Asp Asn Asp Ser Gly Gly His Asp Gly Gly Asn 1. 5 1O 15

Ala Ser Thir Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn Wall Ser 25 3O

Arg Ile Met Llys Lys Ala Lieu Pro Ala Asn Ala Ile Ser Lys Asp 35 4 O 45

Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Ile Ser Phe Ile SO 55

Thir Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Arg Thir Ile 65 70 7s 8O

Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Lell Gly Phe Glu Asp 85 90 95

Wall Glu Pro Leu Lys Val Tyr Lieu. Glin Lys Arg Glu Wall Glu 105 11 O

Gly Glu Lys Thir Thr Thr Ala Gly Arg Glin Gly Asp Lys Glu Gly Gly 115 12 O 125

Gly Gly Gly Gly Gly Ala Gly Ser Gly Ser Gly Gly Ala Pro Met Tyr 13 O 135 14 O

Gly Gly Gly Met Wall. Thir Thir Met Gly His Glin Phe Ser His His Phe 145 150 155 160

Ser

SEO ID NO 17 LENGTH: 528 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO : 17 G3476

SEQUENCE: 17 ggattgattg taagatggc tgagt cqgac aacgact cqg gagggg.cgca gaacg.cggga 6 O alacagtggaa actitgagcga cgggaac agg accoggitttct c cc catagog 12 O aacgtgagca ggat catgaa gaaggccttg ccggcgaacg cgaagat citc galaggacgc.g 18O aaggaga.cgg tgcaggaatg cgtgtcggag ttcat cagot t catalacggg tgagg.cgt.cg 24 O gacaagtgcc agagggagaa gcgcaagacic at Caacggcg acgat cittct Ctgggcc atg 3OO acaa.ccctgg gatt.cgalaga gtacgtggag cct ctdaaga tt tacct coa gcgct tcc.gc 360 gagatggagg gagagaagac cgcgact citt ctaaggactic ggcct cogCC t cott cotlatc. at Cagggaca cgtgtacggc toccctgcct acCat Catca agtgcCtggg

CCCaCttatc. Ctgc.ccctgg tag acccaga tgacgtgctic c to tatto 528

SEQ ID NO 18 LENGTH: 1.65 TYPE : PRT ORGANISM: Glycine max FEATURE: US 7,868,229 B2 63 64

- Continued <223> OTHER INFORMATION: SEO ID NO: 18 G3476 polypeptide

<4 OOs, SEQUENCE: 18

Met Ala Glu Ser Asp As in Asp Ser Gly Gly Ala Glin Asn Ala Gly Asn 1. 5 1O 15

Ser Gly Asn Lieu. Ser Gl ul Luell Ser Pro Arg Glu Glin Asp Arg Phe Lieu. 2O 25 3O

Pro Ile Ala Asn. Wal Se r Arg Ile Met Lys Llys Ala Lell Pro Ala Asn 35 4 O 45

Ala Lys Ile Ser Lys As p Ala Lys Glu Thir Wall Glin Glu Wall Ser SO 55 6 O

Glu Phe Ile Ser Phe Il e Thr Gly Glu Ala Ser Asp Glin Arg 65 70 7s 8O

Glu Arg Lys. Thir Il e Asn Gly Asp Asp Lieu. Lell Trp Ala Met Thr 85 90 95

Thir Lel Gly Phe Glu Gl u Tyr Val Glu Pro Lieu. Ile Tyr Lieu. Glin 1OO 105 11 O

Arg Phe Arg Glu Met Gl u Gly Glu Lys Thr Val Ala Ala Arg Asp Ser 115 12 O 125

Ser Lys Asp Ser Ala Se r Ala Ser Ser Tyr His Glin Gly His Val Tyr 13 O 135 14 O

Gly Ser Pro Ala Tyr Hi S His Glin Val Pro Gly Pro Thir Pro Ala 145 15 O 155 160

Pro Gly Arg Pro Arg 1.65

SEQ ID NO 19 LENGTH: 547 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO : 19 G3472

SEQUENCE: 19 taaggctago: tagctago Ca tggctgagtic gga caacgag tcc.ggaggit c acacggggaa 6 O cgcaa.gcgga agcaacgaat tct Coggttg Cagggagcaa. gacaggttcC titc.cgatagc 12 O gaacgtgagc aggat catga agaaggcgtt gcc.ggcgaac gcgaagatct cgaaggaggc 18O galaggaga.cg gtgcaggagt gcgtgtcgga gttcatcago tt cataa.ca.g gagaa.gcgt.c 24 O cgataagtgc Cagaaggaga agaggaagac gat Caacggc gatgatctgc tgtgggc cat 3OO gaccacgctg ggatt.cgagg agtacgtgga gcct citcaag gtttatctgc atalagtatag 360 ggagctggaa ggggagaaaa ctgctatoat gggaaggc.ca Catgagaggg atgagggitta tggit catgca act cotatga tgat catgat gggggat Caa Cagcagcagc at Cagggaca cgtgtatgga tctggalacta c tactggatc agcatcttct gcaagaact a gataa Caggit 54 O titatgca 547

SEQ ID NO 2 O LENGTH 171 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 2O G3472 polypeptide

<4 OOs, SEQUENCE: 2O Met Ala Glu Ser Asp Asn. Glu Ser Gly Gly His Thr Gly Asn Ala Ser 1. 5 15 US 7,868,229 B2 65 66

- Continued

Gly Ser Asn Glu Phe Ser Gly Cys Arg Glu Glin Asp Arg Phe Leul Pro 25

Ile Ala Asn Val Ser Arg Ile Met Llys Lys Ala Lell Pro Ala Asn Ala 35 4 O 45

Ile Ser Lys Glu Ala Lys Glu Thir Wall Glin Glu Wall Ser Glu SO 55 6 O

Ile Ser Phe Ile Thr Gly Glu Ala Ser Asp Glin Lys Glu 70 7s 8O

Arg Thir Ile Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Th Thr 90 95

Lell Gly Phe Glu Glu Tyr Val Glu Pro Lieu Lys Wall Luell His Lys 105 11 O

Arg Glu Lieu. Glu Gly Glu Lys Thir Ala Met Met Gly Arg Pro His 115 12 O 125

Glu Arg Asp Glu Gly Tyr Gly His Ala Thr Pro Met Met Ile Met Met 13 O 135 14 O

Gly His Glin Glin Glin Gln His Glin Gly His Val Gly Ser Gly Thr 145 150 155 160

Thir Thir Gly Ser Ala Ser Ser Ala Arg Thr Arg 17O

SEQ ID NO 2 LENGTH: 557 TYPE: DNA ORGANISM: Zea mays FEATURE: OTHER INFORMATION: SEO ID NO: 21. G3876

SEQUENCE: 21 tataagtgca ggaggagctic atggcggalag Ctc.cggcgag CCCtggcggc ggcgg.cggga 6 O gccacgagag cqggagcc cc aggggagg.cg gaggcggtgg Cagcgtcagg gag caggaca 12 O ggttcCtgcc catcgc.caac atcagt cqca t catgaagaa ggc catc.ccg gctaacggga 18O agat.cgc.caa ggacgctaag gag accgtgc aggagtgcgt. citc.cgagttc at ct cott ca. 24 O t cactagoga agcgagtgac aagtgccaga gggagaa.gc.g gaagaccatc aatggcgacg 3OO atctgctgtg ggc catggcc acgctggggt ttgaagacita cattgaaccc Ctcaaggtgt 360 acctgcagaa gtacagagag atggagggtg atagdaagtt aactgcaaaa tctagogatg gct caattaa aaaggatgcc Cttggtcatg tgggagcaa.g tagct cagot gcaca aggga tgggccaa.ca gggagcatac alaccalaggaa tgggittatat gcaa.ccc.ca.g taccatalacg 54 O gggatatcto aalactaa sist

SEQ ID NO 22 LENGTH: 178 TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: SEO ID NO: 22 G3876 polypeptide

<4 OOs, SEQUENCE: 22

Met Ala Glu Ala Pro Ala Ser Pro Gly Gly Gly Gly Gly Ser His Glu 1. 5 15 Ser Gly Ser Pro Arg Gly Gly Gly Gly Gly Gly Ser Val Arg Glu Glin 25 Asp Arg Phe Lieu Pro Ile Ala Asn Ile Ser Arg Ile Met Lys Lys Ala 35 4 O 45 US 7,868,229 B2 67 68

- Continued

Ile Pro Ala Asin Gly Lly s Ile Ala Lys Asp Ala Lys Glu Thir Wall Glin SO 55 6 O

Glu Wall Ser Glu Ph.e Ile Ser Phe Ile Thr Ser Glu Ala Ser Asp 65 7s 8O

Glin Arg Glu Lys Arg Llys Thir Ile Asn Gly Asp Asp Lieu. Luell 85 90 95

Trp Ala Met Ala Thir Leu Gly Phe Glu Asp Tyr Ile Glu Pro Lieu Lys 1OO 105 11 O

Wall Luell Glin Llys Tyr Arg Glu Met Glu Gly Asp Ser Lieu. Thir 115 12 O 125

Ala Lys Ser Ser Asp Gl y Ser Ile Ala Lell Gly His Wall 13 O 135 14 O

Gly Ala Ser Ser Ser Al a Ala Glin Gly Met Gly Glin Glin Gly Ala Tyr 145 15 O 155 160

Asn Glin Gly Met Gly Tyr Met Glin Pro Glin Tyr His Asn Gly Asp Ile 1.65 17O 17s

Ser Asn

SEQ ID NO 23 LENGTH: 81.8 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 23 G3875 < 4 OO > SEQUENCE: 23

Cttalacattt CCCtaaaCat aacticacaca cocotottct Ctagggittt C aattt Cacca 6 O tgtttccctt tot Calaatta gggttcCggc gag catggcc gacggtc.cgg cgagt cc agg 12 O cggcggtagc cacgagagcg gcgagcacag ccctc.gctct agcaggacag 18O gtacct cocc atcgctaa.ca taag.ccgcat Catgaagaag gcactacctg cgaacggtaa 24 O aatcgc.caag gacgc.caaag agaccgttca ggaatgcgta tcc.gagttca t cagttt cat 3OO

Caccagcgag gcct citgata agtgtcagag ggaaaagaga aag act atta acggtgatga 360 tittgctctgg gcc atggc.ca citc.ttggttt tgaggattat atcgatcctic ttaaaattta cct cactaga tacagagaga tggagggtga tacga agggit t cagccalagg gcggagactic atctitctaag aaagatgttc agccaagt cc taatgct cag cittgct catc aaggttctitt 54 O ct caca aggt gttagttaca Caatttct Ca gggtcaiacat atgatggttc Caatgcaagg cc.cggagtag gtat caagtt tattalacc ct cctgttgtaa cg tatgttitt ccacgc.cagt 660 taccaagtgc t cacgg cata ttgaatgtct titt tatgtta tgttgaatact gacat aggag 72 O atggttcttg tgtcctttitt ttittaaaaaa. aaaagta agg tttgtatatt atctttggag tcqaattatt atttgaaagt tattatattg taaat CCt 818

SEQ ID NO 24 LENGTH 171 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: SEO ID NO: 24 G3875 polypeptide

<4 OOs, SEQUENCE: 24 Met Ala Asp Gly Pro Ala Ser Pro Gly Gly Gly Ser His Glu Ser Gly 1. 5 15 Glu. His Ser Pro Arg Ser Asn Val Arg Glu Glin Asp Arg Tyr Lieu Pro US 7,868,229 B2 69 70

- Continued

25

Ile Ala Asn Ile Ser Arg Ile Met Llys Lys Ala Lell Pro Ala Asin Gly 35 4 O 45

Lys Ile Ala Lys Asp Ala Lys Glu Thir Wall Glin Glu Wall Ser Glu SO 55 6 O

Phe Ile Ser Phe Ile Thir Ser Glu Ala Ser Asp Glin Arg Glu 70 7s

Thir Ile Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Ala Thr 85 90 95

Lieu. Gly Phe Glu Asp Tyr Ile Asp Pro Lieu Lys Ile Luell Thr Arg 105 11 O

Tyr Arg Glu Met Glu Gly Asp Thr Lys Gly Ser Ala Lys Gly Gly Asp 115 12 O 125

Ser Ser Ser Llys Lys Asp Val Glin Pro Ser Pro Asn Ala Glin Lieu Ala 13 O 135 14 O

His Glin Gly Ser Phe Ser Glin Gly Val Ser Tyr Thir Ile Ser Gln Gly 145 150 155 160

Gln His Met Met Wall Pro Met Glin Gly Pro Glu 1.65 17O

<210s, SEQ I D NO 25 &211s LENGT H: 1065 212. TYPE : DNA <213> ORGANISM: Arabidopsis thal iana 22 Os. FEATU RE: <223> OTHER INFORMATION: SEQ ID NO: 25 G482

<4 OOs, SEQUENCE: 25 tcqacccacg cgt.ccggaca Cttaacaatt Cacac Cttct ct t t t tactic titcCtaaaac 6 O

CCtaaattitc. citcgcttcag tott cocact caagt caacc accaattgaa titcgattitcg 12 O aatcattgat ggaaatgatt tgaaaaaaga gtaaagttta titt titt tatt ccttgtaatt 18O ttcagaaatg ggggattic.cg acagggattic cggtggaggg caaaacggga acaaccagaa 24 O cggacagt cc tocttgttcto Caagaga.gca agacaggttc ttgc.cgatcg ctaacgt cag 3OO ccggat catg aagaaggc ct cgc.caagatc tctaaagatg cCaaagagac 360 gatgcaggag tgttgtctc.cg agttcatcag citt cqt cacc ggagaag cat Ctgataagtg t cagaaggag aagaggaaga cgatcaacgg agacgatttg Ctctgggcta tgact actict aggttittgag gattatgttg agc cattgaa agttt acttg cagaggttta gggagat.cga 54 O aggggagagg actggactag ggaggccaca gactggtggt gaggtoggag agcatcagag agatgctgtc. ggagatggcg gtgggttcta cggtggtggit ggtgggatgc agitat cacca 660 acat catcag titt citt cacc agcagaacca tatgt atgga gcc acaggtg gcgg tag.cga 72 O Cagtggaggt ggagctgc ct ccgg taggac aaggact taa caaagattgg tgaagtggat citctictotgt atatagatac ataaataCat gtatacacat gcc tatttitt acgacccata 84 O taaggitat ct at catgtgat agaacgaa.ca tgatgtaaaa t cagatgtgc 9 OO attalagggitt tagattittga ggctgttgtaa aagaagat.ca agtgtgctitt gttggacaat 96.O aggatt cact aacgaatctg citt cattgga t cittgtatgt aactaaag.cc attgt attga atgcaaatgt titt catttgg gatgctittaa aaaaaaaaaa. aaaaa. 1065

<210s, SEQ I D NO 26 &211s LENGT H: 190 212. TYPE : PRT US 7,868, 229 B2 71 72

- Contin lued <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: SEO ID NO: 26 G482 polypeptide

<4 OOs, SEQUENCE: 26

Met Gly Asp Ser Asp Arg Asp Ser Gly Gly Gly Glin Asn Gly Asn. Asn 1. 5 1O 15

Glin Asn Gly Glin Ser Ser Leu Ser Pro Arg Glu Glin Asp Arg Phe Lieu. 25 3O

Pro Ile Ala Asn Val Ser Arg Ile Met Lys Llys Ala Lieu Pro Ala Asn 35 4 O 45

Ala Lys Ile Ser Lys Asp Ala Lys Glu Thir Met Glin Glu. Cys Wall Ser SO 55 6 O

Glu Phe Ile Ser Phe Val Thr Gly Glu Ala Ser Asp Llys Cys Gln Lys 65 70 7s 8O

Glu Arg Llys Thir Ile Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Thr 85 90 95

Thir Luell Gly Phe Glu Asp Tyr Val Glu Pro Lieu. Llys Val Tyr Lieu. Glin 105 11 O

Arg Phe Arg Glu Ile Glu Gly Glu Arg Thr Gly Lieu. Gly Arg Pro Glin 115 12 O 125

Thir Gly Gly Glu Val Gly Glu. His Glin Arg Asp Ala Val Gly Asp Gly 13 O 135 14 O

Gly Gly Phe Tyr Gly Gly Gly Gly Gly Met Gln Tyr His Glin His His 145 150 155 160

Glin Phe Luell His Glin Glin Asn His Met Tyr Gly Ala Thr Gly Gly Gly 1.65 17O 17s

Ser Asp Ser Gly Gly Gly Ala Ala Ser Gly Arg Thr Arg Thr 18O 185 19 O

SEO ID NO 27 LENGTH: 1347 TYPE: DNA ORGANISM: Physcomitrella patens FEATURE: OTHER INFORMATION: SEO ID NO: 27 G3870

<4 OOs, SEQUENCE: 27 ggcaccagcg aatc.cgt.ctic cgcct cogcc ttctgcacgc gtggttgttgg tcq acct ct c 6 O gccggagcaa. Caggaalacta atc. cottt to cagcactaaa cgattgaagc aattitt t t t t 12 O ttcttgttgaa ctgct cactic totctctgtt atgaggggat tcgaagcttg aaagttatga 18O gctgaaggitt gaggacacgt. aagagcgaag gacgatcata Ctaca attaa. CCCttgcggg 24 O gaaaag.ccca ggcaaaatag gacggatggc cgacagttac ggccacaacg caggttcacc 3OO cgaga.gcagc cc.gcattctg atalacgagtic cggcggc cat taccgtgat c aggacgctt C 360 tgtacgggag Caagaccggit ttittgcc cat cgcaaatgtg agc.cgaatca tgaagaaagc attgc.cat ct aatgcgaaga tat caaaga cgc.caaagag actgtgcagg agtgcgitatic cgagttcatc agttt catta Ctggtgaggc gtc.cgacaag tgtcagaggg aaaagaggaa 54 O gacgat Caac ggggatgact catgagtact cittggittittg aagattatgt ggaac Ctctg alaggtgtacc tacacaagta tcgtgaactg gagggggaga aggcct citat 660 ggcCaagggit ggtgat cagc agggaggaala agaga.gcaac Caaggaggta tgggg.tc.gat 72 O gggCatggca ggcggaat Ca acggcatgaa cggaacgatg aacgggalaca tgcatgggca tggaattic cc gt atcgatgc agatgatgca acagc.cgitac gcgcago agg Cacct Coggg 84 O US 7,868,229 B2 73 74

- Continued gatgatatat t ct cott catc. aaatgatgcc gcaataccag atgc.cgatgc agtctggtgg 9 OO aaac Caac Co. cgcggagtgt aagagtttitc actggcagga ggctittggaa gtggggatat 96.O tgtcgacagc gtgatggggt gttittggagc atgggCaggg Cattatggtg ctgttgaaac agtgatgggit gggtcatgttg aagtgttggc gactgttgaa tgatgaaaac atagaagtga 108 O tgtcgttgaa tittcaagtga aaggaggagc acttitttgtt tggaaaggag 114 O cgtaccgggit Ctggcagtgt acattctgaa tgatagittat Ctgtgctgat ttittcttggc 12 OO cittggcaata cgagggggtt gaatattittg ctittgaattic gttgacattt caa.cottt to 126 O tatgtgaaaa ggctctgtag gatgcaagat aaggaaagac atgcagattg ataaaaaaaa. 132O aaaaaaaaaa. aaaaaaaaaa. aaaaaaa. 1347

SEQ ID NO 28 LENGTH: 218 TYPE : PRT ORGANISM: Physcomitrella patens FEATURE: OTHER INFORMATION: SEO ID NO: 28 G3870 polypeptide

<4 OOs, SEQUENCE: 28

Met Ala Asp Ser Tyr Gly. His Asn Ala Gly Ser Pro Glu Ser Ser Pro 1. 5 1O 15

His Ser Asp Asin Glu Ser Gly Gly His Tyr Arg Asp Glin Asp Ala Ser 25

Wall Arg Glu Gln Asp Arg Phe Lieu. Pro Ile Ala Asn Wall Ser Arg Ile 35 4 O 45

Met Lys Ala Lieu. Pro Ser Asn Ala Lys Ile Ser Asp Ala Lys SO 55 6 O

Glu Thir Wall Gln Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly 65 70 7s 8O

Glu Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Thir Ile Asin Gly 85 90 95

Asp Asp Luell Leu Trp Ala Met Ser Thr Lieu. Gly Phe Glu Asp Tyr Val 105 11 O

Glu Pro Luell Llys Val Tyr Lieu. His Glu Lell Glu Gly Glu 115 12 O 125

Ala Ser Met Ala Lys Gly Gly Asp Glin Glin Gly Gly Glu Ser 13 O 135 14 O

Asn Glin Gly Gly Met Gly Ser Met Gly Met Ala Gly Gly Ile Asin Gly 145 150 155 160

Met Asn Gly Thr Met Asn Gly Asn Met His Gly His Gly Ile Pro Wall 1.65 17O 17s

Ser Met Glin Met Met Glin Glin Pro Tyr Ala Glin Glin Ala Pro Pro Gly 18O 185 19 O

Met Ile Tyr Ser Pro His Gln Met Met Pro Glin Glin Met Pro Met 195 2O5

Glin Ser Gly Gly Asn Gln Pro Arg Gly Val 21 O 215

SEQ ID NO 29 LENGTH: 787 TYPE: DNA ORGANISM: Physcomitrella patens FEATURE: OTHER INFORMATION: SEO ID NO : 29 G3868 US 7,868,229 B2 75 76

- Continued <4 OOs, SEQUENCE: 29 atc.ccgggca gcgagcacac agctagdaac t ctitt.cggag aat actic cag gcgaaattgg 6 O tcggatggcc gatagotacg gtcacaacgc aggttcacca gaga.gcagcc cgcattctga 12 O taacgagt cc gggggt catt accgagacca ggatgct tct gtacgggaac aggat.cggitt 18O cctgcc catc gcgaacgtga gcc gaat cat galaga aggcg ttgc.cgt.cta atgcaaaaat 24 O titcgaaggac gcgaaagaga Ctgtgcagga gagttcatca gct tcat cac 3OO tggtgagg.cg t cagataagt gccagaggga gaagagaaag acgat Caacg gtgacgactt 360 gctgttgggCC atgagtacac ttggitttcga agattacgtg gag cctotga aggtttacct acacaaatac cgggagctag agggagagaa ggctt CC acg gccaagggtg gtgaccagca aggagggaala galagggagtic alaggtgtt at ggggtCcatg gg tatgtcgg gcggaatgaa 54 O cgg tatgaac ggtacgatga acgggaat at gcatgga cat ggaatcc.cgg tgtcgatgca gatgctgcag Cagtcgtacg gacaggaggc acctic Caggg atgatgtatt CCCCt catca 660 gatgatgc.cg gaataccaga tgc caatgca gtctggtgga aac cago ct c gtggagtgta 72 O ggaggttc.ca cgg.cgaggag aatttgaaat tggggagatt gtCaagggcg tgagggagtg agct cqc 787

SEQ ID NO 3 O LENGTH: 218 TYPE : PRT ORGANISM: Physcomitrella patens FEATURE: OTHER INFORMATION: SEO ID NO: 30 G3868 polypeptide

<4 OOs, SEQUENCE: 30

Met Ala Asp Ser Tyr Gly. His Asn Ala Gly Ser Pro Glu Ser Ser Pro 1. 5 1O 15

His Ser Asp Asin Glu Ser Gly Gly His Tyr Arg Asp Glin Asp Ala Ser 25

Wall Arg Glu Glin Asp Arg Phe Lieu. Pro Ile Ala Asn Wall Ser Arg Ile 35 4 O 45

Met Lys Ala Lieu. Pro Ser Asn Ala Lys Ile Ser Asp Ala Lys SO 55 6 O

Glu Thir Wall Gln Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly 65 70 7s 8O

Glu Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Thir Ile Asin Gly 85 90 95

Asp Asp Luell Leu Trp Ala Met Ser Thr Lieu. Gly Phe Glu Asp Tyr Val 105 11 O

Glu Pro Luell Llys Val Tyr Lieu. His Glu Lell Glu Gly Glu 115 12 O 125

Ala Ser Thir Ala Lys Gly Gly Asp Glin Glin Gly Gly Glu Gly 13 O 135 14 O

Ser Glin Gly Val Met Gly Ser Met Gly Met Ser Gly Gly Met Asin Gly 145 150 155 160

Met Asn Gly Thr Met Asn Gly Asn Met His Gly His Gly Ile Pro Wall 1.65 17O 17s

Ser Met Glin Met Leul Glin Glin Ser Tyr Gly Glin Glu Ala Pro Pro Gly 18O 185 19 O

Met Met Tyr Ser Pro His Gln Met Met Pro Glu Glin Met Pro Met 195 2O5 US 7,868,229 B2 77 78

- Continued Glin Ser Gly Gly Asn Gln Pro Arg Gly Val 21 O 215

<210s, SEQ ID NO 31 &211s LENGTH: 91 212. TYPE: PRT ORGANISM: Oryza sativa 22 Os. FEATURE: OTHER INFORMATION: SEQ ID NO : 31 G3397 conserved B domain

<4 OOs, SEQUENCE: 31

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Asp Ala Lys Glu 25 3O

Thir Val Glin Glu. Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Val Asp 65 70 7s 8O

Pro Lieu Lys His Tyr Lell His Phe Arg Glu 85 90

<210s, SEQ ID NO 32 &211s LENGTH: 91 212. TYPE: PRT &213s ORGANISM: Zea mays 22 Os. FEATURE: OTHER INFORMATION: SEQ ID NO: 32 G3435 conserved B domain

<4 OOs, SEQUENCE: 32

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Asp Ala Lys Glu 2O 25 3O

Thir Val Glin Glu. Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Wall Glu 65 70 7s 8O

Pro Lieu Lys His Tyr Lell His Phe Arg Glu 85 90

<210s, SEQ ID NO 33 &211s LENGTH: 91 212. TYPE: PRT &213s ORGANISM: Zea mays 22 Os. FEATURE: OTHER INFORMATION: SEQ ID NO : 33 G3436 conserved B domain

<4 OOs, SEQUENCE: 33

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Asp Ala Lys Glu 2O 25 3O

Thir Val Glin Glu. Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Thir Ile Asn Gly Asp US 7,868,229 B2 79 80

- Continued

SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Tyr Val Glu 65 70 Pro Lieu Lys Lieu. Tyr Lieu. His Llys Phe Arg Glu 85 90

SEQ ID NO 34 LENGTH 91. TYPE PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: SEQ ID NO: 34 G3398 conserved B domain

SEQUENCE: 34

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala Asn Val Ser Arg Ile Met 5 1O 15

Lys Arg Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Tyr Ile Asp 65 70 7s 8O

Pro Lieu Lys Lieu. Tyr Lieu. His Phe Arg Glu 85 90

SEO ID NO 35 LENGTH 91. TYPE PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: SEQ ID NO : 35 G3474 conserved B domain

SEQUENCE: 35

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala Asn Val Ser Arg Ile Met 1. 5 1O 15

Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Glu Ala Lys Glu 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Lys Glu Lys Arg Llys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Tyr Val Asp 65 70 7s 8O

Pro Lieu Lys Ile Tyr Lieu. His Tyr Arg Glu 85 90

SEQ ID NO 36 LENGTH 91. TYPE PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: SEQ ID NO: 36 G3478 conserved B domain

SEQUENCE: 36 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn Val Ser Arg Ile Met 1. 5 15 Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 25 US 7,868,229 B2 81

- Continued

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Gly Tyr Lieu. Glin Arg Phe Arg Glu 85 90

<210s, SEQ ID NO 37 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine max 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO : 37 G3475 conserved B domain <4 OO > SEQUENCE: 37 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Thr Thr Lieu. Gly Phe Glu Asp Tyr Val Glu. 65 70 7s 8O Pro Lieu Lys Gly Tyr Lieu. Glin Arg Phe Arg Glu 85 90

<210s, SEQ ID NO 38 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 38 G485 conserved B domain

<4 OOs, SEQUENCE: 38 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Val Tyr Lieu Gln Lys Tyr Arg Glu 85 90

<210s, SEQ ID NO 39 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine max 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 39 G3476 conserved B domain

<4 OOs, SEQUENCE: 39 US 7,868,229 B2 83

- Continued Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Leu Trp Ala Met Thir Thr Lieu. Gly Phe Glu Glu Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Ile Tyr Lieu. Glin Arg Phe Arg Glu 85 90

<210s, SEQ ID NO 4 O &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine max 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 40 G3472 conserved B domain

<4 OOs, SEQUENCE: 4 O Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Glu Ala Lys Glu 2O 25 3O Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Gln Lys Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Leu Trp Ala Met Thir Thr Lieu. Gly Phe Glu Glu Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Val Tyr Lieu. His Llys Tyr Arg Glu 85 90

<210s, SEQ ID NO 41 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Zea mays 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 41 G3876 conserved B domain

<4 OOs, SEQUENCE: 41 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Ile Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Ile Pro Ala Asn Gly Lys Ile Ala Lys Asp Ala Lys Glu 2O 25 3O Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Ser Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Ala Thr Lieu. Gly Phe Glu Asp Tyr Ile Glu 65 70 7s 8O Pro Lieu Lys Val Tyr Lieu Gln Lys Tyr Arg Glu 85 90

<210s, SEQ ID NO 42 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine max US 7,868,229 B2 85 86

- Continued

FEATURE: OTHER INFORMATION: SEO ID NO : 42 G3875 conserved B domain

SEQUENCE: 42

Arg Glu Glin Asp Arg Tyr Lieu Pro Ile Ala ASn Ile Ser Arg Ile Met 1. 15

Llys Lys Ala Lieu Pro Ala Asn Gly Lys Ile Ala Asp Ala Lys Glu 2O 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Ala Thr Lieu. Gly Phe Glu Asp Ile Asp 65 70 7s 8O

Pro Lieu Lys Ile Tyr Lieu. Thir Arg Tyr Arg Glu 85 90

SEQ ID NO 43 LENGTH 91. TYPE PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: SEO ID NO : 43 G482 conserved B domain

SEQUENCE: 43

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 15

Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Asp Ala Lys Glu 2O 25 3O

Thr Met Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Wall Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Gln Lys Glu Lys Arg Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Wall Glu 65 70 7s 8O

Pro Lieu Lys Val Tyr Lieu. Glin Arg Phe Arg Glu 85 90

SEQ ID NO 44 LENGTH 91. TYPE PRT ORGANISM: Physcomitrella patens FEATURE: OTHER INFORMATION: SEO ID NO : 44 G387O conserved B domain

SEQUENCE: 44

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala ASn Wall Ser Arg Ile Met 15

Llys Lys Ala Lieu Pro Ser Asn Ala Lys Ile Ser Asp Ala Lys Glu 2O 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Ser Thr Lieu. Gly Phe Glu Asp Wall Glu 65 70 7s 8O

Pro Lieu Lys Val Tyr Lieu. His Llys Tyr Arg Glu 85 90 US 7,868,229 B2 87 88

- Continued

SEO ID NO 45 LENGTH 91. TYPE PRT ORGANISM: Physcomitrella patens FEATURE: OTHER INFORMATION: SEQ ID NO: 45 G3868 conserved B domain

SEQUENCE: 45 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Lieu Pro Ser Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Ser Thr Lieu. Gly Phe Glu Asp Tyr Val Glu 65 Pro Lieu Lys Val Tyr Lieu. His Llys Tyr Arg Glu 85 90

SEQ ID NO 46 LENGTH: 720 TYPE: DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: SEQ ID NO: 46 P21265 expression vector (35S : : G3397)

SEQUENCE: 46 gcgtctgatt totgalagag gaggaggagg atgc.cggact cqgacaacga citc.cggcggg 6 O cc.gagcaact acgcgggagg ggagctgtcg tcgc.cgcggg agcagga cag gttcCtgcc.g 12 O atcgcgaacg taggaggat catgaagaag gcgctg.ccgg caacgc.cala gat cagcaa.g 18O gacgc.calagg agacggtgca ggagtgcgtc. tcc.gagttca totCctt cat Caccggc gag 24 O gcct Cogaca agtgc.cagcg cdagaa.gc.gc aagac catca acggcgacga cctgctctgg 3OO gccatgacca ccct cqgctt cqaggactac gtcgaccc.cc ticaag cact a cct coacaag 360 titcc.gc.gaga t cagggcga gcgc.gc.cgcc gcct C Cacca ccggcgc.cgg Caccagcgc.c gcct coacca cqc.cgc.cgca gcagcagdac accgc.caatig ccdc.cggcgg ctacgc.cggg tacgcc.gc.cc cqggagc.cgg ccc.cggcggc atgatgatga tigatggggga gcc catgitac 54 O ggct cqc.cgc caccgc.cgcc acagcaggag cagcagcaac accaccacat ggcaatggga ggaagaggcg gct tcggit catcatc.ccggc ggcggcggcg gcggg.tc.gt C gtcgt.cgt.cg 660 gggcacggtc. g.gcaaaac aggggcgcttga catcgct cog agacgagtag catgcac cat 72 O

SEO ID NO 47 LENGTH 735 TYPE: DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: SEQ ID NO: 47 P21314 expression vector (35S: : G3435)

SEQUENCE: 47 cggcggtggc Cttgagctgaggcggcggag catgc.cgga citcggacaac gactic.cggcg 6 O ggcc.gagcaa cqc.cgggggc gagctgtcgt. c9ccgcggga gcaggaccgg ttcctgcc.ca 12 O tcgc.caacgt gagccggat.c atgaagaagg cqctic ccggc caacgc.caag at Cagcaagg 18O

US 7,868,229 B2 103 104

- Continued

<210s, SEQ ID NO 60 &211s LENGTH: 38 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (2) ... (20 <223 is OTHER INFORMATION: Xala can be any naturally occurring amino acid 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (22) ... (24) <223 is OTHER INFORMATION: Xala can be any naturally occurring amino acid 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (26) ... (29) <223 is OTHER INFORMATION: Xala can be any naturally occurring amino acid 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (31) ... (37) <223 is OTHER INFORMATION: Xala can be any naturally occurring amino acid 22 Os. FEATURE: <223> OTHER INFORMATION: SEO ID NO : 60 highly conserved residues within HFM and B domain

<4 OOs, SEQUENCE: 60

ASn Xala Xala Xala Xala Xala Xala Xala Xala Xaa Xala Xala Xala Xala Xala 1. 5 15

Xaa Xala Xala Xala Lys Xaa Xaa Xala Glin Xaa Xaa Xala Xala Glu Xaa Xala 2O 25 3O

Xaa Xala Xala Xala Xala Glu 35

<210s, SEQ ID NO 61 &211s LENGTH: 574 &212s. TYPE: DNA <213> ORGANISM: cauliflower mosaic virus 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO : 61 CaMV 35S constitutive promoter <4 OOs, SEQUENCE: 61 gcggattic cattgcc.cagct atctgtcact ttattgttgaa gat agtgaaa aagaaggtgg 6 O

Ctcc tacaaa to cat catt gcgataaagg aaaggc.catc gttgaagatg cct Ctgc.cga 12 O Cagtggtc.cc aaagatggac C cccacccac gaggagcatc gtggaaaaag alagacgttcC 18O alaccacgt.ct tcaaagcaag tattgatg tatggit cog attgagacitt ttcaacaaag 24 O ggtaatat co ggaaacct co toggattic cattgcc cagot atctgtcact ttattgttgaa 3OO gatagtggaa aaggaaggtg gct Cotacaa atgc.cat cat ticgataaag gaaaggc cat 360 cgttgaagat gcctctgc.cg acagtggit co caaagatgga CCC cc accca Caggagcat 42O cgtggaaaaa galagacgttc Caaccacgt.c ttcaaagcaa gtggattgat gtgat at ct c 48O cactgacgta agggatgacg cacaatcc.ca citatic ctitcg caaga ccct t c ct ctatata 54 O aggaagttca ttt catttgg agaggacacg Ctg a.

<210s, SEQ ID NO 62 &211s LENGTH: 2365 &212s. TYPE: DNA <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: SEO ID NO : 62 ARSK1 (Root-specific Kinase 1) root - specific promoter

<4 OOs, SEQUENCE: 62 gg.cgagtgat gigtatattta ttggttgggc titaaatatat titcagatgca aaaccatatt 6 O US 7,868,229 B2 105 106

- Continued gaat caataa attataaata catagct tcc ctaac cactt aaaccaccag ctacaaaacc 12 O aataaacccg atcaat catt atgttitt cat aggattt cott gaa catacat taaattattt 18O tt cattttct toggtgct citt ttctgtc.tta t t cacgttitt aatggacata atcggitttca 24 O tattgtaaat citctittaa.cc taacgaacaa tittaatgacc ctagtaatag gataagaagg 3OO tcqtgaaaaa tdaacgagaa aaaac ccacc aaaac act at ataagaaaga ccgaaaaagt 360 aaaaagggtg agc cataaac caaaaacctt accagatgtt gtcaaagaac aaaaatcatc 42O atcCatgatt aacctacgct t cact act aa gacaaggcga ttgttgtc.ccg gttgaaaagg 48O ttgtaaaa.ca gtttgaggat gct acaaaag tatgttaa gtatgaagcg gctalaggttt 54 O tggatttggt Ctaggagcac attggittaag caat atcttic ggtggagatt gagtttittag 6OO agatagtaga tactaattica totatggaga catgcaaatt catcaaaatg cittggatgaa 660 ttagaaaaac taggtggaga atacagtaaa aaaattcaaa aagtgcatat ttittggaca 72 O acattaatat gtacaaatag titt acattta aatgt attat titt actaatt aagtacatat 78O aaagttgcta aactaalacta atataattitt tdcataagta aattitat cqt taaaagttitt 84 O

Ctttctagoc actaaacaac aatacaaaat cqc coaagtc acccattaat taatttagaa 9 OO gtgaaaaa.ca aaatcttaat tatatggacg atc.ttgtcta ccatatttca agggctacag 96.O gcct acagcc gcc gaataaa tottaccago cittaalaccag aacaacggca aataagttca O2O tgtgg.cggct ggtgatgatt cacaatttico Cogacagttc tatgataatgaaactatata O8O attattgtac gtacatacat gcatgcgacg aacaacactt caatttaatt gttag tatta 14 O aattacattt at agtgaagt atgttgggac gattagacgg atacaatgca cittatgttct 2OO ccggaaaatgaat catttgt gttcagagca tact coaag agt caaaaaa gtt attaaat 26 O ttatttgaat ttaaaactta aaaatagtgt aatttittaac caccc.gctgc cqcaaacgtt 32O ggcggaagaa tacgcggtgt taaacaattt ttgttgat cqt ttcaaacat ttgta accgc 38O aatc to tact gcacaatctg ttacgtttac aatttacaag titagtataga agaacgttcg 44 O tacctgaaga cca accqacc tittagittatt gaataaatga ttatttagtt aagagta aca SOO aaat caatgg ttcaaatttgtttct ctitco ttact tc.tta aattittaatc atggaagaaa 560 caaagttcaac giga catccaa titatggccta atcatct cat t ct cotttca acaaggcgaa 62O tcaaatctitc tittatacgta at atttattt gccagotctgaaatgitat acc aaatcattitt 68O taaattaatt gcc taaatta ttagaacaaa aact attagt aaataactaa ttagt ctitat 74 O gaaact agaa atcgagatag toggaatatag agaga cacca ttaaatt cac aaaat cattt 8OO ttaaattacc taaatt atta caacaaaaac tattaga cag aactaagttct ataatgaaac 86 O gagagat.cgt atttggaatg tagagcgaga gacaattitt C aattic attga atatataagc 92 O aaaattatat agc.ccgtaga Ctttggtgag atgaagttcta agtacaaaca actgaatgaa 98 O tittataatca ataatattga ttatattgtg attagaaaaa gaaaacaact togcgittattt 2O4. O ttcaat atta ttgtgaggat taatgtgaac atggaatcgt gtttct c ct g aaaaaaatat 21OO cagdatagag cittagaacaa tataaatata t coaccaaaa ataacttcaa catttittata 216 O caactaatac aaaaaaaaaa aagcaaactt tttgtatata taaataaatt tdaaaactica 222 O alaggtoggtc agtacgaata agacacaiaca act actataa attagaggac tittgaagaca 228O agtaggittaa citagaa catc cittaatttct aaacctacgc actictacaaa agatt catca 234 O aaaggagtaa aagactaact ttct c 23.65 US 7,868,229 B2 107 108

- Continued <210s, SEQ ID NO 63 &211s LENGTH: 922 &212s. TYPE: DNA <213> ORGANISM: Solanum lycopersicum 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 63 RSI1 (Root System Inducible 1) promoter

<4 OOs, SEQUENCE: 63 caat caacta aatggactitt tottgttgcat tdgtoccatt tttacgc cct aatatt cqct 6 O tacttgcttt tttgtattitt atttattitta gttittaattt tat citacct c caaattgata 12 O gaaataatta cacttatagt ccttittgaaa aattataatt at agcattca agtaaataaa 18O aatacg tatt tttagt cact ttgtaatgta taattittgag titgaaaatgt atcaaaagta 24 O aatttatatt cittaagatat ggataaagtt tacatataca ttatc.cgttt catacccitat 3OO ttatag tatt acattgcata agittattgta gat cittgatc gaaagtatgt gat attaata 360 ctatttittag aattatgtta ttct cagtta tdgagtgata tittaaaatca atatagtata 42O tcqataatca gatagtttaa ttct tattitt ct c catccaa tittatataat gat attataa 48O t caattittac gaatgagatg gat attittga aatttittagt ttaaaataaa ttittaaattit 54 O tttgtgggtc. tataaattat ctaattaaga gqtaaaatagaaagtttgaa attaattatt 6OO actitactaaa tatataaata tdt catttitt tottaaactg atttagaaga aaagagtgtc 660 atatacatgg acagaacgaa tataatttga taattaaatt totaaagatt catagittaat 72 O agggat caaa attgcacgta t cc attacta taaggtoata tittgctt cat aaaaatcatc 78O aggatcaaaa atcagaattt at attatatt tdagggacta aaaatgctaa tat cacaaat 84 O taaaattagt ctataaat at t cacactitta citcttctaat tccatcaaat atttic cattt 9 OO at cit to tott citt cittaa at at 922

<210s, SEQ ID NO 64 &211s LENGTH: 1.OO9 &212s. TYPE: DNA <213> ORGANISM: Solanum lycopersicum 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 64 RBCS3 (Ribulose 1,5-bisphosphate carboxylase, small subunit 3) photosynthetic tissue-specific promoter

<4 OOs, SEQUENCE: 64 aaatggagta atatggataa toaacgcaac tatatagaga aaaaataata gcqctaccat 6 O atacgaaaaa tagtaaaaaa ttataataat gattcagaat aaatt attaa taactaaaaa 12 O gcqtaaagaa ataaattaga gaataagtga tacaaaattig gatgttaatg gat actt citt 18O ataattgctt aaaaggaata caagatggga aataatgtgt tatt attatt gatgtataaa 24 O gaatttgtac aattitttgta t caataaagt to caaaaata atctittaaaa aataaaagta 3OO c cctitt tatgaactttittat caaataaatgaaatccaata ttagcaaaac attgat atta 360 ttactaaata tttgttaa at taaaaaatat gtcatttitat tttittaa.cag at atttittta 42O aagtaaatgt tataaattac gaaaaaggga ttaatgagta t caaaac agc ctaaatggga 48O ggagacaata acagaaattit gctgtag taa gotggcttaa gtcat cattt aatttgatat 54 O tataaaaatt ctaattagtt tatagt ctitt cittitt cotct tttgtttgtc. ttg tatgcta 6OO aaaaaggitat attatat cita taaattatgt agcataatga cca catctgg catcatctitt 660 acacaattica cctaaatat c ticaag.cgaag titttgccaaa actgaagaaa agatttgaac 72 O aaccitat caa gtaacaaaaa toccaaacaa tatagt catc tat attaaat cittittcaatt 78O US 7,868,229 B2 109 110

- Continued gaagaaattig toaaagacac atacct citat gagtttitttc atcaatttitt ttittctttitt 84 O taaactgt at ttittaaaaaa at attgaata aaa catgtcc tatto attag tittgggaact 9 OO ttaagataag gag tdtgtaa titt cagaggc tattaattitt gaaatgtcaa gagccacata 96.O atcCaatggit tatggttgct Cttagatgag gtt attgctt taggtgaaa 1 OO9

<210s, SEQ ID NO 65 &211s LENGTH: 2244 &212s. TYPE: DNA <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 65 SUC2 (Sucrose-proton Symporter) vascular-specific promoter

<4 OOs, SEQUENCE: 65 aact aggggit gcataatgat ggaacaaag.c acaaatc.ttt taacgcaaac taact acaac 6 O cittcttittgg ggit coccatc ccc.gacccta atgttittgga attaataaaa ctacaat cac 12 O ttaccaaaaa ataaaagttcaaggccacta taatttct ca tatgaacct a catttataaa 18O taaaatctgg titt cat atta attt cacaca ccaagttact ttctattatt aactgttata 24 O atggac catgaaatcatttg catatgaact gcaatgatac ataatccact ttgttttgttg 3OO ggagacattt accagatttic ggtaaattgg tatt ccc cct tittatgtgat togg to attga 360 t cattgttag toggc.ca.gaca tttgaact co cqtttittttgtctataagaa titcggaaaca 42O tatagitat cotttgaaaacg gagaaacaaa taacaatgtg gacaaac tag atata atttic 48O aacacaagac tatgggaatg attitt accca ctaattataa toccatcaca aggtttcaac 54 O gaac tagttt to cagatat c aaccalaattt actittggaat taalactaact taaaactaat 6OO tggttgttcg taaatggtgc titttittttitt tdcggatgtt agtaaagggit tittatgtatt 660 titat attatt agittatctgt titt cagtgtt atgttgtct c atc cataaag tittatatgtt 72 O ttitt ctittgc tictata actt atatatatat atgagtttac agittatattt atacatttca 78O gatacttgat cqgcatttitt tttgg taaaa aatatatgca tdaaaaactic aagtgtttct 84 O tttittaagga atttittaaat gigtgattata tdaatataat catatgtata t cc.gtatata 9 OO tatgtagcca gatagittaat tatttggggg at atttgaat tattaatgtt ataat attct 96.O ttcttittgac togtotggitt aaattaaaga acaaaaaaaa cacatactitt tactgttitta O2O aaaggittaaa ttaa.cataat ttattgatta caagtgtcaa gtc catgaca ttgcatgtag O8O gttctgagact t cagagatala C9gaagagat cataattgt gatcgtaac a ticcagatatg 14 O tatgtttaat titt catttag atgtggat.ca gagaagataa gtcaaactgt citt cataatt 2OO taagacaa.cc ticttittaata ttitt.cccalaa acatgttitta totaact act ttgct tatgt 26 O gattgcctgaggatactatt attct ctdtc tittatt citct t cacaccaca tittaaatagt 32O ttaa.gagcat agaaattaat tattittcaaa aaggtgatta tatgcatgca aaatago aca 38O c catttatgt ttatattittcaaattattta atacatttca at attt cata agtgtgattit 44 O tttitttittitt tdt caattitc ataagtgtga tttgt cattt gtattaalaca attgtat cqc SOO gcagtacaaa taalacagtgg gagaggtgaa aatgcagtta taaaactgtc. caataattta 560 ctaacacatt taaatat cita aaaagagtgt ttcaaaaaaa attcttittga aataagaaaa 62O gtgatagata tttittacgct titcgtctgaa aataaaacaa taatagttta ttagaaaaat 68O gttatcaccg aaaattatto tagtgccact cqct cqgatc gaaattic gaa agittatatt c 74 O tittct ctitta cctaatataa aaatcacaag aaaaatcaat ccgaatatat citat caacat 8OO US 7,868,229 B2 111 112

- Continued agtatatgcc cittacatatt gtttctgact tttct citatic cqaatttctic gott catggit 1860 tttitttittaa catatt ct catttaattittc attact atta tataactaaa agatggaaat 1920 aaaataaagt gtctttgaga atcgaacgt.c catat cagta agatagtttgttgttgaaggta 198O aaatctaaaa gatttaagtt coaaaaacag aaaataatat attacgctaa aaaagaagaa 2O4. O aataattaaa tacaaaacag aaaaaaataa tatacga cag acacgtgtca cqaagatacc 21OO citacgctata gacacagotc tdttittct ct tttctatgcc ticaaggct ct cittaact tca 216 O ctgtct cotc titcggataat colt atcct tc. tct tcctata aatacct ct c cactict tcct 222 O

CttcCt CCaC Cactacaa.cc acca 2244

<210s, SEQ ID NO 66 &211s LENGTH: 1176 &212s. TYPE: DNA <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 66 CUT1 (Cuticular Wax Condensing Enzyme 1) epidermal-specific promoter

<4 OOs, SEQUENCE: 66 tgttgaattat attt tact ct tcgatat cqg ttgttgacga ttaac catgc aaaaaagaaa 6 O cattaattgc gaatgtaaat aacaaaacat gtaactic titg tagatataca totat caca 12 O tittaaacccg aatatatatg tatacctata atttctic tda ttitt cacgct acctgccacg 18O tacatgggtgataggit coaa act cacaagt aaaagtttac gtacagtgaa titcgt.cttitt 24 O tgggtataaa cqtacattta atttacacgt aagaaaggat tacca attct tt catttatg 3OO gtaccagaca gagtta aggc aaacaagaga aacatataga gttittgatat gttitt Cttgg 360 ataaat atta aattgatgca at atttaggg atggacacaa gotaatatat gcc ttittaag 42O gtatatgtgc tatatgaatc gtttcgcatg gg tactaaaa ttatttgtcc titactittata 48O taaacaaatt coaacaaaat caagtttittg ctaaaac tag tittatttgcg ggittatttaa 54 O ttacctatica tattacttgt aatat cattc g tatgttaac gigg taaacca aaccaaacco 6OO gatattgaac tattaaaaat cittgtaaatt tdacacaaac taatgaatat citaaattatg 660 ttactgctat gataacgacc attitttgttt ttgagaacca taatataaat tacaggtacg 72 O tgacaagtac taagtattta tat coacctt tagt cacagt accaatattg cqcct accogg 78O gcaacgtgaa cqtgat catc aaatcaaagt agttaccaaa cqctittgat c ticgataaaac 84 O taaaagctga cacgtc.ttgc tigttt cittaa tittatttct c ttacaacgac aattittgaga 9 OO aatatgaaat ttittatat cq aaagggaaca gtc.ctitat catttgctic cca toacttgctt 96.O ttgtctagtt acaactggaa atcgaagaga agtattacaa aaa catttitt citcgt cattt 1 O2O ataaaaaaat gacaaaaaat taaatagaga gcaaa.gcaag agcgttgggit gacgttggit c 108 O tott cattaa citcctcitcat citacccct tc citctgttcgc ctittatatoc titcacct tcc 114 O

Ctect ct catc titcattaact Cat Cttcaaa aatacc 1176

SEO ID NO 67 LENGTH: 847 TYPE: DNA ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: SEQ ID NO: 67 LTP1 (Lipid Transfer Protein 1) epidermal-specific promoter

SEQUENCE: 67 US 7,868,229 B2 113 114

- Continued tcgacccacg cgt.ccgagcg titt.cgtagaa aaatticgatt tctictaaagc cctaaaacta 6 O aaacgacitat coccaatticc aagttctagg gtttccatct tcc ccaatct agtataaatg 12 O gcggatacgc Ctt cago Cc agctggagat ggcggagaaa gcggcggttc C9ttagggag 18O caggat.cgat acct tcct at agctaatat c agcaggat.ca tgaagaaagc gttgcct cot 24 O aatggtaaga ttggaaaaga tigctaaggat acagttcagg aatgcgt.ctic tdagttcatc 3OO agct tcatca ctagdgaggc cagtgataag ttcaaaaag agaaaaggala aactgttgaat 360 ggtgatgatt tttgttgggc aatggcaa.ca ttaggatttg aggattacct ggalacct cta aagatatacc tagcgaggta Cagggagttg gagggtgata atalagggat.c aggaaagagt ggagatggat Caaatagaga tigctggtggc ggtgtttctg gtgaagaaat gcc.gagctgg 54 O taaaagaagt to aagtagt gattalagaac aatcgc.caaa tgatcaaggg aaattagaga t cagtgagtt gtttatagitt gagctgat cq acaactattt cgggitt tact ct caattitcg 660 gttatgttag tittgaacgtt toggtttattgttt coggittt agttggttgt atttaaagat 72 O ttct Ctgtta gatgttgaga acacttgaat gaaggaaaaa tttgtccaca toctdttgtt attitt.cgatt cactitt cqga attt catago taatttatto to atttaata CCaaatcCtt 84 O aaattaa. 847

SEQ ID NO 68 LENGTH: 48O1 TYPE: DNA ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: SEO ID NO : 68 AS1 (emergent leaf primordia- specific) promoter

SEQUENCE: 68 ggaccgtgta atgggc catt gggccaagtt ttcttgatat aaaatctgaa atact actaa 6 O attacaattt ttcttaaact cqattt cata att catgtgg gacticagttc. tcc.gc.gt citt 12 O atgact taag agittaa.gagt aaaga caatt gattgtagtt tgcatt atta aggttgttgat 18O tittaaaggct at attggc cc aggcaaagtg gtt atgaaag ttaaaaggta ttattaaatg 24 O tcgittatgga ctagot aaag aaaagagatg gatatagaaa cggatttgcc agtttgtgag 3OO gttacgtact cqttacttitc tattgcattt ttgttgttgtca ttgtgcttgt gatttctitta 360 gtatatgttt ttcttitttgt caaactictitt agtacatgtt atgctittatt ttcttgttta gcattgtt at tdttattittg atc catgttc titt tact taa tgttgtagagt gttcacgtac gact ctittat gatcgctata ctaatatact atgaaact cq aatgagaaca tdcatgtcat 54 O aaat caataa alacatalacat acgacactta acctaaatca tacatt catt gatt catact at catgat co toat cacatt agtat cattt gtc.tttattt attacttagc tactt cqtta 660 tott attata t ctittacctg ttctgctggit catttgc cat aaacaccaag tacaa.gcaac 72 O t ctittagt cc aatat cagac caaattaa.ca aacattt coc caatccaaaa cqgaaattta attataatta gcatttaaat aggttcgatt acaaaaaaaa atcaacaaag gaacaagtica 84 O attt cataat gigtttgttcaa ttgtcacaca acgaaatggc tagc.cggat C aag catgcat 9 OO gatccaaatt toaacatttic catgataacc tdaattataa cgt.ctacata aaccatattt 96.O aaataaatag gatggit caa agatat catt aaaagaacga ttcaatatt c tittattgttc aattgataca catgttatt c ticcittaacca gttatgaaca tgtcc tacaa gtttcttgac 108 O c caaact cat aattt catat accataatcc caagttaagt tttitttittitt toggggat cala 114 O US 7,868,229 B2 115 116

- Continued aatctoaagt taagttaagt toaattattt agctgtaatig citcggaaaaa agat.cggatg 2OO aatat coaat tdgttcaata tat accc.caa to cqgccaat citc cctatot ttatagotta 26 O attattagag aatggit caat t cacgc.catc agaac cagtt toatat ctitc atgaaccaaa 32O acgc.ctacaa ccct attatt caagaaatca citataattgt ccaagtaaaa ccattaatta 38O accoagt cqa tttittctato gtc.ctatagg catgttgtta citcaaactac tdattaatta 44 O ataagaagtt gtagtttgaa aaagaatcta gctgaaaaat act cotactic taagaattta SOO agittagaata aaa.cat atta atacaaatat aaaaatttag titattaaaaa agcgctact a 560 c caagacgt.c ctaaagaaaa act agctttgttcttctaaaa gaaaacctag cittaact acc 62O caaaaaaatc tagttttaca aac actaaag acaaattitta tttittcaa.ca aatttaccala 68O ttaaagaaaa titc catgtag gaatgitat co aaattgaaaa tat coctaca tattttgtag 74 O gaaaaaaggt ttittataaat attaaaaaaa Cagaaaaag aaaagagaaa agagaaaaaa 8OO aaaag.ccgga gagaatggag cacatgaggit aaaaggcaag agatggcaga gagaagat.ca 86 O gaga agggat citgcct caat ttgacaactic atatgtcatgtcatttic cct cactact att 92 O attitt.cct at ttcaaaaa.ca cctitt citctg ataccat cac ctitttacctt citc.tttittitt 98 O ttactgtc.tt togctctgttt cacatt.ccct tctatatata cagtatagta tattittatcc 2O4. O ttcttittatt gttittgctta ctaaaagttt tttitcct cog gaatcaaaat t ctaaaatgt 21OO at at catgtt aggtogcgag ggc catgcaa tatt atgaac tatgcatgat gattaatgtc. 216 O tgtggat.cca t cacaaat at tattgaaggit tat cagaga citatggacca aaatggit cog 222 O aatcgc.ctga taataaaaaa citatt cattt ttatttittta tttitttittat taalacatgtg 228O attaatgata gat cittacga titcgcaactg ggaaacatgc actaact caa acttaaaa.ca 234 O cacaatacta aaagttct at taaattittga atgtaaagag aaatatatta ggcaatcaaa 24 OO cggit caagta aat catacac atcgataatt tatttittitta t cottcaaag caggcc catc 246 O caaggcc.cac cactatt citc at atcaa.cat acttittcttgttittggittaa atcaacctac 252O catgttggct gttct citccg ctic ct ctdtg taagat caca cca acaccac togcataattit 2580 cittg tatt at tittgagacitt gagagtaaac tdattgacaa aaaaaaaaaa aaaaaaaaaa 264 O aattgagagt aaactagttt Cttgaatatt gattittittca gcttaatttgttggggaaag 27 OO at attactac tattgctgta aaaaaaaaaa aaaaaaaaaa agatattatt act at atttg 276 O tagtgattitt attittgaaaa ttct citt cac tttitttgtag ttaac attct aattttgttga 282O aaagaactitt taatgtcagg to atgtct ct taaaaagttt gcatgatgaa atgatttaca 288O aattacaata gaaaatggaa accattgcaa actaaattitt tat caaaaaa aatcgaaaat 294 O aaaatgtatt gacittagtaa togctgtgtct gctacgatta act attacac ataatgcaac 3 OOO actgaattat coaaatacat tattagaata atagt attac agitat cact a ttacaacaac 3 O 6 O aatgtcaa.ca ataatctitat tataataata tataaataga cct tagtgac atcat attat 312 O atagaaaa.ca ttggttgcc taatttgt at aagctagata cittgggggtg atgagtgact 318O agttgatgca atgataaaag agtgaaagtt ttgtctgcct gattataga C gtcggagaaa 324 O tact aaaata cqctatgaag attittggcgc atggtag cag aaaaaaaaaa C9gagggtgt 33 OO gagtgagtag titagt cqg atgtgatgga acaaagaaaa gtatttittgg tagggittatg 3360 ggagagagaa goggaccatt attacacact tacatgctitt coccaaaaga taccatt coc 342O attittctgac acgtgtcc cc ct catc.ccca attact cata cqtcaaatcc aatttittagc 3480 ctaaaagttt tttittatttgtttagccaaa totattittac taattaaagt tttcaaatgg 354 O US 7,868,229 B2 117 118

- Continued caaatagaaa gat cittctaa gottttataa aattacttga ttatttctag titttgct cat 36OO tttittaaata aaatttct ct ttitttittctt gcaac attat tdatttittitt tttgataggg 366 O agtaac atta gtgatgttct atctottct c attgcaaaaa ctittattitt c ticatctotat 372 O ttgat catca ttgcgaaatc titc catttitc aacaaatact titt coatgtt aatatgctgt 378 O ttcaaaatat aagttgtttgg aaaataaatc aacaagttta aatgttaact attitt tatgc 384 O tattataatt atttitt citta togggtaagtg gaaattaatgttact caaat tdgacataaa 3900 attctattgt ttgagtgaag gagtttataa atggagcatt attitt cittga atggittagtt 396 O tittcttctat cattttgaca agtaaatgac ttitt cagcca citaaagtaca acactttitt c 4 O2O atttaaattit aaa.gcatc.cc ctacattaga ttgtcattitt atttct cata atgttataga 4 O8O aaaatgaatt ttgagat.ccc aatgtagtaa atatatataa aaaaaggttt aatattgtca 414 O atgacaaaca acgaactitat ggaatttcaa cittitt cacct c cacgc.gc.ct c to tcagagt 42OO tttittttitt c cccacttgtg atgtaaaaag giggaaaacgt ctdtgtctica gtcgg taaac 426 O tttittct citc tittttitttitt taaagattitt attittaatta togcc.gtc.tct gtggtctaat 432O cgtgtacgt.c gtctggittitt aaaagcct ct ct cactittgg totttitcgtt ttct citcttic 438 O

Cattitt ct co alactatataa aaaaaaaaaa gtgagagaga gagcaaatct gtgttgatgga 4 44 O agttgct citt gagtttggga ttatttatct tttcaat atc atttggtaag catttittatt 4500 ttgttittata gtaataattt taact ct citt atctt cittaa taagttctittg cittaatagtg 456 O ttittgggg.tc ago attaatt t cc cctgttt ggttt coaga atataggttg tat agtgtga 462O taataacaaa ttatt coaag titttgct tca aac attgtca aagttitttgt cattitt catt 468O t cittgaaacg gaaatttitt c agacitttgta atttctaatt cqaaaatticg acagat cittg 474. O tagatttgtt togatcttitt agagttittga attggagaga tittatgaaac giggttgattit 48OO t 48O1

<210s, SEQ ID NO 69 &211s LENGTH: 1510 &212s. TYPE: DNA <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: SEQ ID NO: 69 RD29A (Desiccation-responsive 29a) stress inducible promoter

<4 OOs, SEQUENCE: 69 ggttgctatg gtagggacta tigggtttitc ggatt CC ggt ggaagtgagt ggggaggcag 6 O tggcggaggit aagggagttcaagattctgg aactgaagat ttggggttitt gcttittgaat 12 O gtttgcgttt ttg tatgatg cct citgtttgttgaactittga tigt atttitat citttgttgttga 18O aaaagagatt gggittaataa aatatttgct tttittggata agaaact citt ttagcggcc c 24 O attaataaag gttacaaatg caaaatcatgttagcgt cag at atttaatt atticgaagat 3OO gattgttgata gatttaaaat tat cotagtc. aaaaagaaag agtaggttga gcagaaa.ca.g 360 tgacatctgt tdtttgtaccatacaaatta gtttagatta ttggittaa.ca tottaaatgg 42O citatgcatgt gacatttaga ccttatcgga attaatttgt agaattatta attaagatgt 48O tgattagttcaaacaaaaat titt at attaa aaaatgtaaa cqaat attitt g tatgttcag 54 O tgaaagtaaa acaaattaaa ttaacaagaa acttatagaa gaaaatttitt act atttaag 6OO agaaagaaaa aaatctat catttaatctgagtic ctaaaaa citgttatact taacagttaa 660 cgcatgattt gatggaggag C catagatgc aattcaatca aactgaaatt totgcaagaa 72 O US 7,868,229 B2 119 120

- Continued t citcaaacac ggagat ct ca aagtttgaaa gaaaattitat ttctt cact caaaacaaac 78O ttacgaaatt tagg tagaac ttatatacat tatattgtaa tttitttgtaa caaaatgttt 84 O ttattatt at tatagaattt tactggittaa attaaaaatgaatagaaaag gtgaattaag 9 OO aggagagagg aggtaaac at titt Cttct at tttitt catat titt caggata aattattgta 96.O aaagtttaca agattt coat ttgactagtg taaatgagga at attcticta gtaagat cat O2O tattt catct acttcttitta t cittctacca gtagaggaat aaacaatatt tagct cottt O8O gtaaatacaa attaattitt.c citt cittgaca to attcaatt ttaattittac gtataaaata 14 O aaagat cata cct attagaa cattaagga gaaatacaat t caatgaga aggatgtgcc 2OO gtttgttata ataaacagcc acacgacgta aacgtaaaat gaccacatga tigggc caata 26 O gacatggacc gac tactaat aatagtaagt tacattt tag gatggaataa atat catacc 32O gacatcagtt ttgaaagaaa agggaaaaaa agaaaaaata aataaaagat atact accga 38O

Catgagttcc aaaaagcaaa aaaaaagat.c aag.ccgacac agacacgcgt agaga.gcaaa 44 O atgactittga cqt cacacca cqaaaacaga cqctt catac gtgtc. cctitt atctotctica SOO gtct ct citat 510

<210 SEO ID NO 70 &211s LENGTH: 16 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (3) . . (3) 223 OTHER INFORMATION: Xaa can be Lys or Arg 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (8) . . (8) 223 OTHER INFORMATION: Xaa can be any naturally occurring amino acid 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (11). . (11) 223 OTHER INFORMATION: Xaa can be any naturally occurring amino acid 22 Os. FEATURE: 223 OTHER INFORMATION: CBF conserved consecutive amino acid residues

<4 OO > SEQUENCE: 7 O

Pro Lys Xaa Pro Ala Gly Arg Xaa Llys Phe Xaa Glu Thir Arg His Pro 1. 5 15

<210s, SEQ ID NO 71 &211s LENGTH: 5 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: CBF conserved consecutive amino acid residues

<4 OOs, SEQUENCE: 71

Asp Ser Ala Trp Arg 1. 5 US 7,868,229 B2 121 122 What is claimed is: photosynthetic-tissue specific-promoter, a vascular-spe 1. A transformed plant comprising a recombinant poly cific promoter, an epidermal-specific promoter, an emer nucleotide that encodes a CCAAT-box polypeptide having a gent leaf primordia-specific promoter, and a stress B domain with at least 95% amino acid identity with SEQID inducible promoter, and NO:31: 5 as a result of said overexpression the transformed plant wherein said transformed plant has a phenotype of accel flowers earlier than the control plant which does not erated flower development, as compared to a control contain the recombinant polynucleotide. plant that does not contain the recombinant polynucle 11. The method of claim 10, wherein the CCAAT-box otide, when the polypeptide is overexpressed in the polypeptide has a B domain with at least 96% amino acid transformed plant; and 10 identity with the B domain of SEQID NO:31. wherein the overexpression of the polypeptide in the trans 12. The method of claim 10, wherein the CCAAT-box formed plant is regulated by a promoter selected from polypeptide has a B domain with at least 97% amino acid the group consisting of a root-specific promoter, a pho identity with the B domain of SEQID NO:31. tosynthetic-tissue specific-promoter, a vascular-specific 13. The method of claim 10, wherein the CCAAT-box promoter, an epidermal-specific promoter, an emergent 15 polypeptide has a B domain with at least 98% amino acid leaf primordia-specific promoter, and a stress inducible identity with the B domain of SEQID NO:31. promoter. 14. The method of claim 10, wherein the CCAAT-box 2. The transformed plant of claim 1, wherein the trans polypeptide has a B domain with at least 99% amino acid formed plant is a monocot. identity with the B domain of SEQID NO:31. 3. The transformed plant of claim 1, wherein the trans 15. The method of claim 10, wherein the overexpression of formed plant is a eudicot. the CCAAT-box polypeptide is regulated by an ARSK1 pro 4. The transformed plant of claim 1, wherein the CCAAT moter, an RSI1 promoter, an RBCS3 promoter, a SUC2 pro box polypeptide has a B domain with at least 96% amino acid moter, a CUT1 promoter, an LTP1 promoter, an AS1 pro identity with SEQID NO:31. moter, or an RD29A promoter. 5. The transformed plant of claim 1, wherein the CCAAT 25 16. The method of claim 10, wherein the CCAAT-box box polypeptide has a B domain with at least 97% amino acid polypeptide comprises a subsequence of SEQID NO: 60. identity with SEQID NO:31. 17. The method of claim 10, wherein the method steps 6. The transformed plant of claim 1, wherein the CCAAT further comprise selfing or crossing the transformed plant box polypeptide has at least 98% amino acid identity with the with itself or another plant, respectively, to produce trans B domain of SEQID NO:31. 30 formed seed. 7. The transformed plant of claim 1, wherein the CCAAT 18. A transformed plant comprising a recombinant poly box polypeptide comprises a subsequence of SEQID NO: 60. nucleotide encoding a CCAAT-box polypeptide having a B 8. The transformed plant of claim 1, wherein the overex domain with at least 90% amino acid identity with SEQ ID pression of the CCAAT-box polypeptide is regulated by an NO: 31, wherein the B domain comprises a recombinant ARSK1 promoter, an RSI1 promoter, an RBCS3 promoter, a 35 nucleic acid sequence encoding SEQID NO: 60: SUC2 promoter, a CUT1 promoter, an LTP1 promoter, an wherein the CCAAT-box polypeptide is overexpressed in AS1 promoter, or an RD29A promoter. the transformed plant; 9. The transformed plant of claim 1, wherein the trans and the overexpression of the CCAAT-box polypeptide in formed plant is a transformed seed comprising the recombi the transformed plant is regulated by a promoter selected nant polynucleotide. 40 from the group consisting of a root-specific promoter, a 10. A method for decreasing the time to flowering of a photosynthetic-tissue specific-promoter, a vascular-spe plant, as compared to the flowering time of a control plant, the cific promoter, an epidermal-specific promoter, an emer method comprising: gent leaf primordia-specific promoter, and a stress (a) providing a recombinant polynucleotide that encodes a inducible promoter, and CCAAT-box polypeptide that has a B domain having at 45 as a result of said overexpression the transformed plant least 95% amino acid identity with SEQID NO:31; and flowers earlier than a control plant that does not contain (b) transforming a target plant with the recombinant poly the recombinant polynucleotide. nucleotide to produce a transformed plant; wherein the 19. The transformed plant of claim 18, wherein the over CCAAT-box polypeptide is overexpressed in the trans expression of the CCAAT-box polypeptide is regulated by an formed plant; and 50 ARSK1 promoter, an RSI1 promoter, an RBCS3 promoter, a the overexpression of the CCAAT-box polypeptide in the SUC2 promoter, a CUT1 promoter, an LTP1 promoter, an transformed plant is regulated by a promoter selected AS1 promoter, or an RD29A promoter. from the group consisting of a root-specific promoter, a k k k k k