USOO8633353B2

(12) United States Patent (10) Patent No.: US 8,633,353 B2 Ratcliffe et al. (45) Date of Patent: Jan. 21, 2014

(54) WITH IMPROVED WATER DEFICIT PCT/US2004/037584, filed on Nov. 12, 2004, which is AND COLD TOLERANCE a continuation-in-part of application No. 10/714,887, application No. 1 1/981,813, which is a (75) Inventors: Oliver . Ratcliffe, Oakland, CA (US); continuation-in-part of application No. Jacqueline Heard, Webster Groves, MO PCT/US2006/034615, filed on Aug. 31, 2006, and a (US); Rodrick W. Kumimoto, Norman, continuation-in-part of application No. 1 1/705,903, OK (US); Peter P. Repetti, Emeryville, filed on Feb. 12, 2007, now Pat. No. 7,868,229, which CA (US); T. Lynne Reuber, San Mateo, is a continuation-in-part of application No. CA (US); Robert Creelman, Castro PCT/US2006/034615 Valley, CA (US); Frederick D. Hempel, Sunol, CA (US); Neal L. Gutterson, (60) Provisional application No. 60/961,403, filed on Jul. Oakland, CA (US); Roger Canales, 20, 2007, provisional application No. 60/713,952, Bernburg (DE) filed on Aug. 31, 2005. (73) Assignee: Mendel Biotechnology, Inc., Hayward, (51) Int. Cl. CA (US) AOIH I/O (2006.01) c - r - (52) U.S. Cl. (*) Notice: patentSubject is to extended any disclaimer, or adjusted the term under of this 35 USPC ...... 800/278; 800/287: 800/289: 800/290 U.S.C. 154(b) by 753 days. (58) Field of Classification Search None (21) Appl. No.: 11/981,813 See application file for complete search history. (22) Filed: Oct. 30, 2007 (56) References Cited (65) Prior Publication Data U.S. PATENT DOCUMENTS US 2008/O163397 A1 Jul. 3, 2008 6,121,513 A 9/2000 Zhang et al. 6,235,975 B1 5, 2001 Harada et al. Related U.S. ApplicationO O Data 6,476,2126,320,102 B1 11/20021 1/2001 HaradaLalgudi etet al. al. (60) Continuation-in-part of application No. 10/675,852, g: R 3: Sinces al. - I - aala Ca. filedcontinuation-in-part on Sep. 30, 2003,of application now abandoned, No. 1 1/479,226, and a 6,664,446 B2 12/2003 Heard et al. filed on Jun. 30, 2006, now Pat. No. 7,858,848, and a (Continued) continuation-in-part of application No. 1 1/725,235, filed on Mar. 16, 2007, now Pat. No. 7,601,893, and a FOREIGN PATENT DOCUMENTS continuation-in-part of application No. 1 1/728,567, AR PO70 102784 1, 2008 filed on Mar. 26, 2007, now Pat. No. 7,635,800, and a CA 2302828 10, 2000 continuation-in-part of application No. 1 1/069.255, (Continued) filed on Feb. 28, 2005, now Pat. No. 8,558,059, and a continuation-in-part of application No. 10/374,780, OTHER PUBLICATIONS filed on Feb. 25, 2003, now Pat. No. 7,511,190, and a continuation-in-part of application No. 10/546,266, Bowie et al, Science 247: 1306-1310, 1990.* filed as application No. PCT/US2004/005654 on Feb. 25, 2004, now Pat. No. 7,659,446, which is a (Continued) continuation-in-part of application No. 10/374,780, and a continuation-in-part of application No. Primary Examiner — Stuart F Baum 10/675,852, application No. 1 1/981,813, which is a (74) Attorney, Agent, or Firm — Jeffrey M. Libby: Yifan continuation-in-part of application No. 10/412,699, Mao filed on Apr. 10, 2003, now Pat. No. 7,345,217, which is a continuation-in-part of application No. (57) ABSTRACT 10/374,780, application No. 1 1/981,813, which is a The present invention provides nucleic acid constructs, continuation-in-part of application No. 1 1/642,814, including plasmids, expression vectors or expression cas filed on Dec. 20, 2006, now Pat. No. 7,825,296, which settes comprising polynucleotides encoding CCAAT-binding is a division of application No. 10/666,642, filed on transcription factor polypeptides that have the ability to Sep. 18, 2003, now Pat. No. 7,196,245, application No. increase a plants tolerance to abiotic stress. Polynucleotides 11/981,813, which is a continuation-in-part of encoding functional CCAAT-binding transcription factors application No. 10/714,887, filed on Nov. 13, 2003, were incorporated into expression vectors, introduced into now abandoned, which is a continuation-in-part of plants, and ectopically expressed. The encoded polypeptides application No. 10/666,642, application No. of the invention significantly increased the cold and water 11/981,813, which is a continuation-in-part of deficit tolerance of the transgenic plants, as compared to application No. 11/435,388, filed on May 15, 2006, tolerance to these stresses of control plants. now Pat. No. 7,663,025, which is a continuation-in-part of application No. 22 Claims, 5 Drawing Sheets US 8,633,353 B2 Page 2

(56) References Cited 2009,0183270 A1 7/2009 Adams 2009,0192305 A1 7/2009 Riechmann et al. U.S. PATENT DOCUMENTS 2009/0205063 A1 8/2009 Zhang et al. 2009,0265807 A1 10, 2009 Kumimoto et al. 2009,0265813 A1 10, 2009 Gutterson et al. 8%. R 58. A.C.E.tal. 2009/02769 12 Al 11/2009 Sherman et al. 6.77 O34 B2 4/2004 Jiang et al. 2010, 0071086 A1 3/2010 Repetti et al. 6,781.035 B1 8/2004 Harada et al. 2010, 0083395 A1 4/2010 Reuber et al. 6.825,397 B1 11/2004 Lowe et al. 2010, 00834O2 A1 4, 2010 Heard et al. 6.835 540 B2 12/2004 Broun et all 2010/01O7279 A1 4/2010 Ratcliffe et al. 6,946,586 B1 9, 2005 Fromm et al. 2010, 0162427 A1 6/2010 Riechmann et al. 7,109,393 B2 9, 2006 Gutterson et al. 2010/01751.45 A1 7/2010 Heard et al. 735 616 B2 11/2006 Heard et al. 2010, 0186105 A1 7/2010 Creelman et al. 79320 B2 3/2007 Reuber et al. 2010, 0186106 A1 7/2010 Creelman et al. 796.345 B2 3, 2007 Jiang et al. 2010, 0192249 A1 7/2010 Creelman et al. 7233,904 B2 5, 2007 Heard et al. 2010/0223689 A1 9/2010 Creelman et al. 7238.860 B2 7/2007 Ratcliffe et al. 2010/0281565 A1 1 1/2010 Engler et al. 7,345,217 B2 3/2008 Zhang et al. 7,511, 190 B2 3/2009 Sherman et al. FOREIGN PATENT DOCUMENTS 7,598.429 B2 10/2009 Heard et al. 7,601,893 B2 10/2009 Reuber et al. CL 185107 2, 2008 7,635,800 B2 12/2009 Ratcliffe et al. DE 19503359 2, 1996 7,659,446 B2 2/2010 Sherman et al. EP 1033 405 9, 2000 7,663,025 B2 2/2010 Heard et al. EP 123O345 8, 2002 7,692,067 B2 4/2010 Creelman et al. EP 1 42O 630 2, 2003 2001/0051335 Al 12/2001 Lalgudi et al. EP 1420630 2, 2003 2002fOO23281 A1 2/2002 Gorlach et al. EP 1454993 9, 2004 2002fOO40489 A1 4/2002 Gorlach et al. EP 16O1758 9, 2004 2003/0041356 A1 2/2003 Reuber et al. EP O8O3572 10/2007 2003, OO61637 A1 3/2003 Jiang et al. GB 2244272 11, 1991 2003/0093837 A1 5, 2003 Keddie GB 2392444 3, 2004 2003/0101481 A1 5/2003 Zhang et al. WO WO 98.37184 8, 1998 2003/O121070 A1 6, 2003 Adam et al. WO WO 98/37755 9, 1998 2003/O126638 A1 7/2003 Allen et al. WO WO 98.58069 12/1998 2003. O131386 A1 7/2003 Samaha et al. WO WO99,53016 A2 10, 1999 2003. O188330 A1 10, 2003 Heard WO WO99,67405 12/1999 2004.0009.476 A9 1/2004 Harper et al. WO WOOOf 28058 5, 2000 2004, OO16022 A1 1/2004 Lowe et al. WO WOOO, 53724 A2 9, 2000 2004/OO 19927 A1 1/2004 Sherman et al. WO WOO1/36598 A1 5, 2001 2004/0031072 A1 2, 2004 La Rosa et al. WO WOO1/45493 A2 6, 2001 2004.0034.888 A1 2/2004 Liu et al. WO WOO1,64022 9, 2001 2004/0098764 A1 5, 2004 Heard et al. WO WOO1,773.11 10, 2001 2004/O123338 A1 6, 2004 Fincher et al. WO WOO2,06499 1, 2002 2004/O123339 A1 6/2004 Conner et al. WO WO O2, 16655 2, 2002 2004/O123343 A1 6/2004 La Rosa et al. WO WO O2/46442 6, 2002 2004/O128712 A1 7/2004 Jiang et al. WO WO O2/O57439 A2 T 2002 2004/0172684 A1 9, 2004 Kovalic et al. WO WO O2/O79245 A2 10, 2002 2004/O181830 A1 9, 2004 Kovalic et al. WO WOO3,OOO898 1, 2003 2004/0214272 A1 10, 2004 La Rosa et al. WO WOO3,OO1902 1, 2003 2004/0216190 A1 10, 2004 Kovalic et al. WO WOO3,OO2751 1, 2003 2004/0229367 A1 11/2004 Berka et al. WO WOO3,OO8540 1, 2003 2004/0259145 A1 12, 2004 Wood et al. WO WOO3,O14327 A2 2, 2003 2005/0022266 A1 1/2005 Wu WO WOO3/020936 3, 2003 2005, OO86718 A1 4/2005 Heard et al. WO WOO3,083.042 10, 2003 2005/OO97638 A1 5/2005 Jiang et al. WO WO 2004/OO982O 1, 2004 2005/O155117 A1 7/2005 Century et al. WO WO 2004/031349 A2 4/2004 2005/0172364 A1 8, 2005 Heard et al. WO WO 2004/076638 A2 9, 2004 2006, OOO8874 A1 1/2006 Creelman et al. WO WO 2004/079006 9, 2004 2006, OO15972 A1 1/2006 Heard et al. WO WO 2005/001050 1, 2005 2006,0162018 A1 7/2006 Gutterson et al. WO WO2005/033319 A2 4/2005 2007/0022495 A1 1/2007 Reuber et al. WO WO 2008.002480 A2 1, 2008 2007/0101454 A1 5/2007 Jiang et al. 2007. O184092 A1 8/2007 Meyer et al. OTHER PUBLICATIONS 2007/0199.107 A1 8, 2007 Ratcliffe et al. 2007/02268.39 A1 9, 2007 Gutterson et al. McConnell et al. Nature 411 (6838):709–713, 2001.* 2008/0040973 A1 2/2008 Nelson et al. U.S. Appl. No. 09/533,030, filed Mar. 22, 2000, Keddie, James, et al. 2008/0104730 A1* 5/2008 Wu et al...... 800,278 U.S. Appl. No. 10/286,264, Jun. 25, 2008, Keddie, James, et al., 2008. O1557O6 A1 6/2008 Riechmann et al. 1 office action 2008/O172759 A1 7/2008 da Costa e Silva et al. 2008/0229448 A1 9/2008 Libby et al. S.E." 10/286,264, Mar. 10, 2009, Keddie, James, et al., 3888 EA 358 SMEs. U.S. Appl. No. 10/286,264. Aug. 11, 2006, Keddie, James, et al., 2008/030 1841 A1 12/2008 Ratcliffe et al. office action. 2008/0313756 Al 12/2008 Zhang et al. U.S. Appl. No. 10/286.264, Oct. 7, 2005, Keddie, James, et al., office 2009/0044297 A1 2/2009 Andersen et al. action. 2009.0049566 A1 2/2009 Zhang et al. U.S. Appl. No. 10/286.264, Jul. 26, 2007, Keddie, James, et al., office 2009.0049573 A1 2/2009 Dotson action. 2009/0138981 A1 5/2009 Repetti et al. U.S. Appl. No. 10/675,852, Apr. 14, 2008, Heard, J., et al., office 2009. O151015 A1 6, 2009 Adam et al. action. US 8,633,353 B2 Page 3

(56) References Cited Edwards, et al. (1997). Arabidopsis thaliana mRNA for Hap3b tran scription factor. GenBank Accession No. Y 13724. OTHER PUBLICATIONS Gusmaroli (2002). Regulation of the Arabidopsis thaliana CCAAT binding nuclear factor Y subunits. Gene 283, 41-48. U.S. Appl. No. 1 1/069,255, Nov. 26, 2008, Heard, J., et al., office Gusmaroli, et al. (2001). Regulation of the CCAAT-Binding NF-Y action. subunits in Arabidopsis thaliana. Gene 264, 173-185. U.S. Appl. No. 1 1/069.255, Mar. 19, 2008, Heard, J., et al., office Riechmann, et al. (2000). Arabidopsis transcription factors: genome action. wide comparative analysis among eukaryotes. Science 290, 2105 U.S. Appl. No. 1 1/069.255, Mar. 21, 2007, Heard, J., et al., office 2110. action. U.S. Appl. No. 10/112,887, Sep. 28, 2004, Heard, J., et al., office Mantovani (Oct. 18, 1999). The molecular biology of the CCAAT action. binding factor. Gene 239, 15-27. U.S. Appl. No. 09/533,030, Nov. 23, 2001, Keddie, James et al., Li, et al. (1999). Transcription factor NF-Y. CCAAT-binding chain office action. B-maize, NCBI Sequence S22820. U.S. Appl. No. 09/533,030, May 3, 2002, Keddie, James et al., office Li, et al. (Mar. 1992). Evolutionary variation of the CCAAT-binding action. transcription factor NF-Y. Nucl. Acids Res. 20, 1087-1091. U.S. Appl. No. 10/675,852, Feb. 25, 2009, Heard, J., et al., office Guo, et al. (2004). “Protein tolerance to random amino acid change”. action. Proc. Natl. Acad. Sci. USA 101,9205-9210. Asamizu, E., et al. (Apr. 28, 2000). Generation of 7137 non-redun Smolen, et al. (2002). Dominant Alleles of the basic Helix-Loop dant expressed sequence tags from a legume, Lotus japonicus. DNA Helix Transcription Factor ATR2 Activate Stress-Responsive Genes Res. 7 (2), 127-130. in Arabidopsis. Gen vol. 161, pp. 1235-1246. Bucher, P., and Trifonov, E.N. (Jun. 1988). CCAAT box revisited: Liu, Q., et al. (1998). Two transcription factors, DREB1 and DREB2. bidirectionality, location and context. J Biomol Struct Dyn 5, 1231 with an EREBPAP2 DNA binding domain separate two cellular 1236. signal ... Cell, vol. 10, pp. 1391-1406. Bucher, P. (Apr. 20, 1990). Weight matrix descriptions of four Fourgoux-Nicol, et al. (1999). Isolation of rapeseed genes expressed eukaryotic RNA polymerase II promoter elements derived from 502 early and specifically during development of the male gametophyte. unrelated promoter sequences. J Mol Biol 212, 563-578. Plant Mol Biol 40, 857-872. Chae, H.D., et al. (May 20, 2004). Cdk2-dependent phosphorylation Eisen, et al. (1998). “Phylogenomics: Improving Functional Predic of the NF-Y transcription factor is essential for the expression of the cell cycle-regulatory genes and cell cycle G1/S and G2/M transitions. tions for Uncharacterized Genes by Evolutionary Analysis”. Genome Oncogene 23, 4084-4088. Research 8, 163-167. Clarke, B., et al. (Mar. 2003). Arabidopsis genomic information for Rossini, et al. The maize golden2 gene defines a novel class of interpreting wheat EST sequences. Funct. Integr. Genomics 3 (1-2), transcriptional regulators in plants. The Plant Cell vol. 13, May 2001, 33-38. pp. 1231-1244, XP002962733. Gelinas, R., et al. (Jan. 1985). Sequences of G gamma, Agamma, and Hill, et al. (1998). Functional analysis of conserved histidines in ADP beta genes of the Greek (A gamma) HPFH mutant: evidence for a glucose pyrophosphorylase from Escherichia coli. Biochem. distal CCAAT box mutation in the Agamma gene. ProgClin Biol Res Biophys. Res. Comm. 244, 573-577. 191, 125-139. Rounsley, S.D., et al. (Aug. 1997) Database Embl, Arabidopsis Good, L.F., and Chen, K.Y. (May 1996). Cell cycle- and age-depen thaliana chromosome II BAC T13E 15 genomic seq. complete seq. dent transcriptional regulation of human thymidine kinase gene: the XPOO2303794. Acc No. ACOO2388. Positions 57340-58040. role of NF-Y in the CBP/tk binding complex. Biol Signals 5, 163 Rounsley, S.D., et al. (2000). Database EMBL Sep. 12, 1997. 169. “Arabidopsis thaliana mRNA for Hap3a.” XP002315220 retrieved Hiei, Y., et al. Efficient transformation of rice (Oryza sativa L.) from EBI acc No. EM PRO: ATHAP3A, acc No. Y 13723. mediated by Agrobacterium and sequence analysis of the boundaries Ohme-Takagi, M. and Singh, K. (Feb 1995) “Ethylene-inducible of the T-DNA. Plant J. Aug. 1994; 6(2): 271-282. DNA binding proteins that interact with an ...” Plant Cell, vol.7. pp. Ito, T., et al. (Oct. 1995). A far-upstream sequence of the wheat 173-182, XP002108954. *Figs 5, 6*. histone H3 promoter functions differently in rice and tobacco cul Buttner, M. and Singh, K. (May 27, 1997) 'Arabidopsis thaliana tured cells. Plant Cell Physiol 36, 1281-1289. ethylene-responsive.” Proc of the Natl Acad of Sciences, vol. 94, pp. Johnson, P.F., and McKnight, S.L. (Jul. 1989). Eukaryotic transcrip 5961-5966, XP002108953. tional regulatory proteins. Annu Rev Biochem 58 799-839. Winicov. I. (Dec. 1998) “New molecular approaches to improving Jones, P.G., et al. (epub Nov. 26, 2002). Gene discovery and microar salt tolerance in crop plants.” Annals of Botany, ACAD Press, vol.82. ray analysis of cacao varieties. Planta 216 (2), 255-264. No. 6, pp. 705-710, XPOO 1007288. Maity, S.N., and de Crombrugghe, B. (May 1998). Role of the Urao, T., et al. (Nov. 1993) “An Arabidopsis MYB homolog is CCAAT-binding protein CBF/NF-Y in transcription. Trends induced by dehydration stress and its geneproduct binds to...” Plant Biochem Sci 23, 174-178. Cell, vol. 5, pp. 1529-1539, XP002938159. Mazon, M.J., Gancedo, J.M., and Gancedo, C. (Oct. 1982). Kasuga Mie, et al. (Mar. 1999) “Improving plant drought, salt, and Phosphorylation and inactivation of yeast fructose-bisphosphatase in freezing tolerance by gene.” Nature Biotechnol vol. 17. No. 3, pp. vivo by glucose and by proton ionophores. A possible role for cAMP. 287-291, XP002 173128. Eur J. Biochem 127, 605-608. Kaneko, t., et al. (Mar. 1, 2001) database trembl seqlib ebi, hinxton; McNabb, D.S., Xing, Y., and Guarente, L. (Jan. 1995). Cloning of “transcription factor hap5a-like (at5g50480) . . .” xp002302644 yeast HAP5: a novel subunit of a heterotrimeric complex required for www.ebi.ac.UK Database accession No. Q9FGP7. CCAAT binding. Genes Dev 9, 47-58. Kaneko, T., et al. (Apr. 9, 1999) EMBL SEQ LIB EBI, Olesen, J.T., and Guarente, L. (Oct. 1990). The HAP2 subunit of Hinxton:"Structural analysis of Arabidopsis thaliana chromosome 5. yeast CCAAT transcriptional activator contains adjacent domains for XL.; P1 clone : MBA10' www.EBI. AC.UK acc. No. AB025619. subunit association and DNA recognition: model for the HAP2/3/4 Lotan, Tamar, et al. (Jun. 26, 1998). “Arabidopsis Leafy COTYLE complex. Genes Dev 4, 1714-1729. DON1 is sufficient to induce embryo. . .” Cell, Cell Press, vol. 93, Rieping, M., and Schoffl, F. (Jan. 1992). Synergistic effect of No. 7, pp. 1195-1205, XP002 136428. upstream sequences, CCAAT box elements, and HSE sequences for Albani, Diego et al., (Dec. 29, 1995) Cloning and characterization of enhanced expression of chimaeric heat shock genes in transgenic a Brassica napus gene encoding a homologue of the B subunit of a tobacco. Mol Gen Genet 231, 226-232. heteromeric CCAAT-binding factor, Gene 167, pp. 209-213. Edwards, D., et al. (1998). Multiple genes encoding the conserved Arents, G., and Moudrianakis, E.N. (Nov. 21, 1995). The histone fold: CCAAT-Box transcription factor complex are expressed in a ubiquitous architectural motif utilized in DNA compaction and Arabidopsis. Plant Physiol. 117, pp. 1015-1022. protein dimerization. Proc Natl Acad Sci U S A92, 11170-11174. US 8,633,353 B2 Page 4

(56) References Cited Crookshanks, M., et al. (Sep. 18, 2001). The potato tuber transcriptome: analysis of 6077 expressed sequence tags. FEBS Lett. OTHER PUBLICATIONS 506 (2), 123-126. Currie, R.A. (Dec. 5, 1997). Functional interaction between the DNA Caretti G. MottaMC, Mantovani R (Dec. 1999)NF-Yassociates with binding subunit trimerization domain of NF-Y and the high mobility H3-H4 tetramers and octamers by multiple mechanisms. Mol Cell group protein HMG-I(Y). J Biol Chem 272,30880-30888. Biol 19: 8591-8603. Daly et al. (Dec. 2001). Plant Systematics in the Age of Genomics. Caretti, G., Salsi, V., Vecchi, C. Imbriano, C., and Mantovani, R. Plant Physiology 127: 1328-1333. (Aug. 15, 2003). Dynamic recruitment of NF-Y and histone Dang, V.D., Bohn, C., Bolotin-Fukuhara, M., and Daignan-Fornier, acetyltransferases on cell-cycle promoters. J Biol Chem 278, 30435 B. (Apr. 1996). The CCAAT box-binding factor stimulates ammo 30440. nium assimilation in Saccharomyces cerevisiae, defining a new Cane, I.A., and Kay, S.A. (Dec. 1995). Multiple DNA-Protein com cross-pathway regulation between nitrogen and carbon metabolisms. plexes at a circadian-regulated promoter element. Plant Cell 7, 2039 J Bacteriol 178, 1842-1849. 2051. di Silvio A, Imbriano C. Mantovani R (Jul. 1, 1999) Dissection of the Chattopadhyay C, et al. (Jul. 8, 2004) Human p32, interacts with B NF-Y transcriptional activation potential. Nucleic Acids Res 27: subunit of the CCAAT-binding factor, CBF/NF-Y, and inhibits CBF 2578-2584. mediated transcription activation in vitro. Nucleic Acids Res 32: Faniello MC, Bevilacqua MA. Condorelli G. de Crombrugghe B. 3632-3641 Maity SN, Avvedimento VE, Cimino F. Costanzo F (Mar. 19, 1999) Chang, Z.F., and Liu, C.J. (Jul. 8, 1994). Human thymidine kinase The B subunit of the CAAT-binding factor NFY binds the central CCAAT-binding protein is NF-Y, whose A subunit expression is segment of the Co-activator p300. J Biol Chem 274: 7623-7626. serum-dependent in human IMR-90 diploid fibroblasts. J Biol Chem Forsburg, S.L., and Guarente, L. (Feb. 1988). Mutational analysis of 269, 17893-17898. upstream activation sequence 2 of the CYC1 gene of Saccharomyces Chen, W. etal. (Mar. 2002). Expression profile matrix of Arabidopsis cerevisiae: a HAP2-HAP3-responsive site. Mol Cell Biol 8,647-654. transcription factor genes Suggests their putative functions in Forsburg, S.L., and Guarente, L. (Aug. 1989). Identification and response to environmental stresses. Plant Cell 14, 559-574. characterization of HAP4: a third component of the CCAAT-bound Ayele, M., et al. (Apr. 2005). Whole genome shotgun sequencing of HAP2/HAP3 heteromer. Genes Dev 3, 1166-1178. Brassica oleracea and its application to gene discovery and annota Franchini A, Imbriano C. Peruzzi E. Mantovani R. Ottaviani E (Feb. tion in Arabidopsis. Genome Res. 15 (4), 487-495. 2005) Expression of the CCAAT-binding factor NF-Y in Bezhani, S. Sherameti, I., Pfannschmidt, T., and Oelmuller, R. (Jun. Caenorhabditis elegans. J Mol Histol 36: 139-145. 29, 2001). A repressor with similarities to prokaryotic and eukaryotic FrontiniM, Imbriano C. Manni I. Mantovani R (Feb. 2004) Cell cycle DNA helicases controls the assembly of the CAAT box binding regulation of NF-YC nuclear localization. Cell Cycle 3: 217-222. complex ataphotosynthesis gene promoter. J Biol Chem 276,23785 Fu, et al. (Aug. 2001). Expression of Arabidopsis GAI in Transgenic 23789. Rice Represses Multiple Gibberellin Responses. Plant Cell 13:1791 Bi, W., et al. (Oct. 17, 1997). DNA binding specificity of the CCAAT 1802. binding factor CBF/NF-Y. J Biol Chem 272, 26562-26572. Gancedo, J.M. (Jun. 1998). Yeast carbon catabolite repression. Borrell, A.K., et al. (Jul. 2000). Does Maintaining Green Leaf Area in Microbiol Mol Biol Rev 62,334-361. Sorghum Improve Yield under Drought? II. Dry Matter Production Gardiner, J., et al. (Apr. 2004). Anchoring 9,371 Maize Expressed and Yield. CropScience 40, 1037-1048. Sequence Tagged Unigenes to the Bacterial Artificial Chromosome Bowler, C., and Fluhr, R. (Jun. 2000). The role of calcium and Contig Map by Two-Dimensional Overgo Hybridization. Plant activated oxygens as signals for controlling cross-tolerance. Trends Physiol. 134 (4), 1317-1326. Plant Sci 5, 241-246. Gurtner A. etal. (Jul. 2003) Requirement for Down-Regulation of the Bowie, et al. Science 247: 1306-1310 (1990). CCAAT-binding Activity of the NF-Y Transcription Factor during Cooper, B., et al. (Oct. 2003). Identification of rice (Oryza sativa) Skeletal Muscle Differentiation. Mol Biol Cell 14: 2706-2715. proteins linked to the cyclin-mediated regulation of the cell cycle. Guterman, I., et al. (Oct. 2002). Rose Scent: Genomics Approach to Plant Mol. Biol. 53(3):273-9. Discovering Novel Floral Fragrance-Related Genes. Plant Cell 14 Coupland (Oct. 1995). Flower development. LEAFY blooms in (10), 2325-2338. aspen. Nature 377:482-483. Haas, B., et al. (May 30, 2002). Full-length messenger RNA Coustry, F. Maity, S.N., and de Crombrugghe, B. (Jan. 6, 1995). sequences greatly improve genome annotation. Genome Biology Studies on transcription activation by the multimeric CCAAT-bind 2002, 3(6)research0029.1-0029.12. ing factor CBF. J. Biol Chem 270, 468-475. Harmer, S., et al., (Dec. 15, 2000) Orchestrated Transcription of Key Coustry, F. Maity, S.N. Sinha, S., and de Crombrugghe, B. (Jun. 14. Pathways in Arabidopsis by the Circadian Clock. Science vol. 290, p. 1996). The transcriptional activity of the CCAAT-binding factor CBF 2110. is mediated by two distinct activation domains, one in the CBF-B Herwig, R., et al. (Aug. 2002). Construction of a ‘unigene cDNA subunit and the other in the CBF-C subunit. J Biol Chem 271, 14485 clone set by oligonucleotide fingerprinting allows access to 25,000 14491. potential Sugar beet genes. Plant J. 32 (5), 845-857. Coustry, F. Sinha, S., Maity, S.N., and Crombrugghe, B. (Apr. 1, Hollung Kristin et al. (Jun. 1997) Developmental stress and ABA 1998). The two activation domains of the CCAAT-binding factor modulation of mRNA levels for bZIP transcription factors and Vp 1 CBF interact with the dTAFII 110 component of the Drosophila in barley embryos and embryo-derived suspension cultures, Plant TFIID complex. Biochem J 331 (Pt 1), 291-297. Molecular Biology, vol. 35. No. 5, pp. 561-571. Coustry, F. Hu, Q., de Crombrugghe, B., and Maity, S.N. (Nov. 2, Kahle J, et al. (Jul. 2005) Subunits of the heterotrimeric transcription 2001). CBF/NF-Y functions both in nucleosomal disruption and factor NF-Y are imported into the nucleus by distinct pathways transcription activation of the chromatin-assembled topoisomerase involving importin beta and importin 13. Mol Cell Biol 25: 5339 IIalpha promoter. Transcription activation by CBF/NF-Y in chroma 5354. tin is dependent on the promoter structure. J Biol Chem 276, 40621 Kater, et al. (Feb. 1998). Multiple AGAMOUS Homologs from 40630. Cucumber and Petunia Differ in Their Ability to Induce Reproductive Covitz, P.A., et al. (Aug. 1998). Expressed sequence tags from a Organ Fate. Plant Cell 10:171-182. root-hair-enriched medicago truncatula cDNA library. Plant Physiol. Kehoe, D.M., et al. (Aug. 1994). Two 10-bp regions are critical for 117(4): 1325-32. phytochrome regulation of a Lemna gibba Lhcbgene promoter. Plant Crevillen P. Ventriglia T. Pinto F, Orea A. Merida A. Romero JM Cell 6, 1123-1134. (Mar. 4, 2005) Differential pattern of expression and Sugar regulation Kim, C.G., and Sheffery, M. (Aug. 1990). Physical characterization of Arabidopsis thaliana ADP-glucose pyrophosphorylase-encoding of the purified CCAAT transcription factor, alpha-CP1. J Biol Chem genes. J Biol Chem 280: 8143-8149. 265, 13362-13369. US 8,633,353 B2 Page 5

(56) References Cited Nandi et al. (Feb. 24, 2000). A conserved function for Arabidopsis Superman in regulating floral-whorl cell proliferation in rice, a OTHER PUBLICATIONS monocotyledonous plant. Curr. Biol. 10:215-218. Novillo, F., et al. (Mar. 16, 2004). CBF2/DREB1C is a negative Kim, I.S., et al. (Aug. 1996). Determination of functional domains in regulator of CBF1/DREB1.B and CBF3/DREB1A expression and the C subunit of the CCAAT-binding factor (CBF) necessary for plays a central role in stress tolerance in Arabidopsis. Proc Natl Acad formation of a CBF-DNA complex: CBF-B interacts simultaneously Sci USA 101,3985-3990. with both the CBF-A and CBF-C subunits to form a heterotrimeric Parcy, F., et al. (Aug. 1997). The Abscisic Acid-INSENSITIVE3, CBF molecule. Mol Cell Biol 16, 4003-4013. FUSCA3, and Leafy COTYLEDON1 loci act in concert to control Kusnetsov, V., et al. (Dec. 10, 1999). The assembly of the CAAT-box multiple aspects of Arabidopsis seed development. Plant Cell 9, binding complex at a photosynthesis gene promoter is regulated by 1265-1277. light, cytokinin, and the stage of the plastids. J Biol Chem 274. Peng et al. (Dec. 1, 1997). The Arabidopsis GAI gene defines a 36009-36014. signaling pathway that negatively regulates gibberellin responses. Kwong, R.W., et al. (Jan. 2003). LEAFY COTYLEDON1-Like Genes and Development 11:3194-3205. Peng et al. (Jul. 15, 1999). 'Green revolution genes encode mutant defines a class of regulators essential for embryo development. Plant gibberellin response modulators, Nature 400:256-261. Cell 15, 5-18. Pinkham, J.L., and Guarente, L. (Dec. 1985). Cloning and molecular Lapik.Y.R., and Kaufman, L.S. (Jul. 2003). The Arabidopsis cupin analysis of the HAP2 locus: a global regulator of respiratory genes in domain protein AtPirinl interacts with the G protein alpha-subunit Saccharomyces cerevisiae. Mol Cell Biol 5, 3410-3416. GPA1 and regulates seed germination and early seedling develop Prandl R. et al., (May 1998) HSF3, a new heat shock factor from ment. Plant Cell 15, 1578-1590. Arabidopsis thaliana, derepresses the heat shock response and con Lascaris R. et al. (Dec. 17, 2002) Hap4p overexpression in glucose fers thermotolerance when overexpressed in transgenic plants. grown Saccharomyces cerevisiae induces cells to enter a novel meta Molecular and General Genetics, vol. 258, pp. 269-278. bolic state. Genome Biol. 4: R3. Rizhsky, L., et al. (Nov. 2002). The combined effect of drought stress Lazaretal. (Mar. 1988) Transforming Growth Factor alpha: Mutation and heat shock on gene expression in tobacco. Plant Physiol 130, of Aspartic Acid 47 and Leucine 48 Results in Different Biological 1143-1151. Activities. Mol. Cell Biol. 8:1247-1252. Romier, C., et al. (Jan. 10, 2003). The NF-YB/NF-YC structure gives Lee, H., et al. (Feb. 18, 2003). Arabidopsis LEAFY COTYLEDON1 insight into DNA binding and transcription regulation by CCAAT represents a functionally specialized subunit of the CCAAT binding factor NF-Y. J Biol Chem 278, 1336-1345. transcription factor. Proc Natl Acad Sci USA 100, 2152-2156. Sabehat, A., et al. (Jun. 1998). Expression of Small heat-shock pro Lee J. H. etal. (Oct. 1995), Derepression of the activity of genetically teins at low temperatures. A possible role in protecting against chill engineered heat shock factor causes constitutive synthesis of heat ing injuries. Plant Physiol 117,651-658. shock proteins and increased themotolerance in transgenic Salmi, M.L., etal. (Jul. 2005). Profile and analysis of gene expression Arabidopsis. Plant Journal, vol. 8, No. 4, pp. 603-612. changes during early development in germinating spores of Levesque-Lemay M. et al. (Mar. 4, 2003) Expression of CCAAT Ceratopteris richardii. Plant Physiol. 138 (3), 1734-1745. binding factor antisense transcripts in reproductive tissues affects Salsi, V., et al. (Feb. 28, 2003). Interactions between p300 and mul plant fertility. Plant Cell Rep 21: 804-808. tiple NF-Y trimers govern cyclin B2 promoter function. J Biol Chem Li, Q, et al. (Nov. 2, 1998). Xenopus NF-Y pre-sets chromatin to 278,6642-6650. potentiate p300 and acetylation-responsive transcription from the Sasaki, T., et al. (Nov. 21, 2002). The genome sequence and structure Xenopus hsp70 promoter in vivo. Embo J 17, 6300-6315. of rice chromosome 1. Nature 420 (6913), 312-316. Lin, J.F., and Wu, S.H. (Aug. 2004). Molecular events in senescing Seki, M., etal. (Jan. 2001). Monitoring the expression pattern of 1300 Arabidopsis leaves. Plant J39, 612-628. Arabidopsis genes under drought and cold stresses by using a full Luger, K., et al. (Sep. 18, 1997). Crystal structure of the nucleosome length cDNA microarray. Plant Cell 13, 61-72. core particle at 2.8 A resolution. Nature 389, 251-260. Sinha S, et al. (Feb. 28, 1995) Recombinant rat CBF-C, the third Mandel et al. (Oct. 1992). Manipulation of flower structure in subunit of CBF/NFY, allows formation of a protein-DNA complex transgenic tobacco, Cell 71-133-143. with CBF-A and CBF-B and with yeast HAP2 and HAP3. Proc Natl Mantovani, R. (Mar. 1998). A survey of 178 NF-Y binding CCAAT AcadSci U S A92: 1624-1628. boxes. Nucleic Acids Res 26, 1135-1143. Sinha, S., et al. (Jan. 1996). Three classes of mutations in the A Masiero, S., et al. (Jul 19, 2002). Ternary complex formation subunit of the CCAAT-binding factor CBF delineate functional between MADS-box transcription factors and the histone fold pro domains involved in the three-step assembly of the CBF-DNA com tein NF-YB. J. Biol Chem 277,26429-26435. plex. Mol Cell Biol 16, 328-337. McNabb, D.S., et al. (Dec. 1997). The Saccharomyces cerevisiae Surpin, M., et al. (May 2002). Signal transduction between the Hap5p homolog from fission yeast reveals two conserved domains chloroplast and the nucleus. Plant Cell 14 Suppl. S327-338. that are essential for assembly of heterotetrameric CCAAT-binding Suzuki et al. (Nov. 2001). Maize VP1 complements Arabidopsis abi3 factor. Mol Cell Biol 17, 7008-7018. and confers a novel ABA'auxin interaction in roots. Plant J. 28:409 McConnell, et al. (6838). Role of PHABULOSA and PHAVOLUTA 418. in determining radial patterning in shoots; Nature 411, 709-713 Tasanen, K., et al. (Jun. 5, 1992). Promoter of the gene for the (2001). multifunctional protein disulfide isomerase polypeptide. Functional Meinke, D. (Dec. 4, 1992). A homeotic mutant of Arabidopsis significance of the six CCAAT boxes and other promoter elements. J thaliana with leafy cotyledons. Science 258, 1647-1650. Biol Chem 267, 11513-11519. Meinke, D.W., et al. (Aug. 1994). Leafy Cotyledon Mutants of Testa A. etal. (Jan. 11, 2005) Chromatin immunoprecipitation (ChIP) Arabidopsis. Plant Cell 6, 1049-1064. on chip experiments uncover a widespread distribution of NF-Y Miyoshi, K., et al. (Nov. 2003). OshAP3 genes regulate chloroplast binding CCAAT sites outside of core promoters. J Biol Chem 280: biogenesis in rice. Plant J36,532-540. 13606-13615. Myers, R.M., Tilly, K., and Maniatis, T. (May 2, 1986). Fine structure Thomas, H., and Howarth, C.J. (Feb. 2000). Five ways to stay green. genetic analysis of a beta-globin promoter. Science 232, 613-618. J Exp Bot 51 Spec No. 329-337. Nakamura, Y, et al. (Dec. 31, 1997). Structural analysis of Vicient, C.M., et al. (Jun. 2000). Changes in gene expression in the Arabidopsis thaliana chromosome 5. III. Sequence features of the leafy cotyledonil (lec 1) and fusca3 (fus3) mutants of Arabidopsis regions of 1,191.918 by covered by seventeen physically assigned P1 thaliana L.J Exp. Bot 51,995-1003. clones. DNA Res.4(6):401-414. Weigel and Nilsson (Oct. 12, 1995). A developmental switch suffi Nakshatri, H., et al. (Nov. 15, 1996). Subunit association and DNA cient for flower initiation in diverse plants. Nature 377:482-500. binding activity of the heterotrimeric transcription factor NF-Y is Wendler, W.M., et al. (Mar. 28, 1997). Identification of pirin, a novel regulated by cellular redox. J Biol Chem 271, 28784-28791. highly conserved nuclear protein. J Biol Chem 272, 8482-8489. US 8,633,353 B2 Page 6

(56) References Cited NCBI accession No. AAAA01009782 (gi: 1993.4092) (Apr. 4, 2002); Yu, J., et al. “Oryza sativa (indica cultivar-group) scaffold009782, OTHER PUBLICATIONS whole genome shotgun sequence”. (Oryza sativa) (publication: See PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History West, M., et al. (Dec. 1994). Leafy COTYLEDON1 Is an essential of Duplications). regulator of late embryogenesis and cotyledon identity in NCBI acc. No. AAAAO 1015835 (gii: 1994,5865) (Apr 4 2002);Yu.J., Arabidopsis. Plant Cell 6, 1731-1745. et al. “Oryza sativa (indica cultivar-group) scaffold015835, whole Winicov. Ilga et al., (Jun. 1999) Transgenic overexpression of the genome shotgun sequence'; source: Oryza sativa (indica cultivar transcription factor Alfin enhances expression of the endogenous group); Title: “The Genomes of Oryza sativa : A History of Dupli MsPRP2 gene in alfalfa and improves salinity tolerance of the plants. cations” (PLoS Biol. 3 (2), E38 (2005)). Plant Physiology vol. 120, No. 2, pp. 473-480. NCBI accession No. AAAA01015884 (gi: 1994.5953) (Apr. 4, 2002); Xing, Y. Fikes, J.D., and Guarente, L. (Dec. 1993). Mutations in Yu, J., et al. “Oryza sativa (indica cultivar-group) scaffold015884, yeast HAP2/HAP3 define a hybrid CCAAT box binding domain. whole genome shotgun sequence”. (Oryza sativa) (publication: See Embo J 12, 4647-4655. PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History Xiong, L., et al. (Dec. 2001). Modulation of abscisic acid signal of Duplications). transduction and biosynthesis by an Sm-like protein in Arabidopsis. NCBI accession No. AAK95562 (gii: 15321716) (Aug. 28, 2001); Dev Cell 1,771-781. Lowe, K.S. etal "Leafy cotyledonl” (Zea mays). Xiong, L., et al. (Aug. 1, 2001). FIERY1 encoding an inositol NCBI accession No. AAL27657 (gii: 16902050) (Nov. 10, 2001); polyphosphate 1-phosphatase is a negative regulator of abscisic acid Lowe, K.S., et al. “CCAAT-box binding factor HAP3 B domain' and stress signaling in Arabidopsis. Genes Dev 15, 1971-1984. (Glycine max.). Yan, J., et al. (Aug. 2004). Overexpression of the Arabidopsis 14-3-3 NCBI accession No. AAL27658 (gii: 16902052) (Nov. 10, 2001); protein GF14 lambda in cotton leads to a “stay-green” phenotype and Lowe, K.S., et al. “CCAAT-box binding factor HAP3 B domain' improves stress tolerance under moderate drought conditions. Plant (Glycine max.). Cell Physiol 45, 1007-1014. NCBI accession No. AAL27659 (gii: 16902054) (Nov. 10, 2001); Yu, J., et al. (Feb. 2005). The Genomes of Oryza sativa: A History of Lowe, K.S., et al. “CCAAT-box binding factor HAP3 B domain' Duplications. PloS Biol. 3 (2), E38: 266-281. ( galamensis). Yun, J., et al. (Sep. 19, 2003). Cdk2-dependent phosphorylation of NCBI accession No. AAL27660 (gii: 16902056) (Nov. 10, 2001); the NF-Y transcription factor and its involvement in the p. 53-p. 21 Lowe, K.S. et al. "CCAAT-box binding factor HAP3 B domain' signaling pathway. J Biol Chem 278,36966-36972. (Argemone mexicana). Zhang, S., et al. (Jun. 2002). Similarity of expression patterns of NCBI accession No. AAL27661 (gii: 16902058) (Nov. 10, 2001); knotted 1 and ZmLEC1 during somatic and Zygotic embryogenesis in Lowe, K.S. et al. "CCAAT-box binding factor HAP3 B domain' maize (Zea may’s L.). Planta 215, 191-194. (Triticum aestivum). Zhou, Y., and Lee, A.S. (Mar. 4, 1998). Mechanism for the Suppres NCBI accession No. AAL49943 (gi:17979253) (Dec. 26, 2001); Sion of the mammalian stress response by genistein, an anticancer Shinn, P. et al., “At2g37060/T2N18.18 Arabidopsis thaliana”. phytoestrogen from soy. J Natl Cancer Inst 90,381-388. NCBI accession No. AANO 1148 (gi:2253.6010) (Aug. 29, 2002); Shinn, et al. (Dec. 1, 2001). Database SPTREMBL Online Kwong, R.W., et al. “LEC1-like protein' (Phaseolus coccineus) XP002962732 Database accession No. (Q94F45). (note: the original Submission and latest update are provided for Nakamura, Y., et al. (Mar. 1, 2001). Database TREMBL SEQ LIB examination) (publication: see Plant Cell 15 (1), 5-18, 2003, Leafy EBI. Hinxton; "Structural analysis of Arabidopsis . . . ” COTYLEDON1-Like Defines a Class of Regulators Essential for XP002302645 WWWEBI.AC.UK Database acc No. Q9FMV5. Embryo Development). Database UniProt Online (May 1, 2000). “Putative CCAAT-bind NCBI accession No. AAO33918 (gi:28274.147) (Feb. 8, 2003); ing transcription factor subunit.” retrieved from EBI accession No. Adams, K.L., et al. “Putative CCAAT-binding transcription factor': UNIPROT: Database acc No. Q9SLGO. (Gossypium barbadense). NCBI accession No. AA660543 (gi:2604587) (Nov. 11, 1997); NCBI accession No. AAO33919 (gi:28274149) (Feb. 8, 2003); Covitz, P.A., et al. “00429 MtRHE Medicago truncatula cDNA 5' Adams, K.L., et al. “Putative CCAAT-binding transcription factor” similar to CCAAT box DNA binding transcription factor, mRNA Gossypium barbadense. sequence'; (Medicago truncatula) (publication: see Plant Physiol. NCBI accession No. AAO72650 (gi:29367577) (Mar. 30, 2003); 117 (4), 1325-1332, 1998). Cooper, B., et al. “CCAAT-binding transcription factor-like protein' NCBI accession No. AAAA01000016 (gi: 19924325) (pos. 61435 (Oryza sativa) (note: the original Submission and latest update are 62623) (Apr. 4, 2002); Yu, J., et al. “Oryza sativa (indica cultivar provided for examination) (publication: see Plant Mol. Biol. 53 (3), group) scaffoldO00016, whole genome shotgun sequence”. (Oryza 273-279, 2003, Identification of rice (Oryza sativa) proteins linked to sativa) (publication: see PloS Biol. 3 (2), E38, 2005, The Genomes of the cyclin-mediated regulation of the cell cycle). Oryza sativa: A History of Duplications). NCBI accession No. AB025628 (gi:4589434) (pos. 69357-69929) NCBI accession No. AAAA01001 199 (gi: 19925508) (pos. 24979 (Apr. 20, 1999); Nakamura, Y., “Arabidopsis thaliana genomic DNA, 25653) (Apr. 4, 2002); Yu, J., et al. “Oryza sativa (indica cultivar chromosome 5, P1 clone: MNJ7, complete sequence” (note: the group) scaffold001 199, whole genome shotgun sequence”. (Oryza original Submission and latest update are provided for examination, sativa) (publication: see PloS Biol. 3 (2), E38, 2005, The Genomes of see geneid: MNJ7.23, protein id: BAB09090.1, gi:9758792 marked Oryza sativa: A History of Duplications). on p. 7/29). NCBI accession No. AAAA01003638 (gi: 19927947) (Apr. 4, 2002); NCBI accession No. AC000 106 (gi:1785951) (Jan. 21 1997); Yu, J., et al. “Oryza sativa (indica cultivar-group) scaffold003638, Osborne, B.I., et al., “Arabidopsis thaliana chromosome 1” (note: whole genome shotgun sequence'; (Oryza sativa) (publication: See two versions are presented for review, gi:1785951 submitted Jan. 21. PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History 1997; and, gi:2342673 submitted Sep. 17, 1997). of Duplications). NCBI accession No. AC002388 (gi:2282009) (Jul. 28, 1997); NCBI accession No. AAAA01006073 (gi: 19930383) (Apr. 4, 2002); Rounsley, S.D., et al., “Arabidopsis thaliana clone T13E15” (note: Yu, J., et al. “Oryza sativa (indica cultivar-group) scaffold006073, two versions are presented for review, gi:2282009 submitted Jul. 28, whole genome shotgun sequence'; (Oryza sativa) (publication: See 1997; and, gi:20196917 submitted Apr. 18, 2002). PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History NCBI accession No. AC005770 (gi:3694645) (pos 56423-57963) of Duplications). (Oct. 3, 1998); Rounsley, S.D., et al., “Arabidopsis thaliana clone NCBI accession No. AAAA01008870 (gi: 19933 180) (Apr. 4, 2002); T7F6” (note: two versions are presented for review, gi:3694645 sub Yu, J., et al. “Oryza sativa (indica cultivar-group) scaffold008870, mitted Oct. 3, 1998; and, gi:20 197440 submitted Apr. 18, 2002, see whole genome shotgun sequence'; (Oryza sativa) (publication: See gene At2g38870, gi:20 197447 marked on p. 2/35). PloS Biol. 3 (2), E38, 2005, The Genomes of Oryza sativa: A History NCBI accession No. AC006260 (gi:4071012) (pos 40445 of Duplications). 41633)(Dec. 29, 1998); Lin, X., et al., “Arabidopsis thaliana clone US 8,633,353 B2 Page 7

(56) References Cited NCBI accession No. AI731275 (gi:5050127) (Jun. 11, 1999); Blewitt, M., et al., “BNLGHi9078 Six-day Cotton fiber Gossypium OTHER PUBLICATIONS hirsutum cDNA 5' similar to (X59714) CAAT-box DNA binding protein subunit B (NF-YB) Zea mays), mRNA sequence'; (Gos T2N18” (note: two versions are presented for review, gi:4071012 sypium hirsutum). submitted Dec. 29, 1998; and, gi:20 197714 submitted Apr. 18, 2002, NCBI accession No. AI782351 (gi:5280392) (Jun. 29, 1999); see p. 6 of 28, gene At2g37070, gi:4371294). D’Ascenzo, M., et al., “EST263230 tomato susceptible, Cornell NCBI acc. No. AC007063 (gi:4389532) (Mar 11 1999); Lin.X., et al. Lycopersicon esculentum cDNA clone cILES18L2, mRNA “Arabidopsis thaliana clone T10F5, *** Sequencing in Progress***, sequence', (Lycopersicon esculentum). 4 unordered pieces'; source: Arabidopsis thaliana (thale cress); Title: “Arabidopsis thaliana TAMU BAC T10F5 genomic NCBI accession No. AI900024 (gi:5605926) (Jul 27, 1999); Shoe sequence near marker GPC6' (Unpublished). maker, R., et al., “sb97g 11.y1 Gm-c 1012 Glycine may cDNA clone NCBI acc. No. AC104284 (gi: 17402732) (Dec 7, 2001); Chow, T-Y., Genome Systems clone ID: Gm-c 1012-6695'similar to SW:CBFA et al. “Oryza sativa (japonica cultivar-group) chromosome 5 clone Maize P25209 CCAAT-binding transcription factor subunit A: OJ1735C10, *** Sequencing in Progress ***, 7 ordered pieces”; mRNA sequence'; (Glycine max). Source: Oryza sativa (japonica cultivar-group); Title: "Oryza sativa NCBI accession No. AI965590 (gi:5760227) (Aug. 23, 1999); Shoe BAC OJ1735C10 genomic sequence” (Unpublished). maker, R., et al., "sc74b05.y1 Gm-c1018 Glycine may cDNA clone NCBI accession No. AC 120529 (gi:20503070) (pos. 90470-91129) Genome Systems clone ID: Gm-c 1018-5865" similar to SW:CBFA (May 8, 2002); Buell, C., et al. “Oryza sativa (japonica cultivar Maize P25209 CCAAT-binding transcription factor subunit A: group) chromosome 3 clone OSJNBa0039N21, complete sequence': mRNA sequence'; (Glycine max). (Oryza sativa ) (note: two versions are presented for review, NCBI accession No. AI966550(gi:5761187) (Aug. 23, 1999); Shoe gi:20503070 submitted May 8, 2002; and, gi:34447241 submitted maker, R., et al., "scSlhO1.y1 Gm-c1015 Glycine may cDNA clone Sep. 4, 2003, see protein id: 41469085, marked on p. 6/53). Genome Systems Clone ID: Gm-c1015-1130 5" similar to NCBI accession No. AC122165 (gi:21104913) (pos. 113095-113208 TR: 023633 023633 Transcription factor; mRNA sequence': in latest version) (May 23, 2002); Shaull, S., et al. “Medicago (Glycine max). truncatula clone mth2-32m22, complete sequence'; (Medicago NCBI accession No. AJ487398 (gi:22022152) (Jul 30, 2002); truncatula) (note: two versions arepresented for review, gi:21104913 Gebhardt, C., et al., “AJ487398 Solanum tuberosum cv. Provita submitted May 23, 2002; and, gi:71274322 submitted on Jul. 27. Solanum tuberosum cDNA clone Plea, mRNA sequence'. (Solanum 2005). tuberosum). NCBI accession No. AF193440 (gi:6289056) (Nov. 9, 1999); Gher NCBI accession No. AJ5.01023 (gi:22081956) (Aug. 1, 2002); raby, W., et al., “Arabidopsis thaliana heme activated protein Manthey, K., et al., “AJ5.01023 MTAMP Medicago truncatula cDNA (HAP5c) mRNA, complete cods'. clone mtgmadc120001 h01, mRNA sequence”; (Medicago NCBI accession No. AF410176 (gi:15321715) (Aug. 28, 2001); truncatula) (note: the original Submission and latest update are pro Lowe, K.S., et al. "Zea may’s leafy cotyledon) (Lec 1) mRNA, com vided for examination). plete cols', (Zea mays). NCBI accession No. AJ501814 (gi:22082742) (Aug. 1, 2002); NCBI accession No. AF533650 (gi:22536009) (Aug. 29, 2002); Manthey, K., et al., “AJ501814 MTAMP Medicago truncatula cDNA Kwong, R.W., et al. "Phaseolus coccineus LEC1-like protein mRNA, clone mtgmadc1200 12b08, mRNA sequence”; (Medicago complete cods”; (Phaseolus coccineus) (note: the original Submission truncatula) (note: the original Submission and latest update are pro and latest update are provided for examination). vided for examination). NCBI accession No. AI442376 (gi:4295745) (Feb. 19, 1999); Shoe NCBI accession No. AL132966 (gi:6434215) (pos. 15052-16661) maker, R., et al. “Sa26b07.y1 Gm-c1004 Glycine may cDNA clone (Nov. 15, 1999); Bloecker, H., et al., “Arabidopsis thaliana DNA Genome systems clone ID: Gm-c1004-398 5" similar to TR:023634 chromosome 3, BAC clone F4P12” (note: the original submission 023634 transcription factor; mRNA sequence” (publication: see and latest update are provided for examination, see gene F4P1240, Plant Cell 15 (1), 5-18, 2003, Leafy COTYLEDON1-Like Defines a protein id gi:CAB6764.1 marked on p. 4/66). Class of Regulators Essential for Embryo Development). NCBI accession No. AL387357 (gi:9687108) (Aug. 3, 2000); NCBI accession No. AI442765 (gi:42.98466) (Feb. 19, 1999); Shoe Journet, E.P. et al., “MtBC42A04F1 MtBC Medicago truncatula maker, R., et al. “Sa26b07.x1 Gm-c1004 Glycine may cDNA clone cDNA clone MtBC42A04 T3, mRNA sequence”; (Medicago genome systems clone ID: Gm-c1004-398 3' similar to TR:023634 truncatula) (note: the original Submission and latest update are pro 023634 transcription factor; mRNA sequence'; (Glycine max). vided for examination). NCBI accession No. AI486503 (gi:43.81874) (Mar. 9, 1999); Alcala, NCBI accession No. AL506199 (gii: 12032414) (Jan. 4, 2001); J., et al. “EST244824 tomato ovary, TAMU Lycopersiconesculentum Michalek, W., et al., “AL506199 Hordeum vulgare Baarke develop cDNA clone cIED6C8, mRNA sequence”; (Lycopersicon ing caryopsis (3.-15. DAP) Hordeum vulgare subsp. Vulgare cDNA esculentum). clone HY02F18T 5", mRNA sequence': (Hordeum vulugare). NCBI accession No. AI495.007 (gi:4396010) (Mar. 11, 1999); Shoe NCBI accession No. AL509098 (gii: 12035601) (Jan. 4, 2001); maker, R., et al. “Sa89f)3.y1 Gm-c1004 Glycine may cDNA clone Michalek, W., et al., “AL509098 Hordeum vulgare Barke developing genome systems clone ID: Gm-c1004-64865" similar to TR:023310 caryopsis (3.-15. DAP) Hordeum vulgare subsp. Vulgare cDNA clone 023310 CCAATt-binding transcription factor subunit A.; mRNA HY1OL07V 5", mRNA sequence': (Hordeum vulgare). sequence', (Glycine max). NCBI accession No. AL830693 (gi:21842473) (Jul 16, 2002); NCBI accession No. AI725612 (gi:5044464) (Jun. 11, 1999); Sprunck, S., et al., “AL830693 q:242 Triticum aestivum cDNA clone Blewitt, M., et al., “BNLGHi12445 Six-day Cotton fiber Gossypium D05 q242 plate 10, mRNA sequence': (Triticum aestivum). hirsutum cDNA 5" similar to CCAAT-binding transcription factor NCBI accession No. AP003246 (gi: 13027276) (pos. 28048-28351) subunit A (CBF-A) (NF-Y Protein Chain B) (NFYB) (CAAT-Box (Feb. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar DNA binding protein subunit B), mRNA sequence”; (Gossypium group) genomic DNA, chromosome 1, PAC clone: P0423A12'; hirsutum). (Oryza sativa) (note: two versions are presented for review, NCBI accession No. AI728916 (gi:5047768) (Jun. 11, 1999); gi:21104640 submitted Feb. 21, 2001; and, gi:21104640 submitted Blewitt, M., et al., “BNLGHi12022 Six-day Cotton fiber Gossypium May 22, 2002, see protein id BAB93258.1 marked on p. 1450) hirsutum cDNA 5" similar to (Y13723) Transcription factor (publication: see Nature 420 (6913), 312-316, 2002, the genome Arabidopsis thaliana, mRNA sequence'; (Gossypium hirsutum). sequence and structure office chromosome 1). NCBI accession No. AI731250 (gi:505.0102) (Jun. 11, 1999); NCBI acc. No. NP 030436 (gii: 18404885) (Jan 29 2002):, et al. Blewitt, M., et al., “BNLGHi9010 Six-day Cotton fiber Gossypium “putative CCAAT-binding transcription factor subunit'; source: hirsutum cDNA 5' similar to (X59714) CAAT-box DNA binding Unknown.; Title: “'. protein subunit B (NF-YB) Zea mays), mRNA sequence”; (Gos NCBI acc. No. NP 199575 (gii: 15238156) (Aug. 21 2001); sypium hirsutum). Tabata, S., et al. “putative protein Arabidopsis thaliana'; source: US 8,633,353 B2 Page 8

(56) References Cited MWM203a04 r 5’, mRNA sequence': (Lotus corniculatus) (publi cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 OTHER PUBLICATIONS non-redundant expressed sequence tags from a legume, Lotus japonicus). Arabidopsis thaliana (thale cress); Title: “Sequence and analysis of NCBI accession No. AV420653 (gi:7749830) (May 9, 2000); chromosome 5 of the plant Arabidopsis thaliana” (Nature 408 Asamizo, E., et al., “AV420653 Lotus japonicus young plants (two (6814), 823-826 (2000)). week old) Lotus corniculatus var. japonicus cDNA clone NCBI acc. No. NP 193190 (gii: 15233475) (Aug. 21 2001); MWM184h05 r 5", mRNA sequence': (Lotus corniculatus) (publi Mayer.K., et al. "CCAAT-binding transcription factor subunit cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 A(CBF-A) Arabidopsis thaliana'; source: Arabidopsis thaliana non-redundant expressed sequence tags from a legume, Lotus (thale cress); Title: “Sequence and analysis of chromosome 4 of the japonicus). plant Arabidopsis thaliana” (Nature 402 (6763), 769-777 (1999)). NCBI accession No. AV424305 (gi:7781090) (May 12, 2000); NCBI acc. No. NP 00103 1500 (gi: 79324546) (Nov. 3 2005), et al. Asamizo, E., et al., “AV424305 Lotus japonicus young plants (two “unknown protein Arabidopsis thaliana'; source: Arabidopsis week old) Lotus corniculatus var. japonicus cDNA clone thaliana (thale cress); Title: “” (). MWMO38e02 r 5", mRNA sequence': (Lotus corniculatus) (publi NCBI acc. No. NP 178981 (gi: 15225440) (Aug. 21 2001); Lin.X., cation: see Dna Res. 7 (2), 127-130, 2000, Generation of 7137 non et al. “putative CCAAT-box binding trancription factor Arabidopsis redundant expressed sequence tags from a legume, Lotus japonicus). thaliana'; source: Arabidopsis thaliana (thale cress); Title: NCBI accession No. AV425835 (gi:7784165) (May 12, 2000); “Sequence and analysis of chromosome 2 of the plant Arabidopsis Asamizo, E., et al., “AV425835 Lotus japonicus young plants (two thaliana” (Nature 402 (6763), 761-768 (1999)). week old) Lotus corniculatus var. japonicus cDNA clone NCBI acc. No. NP 190902 (gii: 15231796) (Aug. 21 2001); MWMO59e05 r 5", mRNA sequence': (Lotus corniculatus) (publi Salanoubat.M., et al. “transcription factor NF-Y. CCAAT-binding cation: see DNA Res. 7 (2), 127-130, 2000, Generation of 7137 like protein Arabidopsis thaliana'; source: Arabidopsis thaliana non-redundant expressed sequence tags from a legume, Lotus (thale cress); Title: “Sequence and analysis of chromosome 3 of the japonicus). plant Arabidopsis thaliana” (Nature 408 (6814), 820-822 (2000)). NCBI accession No. AV632044 (gii: 10775364) (Oct. 11, 2000); NCBI accession No. AP003266 (gi: 13027296)(pos. 83090-83330) Asamizo, E., etal, “AV632044 Chlamydomonas reinhardtii 5% CO2 (Feb. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar Chlamydomonas reinhardtii cDNA clone HC003b12 r 5", mRNA group) genomic DNA, chromosome 1, PAC clone:P0492G09": sequence”; (Chlamydomonas reinhardtii) (publication: see DNA (Oryza sativa ) (note: two versions are presented for review, Res. 7 (5), 305-307, 2000, Generation of expressed sequence tags gi: 13027296 submitted Feb. 21, 2001; and, gi:15408784 submitted from low-CO2 and high-0O2 adapted cells of Chlamydomonas Aug. 31, 2001, see proteinid BAB54190.1 marked on p. 6/47) (pub reinhardtii). lication: see Nature 420 (6913), 312-316, 2002. The genome NCBI accession No. AV632945 (gii: 10776265) (Oct. 11, 2000); sequence and structure office chromosome 1). Asamizo, E., etal, “AV632945 Chlamydomonas reinhardtii 5% CO2 NCBI accession No. AP003271 (gii: 13027301) (pos. 154362 Chlamydomonas reinhardtii cDNA clone HC014f12 r 5", mRNA 155761) (Feb. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica sequence”; (Chlamydomonas reinhardtii) (publication: see DNA cultivar-group) genomic DNA, chromosome 1, PAC clone: Res. 7 (5), 305-307, 2000, Generation of expressed sequence tags P0506B12: (Oryza sativa)) (note: two versions are presented for from low-CO2 and high-CO2 adapted cells of Chlamydomonas review, gi: 13027301 submitted Feb. 21, 2001; and, gi:20160789 sub reinhardtii). mitted Apr. 16, 2002, see protein id: BAD73383.1, gi:5620 1933 NCBI accession No. AW035570 (gi:5894326) (Sep. 15, 1999); marked on p. 3/45) (publication: see Nature 420 (6913), 312-316, Alcala, J., et al., “EST281308 tomato callus, TAMU Lycopersicon 2002. The genome sequence and structure of rice chromosome 1). esculentum cDNA clone cIEC39F2 similar to CAAT-box DNA bind NCBI accession No. AP004179 (gii: 15718436) (pos. 36798-37036) ing protein subunit B (NF-YB), putative, mRNA sequence': (Sep. 20, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar (Lycopersicon esculentum). group) genomic DNA, chromosome 2, BAC clone: OJ1124 G07": NCBI accession No. AWO43377 (gi:5903906) (Sep. 18, 1999); (Oryza sativa ) ) (note: two versions are presented for review, Whetten, R.W. “ST32F09 Pine TriplEx shoot tip library Pinus taeda gi:157 18436 submitted Sep. 20, 2001; and, gi:45735881 submitted cDNA clone ST32F09, mRNA sequence': (Pinus taeda). Mar. 25, 2004, see protein id BAD 12927.1, gi:45735894 marked on NCBI accession No. AW132359 (gi:6133966) (Oct. 27, 1999); Shoe p. 6/32). maker, R., et al., “se03b02.y1 Gm-c 1013 Glycine may cDNA clone NCBI accession No. AP004366 (gii: 17046146) (pos. 72619-74018) Genome Systems clone ID: Gm-c1013-2404 5" similar to (Nov. 21, 2001); Sasaki, T., et al., “Oryza sativa (japonica cultivar SW:CBFA Maize P25209 CCAAT-binding transcription factor group) genomic DNA, chromosome 1, PAC clone: P0460004'; subunit A, mRNA sequence”; (Glycine max). (Oryza sativa ) ) (note: two versions are presented for review, NCBI accession No. AW200790 (gi:6481519) (Nov. 30, 1999); gi: 17046146 submittedd Nov. 21, 2001; and, gi:20805242 submitted Shoemaker, R., et al., “se93el 1.y1 Gm-c 1027 Glycine may cDNA May 15, 2002, see protein id: BAD73788.1, gi:56202329 marked on clone Genome Systems clone ID: Gm-c1027-357 5' similar to p. 12/46) (publication: see Nature 420 (6913), 312-316, 2002, The TR: 023310 023310 CCAAT-binding transcription factor subunit A: genome sequence and structure of rice chromosome 1). mRNA sequence'; (Glycine max). NCBI accession No. AP005193 (gi:20975319) (pos. 56389-57063) NCBI accession No. AW201996 (gi:6482782) (Nov. 30, 1999); (May 17, 2002); Sasaki, T., et al., “Oryza sativa (japonica cultivar Shoemaker, R., et al., “sf09g 11.y1 Gm-c 1027 Glycine may cDNA group) genomic DNA, chromosome 7, PAC clone: P0493C06'; clone Genome Systems clone ID: Gm-c 1027-1821 5" similar to (Oryza sativa ) ) (note: two versions are presented for review, TR: 023310 023310 CCAAT-binding transcription factor subunit A: gi:20975319 submitted May 17, 2002; and, gi:38142450 submitted mRNA sequence'; (Glycine max). Oct. 31, 2003, see protein id: BAD31 143.1, gi:50508657 marked on NCBI accession No. AW348.165 (gi:6845875) (Feb. 1, 2000); p. 7/50). Vodkin, L., et al., “GM210001A21D7Gm-r1021 Glycine may cDNA NCBI accession No. AT002114 (gi:5724898) (Aug. 10, 1999); Ryu, clone Gm-r1021-1583', mRNA sequence'; (Glycine max). S.W., et al., “AT002114 Flower bud cDNA Brassica rapa Subsp. NCBI accession No. AW395227 (gi:6913697) (Feb. 7, 2000); Shoe Pekinesis cDNA clone RF0417, mRNA sequence”. maker, R., et al., “sh45e04.y1 Gm-c1017 Glycine maxcDNA clone NCBI accession No. AU088581 (gi:7378310) (Mar. 31, 2000); Genome Systems clone ID: Gm-c1017-4663 5" similar to Sasaki, T., et al., “AU88581 Rice Callus Oryza sativa (japonica TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac cultivar-group) cDNA clone C52742, mRNA sequence': (Oryza tor; mRNA sequence'; (Glycine max). sativa). NCBI accession No. AW397727 (gi:69 16197) (Feb. 7, 2000); Shoe NCBI accession No. AV411210 (gi:7740371) (May 9, 2000); maker, R., et al., “sg83ff)4.y1 Gm-c1026 Glycine maxcDNA clone Asamizu, E., et al., “AV41 1210 Lotus japonicus young plants (two Genome Systems clone ID: Gm-c1026-3445" similar to TR:023634 week old) Lotus corniculatus var. japonicus cDNA clone 023634 transcription factor; mRNA sequence”; (Glycine max). US 8,633,353 B2 Page 9

(56) References Cited Genome systems clone ID: Gm-c1027-5478.5" similar to TR:023310 023310 CCAAT-binding transcription factor subunit A, mRNA OTHER PUBLICATIONS sequence', (Glycine max). NCBI accession No. AW775623 (gi:7765436) (May 9, 2000); NCBI accession No. AW432980 (gi:6964287) (Feb. 11, 2000); Shoe Fedorova, M., et al., “EST334688 DSIL Medicago truncatula cDNA maker, R., et al., “si03a.01.y1 Gm-c 1029 Glycine maxcDNA clone clone plDSIL-2K6, mRNA sequence'; (Medicago truncatula). Genome Systems clone ID: Gm-c1029-97 5" similar to TR:081130 NCBI accession No. AW907348 (gi:807 1558) (May 24, 2000); Van 081130 CCAAT-box binding factor HAP3 homolog; mRNA der Hoeven, R.S., et al., “EST343471 potato stolon, Cornell Univer sequence', (Glycine max). sity Solanum tuberosum cDNA clone cSTA6D11, mRNA sequence': NCBI accession No. AW459387 (gi:7029546) (Feb. 24, 2000); Shoe (Solanum tuberosum). maker, R., et al., “sh23ff)3.y1 Gm-c1016 Glycine maxcDNA clone NCBI accession No. AW931376 (gi:8106777) (May 30, 2000); Genome Systems clone ID: Gm-c 1016-5622 5" similar to Alcala, J., et al., “EST357219 tomato fruit mature green, TAMU TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac Lycopersicon esculentum cDNA clone cI.EF44N20 5", mRNA tor; mRNA sequence”; (Glycine max). sequence', (Lycopersicon esculentum). NCBI accession No. AW570530 (gi:7235201) (Mar. 13, 2000); NCBI accession No. AW931634 (gi:8107035) (May 30, 2000); Shoemaker, R., et al., “sj63c01.y1 Gm-c 1033 Glycine maxcDNA Alcala, J., et al., “EST357477 tomato fruit mature green, TAMU clone Genome Systems clone ID: Gm-c 1033-1945 5" similar to Lycopersicon esculentum cDNA clone cI.EF45F24 5", mRNA SW:CBFA Maize P25209 CCAAT-binding transcription factor sequence', (Lycopersicon esculentum). subunit A, mRNA sequence”; (Glycine max). NCBI accession No. AW980494 (gi:8172030) (Jun. 2, 2000); NCBI accession No. AW597630 (gi:7285.143) (Mar. 22, 2000); Fedorova, M., et al., “EST391647 GVN Medicago truncatula cDNA Shoemaker, R., et al., “sj96g06.y1 Gm-c 1023 Glycine maxcDNA clone pGVN-55A17, mRNA sequence”; (Medicago truncatula). clone Genome Systems clone ID: Gm-c 1023-2483 5" similar to NCBI accession No. AW981720 (gi:8173288) (Jun. 2, 2000); TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac Whetten, R.W., et al., “PC15H07 pine TriplEx pollen cone library tor; mRNA sequence”; (Glycine max). Pinus taeda cDNA clone PC15H07, mRNA sequence', (Pinus NCBI accession No. AW621652 (gi:7333299) (Mar. 28, 2000); Van taeda). der Hoeven, R.S., et al., “EST312450 tomato root during/after fruit NCBI accession No. AX180950 (gi:15132741)(Aug.9, 2001); Costa set, Cornell University Lycopersicon esculentum cDNA clone e Silva, O.D., et al., “Sequence 1 from Patent WO0145493”; cLEX12N125", mRNA sequence”; (Lycopersicon esculentum). (Physcomitrella patens) (note: the original Submission and latest NCBI accession No. AW625817 (gi:7338844) (Mar. 28, 2000); Van update are provided for examination) (publication: see WOO 145493 der Hoeven, R.S., et al., “EST319724 tomato radicle, 5d post-imbi A1: Jun. 28, 2001). bition, Cornell University Lycopersicon esculentum cDNA clone NCBI accession No. AX180957 (gi:15132748) (Aug.9, 2001); Costa cLEZ17A35", mRNA sequence”; (Lycopersicon esculentum). e Silva, O.D., et al., “Sequence 8 from Patent WO0145493”; NCBI accession No. AW648378 (gi:7409616) (Apr. 4, 2000); Alcala, (Physcomitrella patens) (note: the original Submission and latest J., et al., “EST326832 tomato germinating seedlings, TAMU update are provided for examination) (publication: see WOO 145493 Lycopersicon esculentum cDNA clone cI.EI4I18 5", mRNA A1: Jun. 28, 2001). sequence', (Lycopersicon esculentum). NCBI accession No. AX288144 (gi:17049846) (Nov. 22, 2001); da NCBI accession No. AW648379 (gi:7409617) (Apr. 4, 2000); Alcala, Costa Silva, O., et al., “Sequence 15 from Patent WOO177311”: J., et al., “EST326833 tomato germinating seedlings, TAMU (Physcomitrella patens) (note: the original Submission and latest Lycopersicon esculentum cDNA clone cI.EI4I22 5", mRNA update are provided for examination) (publication: see WO sequence', (Lycopersicon esculentum). 0.1773 11-A 15; Oct. 18, 2001). NCBI accession No. AW688588 (gi:7563324) (Apr. 14, 2000); He, NCBI accession No. AX365282 (gi: 18697024) (Feb. 16, 2002); X.-Z. et al., “NFO09C11ST1F1000 Developing stem Medicago Garnaat, C., et al., “Sequence 18 from Patent WO0206499”; (Zea truncatula cDNA clone NF009C11ST 5", mRNA sequence': mays) (note: the original Submission and latest update are provided (Medicago truncatula). for examination) (publication: see WO 0206499-A 18; Jan. 24. NCBI accession No. AW719547 (gi:7614059) (Apr. 19, 2000); 2002). Freund, S., et al., “LNEST6a3r Lotus japonicus nodule library, NCBI accession No. AX584259 (gi:27655760) (Jan. 11, 2003); mature and immature nodules Lotus corniculatus var. japonicus Cahoon, R.E., et al., Sequence 1 from Patent WO02057439 cDNA 5", mRNA sequence': (Lotus corniculatus). (Momordica charantia) (note: the original Submission and latest NCBI accession No. AW720671 (gi:7615221) (Apr. 19, 2000); update are provided for examination) (publication: see WO Colebatch, G., et al., “LNEST6a3rc Lotus japonicus nodule library 02057439-A 1: Jul. 25, 2002). 5 and 7 week-old Lotus corniculatus var. japonicus cDNA 5", mRNA NCBI accession No. AX584261 (gi:27655761) (Jan. 11, 2003); sequence', (Lotus corniculatus). Cahoon, R.E., et al., “Sequence 3 from patent WO2057439”; (Euca NCBI accession No. AW733618 (gi:7639292) (Apr. 24, 2000); Shoe lyptus grandis) (note: the original Submission and latest update are maker, R., et al., “sk75h06.y1 Gm-c 1016 Glycine may cDNA clone provided for examination) (publication: see WO 02057439-A3; Jul. Genome systems clone ID: Gm-c 1016-9972 5" similar to 25, 2002). TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac NCBI accession No. AX584263 (gi:27655762) (Jan. 11, 2003); tor, mRNA sequence”; (Glycine max). Cahoon, R.E., et al., “Sequence 5 from Patent WO02057439”; (Zea NCBI accession No. AW738727 (gi:7647672) (Apr. 25, 2000); Van mays) (note: the original Submission and latest update are provided der Hoeven, R.S., et al., “EST340154 tomato flower buds, anthesis, for examination) (publication: see WO 02057439-A5;Jul. 25, 2002). Cornell University Lycopersicon esculentum cDNA clone NCBI accession No. AX584265 (gi:27655763) (Jan. 11, 2003); cTOD8G22 5", mRNA sequence'; (Lycopersicon esculentum). Cahoon, R.E., et al., “Sequence 7 from Patent WO02057439”; (Zea NCBI accession No. AW754604 (gi:7676324) (May 1, 2000); mays) (note: the original Submission and latest update are provided Whetten, R.W. et al., “PC04B12 Pine TriplEx pollen cone library for examination) (publication: see WO 02057439-A7; Jul. 25, 2002). Pinus taeda cDNA clone PC04B12, mRNA sequence': (Pinus NCBI accession No. AX584267 (gi:27655764) (Jan. 11, 2003); taeda). Cahoon, R.E., et al., “Sequence 9 from patent WO02057439”; (Oryza NCBI accession No. AW756413 (gi:7685765) (May 3, 2000); Shoe sativa) (note: the original Submission and latest update are provided maker, R., et al., “s 121a12.y1 Gm-c1036 Glycine may cDNA clone for examination) (publication: see WO 02057439-A9; Jul. 25, 2002). Genome systems clone ID: Gm-c 1036-19435" similar to TR:081130 NCBI accession No. AX584269 (gi:27655765) (Jan. 11, 2003); 081130 CCAAT-box binding factor HAP3 homolog, mRNA Cahoon, R.E., et al., “Sequence 11 from Patent WO02057439”: sequence', (Glycine max). (Oryza sativa) (note: the original Submission and latest update are NCBI accession No. AW760103 (gi:769 1987) (May 4, 2000); Shoe provided for examination) (publication: see WO 02057439-A 11; Jul. maker, R., et al., “s 158b03.y1 Gm-c 1027 Glycine may cDNA clone 25, 2002). US 8,633,353 B2 Page 10

(56) References Cited NCBI accession No. BAB92931 (gi:20805265) (May 15, 2002); Sasaki, T., et al. “Putative CAAT-box DNA binding protein' (Oryza OTHER PUBLICATIONS sativa) (note: the original Submission and latest update are provided for examination) (publication: see Nature 420 (6913),312-316, 2002, NCBI accession No. AX584271 (gi:27655766) (Jan. 11, 2003); The genome sequence and structure of rice chromosome 1). Cahoon, R.E. et al., “Sequence 13 from Patent WO02057439”: NCBI accession No. BAB93257 (gi:21 104666) (May 22, 2002); (Glycine max) (note: the original Submission and latest update are Sasaki, T., et al. “P0423Al2.29” (Oryza sativa) (note: the original provided for examination) (publication: see WOO2057439-A 13; Jul. Submission and latest update are provided for examination) (publi 25, 2002). cation: see Nature 420 (6913), 312-316, 2002. The genome sequence NCBI accession No. AX584273 (gi:27655767) (Jan. 11, 2003); and structure of rice chromosome 1). Cahoon, R.E., et al. “Sequence 15 from Patent WO02057439”: NCBI accession No. BAB93258 (gi:21 104667) (May 22, 2002); (Glycine max) (note: the original Submission and latest update are Sasaki, T., et al. “Putative HAP3-like transcriptional-activator': provided for examination) (publication: see WOO2057439-A 15; Jul. (Oryza sativa) (note: the original Submission and latest update are 25, 2002). provided for examination) (publication: see Nature 420 (6913), 312 NCBI accession No. AX584275 (gi:27655768) (Jan. 11, 2003); 316, 2002. The genome sequence and structure of rice chromosome Cahoon, R.E., et al., “Sequence 17 from Patent WO02057439”: 1). (Glycine max) (note: the original Submission and latest update are NCBI accession No. BAC76331 (gi:30409459) (May 6, 2003); provided for examination) (publication: see WOO2057439-A 17; Jul. Miyoshi, K., et al., “NF-YB Oryza sativa (japonica cultivar 25, 2002). group)' (note: the original Submission and latest update are provided NCBI accession No. AX584277 (gi:27655769) (Jan. 11, 2003); for examination) (publication: see Plant J. 36, 532-540, 2003, Cahoon, R.E., et al., “Sequence 19 from Patent WO02057439”: OSHAP3 genes regulate chloroplast biogenesis in rice). (Glycine max) (note: the original Submission and latest update are NCBI accession No. BAC76332 (gi:30409461) (May 6, 2003); provided for examination) (publication: see WOO2057439-A 19; Jul. Miyoshi, K., et al., “NF-YB Oryza sativa (japonica cultivar 25, 2002). group); (note: the original Submission and latest update are pro NCBI accession No. AX584279 (gi:27655770) (Jan. 11, 2003); vided for examination) (publication: see Plant J. 36,532-540, 2003, Cahoon, R.E., et al., “Sequence 21 from patent WO02057439”: OSHAP3 genes regulate chloroplast biogenesis in rice). (Glycine max) (note: the original Submission and latest update are NCBI accession No. BAC76333 (gi:30409463) (May 6, 2003); provided for examination) (publication: see WOO2057439-A21; Jul. Miyoshi, K., et al., “NF-YB Oryza sativa (japonica cultivar 25, 2002). NCBI accession No. AX584281 (gi:27655771) (Jan. 11, 2003); group); (note: the original Submission and latest update are pro Cahoon, R.E., et al., “Sequence 23 from patent WO02057439”: vided for examination) (publication: see Plant J. 36,532-540, 2003, (Triticum aestivum) (note: the original Submission and latest update OSHAP3 genes regulate chloroplast biogenesis in rice). are provided for examination) (publication: see WO 02057439-A23; NCBI accession No. BE021941 (gi:8284382) (Jun. 6, 2000); Shoe Jul. 25, 2002). maker, R., et al., “smó4d05.y1 Gm-c 1028 Glycine may cDNA clone NCBI accession No. AX584283 (gi:27655772) (Jan. 11, 2003); Genome systems clone ID: Gm-c 1028-8674 5" similar to Cahoon, R.E., et al., “Sequence 25 from patent WO02057439”: TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac (Triticum aestivum) (note: the original Submission and latest update tor; mRNA sequence'; (Glycine max). are provided for examination) (publication: see WO 02057439-A25; NCBI accession No. BE054369 (gi: 13243855) (Jun. 8, 2000); Wing, Jul. 25, 2002). R.A., et al., “GA EaO002AO5f Gossypium arboreum 7-10 dpa fiber NCBI accession No. AX584285 (gi:27655773) (Jan. 11, 2003); library Gossypium arboreum cDNA clone GA Ea?)002AO5f. Cahoon, R.E., et al., “Sequence 27 from patent WO02057439”: mRNA sequence', (Gossypium arboreum) (note: two versions are (Triticum aestivum) (note: the original Submission and latest update presented for review, gi: 13243855 submitted Jun. 8, 2000; and, are provided for examination) (publication: see WO 02057439-A27; gi:8381425 submitted on Jun. 8, 2000). Jul. 25, 2002). NCBI accession No. BE060015 (gi:8404381) (Jun. 9, 2000); Shoe NCBI accession No. AX584287 (gi:27655774) (Jan. 11, 2003); maker, R., et al., "sn39h 06.y1 Gm-c 1027 Glycine may cDNA clone Cahoon, R.E., et al., “Sequence 29 from patent WO02057439”: Genome systems clone ID: Gm-c1027-96845" similar to TR:023634 (Canna indica) (note: the original Submission and latest update are 023634 transcription factor; mRNA sequence”; (Glycine max). provided for examination) (publication: see WOO2057439-A29; Jul. 25, 2002). NCBI accession No. BE121888 (gi:8513993) (Jun. 13, 2000); Gross NCBI accession No. AY 1 12643 (gi:21217233) (May 26, 2002); man, A., et al., "894.015G05.y1 C. reinhardtii cc-1690, normalized, Gardiner, J., et al., “Zea may’s CL691 1 mRNA sequence”; (Zea Lambda ZapII Chlamydomonas reinhardtii cDNA, mRNA mays) (note: the original Submission and latest update are provided sequence”; (Chlamydomonas reinhardtii). for examination) (publication: see Plant Physiol. 134 (4), 1317-1326 NCBI accession No. BE196056 (gi: 13188584) (Jun. 26, 2000); 2004, Anchoring 9,371 maize expressed sequence tagged unigenes to Wing, R., et al., “HVSMEh0091D23f Hordeum vulgare 5-45 DAP the bacterial artificial chromosome contig map by two-dimensional spike EST library HVcDNAO009 (5 to 45 DAP) Hordeum vulgare overgo hybridization). subsp. Vulgare cDNA clone HVSMEh0091D23f, mRNA sequence': NCBI accession No. BAB64189 (gi: 15408793) (Aug. 31, 2001); (Hordeum vulgare) ) (note: two versions are presented for review, Sasaki, T., et al. “P0492G09.10” (Oryza sativa) (note: the original gi: 13188584 submitted Jun. 26, 2000; and, gi:8708251 submitted Submission and latest update are provided for examination) (publi Jun. 26, 2000). cation: see Nature 420 (6913), 312-316, 2002. The genome sequence NCBI accession No. BE210041 (gi:8826311) (Jun. 29, 2000); Shoe and structure of rice chromosome 1). maker, R., et al., "so38b01.y1 Gm-c 1039 Glycine may cDNA clone NCBI accession No. BAB64190 (gi: 15408794) (Aug. 31, 2001); Genome systems clone ID: Gm-c1039-1945" similar to TR:Q9ZQC3 Sasaki, T., et al. “Putative HAP3-like transcriptional-activator' Q97.QC3 putative CCAAT-binding transcription factor; mRNA (Oryza sativa) (note: the original Submission and latest update are sequence', (Glycine max). provided for examination) (publication: see Nature 420 (6913), 312 NCBI accession No. BE356560 (gi:9298117) (Jul. 20, 2000); 316, 2002. The genome sequence and structure of rice chromosome Cordonnier-Pratt, M., et al. “DG1 126 D05.b1 AO02 Dark 1). Grown 1 (DG 1) Sorghum bicolor cDNA, mRNA sequence': (Sor NCBI accession No. BAB89732 (gi:20160792) (Apr. 16, 2002); ghum bicolor). Sasaki, et al. “Putative CAAT-box DNA binding protein' (Oryza NCBI accession No. BE413647 (gi:941 1493) (Jul 24, 2000); Ander sativa) (note: the original Submission and latest update are provided son, O.A., et al., “SCU001.E10.R990714 ITEC SCU. Wheat for examination) (publication: see Nature 420 (6913),312-316, 2002, Endosperm Library Triticum aestivum cDNA clone SCU001.E.10, The genome sequence and structure of rice chromosome 1). mRNA sequence': (Triticum aestivum). US 8,633,353 B2 Page 11

(56) References Cited NCBI accession No. BF 169598 (gi: 11054215) (Oct. 30, 2000); Sederoff, R., “NXCI 125 B04 FNXCI (NsfXylem Compression OTHER PUBLICATIONS wood Inclined) Pinus taeda cDNA clone NXCI 125 B045" similar to Arabidopsis thaliana sequence At2g37060 putative CCAAT-box NCBI accession No. BE418716 (gi:9416562) (Jul 24, 2000); Ander binding transcription factor see http://mips.gsfdef pro?thal/db/index. son, O.A., et al., “SCL074.B01 R990724 ITEC SCL Wheat Leaf html, mRNA sequence': (Pinus taeda). Library Triticum aestivum cDNA clone SCLO74.B01, mRNA NCBI accession No. BF263449 (gii: 13260832) (Nov. 17, 2000); sequence'. (Triticum aestivum). Wing, R. et al., “HV CEaO006M10f Hordeum vulgare seedling NCBI accession No. BE441 135 (gi:944.0635) (Jul 25, 2000); Alcala, green leaf EST library HvcDNA 0004 (Blumeria challenged) J., et al., “EST408405 tomato developing/immature green fruit Hordeum vulgare subsp. Vulgare cDNA clone HV CEa(0.006M10f, Lycopersicon esculentum cDNA clone cI.EM6C23 similar to Zea may’s CAAT-box DNA binding protein subunit B, mRNA sequence': mRNA sequence', (Hordeum vulgare) (note: two versions are pre (Lycopersicon esculentum). sented for review, gi: 13260832 Submitted Nov. 17, 2000; and, NCBI accession No. BE441739 (gi:9441376) (Jul 25, 2000); Gross gi:11 194443 submitted Nov. 17, 2000). man, A., et al., '925.009 A11.xl C. Reinhardtii CC-2290, normalized, NCBI accession No. BF263455 (gii: 13260837) (Nov. 17, 2000); Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA Wing, R. et al., “HV CEaO006M16f Hordeum vulgare seedling sequence”; (Chlamydomonas reinhardtii). green leaf EST library HvcDNA 0004 (Blumeria challenged) NCBI accession No. BE496857 (gi:9695474) (Aug. 4, 2000); Ander Hordeum vulgare subsp. Vulgare cDNA clone HV CEa(0.006M16f, son, O.D. et al., “WHE0761 D09 H17ZS Wheat heat-stressed mRNA sequence', (Hordeum vulgare) (note: two versions are pre seedling cDNA library Triticum aestivum cDNA clone WHE0761 sented for review, gi: 13260837 submitted on Nov. 17, 2000; and D09 H17, mRNA sequence”; (Triticum aestivum). gi:16334312 submitted Oct. 23, 2001). NCBI accession No. BE516510 (gi:9740538) (Aug. 8, 2000); Ander NCBI accession No. BF270164 (gii: 13247369) (Nov. 17, 2000); son, O.D., et al., “WHE611 D10 H19ZA Wheat ABA-treated Wing, R. et al., “GA Eb0007A21f Gossypium arboreum 7-10 dpa embryo cDNA library Triticum aestivum cDNA clone WHE611 fiber library Gossypium arboreum cDNA clone GA Eb0007A21f. D10 H19, mRNA sequence”; (Triticum aestivum). mRNA sequence', (Gossypium arboreum) (note: two versions are NCBI accession No. BE603222 (gi: 13191083) (Aug. 21, 2000); presented for review, gi: 13247369 submitted Nov. 17, 2000; and, Wing, R., et al., “HVSMEh0102J16f Hordeum vulgare 5-45 DAP gi:1120 1159 submitted Nov. 17, 2000). spike EST library HVcDNAO009 (5 to 4 DAP) Hordeum vulgare subsp. Vulgare cDNA clone HVSMEhQ102J16f mRNA sequence': NCBI accession No. BF270944 (gi:1120 1939) (Nov. 17, 2000); (Hordeum vulgare) ) (note: two versions are presented for review, Wing, R. et al., “GA Eb0010B11f Gossypium arboreum 7-10 dpa gi: 13191083 submitted Aug. 21, 2000; and, gi:9860783 submitted fiber library Gossypium arboreum cDNA clone GA Eb0010B11f. Aug. 21, 2000). mRNA sequence', (Gossypium arboreum). NCBI accession No. BE604847 (gi:9862117) (Aug. 21, 2000); NCBI accession No. BF291752 (gi:11222816) (Nov. 17, 2000); Anderson, O.D., et al., “WHE 1713-1716 D19 D19ZS Wheat heat Akhunov, E., et al., “WHE2205 F04 K07ZS Aegilops speltoides stressed spike cDNA library Triticum aestivum cDNA clone anther cDNA library Aegilops speltoides cDNA clone WHE2205 WHE 1713-1716 D19 D19, mRNA sequence”; (Triticum F04 K07, mRNA sequence'; (Aegilops speltoides). aestivum). NCBI accession No. BF459554 (gi:1 1528.732) (Dec. 4, 2000); NCBI accession No. BE641 101 (gi:9958761) (Sep. 1, 2000); Salmi, Crookshanks, M., et al., “061 A04 Mature tuber lambda ZAP M.L., et al., “Cri2 2 E11 SP6 Ceratopteris Spore Library Solanum tuberosum cDNA 5" similar to (AL 132966) transcription Ceratopteris richardii cDNA clone Crit 2 E11 5", mRNA factor NF-Y. CCAAT-binding . . . emb CAB7641.1, mRNA sequence”; (Ceratopteris richardii) (publication: see Plant Physiol. sequence': (Solanum tuberosum) (publication: see FEBS Lett. 506 138 (3), 1734-1745, 2005, Profile and analysis of gene expression (2), 123-126, 2001. The potato tuber transcriptome: analysis of 6077 changes during early development in germinating spores of expressed sequence tags). Ceratopteris richardii). NCBI accession No. BF460267 (gi: 11529424) (Dec. 4, 2000); NCBI accession No. BE726750 (gi:10 127934) (Sep. 14, 2000); Crookshanks, M., et al., “073E08Mature tuberlambda ZAPSolanum Grossman, A., et al., "894093C12.y3 C. Reinhardtii CC-1690, nor tuberosum cDNA 5" similar to CCAAT-binding transcription factor malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA subunit . . . sp. P25209, mRNA sequence"; (Solanum tuberosum) sequence”; (Chlamydomonas reinhardtii). (publication: see FEBSLett. 506(2), 123-126, 2001. The potato tuber NCBI accession No. BE802539 (gi:10233651) (Sep. 20, 2000); transcriptome: analysis of 6077 expressed sequence tags). Shoemaker, R. et al., “sr32fO2.y1 Gm-c 1050 Glycine may cDNA NCBI accession No. BF517889 (gi: 11606021) (Dec. 8, 2000); clone Genome systems clone ID: Gm-c 1050-2068 5" similar to SW: Sederoff, R., “NXSI 029 D01 F NXSI (Nsf Xylem Side wood CBFA Maize P25209 CCAAT-binding transcription factor subunit Inclined) Pinus taeda cDNA clone NXSI 029 D01 5" similar to A, mRNA sequence”; (Glycine max). Arabidopsis thaliana sequence At2g37060 putative CCAAT-box NCBI accession No. BE803572 (gi:10234684) (Sep. 20, 2000); binding transcription factor see http://mips.gsfdef pro?thal/db/index. Shoemaker, R. et al., “srÓ0e 1 1.y1 Gm-c 1052 Glycine may cDNA html, mRNA sequence': (Pinus taeda). clone Genome systems clone ID: Gm-c 1052-165 5" similar to NCBI accession No. BF585526 (gi:11677850) (Dec. 12, 2000); TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac Cordonnier-Pratt, M., et al., “FM1 23 E09.g1 AO03 Floral-In tor; mRNA sequence”; (Glycine max). duced Meristem 1 (FM1) Sorghum propinquum cDNA, mRNA NCBI accession No. BE804236 (gi:10235348) (Sep. 20, 2000); sequence'. (Sorghum propinquium). Shoemaker, R. et al., “sr77b04.y1 Gm-c 1052 Glycine may cDNA NCBI accession No. BF585616 (gi:11677940) (Dec. 12, 2000); clone Genome systems clone ID: Gm-c 1052-1736 5" similar to Cordonnier-Pratt, M., et al., “FM1 23 E09.b1 AO03 Floral-In SW:CBFA Maize P25209 CCAAT-binding transcription factor duced Meristem 1 (FM1) Sorghum propinquum cDNA, mRNA subunit A, mRNA sequence”; (Glycine max). sequence'. (Sorghum propinquium). NCBI accession No. BF065056 (gi: 1084.1695) (Oct. 17, 2000); NCBI accession No. BF595304 (gi:1 1687628) (Dec. 12, 2000); Wing, R., et al., “HV CEb022MOlf Hordeum vulgare seedling Shoemaker, R., et al., “su76f(03.y1 Gm-c 1055 Glycine may cDNA green leaf EST library HvcDNA 0005 (Blumeria challenged) clone Genome Systems clone ID: gm-c 1055-653 5' similar to Hordeum vulgare subsp. Vulgare cDNA clone HV CEb0022M01f. TR:081130 081130 CCAAT-Box binding factor HAP3 homolog: mRNA sequence': (Hordeum vulgare). mRNA sequence'; (Glycine max). NCBI accession No. BF071234 (gi: 10845982) (Oct. 17, 2000); NCBI accession No. BF597252 (gi:1 1689576) (Dec. 12, 2000); Shoemaker, R. et al., “stO6hO5.y1 Gm-c 1065 Glycine may cDNA Shoemaker, R., et al. “Su96co6.y1 Gm-c 1056 Glycine soia cDNA clone Genome systems clone ID: Gm-c 1065-562 5" similar to clone Genome Systems clone ID: gm-c 1056-131 5' similar to TR:Q9ZQC3 Q97.QC3 Putative CCAAT-binding transcription fac TR:Q97OC3 Q9ZQC3 Putative CCAAT-Box binding transcription tor; mRNA sequence”; (Glycine max). factor; mRNA sequence', (Glycine soia). US 8,633,353 B2 Page 12

(56) References Cited Genome Systems clone ID: Gm-c 1040-4581 5" similar to TR: Q97.QC3 Q97.QC3 Putative CCAAT-binding transcription factor; OTHER PUBLICATIONS mRNA sequence'; (Glycine max). NCBI accession No. BG368375 (gi: 13257476) (Mar. 8, 2001); Wing, NCBI accession No. BF636140 (gi:1 1900298) (Dec. 19, 2000); Tor R., et al., “HVSMEi0018C01f Hordeum vulgare 20 DAP spike EST rez-Jerez, I., et al., “NFO60HO9DT1F1079 Drought Medicago library HvcDNA0010 (20 DAP) Hordeum vulgare subsp. vulgare truncatula cDNA clone NF060HO9DT 5", mRNA sequence': cDNA clone HVSMEi0018C01f, mRNA sequence”; (Hordeum (Medicago truncatula). vulgare) ) (note: two versions are presented for review, gi: 13257476 NCBI accession No. BF645376 (gi: 11910505) (Dec. 20, 2000); submitted Mar. 8, 2001; and, gi:16325230 submitted Oct. 22, 2001). Torres-Jerez, I., et al., “NFO40B5EC1F1044 Elicted cell culture Medicago truncatula cDNA clone NF040B05EC 5", mRNA NCBI accession No. BG440251 (gii: 13349902) (Mar. 15, 2001); sequence'; (Medicago truncatula). Wing, R.A., et al., “GA Ea?)06K2Of Gossypium arboreum 7-10 dpa NCBI accession No. BF65.1151 (gi: 11916281) (Dec. 20, 2000); fiber library Gossypium arboreum cDNA clone GA Ea?)006K20f. Torres-Jerez, I., et al., “NF101 H1 OEC1F1090 Elicted cell culture mRNA sequence', (Gossypium arboreum). Medicago truncatula cDNA clone NF101H10EC 5", mRNA NCBI accession No. BG445358 (gii: 13355010) (Mar. 15, 2001); sequence'; (Medicago truncatula). Wing, R.A., et al., “GA Ea?)027N18f Gossypium arboreum 7-10 NCBI accession No. BF715909 (gii: 12015181) (Jan. 2, 2001); Shoe dpa fiber library Gossypium arboreum cDNA clone maker, R., et al., “saal le(08.y1 Gm-c 1058 Glycine soia cDNA clone GA Ea?)027N18f mRNA sequence'; (Gossypium arboreum). Genome systems clone ID: Gm-c 1058-999.5" similar to TR:O23310 NCBI accession No. BG526135 (gi: 16949604) (Nov. 16, 2001); O23310 CCAAT-binding transcription factor subunit A, mRNA Brandle, J.E. et al., “57-6 Stevia field grown leaf cDNA Stevia sequence', (Glycine soia). rebaudiana cDNA 5", mRNA sequence"; (Stevia rebaudiana). NCBI accession No. BF777951 (gii: 12125851) (Jan. 12, 2001); NCBI accession No. BG551755 (gi: 13563535) (Apr. 9, 2001); Shoe Sederoff, R., “NXSI 079 C03 F NXSI (Nsf Xylem Side wood maker, R., "sad42f11.y1 Gm-c1075 Glycine may cDNA clone Inclined) Pinus taeda cDNA clone NXSI 079 C03 5' similar to Genome systems clone ID: Gm-c1075-6695' similar to TR:081130 Arabidopsis thaliana sequence Atag14540 CCAAT-binding tran 081130 CCAAT-Box binding factor HAP3 homolog; mRNA scription factor subunit A (CBF-A) see http://mips.gsfde?projthal/ sequence', (Glycine max). db/index.html, mRNA sequence': (Pinus taeda). NCBI accession No. BG589029 (gii: 13607169) (Apr. 12, 2001); Har NCBI accession No. BG039303 (gi: 1248.1888) (Jan. 24, 2001); rison, M.J., et al., “EST490838 MHRP-Medicago truncatula cDNA Sederoff, R., “NXSI 097 E11 F NXSI (Nsf Xylem Side Wood clone pMHRP-59024, mRNA sequence”; (Medicago truncatula). Inclined) Pinus taeda cDNA clone NXSI 097 E11 5" similar to NCBI accession No. BG594268 (gi: 13612408) (Apr. 12, 2001); Van Arabidopsis thaliana sequence At2g37060 putative CCAAT-box der Hoeven, R., et al., “EST49294.6 cSTS Solanum tuberosum cDNA binding transcription factor see http://mips.gsfdef pro?thal/db/index. clone cSTS7A17 5" sequence, mRNA sequence'. (Solanum html, mRNA sequence': (Pinus taeda). tuberosum). NCBI accession No. BG135204 (gii: 12635392) (Jan. 31, 2001); Van NCBI accession No. BG599785 (gi: 13616921) (Apr. 12, 2001); Van der Hoeven, R., et al., “EST468096 tomato crown gall Lycopersicon der Hoeven, R., et al., “EST504680 cSTS Solanum tuberosum cDNA esculentum cDNA clone cTOE21D12 5' sequence, mRNA clone cSTS26G12 5" sequence, mRNA sequence'. (Solanum sequence', (Lycopersicon esculentum). tuberosum). NCBI accession No. BG263362 (gi: 12865444) (Feb. 16, 2001); NCBI accession No. BG642751 (gi: 13777673) (Apr. 24, 2001); Van Anderson, O.D., et al., “WHE2341 B02 C03ZS Wheat pre der Hoeven, R., et al., “EST510945 tomato shoot/meristem anthesis spike cDNA library Triticum aestivum cdNA clone Lycopersicon esculentum cDNA clone cTOF25F23 5" sequence, WHE2341 B02 C03, mRNA sequence': NCBI accession No. mRNA sequence', (Lycopersicon esculentum). BG263362 (gi: 12865444) (Triticum aestivum). NCBI accession No. BG644353 (gii: 13779465) (Apr. 24, 2001); NCBI accession No. BG274786 (gi: 13067446) (Feb. 21, 2001); VandenBosch, K., et al., “EST505972 KV3 Medicago truncatula Akhunov, E., et al., “WHE2234 C03 E06ZS Aegilops speltoides cDNA clone pKV3-37C23 5' end, mRNA sequence'; (Medicago anther cDNA library Aegilops speltoides cDNA clone WHE2234 truncatula). C03 E06, mRNA sequence”; (Aegilops speltoides). NCBI accession No. BG662094 (gii: 13884016) (Apr. 30, 2001); NCBI accession No. BG314203 (gi: 13116006) (Feb. 23, 2001); Poulsen, C., et al., “Ljirnpest38-110-g8 Ljirnp Lambda HybriZap Anderson, O.D., et al., “WHE2460 E10 I2OZS Triticum monococ two-hybrid library Lotus corniculatus var. japonicus cDNA clone cum early reproductive apex cDNA library Triticum monococcum LP110-38-g8 5" similar to homolog of transcription factor NF-Y. cDNA clone WHE2460 E10 120, mRNA sequence”; (Triticum CCAAT-binding-like protein, mRNA sequence': (Lotus monococcum). corniculatus). NCBI accession No. BG318871 (gi: 13128301) (Feb. 26, 2001); NCBI accession No. BG832836 (gi:14189478) (May 22, 2001); Sederoff, R., “NXPV 020 H08 F NXPV (Nsf Xylem Planings Sederoff, R., “NXPV 081 C10 F NXPV (Nsf Xylem Planings Wood Vertical) Pinus taeda cDNA clone NXPV 020 H08 5" simi wood Vertical) Pinus taeda cDNA clone NXPV 081 C105" similar lar to Arabidopsis thaliana sequence Atag14540 CCAAT-binding to Arabidopsis thaliana sequence Atág 14540 CCAAT-binding tran transcription factor Subunit a (CBF-A) see http://mips.gsfdef proj/ scription factor subunit A (CBF-A) see http://mips.gsfde?projthal/ thal/db/index.html, mRNA sequence': (Pinus taeda). db/index.html, mRNA sequence': (Pinus taeda). NCBI accession No. BG350430 (gi: 13179172) (Mar. 1, 2001); NCBI accession No. BG846124 (gi: 14227308) (May 29, 2001); Crookshanks, M., et al., “091 D09 mature tuberlambda ZAPSolanum Grossman, A., et al., "1024012C11.y1 C. Reinhardtii CC-1690, nor tuberosum cDNA, mRNA sequence': (Solanum tuberosum) (publi malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA cation: see FEBS Lett. 506 (2), 123-126, 2001, The potato tuber sequence”; (Chlamydomonas reinhardtii). transcriptome: analysis of 6077 expressed sequence tags). NCBI accession No. BG847452 (gi: 14228636) (May 29, 2001); NCBI accession No. BG350792 (gi: 13179534) (Mar. 1, 2001); Grossman, A., et al., "1024017D03.y1 C. Reinhardtii CC-1690, nor Crookshanks, M., et al., “O98CO7 mature tuber lambda ZAPSolanum malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA tuberosum cDNA, mRNA sequence': (Solanum tuberosum) (publi sequence”; (Chlamydomonas reinhardtii). cation: see FEBS Lett. 506 (2), 123-126, 2001, The potato tuber NCBI accession No. BG850688 (gi: 14231872) (May 29, 2001); transcriptome: analysis of 6077 expressed sequence tags). Grossman, A., et al., "1024029A11.y1 C. Reinhardtii CC-1690, nor NCBI accession No. BG362898 (gii: 1325 1995) (Mar. 8, 2001); Shoe malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA maker, R., et al., “sacl3e07.y1 Gm-c 1040 Glycine may cDNA clone sequence”; (Chlamydomonas reinhardtii). Genome systems clone ID: Gm-c 1040-4453.5" similar to TR:023633 NCBI accession No. BG850689 (gi: 14231873) (May 29, 2001); 023633 Transcription factor; mRNA sequence”; (Glycine max). Grossman, A., et al., "1024029A11 y2 C. Reinhardtii CC-1690, nor NCBI accession No. BG363233 (gii: 13252330) (Mar. 8, 2001); Shoe malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA maker, R., et al., “sacllh11.y1 Gm-c1040 Glycine may cDNA clone sequence”; (Chlamydomonas reinhardtii). US 8,633,353 B2 Page 13

(56) References Cited NCBI accession No. BI469382 (gi:15285491) (Aug. 24, 2001); Shoemaker, R., et al., "saillb10,y1 Gm-c1053 Glycine may cDNA OTHER PUBLICATIONS clone Genome systems clone ID: Gm-c 1053-2779 5" similar to TR: 023310 023310 CCAAT-Binding transcription factor subunit A: NCBI accession No. BG857007 (gi: 14238.191) (May 29, 2001); mRNA sequence'; (Glycine max). Grossman, A., et al., "1024049D01.y1 C. Reinhardtii CC-1690, nor NCBI accession No. BI480208 (gi:15315976) (Aug. 27, 2001); malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA Akhunov, E., et al., “WHE2403 H07 P13ZS Wheat 3-6 DAP seed sequence”; (Chlamydomonas reinhardtii). cDNA library Triticum aestivum cDNA clone WHE2403 H07 NCBI accession No. BG858372 (gi: 14239556) (May 29, 2001); P13, mRNA sequence': (Triticum aestivum). Grossman, A., et al., "1024057C11.y1 C. Reinhardtii CC-1690, nor NCBI accession No. BI531782 (gi:15372356) (Aug. 29, 2001); malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA Grossman, A., et al., "1024116E03.y1 C. Reinhardtii CC-1690, nor sequence”; (Chlamydomonas reinhardtii). malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA NCBI accession No. BG890447 (gii: 14267556) (May 30, 2001); Van sequence”; (Chlamydomonas reinhardtii). der Hoeven, R., et al., “EST516298 cSTD Solanum tuberosum cDNA NCBI accession No. BI531808 (gi:15372382) (Aug. 29, 2001); clone cSTD 18M6 5" sequence, mRNA sequence'. (Solanum Grossman, A., et al., "1024116G03.y1 C. Reinhardtii cc-1690, nor tuberosum). malized, Lambda Zap II Chlamydomonas reinhardtii cDNA, mRNA NCBI accession No. BH966788 (gi:23448014) (Oct. 1, 2002); sequence”; (Chlamydomonas reinhardtii). Delehaunty, K., et al., “odi26h 12,b1 Boleracea002 Brassica NCBI accession No. BIT 18232 (gi:15693927) (Sep. 19, 2001); oleracea genomic, genomic Survey sequence'. (Brassica oleracea). Grossman, A., et al., "1031024F10.y1 C. Reinhardtii cc-1690, stress NCBI accession No. BI129814 (gi:18.013785) (Dec. 31, 2001); II (normalized), Lambda Zap II Chlamydomonas reinhardtii cDNA, Hertzberg, M., et al., “G095P88Y Populus cambium cDNA library mRNA sequence'; (Chlamydomonas reinhardtii). Populus tremula X Populus tremuloides cDNA, mRNA sequence': NCBI accession No. BIT 19728 (gi:15695423) (Sep. 19, 2001); (Populus tremula X Populus tremuloides). Grossman, A., et al., “1031045D08.y1 C. Reinhardtiicc-1690, stress NCBI accession No. BI176409 (gi:14642220) (Jul. 9, 2001); II (normalized), Lambda Zap II Chlamydomonas reinhardtii cDNA, Restrepo, S, et al., “EST521199P Infestans-challenged potato leaf, mRNA sequence'; (Chlamydomonas reinhardtii). compatible reaction Solanum tuberosum cDNA clone PPCAC88 5' NCBI accession No. BI875221 (gi:16073225) (Oct. 11, 2001); sequence similar to CCAAT-box binding transcription factor gene Grossman, A., et al., “963122O10,y1 C. Reinhardtii cc-1690, stress id:MNJ7.26 (Arabdiopsis thaliana), mRNA sequence': (Solanum condition I, normalized, Lambda Zap II Chlamydomonas reinhardtii tuberosum). cDNA, mRNA sequence”; (Chlamydomonas reinhardtii). NCBI accession No. BI206716 (gi: 14684440) (Jul 11, 2001); Van der Hoeven, R., et al., “EST524756 cTOS Lycopersicon esculentum NCBI accession No. BI952722(gi: 16296768) (Oct. 19, 2001); Wing, cDNA clone cTOS11H10 5' end, mRNA sequence”; (Lycopersicon R., et al., “HVSMEm()007I19f Hordeum vulgaregreen seedling EST esculentum). library HvcDNA0014 (Blumeria infected) Hordeum vulgare subsp. NCBI accession No. BI207873 (gi: 14685597) (Jul 11, 2001); Van Vulgare cDNA clone HVSMEm0007I19f, mRNA sequence': der Hoeven, R., et al., “EST525913 cTOS Lycopersicon esculentum (Hordeum vulgare). cDNA clone cTOS15I16 5' end, mRNA sequence”; (Lycopersicon NCBI accession No. BI953657 (gi: 16298505) (Oct. 19, 2001); Wing, esculentum). R., et al., “HVSMEm()013M03f Hordeum vulgare green seedling NCBI accession No. BI268123 (gii: 14873755) (Jul.18, 2001); Korth, EST library HvcDNA0014 (Blumeriainfected) Hordeum vulgare K., et al., “NF116D11 IN1F1094 Insect herbivory Medicago subsp. Vulgare cDNA clone HVSMEm()013M03f, mRNA truncatula cDNA clone NF116D11 IN 5", mRNA sequence': sequence', (Hordeum vulgare). (Medicago truncatula). NCBI accession No. BI967397 (gi:1634 1802) (Oct. 23, 2001); NCBI accession No. BI271802 (gii: 14880590) (Jul. 18, 2001); Vodkin, L., et al., “GM830001B20E03 Gm-r1083 Glycine may Torres-Jerez, I., et al., “NFO13D06FL1F1057 Developing flower cDNA clone Gm-r1083-222 3', mRNA sequence”; (Glycine max). Medicago truncatula cDNA clone NF013D06FL 5", mRNA NCBI accession No. BI972318(gii: 16346723) (Oct. 23, 2001); Shoe sequence'; (Medicago truncatula). maker, R., et al., "sag90a01.y1 Gm-c 1084 Glycine may cDNA clone NCBI accession No. BI309 186 (gii: 14983513) (Jul. 20, 2001); Genome Systems clone ID: Gm-c 1084-1178 5" similar to TR: Grusak, M.A., et al., “EST530596 GPOD Medicago truncatula Q97.QC3 Q97.QC3 Putative CCAAT-Binding Transcription Factor; cDNA clone pGPOD-10P10 5' end, mRNA sequence”; (Medicago mRNA sequence'; (Glycine max). truncatula). NCBI accession No. BJ208385 (gi: 1994.6503) (Apr. 4, 2002); NCBI accession No. BI311277 (gii: 14985604) (Jul. 20, 2001); Ogihara, Y, et al., “BJ208385Y. Ogihara unpublished cDNA library, Grusak, M.A., et al., “EST5313027 GESD Medicago truncatula Wh Triticum aestivum cDNA clone wh8a18 5", mRNA sequence': cDNA clone pGESD10M10 5' end, mRNA sequence': (Medicago (Triticum aestivum). truncatula). NCBI accession No. BJ208815 (gi: 19947041) (Apr. 4, 2002); NCBI accession No. BI316766 (gi: 14991093) (Jul 20, 2001); Shoe Ogihara, Y, et al., “BJ208815Y. Ogihara unpublished cDNA library, maker, R., et al., "saf73a12.y1 Gm-c1078 Glycine may cDNA clone Wh Triticum aestivum cDNA clone wh10p165", mRNA sequence': Genome systems clone ID: Gm-c 1078-1584 5" similar to TR: (Triticum aestivum). Q97.QC3 Q9ZQC3 Putative CCAAT-Binding transcription factor; NCBI accession No. BJ210400 (gi: 19949034) (Apr. 4, 2002); mRNA sequence'; (Glycine max). Ogihara, Y, et al., “BJ210400Y. Ogihara unpublished cDNA library, NCBI accession No. BI406257 (gi:15185671) (Aug. 14, 2001); Wh Triticum aestivum cDNA clone wh271225", mRNA sequence': Crookshanks, M., et al., “158C12 Mature tuber lambda Zap Solanum (Triticum aestivum). tubersum cDNA, mRNA sequence'. (Solanum tuberosum) (publica NCBI accession No. BJ210722 (gi: 1994.9459) (Apr. 4, 2002); tion: see FEBS Lett. 506 (2), 123-126, 2001, The potato tuber Ogihara, Y, et al., “BJ210722Y. Ogihara unpublished cDNA library, transcriptome: analysis of 6077 expressed sequence tags). Wh Triticum aestivum cDNA clone wh29f245", mRNA sequence': NCBI accession No. BI419749 (gi:15190772) (Aug. 15, 2001); (Triticum aestivum). Colebatch, G., et al., “LNEST14e12r Lotus japonicus nodule library NCBI accession No. BJ215658 (gi: 1995.4285) (Apr. 4, 2002); 5 and 7 week-old Lotus corniculatus var. japonicus cDNA 5", mRNA Ogihara, Y, et al., “BJ215658.Y. Ogihara unpublished cDNA library, sequence', (Lotus corniculatus var. japonicus). Wh Triticum aestivum cDNA clone wh8a18 3", mRNA sequence': NCBI accession No. BI423967 (gi:15199704) (Aug. 16, 2001); (Triticum aestivum). Shoemaker, R., et al., "sahó4c11.y1 Gm-c 1049 Glycine may cDNA NCBI accession No. BJ217966 (gi: 1995.7482) (Apr. 4, 2002); clone Genome systems clone ID: Gm-c 1049-3189 5" similar to Ogihara, Y, et al., “BJ217966Y. Ogihara unpublished cDNA library, TR:023310 023310 CCAAT-binding transcription factor subunit A: Wh Triticum aestivum cDNA clone wh29f243', mRNA sequence': mRNA sequence'; (Glycine max). (Triticum aestivum). US 8,633,353 B2 Page 14

(56) References Cited NCBI accession No. BM341 107 (gi: 18171267) (Jan. 16, 2002); Wen, T.J., et al., “MEST330-D11.T3 ISUM-RN Zea may’s cDNA clone OTHER PUBLICATIONS MEST330-D113', mRNA sequence'; (Zea mays). NCBI accession No. BM341536 (gi: 18171696) (Jan. 16, 2002); Wen, NCBI accession No. BJ234039 (gi:20051078) (Apr. 5, 2002); T.J., et al., “MEST336-C11.T3 ISUM5-RN Zea may’s cDNA clone Ogihara, Y, et al., “BJ234039Y. Ogihara unpublished cDNA library, MEST336-C113', mRNA sequence”; (Zea mays). Wh Triticum aestivum cDNA clone whe8e 165", mRNA sequence': NCBI accession No. BM349646 (gi: 18174258)(Jan. 16, 2002); Wen, (Triticum aestivum). T.J., et al., “MEST253-D11.T3 ISUM5-RN Zea may’s cDNA clone NCBI accession No. BJ236476 (gi:2005.2449) (Apr. 5, 2002); MEST253-D113', mRNA sequence”. Ogihara, Y, et al., “BJ236476Y. Ogihara unpublished cDNA library, NCBI accession No. BM525962 (gii: 1873.0588) (Feb. 19, 2002); Wh Triticum aestivum cDNA clone whe21n09 5", mRNA sequence” Shoemaker, R., et al., “sak74b11.y1 gm-c 1036 Glycine may cDNA (Triticum aestivum). clone Soybean clone ID: Gm-c 1036-8542 5' similar to TR: Q97.QC3 NCBI accession No. BJ248969 (gi:20059585) (Apr. 5, 2002); Q97.QC3 Putative CCAAT-Binding Transcription factor; mRNA Ogihara, Y, et al., “BJ248969 Y. Ogihara unpublished cDNA library, sequence', (Glycine max). Wh Triticum aestivum cDNA clone whf&no9 5", mRNA sequence': NCBI accession No. BM528842 (gii: 18735577) (Feb. 19, 2002); (Triticum aestivum). Shoemaker, R., et al., “sak69b03.y1 Gm-c 1036 Glycine may cDNA NCBI accession No. BJ255219 (gi:20079277) (Apr. 8, 2002); clone Soybean clone ID: Gm-c10368141 5" similar to TR:081130 Ogihara, Y, et al., “BJ255219Y. Ogihara unpublished cDNA library, 0811130 CCAAT-Box Binding Factor HAP3 Homolog, mRNA Wh Triticum aestivum cDNA clone whf&no'93", mRNA sequence': sequence', (Glycine max). (Triticum aestivum). NCBI accession No. BM887558 (gi: 1927.1302) (Mar. 8, 2002); NCBI accession No. BJ463462 (gi:21 141969) (May 23, 2002); Sato, Shoemaker, R., et al., “sam40cO9.y1 Gm-c1068 Glycine may cDNA K., et al., “BJ463462 K. Sato unpublished cDNA library, cv. Haruna clone Soybean clone ID: Gm-c10687050 5' similar to TR:Q9ZQC3 Nijo germination shoots Hordeum vulgare subsp. Vulgare cDNA Q97.QC3 Putative CCAAT-Box Binding Transcription Factor, clone bags32a01 5", mRNA sequence': (Hordeum vulgare subsp. mRNA sequence'; (Glycine max). Vulgare). NCBI accession No. BM888735 (gi: 19272479) (Mar. 8, 2002); NCBI accession No. BJ482187(gi:21160648) (May 23, 2002); Sato, Walbot, V., “952068E04.y1 952. BMS tissue from Walbot Lab K., et al., “BJ48218.7 K. Sato unpublished cDNA library, strain H602 (reduced rRNA) Zea may’s cDNA, mRNA sequence”; (Zea mays). adult, heading stage top three leaves Hordeum vulgare subsp. NCBI accession No. BM892103 (gi: 19347223) (Mar. 11, 2002); Spontaneum cDNA clone bahó3025", mRNA sequence': (Hordeum Shoemaker, R., et al., “sam48d03.y1 Gm-c 1069 Glycine may cDNA vulgare). clone Soybean clone ID: Gm-c1069-2478.5" similar to TR:Q9ZQC3 NCBI accession No. BJ555744 (gi:27237564) (Dec. 18, 2002); Q97.QC3 Putative CCAAT-Binding Factor, mRNA sequence': Hoshino, A., et al., “BJ555744 Ipomoea nil mixture of flower and (Glycine max). flower bud Ipomoea nil cDNA clonejm 18c025", mRNA sequence': NCBI accession No. BM896.169 (gi: 19351637) (Mar. 11, 2002); (Ipomoea nil). Walbot, V., “952067E04 xl 952. BMS tissue from Walbot Lab NCBI accession No. BJ571596 (gi:27253424) (Dec. 18, 2002); (reduced rRNA) Zea may’s cDNA, mRNA sequence”; (Zea mays). Hoshino, A., et al., “BJ571596 Ipomoea nil mixture of flower and NCBI accession No. BQ046483 (gii: 19820469) (Mar. 29, 2002); flower bud Ipomoea nil cDNA clonejm 18c02 3', mRNA sequence': Zhang, P. et al., “EST595601 P. Infestans-challenged potato leaf, (Ipomoea nil). incompatible reaction Solanum tuberosum cDNA clone BPLI114J14 NCBI accession No. BJ575345 (gi:27257173) (Dec. 18, 2002); 5' end, mRNA sequence'. (Solanum tuberosum). Hoshino, A., et al., “BJ575345 Ipomoea nil mixture of flower and NCBI accession No. BQ104671 (gi:20154333) (Apr. 16, 2002); flower bud Ipomoea nil cDNA clonejm29, 193", mRNA sequence': Guterman, I., et al., “fc0546.e Rose Petals (Fragrant Cloud) Lambda (Ipomoea nil). Zap Express Library Rosa hybrid cultivar cDNA clone fe0546.e5". NCBI accession No. BM109471 (gi: 17070418) (Nov. 26, 2001); Van mRNA sequence”; (Rosa hybrid cultivar) (publication: see Plant Cell der Hoeven, R., et al., “EST557007 potato roots Solanum tuberosum 14 (10), 2325-2338, 2002, Rose Scent: Genomics Approach to Dis cDNA clone cFRO4E20 5' end, mRNA sequence”; (Solanum covering Novel Floral Fragrance-Related Genes). tuberosum). NCBI accession No. BQ105902 (gi:20 155564) (Apr. 16, 2002); NCBI accession No. BM134935 (gi: 17143059) (Nov. 28, 2001); Guterman, I., et al., “fc0632.e Rose Petals (Fragrant Cloud) Lambda Anderson, O.D., et al., “WHE0460 A02 A03ZS Wheat Fusarium Zap Express Library Rosa hybrid cultivar cDNA clone fe0632.e5". graminearum infected spike cDNA library Triticum aestivum cDNA mRNA sequence”; (Rosa hybrid cultivar) (publication: see Plant Cell clone WHE0460 A02 A03, mRNA sequence'. (Triticum 14 (10), 2325-2338, 2002, Rose Scent: Genomics Approach to Dis aestivum). covering Novel Floral Fragrance-Related Genes). NCBI accession No. BM139923 (gi:21638877) (Jul. 1, 2002); Wang, NCBI accession No. BQ164458 (gi:2030.1515) (Apr. 24, 2002); G. et al., “Gm-R115 Soybean root reverse subtractive cDNA library Walbot, V., et al., “1091020E 11 y3 1091–Immature ear with com Glycine may cDNA clone Gm-R115, mRNA sequence'; (Glycine mon ESTs screened by Schmidt lab Zea may’s cDNA, mRNA max). sequence', (Zea mays). NCBI accession No. BM268414 (gi:1793.1454) (Dec. 18, 2001); NCBI accession No. BQ255423 (gi:20456176) (May 6, 2002); Wen, T.J., et al., “MEST395-C12.univ ISUM5-RNZea may’s cDNA VandenBosch, K., “MTNAL55TKN KVKC Medicago truncatula clone MEST395-C123", mRNA sequence”; (Zea mays). cDNA clone pKVKC-12E7, mRNA sequence”; (Medicago NCBI accession No. BM269434 (gi:17932474) (Dec. 18, 2001); truncatula). Wen, T.J., et al., “MEST409-G 11. Univ ISUM5-RNZea may’s cDNA NCBI accession No. BQ296537 (gi:20812059) (May 16, 2002); clone MEST409-G 11 3', mRNA sequence”; (Zea mays). Shoemaker, R., et al., "san93ff)1 y2 Gm-c 1052 glycine max cDNA NCBI accession No. BM308208 (gi: 18039914) (Jan. 2, 2002); Shoe clone Soybean clone ID: Gm-c 105271785" similar to TR: Q97.QC3 maker, R., et al., “sak43a12.y1 Gm-c 1036 Glycine may cDNA clone Q97.QC3 putative CCAAT-binding transcription factor; mRNA Soybean Clone ID: Gm-c 1036-57835" similar to TR:081130 081130 sequence', (Glycine max). CCAAT-Box Binding Factor HAP3 Homolog; mRNA sequence': NCBI accession No. BQ405785 (gi:21093472) (May 22, 2002); (Glycine max). Wing, R.A., et al., “GA Ed0086G12f Gossypium arboreum 7-10 NCBI accession No. BM331836 (gi: 18161997) (Jan. 16, 2002); Wen, dpa fiber library Gossypium arboreum cDNA clone T.J., et al., “MEST171-B1 1.T3 ISUM5-RNZea may’s cDNA clone GA Ed0086G12f mRNA sequence'; (Gossypium arboreum). MEST 171-B113', mRNA sequence”; (Zea mays). NCBI accession No. BQ488908 (gi:21333528) (Jun. 7, 2002); NCBI accession No. BM337630 (gi: 18167790) (Jan. 16, 2002); Wen, Bellin, D., et al., “95-E9134-006-006-M23-T3 Sugar beet MPIZ T.J., et al., “MEST215-B12.T3 ISUM-RN Zea may’s cDNA clone ADIS-006 Lambda Zap II Library Beta vulgaris cDNA clone M-23 MEST215-B123', mRNA sequence”; (Zea mays). 6, mRNA sequence': (Beta vulgaris). US 8,633,353 B2 Page 15

(56) References Cited NCBI accession No. BQ744755 (gi:218.91542) (Jul 17, 2002); Walbot, V., "946 111A02.y1 946—tassel primordium prepared by OTHER PUBLICATIONS Schmidt lab Zea may’s cDNA, mRNA sequence”; (Zea mays). NCBI accession No. BQ757780 (gi:2196.6252) (Jul 26, 2002); NCBI accession No. BQ505705 (gi:21364,574) (Jun. 10, 2002); Hedley, P. et al., “EBem10 SQ005 D11 Rembryo. 2 Day germi Buell, C.R., et al., “EST613120 Generation of a set of potato cDNA nation, no treatment, cv Optic, EBem10 Hordeum vulgare subsp. clones for microarray analyses mixed potato tissues Solanum Vulgare cDNA clone EBeml.0 SQ005 D115", mRNA sequence': tuberosum cDNA clone STMGF80 5' end, mRNA sequence': (Hordeum vulgare). (Solanum tuberosum) (note: two versions are presented for review, NCBI accession No. BQ799965 (gi:220 14931) (Jul 30, 2002); gi:21364,574 submitted Jun. 10, 2002; and, gi:21921628 submitted Abbal. P. et al., “EST 2134 Green Grape berries Lambda Zap II Jul. 22, 2002). NCBI accession No. BQ505706 (gi:21364575) (Jun. 10, 2002); Library Vitis vinifera cDNA clone GT203H053", mRNA sequence': Buell, C.R., et al., “EST613121 Generation of a set of potato cDNA (Vitis vinifera). clones for microarray analyses mixed potato tissues Solanum NCBI accession No. BQ816666 (gi:22065967) (Aug. 1, 2002); tuberosum cDNA clone STMGF80 3' end, mRNA sequence': Grossman, A., et al., “1030059C09.y1 C. Reinhardtii CC-1690, (Solanum tuberosum) (note: two versions are presented for review, Deflgellation (normalized), Lambda Zap II Chlamydomonas gi:21364575 submitted Jun. 10, 2002; and, gi:21921629 submitted reinhardtii cDNA, mRNA sequence”; (Chlamydomonas reinhardtii). Jul. 22, 2002). NCBI accession No. BQ838221 (gi:22142539) (Aug. 8, 2002); NCBI accession No. BQ507074 (gi:21365943) (Jun. 10, 2002); Anderson, O.D., et al., “WHE2907 HO9 P17ZS Wheat aluminum Buell, C.R., et al., “EST614489 Generation of a set of potato cDNA stressed root tip cDNA library Triticum aestivum cDNA clone clones for microarray analyses mixed potato tissues Solanum WHE2907 HO9 P17, mRNA sequence”; (Triticum aestivum). tuberosum cDNA clone STMG023 3' end, mRNA sequence': NCBI accession No. BQ857127 (gi:22242592) (Aug. 14, 2002); (Solanum tuberosum) (note: two versions are presented for review, Kozik, A., et al., “QGB6K24.ygabl QG ABCDI lettuce salinas gi:21365943 submitted Jun. 10, 2002; and, gi:21922914 submitted Lactuca sativa cDNA clone QGB6K24, mRNA sequence”; (Lactuca Jul. 22, 2002). sativa). NCBI accession No. BQ508506 (gi:21367375) (Jun. 10, 2002); NCBI accession No. BQ862671 (gi:22248.136) (Aug. 14, 2002); Buell, C.R., et al., “EST615921 Generation of a set of potato cDNA Kozik, A., et al., “QGC21L19.yg.abl QG ABCDI lettuce Salinas clones for microarray analyses mixed potato tissues Solanum Lactuca sativa cDNA clone QGC21L19, mRNA sequence': tuberosum cDNA clone STMGW70 5' end, mRNA sequence': (Solanum tuberosum) (note: two versions are presented for review, (Lactuca sativa). gi:21367375 submitted Jun. 10, 2002; and, gi:21924285 submitted NCBI accession No. BQ875352 (gi:22264573) (Aug. 15, 2002); Jul. 22, 2002). Kozik, A., et al., “QGI7N22.yg.abl QG ABCDI lettuce salinas NCBI accession No. BQ508507 (gi:21367376) (Jun. 10, 2002); Lactuca sativa cDNA clone QGI7N22, mRNA sequence”; (Lactuca Buell, C.R., et al., “EST615922 Generation of a set of potato cDNA sativa). clones for microarray analyses mixed potato tissues Solanum NCBI accession No. BQ911236 (gi:223 10015) (Aug. 19, 2002); tuberosum cDNA clone STMGW70 3' end, mRNA sequence': Kozik, A., et al., “QHA16J13.yg.abl QG ABCDI sunflower (Solanum tuberosum) (note: two versions are presented for review, RHA801 Helianthus annus cDNA clone QHA16J13, mRNA gi:21367376 Submitted Jun. 10, 2002; and, gi:21924286 submitted sequence'; (Helianthus annus). Jul. 22, 2002). NCBI accession No. BQ990941 (gi:22410476) (Aug. 21, 2002); NCBI accession No. BQ510555 (gi:21369424) (Jun. 10, 2002); Kozik, A., et al., “QGF21I11.yg.abl QG EFGHJ lettuce serriola Buell, C.R., et al., “EST617970 Generation of a set of potato cDNA Lactuca serriola cDNA clone QGF21I11, mRNA sequence': clones for microarray analyses mixed potato tissues Solanum (Lactuca serriola). tuberosum cDNA clone STMHL 13 5' end, mRNA sequence': NCBI accession No. BQ996.905 (gi:22431301) (Aug. 22, 2002); (Solanum tuberosum) (note: two versions are presented for review, Kozik, A., et al., “QGG14CO3.yg.abl QG EFGHJ lettuce serriola gi:21369424 submitted Jun. 10, 2002; and, gi:21926247 submitted Lactuca serriola cDNA clone QGG14CO3, mRNA sequence': Jul. 22, 2002). (Lactuca serriola). NCBI accession No. BQ511879 (gi:21370748) (Jun. 10, 2002); NCBI accession No. BU008142 (gi:22442537) (Aug. 22, 2002); Buell, C.R., et al., “EST619294 Generation of a set of potato cDNA Kozik, A., et al., “QGH6K04.yg.abl QG EFGHJ lettuce serriola clones for microarray analyses mixed potato tissues Solanum Lactuca serriola cDNA clone QGH6K04, mRNA sequence': tuberosum cDNA clone STMHU61 5' end, mRNA sequence': (Lactuca serriola). (Solanum tuberosum) (note: two versions are presented for review, NCBI accession No. BU013962 (gi:22448357) (Aug. 22, 2002); gi:21370748 submitted Jun. 10, 2002; and, gi:21927514 submitted Kozik, A., et al., “QGJ6B02.yg.abl QG EFGHJ lettuce serriola Jul. 22, 2002). Lactuca serriola cDNA clone QGJ6B02, mRNA sequence': NCBI accession No. BQ592365 (gi:26121948) (Dec. 6, 2002); (Lactuca serriola). Herwig, R., et al., “EO 12681-024-020-H10-SP6 MPIZ-ADIS-024 NCBI accession No. BUO16847 (gi:224.52367) (Aug. 23, 2002); developing root Beta vulgaris cDNA clone 024020-H105", mRNA Kozik, A., et al., “QHE14C23.ygabl QG EFGHJ Sunflower sequence': (Beta vulgaris) (publication: see Plant J.32 (5), 845-857. RHA280 Helianthus annuus cDNA clone QHE14C23, mRNA 2002, Construction of a ‘unigene cDNA clone set by oligonucleotide sequence'; (Helianthus annuus). fingerprinting allows access to 25,000 potential Sugar beet genes). NCBI accession No. BU051043 (gi:2249 1120) (Aug. 26, 2002); NCBI accession No. BQ606328 (gi:21555594) (Jun. 25, 2002); Walbot, V., “1111037G04y2 1111 Unigene III from Maize Clarke, B., et al., “BRY 2180 wheat EST endosperm library Genome Project Zea may’s cDNA, mRNA sequence': (Zea mays). Triticum aestivum cDNA 5", mRNA sequence': (Triticum aestivum) NCBI accession No. BUO90324 (gi:2254.0481) (Aug. 29, 2002); (publication: see Funct. Integr. Genomics 3 (1-2), 33-38, 2003, Shoemaker, R., et al., “sr70c10,y1 Gm-c1052 Glycine may cDNA Arabidopsis genomic information for interpreting wheat EST clone Genome systems clone ID: Gm-c 1052-1099 5" similar to SW: sequences). CBFA Maize P25209 CCAAT-binding transcription factor subunit NCBI accession No. BQ629472 (gi:21677121) (Jul. 2, 2002); Shoe A, mRNA sequence'; (Glycine max). maker, R., et al., "saq02e06.y1 Gm-c 1045 Glycinemax cDNA clone NCBI accessino No. BU097938 (gi:22545579) (Aug. 29, 2002); Soybean clone ID: Gm-c1045-3660 5" similar to TR: Q97.QC3 Walbot, V., et al., “946122D01.y1 946—tassel primodium prepared Q97.QC3 Putative CCAAT-Binding Transcription Factor, mRNA by Schmidt lab Zea may’s cDNA, mRNA sequence”; (Zea mays). sequence', (Glycine max). NCBI accession No. BU238020 (gi:22749845) (Sep. 6, 2002); NCBI accession No. BQ667936 (gi:21809618) (Jul 15, 2002); Singh, J.A., et al., “Ds01 14a 12 A Ds01 AAFC ECORC Walbot, V., "946102B01.y1 946—tassel primordium prepared by cold stressed Flixweed seedlings Descurainia sophia cDNA Schmidt lab Zea may’s cDNA, mRNA sequence'; (Zea mays). clone Ds01 14a 12, mRNA sequence'. (Descurainia sophia). US 8,633,353 B2 Page 16

(56) References Cited OA Oryza sativa (japonica cultivar-group) cDNA clone BR040003000 PLATE A05 33 34.abl similar to putative OTHER PUBLICATIONS CAAT-box DNA binding protein Oryza sativa (japonica cultivar group) gi:20805265:dbj:BAB92931.1 (AP004366) putative CAAT NCBI accession No. BU4994.57 (gi:22819367) (Sep. 12, 2002); box DNA binding protein Oryza sativa (japonica cultivar-group). Walbot, V., "946 175D02.y1 946—tassel primordium prepared by mRNA sequence': (Oryza sativa). Schmidt lab Zea may’s cDNA, mRNA sequence'; (Zea mays). NCBI accession No. CA785249 (gi:26048796) (Dec. 4, 2002); Shoe NCBI accession No. BU551072 (gi:22933933) (Sep. 16, 2002); maker, R., et al., "sau26hl 1.y1 Gm-c1062 Glycinemax cDNA clone Vodkin, L, et al., “GM880006B11G05 Gm-r1088 Glycine may Soybean Clone ID: Gm-c1062-95745" similar to TR: 023310 023310 cDNA clone Gm-r1088-2241 3", mRNA sequence”; (Glycine max). CCAAT-binding transcription factor Subunit A. mRNA sequence': NCBI accession No. BU653821 (gi:23366001) (Sep. 30, 2002); (Glycine max.). Grossman, A., et al., “1112109C03.y1 C. Reinhardtii cc-1690 (mt--), NCBI accession No. CA795.038 (gi:26052114) (Dec. 5, 2002); cc-1691 (mt-), Gamete (normalized), Lambda Zap II Jones, P.G., et al., "Cac BL 208 Cac BL (Bean and Leaf from Chlamydomonas reinhardtii cDNA, mRNA sequence': Amelonardo type Cacao) Theobroma cacao cDNA clone Cac BL (Chlamydoonas reinhardtii). 208 5", mRNA sequence': (Theobroma cacao) (publication: see NCBI accession No. BU655174 (gi:23367356) (Sep. 30, 2002); Planta 216(2), 255-264, 2002, Gene discovery and microarray analy Grossman, A., et al., “1112118C09.y1 C. Reinhardtii cc-1690 (mt--), sis of cacao varieties). cc-1691 (mt-), Gamete (normalized), Lambda Zap II NCBI accession No. CA801742 (gi:26058828) (Dec. 5, 2002); Shoe Chlamydomonas reinhardtii cDNA, mRNA sequence': maker, R., et al., "satl7b11.y1 Gm-c 1036 Glycine max cDNA clone (Chlamydomonas reinhardtii). Soybean Clone ID: Gm-c 1036-13894 5" similar to TR: 081130 NCBI accession No. BU871529 (gi:24063053) (Oct. 16, 2002); 081130 CCAAT-Box binding transcription factor HAP3 Homolog, Unneberg, P. et al., Q031E12 Populus flower cDNA library Populus mRNA sequence'; (Glycine max.). trichocarpa cDNA 5. mRNA sequence (Populus trichocarpa). NCBI accession No. CA802391 (gi:26059477) (Dec. 5, 2002); Shoe NCBI accession No. BU880488 (gi:24072012) (Oct. 16, 2002); maker, R., et al., "sau35cO3.y1 Gm-c1071 Glycinemax cDNA clone Unneberg, P. et al., “UM49TGO9 Populus flower cDNA library Soybean Clone ID: Gm-c1071-28135" similar to SW:CBFA Maize Populus trichocarpa cDNA 5’, mRNA sequence”; (Populus P25209 CCAAT-binding transcription factor subunit A. mRNA trichocarpa). sequence', (Glycine max.). NCBI accession No. BU881483 (gi:24073007) (Oct. 16, 2002); NCBI accession No. CA802515 (gi:26059601) (Dec. 5, 2002); Shoe Unneberg, P. et al., “UM63TC12 Populus flower cDNA library maker, R., et al., "sau37e09.y1 Gm-c1071 Glycinemax cDNA clone Populus trichocarpa cDNA 5’, mRNA sequence”; (Populus Soybean Clone ID: Gm-c 1071-32815" similar to TR: 023633023633 trichocarpa). transcription factor, mRNA sequence'; (Glycine max.). NCBI accession No. BU896236 (gi:24107443) (Oct. 17, 2002); NCBI accession No. CA820519 (gi:26269456) (Dec. 9, 2002); Shoe Unneberg, P. et al., “X037F04 Populus wood cDNA library Populus maker, R., et al., "sau.90cQ6.y1 Gm-c1048 Glycinemax cDNA clone tremulay Populus tremuloides cDNA 5’, mRNA sequence'. (Populus Soybean Clone ID: Gm-c1048-3180 5" similar to TR: O23310 tremula y Populus tremuloides). O23310 CCAAT-binding transcription factor subunit A. mRNA NCBI accession No. BU997198 (gi:24274181) (Oct. 23, 2002); sequence', (Glycine max.). Zhang, H., et al., “HI07D15r HI Hordeum vulgare subsp. Vulgare NCBI accession No. CA828.166 (gi:26456583) (Dec. 11, 2002); cDNA clone HIO7D 155", mRNA sequence': (Hordeum vulgare). Walbot, V., “11 14024G02.y2 1114 Unigene IV from Maize NCBI accession No. BZ505711 (gi:27026128) (Dec. 16, 2002); Genome Project Zea may’s cDNA, mRNA sequence': (Zea mays). Ayele, M., et al., “BONBC73TF BO 1y 2 KB tot Brassica NCBI accession No. CA82993.1 (gi:26557696) (Dec. 12, 2002); oleracea genomic clone BONBC73, genomic Survey sequence': Walbot, V., “3529 1 1 1 A10.y 13529–2 mm ear tissue from (Brassica oleracea) (publication: see Genome Res. 15 (4), 487-495, Schmidt and Hake labs Zea may’s cDNA, mRNA sequence”; (Zea 2005, Whole genome shotgun sequencing of Brassica oleracea and mays). its application to gene discovery and annotation in Arabidopsis). NCBI accession No. CA832165 (gi:26559930) (Dec. 12, 2002); NCBI accession No. BZ635584 (gi:28084146) (Jan. 29, 2003); Walbot, V., “11 17028F06.y1 1117 Unigene V from Maize Genome Whitelaw, C.A. et al., “OGCAF48TCZM 0.7 1.5 KBZea mays Project Zea may’s cDNA, mRNA sequence': (Zea mays). genomic clone ZMMBMaO126G24, genomic Survey sequence': NCBI accession No. CA897002 (gi:27383993) (Dec. 27, 2002); Bui, (Zea mays). A.Q., et al., “PCEP0404 Scarlet Runner Bean Embryo-Proper NCBI accession No. BZ635588 (gi:28084150) (Jan. 29, 2003); Region Phaseolus coccineus cDNA 5" similar to AB025628: contains Whitelaw, C.A. et al., “OGCAF48TMZM 0.71 5 KB Zea mays similarity to CCAAT-box binding transcription factor gene genomic clone ZMMBMaO126G24, genomic Survey sequence': id:MNJ7.26 Arabidopsis thaliana: Identical to At5g47670: LEC1 (Zea mays). like AF533650), mRNA sequence': (Phaseolus coccineus). NCBI accession No. BZ707 196 (gi:28427237) (Feb. 19, 2003); NCBI accession No. CA902393 (gi:27389385) (Dec. 27, 2002); Bui, Whitelaw, C.A. et al., “OGEBD41TCZM 0.7 1.5 KB Zea mays A.Q., et al., “PCS03217F Scarlet Runner Bean Suspensor Region genomic clone ZMMBMaO222G10, genomic Survey sequence': TriplEx2 Phaseolus coccineus cDNA 5" similar to AB025628: con (Zea mays). tains similarity to CCAAT-box binding transcription factor gene NCBI accession No. BZ733904 (gi:287 10230) (Mar. 3, 2003); id:MNJ7.26 Arabidopsis thaliana: Identical to At5g47670: LEC1 Whitelaw, C.A. et al., “OGFAN93TCZM 0.7 1.5 KB Zea mays like AF533650), mRNA sequence': (Phaseolus coccineus). genomic clone ZMMBMaO240O18, genomic Survey sequence': NCBI accession No. CA902394 (gi:27389386) (Dec. 27, 2002); Bui, (Zea mays). A.Q., et al., “PCSC12889 Scarlet Runner Bean Suspensor Region NCBI accession No. BZ733911 (gi:287 10244) (Mar. 3, 2003); TriplEx2 Phaseolus coccineus cDNA 5" similar to AB025628: con Whitelaw, C.A. et al., “OGFAN93TMZM 0.71.5 KB Zea mays tains similarity to CCAAT-box binding transcription factor gene genomic clone ZMMBMaO240O18, genomic Survey sequence': id:MNJ7.26 Arabidopsis thaliana: Identical to At5g47670: LEC1 (Zea mays). like AF533650), mRNA sequence': (Phaseolus coccineus). NCBI accession No. CA026619 (gi:24303993) (Oct. 23, 2002); NCBI accession No. CA902402 (gi:27389394) (Dec. 27, 2002); Bui, Radchuk, V., et al., “HZ56G24r HZ Hordeum vulgare subsp. Vulgare A.Q., et al. “PCSC08658 Scarlet Runner Bean Suspensor Region cDNA clone HZ56G245", mRNA sequence': (Hordeum vulgare). TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: CBFA NCBI accession No. CA688065 (gi:25278629) (Nov. 25, 2002); Maize CCAAT-binding transcription factor subunit A (CBF-A) Tingey, S.V., et al., “wlm96.pk037.k9 wilm96 Triticum aestivum Arabidopsis thaliana; similar to At3 g53340: Non-LEC1-type cDNA clone wilm96.pk037.k.9 5' end, mRNA sequence': (Triticum AHAP3, mRNA sequence': (Phaseolus coccineus). aestivum). NCBI accession No. CA902403 (gi:27389395) (Dec. 27, 2002); Bui, NCBI accession No. CA753815 (gi:25797854) (Nov. 27, 2002); A.Q., et al., “PCSC08738 Scarlet Runner Bean Suspensor Region Bohnert, H.J., et al., “BR04.0003000 PLATE AO5 33 034.abl TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: CBFA US 8,633,353 B2 Page 17

(56) References Cited NCBI accession No. CB350611 (gi:28985378) (Mar. 17, 2003); Wen, T.J., et al., “MEST253-D11.univ ISUM5-RNZea may’s cDNA OTHER PUBLICATIONS clone MEST253-D113', mRNA sequence”; (Zea mays). NCBI accession No. CB351460 (gi:2898.7093) (Mar. 17, 2003); Maize CCAAT-binding transcription factor subunit A (CBF-A) Walbot, V., “3529 1 39 1 B01y 13529–2 mm ear tissue from Arabidopsis thaliana; similar to At3g53340: Non-LEC1-type Schmidt and Hake labs Zea may’s cDNA, mRNA sequence”; (Zea AHAP3, mRNA sequence': (Phaseolus coccineus). mays). NCBI accession No. CA902404 (gi:273893.96) (Dec. 27, 2002); Bui, NCBI accession No. CB351633 (gi:28987436) (Mar. 17, 2003); A.Q., et al., “PCSC 14305 Scarlet Runner Bean Suspensor Region TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: CBFA Walbot, V. “3529 1 42 1 E05.y 1 3529–2 mm ear tissue Maize CCAAT-binding transcription factor subunit A (CBF-A) from Schmidt and Hake labs Zea may’s cDNA, mRNA sequence': Arabidopsis thaliana; similar to At3 g53340: Non-LEC1 type (Zea mays). AHAP3, mRNA sequence': (Phaseolus coccineus). NCBI accession No. P25209 (gii: 115840) (Apr. 23, 1993); Li, X.Y., et NCBI accession No. CA902413 (gi:27389405) (Dec. 27, 2002); Bui, al. "CCAAT-binding transcription factor subunit A (CBF-A) (NF-Y A.Q., et al. “PCSC08277 Scarlet Runner Bean Suspensor Region protein chain B) (NF-YB) (CAAT-box DNA binding protein subunit TriplEx2 Phaseolus coccineus cDNA 5' similar to P25209: CBFA B)” (Zea mays) (note: the original Submission and latest update are Maize CCAAT-binding transcription factor subunit A (CBF-A) provided for examination) (publication: see Nucleic Acids Res. 20 Arabidopsis thaliana: Identical to At3 g53340: Non-LEC1-type (5), 1087-1091, 1992, Evolutionary variation of the CCAAT-binding AHAP3, mRNA sequence': (Phaseolus coccineus). transcription factor NF-Y). NCBI accession No. CA9 19789 (gi:27406719) (Dec. 27, 2002); NCBI accession No. S22820 (gi:7443522) (Apr. 5, 2000); Li, X.Y., et VanderBosch, K., et al., “EST637507 MTUS Medicago truncatula al. “Transcription factor NF-Y. CCAAT-binding, chain B—maize” cDNA clone MTUS-18B7, mRNA sequence”; (Medicago (Zea mays) (note: the original Submission and latest update are pro truncatula). vided for examination) (publication: see Nucleic Acids Res. 20 (5), NCBI accession No. CA923822 (gi:274 10752) (Dec. 27, 2002); 1087-1091, 1992, Evolutionary variation of the CCAAT-binding Ranjan, P. et al., “MTU7CL.P15. D04 Aspen leaf cDNA Library transcription factor NF-Y). Populus tremuloides cDNA, mRNA sequence'. (Populus NCBI accession No. X59714 (gi:22379) (Apr. 21, 1993); Benoist, C., tremuloides). "Z. may’s mRNA for CAAT-box DNA binding protein subunit B (NF NCBI accession No. CA935541 (gi:27424021) (Dec. 30, 2002); YB); (Zea mays) (note: the original Submission and latest update are Shoemaker, R., et al., "sau55g04.y1 Gm-c1071 Glycinemax cDNA provided for examination) (publication: see Nucleic Acids Res. 20 clone Soybean Clone ID: Gm-c1071-4927 5' similar to Tr:08113 (5), 1087-1091, 1992, Evolutionary variation of the CCAAT-binding 081130 CCAAT-Box Binding Factor HAP3 Homolog, mRNA transcription factor NF-Y). sequence', (Glycine max). NCBI accession No.Y13723 (gi:2398526)(Sep. 16, 1997); Edwards, NCBI accession No. CAA42234 (gi:22380) (Apr. 21, 1993); Li, D., “Arabidopsis thaliana mRNA for Hap3a transcription factor” X.Y., et al. “CA-box DNA binding protein subunit B (NF-YB)” (Zea (note: the original Submission and latest update are provided for mays) (note: the original Submission and latest update are provided examination). for examination) (publication: see Nucleic Acids Res. 20 (5), 1087 NCBI accession No.Y13724 (gi:2398528) (Sep. 16, 1997); Edwards, 1091, 1992, Evolutionary variation of the CCAAT-binding transcrip D., “Arabidopsis thaliana mRNA for Hap3b transcription factor” tion factor NF-Y). (note: the original Submission and latest update are provided for NCBI accession No. CAA74051 (gi:2398527) (Sep. 16, 1997); examination). Edwards, D., “Transcription factor Arabidopsis thaliana” (note: the NCBI accession No. Z97336 (gi:2244788) (pos. 102664-103149) original Submission and latest update are provided for examination). (Jul. 6, 1997); Bevan, M., et al., “Arabidopsis thaliana DNA chro NCBI accession No. CAC37695 (gi: 13928060) (May 2, 2001); mosome 4, ESSA I contig fragment No. 1'' (note: the original sub Masiero, S., et al. “NFYB1 protein' (Oryza sativa) (note: the original mission and latest update are provided for examination, see gene Submission and latest update are provided for examination). d13310w, protein id CAB10233.1 marked on p. 21-22/47). NCBI accession No. CB077569 (gi:2789 1006) (Jan. 24, 2003); Database TREMBL Accession No. AB025619; (Apr. 9, 1999) Levesque, M.P. et al., “hj56d09.g1 Hedyotis terminalis flower— “Arabidopsis thaliana genomic DNA, chromosome 5, P1 clone: Stage 2 (Nybg) Hedyotis terminalis cDNA clone hj56d09, mRNA MBA10'. sequence'; (Hedyotis terminalis). Database TREMBL Sequence Library TREMBL Accession No. NCBI accession No. CBO90548 (gi:27914740)(Jan. 27, 2003); Bren Q9FGJ3 (Mar. 1, 2001); protein: “Similarity to CCAAT-box binding ner, E.D., et al., “gy76g 12 gl Cycad Leaf Library (Nybg) Cycas transcription factor' (origin-gene: AT5.g47640). rumphii cDNA clone gy76g 12, mRNA sequence”; (Cycas rumphii). Database TREMBL Sequence Library TREMBL Accession No. NCBI accession No. CB290512 (gi:28615969) (Feb. 28, 2003); Q9FGP7 (Mar. 1, 2001); protein: “Transcription factor Hap5a-like” Close, T.J., et al., “UCRCS01 01bh 12 b1 Washington Navel orange (origin-gene: At5g50480). cold acclimated flavedo & albedo cDNA library Citrus sinensis Database TREMBL Sequence Library TREMBL Accession No. cDNA clone UCRCS01 01bh 12, mRNA sequence”; (Citrus Q9FMV5 (Mar. 1, 2001); protein: “Transcription factor Hap5a-like Sinensis). protein' (origin-gene: At5g63470) (publication: see DNA Res. NCBI accession No. CB290513 (gi:28615970) (Feb. 28, 2003); 4:401-414, 1997. Structural analysis of Arabidopsis thaliana chro Close, T.J., et al., “UCRCS01 01bh 12 gl Washington Navel mosome 5. . . ). orange cold acclimated flavedo & albedo cDNA library Citrus Database TREMBL Sequence Library TREMBL Accession No. sinensis cDNA clone UCRCS01 01bh 12, mRNA sequence”; (Cit Q9LFI3 (Oct. 1, 2000); protein: “Transcription factor NF-Y. rus sinensis). CCAAT-binding-like protein' (origin-gene: At3 g53340). NCBI accession No. CB347686 (gi:28968653) (Mar. 14, 2003); Database TREMBL Sequence Library TREMBL Accession No. Goes da Silva, F., et al., “CAB2SG0003 IaF CO7 Cabernet Q9SLGO (May 1, 2000); protein: “Putative CCAAT-binding tran Sauvignon Berry CAB2SG Vitis vinifera cDNA clone Scription factor Subunit' (origin-gene: At2g38880) (publication: See CAB2SG0003 IaF CO7 5", mRNA sequence”; (Vitis vinifera) Genome Biol. 3:RESEARCHOO29-RESEARCHO029, 2002, Full (note: two versions are presented for review, gi:28968653 submitted length messenger RNA sequences greatly improve genome annota Mar. 14, 2003; and, gi:297824.05 submitted Apr. 10, 2003). tion). NCBI accession No. CB347754 (gi:28968721) (Mar. 14, 2003); Database TREMBL Sequence Library TREMBL Accession No. Goes da Silva, F., et al., “CAB2SG0003 IaR CO7 Cabernet Q97.QC3 (May 1, 1999); protein: “Putative CCAAT-box binding Sauvignon Berry CAB2SG Vitis vinifera cDNA clone transcription factor' (origin-gene: At2g37060). CAB2SG0003 IaR C07 3', mRNA sequence”; (Vitis vinifera) Database TREMBL Sequence Library TREMBL Accession No. (note: two versions are presented for review, gi:28968721 submitted 023634 (Jan. 1, 1998); protein: “Transcription factor fragment” Mar. 14, 2003; and, gi:29782446 submitted Apr. 10, 2003). (origin-gene: hap3b). US 8,633,353 B2 Page 18

(56) References Cited U.S. Appl. No. 10/171.468, filed Jun. 14, 2002, Creelman, Robert et al. OTHER PUBLICATIONS U.S. Appl. No. 09/394,519, filed Sep. 13, 1999, Zhang, J. et al. U.S. Appl. No. 09/627.348, filed Jul. 28, 2000, Thomashow, Michael Database TREMBL Sequence Library TREMBL Accession No. O23310 (Jan. 1, 1998); protein: “CCAAT-binding transcription fac et al. tor subunit A (CBF-A)”. U.S. Appl. No. 09/489,376, filed Jan. 21, 2000, Heard, J. et al. Database TREMBL Sequence Library TREMBL Accession No. U.S. Appl. No. 09/489,230, filed Jan. 21, 2000, Broun, P. et al. O81130 (Nov. 1, 1998); protein: “CCAAT-box binding factor HAP3 U.S. Appl. No. 09/506,720, filed Feb. 17, 2000, Keddie, James et al. homolog”; (publication: see Cell 93:1195-1205, 1998, “Arabidopsis U.S. Appl. No. 09/533,030, filed Mar. 22, 2000, Keddie, James et al. Leafy Cotyledon1 is sufficient to induce embryo development in U.S. Appl. No. 09/533,392, filed Mar. 22, 2000, Jiang, C-Z. et al. vegetative cells”). U.S. Appl. No. 12/526,042, filed Feb. 7, 2008, Repetti, Peter P. et al. U.S. Appl. No. 09/733,089, filed Dec. 11, 2000, Lutfiyya. U.S. Appl. No. 12/721,304, filed Mar. 10, 2010, Creelman, Robert et U.S. Appl. No. 09/474,435, filed Dec. 28, 1999, Monsanto Co., et al. al 1 U.S. Appl. No. 09/532,591, filed Mar. 22, 2000, Samaha, R. et al. .S. Appl. No. 12/988,789, filed Oct. 20, 2010, Chen, Jianxin, et al. U.S. Appl. No. 09/533,648, filed Mar. 22, 2000, Riechmann, Jose .S. Appl. No. 12/689,010, filed Jan. 18, 2010, Powell, Ann L.T. et al. Luis et al. .S. Appl. No. 12/732,911, filed Mar. 26, 2010, Armstrong, Joshua I. U.S. Appl. No. 10/290,627, filed Nov. 7, 2002, Riechmann, Jose Luis e tal. et al. U.S. Appl. No. 12/917,303, filed Nov. 1, 2010, Jiang, C-Z. et al. U.S. Appl. No. 09/713,994, filed Nov. 16, 2000, Keddie, James et al. U.S. Appl. No. 12/922,834, filed Sep. 15, 2010, Khanna, Rajnish, et U.S. Appl. No. 09/837,944, filed Apr. 18, 2001, Creelman, Robert et al 1 al. U.S. Appl. No. 12/902,887, filed Oct. 12, 2010, Armstrong, Joshua, et U.S. Appl. No. 09/594,214, filed Jun. 14, 2000, Jones, J. et al. al. U.S. Appl. No. 10/456,882, filed Jun. 6, 2003, Riechmann, Jose Luis et al. * cited by examiner U.S. Patent Jan. 21, 2014 Sheet 1 of 5 US 8,633,353 B2

Fagales Cucurbitales Rosales Fabales Oxalidales Malpighiales Sapindales Malvales Brassicales Myrtales GeranialesDipsacales Apiales Aquifoliales Solanales Lamiales Gentianales Garryales Ericales Cornales Saxifradales Santalales Caryophyllales Proteales ZingiberalesRanunculales Commelinales Poales Arecales Pandanales Liliales Dioscoreales Asparagales Alismatales AcoralesPiperales Magnoliales Laurales Ceratophyllales Fig. 1

U.S. Patent Jan. 21, 2014 Sheet 3 of 5 US 8,633,353 B2

99 Zm G3436 (34) 95 (not tested) 97 Os G3398 (40) Os G3397 (36) 50 100 Zm G3435 (30) (not tested) 100 Gm G3472 (32) 100 96 Gm G3474 (24) At G482 (28) At G485 (18) \ Gm G3475 (16) 97 100 Gm G3476 (20) 90 Os G3395 (12) 100 Zm G3434 (12) 94 (not tested) Os G3396 (44) 96 100 At G1364 (14) At G2345 (22) 87 At G481 (2) 74 Gm G3470 (4) 100 Gm G3471 (6)

Fig. 3 U.S. Patent US 8,633,353 B2

US 8,633,353 B2 1. 2 PLANTS WITH IMPROVED WATER DEFCIT Aug. 2006 (expired), which claims the benefit of U.S. provi AND COLD TOLERANCE sional application 60/713.952, filed 31 Aug. 2005 (expired). The entire contents of each of these applications are hereby RELATIONSHIP TO COPENDING incorporated by reference. APPLICATIONS JOINT RESEARCH AGREEMENT This application (the “present application”) claims the ben efit of U.S. provisional application 60/961,403, filed 20 Jul. The claimed invention, in the field of functional genomics 2007 (expired); and, the present application is a continuation and the characterization of plant genes for the improvement in-part application of U.S. non-provisional application Ser. 10 of plants, was made by or on behalf of Mendel Biotechnology, No. 10/675,852, filed 30 Sep. 2003 (pending); and, the Inc. and Monsanto Company as a result of activities under present application is a continuation-in-part application of taken within the scope of a joint research agreement in effect .S. non-provisional application Ser. No. 1 1/479,226, filed on or before the date the claimed invention was made. 0 Jun. 2006 (issued as U.S. Pat. No. 7,858,848); and, the resent application is a continuation-in-part application of 15 FIELD OF THE INVENTION .S. non-provisional application Ser. No. 1 1/725,235, filed 6 Mar. 2007 (issued as U.S. Pat. No. 7,601,893); and, the The present invention relates to plant genomics and plant resent application is a continuation-in-part application of improvement, increasing tolerance to abiotic stresses, and .S. non-provisional application Ser. No. 1 1/728,567, filed improving the appearance and yield of plants. 6 Mar. 2007 (issued as U.S. Pat. No. 7,635,800); and, the resent application is a continuation-in-part application of BACKGROUND OF THE INVENTION .S. non-provisional application Ser. No. 1 1/069.255, filed 8 Feb. 2005 (pending); and, the present application is a Water deficit is a common component of many plant ontinuation-in-part application of U.S. non-provisional stresses. Water deficit occurs in plant cells when the whole pplication Ser. No. 10/374,780, filed 25 Feb. 2003 (issued as 25 plant transpiration rate exceeds the water uptake. In addition .S. Pat. No. 7,511,190); and, the present application is a to drought, other stresses, such as Salinity and low tempera ontinuation-in-part application of U.S. non-provisional ture, produce cellular dehydration (McCue and Hanson, pplication Ser. No. 10/546,266, filed 19 Aug. 2005 (issued as 1990). U.S. Pat. No. 7,659,446), which is a 371 National Stage Heat stress often accompanies conditions of low water filing of PCT application PCT/US2004.005654, filed 25 Feb. 30 availability. Heat itself is seen as an interacting stress and 2004 (expired), which is a continuation-in-part of both U.S. adds to the detrimental effects caused by water deficit condi non-provisional application Ser. No. 10/374,780, filed 25 tions. Evaporative demand exhibits near exponential Feb. 2003 (issued as U.S. Pat. No. 7,511,190) and U.S. non increases with increases in daytime temperatures and can provisional application Ser. No. 10/675,852, filed 30 Sep. result in high transpiration rates and low plant water poten 2003 (pending); and, the present application is a continua 35 tials (Hall et al., 2000). High-temperature damage to pollen tion-in-part application of U.S. non-provisional application almost always occurs in conjunction with drought stress, and Ser. No. 10/412,699, filed 10 Apr. 2003 (issued as U.S. Pat. rarely occurs under well-watered conditions. Thus, separat No. 7,345,217), and U.S. non-provisional application Ser. ing the effects of heat and drought stress on pollination is No. 10/412,699 also is a continuation-in-part application of difficult. Combined stress can alter plant metabolism in novel U.S. non-provisional application Ser. No. 10/374,780, filed 40 ways; therefore, understanding the interaction between dif 25 Feb. 2003 (issued as U.S. Pat. No. 7,511,190); and, the ferent stresses may be important for the development of strat present application is a continuation-in-part application of egies to enhance stress tolerance by genetic manipulation. U.S. non-provisional application Ser. No. 1 1/642,814, filed “Chilling sensitivity” describes many types of physiologi 20 Dec. 2006 (issued as U.S. Pat. No. 7,825.296), which is a cal damage produced at low, but above freezing, tempera divisional application of U.S. non-provisional application 45 tures. Typical chilling damage includes wilting, necrosis, Ser. No. 10/666,642, filed 18 Sep. 2003 (issued as U.S. Pat. chlorosis or leakage of ions from cell membranes. The under No. 7,196,245); and, the present application is a continuation lying mechanisms of chilling sensitivity are not completely in-part application of U.S. non-provisional application Ser. understood yet, but probably involve the level of membrane No. 10/714,887, filed 13 Nov. 2003 (pending), which is a saturation and other physiological deficiencies. By some esti continuation-in-part application of U.S. non-provisional 50 mates, chilling accounts for monetary losses in the United application Ser. No. 10/666,642, filed 18 Sep. 2003 (issued as States second only to drought and flooding. U.S. Pat. No. 7,196,245); and, the present application is a Based on the commonality of many aspects of cold, continuation-in-part application of U.S. non-provisional drought, and salt stress responses, genes that increase toler application Ser. No. 1 1/435,388, filed 15 May 2006 (issued as ance to cold or salt stress can also improve drought stress U.S. Pat. No. 7,663,025), which is a continuation-in-part 55 protection. In fact, this has already been demonstrated for application of PCT application PCT/US04/37584, filed 12 some transcription factors, such as AtCBF/DREB1, and for Nov. 2004 (expired), which is a continuation-in-part of U.S. other genes such as OsCDPK7 (Saijo et al., 2000), or AVP1 (a non-provisional application Ser. No. 10/714,887, filed 13 vacuolar pyrophosphatase-proton-pump, Gaxiola et al., Nov. 2003 (pending); and, the present application is a con 2001). tinuation-in-part of International Application No. PCT/ 60 This study identifies polynucleotides encoding another US2006/34615, filed 31 Aug. 2006 (expired), which claims group of transcription factors that can improve tolerance to priority from U.S. provisional application 60/713.952, filed cold and/or water deficit conditions. The protein sequences of 31 Aug. 2005 (expired); and, the present application is a the invention, which belong to the CCAAT-binding family of continuation-in-part application of U.S. non-provisional transcription factors, have been introduced into transgenic application Ser. No. 1 1/705.903, filed 12 Feb. 2007 (issued as 65 plants that were then found to have greater tolerance to cold U.S. Pat. No. 7,868,229), which is a continuation-in-part and water deficit stress than control plants. Thus, important application of PCT application PCT/US2006/34615, filed 31 polynucleotide and polypeptide sequences for producing US 8,633,353 B2 3 4 commercially valuable plants and crops as well as the meth electronic file of the Sequence Listing contained on each of ods for making them and using them were discovered. Other these CD-ROMs was created on 19 Oct. 2007, and each copy aspects and embodiments of the invention are described of the Sequence Listing is 159 kilobytes in size. These copies below and can be derived from the teachings of this disclosure of the Sequence Listing on the three CD-ROM discs submit as a whole. ted with this application are hereby incorporated by reference in their entirety. SUMMARY OF THE INVENTION FIG. 1 shows a conservative estimate of phylogenetic rela tionships among the orders of flowering plants (modified The present invention pertains to a nucleic acid construct, from Soltis et al., 1997). Those plants with a single cotyledon Such as an expression vector or cassette, a plasmid or another 10 (monocots) are a monophyletic clade nested within at least DNA preparation that comprises a recombinant polynucle two major lineages of dicots; the are further divided otide encoding a HAP3-like (or NF-YB) transcription factor into rosids and . Arabidopsis is a rosid eudicot clas found within the CCAAT binding-transcription factor family sified within the order Brassicales; rice is a member of the (also known as the NF-Y family; Mantovani, 1999). Com monocot order Poales. FIG. 1 was adapted from Daly et al., prised within each highly conserved central B domain of any 15 2001. of these proteins is found a conserved protein-protein and FIG. 2 shows a phylogenic dendogram depicting phyloge DNA-binding interaction module within the “histone fold netic relationships of higher plant taxa, including clades con motif or “HFM'. The conserved B domains of the present taining tomato and Arabidopsis; adapted from Ku et al., 2000; transcription factors responsible for these functions are and Chase et al., 1993. closely-related to the B domain of G481, SEQID NO: 2, in FIG. 3 illustrates the phylogenic relationship of a number that they are at least about 78%, 80%, 81, 82%, 83%, 84%, of sequences within the G482 subclade. The phylogenetic 85%. 86%, 87%, 91%, 93%, 95%, 97%, 98%, or 100% iden tree and multiple sequence alignments of G481 and related tical to the G481 B domain. When the expression vector or full length proteins were constructed using ClustalW cassette is introduced into a plant to produce a transgenic (CLUSTALW Multiple Sequence Alignment Program ver plant, the transgenic plant that results is capable of overex 25 sion 1.83, 2003) and MEGA2 (www.megasoftware.net) soft pressing the polypeptide, at which point the transgenic plant ware. The ClustalW multiple alignment parameters were: becomes more tolerant to cold or a water deficit condition Gap Opening Penalty: 10.00 than a control plant. Examples of water deficit conditions Gap Extension Penalty: 0.20 include heat, salt, drought, desiccation, dehydration, high Delay divergent sequences: 30% Sugar concentration, or freezing. Examples of control plants 30 DNA Transitions Weight: 0.50 include wild-type plants or plants transformed with an Protein weight matrix: Gonnet series “empty expression vector that does not comprise the recom DNA weight matrix: IUB binant polynucleotide. Use negative matrix: OFF. The invention also pertains to a transgenic plant compris A FastA formatted alignment was then used to generate a ing the nucleic acid construct comprising the recombinant 35 phylogenetic tree in MEGA2using the neighborjoining algo polynucleotide encoding the CCAAT family transcription rithm and a p-distance model. A test of phylogeny was done factor polypeptide having the conserved B domain that is at via bootstrap with 100 replications and Random Speed set to least about 78%, 80%, 81, 82%, 83%, 84%, 85%, 86%, 87%, default. Cut off values of the bootstrap tree were set to 50%. 91%, 93%, 95%, 97%, 98%, or 100% identical to a conserved G482 subclade transcription factors of the broader non B domain of SEQID NO: 2, wherein the transgenic plant has 40 LEC1-like clade of transcription factors found in the L1 L more tolerance to cold or a water deficit condition than a related CCAAT transcription factor family are derived from a control plant. common single strong node (arrow). The sequences shown in The invention is also directed to a method for increasing the FIG.3 have been introduced into plants, including sequences cold or water deficit tolerance of a plant, the method compris from both monocots and eudicots, most of these sequences ing the steps of introducing into a nucleic acid construct to 45 have conferred increased cold or water deficit tolerance when produce a transgenic plant. The nucleic acid construct com the sequences were overexpressed. The sequences of FIG. 3 prises the recombinant polynucleotide encoding the CCAAT that conferred improved cold or water deficit tolerance in family transcription factor polypeptide having the conserved Arabidopsis plants have B domains with at least 78% identity B domain that is at least 78%, 80%, 81, 82%, 83%, 84%, 85%, to the B domain of G481. SEQ ID NOs: of the sequences 86%, 87%, 91%, 93%, 95%, 97%, 98%, or 100% identical to 50 found in FIG. 3 are provided in the parentheses. a conserved B domain of SEQID NO: 2. When the transgenic FIGS. 4A and 4B provide a sequence alignment of the plant overexpresses the polypeptide, the transgenic plant will conserved B domains of HAP3 polypeptides from Arabidop then have more tolerance to cold or a water deficit condition sis, soybean, rice, and corn. SEQ ID NOs of sequences in than a control plant. FIGS. 4A to 4B are found within the parentheses after the 55 Gene Identification Numbers (GIDs; e.g., “G481”, “G482, BRIEF DESCRIPTION OF THE SEQUENCE etc.). Members of the G482 subclade are shown above the LISTING AND DRAWINGS horizontal lines in FIGS. 4A to 4B. Conserved residues con stituting the backbone structure (Maity and de Crombrugghe, The Sequence Listing provides exemplary polynucleotide 1998; Zemzoumi et al., 1999) of the histone fold matrix and polypeptide sequences of the invention. The traits asso 60 (Gusmaroliet al., 2002; Edwards et al., 1998) are shown in the ciated with the use of the sequences are included in the boxes in FIGS. 4A and 4B and are also found in SEQID NO: Examples. 114. CD-ROMs Copy 1 and Copy 2, and the CRF copy (Copy 3) of the Sequence Listing under CFR Section 1.821(e), are DETAILED DESCRIPTION read-only memory computer-readable compact discs. Each 65 contains a copy of the Sequence Listing in ASCII text format. The present invention relates to polynucleotides and The Sequence Listing is named “MBI0071CIPST25.txt”, the polypeptides for modifying phenotypes of plants, particularly US 8,633,353 B2 5 6 those associated with increased cold and/or water deficit tol itance, and in physical terms is a particular segment or erance with respect to a control plant (for example, a wild sequence of nucleotides along a molecule of DNA (or RNA, type plant or a plant transformed with an “empty' vector in the case of RNA viruses) involved in producing a polypep lacking a gene of interest). Throughout this disclosure, Vari tide chain. The latter may be subjected to Subsequent process ous information Sources are referred to and/or are specifically ing such as chemical modification or folding to obtain a incorporated. The information Sources include Scientific jour functional protein or polypeptide. A gene may be isolated, nal articles, patent documents, textbooks, and World Wide partially isolated, or found with an organism's genome. By Web browser-inactive page addresses. While the reference to way of example, a transcription factor gene encodes a tran these information sources clearly indicates that they can be Scription factor polypeptide, which may be functional or used by one of skill in the art, each and every one of the 10 require processing to function as an initiator of transcription. information sources cited herein are specifically incorporated Operationally, genes may be defined by the cis-trans test, a in their entirety, whether or not a specific mention of “incor genetic test that determines whether two mutations occur in poration by reference' is noted. The contents and teachings of the same gene and that may be used to determine the limits of each and every one of the information sources can be relied on the genetically active unit (Rieger et al., 1976). A gene gen and used to make and use embodiments of the invention. 15 erally includes regions preceding (“leaders'; upstream) and As used herein and in the appended claims, the singular following (“trailers'; downstream) the coding region. A gene forms “a”, “an', and “the include the plural reference unless may also include intervening, non-coding sequences, referred the context clearly dictates otherwise. Thus, for example, a to as “introns', located between individual coding segments, reference to “a host cell includes a plurality of such host referred to as “exons'. Most genes have an associated pro cells, and a reference to “a stress' is a reference to one or more moter region, a regulatory sequence 5' of the transcription stresses and equivalents thereof known to those skilled in the initiation codon (there are some genes that do not have an art, and so forth. identifiable promoter). The function of a gene may also be regulated by enhancers, operators, and other regulatory ele DEFINITIONS mentS. 25 A "polypeptide' is an amino acid sequence comprising a "Polynucleotide' is a nucleic acid molecule comprising a plurality of consecutive polymerized amino acid residues plurality of polymerized nucleotides, e.g., at least about 15 e.g., at least about 15 consecutive polymerized amino acid consecutive polymerized nucleotides. A polynucleotide may residues. In many instances, a polypeptide comprises a poly be a nucleic acid, oligonucleotide, nucleotide, or any frag merized amino acid residue sequence that is a transcription ment thereof. In many instances, a polynucleotide comprises 30 factor or a domain or portion or fragment thereof. Addition a nucleotide sequence encoding a polypeptide (or protein) or ally, the polypeptide may comprise: (i) a localization domain; a domain or fragment thereof. Additionally, the polynucle (ii) an activation domain; (iii) a repression domain; (iv) an otide may comprise a promoter, an intron, an enhancer region, oligomerization domain; (v) a protein-protein interaction a polyadenylation site, a translation initiation site, 5' or 3' domain; (vi) a DNA-binding domain; or the like. The untranslated regions, a reporter gene, a selectable marker, or 35 polypeptide optionally comprises modified amino acid resi the like. The polynucleotide can be single-stranded or double dues, naturally occurring amino acid residues not encoded by stranded DNA or RNA. The polynucleotide optionally com a codon, non-naturally occurring amino acid residues. prises modified bases or a modified backbone. The polynucle “Protein’ refers to an amino acid sequence, oligopeptide, otide can be, e.g., genomic DNA or RNA, a transcript (such as peptide, polypeptide or portions thereof whether naturally an mRNA), a cDNA, a PCR product, a cloned DNA, a syn 40 occurring or synthetic. thetic DNA or RNA, or the like. The polynucleotide can be “Portion', as used herein, refers to any part of a protein combined with carbohydrate, lipids, protein, or other materi used for any purpose, but especially for the screening of a als to perform a particular activity Such as transformation or library of molecules which specifically bind to that portion or form a useful composition Such as a peptide nucleic acid for the production of antibodies. (PNA). The polynucleotide can comprise a sequence in either 45 A “recombinant polypeptide' is a polypeptide produced by sense orantisense orientations. “Oligonucleotide' is Substan translation of a recombinant polynucleotide. A “synthetic tially equivalent to the terms amplimer, primer, oligomer, polypeptide' is a polypeptide created by consecutive poly element, target, and probe and is preferably single-stranded. merization of isolated amino acid residues using methods A “recombinant polynucleotide' is a polynucleotide that is well known in the art. An "isolated polypeptide, whether a not in its native state, e.g., the polynucleotide comprises a 50 naturally occurring or a recombinant polypeptide, is more nucleotide sequence not found in nature, or the polynucle enriched in (or out of) a cell than the polypeptide in its natural otide is in a context other than that in which it is naturally state in a wild-type cell, e.g., more than about 5% enriched, found, e.g., separated from nucleotide sequences with which more than about 10% enriched, or more than about 20%, or it typically is in proximity in nature, or adjacent (or contigu more than about 50%, or more, enriched, i.e., alternatively ous with) nucleotide sequences with which it typically is not 55 denoted: 105%, 110%, 120%, 150% or more, enriched rela in proximity. For example, the sequence at issue can be tive to wildtype standardized at 100%. Such an enrichment is cloned into a nucleic acid construct, or otherwise recombined not the result of a natural response of a wild-type plant. with one or more additional nucleic acid. Alternatively, or additionally, the isolated polypeptide is An "isolated polynucleotide' is a polynucleotide, whether separated from other cellular components with which it is naturally occurring or recombinant, that is present outside the 60 typically associated, e.g., by any of the various protein puri cell in which it is typically found in nature, whether purified fication methods herein. or not. Optionally, an isolated polynucleotide is subject to one “Homology” refers to sequence similarity between a ref or more enrichment or purification procedures, e.g., cell lysis, erence sequence and at least a fragment of a newly sequenced extraction, centrifugation, precipitation, or the like. clone insert or its encoded amino acid sequence. "Gene' or “gene sequence” refers to the partial or complete 65 “Identity” or “similarity” refers to sequence similarity coding sequence of a gene, its complement, and its 5' or 3' between two polynucleotide sequences or between two untranslated regions. A gene is also a functional unit of inher polypeptide sequences, with identity being a more strict com US 8,633,353 B2 7 8 parison. The phrases “percent identity” and “96 identity” refer formed plants overexpressing other members of the same to the percentage of sequence similarity found in a compari clade or Subclade of transcription factor polypeptides. Son of two or more polynucleotide sequences or two or more A fragment or domain can be referred to as outside a polypeptide sequences. “Sequence similarity refers to the conserved domain, outside a consensus sequence, or outside percent similarity in base pair sequence (as determined by any 5 a consensus DNA-binding site that is known to exist or that suitable method) between two or more polynucleotide exists for a particular polypeptide class, family, or Sub-family. sequences. Two or more sequences can be anywhere from 0 to In this case, the fragment or domain will not include the exact 100% similar. Identity or similarity can be determined by amino acids of a consensus sequence or consensus DNA comparingaposition in each sequence that may be aligned for binding site of a transcription factor class, family or Sub purposes of comparison. When a position in the compared 10 family, or the exact amino acids of a particular transcription sequence is occupied by the same nucleotide base or amino factor consensus sequence or consensus DNA-binding site. acid, then the molecules are identical at that position. A Furthermore, a particular fragment, region, or domain of a degree of similarity or identity between polynucleotide polypeptide, or a polynucleotide encoding a polypeptide, can sequences is a function of the number of identical, matching be “outside a conserved domain if all the amino acids of the or corresponding nucleotides at positions shared by the poly- 15 fragment, region, or domain fall outside of a defined con nucleotide sequences. A degree of identity of polypeptide served domain(s) for a polypeptide or protein. Sequences sequences is a function of the number of identical amino acids having lesser degrees of identity but comparable biological at corresponding positions shared by the polypeptide activity are considered to be equivalents. sequences. A degree of homology or similarity of polypeptide As one of ordinary skill in the art recognizes, conserved sequences is a function of the number of amino acids at 20 domains may be identified as regions or domains of identity to corresponding positions shared by the polypeptide a specific consensus sequence (see, for example, Riechmann Sequences. et al., 2000a, 2000b). Thus, by using alignment methods well By “substantially identical' is meant an amino acid known in the art, the conserved domains of the plant polypep sequence which differs only by conservative amino acid Sub tides may be determined. stitutions, for example, Substitution of one amino acid for 25 The conserved B domains for many of the polypeptide another of the same class (e.g., valine for glycine, arginine for sequences of the invention are listed in Table 2. Also, the lysine, etc.) or by one or more non-conservative Substitutions, polypeptides of FIGS. 4A to 4B and Table 2 have conserved deletions, or insertions located at positions of the amino acid B domains specifically indicated by amino acid coordinate sequence which do not destroy the function of the protein start and stop sites. A comparison, of the regions of these assayed. (e.g., as described herein). Preferably, such a 30 polypeptides allows one of skill in the art (see, for example, sequence has at least 78% or greater identity with a listed Reeves and Nissen, 1995) to identify domains or conserved B sequence of the invention, such as at least 78% or greater domains for any of the polypeptides listed or referred to in this identity with the B domain of SEQID NO: 2 or at least 80% disclosure. or greater identity with the B domain of SEQID NO: 2, or at “Complementary” refers to the natural hydrogen bonding least 83% or greater identity with the B domain of SEQID 35 by base pairing between purines and pyrimidines. For NO: 2, or at least 85% or greater identity with the B domain example, the sequence A-C-G-T (5'-->3') forms hydrogen of SEQID NO: 2, or at least 91% or greater identity with the bonds with its complements A-C-G-T (5'-->3') or A-C-G-U B domain of SEQID NO:2, or at least 93% or greater identity (5'-->3'). Two single-stranded molecules may be considered with the B domain of SEQID NO: 2. partially complementary, if only some of the nucleotides Alignment” refers to a number of nucleotide bases or 40 bond, or “completely complementary’ if all of the nucle amino acid residue sequences aligned by lengthwise com otides bond. The degree of complementarity between nucleic parison so that components in common (i.e., nucleotide bases acid strands affects the efficiency and strength of hybridiza or amino acid residues at corresponding positions) may be tion and amplification reactions. “Fully complementary visually and readily identified. The fraction or percentage of refers to the case where bonding occurs between every base components in common is related to the homology or identity 45 pair and its complement in a pair of sequences, and the two between the sequences. Alignments such as those of FIGS. sequences have the same number of nucleotides. 4A to 4B may be used to identify conserved B domains and The terms “highly stringent' or “highly stringent condi relatedness within these domains. An alignment may suitably tion” refer to conditions that permit hybridization of DNA be determined by means of computer programs known in the Strands whose sequences are highly complementary, wherein art, such as MACVECTOR software (1999) (Accelrys, Inc., 50 these same conditions exclude hybridization of significantly San Diego, Calif.). mismatched DNAs. Polynucleotide sequences capable of A “conserved domain or “conserved region' as used hybridizing under stringent conditions with the polynucle herein refers to a region in heterologous polynucleotide or otides of the present invention may be, for example, variants polypeptide sequences where there is a relatively high degree of the disclosed polynucleotide sequences, including allelic of sequence identity between the distinct sequences. With 55 or splice variants, or sequences that encode orthologs or para respect to polynucleotides encoding presently disclosed logs of presently disclosed polypeptides. Nucleic acid polypeptides, a conserved domain is preferably at least nine hybridization methods are disclosed in detail by Kashima et base pairs (bp) in length. Transcription factor sequences that al., 1985, Sambrook et al., 1989, and by Haymes et al., 1985, possess or encode for conserved domains that have a mini which references are incorporated herein by reference. mum percentage identity and have comparable biological 60 In general, stringency is determined by the temperature, activity to the present polypeptide sequences, thus being ionic strength, and concentration of denaturing agents (e.g., members of the same clade or subclade of transcription factor formamide) used in a hybridization and washing procedure polypeptides, are encompassed by the invention. Overexpres (for a more detailed description of establishing and determin sion in a transformed plant of a polypeptide that comprises, ing stringency, see the section “Identifying Polynucleotides for example, a conserved domain having DNA-binding, acti- 65 or Nucleic Acids by Hybridization', below). The degree to Vation or nuclear localization activity results in the trans which two nucleic acids hybridize under various conditions formed plant having similar improved traits as other trans of stringency is correlated with the extent of their similarity. US 8,633,353 B2 9 10 Thus, similar nucleic acid sequences from a variety of phisms that may or may not be readily detectable using a Sources. Such as within a plant's genome (as in the case of particular oligonucleotide probe of the polynucleotide encod paralogs) or from another plant (as in the case of orthologs) ing polypeptide, and improper or unexpected hybridization to that may perform similar functions can be isolated on the allelic variants, with a locus other than the normal chromo basis of their ability to hybridize with known related poly Somal locus for the polynucleotide sequence encoding nucleotide sequences. Numerous variations are possible in polypeptide. the conditions and means by which nucleic acid hybridization “Allelic variant' or “polynucleotide allelic variant” refers can be performed to isolate related polynucleotide sequences to any of two or more alternative forms of a gene occupying having similarity to sequences known in the art and are not the same chromosomal locus. Allelic variation arises natu limited to those explicitly disclosed herein. Such an approach 10 rally through mutation, and may result in phenotypic poly may be used to isolate polynucleotide sequences having Vari morphism within populations. Gene mutations may be ous degrees of similarity with disclosed polynucleotide 'silent” or may encode polypeptides having altered amino sequences, such as, for example, encoded transcription fac acid sequence. “Allelic variant' and “polypeptide allelic vari tors having 78% or greater identity with the conserved B ant’ may also be used with respect to polypeptides, and in this domain of disclosed sequences. 15 case the terms refer to a polypeptide encoded by an allelic The terms “paralog and "ortholog” are defined below in variant of a gene. the section entitled “Orthologs and Paralogs”. In brief, “Splice variant' or “polynucleotide splice variant” as used orthologs and paralogs are evolutionarily related genes that herein refers to alternative forms of RNA transcribed from a have similar sequences and functions. Orthologs are structur gene. Splice variation naturally occurs as a result of alterna ally related genes in different species that are derived by a tive sites being spliced within a single transcribed RNA mol speciation event. Paralogs are structurally related genes ecule or between separately transcribed RNA molecules, and within a single species that are derived by a duplication event. may result in several different forms of mRNA transcribed The term “equivalog describes members of a set of from the same gene. Thus, splice variants may encode homologous proteins that are conserved with respect to func polypeptides having different amino acid sequences, which tion since their last common ancestor. Related proteins are 25 may or may not have similar functions in the organism. grouped into equivalog families, and otherwise into protein “Splice variant' or “polypeptide splice variant may also families with other hierarchically defined homology types. refer to a polypeptide encoded by a splice variant of a tran This definition is provided at the Institute for Genomic scribed mRNA. Research (TIGR) WorldWideWeb (www) website, “tigr.org As used herein, “polynucleotide variants' may also refer to under the heading “Terms associated with TIGRFAMs”. 30 polynucleotide sequences that encode paralogs and orthologs In general, the term “variant” refers to molecules with of the presently disclosed polypeptide sequences. “Polypep some differences, generated synthetically or naturally, in tide variants’ may refer to polypeptide sequences that are their base oramino acid sequences as compared to a reference paralogs and orthologs of the presently disclosed polypeptide (native) polynucleotide or polypeptide, respectively. These Sequences. differences include Substitutions, insertions, deletions or any 35 Differences between presently disclosed polypeptides and desired combinations of such changes in a native polynucle polypeptide variants are limited so that the sequences of the otide of amino acid sequence. former and the latter are closely similar overall and, in many With regard to polynucleotide variants, differences regions, identical. Presently disclosed polypeptide sequences between presently disclosed polynucleotides and polynucle and similar polypeptide variants may differ in amino acid otide variants are limited so that the nucleotide sequences of 40 sequence by one or more substitutions, additions, deletions, the former and the latter are closely similar overall and, in fusions and truncations, which may be present in any combi many regions, identical. Due to the degeneracy of the genetic nation. These differences may produce silent changes and code, differences between the former and latter nucleotide result in a functionally equivalent polypeptides. Thus, it will sequences may be silent (i.e., the amino acids encoded by the be readily appreciated by those of skill in the art, that any of polynucleotide are the same, and the variant polynucleotide 45 a variety of polynucleotide sequences is capable of encoding sequence encodes the same amino acid sequence as the pres the polypeptides and homolog polypeptides of the invention. ently disclosed polynucleotide. Variant nucleotide sequences A polypeptide sequence variant may have “conservative' may encode different amino acid sequences, in which case changes, wherein a Substituted amino acid has similar struc such nucleotide differences will result in amino acid substi tural or chemical properties. Deliberate amino acid substitu tutions, additions, deletions, insertions, truncations or fusions 50 tions may thus be made on the basis of similarity in polarity, with respect to the similar disclosed polynucleotide charge, solubility, hydrophobicity, hydrophilicity, and/or the sequences. These variations may result in polynucleotide amphipathic nature of the residues, as long as a significant variants encoding polypeptides that share at least one func amount of the functional or biological activity of the polypep tional characteristic. The degeneracy of the genetic code also tide is retained. For example, negatively charged amino acids dictates that many different variant polynucleotides can 55 may include aspartic acid and glutamic acid, positively encode identical and/or Substantially similar polypeptides in charged amino acids may include lysine and arginine, and addition to those sequences illustrated in the Sequence List amino acids with uncharged polar head groups having similar ing. hydrophilicity values may include leucine, isoleucine, and Also within the scope of the invention is a variant of a Valine; glycine and alanine; asparagine and glutamine; serine nucleic acid listed in the Sequence Listing, that is, one having 60 and threonine; and phenylalanine and tyrosine. More rarely, a a sequence that differs from the one of the polynucleotide variant may have “non-conservative changes, e.g., replace sequences in the Sequence Listing, or a complementary ment of a glycine with a tryptophan. Similar minor variations sequence, that encodes a functionally equivalent polypeptide may also include amino acid deletions or insertions, or both. (i.e., a polypeptide having some degree of equivalent or simi Related polypeptides may comprise, for example, additions lar biological activity) but differs in sequence from the 65 and/or deletions of one or more N-linked or O-linked glyco sequence in the Sequence Listing, due to degeneracy in the Sylation sites, or an addition and/or a deletion of one or more genetic code. Included within this definition are polymor cysteine residues. Guidance in determining which and how US 8,633,353 B2 11 12 many amino acid residues may be substituted, inserted or FIG. 1, adapted from Daly et al., 2001, FIG. 2, adapted from deleted without abolishing functional or biological activity Ku et al., 2000; and see also Tudge, 2000. may be found using computer programs well known in the art, A “control plant’ as used in the present invention refers to for example, DNASTAR software (see U.S. Pat. No. 5,840, a plant cell, seed, plant component, plant tissue, plant organ or 544 to Hawkins, 1998). whole plant used to compare against transformed, transgenic “Fragment, with respect to a polynucleotide, refers to a or genetically modified plant for the purpose of identifying an clone or any part of a polynucleotide molecule that retains a enhanced phenotype in the transformed, transgenic or geneti usable, functional characteristic. Useful fragments include cally modified plant. A control plant may in some cases be a oligonucleotides and polynucleotides that may be used in transformed or transgenic plant line that comprises an empty 10 vector or marker gene, but does not contain the recombinant hybridization or amplification technologies or in the regula polynucleotide of the present invention that is expressed in tion of replication, transcription or translation. A polynucle the transformed, transgenic or genetically modified plant otide fragment” refers to any Subsequence of a polynucle being evaluated. In general, a control plant is a plant of the otide, typically, of at least about 9 consecutive nucleotides, same line or variety as the transformed, transgenic or geneti preferably at least about 30 nucleotides, more preferably at 15 cally modified plant being tested. A suitable control plant least about 50 nucleotides, of any of the sequences provided would include a genetically unaltered or non-transgenic plant herein. Exemplary polynucleotide fragments are the first of the parental line used to generate a transformed or trans sixty consecutive nucleotides of the polynucleotides listed in genic plant herein. the Sequence Listing. Exemplary fragments also include “Transformation” refers to the transfer of a foreign poly fragments that comprise a region that encodes a conserved B nucleotide sequence into the genome of a host organism Such domain of a polypeptide. Exemplary fragments also include as that of a plant or plant cell. Typically, the foreign genetic fragments that comprise a conserved domain of a polypep material has been introduced into the plant by human manipu tide. lation, but any method can be used as one of skill in the art Fragments may also include Subsequences of polypeptides recognizes. Examples of methods of plant transformation and protein molecules, or a Subsequence of the polypeptide. 25 include Agrobacterium-mediated transformation (De Blaere Fragments may have uses in that they may have antigenic et al., 1987) and biolistic methodology (Klein et al., 1987). potential. In some cases, the fragment or domain is a Subse A “transformed plant', which may also be referred to as a quence of the polypeptide which performs at least one bio “transgenic plant’ or “transformant', generally refers to a logical function of the intact polypeptide in Substantially the plant, a plant cell, plant tissue, seed or calli that has been same manner, or to a similar extent, as does the intact 30 through, or is derived from a plant that has been through, a polypeptide. For example, a polypeptide fragment can com transformation process in which a nucleic acid construct that prise a recognizable structural motif or functional domain contains at leastoneforeign polynucleotide sequence is intro such as a DNA-binding site or domain that binds to a DNA duced into the plant. The nucleic acid construct, which may promoter region, an activation domain, or a domain for pro be an expression vector or expression cassette, a plasmid, or tein-protein interactions, and may initiate transcription. Frag 35 a DNA preparation, contains genetic material that is not found ments can vary in size from as few as 3 amino acid residues to in a wild-type plant of the same species, variety or cultivar. the full length of the intact polypeptide, but are preferably at The genetic material may include a regulatory element, a least about 30 amino acid residues in length and more pref transgene (for example, a foreign transcription factor erably at least about 60 amino acid residues in length. sequence), an insertional mutagenesis event (such as by trans The invention also encompasses production of DNA 40 poson or T-DNA insertional mutagenesis), an activation tag sequences that encode polypeptides and derivatives, or frag ging sequence, a mutated sequence, a homologous recombi ments thereof, entirely by synthetic chemistry. After produc nation event or a sequence modified by chimeraplasty. In tion, the synthetic sequence may be inserted into any of the Some embodiments the regulatory and transcription factor many available expression vectors and cell Systems using sequence may be derived from the host plant, but by their reagents well known in the art. Moreover, synthetic chemistry 45 incorporation into an expression vector of cassette, represent may be used to introduce mutations into a sequence encoding an arrangement of the polynucleotide sequences not found a polypeptides or any fragment thereof. wild-type plant of the same species, variety or cultivar. “Derivative' refers to the chemical modification of a An “untransformed plant' is a plant that has not been nucleic acid molecule or amino acid sequence. Chemical through the transformation process. modifications can include replacement of hydrogen by an 50 A “stably transformed plant, plant cell or plant tissue has alkyl, acyl, or amino group or glycosylation, pegylation, or generally been selected and regenerated on a selection media any similar process that retains or enhances biological activ following transformation. ity or lifespan of the molecule or sequence. A nucleic acid construct such as a plasmid, an expression The term “plant' includes whole plants, shoot vegetative vector or expression cassette typically comprises a polypep organs/structures (for example, leaves, stems and tubers), 55 tide-encoding sequence operably linked (i.e., under regula roots, flowers and floral organs/structures (for example, tory control of) to appropriate inducible or constitutive regu bracts, sepals, petals, stamens, carpels, anthers and ovules), latory sequences that allow for the controlled expression of seed (including embryo, endosperm, and seed coat) and fruit polypeptide. The expression vector or cassette can be intro (the mature ovary), plant tissue (for example, vascular tissue, duced into a plant by transformation or by breeding after ground tissue, and the like) and cells (for example, guard 60 transformation of a parent plant. A plant refers to a whole cells, egg cells, and the like), and progeny of same. The class plantas well as to a plant part, Such as seed, fruit, leaf, or root, of plants that can be used in the method of the invention is plant tissue, plant cells or any other plant material, e.g., a plant generally as broad as the class of higher and lower plants explant, as well as to progeny thereof, and to in vitro systems amenable to transformation techniques, including that mimic biochemical or cellular components or processes angiosperms (monocotyledonous and dicotyledonous 65 in a cell. plants), gymnosperms, ferns, horsetails, psilophytes, lyco “Wildtype' or “wild-type', as used herein, refers to a plant phytes, bryophytes, and multicellular algae (see for example, cell, seed, plant component, plant tissue, plant organ or whole US 8,633,353 B2 13 14 plant that has not been genetically modified or treated in an in at least one gene in the plant or cell, where the disruption experimental sense. Wild-type cells, seed, components, tis results in a reduced expression or activity of the polypeptide Sue, organs or whole plants may be used as controls to com encoded by that gene compared to a control cell. The knock pare levels of expression and the extent and nature of trait out can be the result of for example, genomic disruptions, modification with cells, tissue or plants of the same species in 5 including transposons, tilling, and homologous recombina which a polypeptide's expression is altered, e.g., in that it has tion, antisense constructs, sense constructs, RNA silencing been knocked out, overexpressed, or ectopically expressed. constructs, or RNA interference. AT-DNA insertion within a A “trait” refers to a physiological, morphological, bio gene is an example of a genotypic alteration that may abolish chemical, or physical characteristic of a plant or particular expression of that gene. plant material or cell. In some instances, this characteristic is 10 “Ectopic expression' or “altered expression' in reference visible to the human eye, Such as seed or plant size, or can be to a polynucleotide indicates that the pattern of expression in, measured by biochemical techniques, such as detecting the e.g., a transformed or transgenic plant or plant tissue, is dif protein, starch, or oil content of seed or leaves, or by obser ferent from the expression pattern in a wild-type plant or a Vation of a metabolic or physiological process, e.g. by mea reference plant of the same species. The pattern of expression Suring tolerance to water deficit or particular salt or Sugar 15 may also be compared with a reference expression pattern in concentrations, or by the observation of the expression level a wild-type plant of the same species. For example, the poly of a gene or genes, e.g., by employing Northern analysis, nucleotide or polypeptide is expressed in a cell or tissue type RT-PCR, microarray gene expression assays, or reporter gene other than a cell or tissue type in which the sequence is expression systems, or by agricultural observations such as expressed in the wild-type plant, or by expression at a time increased cold or water deficit tolerance oran increased yield. 20 other than at the time the sequence is expressed in the wild Any technique can be used to measure the amount of com type plant, or by a response to different inducible agents. Such parative level of, or difference in any selected chemical com as hormones or environmental signals, or at different expres pound or macromolecule in the transformed or transgenic sion levels (either higher or lower) compared with those plants, however. found in a wild-type plant. The term also refers to altered “Trait modification” refers to a detectable difference in a 25 expression patterns that are produced by lowering the levels characteristic in a plant ectopically expressing a polynucle of expression to below the detection level or completely abol otide or polypeptide of the present invention relative to a plant ishing expression. The resulting expression pattern can be not doing so. Such as a wild-type plant. In some cases, the trait transient or stable, constitutive or inducible. In reference to a modification can be evaluated quantitatively. For example, polypeptide, the terms “ectopic expression” or “altered the trait modification can entail at least about a 2% increase or 30 expression further may relate to altered activity levels result decrease, or an even greater difference, in an observed trait as ing from the interactions of the polypeptides with exogenous compared with a control or wild-type plant. It is known that or endogenous modulators or from interactions with factors there can be a natural variation in the modified trait. There oras a result of the chemical modification of the polypeptides. fore, the trait modification observed entails a change of the The term “overexpression' as used herein refers to a normal distribution and magnitude of the trait in the plants as 35 greater expression level of a gene in a plant, plant cell or plant compared to control or wild-type plants. tissue, compared to expression in a wild-type plant, cell or When two or more plants have “similar morphologies, tissue, at any developmental or temporal stage for the gene. “Substantially similar morphologies”, “a morphology that is Overexpression can occur when, for example, the genes substantially similar, or are “morphologically similar, the encoding one or more polypeptides are under the control of a plants have comparable forms or appearances, including 40 strong promoter (e.g., the cauliflower mosaic virus 35S tran analogous features such as overall dimensions, height, width, scription initiation region (SEQ ID NO: 107). Overexpres mass, root mass, shape, glossiness, color, stem diameter, leaf sion may also under the control of an inducible or tissue size, leaf dimension, leaf density, internode distance, branch specific promoter. The choice of inducible or tissue-specific ing, root branching, number and form of inflorescences, and promoters may include, for example, the LTP1 epidermal other macroscopic characteristics, and the individual plants 45 specific promoter (SEQ ID NO: 108), the SUC2 vascular are not readily distinguishable based on morphological char specific promoter (SEQ ID NO: 109), the ARSK1 root-spe acteristics alone. cific promoter (SEQ ID NO: 110), the RD29A stress “Modulates' refers to a change in activity (biological, inducible promoter (SEQID NO: 111), the AS1 emergent leaf chemical, or immunological) or lifespan resulting from spe primordia-specific promoter (SEQ ID NO: 112), or the cific binding between a molecule and either a nucleic acid 50 RBCS3 leaf- or photosynthetic-tissue specific-promoter molecule or a protein. (SEQID NO: 113). Many of these promoters have been used The term “transcript profile” refers to the expression levels with polynucleotide sequences of the invention to produce of a set of genes in a cell in a particular state, particularly by transgenic plants. These or other inducible or tissue-specific comparison with the expression levels of that same set of promoters may be incorporated into a nucleic acid construct genes in a cell of the same type in a reference state. For 55 comprising a transcription factor polynucleotide of the inven example, the transcript profile of a particular polypeptide in a tion, where the promoter is operably linked to the transcrip Suspension cell is the expression levels of a set of genes in a tion factor polynucleotide, can be envisioned and produced. cell knocking out or overexpressing that polypeptide com Thus, overexpression may occur throughout a plant, in spe pared with the expression levels of that same set of genes in a cific tissues of the plant, or in the presence or absence of suspension cell that has normal levels of that polypeptide. The 60 particular environmental signals, depending on the promoter transcript profile can be presented as a list of those genes used. whose expression level is significantly different between the Overexpression may take place in plant cells normally two treatments, and the difference ratios. Differences and lacking expression of polypeptides functionally equivalent or similarities between expression levels may also be evaluated identical to the present polypeptides. Overexpression may and calculated using statistical and clustering methods. 65 also occur in plant cells where endogenous expression of the With regard to gene knockouts as used herein, the term present polypeptides or functionally equivalent molecules “knockout” refers to a plant or plant cell having a disruption normally occurs, but Such normal expression is at a lower US 8,633,353 B2 15 16 level. Overexpression thus results in a greater than normal found in the wild-type cultivar or strain. Plants may then be production, or “overproduction' of the polypeptide in the selected for those that produce the most desirable degree of plant, cell or tissue. over- or under-expression of target genes of interest and coin The term “transcription regulating region” refers to a DNA cident trait improvement. regulatory sequence that regulates expression of one or more The sequences of the present invention may be from any genes in a plant when a transcription factor having one or species, particularly plant species, in a naturally occurring more specific binding domains binds to the DNA regulatory form or from any source whether natural, synthetic, semi sequence. Transcription factors possess at least one con synthetic or recombinant. The sequences of the invention may served domain. The transcription factors also comprise an also include fragments of the present amino acid sequences. amino acid Subsequence that forms a transcription activation 10 Where “amino acid sequence” is recited to refer to an amino domain that regulates expression of one or more genes that acid sequence of a naturally occurring protein molecule, modulate tolerance to cold or water deficit in a plant when the transcription factor binds to the regulating region. “amino acid sequence' and like terms are not meant to limit “Yield' or “plant yield’ refers to increased plant growth, the amino acid sequence to the complete native amino acid increased crop growth, increased biomass, and/or increased 15 sequence associated with the recited protein molecule. plant product production, and is dependent to Some extent on In addition to methods for modifying a plant phenotype by temperature, plant size, organ size, planting density, light, employing one or more polynucleotides and polypeptides of water and nutrient availability, and how the plant copes with the invention described herein, the polynucleotides and various stresses, such as through temperature acclimation and polypeptides of the invention have a variety of additional water or nutrient use efficiency. uses. These uses include their use in the recombinant produc “Planting density' refers to the number of plants that can be tion (i.e., expression) of proteins; as regulators of plant gene grown per acre. For crop species, planting or population den expression, as diagnostic probes for the presence of comple sity varies from a crop to a crop, from one growing region to mentary or partially complementary nucleic acids (including another, and from year to year. Using corn as an example, the for detection of natural coding nucleic acids); as Substrates average prevailing density in 2000 was in the range of 20,000 25 for further reactions, e.g., mutation reactions, PCR reactions, 25,000 plants per acre in Missouri, USA. A desirable higher or the like; as Substrates for cloning e.g., including digestion population density (a measure of yield) would be at least or ligation reactions; and for identifying exogenous or endog 22,000 plants per acre, and a more desirable higher popula enous modulators of the transcription factors. The polynucle tion density would be at least 28,000 plants per acre, more otide can be, e.g., genomic DNA or RNA, a transcript (such as preferably at least 34,000 plants per acre, and most preferably 30 an mRNA), a cDNA, a PCR product, a cloned DNA, a syn at least 40,000 plants per acre. The average prevailing densi thetic DNA or RNA, or the like. The polynucleotide can ties per acre of a few other examples of crop plants in the USA comprise a sequence in either sense or antisense orientations. in the year 2000 were: wheat 1,000,000-1,500,000; rice 650, Expression of genes that encode polypeptides that modify 000-900,000; soybean 150,000-200,000, canola 260,000 expression of endogenous genes, polynucleotides, and pro 350,000, sunflower 17,000-23,000 and cotton 28,000-55,000 35 plants per acre (U.S. Pat. Appl. No. 20030101479, Cheikh et teins are well known in the art. In addition, transformed or al., 2003). A desirable higher population density for each of transgenic plants comprising isolated polynucleotides encod these examples, as well as other valuable species of plants, ing transcription factors may also modify expression of would be at least 10% higher than the average prevailing endogenous genes, polynucleotides, and proteins. Examples density or yield. 40 include Peng et al., 1997 and Peng et al., 1999. In addition, many others have demonstrated that an Arabidopsis tran DESCRIPTION OF THE SPECIFIC Scription factor expressed in an exogenous plant species elic EMBODIMENTS its the same or very similar phenotypic response. See, for example, Fu et al., 2001; Nandiet al., 2000; Coupland, 1995: Transcription Factors Modify Expression of Endogenous 45 and Weigel and Nilsson, 1995). Genes In another example, Mandelet al., 1992b, and Suzuki et al., A transcription factor may include, but is not limited to, 2001, teach that a transcription factor expressed in another any polypeptide that can activate or repress transcription of a plant species elicits the same or very similar phenotypic single gene or a number of genes. As one of ordinary skill in response of the endogenous sequence, as often predicted in the art recognizes, transcription factors can be identified by 50 earlier studies of Arabidopsis transcription factors in Arabi the presence of a region or domain of structural similarity or dopsis (see Mandel et al., 1992a: Suzuki et al., 2001). Other identity to a specific consensus sequence or the presence of a examples include Müller et al., 2001; Kim et al., 2001; Kyo specific consensus DNA-binding motif (see, for example, Zuka and Shimamoto, 2002; Boss and Thomas, 2002; He et Riechmann et al., 2000a). The plant transcription factors of al., 2000; and Robson et al., 2001. the present invention are transcription factors. 55 In yet another example, Gilmour et al., 1998, teach an Generally, transcription factors are involved in cell differ Arabidopsis AP2 transcription factor, CBF1, which, when entiation and proliferation and the regulation of growth. overexpressed in transgenic plants, increases plant freezing Accordingly, one skilled in the art would recognize that by tolerance. Jaglo et al., 2001, further identified sequences in expressing the present sequences in a plant, one may change Brassica napus which encode CBF-like genes and that tran the expression of autologous genes or induce the expression 60 Scripts for these genes accumulated rapidly in response to low of introduced genes. By affecting the expression of similar temperature. Transcripts encoding CBF-like proteins were autologous sequences in a plant that have the biological activ also found to accumulate rapidly in response to low tempera ity of the present sequences, or by introducing the present ture in wheat, as well as in tomato. An alignment of the CBF sequences into a plant, one may alter a plants phenotype to proteins from Arabidopsis, B. napus, wheat, rye, and tomato one with improved traits related to Improved cold or water 65 revealed the presence of conserved consecutive amino acid deficit tolerance. The sequences of the invention may also be residues, PKK/RPAGRXKFXETRHP (SEQID NO: 115) and used to transform a plant and introduce desirable traits not DSAWR (SEQID NO: 116), which bracket the AP2/EREBP US 8,633,353 B2 17 18 DNA binding domains of the proteins and distinguish them cially available kit according to the manufacturers instruc from other members of the AP2/EREBP protein family. (Ja tions. Where necessary, multiple rounds of RACE are glo et al., 2001). performed to isolate 5' and 3' ends. The full-length cDNA was Transcription factors mediate cellular responses and con then recovered by a routine end-to-end polymerase chain trol traits through altered expression of genes containing cis reaction (PCR) using primers specific to the isolated 5' and 3 acting nucleotide sequences that are targets of the introduced ends. Exemplary sequences are provided in the Sequence transcription factor. It is well appreciated in the art that the Listing. effect of a transcription factor on cellular responses or a The HAP3/NF-YB Sub-Group of the CCAATBinding Factor cellular trait is determined by the particular genes whose Family expression is either directly or indirectly (e.g., by a cascade of 10 G481 (AT2G38880; also known as HAP3A and NF-YB1) transcription factor binding events and transcriptional from Arabidopsis is a member of the HAP3/NF-YB sub changes) altered by transcription factor binding. In a global group of the CCAAT binding factor family (CCAAT) of analysis of transcription comparing a standard condition with transcription factors. This gene and related sequences were one in which a transcription factor is overexpressed, the selected for further study based on our original finding of resulting transcript profile associated with transcription fac 15 hyperosmotic stress conferred by 35S::G481 lines, (SEQ ID tor overexpression is related to the trait or cellular process NOs: 1-2), and salt tolerance conferred by constitutive over controlled by that transcription factor. For example, the PAP2 expression of its paralog G482 (SEQID NOS: 27-28). Thus, gene and other genes in the MYB family have been shown to the goal of the research program described herein was to control anthocyanin biosynthesis through regulation of the determine the extent to which other proteins from the CCAAT expression of genes known to be involved in the anthocyanin family, both in Arabidopsis and other plant species, have biosynthetic pathway (Bruce et al., 2000; and Borevitz et al., similar functions. 2000). Further, global transcript profiles have been used suc The CCAAT Family Members Under Study cessfully as diagnostic tools for specific cellular states (e.g., Transcriptional regulation of most eukaryotic genes occurs cancerous vs. non-cancerous; Bhattacharjee et al., 2001; and through the binding of transcription factors to sequence spe Xu et al., 2001). Consequently, it is evident to one skilled in 25 cific binding sites in their promoter regions. Many of these the art that similarity of transcript profile upon overexpres protein binding sites have been conserved through evolution sion of different transcription factors would indicate similar and are found in the promoters of diverse eukaryotic organ ity of transcription factor function. isms. One element that shows a high degree of conservation is Polypeptides and Polynucleotides of the Invention the CCAAT-box (Gelinas et al., 1985). The CCAAT family of The present invention includes transcription factors (TFs), 30 transcription factors, also be referred to as the “CAAT', and isolated or recombinant polynucleotides encoding the “CAAT-box” “CCAAT-box binding,” “HAP" or “NF-Y” polypeptides, or novel sequence variant polypeptides or poly family, are characterized by their ability to bind to the nucleotides encoding novel variants of polypeptides derived CCAAT-box element located 80 to 300 bp 5' from a transcrip from the specific sequences provided in the Sequence Listing: tion start site (Gelinas et al., 1985). This cis-acting regulatory the recombinant polynucleotides of the invention may be 35 element is found in all eukaryotic species and present in the incorporated in nucleic acid constructs for the purpose of promoter and enhancer regions of approximately 30% of producing transformed plants. Also provided are methods for genes (Bucher and Trifonov, 1988: Bucher, 1990). The ele accelerating the time for a plant to flower, as compared to a ment can function in either orientation, and operates alone, or control plant. These methods are based on the ability to alter in possible cooperation with other cis regulatory elements the expression of critical regulatory molecules that may be 40 (Tasanen et al., 1992). conserved between diverse plant species. Related conserved The present invention pertains to a nucleic acid construct regulatory molecules may be originally discovered in a model that comprises a recombinant polynucleotide encoding a system such as Arabidopsis and homologous, functional mol HAP3 (or NF-YB) protein found within the CCAAT family ecules then discovered in other plant species. The latter may transcription factor family. HAP3 proteins contain an amino then be used to confer improved water deficit and or/cold 45 terminal A domain, a central B domain and a carboxy-termi tolerance in diverse plant species. nal C domain. Outside of the B domain, non-paralogous Exemplary polynucleotides encoding the polypeptides of Arabidopsis HAP-3 like proteins are quite divergent in the invention were identified in the Arabidopsis thaliana sequence and in overall length. It is therefore reasonable to GenBank database using publicly available sequence analy assume that the A and C domains may provide a degree of sis programs and parameters. Sequences initially identified 50 functional specificity to each member of the HAP3 subfamily. were then further characterized to identify sequences com However, as shown in the presentanalysis, sequences that are prising specified sequence strings corresponding to sequence closely related to one of the transcription factors presently motifs present infamilies of known polypeptides. In addition, being studied, G481 or SEQID NO: 2, have similar functions further exemplary polynucleotides encoding the polypeptides in plants when the sequences are overexpressed. of the invention were identified in the plant GenBank data 55 Structural Features and Assembly of the NF-Y Subunits. base using publicly available sequence analysis programs and NF-Y is one of the most heavily studied transcription factor parameters. Sequences initially identified were then further complexes and an extensive literature has accumulated characterized to identify sequences comprising specified regarding its structure, regulation, and putative roles in Vari sequence Strings corresponding to sequence motifs present in ous different organisms. Each of the three subunits comprises families of known polypeptides. 60 a region which has been evolutionarily conserved (Li et al., Additional polynucleotides of the invention were identi 1992; Mantovani, 1999). In the NF-YA subunits, this con fied by Screening Arabidopsis thaliana and/or other plant served region is at the C-terminus, in the NF-YB proteins it is cDNA libraries with probes corresponding to known centrally located, and in the NF-YC subunits it is at the polypeptides under low stringency hybridization conditions. N-terminus The NF-YB and NF-YC Subunits bear Some simi Additional sequences, including full length coding 65 larity to histones; the conserved regions of both these subunits sequences, were Subsequently recovered by the rapid ampli contain a “histone fold motif (HFM), which is an ancient fication of cDNA ends (RACE) procedure using a commer domain of about 65 amino acids. The HFM has a high degree US 8,633,353 B2 19 20 of structural conservation across all histones and comprises ing in this field. These genes, LEAFY COTYLEDON 1 three or four C.-helices (four in the case of the NF-Y subunits) (LEC1, G620) and LEAFY COTYLEDON 1-LIKE (L1L, which are separated by short loops (L)/Strand regions (Arents G1821) have critical roles in embryo development and seed and Moudrianakis, 1995). In the histones, this HFM domain maturation (Lotan et al., 1998; Kwong et al., 2003) and mediates dimerization and formation of non sequence-spe 5 encode proteins of the HAP3 (NF-YB) class. LEC1 has mul cific interactions with DNA (Arents and Moudrianakis, tiple roles in and is critical for normal development during 1995). both the early and late phases of embryogenesis (Meinke, Comprised within the larger highly conserved B domain 1992: Meinkeet al., 1994; Westet al., 1994; Parcy et al., 1997: (Lee et al., 2003) of the HAP3-like NF-YB transcription Vicient et al., 2000). Mutant lec1 embryos have cotyledons factors is found a “conserved protein-protein and DNA-bind 10 that exhibit leaf-like characteristics such as trichomes. The ing interaction module' within their “histone fold motif or “HFM” (Gusmaroli et al., 2002: Romier et al. (2003). The gene is required to maintain suspensor cell fate and to specify HFM, is thus “specific and required for HAP function” (Ed cotyledon identity in the early morphogenesis phase. wards et al., 1998), comprising two highly conserved sub Through overexpression studies, LEC1 activity has been domains within the B domain that are responsible for both 15 shown sufficient to initiate embryo development in vegetative subunit interaction and binding DNA. The B domain of HAP3 cells (Lotan et al., 1998). Additionally, lec1 mutant embryos transcription factors is necessary and Sufficient for the activ are desiccation intolerant and cannot survive seed dry-down ity of another related HAP3 protein, LEC1 (Lee et al., 2003). (but can be artificially rescued in the laboratory). This phe According to Gusmaroli et al., 2002, “all residues that con notype reflects a role for LEC1 at later stages of seed matu stitute the backbone structure of the HFMs are conserved, and ration; the gene initiates and/or maintains the maturation residues such as AtNF-YB-10N38, K58 and Q62, involved in phase, prevents precocious germination, and is required for CCAAT-binding, and E67 and E75, involved in NF-YA asso acquisition of desiccation tolerance during seed maturation. ciation (Maity and de Crombrugghe, 1998; Zemzoumi et al., L1L appears to be a paralog of, and partially redundant with 1999), are maintained. These conserved residues constitut LEC 1. Like LEC1, L1L is expressed during embryogenesis, ing the backbone structure of the HFMs are shown in the 25 and genetic studies have demonstrated that L1L can comple boxes that appear in FIGS. 4A and 4B and are represented by ment lec1 mutants (Kwong et al., 2003). A number of groups SEQID NO: 114. have now begun to elaborate details of the mechanism of Plant CCAAT binding transcription factors potentially LEC1 action in embryonic cell fate control. For example, bind DNA as heterotrimers composed of HAP2-like, HAP3 Casson and Lindsey, 2006, proposed that LEC1 requires like and HAP5-like subunits. The heterotrimer is also refer 30 auxin and Sucrose to promote cell division and embryonic enced in the public literature as Nuclear Factor Y (NF-Y), differentiation. which comprises an NF-YA subunit (corresponding to the Role of Non-LEC1-Like Proteins: HAP3 (NF-YB) Genes HAP2-like subunit), an NF-YB subunit (corresponding to the Affecting Chloroplast Development in Rice HAP3-like subunit) and an NF-YC subunit (corresponding to The first functional analysis of non-LEC1 like HAP3 pro the HAP5-like subunit) (Mantovani, 1999; Gusmaroli et al., 35 teins has now also been published (Miyoshi et al., 2003). 2001; Gusmaroli et al., 2002). All subunits contain regions These authors used antisense and RNAi approaches to reduce that are required for DNA binding and subunit association. the activity of three non-LEC1-like HAP3 proteins from rice The subunit proteins appear to lack activation domains; there (OshAP3A, OshAP3B and OsHAP3C). These genes are fore, that function must come from proteins with which they ubiquitously expressed under normal conditions, and repres interact on target promoters. No proteins that provide the 40 sion of these transcripts using antisense and RNAi techniques activation domain function for CCAAT binding factors have resulted in plants that were pale in coloration. This phenotype been confirmed in plants, although a pair of recent publica was found to be associated with aberrant plastid development tions implicate CCT-domain containing proteins as having (both chloroplasts and amyloplasts were affected) and a such a role (Ben-Naim et al., 2006; Wenkel et al., 2006). In decrease in the expression of nuclear encoded photosynthesis yeast, however, the HAP4 protein provides the primary acti 45 related genes such as RBCS3 and CAB. However, it remains vation domain (McNabb et al., 1995; Olesen and Guarente, to be determined whether these phenotypes were due to the 1990). HAP3-like proteins having a direct role in the regulation of Like their mammalian counterparts, plant CCAAT binding Such genes and in chloroplast development, or whether the factors most likely bind DNA as heterotrimers composed of defects seen in the experimental plants were secondary HAP2-like, HAP3-like and HAP5-like subunits. All subunits 50 effects relating from deficiencies in some other process. contain regions that are required for DNA binding and Subunit HAP3 (NF-YB) proteins have a modular structure and are association. However, regions that might have an activation comprised of three distinct domains: an amino-terminal A domain function are less apparent than in the mammalian domain, a central B domain and a carboxy-terminal C proteins, where Q-rich regions within the HAP2 and HAP5 domain. There is very little sequence similarity between subunits are thought to fulfill such a role. Nonetheless, some 55 HAP3 proteins within the A and C domains suggesting that of the HAP2 and HAP5 class proteins that we have identified those regions could provide a degree of functional specificity do have Q-rich regions within the N and C-termini. However, to each member of the HAP3 subfamily. The B domain is a these regions have not been confirmed yet as having Such highly conserved region that specifies DNA binding and Sub activation domain properties. In yeast, the HAP4 protein pro unit association. Lee et al., 2003, performed an elegant series vides the primary activation domain (McNabb et al., 1995; 60 of domain swap experiments between the LEC1 and a non Olesen and Guarente, 1990). Although, HAP4 subunits do not LEC1 like HAP3 protein (Atag14540, G485) to demonstrate exist in plants, it is possible that the activation domain func that the B domain of LEC1 is necessary and sufficient, within tion is provided by some other factor that interacts with the the context of the rest of the protein, to confer its activity in HAP complex. embryogenesis. Furthermore, these authors identified a spe Role of LEC1-Like Proteins 65 cific defining residue within the B domain (Asp-55) that is The functions of at least two of the Arabidopsis CCAAT required for LEC1 activity and which is sufficient to confer box genes have been genetically determined by others work LEC1 function to a non-LEC1 like B domain. US 8,633,353 B2 21 22 There is very little sequence similarity between HAP3 and its homologs. Such trees are distinct from clusters and proteins in the A and C domains; it is therefore reasonable to other means of characterizing sequence similarity because assume that the A and C domains could provide a degree of they are inferred by techniques that help convert patterns of functional specificity to each member of the HAP3 subfamily. similarity into evolutionary relationships. . . . After the gene The B domain is the conserved region that specifies DNA tree is inferred, biologically determined functions of the vari binding and subunit association. ous homologs are overlaid onto the tree. Finally, the structure Phylogenetic trees based on sequential relatedness of the of the tree and the relative phylogenetic positions of genes of HAP3 genes are shown in FIG. 3. The present invention different functions are used to trace the history of functional encompasses the G482 subclade within the non-LEC1-like changes, which is then used to predict functions of as yet clade of HAP3 (NF-YB) proteins, for which a representative 10 uncharacterized genes' (Eisen, 1998). number of monocot and eudicot species, including members Within a single plant species, gene duplication may cause from eudicot and monocot species, have been shown to confer increased cold or water deficit tolerance in plants when over two copies of a particular gene, giving rise to two or more expressed (shown in Tables 1-3 in Example IV). genes with similar sequence and often similar function known In FIGS. 4A and 4B, the B domains of G482 subclade 15 as paralogs. A paralog is therefore a similar gene formed by members of the HAP3 polypeptides from Arabidopsis, soy duplication within the same species. Paralogs typically clus bean, rice, and corn are aligned with the similar domain of ter together or in the same clade (a group of similar genes) G481. The B domains of the sequences in this non-LEC1-like when a gene family phylogeny is analyzed using programs “G482 subclade are generally distinguished by the con such as CLUSTAL (Thompson et al., 1994; Higgins et al., served residues within the HFM and larger B domain com 1996). Groups of similar genes can also be identified with prised within the subsequence SEQID NO: 114: pair-wise BLAST analysis (Feng and Doolittle, 1987). For example, a clade of very similar MADS domain transcription factors from Arabidopsis all share a common function in Asn (Xaa) 1926 Llys (Xaa) Glin (Xaa) Glu (Xaa) 7 flowering time (Ratcliffe et al., 2001), and a group of very 25 similar AP2 domain transcription factors from Arabidopsis Glu are involved in tolerance of plants to freezing (Gilmour et al., where Xaa can be any amino acid residue. Within the G482 1998). Analysis of groups of similar genes with similar func subclade, the A and C domains (not shown in FIGS. 4A and tion that fall within one clade can yield Sub-sequences that are 4B) are more variable than the B domain in both length and particular to the clade. These Sub-sequences, known as con sequence identity. SEQ ID NOs of the sequences listed in 30 sensus sequences, can not only be used to define the FIGS. 4A and 4B are found with the parentheses. sequences within each clade, but define the functions of these Overexpression of the G482 subclade polypeptides com genes; genes within a clade may contain paralogous prising a central conserved domain containing this Subse sequences, or orthologous sequences that share the same quence have been shown to confer increased cold and/or function (see also, for example, Mount, 2001). water deficit tolerance in transgenic plants, as compared to a 35 Transcription factor gene sequences are conserved across non-transformed plant that does not overexpress the polypep diverse eukaryotic species lines (Goodrich et al., 1993: Linet tide. al., 1991; Sadowski et al., 1988). Plants are no exception to Orthologs and Paralogs this observation; diverse plant species possess transcription Homologous sequences as described above can comprise factors that have similar sequences and functions. Speciation, orthologous or paralogous sequences. Several different meth 40 the production of new species from a parental species, gives ods are known by those of skill in the art for identifying and rise to two or more genes with similar sequence and similar defining these functionally homologous sequences. General function. These genes, termed orthologs, often have an iden methods for identifying orthologs and paralogs, including tical function within their host plants and are often inter phylogenetic methods, sequence similarity and hybridization changeable between species without losing function. methods, are described herein; an ortholog or paralog, includ 45 Because plants have common ancestors, many genes in any ing equivalogs, may be identified by one or more of the plant species will have a corresponding orthologous gene in methods described below. another plant species. Once a phylogenic tree for a gene As described by Eisen, 1998, evolutionary information family of one species has been constructed using a program may be used to predict gene function. It is common for groups such as CLUSTAL (Thompson et al., 1994; Higgins et al., of genes that are homologous in sequence to have diverse, 50 1996) potential orthologous sequences can be placed into the although usually related, functions. However, in many cases, phylogenetic tree and their relationship to genes from the the identification of homologs is not sufficient to make spe species of interest can be determined. Orthologous sequences cific predictions because not all homologs have the same can also be identified by a reciprocal BLAST strategy. Once function. Thus, an initial analysis of functional relatedness an orthologous sequence has been identified, the function of based on sequence similarity alone may not provide one with 55 the ortholog can be deduced from the identified function of a means to determine where similarity ends and functional the reference sequence. relatedness begins. Fortunately, it is well known in the art that By using a phylogenetic analysis, one skilled in the art protein function can be classified using phylogenetic analysis would recognize that the ability to predict similar functions of gene trees combined with the corresponding species. Func conferred by closely-related polypeptides is predictable. This tional predictions can be greatly improved by focusing on 60 predictability has been confirmed by our own many studies in how the genes became similar in sequence (i.e., by evolution which we have found that a wide variety of polypeptides have ary processes) rather than on the sequence similarity itself orthologous or closely-related homologous sequences that (Eisen, 1998). In fact, many specific examples exist in which function as does the first, closely-related reference sequence. gene function has been shown to correlate well with gene For example, distinct transcription factors, including: phylogeny (Eisen, 1998). Thus, “the first step in making 65 (i) AP2 family Arabidopsis G47 (found in U.S. Pat. No. functional predictions is the generation of a phylogenetic tree 7,135,616 to Heard et al., 2006), a phylogenetically-related representing the evolutionary history of the gene of interest sequence from soybean, and two phylogenetically-related US 8,633,353 B2 23 24 homologs from rice all can confer greater tolerance to the known consensus sequence or consensus DNA-binding drought, hyperosmotic stress, or delayed flowering as com site, or to a B domain (e.g., SEQID NOs: 47-69) of a sequence pared to control plants; of the invention, said B domain being required for DNA (ii) Myb-related Arabidopsis G682 (found in U.S. Pat. No. binding and subunit association. 7,223,904 to Heard et al., 2006) and numerous phylogeneti Percent identity can be determined electronically, e.g., by cally-related sequences from eudicots and monocots can con using the MEGALIGN program (DNASTAR, Inc. Madison, fer greater tolerance to heat, drought-related stress, cold, and Wis.). The MEGALIGN program can create alignments salt as compared to control plants; between two or more sequences according to different meth (iii) WRKY family Arabidopsis G1274 (found in U.S. Pat. ods, for example, the clustal method (see, for example, Hig No. 7,196,245 to Jiang et al., 2006) and numerous closely 10 gins and Sharp, 1988. The clustal algorithm groups sequences related sequences from eudicots and monocots have been into clusters by examining the distances between all pairs. shown to confer increased water deprivation tolerance, and The clusters are aligned pairwise and then in groups. Other (iv) AT-hook family soy sequence G3456 (found in U.S. alignment algorithms or programs may be used, including Pat. Appl. No. 2004.0128712, Jiang et al., 2004) and numer FASTA, BLAST, or ENTREZ, FASTA and BLAST, and ous phylogenetically-related sequences from eudicots and 15 which may be used to calculate percent similarity. These are monocots, increased biomass compared to control plants available as a part of the GCG sequence analysis package when these sequences are overexpressed in plants. (University of Wisconsin, Madison, Wis.), and can be used The polypeptides sequences belong to distinct clades or with or without default settings. ENTREZ is available subclades of polypeptides that include members from diverse through the National Center for Biotechnology Information. species. Many of the G482 subclade member sequences In one embodiment, the percent identity of two sequences can derived from both eudicots and monocots that have been be determined by the GCG program with a gap weight of 1. introduced into plants have been shown to confer greater e.g., each amino acid gap is weighted as if it were a single tolerance to cold and/or water deficit conditions relative to amino acid or nucleotide mismatch between the two control plants when the sequences were overexpressed, sequences (see U.S. Pat. No. 6,262,333 to Endege, 2001). including most of the sequences shown in Table 2. These 25 Software for performing BLAST analyses is publicly studies each demonstrate, in accord with the teachings of available, e.g., through the National Center for Biotechnol Goodrich et al., 1993, Lin et al., 1991, and Sadowski et al., ogy Information (see internet website at www.ncbi.nlm.nih 1988, that evolutionarily conserved genes from diverse spe .gov/). This algorithm involves first identifying high scoring cies are likely to function similarly (i.e., by regulating similar sequence pairs (HSPs) by identifying short words of length W target sequences and controlling the same traits), and that 30 in the G3876 sequence, which either match or satisfy some polynucleotides from one species may be transformed into positive-valued threshold score T when aligned with a word closely-related or distantly-related plant species to confer or of the same length in a database sequence. T is referred to as improve traits. the neighborhood word score threshold (Altschul, 1990: Alts At the nucleotide level, the sequences of the invention will chulet al., 1993). These initial neighborhood word hits act as typically share at least about 30% or 40% nucleotide 35 seeds for initiating searches to find longer HSPs containing sequence identity, preferably at least about 50%, about 60%, them. The word hits are then extended in both directions about 70% or about 80% sequence identity, and more prefer along each sequence for as far as the cumulative alignment ably about 85%, about 90%, about 95% or about 97% or more score can be increased. Cumulative scores are calculated sequence identity to one or more of the listed full-length using, for nucleotide sequences, the parameters M (reward sequences, or to a region of a listed sequence excluding or 40 score for a pair of matching residues; always-0) and N (pen outside of the region(s) encoding a known consensus alty score for mismatching residues; always.<0). For amino sequence or consensus DNA-binding site, or outside of the acid sequences, a scoring matrix is used to calculate the region(s) encoding one or all conserved domains. The degen cumulative score. Extension of the word hits in each direction eracy of the genetic code enables major variations in the are halted when: the cumulative alignment score falls off by nucleotide sequence of a polynucleotide while maintaining 45 the quantity X from its maximum achieved value; the cumu the amino acid sequence of the encoded protein. lative score goes to Zero or below, due to the accumulation of At the polypeptide level, the sequences of the invention one or more negative-scoring residue alignments; or the end will typically share at least about 42%, at least about 50%, at of either sequence is reached. The BLAST algorithm param least about 51%, at least about 52%, at least about 53%, at eters W. T. and X determine the sensitivity and speed of the least about 54%, at least about 55%, at least about 56%, at 50 alignment. The BLASTN program (for nucleotide least about 57%, at least about 58%, at least about 59%, at sequences) uses as defaults a wordlength (W) of 11, an expec least about 60%, at least about 61%, at least about 62%, at tation (E) of 10, a cutoff of 100, M-5, N=-4, and a compari least about 63%, at least about 64%, at least about 65%, at son of both strands. For amino acid sequences, the BLASTP least about 66%, at least about 67%, at least about 68%, at program uses as defaults a wordlength (W) of 3, an expecta least about 69%, at least about 70%, at least about 71%, at 55 tion (E) of 10, and the BLOSUM62 scoring matrix (see Heni least about 72%, at least about 73%, at least about 74%, at koff & Henikoff, 1989). Unless otherwise indicated for com least about 75%, at least about 76%, at least about 77%, at parisons of predicted polynucleotides, “sequence identity” least about 78%, at least about 79%, at least about 80%, at refers to the '% sequence identity generated from a thiastx least about 81%, at least about 82%, at least about 83%, at using the NCBI version of the algorithm at the default settings least about 84% sequence identity, at least about 85%, at least 60 using gapped alignments with the filter "off (see, for about 86%, at least about 87%, at least about 88%, at least example, internet website at www.ncbi.nlm.nih.gov/). about 89%, at least about 90%, at least about 91%, at least Other techniques for alignment are described by Doolittle, about 92%, at least about 93%, at least about 94%, at least 1996. Preferably, an alignment program that permits gaps in about 95%, at least about 96%, at least about 97%, at least the sequence is utilized to align the sequences. The Smith about 98%, at least about 99% or about 100% amino acid 65 Waterman is one type of algorithm that permits gaps in sequence identity to one or more of the listed full-length sequence alignments (see Shpaer, 1997). Also, the GAP pro sequences, or to a listed sequence but excluding or outside of gram using the Needleman and Wunsch alignment method US 8,633,353 B2 25 26 can be utilized to align sequences. An alternative search strat Furthermore, methods using manual alignment of egy uses MPSRCH software, which runs on a MASPAR sequences similar or homologous to one or more polynucle computer. MPSRCH uses a Smith-Waterman algorithm to otide sequences or one or more polypeptides encoded by the score sequences on a massively parallel computer. This polynucleotide sequences may be used to identify regions of approach improves ability to pick up distantly related similarity and conserved domains characteristic of a particu matches, and is especially tolerant of Small gaps and nucle lar transcription factor family. Such manual methods are otide sequence errors. Nucleic acid-encoded amino acid well-known to those with skill in the art and can include, for sequences can be used to search both protein and DNA data example, comparisons of tertiary structure between a bases. polypeptide sequence encoded by a polynucleotide that com The percentage similarity between two polypeptide 10 prises a known function and a polypeptide sequence encoded sequences, e.g., sequence A and sequence B, is calculated by by a polynucleotide sequence that has a function not yet dividing the length of sequence A, minus the number of gap determined. Such examples of tertiary structure may com residues in sequence A, minus the number of gap residues in prise predicted alpha helices, beta-sheets, amphipathic heli sequence B, into the Sum of the residue matches between ces, leucine Zipper motifs, Zinc finger motifs, proline-rich sequence A and sequence B, times one hundred. Gaps of low 15 regions, cysteine repeat motifs, and the like. or of no similarity between the two amino acid sequences are Orthologs and paralogs of presently disclosed polypep not included in determining percentage similarity. Percent tides may be cloned using compositions provided by the identity between polynucleotide sequences can also be present invention according to methods well known in the art. counted or calculated by other methods known in the art, e.g., cDNAs can be cloned using mRNA from a plant cell or tissue the Jotun Hein method (see, for example, Hein, 1990). Iden that expresses one of the present sequences. Appropriate tity between sequences can also be determined by other meth mRNA sources may be identified by interrogating Northern ods known in the art, e.g., by varying hybridization conditions blots with probes designed from the present sequences, after (see U.S. Pat. Appl. No. 20010010913, Hillmann, 2001). which a library is prepared from the mRNA obtained from a Thus, the invention provides methods for identifying a positive cell or tissue. Polypeptide-encoding cDNA is then sequence similar or paralogous or orthologous or homolo 25 isolated using, for example, PCR, using primers designed gous to one or more polynucleotides as noted herein, or one or from a presently disclosed gene sequence, or by probing with more target polypeptides encoded by the polynucleotides, or a partial or complete cDNA or with one or more sets of otherwise noted herein and may include linking or associat degenerate probes based on the disclosed sequences. The ing a given plant phenotype or gene function with a sequence. cDNA library may be used to transform plant cells. Expres In the methods, a sequence database is provided (locally or 30 sion of the cDNAs of interest is detected using, for example, across an internet or intranet) and a query is made against the microarrays, Northern blots, quantitative PCR, or any other sequence database using the relevant sequences herein and technique for monitoring changes in expression. Genomic associated plant phenotypes or gene functions. clones may be isolated using similar techniques to those. In addition, one or more polynucleotide sequences or one Examples of orthologs of the Arabidopsis polypeptide or more polypeptides encoded by the polynucleotide 35 sequences and their functionally similar orthologs are listed sequences may be used to search against a BLOCKS (Bairoch in FIGS. 4A to 4B and Tables 1-3, and the Sequence Listing. et al., 1997), PFAM, and other databases which contain pre In addition to the sequences in FIGS. 4A to 4B and Tables 1-3 viously identified and annotated motifs, sequences and gene and the Sequence Listing, the invention encompasses isolated functions. Methods that search for primary sequence patterns nucleotide sequences that are phylogenetically and structur with secondary structure gap penalties (Smith et al., 1992) as 40 ally similar to sequences listed in the Sequence Listing) and well as algorithms such as Basic Local Alignment Search can function in a plant by improving water deficit tolerance Tool (BLAST: Altschul, 1990: Altschul et al., 1993), and/or increased tolerance to cold when ectopically BLOCKS (Henikoff and Henikoff, 1991), Hidden Markov expressed in a plant. Models (HMM; Eddy, 1996: Sonnhammer et al., 1997), and Since a significant number of these sequences are phylo the like, can be used to manipulate and analyze polynucle 45 genetically and sequentially related to each other and have otide and polypeptide sequences encoded by polynucle been shown to increase tolerance to cold and/or water deficit otides. These databases, algorithms and other methods are conditions, one skilled in the art would predict that other well known in the art and are described in Ausubel et al., similar, phylogenetically related sequences derived from the 1997, and in Meyers, 1995. same ancestral sequence would also perform similar func A further method for identifying or confirming that specific 50 tions when ectopically expressed. homologous sequences control the same function is by com Identifying Polynucleotides or Nucleic Acids by Hybridiza parison of the transcript profile(s) obtained upon overexpres tion sion or knockout of two or more related polypeptides. Since Polynucleotides homologous to the sequences illustrated transcript profiles are diagnostic for specific cellular states, in the Sequence Listing and tables can be identified, e.g., by one skilled in the art will appreciate that genes that have a 55 hybridization to each other under Stringent or under highly highly similar transcript profile (e.g., with greater than 50% stringent conditions. Single stranded polynucleotides hybrid regulated transcripts in common, or with greater than 70% ize when they associate based on a variety of well character regulated transcripts in common, or with greater than 90% ized physical-chemical forces, such as hydrogen bonding, regulated transcripts in common) will have highly similar Solvent exclusion, base stacking and the like. The stringency functions. Fowler and Thomashow, 2002, have shown that 60 of a hybridization reflects the degree of sequence identity of three paralogous AP2 family genes (CBF1, CBF2 and CBF3) the nucleic acids involved, such that the higher the Stringency, are induced upon cold treatment, and each of which can the more similar are the two polynucleotide strands. Strin condition improved freezing tolerance, and all have highly gency is influenced by a variety of factors, including tempera similar transcript profiles. Once a polypeptide has been ture, salt concentration and composition, organic and non shown to provide a specific function, its transcript profile 65 organic additives, solvents, etc. present in both the becomes a diagnostic tool to determine whether paralogs or hybridization and wash solutions and incubations (and num orthologs have the same function. berthereof), as described in more detail in the references cited US 8,633,353 B2 27 28 below (e.g., Sambrook et al., 1989; Berger and Kimmel, ments such as genes that duplicate functional enzymes from 1987; and Anderson and Young, 1985). closely related organisms. The stringency can be adjusted Encompassed by the invention are polynucleotide either during the hybridization step or in the post-hybridiza sequences that are capable of hybridizing to the claimed tion washes. Salt concentration, formamide concentration, polynucleotide sequences, including any of the polynucle 5 hybridization temperature and probe lengths are variables otides within the Sequence Listing, and fragments thereof that can be used to alter stringency (as described by the under various conditions of stringency (see, for example, formula above). As a general guideline, high Stringency is Wahl and Berger, 1987; and Kimmel, 1987). In addition to the typically performed at Tm-5° C. to Tm-20° C. moderate nucleotide sequences listed in the Sequence Listing, full stringency at Tm-20°C. to Tm-35°C. and low stringency at length cDNA, orthologs, and paralogs of the present nucle 10 otide sequences may be identified and isolated using well Tm-35°C. to Tm-50° C. for duplex >150 base pairs. Hybrid known methods. The cDNA libraries, orthologs, and paralogs ization may be performed at low to moderate stringency (25 of the present nucleotide sequences may be screened using 50° C. below Tm), followed by post-hybridization washes at hybridization methods to determine their utility as hybridiza increasing stringencies. Maximum rates of hybridization in tion target or amplification probes. 15 solution are determined empirically to occurat Tm-25°C. for With regard to hybridization, conditions that are highly DNA-DNA duplex and Tm-15° C. for RNA-DNA duplex. stringent, and means for achieving them, are well known in Optionally, the degree of dissociation may be assessed after the art. See, for example, Sambrook et al., 1989; Berger, 1987, each wash step to determine the need for Subsequent, higher pages 467-469; and Anderson and Young, 1985. stringency wash steps. Stability of DNA duplexes is affected by such factors as High stringency conditions may be used to select for base composition, length, and degree of base pair mismatch. nucleic acid sequences with high degrees of identity to the Hybridization conditions may be adjusted to allow DNAs of disclosed sequences. An example of stringent hybridization different sequence relatedness to hybridize. The melting tem conditions obtained in a filter-based method such as a South perature (Tm) is defined as the temperature when 50% of the ern or Northern blot for hybridization of complementary duplex molecules have dissociated into their constituent 25 nucleic acids that have more than 100 complementary resi single strands. The melting temperature of a perfectly dues is about 5° C. to 20° C. lower than the thermal melting matched duplex, where the hybridization buffer contains for point (Tm) for the specific sequence at a defined ionic mamide as a denaturing agent, may be estimated by the fol strength and pH. Conditions used for hybridization may lowing equations: include about 0.02 M to about 0.15M sodium chloride, about DNA-DNA: 30 0.5% to about 5% casein, about 0.02% SDS or about 0.1% N-laurylsarcosine, about 0.001 M to about 0.03 M sodium (I) citrate, at hybridization temperatures between about 50° C. and about 70° C. More preferably, high stringency conditions DNA-RNA: are about 0.02 M sodium chloride, about 0.5% casein, about 35 0.02% SDS, about 0.001M sodium citrate, at a temperature of (II) about 50° C. Nucleic acid molecules that hybridize under RNA-RNA: stringent conditions will typically hybridize to a probe based on either the entire DNA molecule or selected portions, e.g., to a unique Subsequence, of the DNA. (III) 40 Stringent salt concentration will ordinarily be less than where L is the length of the duplex formed, Na' is the about 750 mM NaCl and 75 mM trisodium citrate. Increas molar concentration of the sodium ion in the hybridization or ingly stringent conditions may be obtained with less than washing solution, and % G+C is the percentage of (guanine-- about 500 mM. NaCl and 50 mM trisodium citrate, to even cytosine) bases in the hybrid. For imperfectly matched greater stringency with less than about 250 mM NaCl and 25 hybrids, approximately 1°C. is required to reduce the melting 45 mM trisodium citrate. Low stringency hybridization can be temperature for each 1% mismatch. obtained in the absence of organic solvent, e.g., formamide, Hybridization experiments are generally conducted in a whereas high Stringency hybridization may be obtained in the buffer of pH between 6.8 to 7.4, although the rate of hybrid presence of at least about 35% formamide, and more prefer ization is nearly independent of pH at ionic strengths likely to ably at least about 50% formamide. Stringent temperature be used in the hybridization buffer (Anderson and Young, 50 conditions will ordinarily include temperatures of at least 1985). In addition, one or more of the following may be used about 30° C., more preferably of at least about 37° C., and to reduce non-specific hybridization: Sonicated Salmon sperm most preferably of at least about 42° C. with formamide DNA or another non-complementary DNA, bovine serum present. Varying additional parameters, such as hybridization albumin, Sodium pyrophosphate, Sodium dodecylsulfate time, the concentration of detergent, e.g., sodium dodecyl (SDS), polyvinyl-pyrrolidone, ficolland Denhardt’s solution. 55 sulfate (SDS) and ionic strength, are well known to those Dextran sulfate and polyethylene glycol 6000 act to exclude skilled in the art. Various levels of stringency are accom DNA from solution, thus raising the effective probe DNA plished by combining these various conditions as needed. concentration and the hybridization signal within a given unit The washing steps that follow hybridization may also vary of time. In some instances, conditions of even greater Strin in stringency; the post-hybridization wash steps primarily gency may be desirable or required to reduce non-specific 60 determine hybridization specificity, with the most critical and/or background hybridization. These conditions may be factors being temperature and the ionic strength of the final created with the use of higher temperature, lower ionic wash Solution. Wash stringency can be increased by decreas strength and higher concentration of a denaturing agent Such ing salt concentration or by increasing temperature. Stringent as formamide. salt concentration for the wash steps will preferably be less Stringency conditions can be adjusted to screen for mod 65 thanabout 30 mMNaCl and 3 mMtrisodium citrate, and most erately similar fragments such as homologous sequences preferably less than about 15 mM NaCl and 1.5 mM triso from distantly related organisms, or to highly similar frag dium citrate. US 8,633,353 B2 29 30 Thus, hybridization and wash conditions that may be used Wahl and Berger, 1987, pages 399-407; and Kimmel, 1987). to bind and remove polynucleotides with less than the desired In addition to the nucleotide sequences in the Sequence List homology to the nucleic acid sequences or their complements ing, full length cDNA, orthologs, and paralogs of the present that encode the present polypeptides include, for example: nucleotide sequences may be identified and isolated using 6xSSC at a temperature of 65° C.; well-known methods. The cDNA libraries, orthologs, and 50% formamide, 4xSSC at 42°C.; or paralogs of the present nucleotide sequences may be screened 0.1XSSC, 0.2xSSC, 0.5xSSC, 1XSSC, 2XSSC, or 0.2xSSC using hybridization methods to determine their utility as to 2xSSC, with 0.1% SDS, at a temperature of 50° C., 55°C., hybridization target or amplification probes. 60° C., 65° C., or 50° C. to 65° C.; Vectors, Promoters, and Expression Systems with, for example, two wash steps of 10-30 minutes each. 10 The present invention includes recombinant nucleic acid Useful variations on these conditions will be readily apparent constructs comprising one or more of the nucleic acid to those skilled in the art. sequences herein. The constructs typically comprise a vector, A person of skill in the art would not expect substantial Such as a plasmid, a cosmid, a phage, a virus (e.g., a plant variation among polynucleotide species encompassed within virus), a bacterial artificial chromosome (BAC), a yeast arti the scope of the present invention because the highly stringent 15 ficial chromosome (YAC), or the like, into which a nucleic conditions set forth in the above formulae yield structurally acid sequence of the invention has been inserted, in a forward similar polynucleotides. or reverse orientation. In a preferred aspect of this embodi If desired, one may employ wash steps of even greater ment, the construct further comprises regulatory sequences, stringency, including about 0.2xSSC, 0.1% SDS at 65° C. and including, for example, a promoter, operably linked to the washing twice, each wash step being about 30 minutes, or sequence. Large numbers of Suitable vectors, cassettes and about 0.1xSSC, 0.1% SDS at 65° C. and washing twice for 30 promoters are known to those of skill in the art, and are minutes. The temperature for the wash solutions will ordi commercially available. narily be at least about 25°C., and for greater stringency at General texts that describe molecular biological tech least about 42°C. Hybridization stringency may be increased niques useful herein, including the use and production of further by using the same conditions as in the hybridization 25 vectors, cassettes and promoters and many other relevant steps, with the wash temperature raised about 3°C. to about topics, include Berger and Kimmel, 1987, Sambrook et al., 5°C., and stringency may be increased even further by using 1989, and Ausubel, 1997-2001. Any of the identified the same conditions except the wash temperature is raised sequences can be incorporated into a cassette or vector, e.g., about 6° C. to about 9° C. For identification of less closely for expression in plants. A number of expression vectors or related homologs, wash steps may be performed at a lower 30 cassettes suitable for stable transformation of plant cells or temperature, e.g., 50° C. for the establishment of transgenic plants have been described An example of a low stringency wash step employs a including those described in Weissbach and Weissbach, 1989, solution and conditions of at least 25°C. in 30 mM NaCl, 3 and Gelvin et al., 1990. Specific examples include those mM trisodium citrate, and 0.1% SDS over 30 minutes. derived from a Tiplasmid of Agrobacterium tumefaciens, as Greater stringency may be obtained at 42°C. in 15 mMNaCl, 35 well as those disclosed by Herrera-Estrella et al., 1983, Bevan with 1.5 mM trisodium citrate, and 0.1% SDS over 30 min 1984, Klee 1985, for dicotyledonous plants. utes. Even higher stringency wash conditions are obtained at Alternatively, non-Ti vectors can be used to transfer the 65°C.-68°C. in a solution of 15 mMNaCl, 1.5 mM trisodium DNA into monocotyledonous plants and cells by using free citrate, and 0.1% SDS. Wash procedures will generally DNA delivery techniques. Such methods can involve, for employ at least two final wash steps. Additional variations on 40 example, the use of liposomes, electroporation, microprojec these conditions will be readily apparent to those skilled in the tile bombardment, silicon carbide whiskers, and viruses. By art (see, for example, U.S. Pat. Appl. No. 20010010913, using these methods transgenic plants such as wheat, rice Hillmann, 2001). (Christou, 1991, and corn (Gordon-Kamm, 1990) can be Stringency conditions can be selected Such that an oligo produced. An immature embryo can also be a good target nucleotide that is perfectly complementary to the coding oli 45 tissue for monocots for direct DNA delivery techniques by gonucleotide hybridizes to the coding oligonucleotide with at using the particle gun (Weeks et al., 1993; Vasil, 1993: Wan least about a 5-10x higher signal to noise ratio than the ratio and Lemeaux, 1994, and for Agrobacterium-mediated DNA for hybridization of the perfectly complementary oligonucle transfer (Ishida et al., 1996). otide to a nucleic acid encoding a polypeptide known as of the Typically, plant transformation vectors include one or filing date of the application. It may be desirable to select 50 more cloned plant coding sequence (genomic or cDNA) conditions for a particular assay Such that a higher signal to under the transcriptional control of 5' and 3' regulatory noise ratio, that is, about 15x or more, is obtained. Accord sequences and a dominant selectable marker. Such plant ingly, a Subject nucleic acid will hybridize to a unique coding transformation vectors typically also contain a promoter (e.g., oligonucleotide with at least a 2x or greater signal to noise a regulatory region controlling inducible or constitutive, envi ratio as compared to hybridization of the coding oligonucle 55 ronmentally- or developmentally-regulated, or cell- or tissue otide to a nucleic acid encoding known polypeptide. The specific expression), a transcription initiation start site, an particular signal will depend on the label used in the relevant RNA processing signal (such as intron splice sites), a tran assay, e.g., a fluorescent label, a colorimetric label, a radio Scription termination site, and/or a polyadenylation signal. active label, or the like. Labeled hybridization or PCR probes A potential utility for the transcription factor polynucle for detecting related polynucleotide sequences may be pro 60 otides disclosed herein is the isolation of promoter elements duced by oligolabeling, nick translation, end-labeling, or from these genes that can be used to program expression in PCR amplification using a labeled nucleotide. plants of any genes. Each transcription factor gene disclosed Encompassed by the invention are polynucleotide herein is expressed in a unique fashion, as determined by sequences that are capable of hybridizing to the claimed promoter elements located upstream of the start of transla polynucleotide sequences, including any of the polynucle 65 tion, and additionally within an intron of the transcription otides within the Sequence Listing, and fragments thereof factor gene or downstream of the termination codon of the under various conditions of stringency (see, for example, gene. As is well known in the art, for a significant portion of US 8,633,353 B2 31 32 genes, the promoter sequences are located entirely in the Expression Hosts region directly upstream of the start of translation. In Such The present invention also relates to host cells which are cases, typically the promotersequences are located within 2.0 transduced with vectors of the invention, and the production kb of the start of translation, or within 1.5 kb of the start of of polypeptides of the invention (including fragments translation, frequently within 1.0kb of the start of translation, 5 thereof) by recombinant techniques. Host cells are geneti and sometimes within 0.5 kb of the start of translation. cally engineered (i.e., nucleic acids are introduced, e.g., trans The promoter sequences can be isolated according to meth duced, transformed or transfected) with the vectors of this ods known to one skilled in the art. invention, which may be, for example, a cloning vector or an Examples of constitutive plant promoters which can be expression vector comprising the relevant nucleic acids useful for expressing the TF sequence include: the cauli 10 herein. The vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acid, etc. The engineered host cells can flower mosaic virus (CaMV) 35S promoter, which confers be cultured in conventional nutrient media modified as appro constitutive, high-level expression in most plant tissues (see, priate for activating promoters, selecting transformants, or e.g., Odell et al., 1985); the nopaline synthase promoter (An amplifying the relevant gene. The culture conditions, such as et al., 1988); and the octopine synthase promoter (Fromm et 15 temperature, pH and the like, are those previously used with al., 1989). the host cell selected for expression, and will be apparent to A variety of plant gene promoters that regulate gene those skilled in the art and in the references cited herein, expression in response to environmental, hormonal, chemi including, Sambrook, 1989 and Ausubel, 1997-2001. cal, developmental signals, and in a tissue-active manner can The host cell can be a eukaryotic cell. Such as a yeast cell, be used for expression of a TF sequence in plants. Choice of or a plant cell, or the host cell can be a prokaryotic cell. Such a promoter is based largely on the phenotype of interest and is as a bacterial cell. Plant protoplasts are also suitable for some determined by Such factors as tissue (e.g., seed, fruit, root, applications. For example, the DNA fragments are introduced pollen, vascular tissue, flower, carpel, etc.), inducibility (e.g., into plant tissues, cultured plant cells or plant protoplasts by in response to wounding, heat, cold, drought, light, patho standard methods including electroporation (Fromm et al., gens, etc.), timing, developmental stage, and the like. Numer 25 1985, infection by viral vectors such as cauliflower mosaic ous known promoters have been characterized and can favor virus (CaMV) (Hohnet al., 1982; and U.S. Pat. No. 4,407,956 ably be employed to promote expression of a polynucleotide to Howell, 1983), high velocity ballistic penetration by small of the invention in a transgenic plant or cell of interest. For particles with the nucleic acid either within the matrix of example, tissue specific promoters include: seed-specific pro small beads or particles, or on the surface (Klein et al., 1987), moters (such as the napin, phaseolin or DC3 promoter 30 use of pollen as vector (PCT Pat. Publ. No. WO8501856, De described in U.S. Pat. No. 5,773,697 to Tomes, 1998), fruit Wet), or use of Agrobacterium tumefaciens or A. rhizogenes specific promoters that are active during fruit ripening (such carrying a T-DNA plasmid in which DNA fragments are as the dru 1 promoter (U.S. Pat. No. 5,783,393 to Kellogg, cloned. The T-DNA plasmid is transmitted to plant cells upon 1998), or the 2A11 promoter (U.S. Pat. No. 4,943,674 to infection by Agrobacterium tumefaciens, and a portion is Houck and Pear, 1990) and the tomato polygalacturonase 35 stably integrated into the plant genome (Horsch et al., 1984; promoter (Bird et al., 1988), root-specific promoters, such as Fraley et al., 1983). those disclosed in U.S. Pat. No. 5,618,988 to Hauptmann, et The cell can include a nucleic acid of the invention that al., 1997, U.S. Pat. No. 5,837,848 to Ely, 1998, and U.S. Pat. encodes a polypeptide, wherein the cell expresses a polypep No. 5,905,186 to Thomas et al., 1999, pollen-active promot tide of the invention. The cell can also include vector ers such as PTA29, PTA26 and PTA13 (U.S. Pat. No. 5,792, 40 sequences, or the like. Furthermore, cells and transgenic 929 to Mariani, 1998), promoters active in vascular tissue plants that include any polypeptide or nucleic acid above or (Ringli and Keller, 1998), flower-specific (Kaiser et al., throughout this specification, e.g., produced by transduction 1995), pollen (Baersonet al., 1994), carpels (Ohlet al., 1990), of a vector of the invention, are an additional feature of the pollen and ovules (Baerson et al., 1993), auxin-inducible invention. promoters (such as that described in van der Kop et al., 1999 45 For long-term, high-yield production of recombinant pro or Baumann et al., 1999), cytokinin-inducible promoter teins, stable expression can be used. Host cells transformed (Guevara-Garcia, 1998), promoters responsive to gibberellin with a nucleotide sequence encoding a polypeptide of the (Shietal., 1998, Willmottet al., 1998) and the like. Additional invention are optionally cultured under conditions Suitable promoters are those that elicit expression in response to heat for the expression and recovery of the encoded protein from (Ainley et al., 1993), light (e.g., the pea rbcS-3A promoter, 50 cell culture. The protein or fragment thereof produced by a Kuhlemeier et al., 1989, and the maize rbcS promoter, recombinant cell may be secreted, membrane-bound, or con Schaffner and Sheen, 1991); wounding (e.g., wunI, Siebertz tained intracellularly, depending on the sequence and/or the et al., 1989); pathogens (such as the PR-1 promoter described vector used. As will be understood by those of skill in the art, in Buchel et al., 1999, and the PDF1.2 promoter described in expression vectors containing polynucleotides encoding Manners et al., 1998), and chemicals such as methyl jas 55 mature proteins of the invention can be designed with signal monate or salicylic acid (Gatz, 1997). In addition, the timing sequences which direct secretion of the mature polypeptides of the expression can be controlled by using promoters such through a prokaryotic or eukaryotic cell membrane. as those acting at Senescence (Gan and Amasino, 1995); or late seed development (Odell et al., 1994). EXAMPLES Plant expression vectors or cassettes can also include RNA 60 processing signals that can be positioned within, upstream or It is to be understood that this invention is not limited to the downstream of the coding sequence. In addition, the expres particular devices, machines, materials and methods sion vectors can include additional regulatory sequences described. Although particular embodiments are described, from the 3'-untranslated region of plant genes, e.g., a 3' ter equivalent embodiments may be used to practice the inven minator region to increase mRNA stability of the mRNA, 65 tion. such as the PI-II terminator region of potato or the octopine or The invention, now being generally described, will be more nopaline synthase 3' terminator regions. readily understood by reference to the following examples, US 8,633,353 B2 33 34 which are included merely for purposes of illustration of library. P46 was used to prepare a 35S::G481 direct promoter certain aspects and embodiments of the present invention and fusion construct that carries a kanamycin resistance marker. are not intended to limit the invention. It will be recognized by The two constructs P5287 (LTP1::LexA-GAL4 TA: SEQ one of skill in the art that a polypeptide that is associated with ID NO: 71) and P6812 (opLexA::G481; SEQ ID NO: 72) a particular first trait may also be associated with at least one 5 together constitute a two-component system for expression of other, unrelated and inherent second trait which was not pre G481 from the LTP1 promoter. A kanamycin resistant trans dicted by the first trait. genic line containing P5287 was established, and this was then supertransformed with the P6812 construct containing a Example I cDNA clone of G481 and a sulfonamide resistance marker. 10 P21522 (SEQ ID NO: 73) contains a G481 cDNA clone Project Types and Vector and Cloning Information and was used to prepare SUC2::G481 direct promoter-fusion construct that carries a kanamycin resistance marker. A number of constructs were used to modulate the activity The two constructs P5311 (ARSK1:LexA-GAL4 TA; of sequences of the invention. An individual project was SEQID NO: 74) and P6812 (opLex A::G481; SEQ ID NO: 15 72) together constitute a two-component system for expres defined as the analysis of lines for a particular construct (for sion of G481 from the ARSK1 promoter. A kanamycin resis example, this might include plant lines that constitutively tant transgenic line containing P5311 was established, and overexpress G482 or another subclade polypeptide). Gener this was then supertransformed with the P6812 construct ally, a full-length wild-type version of a gene or its cDNA was containing a cDNA clone of G481 and a sulfonamide resis directly fused to a promoter that drove its expression in trans tance marker. genic plants, except as noted in Table 1. Such a promoter The two constructs P9002 (RD29A::LexA-GAL4 TA; could be the native promoter of that gene, or the CaMV 35S SEQID NO: 75) and P6812 (opLex A::G481; SEQ ID NO: promoter which drives constitutive expression. Alternatively, 72) together constitute a two-component system for expres a promoter that drives tissue specific or conditional expres sion of G481 from the RD29A promoter. A kanamycin resis sion could be used in similar studies. A direct fusion approach 25 tant transgenic line containing P9002 was established, and has the advantage of allowing for simple genetic analysis if a this was then supertransformed with the P6812 construct given promoter-polynucleotide line is to be crossed into dif containing a cDNA clone of G481 and a sulfonamide resis ferent genetic backgrounds at a later date. tance marker. As an alternative to plant transformation with a direct The two constructs P5319 (AS1:LeXA-GAL4TA: SEQID fusion construct, Some plant lines were transformed with a 30 NO: 76) and P6812 (opLexA::G481; SEQ ID NO: 72) two component expression system in which a kanamycin together constitute a two-component system for expression of resistant 35S:LeXA-GAL4-TA driver line was established G481 from the AS1 promoter. A kanamycin resistant trans and then Supertransformed with an opLex A::transcription genic line containing P5319 was established, and this was factor construct carrying a Sulfonamide resistance gene for then supertransformed with the P6812 construct containing a each of the transcription factors of interest. 35 cDNA clone of G481 and a sulfonamide resistance marker. The first component vector, the “driver vector or construct The two constructs P6506 (35S:LexA-GAL4TA: SEQID (P6506) contained a transgene carrying a 35S::Lex A-GAL4 NO: 77) and P6812 (opLexA::G481; SEQ ID NO: 72) transactivation domain (TA) (SEQID NO: 77) along with a together constitute a two-component system for expression of kanamycin resistance selectable marker. Having established a G481 from the 35S promoter. P6812 contains the same cDNA driver line containing the 35S:LexA-GAL4-transactivation 40 clone that is present in P46. A kanamycin resistant transgenic domain component, the transcription factors of the invention line containing P6506 was established, and this was then could be expressed by Super-transforming or crossing in a supertransformed with the P6812 construct containing a second construct carrying a Sulphonamide resistance select cDNA clone of G481 and a sulfonamide resistance marker. able marker and the transcription factor polynucleotide of P25281 (SEQ ID NO: 78) was used to prepare a 35S:: interest cloned behind a LexA operator site (opLex A:TF). 45 G481-GFP (Green Fluorescent Protein) fusion directly fused For example, the two constructs P6506 (35S:LexA-GAL4 to the 35S promoter and a KanR marker. TA: SEQ ID NO: 77) and P5072 (opLexA::G482: SEQ ID P47 (SEQ ID NO: 79) was used to prepare a 35S::G482 NO: 80) together constituted a two-component system for direct promoter fusion construct that carried a KanR marker. expression of G482 from the 35S promoter. A kanamycin The two constructs P6506 (35S:LexA-GAL4TA: SEQID resistant transgenic line containing P6506 was established, 50 NO: 77) and P5072 (opLexA::G482: SEQ ID NO: 80) and this was then supertransformed with the P5072 construct together constitute a two-component system for expression of containing a genomic clone of G482 and a Sulfonamide resis G482 from the 35S promoter. A kanamycin resistant trans tance marker. For each transcription factor that was overex genic line containing P6506 was established, and this was pressed with a two component system, the second construct then supertransformed with the P5072 construct containing a carried a sulfonamide selectable marker and was contained 55 cDNA clone of G482 and a sulfonamide resistance marker. within vector backbone pMEN53. P1441 (SEQID NO: 81) contains a G485 cDNA clone and A Summary of constructs that have been used in these was used to prepare a 35S::G485 direct promoter fusion con studies is provided as follows. Generally, a direct promoter struct that carried a KanR marker. fusion nucleic acid construct may be prepared by fusing a The two constructs P6506 (35S:LexA-GAL4TA: SEQID promoter sequence, such as constitutive, tissue-specific or 60 NO: 77) and P4190 (opLexA::G485; SEQ ID NO: 82) inducible promoter, include those found in the sequence list together constitute a two-component system for expression of ing, to the nucleic acid sequence found in the below listed G485 from the 35S promoter. A kanamycin resistant trans PIDS. Alternatively, a two component expression system may genic line containing P6506 was established, and this was be prepared to overexpress particular sequences of the inven then supertransformed with the P4190 construct containing a tion, as noted below. 65 cDNA clone of G485 and a sulfonamide resistance marker. P46 (SEQID NO: 70) contains a G481 cDNA clone which P26044 (SEQ ID NO: 83) was used to prepare a 35S:: incorporates native UTR and was derived from a cDNA G485-(9A)-CFP (9A=a linker comprising nine alanine resi US 8,633,353 B2 35 36 dues; CFP-cyan fluorescent protein) fusion directly fused to P21251 (SEQID NO:92) contains a G3429 cDNA clone, the 35S promoter and that carried a KanR marker. and was used to prepare a 35S::G3429 direct promoter-fusion The two constructs P6506 (35S:LexA-GAL4TA: SEQID construct carrying a kanamycin resistance marker. NO: 77) and P4357 (opLexA::G1364; SEQ ID NO: 84) P21466 (SEQID NO: 93) contains a G3434 cDNA clone, 5 and was used to prepare a 35S::G3434 direct promoter-fusion together constitute a two-component system for expression of construct carrying a kanamycin resistance marker. G1364 from the 35S promoter. A kanamycin resistant trans P21314 (SEQID NO:94) contains a G3435 cDNA clone, genic line containing P6506 was established, and this was and was used to prepare a 35S::G3435 direct promoter-fusion then supertransformed with the P4357 construct containing a construct carrying a kanamycin resistance marker. genomic clone of G 1364 and a Sulfonamide resistance P21381 (SEQID NO: 95) contains a G3436 cDNA clone, marker. 10 and was used to prepare a 35S::G3436 direct promoter-fusion The two constructs P5284 (RBCS3::LexA-GAL4 TA; construct carrying a kanamycin resistance marker. SEQID NO: 85) and P4357 (opLexA::G1364; SEQID NO: P21341 and P21471 (SEQIDNOs: 96 and 97) each contain 84) together constitute a two-component system for expres a G3470 cDNA clone, and are used to prepare 35S::G3470 sion of G1364 from the RBCS3 promoter. Akanamycin resis direct promoter-fusion constructs carrying a kanamycin tant transgenic line containing P5284 was established, and 15 resistance marker. this was then supertransformed with the P4357 construct P21342 (SEQID NO: 98) contains a G3471 cDNA clone, containing a genomic clone of G1364 and a Sulfonamide and was used to prepare a 35S::G3471 direct promoter-fusion resistance marker. construct carrying a kanamycin resistance marker. The two constructs P9002 (RD29A::LexA-GAL4 TA; P21348 (SEQID NO: 99) contains a G3472 cDNA clone, SEQID NO: 75) and P4357 (opLexA::G1364; SEQID NO: and was used to prepare a 35S::G3472 direct promoter-fusion 84) together constitute a two-component system for expres construct carrying a kanamycin resistance marker. sion of G1364 from the RD29A promoter. A kanamycin P21344 (SEQID NO: 100) contains a G3474 cDNA clone, resistant transgenic line containing P9002 (line 5) was estab and was used to prepare a 35S::G3474 direct promoter-fusion lished, and this was then supertransformed with the P4357 construct carrying a kanamycin resistance marker. construct containing a genomic clone of G 1364 and a Sul 25 P21347 (SEQID NO: 101) contains a G3475 cDNA clone, fonamide resistance marker. and was used to prepare a 35S::G3475 direct promoter-fusion P26.108 (SEQ ID NO: 86) was used to prepare a 35S:: construct carrying a kanamycin resistance marker. G1364-(9A)-CFP fusion directly fused to the 35S promoter, P21345 (SEQID NO: 102) contains a G3476 cDNA clone, and that carried a KanR marker. and was used to prepare a 35S::G3476 direct promoter-fusion The two constructs P6506 (35S:LexA-GAL4TA: SEQID 30 construct carrying a kanamycin resistance marker. NO: 77) and P8079 (opLexA::G2345; SEQ ID NO: 87) P21350 (SEQID NO: 103) contains a G3478 cDNA clone, together constitute a two-component system for expression of and was used to prepare a 35S::G3478 direct promoter-fusion G2345 from the 35S promoter. A kanamycin resistant trans construct carrying a kanamycin resistance marker. genic line containing P6506 was established, and this was P27818 (SEQID NO: 104) contains a G3866-GFP C-ter then supertransformed with the P8079 construct containing a 35 minal protein fusion and was fused to the 35S promoter. genomic clone of G2345 and a Sulfonamide resistance P26609 (SEQ ID NO: 105) was used to prepare a 35S:: marker. G3875 direct promoter fusion. The construct carries kanR P21253 (SEQ ID NO: 88) was used to prepare a G3395 and contains a cDNA clone of G3875. The clone within this cDNA clone 35S:G3395 direct promoter-fusion construct construct apparently encodes a slightly truncated form of the carrying a kanamycin resistance marker. 40 protein. P23304 (SEQ ID NO: 89) was used to prepare a 35S:: P25657 (SEQ ID NO: 106) was used to prepare a 35S:: G3396 cDNA direct promoter-fusion that carried a KanR G3876 direct fusion that carried a KanR marker. The con marker. struct contains a cDNA clone. P21265 (SEQID NO: 90) contains a G3397 clDNA clone, For the present study, the expression constructs used to and was used to prepare a 35S::G3397 direct promoter-fusion 45 generate lines of transgenic Arabidopsis plants constitutively construct that carried a KanR marker. overexpressing G482 subclade polypeptides are listed in P21252 (SEQID NO:91) contains a G3398 cDNA clone, Table 1. Compilations of the sequences of promoter frag and was used to prepare a 35S::G3398 direct promoter-fusion ments and the expressed transgene sequences within the PIDS construct that carried a KanR marker. are also provided in the Sequence Listing. TABLE 1. G482 subclade polynucleotide expression constructs Species Gene SEQ Promoter? SEQ Identifier ID NO: expression PID Construct 1 PID Construct 2 ID NO: (GID) of GID system (components) (components) of PID(s) Project type At G481 2 Constitutive P46 (for 70 Direct promoter 35S 35S:G481) fusion Epidermal P5287 P6812 71, 72 Two component specific (LTP1:LexA- (opLexA::G481) Supertransformation LTP1 GAL4TA) Vascular P21522 (for 73 Direct promoter specific SUC2:G481) fusion SUC2 Root P5311 P6812 74, 72 Two component specific (ARSK1::Lex A- (opLexA::G481) Supertransformation US 8,633,353 B2 37 38 TABLE 1-continued G482 Subclade polynucleotide expression constructs Species Gene SEQ Promoter? SEQ Identifier ID NO: expression PID Construct 1 PID Construct 2 ID NO: (GID) of GID system (components) (components) of PID(s) Project type ARSK1 GAL4TA) Stress P9002 P6812 75, 72 Two component inducible (RD29A::Lex A- (opLexA::G481) Supertransformation RD29A GAL4TA) Emergent P5319 P6812 76, 72 Two component leaf (AS1::LeXA- (opLexA::G481) Supertransformation primordial- GAL4TA) specific AS1 Constitutive P6506 P6812 77,72 Two component 35S (35S:LexA- (opLexA::G481) Supertransformation GAL4TA) Constitutive P25281 (for 78 Direct promoter 35S/GFP 3SS::G481-GFP protein-GFP fusion fusion fusion) At G482 28 Constitutive P47 (for 79 Direct promoter 35S 35S:G482) fusion Constitutive P6506 P5072 77, 80 Two component 35S (35S:LexA- (opLexA::G482) Supertransformation GAL4TA) At G485 18 Constitutive P1441 (for 81 Direct promoter 35S 35S:G485) fusion Constitutive P6506 P4190 77, 82 Two component 35S (35S:LexA- (opLexA::G485) Supertransformation GAL4TA) Emergent P5319 P4190 76, 82 Two component leaf (AS1::LeXA- (opLexA::G485) Supertransformation primordial- GAL4TA) specific AS1 Constitutive P26044 (for 83 Direct promoter 35S/CFP 35S:G485-(9A)- fusion fusion CFP) At G1364 14 Constitutive P6506 P4357 77, 84. Two component 35S (35S:LexA- (opLexA::G1364) Supertransformation GAL4TA) Leaf- P5284 P4357 85, 84. Two component specific (RBCS3::LexA- (opLexA::G1364) Supertransformation RBCS3 GAL4TA) Stress P9002 P4357 75, 84. Two component inducible (RD29A::Lex A- (opLexA::G1364) Supertransformation RD29A GAL4TA) Constitutive P26108 (for 86 Direct promoter 35S/CFP 3SS::G1364- fusion fusion (9A)-CFP fusion) At G2345 22 Constitutive P6506 P8079 77, 87 Two component 35S (35S:LexA- (opLexA::G2345) Supertransformation GAL4TA) OS.G.339S 38 Constitutive P21253 (for 88 Direct promoter 35S 35S:G3395) usion OS.G.3396 44 Constitutive P23304 (for 89 Direct promoter 35S 35S:G3396) usion OS.G.3397 36 Constitutive P21265 (for 90 Direct promoter 35S 35S:G3397) usion OS.G.3398 40 Constitutive P21252 (for 91 Direct promoter 35S 35S:G3398) usion OSG3429 46 Constitutive P21251 (for 92 Direct promoter 35S 35S:G3429) usion Zm G3434 12 Constitutive P21466 (for 93 Direct promoter 35S 35S:G3434) usion Zm G3435 30 Constitutive P21314(for 94 Direct promoter 35S 35S:G3435) usion Zm G3436 34 Constitutive P21381 (for 95 Direct promoter 35S 35S:G3436) usion GnfG3470 4 Constitutive P21341 (for 96 Direct promoter 35S 35S:G3470) usion GnfG3470 4 Constitutive P21471 (for 97 Direct promoter 35S 35S:G3470) usion GmiG3471 6 Constitutive P21342 (for 98 Direct promoter 35S 35S:G3471) usion GmiG3472 32 Constitutive P21348 (for 99 Direct promoter 35S 35S:G3472) usion US 8,633,353 B2 39 40 TABLE 1-continued G482 Subclade polynucleotide expression constructs Species Gene SEQ Promoter? SEQ Identifier ID NO: expression PID Construct 1 PID Construct 2 ID NO: (GID) of GID system (components) (components) of PID(s) Project type GmiG3474 24 Constitutive P21344 100 Direct promoter 35S (35S:G3474) usion GnfG3475 16 Constitutive P21347 (for 101 Direct promoter 35S 35S:G3475) usion GmiG3476 20 Constitutive P21345 (for 102 Direct promoter 35S 35S:G3476) usion GnfG3478 26 Constitutive P21350 (for 103 Direct promoter 35S 35S:G3478) usion Zm G3866 42 Constitutive P26587 (for 104 Direct promoter 35S 3SS::G3866 usion fused at the C erminus with he GAL4TA) GnfG3875 8 Constitutive P26609 (for 105 Direct promoter 35S 35S:G3875) usion Zn; G3876 10 Constitutive P25657 (for 106 Direct promoter 35S 35S:G3876) usion Lex A- - Constitutive P6506 77 Driver construct GAL4 TA 35S (35S:LexA in driver GAL4TA) construct Species abbreviations: At Arabidopsis thaiana; Gm—Glycine max; Os—Oryza Saiiva; Zm—Zea mayS

Example II obtained from the TO generation, and was later plated on selection plates (eitherkanamycin or Sulfonamide). Resistant Transformation 30 plants that were identified on Such selection plates comprised the T1 generation. Transformation of Arabidopsis was performed by an Agro bacterium-mediated protocol based on the method of Bech Example III told and Pelletier, 1998. Unless otherwise specified, all 35 experimental work was performed with the Arabidopsis Methods Used in Plate- and Soil-Based Physiology Columbia ecotype. Assays Plant Preparation: Arabidopsis seeds were sown on mesh covered pots. The Unless otherwise stated, all experiments are performed seedlings were thinned so that 6-10 evenly spaced plants 40 with the Arabidopsis thaliana ecotype Columbia (col-0). remained on each pot 10 days after planting. The primary Assays are usually performed on non-selected segregating T2 bolts were cut off a week before transformation to break populations (in order to avoid the extra stress of selection). apical dominance and encourage auxiliary shoots to form. Control plants for assays onlines containing direct promoter Transformation was typically performed at 4-5 weeks after fusion constructs are Col-O plants transformed an empty Sowing. 45 transformation vector (pMEN65). Controls for 2-component Bacterial Culture Preparation: lines (generated by Supertransformation) are the background Agrobacterium stocks were inoculated from single colony promoter-driver lines (i.e. promoter::LexA-GAL4TA lines), plates or from glycerol stocks and grown with the appropriate into which the supertransformations were initially per antibiotics until saturation. On the morning of transforma formed. All assays are performed in tissue culture. Growing tion, the Saturated cultures were centrifuged and bacterial 50 the plants under controlled temperature and humidity on Ster pellets were re-suspended in Infiltration Media (0.5x MS. 1 x ile medium produced uniform plant material that had not been B5 Vitamins, 5% sucrose, 1 mg/ml benzylaminopurine ribo exposed to additional stresses (such as water stress) which side, 200 ul/L Silwet L77) until an Asoo reading of 0.8 was could cause variability in the results obtained. All assays were reached. designed to detect plants that are more tolerant or less tolerant Transformation and Seed Harvest: 55 to the particular stress condition and were developed with the Agrobacterium Solution was poured into dipping con reference to the following publications: Jang et al., 1997, tainers. All flower buds and rosette leaves of the plants were Smeekens, 1998, Liu and Zhu, 1997, Saleki et al., 1993, Wu immersed in this solution for 30 seconds. The plants were laid et al., 1996, Zhu et al., 1998, Alia et al., 1998, Xin and on their side and wrapped to keep the humidity high. The Browse, 1998, Leon-Kloosterziel et al., 1996. Where pos plants were kept this way overnight at 4°C., after which the 60 sible, assay conditions were originally tested in a blind pots were turned upright, unwrapped, and moved to the experiment with controls that had phenotypes related to the growth racks. condition tested. The plants were maintained on the growth rack under Plate-based physiological assays representing a variety of 24-hour light until seeds were ready to be harvested. Seeds cold and water deficit-stress related conditions were used as a were harvested when 80% of the siliques of the transformed 65 pre-screen to identify abiotic stress tolerant plant lines from plants were ripe, approximately 5 weeks after the initial trans each project (i.e. lines from transformation with a particular formation. This seed was deemed TO seed, since it was construct), many of which were tested in Subsequent soil US 8,633,353 B2 41 42 based assays. Typically, ten or more lines were transformed bleach/0.01% Tween and five washes in distilled water. The with each construct and Subjected to plate assays, from which seeds were sown to MSagar in 0.1% agarose and stratified for the best three lines were selected for subsequent soil based three days at 4°C. before transfer to growth cabinets at 22°C. assayS. After seven days of growth on selection plates, the seedlings Seed Treatment. were transplanted to 3.5 inch diameterclay pots containing 80 Prior to plating, seed for all experiments were surface g of a 50:50 mix of vermiculite:perlite topped with 80 g of sterilized with: ProMix. Typically, each pot contained 14 seedlings, and (1) a five minute incubation with mixing in 70% ethanol: plants of the transgenic lines being tested and wild-type con trol plants were in separate pots. Pots containing the trans (2) a 20 minute incubation with mixing in 30% bleach, 10 genic line and control pots were randomly interspersed in a 0.01% triton-X 100; and growth room, maintained under 24-hour light conditions (18 (3) five rinses with sterile water. 23°C., and 90-100 uE m-2s-1) and watered for a period of After this treatment, the seeds were re-suspended in 0.1% 14 days. Water was then withheld and the pots were placed on sterile agarose and stratified at 4°C. for 3-4 days. absorbent diaper paper for a period of 8-10 days to apply a Germination Assays. 15 drought treatment. After this period, a visual qualitative All germination assays followed modifications of the same “drought score' from 0-6 was assigned to record the extent of basic protocol. Sterile seeds were sown on a conditional visible drought stress symptoms. A score of “6” corresponded medium that had a basal composition of 80% MS+Vitamins. to no visible symptoms, whereas a score of “0” corresponded Plates were incubated at 22°C. under 24-hour light (120-130 to extreme wilting and the leaves with a “crispy” texture. At uEm-2s-1) in a growth chamber. Evaluation of germination the end of the drought period, the pots were re-watered and and seedling vigor was performed five days after planting. For scored after 5-6 days; the number of surviving plants in each assessment of root development, seedlings germinated on pot was counted, and the proportion of the total plants in the 80% MS+Vitamins+1% sucrose were transferred to square pot that Survived was calculated. plates at seven days after planting. Evaluation was performed 25 Analysis of Results. five days after transfer following growth in a vertical position. In a given experiment, we typically compared six or more Qualitative differences were recorded including lateral and pots of a transgenic line with six or more pots of the appro primary root length, root hair number and length, and overall priate control. The mean drought score and mean proportion of plants surviving (survival rate) were calculated for pots of growth. 30 Water-deficit related plate-based germination assays were both the transgenic lines and wildtype. In each case a p-value conducted in the conditional basal medium with 80% was calculated that indicated the significance of the differ MS+Vitamins. Heat tolerance germination assays were con ence between the two mean values. The p-value was calcu ducted at 32° C. For the salt, mannitol and Sugar sensing lated with a Mann-Whitney rank-Sum test. assays, the conditional basal medium was modified by adding 35 Example IV one of the following for specific tolerance assays: NaCl (150 mM), mannitol (300 mM), or sucrose (9.4%). Transcription Factor Polynucleotide and Polypeptide Cold tolerance germination assays were conducted at 8°C. Sequences of the Invention, and Results Obtained Growth Assays. with Plants Overexpressing these Sequences Severe dehydration (a plate-based water deficit) assays 40 Table 2 shows the polypeptides identified by SEQID NO; were conducted by growing seedlings for 14 days on MS+Vi Gene ID (GID) No.; the transcription factor family to which tamins+1% Sucrose at 22°C. The plates were opened in a the polypeptide belongs, and conserved B domains of the sterile laminar flow hood for three hours for hardening, and polypeptide. The first column shows the polypeptide SEQID the seedlings were then removed from the media and dried for 45 NO; the second column the species (abbreviated) and identi 2 hours on absorbent paper in the flow hood. After this time fier (GID or “Gene IDentifier): they were transferred back to plates and incubated at 22°C. the third column shows percentage identity of each for recovery. The plants were evaluated after five days. sequence to the G481 protein (the number of identical resi For heat sensitivity growth assays, seeds were germinated dues per the total number of residues in the Subsequence used and grown for seven days on MS+Vitamins+1% sucrose at 50 by the BLASTp algorithm for comparison appears in paren 22°C., after which the seedlings were transferred to heat theses), the fourth column shows the amino acid coordinates stress conditions by growing the seedlings at 32° C. for five of the conserved B domains of each transcription factor listed, days. The plants were transferred back to 22°C. for recovery the fifth column shows the B domain of each sequence; the and evaluated after a further 5 days. 55 sixth column lists each SEQ ID NO: of the respective B Growth in cold conditions (chilling) was by germinating domains; and the seventh column shows the percentage iden seeds and growing them for seven days on MS+Vitamins+1% tity of each of the B domains to the G481 B domains (the sucrose at 22°C. The seedlings were then transferred to number of identical residues per the total number of residues chilling conditions at 8°C. and the plants were evaluated after in the subsequence used by the BLASTp algorithm for com 10 days and 17 days in these conditions. 60 parison appears in parentheses). The sequences are arranged Soil-Based Drought Assays in Clay Pots. in descending order of percentage identity to the G481 B The soil drought assay was based on the method of Haake domain. Percentage identities to the sequences listed in Table et al., 2002. In the current procedure, Arabidopsis seedlings 2 were determined using BLASTP analysis with defaults of were first germinated on selection plates containing either 65 wordlength (W) of 3, an expectation (E) of 10, and the BLO kanamycin or sulfonamide. The seeds were sterilized by a SUM62 scoring matrix Henikoff & Henikoff (1989) Proc. two minute ethanol treatment followed by 20 minutes in 30% Natl. Acad. Sci. USA 89:10915)*.

US 8,633,353 B2 47 48 TABLE 2- Continued Conserved domains of G481 and closely related sequences

Percent Percent identity to identity to the G481 G481 B protein: domaink (number of (number of identical identical Species / residues per residues per GID No. the total the total Accession number of Domain in B Domain number of SEQ ID No. or residues Amino Acid SEO ID residues NO: Identifier compared) Coordinates B Domain NO : compared) 46 Os/G3 429 42& 37-125 TNAELPMANLWRLIKKWLPGK 69 43.5% (41/97) AKIGGAAKGLTHDCAWEFWG (37/85) FWGDEASEKAKAEHRRTWAPE DYLGSFGDLGFDRYWDPMDA YIHGYRE

Species abbreviations: At-Arabidopsis thaliana; Gm-Glycine max; OS- Oryza sativa; Zm- Zea mays

The results of various water deficit-related assays are tolerance to dehydration treatment, or one line of several shown in Table 3. Of the twenty-three sequences that are 25 tested showed greater tolerance in Soil-based drought assays, phylogenetically and closely related to G482 and that have than controls. Ectopic expression with other promoter-gene been overexpressed in plants, seventeen have demonstrated combinations has not been tested at Mendel Biotechnology. the ability to confer increased tolerance to water deprivation Three sequences that are considered to fall within the G482 (water deficit). These positive results were obtained even subclade, when expressed under the control of the 35S pro though the expression levels for these sequences that would 30 moter, have not yet demonstrated the ability to confer best confer water deprivation tolerance have not yet been increased tolerance to water deprivation. These include Soy optimized. sequences G3474, G3475, and G3478 (SEQID NOs: 24, 16, Three additional polypeptide sequences, G3875 (SEQ ID and 26, respectively). However, ectopic expression with other NO: 8), G3397 (SEQID NO:36), and G3395, (SEQID NO: promoter-gene combinations, and Soil-based drought assays, 38), produced weaker, yet positive results in water deficit 35 have not yet been tested with the plants overexpressing any of assays as two of ten overexpressing lines showed greater these sequences. TABLE 3 G482-Subclade sequences overexpressed in plants that have shown improved water-deficit tolerance relative to control plants Greater tolerance than controls to: Hyper At least osmotic Gene SEQ One Water StreSS Identifier ID NO: Expression deficit (mannitol, Dehydration (Species) of GID system assay Salt PEG) Sucrose Heat or drought

G481 2 35S ------(At) LTP1 -- ind -- -- nic -- RBCS3 -- -- SUC2 -- -- ARSK1 -- -- RD29A -- -- GALA. C------terminal fusion GFP C-terminal ------nic -- fusion G3470 4 3SS ------(Gm) G3471 6 3SS -- -- ** (Gm) G3875 8 35S * ind nic * (Gm) G3475 16 3SS (Gm) G3876 10 3 SS -- ind nic nic -- (Zm) G3434 12 35S ------US 8,633,353 B2 49 50 TABLE 3-continued G482-Subclade sequences overexpressed in plants that have shown improved water-deficit tolerance relative to control plants Greater tolerance than controls to: Hyper At least osmotic Gene SEQ One water StreSS Identifier ID NO: Expression deficit (mannitol, Dehydration (Species) of GID system assay Salt PEG) Sucrose Heat or drought

G1364 14 3SS -- -- (At) RBCS3 -- -- RD29A -- -- CFP C- -- nic ind ind -- terminal fusion G485 18 3SS ------(At) AS1 -- nic ind ind ind -- CFP C- -- ind ind -- terminal fusion G3476 20 35S -- -- (Gm) G2345 22 35S -- -- (At) G3474 24 3SS (Gm) G3478 26 35S (Gm) G482 28 3SS ------(At) G343S 30 35S -- -- (Zm) G3472 32 35S -- -- (Gm) G3436 34 35S -- -- (Zm) G3397 36 3SS * * (Os) G3395 38 35S * * (Os) G3398 40 3SS -- -- (Os) G3866 42, GAL4 C- -- * * * (Zm) 115 terminal fusion G.3396 44 GAL4 C- -- -- (Os) terminal fusion G3429 46 35S -- -- (Os) Species abbreviations: At Arabidopsis thaiana; Gm—Glycine max; Os—Oryza Saiiva; Zm—Zea mayS + demonstrated greater water deficit tolerance than controls phenotype in more than two lines +* demonstrated greater water deficit tolerance than controls in two lines +** also reported in US patent application US2005.0022266 with constitutive overexpression in wilt, WUE and/or field assays (in US2005.0022266, G3866 = SEQID NO: 2, G3471 = SEQID NO: 6)

55 Sixteen of the G482 subclade sequences were found to TABLE 4 confer increased cold tolerance during the germination or growth of Arabidopsis plants when the sequences were over G482-Subclade sequences overexpressed in plants that have shown expressed (Table 4). Of these, G3471. G3435 and G3866 improved tolerance to cold conditions relative to control plants produced one or two lines that appeared to be more tolerant to 60 Gene More tolerant to cold cold. Five G482 subclade sequences, including Arabidopsis Identifier SEQID NO: during germination sequence G482, soy sequences G3472. G3474, G3478, and (Species) of GID Expression system and/or growth rice G3395 and G3398, have not yet demonstrated the ability G481 (At) 2 35S -- to confer increased tolerance to cold during growth or germi AS1 -- nation. The rice G3429 sequence that did not conferimproved 65 LTP1 -- cold tolerance, and while related to G481, is considered to be RBCS3 -- outside of the G482 subclade. US 8,633,353 B2 51 52 TABLE 4-continued promoter operably linked to the polynucleotide. The cloning vector may be introduced into a variety of plants by means G482-Subclade sequences overexpressed in plants that have shown well known in the art such as, for example, direct DNA improved tolerance to cold conditions relative to control plants transfer or Agrobacterium tumefaciens-mediated transforma Gene More tolerant to cold 5 tion. It is now routine to produce transgenic plants using most Identifier SEQID NO: during germination eudicot plants (see Weissbach and Weissbach, 1989, Gelvinet (Species) of GID Expression system and/or growth al., 1990; Herrera-Estrella et al., 1983; Bevan, 1984; and SUC2 -- Klee, 1985). Methods for analysis of traits are routine in the GFP C-term fusion -- art and examples are disclosed above. G3470 (Gm) 4 35S -- 10 Numerous protocols for the transformation of tomato and G3471 (Gm) 6 35S * G3875 (Gm) 8 35S -- soy plants have been previously described, and are well G3876 (Zm) 10 35S -- known in the art. Gruber et al., 1993, and Glick and Thomp G3434 (Zm) 12 35S -- son, 1993, describe several expression vectors and culture G1364 (At) 14 RD29A -- methods that may be used forcellor tissue transformation and RBCS3 -- 15 Subsequent regeneration. For soybean transformation, meth G3475 (Gm) 16 35S -- G485 (At) 18 35S -- ods are described by Miki et al., 1993; and U.S. Pat. No. AS1 -- 5,563,055 to Townsend and Thomas, 1996. CFP C-term fusion -- There are a substantial number of alternatives to Agrobac RBCS3 -- RD29A -- terium-mediated transformation protocols, other methods for G3476 (Gm) 2O 35S -- the purpose of transferring exogenous genes into Soybeans or G2345 (At) 22 35S -- tomatoes. One Such method is microprojectile-mediated G3474 (Gm) 24 35S transformation, in which DNA on the surface of microprojec G3478 (Gm) 26 35S G482 (At) 28 35S tile particles is driven into plant tissues with a biolistic device G3435 (Zm) 30 35S * (see, for example, Sanford et al., 1987: Christou et al., 1992: G3472 (Gm) 32 35S 25 Sanford, 1993; Klein et al., 1987; U.S. Pat. No. 5,015,580 to G3436 (Zm) 34 35S -- Christou et al., 1991; and U.S. Pat. No. 5,322,783 to Tomes et G3397 (Os) 36 35S -- G3395 (Os) 38 35S al., 1994). G3398 (Os) 40 35S Alternatively, Sonication methods (see, for example, G3866 (Zm) 42 35S * Zhang et al., 1991); direct uptake of DNA into protoplasts G3396 (Os) 44 35S -- 30 using CaCl precipitation, polyvinyl alcohol or poly-L-orni G3429 (Os) 46 35S thine (see, for example, Hain et al., 1985; Draper et al., 1982); Species abbreviations: At Arabidopsis thaiana; Gm—Glycine max; Os—Oryza Saiva; liposome or spheroplast fusion (see, for example, Deshayes et Zm—Zea mayS + demonstrated greater cold tolerance than controls in more than two lines al., 1985); Christou et al., 1987); and electroporation of pro + demonstrated greater cold tolerance in two lines than controls or substantially stronger toplasts and whole cells and tissues (see, for example, Donn cold tolerance than controls in at least one line 35 et al., 1990; DHalluin et al., 1992; and Spencer et al., 1994) +** greater cold tolerance than controls observed in one line have been used to introduce foreign DNA and expression vectors into plants. Example V After a plant or plant cell is transformed (and the latter regenerated into a plant), the transformed plant may be Utilities of G482 Subclade and 40 crossed with itself or a plant from the same line, a non Phylogenetically-Related Sequences transformed or wild-type plant, or another transformed plant from a different transgenic line of plants. Crossing provides The data obtained for water deprivation and cold tolerance the advantages of producing new and often stable transgenic in the above Examples indicate that G482 and related varieties. Genes and the traits they confer that have been sequence overexpression can directly result in improved tol 45 introduced into a tomato or soybean line may be moved into erance to cold, heat, Salinity, water deficit conditions and/or distinct line of plants using traditional backcrossing tech freezing, as well as improved yield, quality, appearance, niques well known in the art. Transformation of tomato plants growth range, and/or water usage of crop plants, ornamental may be conducted using the protocols of Koornneef et al., plants, and woody plants used in the food, ornamental, paper, 1986, and in U.S. Pat. No. 6,613,962 to Vos et al., 2003, the pulp, lumber or other industries (for example, industries 50 latter method described in briefhere. Eight day old cotyledon involved in bioremediation, or carbon sequestration). explants are precultured for 24 hours in Petri dishes contain ing a feeder layer of Petunia hybrida Suspension cells plated Example VI on MS medium with 2% (w/v) sucrose and 0.8% agar supple mented with 10 uM C.-naphthalene acetic acid and 4.4 uM Transformation of Eudicots to Produce Increased 55 6-benzylaminopurine. The explants are then infected with a Yield and/or Abiotic Stress Tolerance diluted overnight culture of Agrobacterium tumefaciens con taining an expression vector comprising a polynucleotide of Crop species that overexpress polypeptides of the inven the invention for 5-10 minutes, blotted dry on sterile filter tion may produce plants with increased water deprivation, paper and cocultured for 48 hours on the original feeder layer cold and/or nutrient tolerance and/or yield in both stressed 60 plates. Culture conditions are as described above. Overnight and non-stressed conditions. Thus, polynucleotide sequences cultures of Agrobacterium tumefaciens are diluted in liquid listed in the Sequence Listing recombined into, for example, MS medium with 2% (w/v/) sucrose, pH 5.7) to an ODoo of one of the expression vectors of the invention, or another O8. Suitable expression vector, may be transformed into a plant Following cocultivation, the cotyledon explants are trans for the purpose of modifying plant traits for the purpose of 65 ferred to Petri dishes with selective medium comprising MS improving yield, appearance, and/or quality. The expression medium with 4.56 uM Zeatin, 67.3 uM Vancomycin, 418.9 vector may contain a constitutive, tissue-specific or inducible uM cefotaxime and 171.6 uM kanamycin sulfate, and cul US 8,633,353 B2 53 54 tured under the culture conditions described above. The DNA transfer or Agrobacterium tumefaciens-mediated trans explants are subcultured every three weeks onto fresh formation. The latter approach may be accomplished by a medium. Emerging shoots are dissected from the underlying variety of means, including, for example, that of U.S. Pat. No. callus and transferred to glass jars with selective medium 5,591.616 to Hiei, 1997, in which monocotyledon callus is without Zeatin to form roots. The formation of roots in a transformed by contacting dedifferentiating tissue with the kanamycin Sulfate-containing medium is a positive indica Agrobacterium containing the cloning vector. tion of a successful transformation. The sample tissues are immersed in a suspension of 3x10 Transformation of soybean plants may be conducted using cells of Agrobacterium containing the cloning vector for 3-10 the methods foundin, for example, U.S. Pat. No. 5,563,055 to minutes. The callus material is cultured on Solid medium at Townsend et al., 1996), described in briefhere. In this method 10 25°C. in the dark for several days. The calli grown on this Soybean seed is Surface sterilized by exposure to chlorine gas medium are transferred to Regeneration medium. Transfers evolved in a glass belljar. Seeds are germinated by plating on are continued every 2-3 weeks (2 or 3 times) until shoots /10 strength agar Solidified medium without plant growth develop. Shoots are then transferred to Shoot-Elongation regulators and culturing at 28°C. with a 16 hour day length. medium every 2-3 weeks. Healthy looking shoots are trans After three or four days, seed may be prepared for cocultiva 15 ferred to rooting medium and after roots have developed, the tion. The Seedcoat is removed and the elongating radicle plants are placed into moist potting soil. removed 3-4 mm below the cotyledons. The transformed plants are then analyzed for the presence Overnight cultures of Agrobacterium tumefaciens harbor of the NPTII gene/kanamycin resistance by ELISA, using the ing the expression vector comprising a polynucleotide of the ELISA NPTII kit from 5Prime-3Prime Inc. (Boulder, Colo.). invention are grown to log phase, pooled, and concentrated by It is also routine to use other methods to produce transgenic centrifugation. Inoculations are conducted in batches Such plants of most cereal crops (Vasil, 1994) such as corn, wheat, that each plate of seed was treated with a newly resuspended rice, sorghum (Cassas et al., 1993), and barley (Wan and pellet of Agrobacterium. The pellets are resuspended in 20 ml Lemeaux, 1994). DNA transfer methods such as the micro inoculation medium. The inoculum is poured into a Petridish projectile method can be used for corn (Fromm et al., 1990; containing prepared seed and the cotyledonary nodes are 25 Gordon-Kamm et al., 1990; Ishida, 1996, wheat (Vasil et al., macerated with a surgical blade. After 30 minutes the explants 1992; Vasil et al., 1993: Weeks et al., 1993), and rice (Chris are transferred to plates of the same medium that has been tou, 1991; Hiei et al., 1994; Aldemita and Hodges, 1996; and solidified. Explants are embedded with the adaxial side up Hiei et al., 1997). For most cereal plants, embryogenic cells and level with the surface of the medium and cultured at 22 derived from immature scutellum tissues are the preferred C. for three days under white fluorescent light. These plants 30 cellular targets for transformation (Hiei et al., 1997: Vasil, may then be regenerated according to methods well estab 1994). Fortransforming cornembryogenic cells derived from lished in the art, such as by moving the explants after three immature scutellar tissue using microprojectile bombard days to a liquid counter-selection medium (see U.S. Pat. No. ment, the A188XB73 genotype is the preferred genotype 5,563,055 to Townsend et al., 1996). (Fromm et al., 1990; Gordon-Kamm et al., 1990). After The explants may then be picked, embedded and cultured 35 microprojectile bombardment the tissues are selected on in solidified selection medium. After one month on selective phosphinothricinto identify the transgenic embryogenic cells media transformed tissue becomes visible as green sectors of (Gordon-Kamm et al., 1990). Transgenic plants are regener regenerating tissue against a background of bleached, less ated by Standard corn regeneration techniques (Fromm et al., healthy tissue. Explants with green sectors are transferred to 1990; Gordon-Kamm et al., 1990). an elongation medium. Culture is continued on this medium 40 with transfers to fresh plates every two weeks. When shoots Example VIII are 0.5 cm in length they may be excised at the base and placed in a rooting medium. Expression and Analysis of Increased Yield or Abiotic Stress Tolerance in Non-Arabidopsis Species Example VII 45 It is expected that structurally similar orthologs of the Transformation of Monocots to Produce Increased G482 Subclade of polypeptide sequences, including those Yield or Abiotic Stress Tolerance found in the Sequence Listing, can confer increased yield or increased tolerance to water deprivation and cold during ger Cereal plants such as, but not limited to, corn, wheat, rice, 50 mination or growth relative to control plants. As sequences of Sorghum, or barley, or grasses such as, for example, Miscant the invention have been shown to improve stress tolerance in hus or Switchgrass, may be transformed with the present a variety of plant species, it is also expected that these polynucleotide sequences, including monocot or eudicot-de sequences will increase yield, appearance and/or quality of rived sequences such as those presented in the present Tables, crop or other commercially important plant species. cloned into a vector Such as pGA643 and containing a kana 55 Northern blot analysis, RT-PCR or microarray analysis of mycin-resistance marker, and expressed constitutively under, the regenerated, transformed plants may be used to show for example, the CaMV 35S or COR15 promoters, or with expression of a polypeptide or the invention and related genes tissue-specific or inducible promoters. The expression vec that are capable of inducing abiotic stress tolerance, and/or tors may be one found in the Sequence Listing, or any other larger size. suitable expression vector may be similarly used. For 60 After a eudicot plant, monocot plant or plant cell has been example, pMENO20 may be modified to replace the NptII transformed (and the latter regenerated into a plant) and coding region with the BAR gene of Streptomyces hygro shown to have greater size, improved planting density, that is, scopicus that confers resistance to phosphinothricin. The able to tolerate greater planting density with a coincident KpnI and BglII sites of the Bar gene are removed by site increase in yield, or tolerance to abiotic stress, or produce directed mutagenesis with silent codon changes. 65 greater yield relative to a control plant under the stress con The cloning vector may be introduced into a variety of ditions, the transformed monocot plant may be crossed with cereal plants by means well known in the art including direct itself or a plant from the same line, a non-transformed or US 8,633,353 B2 55 56 wild-type monocot plant, or another transformed monocot Bechtold and Pelletier (1998) Methods Mol. Biol. 82: 259 plant from a different transgenic line of plants. 266 The function of specific polypeptides of the invention, Ben-Naim et al. (2006) Plant J. 46: 462-476 including closely-related orthologs, have been analyzed and Berger and Kimmel (1987), “Guide to Molecular Cloning may be further characterized and incorporated into crop 5 Techniques', in Methods in Enzymology, Vol. 152, Aca plants. The ectopic overexpression of these sequences may be demic Press, Inc., San Diego, Calif. regulated using constitutive, inducible, or tissue specific Bevan (1984) Nucleic Acids Res. 12:8711-8721 regulatory elements. Genes that have been examined and Bhattachajee et al. (2001) Proc. Natl. Acad. Sci. USA 98: have been shown to modify plant traits (including increasing 13790-13795 yield and/or abiotic stress tolerance) encode polypeptides 10 Biet al. (1997).J. Biol. Chem. 272: 26562-26572 found in the Sequence Listing. In addition to these sequences, Bird et al. (1988) Plant Mol. Biol. 11: 651-662 it is expected that newly discovered polynucleotide and Borevitz et al. (2000) Plant Cell 12: 2383-2393 polypeptide sequences closely related to polynucleotide and Boss and Thomas (2002) Nature, 416: 847-850 polypeptide sequences found in the Sequence Listing can also Bruce et al. (2000) Plant Cell 12: 65-79 confer alteration of traits in a similar manner to the sequences 15 Buchel et al. (1999) Plant Mol. Biol. 40:387-396 found in the Sequence Listing, when transformed into any of Bucher and Trifonov (1988).J. Biomol. Struct. Dyn. 5: 1231 a considerable variety of plants of different species, and 1236 including eudicots and monocots. The polynucleotide and Bucher (1990).J. Mol. Biol. 212:563-578 polypeptide sequences derived from monocots (e.g., the rice Cassas et al. (1993) Proc. Natl. Acad. Sci. USA 90: 11212 sequences) may be used to transform both monocot and eud 11216 icot plants, and those derived from eudicots (e.g., the Arabi Casson and Lindsey (2006) Plant Physiol. 142: 526-541 dopsis and Soy genes) may be used to transform either group, Chase et al. (1993) Ann. Missouri Bot. Gard. 80: 528-580 although it is expected that Some of these sequences will Cheikh et al. (2003) U.S. Pat. Application No. 20030101479, function best if the gene is transformed into a plant from the published May 29, 2003 same group as that from which the sequence is derived. 25 Christou et al. (1987) Proc. Natl. Acad. Sci. USA 84: 3962 As an example of a first step to determine water depriva 3966 tion-related tolerance, seeds of these transgenic plants may be Christou (1991) Bio/Technol. 9:957-962 Subjected to germination assays to measure Sucrose sensing, Christou et al. U.S. Pat. No. 5,015,580, issued May 14, 1991 severe desiccation, freezing or drought. The methods for Christou et al. (1992) Plant. J. 2: 275-281 Sucrose sensing, severe desiccation or drought assays are 30 Coupland (1995) Nature 377: 482-483 described above. Plants overexpressing sequences of the Coustry et al. (1995).J. Biol. Chem. 270: 468-475 invention may be found to be more tolerant to high sucrose by Coustry et al. (1996).J. Biol. Chem. 271: 14485-14491 having better germination, longer radicles, and more cotyle Coustry et al. (1998) Biochem J. 331(Pt 1): 291-297 don expansion. Coustry et al. (2001).J. Biol. Chem. 276: 40621-40630 Plants that are more tolerant than controls to cold or water 35 DHalluin et al. (1992) Plant Cell 4: 1495-1505 deprivation assays will generally have better survival rates Daly et al. (2001) Plant Physiol. 127: 1328-1333 than controls, or will recover better from these treatments De Blaere et al. (1987) Meth. Enzymol. 143:277) than control plants. Therefore, the G482 subclade sequences De Wet, PCT Pat. Publication No. WO8501856, published of the invention may also be used to contribute to increased May 9, 1985 yield of commercially available plants. 40 Deshayes et al. (1985) EMBO.J. 4: 2731-2737 It is expected that the same methods may be applied to Doolittle, ed. (1996) Methods in Enzymology, Vol. 266: identify other useful and valuable sequences of the present “Computer Methods for Macromolecular Sequence polypeptide clades, and the sequences may be derived from a Analysis' Academic Press, Inc., San Diego, Calif., USA diverse range of eudicot and monocot species of plants. Donn et al. (1990) in Abstracts of VIIth International Con 45 gress on Plant Cell and Tissue Culture IAPTC, A2-38: 53 REFERENCES CITED Draper et al. (1982) Plant Cell Physiol. 23: 451–458 Eddy (1996) Curr. Opin. Str. Biol. 6: 361-365 Ainley et al. (1993) Plant Mol. Biol. 22:13–23 Edwards et al. (1998) Plant Physiol. 117: 1015-1022. Aldemita and Hodges (1996) Planta 199: 612-617 Ely, et al. U.S. Pat. No. 5,837,848, issued Nov. 17, 1998 Alia et al. (1998) Plant J. 16: 155-161 50 Eisen (1998) Genome Res. 8: 163-167 Altschul (1990).J. Mol. Biol. 215: 403-410 Endege, et al., U.S. Pat. No. 6,262,333, issued Jul. 17, 2001 Altschul (1993).J. Mol. Evol. 36: 290-300 Feng and Doolittle (1987).J. Mol. Evol. 25: 351-360 An et al. (1988) Plant Physiol. 88: 547-552 Fraley et al. (1983) Proc. Natl. Acad. Sci. USA 80: 4803-4807 Anderson and Young (1985) “Quantitative Filter Hybridisa Forsburg and Guarente (1989) Genes Dev. 3: 1166-1178 tion'. In: Hames and Higgins, ed., Nucleic Acid Hybridi 55 Fowler and Thomashow (2002) Plant Cell 14: 1675-1690 sation, A Practical Approach. Oxford, IRL Press, 73-111 Fromm et al. (1985) Proc. Natl. Acad. Sci. 82: 5824-5828 Arents and Moudrianakis (1995) Proc. Natl. Acad. Sci. USA Fromm et al. (1989) Plant Cell 1:977-984 92: 1117O-11174. Fromm et al. (1990) Bio/Technol. 8: 833-839 Ausubel et al. (1997-2001; eds.) Current Protocols in Fu et al. (2001) Plant Cell 13: 1791-1802 Molecular Biology, John Wiley & Sons (1997 and supple 60 Gan and Amasino (1995) Science 270: 1986-1988 ments through 2001 Gatz (1997) Annu. Rev. Plant Physiol. Plant Mol. Biol. 48: Ausubel et al. (1997) Short Protocols in Molecular Biology, 89-108 John Wiley & Sons, New York, N.Y., unit 7.7 Gaxiola et al. (2001) Proc. Natl. Acad. Sci. USA 98: 11444 Baerson et al. (1993) Plant Mol. Biol. 22:255-267 11449 Baerson et al. (1994) Plant Mol. Biol. 26: 1947-1959 65 Gelvin et al. (1990) Plant Molecular Biology Manual. Klu Bairochet al. (1997) Nucleic Acids Res. 25: 217-221 wer Academic Publishers Baumann et al., (1999) Plant Cell 11: 323-334 Gilmour et al. (1998) Plant J. 16:433-442 US 8,633,353 B2 57 58 Gish and States (1993) Nature Genetics 3: 266-272 Liu and Zhu (1997) Proc. Natl. Acad. Sci. USA 94: 14960 Glick and Thompson, eds. (1993) Methods in Plant Molecu 14964 lar Biology and Biotechnology. CRC Press. Boca Raton, Lotan et al. (1998) Cell 93: 1195-1205. Fla. Luger et al. (1997) Nature 389: 251-260 Goodrich et al. (1993) Cell 75: 519-530 Maity and de Crombrugghe (1998) Trends Biochem. Sci. 23: Gordon-Kamm et al. (1990) Plant Cell 2: 603-618 174-178 Gruber et al. (1993) in Methods in Plant Molecular Biology Mandel (1992a) Nature 360:273-277 and Biotechnology, p. 89-119 Mandelet al. (1992b) Cell 71-133-143 Guevara-Garcia (1998) Plant Mol. Biol. 38: 743-753 Manners et al. (1998) Plant Mol. Biol. 38: 1071-1080 Gusmaroliet al. (2001) Gene 264: 173-185 10 Mantovani (1999) Gene 239: 15-27 Gusmaroliet al. (2002) Gene 283: 41-48 Mariani et al., U.S. Pat. No. 5,792,929, issued Aug. 11, 1998 Haake et al. (2002) Plant Physiol. 130: 639-648 McCue and Hanson (1990) Trends Biotechnol. 8:358-362 Hain et al. (1985) Mol. Gen. Genet. 199: 161-168 McNabb et al. (1995) Genes Dev. 9: 47-58 Hall et al. (2000) Plant Physiol. 123: 1449-1458 McNabb et al. (1997) Mol. Cell. Biol. 17: 7008-7018 Hauptmann, et al., U.S. Pat. No. 5,618.988, issued Apr. 8, 15 Meinke (1992) Science 258: 1647-1650 1997 Meinke et al. (1994) Plant Cell 6: 1049-1064 Hawkins et al. U.S. Pat. No. 5,840,544, issued Nov. 24, 1998 Miki et al. (1993) in Methods in Plant Molecular Biology and Haymes et al., Nucleic Acid Hybridization: A Practical Biotechnology, p. 67-88, Glick and Approach, IRL Press, Washington, D.C. (1985) Thompson, eds. CRC Press, Inc., Boca Raton He et al. (2000) Transgenic Res. 9; 223-227 Miyoshi et al. (2003) Plant J. 36: 532-540 Heard et al., U.S. Pat. No. 7,135,616, issued Nov. 14, 2006 Mount (2001), in Bioinformatics. Sequence and Genome Hein (1990) Methods Enzymol. 183: 626-645 Analysis, Cold Spring Harbor Laboratory Press, Cold Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA Spring Harbor, N.Y., p. 543 89:10915 Müller et al. (2001) Plant J. 28: 169-179 Henikoff and Henikoff (1991) Nucleic Acids Res. 19: 6565 25 Meyers (1995) Molecular Biology and Biotechnology, Wiley 6572 VCH, New York, N.Y., p856-853 Herrera-Estrella et al. (1983) Nature 303: 209 Nandi et al. (2000) Curr. Biol. 10: 215-218 Hiei et al. (1994) Plant J. 6:271-282 Olesen and Guarente (1990) Genes Dev. 4: 1714-1729 Hiei et al. (1997) Plant Mol. Biol. 35:205-218 Ohlet al. (1990) Plant Cell 2: 837-848 Hiei et al., U.S. Pat. No. 5,591,616, issued Jan. 7, 1997 30 Odellet al. (1985) Nature 313: 810-812 Higgins et al. (1996) Methods Enzymol. 266: 383-402 Odellet al. (1994) Plant Physiol. 106: 447-458 Hillman et al., U.S. Pat. Appl. No. 20010010913, published Parcy et al. (1997) Plant Cell 9: 1265-1277 Aug. 2, 2001 Peng et al. (1997) Genes Development 11:3194-3205 Hohnet al. (1982) Molecular Biology of Plant Tumors, Aca Peng et al. (1999) Nature 400: 256-261 demic Press, New York, N.Y., pp. 549-560; US 35 Ratcliffe et al. (2001) Plant Physiol. 126: 122-132 Horsch et al. (1984) Science 233: 496-498 Reeves and Nissen (1995) Prog. Cell Cycle Res. 1: 339-349 U.S. Pat. No. 4,943,674 Riechmann et al. (2000a) Science 290: 2105-2110 Houck and Pear, U.S. Pat. No. 4,943,674, issued Jul. 24, 1990 Riechmann and Ratcliffe (2000b) Curr. Opin. Plant Biol. 3: Howell. U.S. Pat. No. 4,407,956, issued Oct. 4, 1983 423-434 Ishida et al. (1996) Nature Biotechnol. 14:745-750 40 Rieger et al. (1976) Glossary of Genetics and Cytogenetics. Jaglo et al. (2001) Plant Physiol. 127: 910-917 Classical and Molecular, 4th ed., Springer Verlag, Berlin Jang et al. (1997) Plant Cell 9:5-19 Ringli and Keller (1998) Plant Mol. Biol. 37: 977-988 Jiang et al., U.S. Pat. Appl. No. 2004.0128712, published Jul. Robson et al. (2001) Plant J. 28: 619-631 1, 2004 Romier et al. (2003).J. Biol. Chem. 278: 1336-1345 Jiang et al., U.S. Pat. Appl. No. 20070033671, published Feb. 45 Sadowski et al. (1988) Nature 335: 563-564 8, 2007 Kaiser et al. (1995) Plant Mol. Biol. 28: 231-243 Saijo et al. (2000) Plant J. 23:319-327 Kashima et al. (1985) Nature 313:402-404 Saleki et al. (1993) Plant Physiol. 101: 839-845 Kellogg, et al., U.S. Pat. No. 5,783,393, issued Jul. 21, 1998 Sambrook et al. (1989) Molecular Cloning. A Laboratory Kim et al. (1996) Mol. Cell. Biol. 16: 4003-4013 Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Kim et al. (2001) Plant J. 25: 247-259 50 Spring Harbor, N.Y. Kim and Sheffrey (1990).J. Biol. Chem. 265: 13362-13369 Sanford et al. (1987) Part. Sci. Technol. 5:27-37 Kimmel (1987) Methods Enzymol. 152:507-511 Sanford (1993) Methods Enzymol. 217:483-509 Klee (1985) Bio/Technology 3: 637-642 Schaffner and Sheen (1991) Plant Cell 3: 997-1012 Klein et al. (1987) Nature 327: 70-73 Sherman et al., PCT publication WO2004076638, published Klein et al., 1987; U.S. Pat. No. 4,945,050, issued Jul. 31, 55 Sep. 10, 2004 1990 Shi et al. (1998) Plant Mol. Biol. 38: 1053-1060 Koornneef et al. (1986) In Tomato Biotechnology: Alan R. Shpaer (1997) Methods Mol. Biol. 70: 173-187 Liss, Inc., 169-178 Siebertz et al. (1989) Plant Cell 1: 961-968 Ku et al. (2000) Proc. Natl. Acad. Sci. USA 97: 9121-9126 Sinha et al. (1996) Mol. Cell. Biol. 16:328-337 Kuhlemeier et al. (1989) Plant Cell 1: 471-478 60 Smeekens (1998) Curr. Opin. Plant Biol. 1: 230-234 Kwong (2003) Plant Cell 15:5-18 Smith et al. (1992) Protein Engineering 5:35-51 Kyozuka and Shimamoto (2002) Plant Cell Physiol. 43: 130 Smith and Waterman (1981) Adv. Appl. Math. 2: 482-489 135 Soltis et al. (1997) Ann. Missouri Bot. Gard. 84: 1-49 Lee et al. (2003) Proc. Natl. Acad. Sci. USA 100: 2152-2156. Sonnhammer et al. (1997) Proteins 28: 405-420 Leon-Kloosterziel et al. (1996) Plant Physiol. 110: 233-240 65 Spencer et al. (1994) Plant Mol. Biol. 24: 51-61 Liet al. (1992) Nucleic Acids Res. 20: 1087-1091 Suzuki et al. (2001) Plant J. 28: 409–418 Lin et al. (1991) Nature 353: 569-571 Tasanen et al. (1992).J. Biol. Chem. 267: 11513-11519 US 8,633,353 B2 59 60 Thomas et al., U.S. Pat. No. 5,905,186, issued May 18, 1999 Weeks et al. (1993) Plant Physiol. 102: 1077-1084 Thompson et al. (1994) Nucleic Acids Res. 22:4673-4680 Weigel and Nilsson (1995) Nature 377: 482-500 Tomes et al., U.S. Pat. No. 5,322,783, issued Jun. 21, 1994 Weissbach and Weissbach (1989) Methods for Plant Molecu Tomes et al. U.S. Pat. No. 5,773,697, issued Jun. 30, 1998 lar Biology, Academic Press 5 Wenkel et al. (2006) Plant Cell. 18: 2971-2984 Townsend et al., U.S. Pat. No. 5,563,055, issued Oct. 8, 1996 West et al. (1994) Plant Cell 6: 1731-1745 Tudge (2000) in The Variety of Life, Oxford University Press, Willmott et al. (1998) Plant Molec. Biol. 38: 817-825 New York, N.Y. pp. 547-606 Wu et al. (1996) Plant Cell 8:617-627 van der Kop et al. (1999) Plant Mol. Biol. 39:979-990 Xin and Browse (1998) Proc. Natl. Acad. Sci. USA95: 7799 Vasil et al. (1992) Bio/Technol. 10:667-674 7804 Vasil et al. (1993) Bio/Technol. 11:1553-1558 10 Xing. et al. (1993) EMBO.J. 12: 4647-4655 Vasil (1994) Plant Mol. Biol. 25: 925-937 Xiong et al. (2001) Genes Dev. 15: 1971-1984. Vicient et al. (2000).J. Exp. Bot. 51:995-1003 Xu et al. (2001) Proc. Natl. Acad. Sci. USA 98: 15089-15094 Vos et al. U.S. Pat. No. 6,613,962, issued Sep. 2, 2003 Zemzoumi et al. (1999).J. Mol. Biol. 286: 327-337 Wahl and Berger (1987) Methods Enzymol. 152:399-407 Zhang et al. (1991) Bio/Technology 9: 996-997 Wan and Lemeaux (1994) Plant Physiol. 104:37-48 Zhu et al. (1998) Plant Cell 10: 1181-1191

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS : 116

<21 Os SEQ ID NO 1 &211s LENGTH: 832 &212s. TYPE: DNA <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: 223 OTHER INFORMATION: G481

<4 OOs SEQUENCE: 1 gag.cgtttcg tagaaaaatt cqatttct ct aaag.ccctaa aactaaaacg act atc.ccca 60 att CCaagtt Ctagggittt C Catct tcCCC aatctagtat aaatggcgga tacgc.ctt C9 120 agcc.cagctg gagatggcgg agaaag.cggc ggttc.cgitta gggagcagga togatacctt 18O

Cct at agcta at at Cagcag gat catgaag aaag.cgttgc ct cotaatgg talagattgga 24 O

aaagatgcta aggatacagt to aggaatgc git citctgagt to at cagott catcactago 3 OO gaggc.ca.gtg at aagtgtca aaaagagaaa aggaaaactg. taatggtga tigatttgttg 360

tgggcaatgg caacattagg atttgaggat tacctggaac ct ctaaagat at acct agcg 42O agg tacaggg agttggaggg tataataag ggat caggaia agagtggaga tiggatcaaat 48O

agagatgctg gtggcggtgt ttctggtgaa gaaatgc.cga gctggtaaaa gaagttgcaa. 54 O

gtagtgatta agaacaatcg ccaaatgat C aagggaaatt agagat cagt gagttgttta 6 OO

tagttgagct gatcgacaac tattt cqggit ttact ct caa titt.cggittat gttagtttga 660 acgtttggitt tattgtttcc ggitttagttg gttg tattta aagatttctic tdttagatgt 72O

tgagaac act togaatgaagg aaaaatttgt ccacatcct g ttgttattitt cattcactt 78O

toggaattitc at agctaatt tatt ct catt taataccaaa to cittaaatt aa 832

<21 Os SEQ ID NO 2 &211s LENGTH: 141 212s. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: G481 polypeptide

<4 OOs SEQUENCE: 2 Met Ala Asp Thr Pro Ser Ser Pro Ala Gly Asp Gly Gly Glu Ser Gly 1. 5 1O 15

Gly Ser Val Arg Glu Glin Asp Arg Tyr Lieu Pro Ile Ala Asn. Ile Ser 2O 25 3 O

Arg Ile Met Lys Lys Ala Lieu Pro Pro Asn Gly Lys Ile Gly Lys Asp US 8,633,353 B2 61 62 - Continued

35 4 O 45

Ala Lys Asp Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile SO 55 6 O

Thir Ser Glu Ala Ser Asp Llys Cys Glin Lys Glu Arg Thir Wall 65 70 7s 8O

Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Ala Thr Lell Gly Phe Glu Asp 85 90 95

Tyr Lieu. Glu Pro Lieu Lys Ile Tyr Lieu Ala Arg Arg Glu Lieu. Glu 1OO 105 11 O

Gly Asp Asn Lys Gly Ser Gly Lys Ser Gly Asp Gly Ser Asn Arg Asp 115 12 O 125

Ala Gly Gly Gly Val Ser Gly Glu Glu Met Pro Ser Trp 13 O 135 14 O

<210s, SEQ ID NO 3 &211s LENGTH: 555 &212s. TYPE: DNA <213> ORGANISM: Glycine max 22 Os. FEATURE: 223 OTHER INFORMATION: G3 470

<4 OOs, SEQUENCE: 3 t caccgggitt ttgagatgt cqgatgcacc ggcgagt cc.g agt cacgaga gtggtgg.cga 6 O gcagagcc ct cqcggctcgt titc.cggcgc ggctagagag Caggaccggit acct tcc Cat 12 O tgccaa.catc agcc.gcatca tdaagaaggc tictogcct coc aatggcaaga ttgcgaagga 18O tgcaaaagac acaatgcaag aatgcgttt C taattcatc agctt catta C Cagdgaggc 24 O gagtgaga aa to Cagaagg agaagaga aa gacaatcaat ggaga.cgatt tactatgggc 3OO

Catggcaact ttagggitttg aagactacat tagcc.gctt alaggtgt acc tggct aggta 360 cagagaggcg gagggtgaca Ctaaaggat.c tictagaagt ggtgatggat ctgctag acc agat Caagtt ggccttgcag gtcaaaatgc ticagottgtt Cat Cagggitt cgctgaacta tattggitttg caggtgcaac cacaa catct ggittatgcct t caatgcaa.g gcc atgaata 54 O gtttagatgc titcta 555

<210s, SEQ ID NO 4 &211s LENGTH: 174 212. TYPE: PRT <213> ORGANISM: Glycine max 22 Os. FEATURE: <223> OTHER INFORMATION: G3 470 polypeptide

<4 OOs, SEQUENCE: 4 Met Ser Asp Ala Pro Ala Ser Pro Ser His Glu Ser Gly Gly Glu Glin 1. 5 1O 15

Ser Pro Arg Gly Ser Lieu. Ser Gly Ala Ala Arg Glu Glin Asp Arg Tyr 2O 25 3O

Lieu Pro Ile Ala Asn. Ile Ser Arg Ile Met Lys Ala Luell Pro Pro 35 4 O 45

Asn Gly Lys Ile Ala Lys Asp Ala Lys Asp Thr Met Glin Glu Wall SO 55 6 O

Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu Ala Ser Glu Glin 65 70 7s 8O

Lys Glu Lys Arg Llys Thir Ile Asin Gly Asp Asp Lell Lell Trp Met 85 90

Ala Thr Lieu. Gly Phe Glu Asp Tyr Ile Glu Pro Lell Wall Luell 1OO 105 11 O US 8,633,353 B2 63 64 - Continued

Ala Arg Tyr Arg Glu Ala Glu Gly Asp Thir Lys Gly Ser Ala Arg Ser 115 12 O 125

Gly Asp Gly Ser Ala Arg Pro Asp Glin Val Gly Lell Ala Gly Glin Asn 13 O 135 14 O

Ala Glin Luell Val His Glin Gly Ser Lieu. Asn Tyr Ile Gly Luell Glin Wall 145 150 155 160

Glin Pro Glin His Lieu. Wal Met Pro Ser Met Glin Gly His Glu 1.65 17O

SEO ID NO 5 LENGTH: 556 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3 4.71

SEQUENCE: 5 gtagggitttgttgagatgtcg gatgcgc.cac cgagc.ccgac t catgagagt ggggg.cgagc 6 O agagc.ccg.cg cggttcgt.cg tcc.ggcgcga gggagcagga ccggtacctic ccgattgc.ca 12 O acat cagc.cg cattatgaag aaggctctgc ctic ccaacgg Caagattgca aaggatgc.ca 18O aaga caccat gCaggaatgc gtttctgagt t catcagctt cattaccagc gaggcgagtg 24 O agaaatgc.ca galaggagaag agaaagacaa t caatggaga cgatttgcta tgggcCatgg 3OO c cactittagg atttgaagac tacatagagc cgcttalaggt gtacctggct agg tacagag 360 aggcggaggg tgacactaaa ggatctgcta galagtggtga tggatctgct acaccagat c aagttggc ct tgcagg to aa aattct cago ttgttcatca gggttcgctg aactatattg gtttgcaggt gcaaccacaa catctggitta tgc ctitcaat gcaaa.gc.cat gaatagttta 54 O gatgcttcta cgcatc 556

SEQ ID NO 6 LENGTH: 173 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3471 p olypeptide <4 OOs, SEQUENCE: 6

Met Ser Asp Ala Pro Pro Ser Pro Thir His Glu. Ser Gly Gly Glu Glin 1. 5 1O 15

Ser Pro Arg Gly Ser Ser Ser Gly Ala Arg Glu Glin Asp Arg Tyr Lieu. 25 3O

Pro Ile Ala Asn. Ile Ser Arg Ile Met Lys Llys Ala Lell Pro Pro Asn 35 4 O 45

Gly Lys Ile Ala Lys Asp Ala Lys Asp Thr Met Glin Glu Wall Ser SO 55 6 O

Glu Phe Ile Ser Phe Ile Thir Ser Glu Ala Ser Glu Gln Lys 65 70 7s

Glu Arg Llys Thir Ile Asn Gly Asp Asp Lieu. Lell Trp Ala Met Ala 85 90 95

Thir Luell Gly Phe Glu Asp Tyr Ile Glu Pro Lieu. Wall Tyr Lieu Ala 105 11 O

Arg Arg Glu Ala Glu Gly Asp Thr Lys Gly Ser Ala Arg Ser Gly 115 12 O 125

Asp Gly Ser Ala Thr Pro Asp Glin Val Gly Lieu. Ala Gly Glin Asn. Ser 13 O 135 14 O US 8,633,353 B2 65 66 - Continued Glin Lieu Val His Glin Gly Ser Lieu. Asn Tyr Ile Gly Lieu. Glin Val Glin 145 150 155 160

Pro Glin His Lieu. Wal Met Pro Ser Met Glin Ser His Glu. 1.65 17O

<210s, SEQ ID NO 7 &211s LENGTH: 818 &212s. TYPE: DNA <213> ORGANISM: Glycine max 22 Os. FEATURE: 223 OTHER INFORMATION: G3875

<4 OO > SEQUENCE: 7

Cttalacattt CCCtaalacat alacticacaca cocotottct Ctagggittt C aattt Cacca 6 O tgtttccctt t ct caaatta gggttc.cggc gag catggcc gacggtc.cgg cgagt cc agg 12 O cggcggtagc cacgagagcg gcgagcacag ccctc.gctct agcaggacag 18O gtacct cocc atcgctaa.ca taag.ccgcat Catgaagaag gcactacctg cgaacggtaa 24 O aatcgc.caag gacgc.caaag aga.ccgttca ggaatgcgta tcc.gagttca t cagttt cat 3OO

Caccagcgag gcct ctgata agtgtcagag ggaaaagaga aag act atta acggtgatga 360 tittgctctgg gcc atggcca citcttggttt tgaggattat atcgatcctic ttaaaattta cct cactaga tacagagaga tiggagggtga tacga agggit t cagccalagg gcggagactic atctitctaag aaagatgttc agccaagtico taatgct cag cittgct catc aaggttctitt 54 O ct caca aggt gttagttaca caatttctica gggtcaiacat atgatggttc Caatgcaagg

CCC9gagtag gtatcaagtt tattaaccct cctgttgtaa cg tatgtttt c cacgc.cagt 660 taccaagtgc t cacgg cata ttgaatgtct titt tatgtta tgttgaatact gacat aggag 72 O atggttcttg tgtc.ctttitt ttittaaaaaa aaaagta agg tttgtatatt atctttggag tcqaattatt atttgaaagt tattatattg taaat CCt 818

<210s, SEQ ID NO 8 &211s LENGTH: 171 212. TYPE : PRT <213> ORGANISM: Glycine max 22 Os. FEATURE: <223> OTHER INFORMATION: G3875 p olypeptide <4 OOs, SEQUENCE: 8

Met Ala Asp Gly Pro Ala Ser Pro Gly Gly Gly Ser His Glu Ser Gly 1. 5 1O 15

Glu. His Ser Pro Arg Ser Asn. Wall Arg Glu Glin Asp Arg Tyr Leul Pro 2O 25

Ile Ala Asn Ile Ser Arg Ile Met Llys Lys Ala Lell Pro Ala Asin Gly 35 4 O 45

Ala Lys Asp Ala Lys Glu Thir Wall Glin Glu Wall Ser Glu SO 55 6 O

Phe Ile Ser Phe Ile Thir Ser Glu Ala Ser Asp Glin Arg Glu 70 7s 8O

Thir Ile Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Ala Thr 85 90 95

Lieu. Gly Phe Glu Asp Tyr Ile Asp Pro Lieu Lys Ile Luell Thr Arg 1OO 105 11 O

Glu Met Glu Gly Asp Thr Lys Gly Ser Ala Lys Gly Gly Asp 115 12 O 125

Ser Ser Ser Llys Llys Asp Val Glin Pro Ser Pro Asn Ala Glin Lieu Ala 13 O 135 14 O US 8,633,353 B2 67 68 - Continued

His Glin Gly Ser Phe Ser Glin Gly Val Ser Tyr Thr Ile Ser Glin Gly 145 150 155 160

Gln His Met Met Wall Pro Met Glin Gly Pro Glu 1.65 17O

SEO ID NO 9 LENGTH: 557 TYPE: DNA ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3876

SEQUENCE: 9 tataagtgca ggaggagctic atggcggalag Ctc.cggcgag CCCtggcggc ggcgg.cggga 6 O gccacgagag cgggagcc cc aggggagg.cg gaggcggtgg Cagcgtcagg gag caggaca 12 O ggttcCtgcc catcgc.caac atcagt cqca t catgaagaa ggc catc.ccg gctaacggga 18O agat.cgc.caa ggacgctaag gag accgtgc aggagtgcgt. citc.cgagttc at ct cott ca. 24 O t cactagoga agcgagtgac aagtgccaga gggagaa.gc.g gaagaccatc aatggcgacg 3OO atctgctgtg ggc catggcc acgctggggt ttgaagacita cattgaaccc Ctcaaggtgt 360 acctgcagaa gtacagagag atggagggtg atagdaagtt aactgcaaaa tctagogatg gct caattaa aaaggatgcc Cttggtcatg tgggagcaa.g tagct cagot gcaca aggga tgggccaa.ca gggagcatac alaccalaggaa tgggittatat gcaa.ccc.ca.g taccatalacg 54 O gggatatcto aalactaa sist

SEQ ID NO 10 LENGTH: 178 TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3876 polypeptide <4 OOs, SEQUENCE: 10

Met Ala Glu Ala Pro Ala Ser Pro Gly Gly Gly Gly Gly Ser His Glu 1. 5 15

Ser Gly Ser Pro Arg Gly Gly Gly Gly Gly Gly Ser Wall Arg Glu Glin 25

Asp Arg Phe Lieu Pro Ile Ala Asn Ile Ser Arg Ile Met Lys Ala 35 4 O 45

Ile Pro Ala Asin Gly Lys Ile Ala Asp Ala Lys Glu Thir Wall Glin SO 55 6 O

Glu Wall Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu Ala Ser Asp 65 8O

Glin Arg Glu Lys Arg Llys Thir Ile ASn Gly Asp Asp Lieu. Luell 85 90 95

Trp Ala Met Ala Thr Lieu. Gly Phe Glu Asp Tyr Ile Glu Pro Lieu Lys 105 11 O

Wall Luell Glin Llys Tyr Arg Glu Met Glu Gly Asp Ser Lieu. Thir 115 12 O 125

Ala Lys Ser Ser Asp Gly Ser Ile Asp Ala Lell Gly His Wall 13 O 135 14 O

Gly Ala Ser Ser Ser Ala Ala Glin Gly Met Gly Glin Glin Gly Ala Tyr 145 150 155 160

Asn Glin Gly Met Gly Tyr Met Glin Pro Glin Tyr His Asn Gly Asp Ile 1.65 17O 17s US 8,633,353 B2 69 - Continued

Ser Asn

SEQ ID NO 11 LENGTH: 819 TYPE: DNA ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3434

<4 OOs, SEQUENCE: 11

cgctggcgat ggc.cgacgac ggcgggagcc 6 O acgagggcag cgg.cggcggc ggagg.cgtCc gggagcagga ccggttcCtg cc catcgc.ca 12 O acat cagc.cg gat catgaag aaggc.cgt.cc cggccaacgg Caagat.cgc.c aaggacgcta 18O aggaga cc ct gCaggagtgc gtc.t.ccgagt toatlatcatt cgtgaccagc gaggc.ca.gc.g 24 O acaaatgc.ca gaaggaga aa caaagacala t caacgggga cgatttgctic tgggcgatgg 3OO c cactittagg att Caggag tacgt.cgagc citct caagat ttacctacala aagtacaaag 360 agatggaggg tgatagcaag ctgtctacaa aggctgg.cga gggct Ctgta aagaaggatg caattagt cc c catggtggc accagtagct caagtaatca gttggttcag Catggagtict acaaccalagg gatgggct at atgcagocac agtaccacaa tggggaalacc taacaaaggg 54 O ctaatacago agcaattitat gct agggaag t citctgcatt gcttaccatg tgt attggca gaaaac agga ggc acttaca aagggtgtta atctotgcga tggctgcct c t caggtgtaa 660 attggctt.cg gtttagcgct gcttttgtc.c gtatatttag gatgatttga ctgttgctac 72 O ttttggcaac Cttttacatt tacagatatg tattatt cag Catalaatata atatagtagt cctaggccta aataatggtg attaaaaaaa aaaaaaaaa. 819

SEQ ID NO 12 LENGTH: 164 TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3434 polypeptide

<4 OOs, SEQUENCE: 12 Met Ala Asp Asp Gly Gly Ser His Glu Gly Ser Gly Gly Gly Gly Gly 1. 5 1O 15

Wall Arg Glu Glin Asp Arg Phe Lieu. Pro Ile Ala Asn Ile Ser Arg Ile 2O 25

Met Lys Ala Wall Pro Ala Asn Gly Lys Ile Ala Lys Asp Ala Lys 35 4 O 45

Glu Thir Luell Glin Glu Cys Val Ser Glu Phe Ile Ser Phe Wall Thir Ser SO 55 6 O

Glu Ala Ser Asp Llys Cys Glin Lys Glu Lys Arg Thir Ile Asin Gly 65 70 7s

Asp Asp Luell Lieu. Trp Ala Met Ala Thr Lieu. Gly Phe Glu Glu Tyr Val 85 90 95

Glu Pro Luell Lys Ile Tyr Lieu. Glin Llys Tyr Lys Glu Met Glu Gly Asp 1OO 105 11 O

Ser Luell Ser Thir Lys Ala Gly Glu Gly Ser Wall Lys Asp Ala 115 12 O 125

Ile Ser Pro His Gly Gly. Thir Ser Ser Ser Ser Asn Glin Luell Wall Glin 13 O 135 14 O

His Gly Wall Tyr Asn Gln Gly Met Gly Tyr Met Glin Pro Glin Tyr His 145 150 155 160 US 8,633,353 B2 71 - Continued Asn Gly Glu Thr

SEQ ID NO 13 LENGTH: 823 TYPE: DNA ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G1364

<4 OOs, SEQUENCE: 13 citcttctotc gatctotcto cct citctgcc tcc to totto CtcCaat Caa caaac acct C 6 O tctgttt cac toccg.ccctt titt cqaattic ttctgatggc ggagtC9Cag gcc aagagtic 12 O

tggaagcc at gaga.gtggtg gagat caaag tcc caggtog ttacatgttc 18O gtgagcaaga taggtttctt ccgattgcta acataagcc.g tat catgaaa agaggtott C 24 O

Ctgctaatgg gaaaatcgct aaagatgcta aggagattgt gCaggaatgt gtc.tctgaat 3OO t catcagttt cgt caccago gaa.gc.gagtg ataaatgtca aagagagaaa aggalagact a 360 ttaatggaga tgatttgctt tgggcaatgg c tactittagg atttgaagac tacatggaac citct caaggt ttacct gatg agatatagag agatggaggg tgacacaaag ggat Cagcaa. alaggtgggga tccaaatgca aagaaagatg ggcaatcaag cCaaaatggc cagttct cqc 54 O agcttgctica c caaggtoct tatgggaact citcaa.gctica gcagdatatg atggttccaa tgc.cgggaac agacitagtat gagaggagta ttcaactittg gttatgttcc accaaaagag 660 gttatctitat citgitat atta tttgttgttgt aagagttittg Ctatgcagaa cittgttgacta 72 O taat Catcto attgttttgt tttgtttttg titt cottgga tttgttctga atatgtcatc aagt cagt ca gtctattitat at atttggitt tgttgattat tto 823

SEQ ID NO 14 LENGTH: 173 TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G1364 polypeptide

<4 OOs, SEQUENCE: 14 Met Ala Glu Ser Glin Ala Lys Ser Pro Gly Gly Gly Ser His Glu 1. 5 1O 15

Ser Gly Gly Asp Glin Ser Pro Arg Ser Lieu. His Wall Arg Glu Glin Asp 25

Arg Phe Luell Pro Ile Ala ASn Ile Ser Arg Ile Met Lys Arg Gly Lieu. 35 4 O 45

Pro Ala Asn Gly Lys Ile Ala Lys Asp Ala Lys Glu Ile Wall Glin Glu SO 55 6 O

Cys Wall Ser Glu Phe Ile Ser Phe Wall. Thir Ser Glu Ala Ser Asp Llys 65 70

Glin Arg Glu Lys Arg Llys Thr Ile Asin Gly Asp Asp Luell Lieu. Trp 85 90 95

Ala Met Ala Thr Lieu. Gly Phe Glu Asp Tyr Met Glu Pro Luell Llys Val 105 11 O

Luell Met Arg Tyr Arg Glu Met Glu Gly Asp Thir Lys Gly Ser Ala 115 12 O 125

Gly Gly Asp Pro Asn Ala Lys Glin Ser Ser Glin Asn 13 O 135 14 O

Gly Glin Phe Ser Glin Lieu Ala His Gln Gly Pro Gly Asn Ser Glin 145 150 155 160 US 8,633,353 B2 73 - Continued Ala Glin Gln His Met Met Val Pro Met Pro Gly Thr Asp 1.65 17O

SEO ID NO 15 LENGTH: 6OO TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3 475

<4 OOs, SEQUENCE: 15 tcqattatcc gtttgtcgat ggcggacticg gacaacgact gcacaacgc.c 6 O gggaagggga gcgagatgtc. gcc.gcgggag Caggaccggit tcc tigc.cgat cgcgaacgtg 12 O agcc.gcatca tgaagaaggc gctg.ccggcg aacgc.galaga t ct caagga cgcgaaggag 18O acggtgcagg agtgcgtgtc. ggagttcatc agctt catca Ctc.cgacaag 24 O

agaa.gc.gcaa. gacgatcaac ggcgacgacc tgctctgggc gatgaccact 3OO

Ctcggctt.cg aggact acgt. cgagcct citc aagggct acc tccagcgctt cc.gagaaatg 360 galaggagaga agacagtggc ggcgc.gtgac aaggacgc.gc ctic ct cottac caatgct acc aacagtgc ct acgagagt cc tag titatgct gctgctic ctg gtggaat cat gatgcatcag ggacacgtgt acggttctgc cggct tcc at Caagtggctg gtggtgctat aaagggtggg 54 O cctgtttatc atccaatgcc gg taggcc.ca ggtagatggg cctatgttat

SEQ ID NO 16 LENGTH: 188 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3475 p olypeptide <4 OOs, SEQUENCE: 16

Met Ala Asp Ser Asp Asn Asp Ser Gly Gly Ala His Asn Ala Gly Lys 1. 5 1O 15

Gly Ser Glu Met Ser Pro Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala 25

Asn Wall Ser Arg Ile Met Lys Llys Ala Lieu Pro Ala Asn Ala Lys Ile 35 4 O 45

Ser Lys Asp Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Phe Ile SO 55 6 O

Ser Phe Ile Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu 65 70

Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly 85 90 95

Phe Glu Asp Tyr Val Glu Pro Leu Lys Gly Tyr Lell Glin Arg Phe Arg 105 11 O

Glu Met Glu Gly Glu Lys Thr Val Ala Ala Arg Asp Lys Asp Ala Pro 115 12 O 125

Pro Pro Thir Asn Ala Thir Asn. Ser Ala Tyr Glu Ser Pro Ser Tyr Ala 13 O 135 14 O

Ala Ala Pro Gly Gly Ile Met Met His Glin Gly His Wall Gly Ser 145 150 155 160

Ala Gly Phe His Glin Val Ala Gly Gly Ala Ile Gly Gly Pro Wall 1.65 17O 17s

Pro Gly Pro Gly Ser Asn Ala Gly Arg Pro Arg 18O 185 US 8,633,353 B2 75 - Continued SEO ID NO 17 LENGTH: 684 TYPE: DNA ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G485

<4 OOs, SEQUENCE: 17 cct citctgat coaacggacc Caaaa.cat Ct at citctic titt ctic gacctitt tgtct cotcg 6 O atctaaagat ggcggatt.cg gacaacgatt Caggaggaca caaagacggit ggaaatgctt 12 O cgacacgtga gcaagatagg tittct accoga tcqctaacgt. tag caggat C atgaagaaag 18O cact tcctgc gaacgcaaaa atctotaagg atgctaaaga aacggttcaa gagtgttgtat 24 O cggaatt cat aagttt catc accggtgagg cittctgacaa gtgtcagaga gagaa.gagga 3OO agacaatcaa cggtgacgat cittctittggg cgatgacitac gct agggittt gaggact acg 360 tggagcct ct caaggttitat Ctgcaaaagt at agggaggt ggaaggaga.g aagacitact a cggCagggag acaaggcgat aaggalaggtg gaggaggagg cggtggagct ggaagttggaa gtggaggagc tcc gatgtac ggtggtggca tggtgactac gatggga cat caattitt coc 54 O at cattitt to ttaattgtaa aatgataaaa gcaaattitt c atttitt atta attaatgata tatatatata tatgtttaac ttittagtata atgtttacag aattitt t t t t ttaaaac tag 660 gttcaa.ccca ctaacgtaac agcg 684

SEQ ID NO 18 LENGTH: 161 TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G485 polypeptide <4 OOs, SEQUENCE: 18 Met Ala Asp Ser Asp Asn Asp Ser Gly Gly His Asp Gly Gly Asn 1. 5 1O 15

Ala Ser Thir Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn Wall Ser 25 3O

Arg Ile Met Llys Lys Ala Lieu Pro Ala Asn Ala Ile Ser Lys Asp 35 4 O 45

Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Ile Ser Phe Ile SO 55

Thir Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Arg Thir Ile 65 70 7s

Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Lell Gly Phe Glu Asp 85 90 95

Wall Glu Pro Leu Lys Val Tyr Lieu. Glin Lys Arg Glu Wall Glu 105 11 O

Gly Glu Lys Thir Thr Thr Ala Gly Arg Glin Gly Asp Lys Glu Gly Gly 115 12 O 125

Gly Gly Gly Gly Gly Ala Gly Ser Gly Ser Gly Gly Ala Pro Met Tyr 13 O 135 14 O

Gly Gly Gly Met Wall. Thir Thir Met Gly His Glin Phe Ser His His Phe 145 150 155 160

Ser

<210s, SEQ ID NO 19 &211s LENGTH: 528 &212s. TYPE: DNA <213> ORGANISM: Glycine max US 8,633,353 B2 77 - Continued

22 Os. FEATURE: 223 OTHER INFORMATION: G3476

<4 OOs, SEQUENCE: 19 ggattgattg taagatggc tgagt cqgac aacgact cqg gagggg.cgca gaacg.cggga 6 O alacagtggaa actitgagcga gttgtc.gc.ct cgggaac agg accoggitttct c cc catagog 12 O aacgtgagca ggat catgaa gaaggccttg ccggcgaacg cgaagat citc galaggacgc.g 18O aaggaga.cgg tgcaggaatg cgtgtcggag ttcat cagot t catalacggg tgagg.cgt.cg 24 O gacaagtgcc agagggagaa gcgcaagacic at Caacggcg acgat cittct Ctgggcc atg 3OO acaa.ccctgg gatt.cgalaga gtacgtggag cct ctdaaga tt tacct coa gcgct tcc.gc 360 gagatggagg gagagaagac cgtggcc.gc.c cgcgact citt ctaaggactic ggcct cogCC t cott cotlatc. at Cagggaca cgtgtacggc toccctgcct acCat Catca agtgcCtggg

CCCaCttatc. Ctgc.ccctgg tag acccaga tgacgtgctic c to tatto 528

SEQ ID NO 2 O LENGTH: 1.65 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3476 p olypeptide SEQUENCE: 2O

Met Ala Glu Ser Asp Asn Asp Ser Gly Gly Ala Glin Asn Ala Gly Asn 1. 5 1O 15

Ser Gly Asn Lieu. Ser Glu Lieu. Ser Pro Arg Glu. Glin Asp Arg Phe Lieu. 25 3O

Pro Ile Ala Asn Val Ser Arg Ile Met Lys Llys Ala Lell Pro Ala Asn 35 4 O 45

Ala Lys Ile Ser Lys Asp Ala Lys Glu Thir Wall Glin Glu Wall Ser SO 55 6 O

Glu Phe Ile Ser Phe Ile Thr Gly Glu Ala Ser Asp Glin Arg 65 70 7s

Glu Arg Llys Thir Ile Asn Gly Asp Asp Lieu. Lell Trp Ala Met Thr 85 90 95

Thir Lel Gly Phe Glu Glu Tyr Val Glu Pro Lieu. Ile Tyr Lieu. Glin 105 11 O

Arg Phe Arg Glu Met Glu Gly Glu Lys Thr Val Ala Ala Arg Asp Ser 115 12 O 125

Ser Lys Asp Ser Ala Ser Ala Ser Ser Tyr His Glin Gly His Val Tyr 13 O 135 14 O

Gly Ser Pro Ala Tyr His His Glin Val Pro Gly Pro Thir Pro Ala 145 150 155 160

Pro Gly Arg Pro Arg 1.65

SEQ ID NO 21 LENGTH: 896 TYPE: DNA ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G2345

SEQUENCE: 21 agat cagat.c aaatct ct ct ct c catct co gtctic ct cott ccaat catca tttct ct ct c 6 O cgt.ctic cc.gc tictttitt cqa attct cottc ttctt caggg tttctgagat gg.ccgaatcg 12 O US 8,633,353 B2 79 - Continued caaaccggtg gtggtggtgg tggaagcc at gaga.gtggcg gtgat cagag ccc.gaggtot 18O ttgaatgttc gtgagcagga caggtttctt ccgattgcta acataagc.cg tat catgaag 24 O agaggtttac Ctctaaatgg caaaatcgct aaagatgcta aagagacitat gcaggaatgt 3OO gtct citgaat t catcagott cgt caccago gaggctagtg atalagtgcca aagagagaala 360 aggalagacca t caatggaga tgatttgctt tgggctatgg c cactittagg attcgaagat tacatcgatc ccct caaggt ttacctgatg cgatatagag agatggaggg tacactaala ggat Caggaa aaggcgggga atcgagtgca aagagagatg gtcaaccaag ccaagtgtct 54 O cagttct cqc aggttcctica acaaggctica ttct cacagg gtcct tatgg aaact ct caa ggttctgaata tatggttca aatgc.cgggc acagagtagt act aggacac tatt cat cac 660 aatct tcc.ga gacticgacta tgcctgttgt gtctgagaat ctgagtgat c cactitt.ccat 72 O agatatggat ttggitat ct Cagtttittgg gaga.gagaag atttgtatat gattgttgttg tagggagaag agtttggttt gct tatgata agaattgaaa agcacct titt ttittcttittg 84 O actgttatgt ttcttaggitt ttggit cittaa gCagaagcta tttaccalaala aaaaaa. 896

SEQ ID NO 22 LENGTH: 176 TYPE PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G2345 polypeptide <4 OOs, SEQUENCE: 22

Met Ala Glu Ser Glin Thr Gly Gly Gly Gly Gly Gly Ser His Glu Ser 1. 5 1O 15

Gly Gly Asp Glin Ser Pro Arg Ser Lieu. Asn. Wall Arg Glu Glin Asp Arg 25 3O

Phe Luell Pro Ile Ala Asn. Ile Ser Arg Ile Met Arg Gly Lieu Pro 35 4 O 45

Lell Asn Gly Lys Ile Ala Lys Asp Ala Lys Glu Thir Met Gln Glu. Cys SO 55 6 O

Wall Ser Glu Phe Ile Ser Phe Wall Thir Ser Glu Ala Ser Asp Llys Cys 65 70 7s

Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp Asp Lell Lieu. Trp Ala 90 95

Met Ala Thir Lieu. Gly Phe Glu Asp Tyr Ile Asp Pro Lell Llys Val Tyr 105 11 O

Lell Met Arg Tyr Arg Glu Met Glu Gly Asp Thr Gly Ser Gly Lys 115 12 O 125

Gly Gly Glu Ser Ser Ala Lys Arg Asp Gly Glin Pro Ser Glin Wal Ser 13 O 135 14 O

Glin Phe Ser Glin Wall Pro Glin Glin Gly Ser Phe Ser Glin Gly Pro Tyr 145 150 155 160

Gly Asn Ser Glin Gly Ser Asn Met Met Wall Glin Met Pro Gly Thr Glu 1.65 17O 17s

SEQ ID NO 23 LENGTH: 553 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3474

<4 OOs, SEQUENCE: 23 gatatc catg gctgagtc.cg acaacgagtic aggaggt cac acggggaacg cgagcgggag 6 O US 8,633,353 B2 81 82 - Continued

Caacgagttg tcc.ggttgca gggagcaaga caggttcctic c caatagdaa acgtgagcag 12 O gat catgaag aaggcgttgc cggcgaacgc gaagatatcg aaggagg.cga aggaga.cggit 18O gCaggagtgc gtgtcggagt t catcagott Catalacagga gaggctt cog atalagtgc.ca 24 O galaggagaag aggalagacga t caacggcga cgatcttctic tgggc catga CtaccCtggg 3OO

Cttcgaggac tacgtggatc citct caagat ttacctgcac aagtataggg agatggaggg 360 ggagaaaacc gctatgatgg gaaggccaca tgagagggat gagggittatg gcc atggc.ca tggit catgca act cotatga tgacgatgat gatggggcat Cagc.cccagc accagcacca gcaccagcac Caggga cacg tgt atggatc tggat cagca tottctgcaa gaact agata 54 O gcatgtgtca tot 553

<210s, SEQ I D NO 24 &211s LENGT H: 177 212. TYPE : PRT <213> ORGANISM: Glycine max 22 Os. FEATU RE: <223> OTHER INFORMATION: G3474 p. olypeptide <4 OOs, SEQUENCE: 24

Met Ala Glu Ser Asp As in Glu Ser Gly Gly His Thir Gly Asn Ala Ser 1. 5 1O 15

Gly Ser Asn Glu Lieu. Se r Gly Cys Arg Glu Glin Asp Arg Phe Leul Pro 2O 25

Ile Ala Asn Wall Ser Airg Ile Met Llys Lys Ala Lell Pro Ala Asn Ala 35 4 O 45

Lys Ile Ser Lys Glu Al a Lys Glu Thir Wall Glin Glu Wall Ser Glu SO 55 6 O

Phe Ile Ser Phe Ile Thr Gly Glu Ala Ser Asp Glin Lys Glu 70 7s 8O

Thir Ile As in Gly Asp Asp Lieu. Lieu. Trp Ala Met Th Thr 85 90 95

Lieu. Gly Phe Glu Asp Tyr Val Asp Pro Lieu Lys Ile Luell His Lys 105 11 O

Tyr Arg Glu Met Glu Gl y Glu Lys Thir Ala Met Met Gly Arg Pro His 115 12 O 125

Glu Arg Asp Glu Gly Tyr Gly His Gly His Gly His Ala Thir Pro Met 13 O 135 14 O

Met Thir Met Met Met Gl y His Glin Pro Glin His Glin His Glin His Glin 145 15 O 155 160

His Glin Gly His Val Tyr Gly Ser Gly Ser Ala Ser Ser Ala Arg Thr 1.65 17O 17s Arg

<210s, SEQ I D NO 25 &211s LENGT H: 602 212. TYPE : DNA <213> ORGANISM: Glycine max 22 Os. FEATU RE: 223 OTHER INFORMATION: G3478

<4 OOs, SEQUENCE: 25 titc.cgittagt catggcgga citc.cgacaac gactic.cggcg gcgc.gcacala C9gcggcaa.g 6 O gggagcgaga tigt cqc.cgcg ggagcaggac C9gtttct co catcgc.gala C9tgagcc.gc 12 O atcatgaaga aggcgctgcc ggcgaacg.cg aagat ct cqa aggacgc gala ggagacggtg 18O US 8,633,353 B2 83 - Continued

Caggagtgcg tgtcagagtt catcagottc at Caccggcg aggccticca Caagtgc.ca.g 24 O cgc.gagaa.gc gcaagacgat Caacggcgac gacctgctict ggg.cgatgac Cactctgggc 3OO titcgaggact acgtggagcc t ct caaaggc tacct coagc gct tcc.gaga aatggaagga 360 gagalagaccg tgacaaggac gcqcctic ctic ttacgaatgc taccalacagt gcctacgaga gtgctaatta tgctgctgct gctgctgttc Ctggtggaat catgatgcat

Caggga cacg tgtacggttc tgc.cggct tc Cat Caagtgg Ctggcgggg.c tataaagggit 54 O gggcctgctt atcCtgggcc tggat.ccaat gcc.gg taggc ccagataaag agcct attat ta

SEQ ID NO 26 LENGTH: 191 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3478 p olypeptide <4 OOs, SEQUENCE: 26

Met Ala Asp Ser Asp Asn Asp Ser Gly Gly Ala His Asn Gly Gly Lys 1. 5 1O 15

Gly Ser Glu Met Ser Pro Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala 25

Asn Wall Ser Arg Ile Met Lys Llys Ala Lieu Pro Ala Asn Ala Lys Ile 35 4 O 45

Ser Lys Asp Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Phe Ile SO 55 6 O

Ser Phe Ile Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu 65 70

Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly 85 90 95

Phe Glu Asp Tyr Val Glu Pro Leu Lys Gly Tyr Lell Glin Arg Phe Arg 105 11 O

Glu Met Glu Gly Glu Lys Thr Val Ala Ala Arg Asp Lys Asp Ala Pro 115 12 O 125

Pro Luell Thir Asn Ala Thir Asn. Ser Ala Tyr Glu Ser Ala Asn Tyr Ala 13 O 135 14 O

Ala Ala Ala Ala Val Pro Gly Gly Ile Met Met His Glin Gly His Wall 145 150 155 160

Gly Ser Ala Gly Phe His Glin Val Ala Gly Gly Ala Ile Lys Gly 1.65 17O 17s

Gly Pro Ala Tyr Pro Gly Pro Gly Ser Asn Ala Gly Arg Pro Arg 18O 185 19 O

SEO ID NO 27 LENGTH: 1065 TYPE: DNA ORGANISM: Arabidopsis thal iana FEATURE: OTHER INFORMATION: G482

SEQUENCE: 27 tcgacccacg cgt.ccggaca Cttaacaatt Cacac Cttct ct t t t tactic titcCtaaaac 6 O cctaaatttic ct cqcttcag tott cocact caagt caacc accaattgaa titcgattitcg 12 O aatcattgat ggaaatgatt tgaaaaaaga gtaaagttta titt titt tatt ccttgtaatt 18O ttcagaaatgggggatt.ccg acagggattic cggtggaggg caaaacggga acaaccagaa 24 O US 8,633,353 B2 85 - Continued cggacagt cc tocttgttcto Caagaga.gca agacaggttc ttgc.cgatcg ctaacgt cag ccggat catg aagaaggc ct tgc.ccgc.caa cgc.caagatc tctaaagatg cCaaagagac 360 gatgcaggag agttcatcag citt cqt cacc ggagaag cat Ctgataagtg t cagaaggag aagaggaaga cgatcaacgg agacgatttg Ctctgggcta tgact actict aggttittgag gattatgttg agc cattgaa agttt acttg cagaggttta gggagat.cga 54 O aggggagagg actggactag ggaggccaca gactggtggt gaggtoggag agcatcagag agatgctgtc. ggagatggcg gtgggttcta cggtggtggit ggtgggatgc agitat cacca 660 acat catcag titt citt cacc agcagaacca tatgt atgga gcc acaggtg gcgg tag.cga 72 O Cagtggaggt ggagctgc ct ccgg taggac aaggact taa caaagattgg tgaagtggat citctictotgt atatagatac ataaataCat gtatacacat gcc tatttitt acgacccata 84 O taaggitat ct at catgtgat agaacgaa.ca ttggtgttgg tgatgtaaaa t cagatgtgc 9 OO attalagggitt tagattittga ggctgttgtaa aagaagat.ca agtgtgctitt gttggacaat 96.O aggatt cact aacgaatctg citt cattgga t cittgtatgt aactaaag.cc attgt attga atgcaaatgt titt catttgg gatgctittaa aaaaaaaaaa. aaaaa. 1065

SEQ ID NO 28 LENGTH: 190 TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G482 polypeptide

<4 OOs, SEQUENCE: 28

Met Gly Asp Ser Asp Arg Asp Ser Gly Gly Gly Glin Asn Gly Asn. Asn 1. 5 1O 15

Glin Asn Gly Glin Ser Ser Leu Ser Pro Arg Glu Glin Asp Arg Phe Lieu. 25 3O

Pro Ile Ala Asn Val Ser Arg Ile Met Lys Llys Ala Lell Pro Ala Asn 35 4 O 45

Ala Lys Ile Ser Lys Asp Ala Lys Glu Thir Met Glin Glu Wall Ser SO 55 6 O

Glu Phe Ile Ser Phe Val Thr Gly Glu Ala Ser Asp Gln Lys 65 70 7s 8O

Glu Arg Llys Thir Ile Asn Gly Asp Asp Lieu. Lell Trp Ala Met Thr 85 90 95

Thir Luell Gly Phe Glu Asp Tyr Val Glu Pro Lieu. Wall Tyr Lieu. Glin 105 11 O

Arg Phe Arg Glu Ile Glu Gly Glu Arg Thr Gly Lell Gly Arg Pro Glin 115 12 O 125

Thir Gly Gly Glu Val Gly Glu. His Glin Arg Asp Ala Wall Gly Asp Gly 13 O 135 14 O

Gly Gly Phe Tyr Gly Gly Gly Gly Gly Met Gln Tyr His Glin His His 145 150 155 160

Glin Phe Luell His Glin Glin Asn His Met Tyr Gly Ala Thir Gly Gly Gly 1.65 17O 17s

Ser Asp Ser Gly Gly Gly Ala Ala Ser Gly Arg Thir Arg Thir 18O 185 19 O

SEQ ID NO 29 LENGTH 735 TYPE: DNA ORGANISM: Zea mays FEATURE: US 8,633,353 B2 87 - Continued

223 OTHER INFORMATION: G3 435

<4 OOs, SEQUENCE: 29 cggcggtggc Cttgagctga ggcgg.cggag cgatgc.cgga Ctcggacaac gactic.cggcg 6 O

gagctgtcgt. cgcc.gcggga gCagg accgg titcCtgcc.ca 12 O tcgc.caacgt gagccggatc atgaagaagg cgctic ccggc Caacgc.caag at Cagcaagg 18O acgc.caagga gacggtgcag gagtgcgt.gt cc.gagttcat CtcCttcatC accggcgagg 24 O

Cctic.cgacaa gtgc.ca.gc.gc gagaa.gc.gca agaccatcaa cgg.cgacgac Ctgctgttggg 3OO c catgaccac gct cqgctitc gaggactacg tcgagcc.gct caa.gcactac ctdcacaagt 360 tCcgc.gagat Caggg.cgag cgt.ccgc.cgg cgc.ct cqggc ticgcago agc agcagcagca gigg.cgagctg Cccagaggcg cc.gc.caatgc cgc.cggg tac gcc.ggg tacg

atgatgatga tgatgatggg gcago Ccatg tacggcggct 54 O cgcagcc.gca gcaa.ca.gc.cg cc.gcc.gc.ctic agcc.gc.caca gcago agcag caa.catcaac agcatcacat ggcaat agga ggcagaggag gatt.cggc.ca acaaggcggc ggcgg.cggct 660

Cctcgt.cgt.c gtcagggctt acagggcgtg agttgcgacg atacgtcaga 72 O atcagaatcq citgat 73

SEQ ID NO 3 O LENGTH: 222 TYPE PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3435 polypeptide <4 OOs, SEQUENCE: 30

Met Pro Asp Ser Asp Asn Asp Ser Gly Gly Pro Ser Asn Ala Gly Gly 1. 5 1O 15

Glu Luell Ser Ser Pro Arg Glu Gln Asp Arg Phe Lell Pro Ile Ala Asn 25 3O

Wall Ser Arg Ile Met Lys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser 35 4 O 45

Asp Ala Lys Glu Thr Val Glin Glu Cys Val Ser Glu Phe Ile Ser SO 55 6 O

Phe Ile Thr Gly Glu Ala Ser Asp Lys Cys Glin Arg Glu 65 70

Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thir Leu Gly Phe 85 90 95

Glu Asp Tyr Val Glu Pro Leu Lys His Tyr Lieu. His Phe Arg Glu 105 11 O

Ile Glu Gly Glu Arg Ala Ala Ala Ser Ala Gly Ala Ser Gly Ser Glin 115 12 O 125

Glin Glin Glin Glin Glin Gly Glu Lieu. Pro Arg Gly Ala Ala Asn Ala Ala 13 O 135 14 O

Gly Ala Gly Tyr Gly Ala Pro Gly Ser Gly Gly Met Met Met Met 145 150 160

Met Met Gly Glin Pro Met Tyr Gly Gly Ser Glin Pro Glin Glin Glin Pro 1.65 17O 17s

Pro Pro Pro Glin Pro Pro Glin Glin Glin Glin Glin His Glin Gln His His 185 19 O

Met Ala Ile Gly Gly Arg Gly Gly Phe Gly Glin Glin Gly Gly Gly Gly 195 2O5

Gly Ser Ser Ser Ser Ser Gly Lieu. Gly Arg Glin Asp Arg Ala US 8,633,353 B2 89 90 - Continued

21 O 215 22O

SEQ ID NO 31 LENGTH: 547 TYPE: DNA ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3 4.72

<4 OOs, SEQUENCE: 31 taaggctago: tagctago Ca tggctgagtic gga caacgag tcc.ggaggit c acacggggaa 6 O cgcaa.gcgga agcaacgaat tct Coggttg Cagggagcaa. gacaggttcC titc.cgatagc 12 O gaacgtgagc aggat catga agaaggcgtt gcc.ggcgaac gcgaagatct cgaaggaggc 18O galaggaga.cg gtgcaggagt gcgtgtcgga gttcatcago tt cataa.ca.g gagaa.gcgt.c 24 O cgataagtgc Cagaaggaga agaggaagac gat Caacggc gatgatctgc tgtgggc cat 3OO gaccacgctg ggatt.cgagg agtacgtgga gcct citcaag gtttatctgc atalagtatag 360 ggagctggaa ggggagaaaa ctgctatoat gggaaggc.ca Catgagaggg atgagggitta tggit catgca act cotatga tgat catgat gggggat Caa Cagcagcagc at Cagggaca cgtgtatgga tctggalacta c tactggatc agcatcttct gcaagaact a gataa Caggit 54 O titatgca 547

SEQ ID NO 32 LENGTH 171 TYPE : PRT ORGANISM: Glycine max FEATURE: OTHER INFORMATION: G3472 p olypeptide <4 OOs, SEQUENCE: 32 Met Ala Glu Ser Asp Asn. Glu Ser Gly Gly His Thir Gly Asn Ala Ser 1. 5 1O 15

Gly Ser Asn Glu Phe Ser Gly Cys Arg Glu Glin Asp Arg Phe Leul Pro 25

Ile Ala Asn Val Ser Arg Ile Met Llys Lys Ala Lell Pro Ala Asn Ala 35 4 O 45

Ile Ser Lys Glu Ala Lys Glu Thir Wall Glin Glu Wall Ser Glu SO 55 6 O

Ile Ser Phe Ile Thr Gly Glu Ala Ser Asp Glin Lys Glu 70 7s

Arg Thir Ile Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Th Thr 90 95

Lell Gly Phe Glu Glu Tyr Val Glu Pro Lieu Lys Wall Luell His Lys 105 11 O

Arg Glu Lieu. Glu Gly Glu Lys Thir Ala Met Met Gly Arg Pro His 115 12 O 125

Glu Arg Asp Glu Gly Tyr Gly His Ala Thr Pro Met Met Ile Met Met 13 O 135 14 O

Gly His Glin Glin Glin Gln His Glin Gly His Val Gly Ser Gly Thr 145 150 155 160

Thir Thir Gly Ser Ala Ser Ser Ala Arg Thr Arg 17O

<210s, SEQ ID NO 33 &211s LENGTH: 8O3 &212s. TYPE: DNA &213s ORGANISM: Zea mays

US 8,633,353 B2 93 94 - Continued

195 2OO 2O5 Gly Lieu. Gly Arg 21 O

SEO ID NO 35 LENGTH: 720 TYPE: DNA ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3397

<4 OOs, SEQUENCE: 35 gcgtctgatt totgalagag gaggaggagg atgc.cggact cggacaacga citc.cggcggg 6 O cc.gagcaact acgcgggagg ggagctgtc.g tCgcc.gcggg agcaggacag gttcCt9.ccg 12 O atcgcgaacg taggaggat Catgaagaag gcgctg.ccgg cgaacgc.cala gat cagdaag 18O gacgc.calagg agacggtgca ggagtgcgtC tcc.gagttca t ct cott cat caccgg.cgag 24 O gcct Cogaca agtgc.cagcg cgaga agcgc aagac catca acggcgacga Cctgctctgg 3OO gccatgacca ccct cqgctt cgaggactac gtcgacc ccc t caag cact a cct coacaag 360 titcc.gc.gaga t cagggcga gcgc.gc.cgc.c gcc to cacca ccggcgc.cgg caccagogCC gcct coacca cqc.cgc.cgca gcago agcac accgc.caatg tacgcc.gc.cc cqggagc.cgg CCC cqgcggc atgatgatga tgatggggca gcc catgitac 54 O ggct cqc.cgc caccgc.cgc.c acagcagcag Cagcagcaac accaccacat ggcaatggga ggaagagg.cg gCttcggtCa t catc.ccggc ggcgg.cggcg 660 gggcacggtc. g.gcaaaacag ggg.cgcttga catcgct cog agacgagtag catgcac cat 72 O

SEQ ID NO 36 LENGTH: 219 TYPE PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3397 polypeptide

<4 OOs, SEQUENCE: 36

Met Pro Asp Ser Asp Asn Asp Ser Gly Gly Pro Ser Asn Tyr Ala Gly 1. 5 1O 15

Gly Glu Lieu. Ser Ser Pro Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala 25 3O

Asn Wall Ser Arg Ile Met Lys Llys Ala Lieu Pro Ala Asn Ala Lys Ile 35 4 O 45

Ser Lys Asp Ala Lys Glu Thr Val Glin Glu. Cys Wall Ser Glu Phe Ile SO 55 6 O

Ser Phe Ile Thr Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg 65 70 7s 8O

Thir Ile Asin Gly Asp Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly 85 90 95

Phe Glu Asp Tyr Val Asp Pro Lieu. Llys His Tyr Lell His Llys Phe Arg 105 11 O

Glu Ile Glu Gly Glu Arg Ala Ala Ala Ser Thr Thir Gly Ala Gly Thr 115 12 O 125

Ser Ala Ala Ser Thir Thr Pro Pro Glin Glin Glin His Thir Ala Asn Ala 13 O 135 14 O

Ala Gly Gly Tyr Ala Gly Tyr Ala Ala Pro Gly Ala Gly Pro Gly Gly 145 150 155 160

Met Met Met Met Met Gly Glin Pro Met Tyr Gly Ser Pro Pro Pro Pro US 8,633,353 B2 95 96 - Continued

1.65 17O 17s Pro Glin Glin Glin Glin Glin Gln His His His Met Ala Met Gly Gly Arg 18O 185 19 O Gly Gly Phe Gly His His Pro Gly Gly Gly Gly Gly Gly Ser Ser Ser 195 2OO Ser Ser Gly. His Gly Arg Glin Asn Arg Gly Ala 21 O 215

<210s, SEQ ID NO 37 &211s LENGTH: 610 &212s. TYPE: DNA <213> ORGANISM: Oryza sativa 22 Os. FEATURE: 223 OTHER INFORMATION: G33.95

<4 OO > SEQUENCE: 37 tggat.ctagg gtttittggag gC9g.cgcgg gatggcgga cgcggggcac gacgaga.gc.g 6 O ggagc.ccgcc gaggagcggc ggggtgaggg agcagga cag gttcc tigcc c atcgc.caa.ca 12 O t cagcc.gcat catgaagaag gcc.gtc.ccgg caacggcaa. gat.cgc.caag gacgc.calagg 18O agaccctgca ggagtgcgtc. tcggagttca t ct cottct Caccagcgag gcgagcgaca 24 O aatgtcagaa ggagaa.gc.gc aagac catca acggggalaga tot cotctitt gcgatgggta 3OO cgcttggctt taggagtac gttgatcc.gt taagat cta tttacacaag tacagagaga 360 tggagggtga tagtaagctg. tcct caaagg Ctggtgatgg ttcagtaaag aaggatacaa ttggtc.cgca cagtggcgct agtagcticaa gtgcgcaagg gatggttggg gcttacaccc aagggatggg ttatatgcaa cct cagtatic ataatgggga cacctaaaga tgaggatagt 54 O gaaaattitt.c agtaactggit gtcct ctdtgagttatt atc catctgttaa ggaagaaccc acattagggc 610

<210s, SEQ ID NO 38 &211s LENGTH: 164 212. TYPE: PRT <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: G3395 polypeptide

<4 OOs, SEQUENCE: 38

Met Ala Asp Ala Gly His Asp Glu Ser Gly Ser Pro Pro Arg Ser Gly 1. 5 15

Gly Wall Arg Glu Glin Asp Arg Phe Luell Pro Ile Ala Asn Ile Ser Arg 2O 25

Ile Met Lys Lys Ala Wall Pro Ala Asn Gly Ile Ala Asp Ala 35 4 O 45

Glu Thir Lieu. Glin Glu Cys Wall Ser Glu Phe Ile Ser Phe Wall. Thir SO 55 6 O

Ser Glu Ala Ser Asp Lys Glin Glu Lys Arg Thir Ile Asn 65 70 7s 8O

Gly Glu Asp Lieu. Luell Phe Ala Met Gly Thir Luell Gly Phe Glu Glu Tyr 85 90 95

Wall Asp Pro Lieu Lys Ile Luell His Arg Glu Met Glu Gly 1OO 105 11 O

Asp Ser Lys Luell Ser Ser Ala Gly Asp Gly Ser Wall 115 12 O 125

Thir Ile Gly Pro His Ser Gly Ala Ser Ser Ser Ser Ala Glin Gly Met 13 O 135 14 O US 8,633,353 B2 97 98 - Continued Val Gly Ala Tyr Thr Glin Gly Met Gly Tyr Met Gln Pro Glin Tyr His 145 150 155 160 Asn Gly Asp Thr

SEO ID NO 39 LENGTH: 761 TYPE: DNA ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3398

<4 OOs, SEQUENCE: 39 cctcitcct ct tcgtct tcct cct cqcct tc gct tcgactg citt catcga gggagat.cga 6 O ggttgcgatg ccggatt.cgg acaacgagtic agggggg.ccg agcaacg.cgg gggagtacgc 12 O gtcggcgagg gag caggaca ggttcCtgcc gat.cgcgaac gtgagcagga t catgaagag 18O ggcgct cocg gcgaacgc.ca agat cagcaa. ggacgc.caag gaga.cggtgc aggagtgcgt 24 O

Ctcggagttc at ct cott ca. t caccggcga ggcct Cogac gggagaa.gc.g 3OO caagac catc aacggcgacg acctic citctg ggcgatgacc tcgaggact a 360 catcgaccc.g citcaagct ct acctic Cacala gttcc.gc.gag Ctcgagggcg agaaggc cat

ggcagogg.cg cgcct cotcc gct Coggctic cggctcgcac Caccaccagg atgct tcc.cg gaacaatggc ggatacggca tgtacggcgg 54 O cggcggcggc atgat catga tgatgggaca gcc tatgtac cggcgt.cgt.c agctggg tac gcgcagcc.gc cgcc.gc.ccca CCaCCaC CaC Caccagatgg tgatgggagg 660 gaaaggtgcg tatggc catg gcggcggcgg cgg.cggcggg c cct c ccc.gt cgt.cgggata 72 O cggc.cggcaa. gacaggct at gagcttgctt tcttggttgg t 761

SEQ ID NO 4 O LENGTH: 224 TYPE : PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3398 polypeptide <4 OOs, SEQUENCE: 4 O

Met Pro Asp Ser Asp Asn. Glu Ser Gly Gly Pro Ser Asn Ala Gly Glu 1. 5 1O 15

Tyr Ala Ser Ala Arg Glu Glin Asp Arg Phe Lieu. Pro Ile Ala Asn. Wall 25

Ser Arg Ile Met Lys Arg Ala Lieu. Pro Ala Asn Ala Lys Ile Ser Lys 35 4 O 45

Asp Ala Glu Thir Wall Glin Glu Cys Val Ser Glu Phe Ile Ser Phe SO 55 6 O

Ile Thir Gly Glu Ala Ser Asp Llys Cys Glin Arg Glu Arg Lys Thr 65 70 7s

Ile Asn Gly Asp Asp Lieu Lleu Trp Ala Met Thr Thir Lell Gly Phe Glu 85 90 95

Asp Ile Asp Pro Lieu Lys Lieu. Tyr Lieu. His Phe Arg Glu Lieu. 105 11 O

Glu Gly Glu Lys Ala Ile Gly Ala Ala Gly Ser Gly Gly Gly Gly Ala 115 12 O 125

Ala Ser Ser Gly Gly Ser Gly Ser Gly Ser Gly Ser His His His Glin 13 O 135 14 O

Asp Ala Ser Arg Asn. Asn Gly Gly Tyr Gly Met Gly Gly Gly Gly 145 150 155 160

US 8,633,353 B2 101 102 - Continued

Asp Arg Phe Lieu Pro Ile Ala Asn Ile Ser Arg Ile Met Lys Lys Ala 35 4 O 45

Ile Pro Ala Asin Gly Llys Thir Ile Pro Ala Asn Gly Ile Ala Lys SO 55 6 O

Asp Ala Glu Thir Wall Glin Glu Cys Val Ser Glu Phe Ile Ser Phe 65 70 7s

Ile Thir Ser Glu Ala Ser Asp Llys Cys Glin Arg Glu Arg Lys Thr 85 90 95

Ile Asn Gly Asp Asp Lieu Lleu Trp Ala Met Ala Thir Lell Gly Phe Glu 105 11 O

Asp Ile Glu Pro Lieu Lys Val Tyr Lieu Gln Tyr Arg Glu Met 115 12 O 125

Glu Gly Asp Ser Llys Lieu. Thir Ala Llys Ser Ser Asp Gly Ser Ile Llys 13 O 135 14 O

Lys Asp Ala Lieu. Gly His Val Gly Ala Ser Ser Ser Ala Ala Gln Gly 145 150 155 160

Met Gly Glin Glin Gly Ala Tyr Asn Gln Gly Met Gly Met Gln Pro 1.65 17O 17s

Glin His Asin Gly Asp Ile Ser Asn 18O 185

SEQ ID NO 43 LENGTH; 837 TYPE: DNA ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3396

<4 OOs, SEQUENCE: 43

ggcgt.cct Co t cc ct ct coc to CtcCC Cala cCaacggcgc 6 O tgat cocctic cgc.cat ct co gtc. catct co gcc taaaaaa actaagcgat gtcggagggg 12 O titcgacggga cggagaacgg cgg.cggcggc ggcggaggcg gagtagggaa ggagcaggac 18O cggttcCtgc cgatcgc.caa catcggcc.gc at catgcgcc ggagaacggc 24 O aagat.cgc.ca aggact coaa ggagt ccgt.c Caggagtgcg t ct cogagtt catcagott c 3OO atcaccagcg alagcaa.gcga caagtgcctic aaggaga agc gcaagac cat Caatggggac 360 gacctgat ct ggit Caatggg cacgctcgga titcgaggact atgtcgagcc t ct caagctic tacct caggc tct accggga gacggagggit gacacaaagg gttcaagagc ttctgaactg

Ccagtaaaga aagatgttgt acttaatgga gatcctggat catcgtttga aggcatgtag 54 O gacgaggagt gtgatagcat Ctaggaagga gaac catcgt. titt tagggaa agaacgct cc agcatcctgt tatgttgtaa gCaggatgct tctaaagttc caataccttg ttaccacgaa 660 tgttagt ct cgttctttitt gaaatgttct tgttgttagcc aggatgtc.ca aatttgttgt 72 O aggttctagt t cagtcgtgt gttgttgttggit tgttgtctaac catatttggc cgttt Coggc tgtc.ctgcat atgctaaatt Cagaggggta aagagat cta agaaaaaaaa aaaaaaa. 837

SEQ ID NO 44 LENGTH: 143 TYPE : PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3396 polypeptide <4 OOs, SEQUENCE: 44 Met Ser Glu Gly Phe Asp Gly Thr Glu Asn Gly Gly Gly Gly Gly Gly US 8,633,353 B2 103 104 - Continued

15

Gly Gly Wall Gly Lys Glu Glin Asp Arg Phe Lieu. Pro Ile Ala ASn Ile 25

Gly Arg Ile Met Arg Arg Ala Val Pro Glu Asn Gly Lys Ile Ala Lys 35 4 O 45

Asp Ser Glu Ser Wall Glin Glu Cys Val Ser Glu Phe Ile Ser Phe SO 55 6 O

Ile Thir Ser Glu Ala Ser Asp Llys Cys Lieu Lys Glu Lys Arg Lys Thr 65 70

Ile Asn Gly Asp Asp Lieu. Ile Trp Ser Met Gly Thir Lell Gly Phe Glu 85 90 95

Asp Wall Glu Pro Lieu Lys Lieu. Tyr Lieu. Arg Lell Tyr Arg Glu Thir 105 11 O

Glu Gly Asp Thir Lys Gly Ser Arg Ala Ser Glu Lell Pro Wall 115 12 O 125

Asp Wall Wall Lieu. Asn Gly Asp Pro Gly Ser Ser Phe Glu Gly Met 13 O 135 14 O

SEO ID NO 45 LENGTH: 920 TYPE: DNA ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3 429

<4 OOs, SEQUENCE: 45 cCaggit catt gagacittgag aga catggca gggaacaaaa agcgtggitat agtagtatag 6 O t cagtactag gattag tact tgataactag CaCaacaaaa. agctactatic atttittgcat 12 O tcaaagtggit aattaagaat cggatgat Ca ttittgattgt t cott citt cat ttgttittgga 18O tgtcaggtgg Caggaacatg gat Caggit Ca agaaggcggc agtgagat.cg gatggggtgg 24 O gagg tagtgc gaccalacgc.c gagctgc.cga tggccaacct cgitacgc.ctg ataaagaagg 3OO tgct Cocagg gaaag.cgaag atcgggggag Cagccaaggg tot CaccCat gattgcgcgg 360 tggagttcgt. cgggttcgtc ggcgacgagg Cct Cogagaa ggc.ca aggca gag caccgc.c gCaccgtagc gcc.ggalagac tacttgggct cattcggcga ccttggcttic gat.cgctacg tcqaccc.cat ggatgcctac atc catggitt accgtgagtt tgagagggct ggtgggaata 54 O ggagggtggc gcc.gc.ctic ct CC9gcggcag ctacaccgct gacgc.ccggit ggaccgacat t cactgacgc agagctgcag titt ct coggit cggtgat coc citccagaagt gatgatgaat 660 at agcggctic atcaccagcc at aggcggct atggctatgg atatggctat ggaaaaaata 72 O tgtgaacaac ttgatgcatg tgttgttgttgta cactgcatgc atgcgtggag ggtcaaacag agt caagact Cttgttggtgg tittcaataag tctictagtga caatataagt tgttgt atcgt. 84 O tgtttcctitt gtaaaaaaaa aaacgt.catg tat cqttgttg tgtcct catc cgg taatcct 9 OO ttacgtgggit gggittatggc 92 O

SEQ ID NO 46 LENGTH: 193 TYPE : PRT ORGANISM: Oryza sativa FEATURE: OTHER INFORMATION: G3429 polypeptide SEQUENCE: 46 Met Ile Ile Lieu. Ile Val Pro Leu. His Leu Phe Trp Met Ser Gly Gly 1. 5 15 US 8,633,353 B2 105 106 - Continued

Arg Asn Met Asp Glin Wall Ala Ala Wall Arg Ser Asp Gly Wall 2O 25

Gly Gly Ser Ala Thir Asn Ala Glu Leul Pro Met Ala Asn Luell Wall Arg 35 4 O 45

Lell Ile Wall Lell Pro Gly Lys Ala Lys Ile Gly Gly Ala Ala SO 55 6 O

Lys Gly Luell Thir His Asp Ala Wall Glu Phe Wall Gly Phe Wall Gly 65 70

Asp Glu Ala Ser Glu Ala Ala Glu His Arg Arg Thir Wall Ala 85 90 95

Pro Glu Asp Tyr Lell Gly Ser Phe Gly Asp Luell Gly Phe Asp Arg Tyr 105 11 O

Wall Asp Pro Met Asp Ala Ile His Gly Arg Glu Phe Glu Arg 115 12 O 125

Ala Gly Gly Asn Arg Arg Wall Ala Pro Pro Pro Pro Ala Ala Ala Thir 13 O 135 14 O

Pro Luell Thir Pro Gly Gly Pro Thir Phe Thr Asp Ala Glu Luell Glin Phe 145 150 155 160

Lell Arg Ser Wall Ile Pro Ser Arg Ser Asp Asp Glu Ser Gly Ser 1.65 17O 17s

Ser Pro Ala Ile Gly Gly Gly Tyr Gly Gly Gly Asn 18O 185 19 O

Met

SEO ID NO 47 LENGTH: 91 TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G481 conserved B domain

<4 OOs, SEQUENCE: 47

Arg Glu Glin Asp Arg Lell Pro Ile Ala ASn Ile Ser Arg Ile Met 1. 5 15

Lys Lys Ala Luell Pro Pro Asn Gly Lys Ile Gly Asp Ala Asp 2O 25 3 O

Thir Wall Glin Glu Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu 35 4 O 45

Ala Ser Asp Cys Glin Glu Lys Arg Lys Thir Wall Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Ala Thir Lieu. Gly Phe Glu Asp Luell Glu 65 70 7s 8O

Pro Luell Ile Tyr Lell Ala Arg Tyr Arg Glu 85 9 O

SEQ ID NO 48 LENGTH: 91 TYPE : PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: G3 470 conserved B domain

SEQUENCE: 48 Arg Glu Glin Asp Arg Tyr Lieu Pro Ile Ala Asn. Ile Ser Arg Ile Met 1. 5 15 Llys Lys Ala Lieu Pro Pro Asn Gly Lys Ile Ala Lys Asp Ala Lys Asp 25 US 8,633,353 B2 107 108 - Continued Thir Met Glin Glu. Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu 35 4 O 45

Ala Ser Glu Lys Cys Glin Lys Glu Lys Arg Lys Thir Ile Asin Gly Asp SO 55 6 O

Asp Luell Lieu. Trp Ala Met Ala Thir Lieu. Gly Phe Glu Asp Tyr Ile Glu 65 70 8O

Pro Luell Llys Val Tyr Lell Ala Arg Tyr Arg Glu 85 90

SEQ ID NO 49 LENGTH: 91 TYPE : PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: G3471 conserved B domain

<4 OOs, SEQUENCE: 49

Arg Glu Glin Asp Arg Lell Pro Ile Ala ASn Ile Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Luell Pro Pro Asn Gly Lys Ile Ala Asp 2O 25

Thir Met Glin Glu Wall Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu 35 4 O 45

Ala Ser Glu Cys Glin Lys Glu Lys Arg Lys Thir Ile Asin Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Ala Thir Lieu. Gly Phe Glu Asp Tyr Ile Glu 65 70 8O

Pro Luell Wall Tyr Lell Ala Arg Tyr Arg Glu 85 90

SEO ID NO 5 O LENGTH: 91 TYPE : PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: G3875 conserved B domain

<4 OOs, SEQUENCE: SO

Arg Glu Glin Asp Arg Lell Pro Ile Ala ASn Ile Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Luell Pro Ala Asn Gly Lys Ile Ala Glu 2O 25

Thir Wall Glin Glu Wall Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu 35 4 O 45

Ala Ser Asp Cys Glin Arg Glu Lys Arg Lys Thir Ile Asin Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Ala Thir Lieu. Gly Phe Glu Asp Tyr Ile Asp 65 70 8O

Pro Luell Ile Tyr Lell Thir Arg Tyr Arg Glu 85 90

SEO ID NO 51 LENGTH: 91 TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3876 conserved B domain

SEQUENCE: 51 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Ile Ser Arg Ile Met 1. 5 15 US 8,633,353 B2 109 110 - Continued

Ala Ile Pro Ala Asn Gly Lys Ile Ala Lys Asp Ala Lys Glu 2O 25 3O

Thir Wall Glin Glu Wall Ser Glu Phe Ile Ser Phe Ile Thir Ser Glu 35 4 O 45

Ala Ser Asp Cys Glin Arg Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Ala Thir Luell Gly Phe Glu Asp Tyr Ile Glu 65 70 8O

Pro Luell Wall Tyr Lell Glin Lys Tyr Arg Glu 85 90

SEO ID NO 52 LENGTH: 91 TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3434 conserved B domain

<4 OOs, SEQUENCE: 52

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Ile Ser Arg Ile Met 1. 5 15

Lys Lys Ala Wall Pro Ala Asn Gly Lys Ile Ala Asp Ala Glu 2O 25

Thir Luell Glin Glu Wall Ser Glu Phe Ile Ser Phe Wall Thir Ser Glu 35 4 O 45

Ala Ser Asp Cys Glin Lys Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Ala Thir Luell Gly Phe Glu Glu Tyr Wall Glu 65 70 8O

Pro Luell Ile Tyr Lell Glin Lys Tyr Lys Glu 85 90

SEO ID NO 53 LENGTH: 91 TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G1364 conserved B domain

<4 OOs, SEQUENCE: 53

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Ile Ser Arg Ile Met 1. 5 15

Lys Arg Gly Luell Pro Ala Asn Gly Lys Ile Ala Asp Ala Glu 2O 25 3O

Ile Wall Glin Glu Wall Ser Glu Phe Ile Ser Phe Wall Thir Ser Glu 35 4 O 45

Ala Ser Asp Cys Glin Arg Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Ala Thir Luell Gly Phe Glu Asp Met Glu 65 70 8O

Pro Luell Wall Tyr Lell Met Arg Tyr Arg Glu 85 90

SEO ID NO 54 LENGTH: 91 TYPE : PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: G3475 conserved B domain US 8,633,353 B2 111 112 - Continued <4 OOs, SEQUENCE: 54

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15

Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Gly Tyr Lieu. Glin Arg Phe Arg Glu 85 90

<210s, SEQ ID NO 55 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: G485 conserved B domain

<4 OO > SEQUENCE: 55

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15

Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Val Tyr Lieu Gln Lys Tyr Arg Glu 85 90

<210s, SEQ ID NO 56 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine max 22 Os. FEATURE: <223> OTHER INFORMATION: G3476 conserved B domain

<4 OOs, SEQUENCE: 56

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15

Llys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O

Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O

Asp Lieu. Leu Trp Ala Met Thir Thr Lieu. Gly Phe Glu Glu Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Ile Tyr Lieu. Glin Arg Phe Arg Glu 85 90

<210s, SEQ ID NO 57 &211s LENGTH: 91 212. TYPE: PRT US 8,633,353 B2 113 114 - Continued <213> ORGANISM: Arabidopsis thaliana 22 Os. FEATURE: <223> OTHER INFORMATION: G2345 conserved B domain

<4 OOs, SEQUENCE: f

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Ile Ser Arg Ile Met 5 1O 15

Lys Arg Gly Lieu. Pro Lell Asn Gly Lys Ile Ala Asp Ala Glu 2O 25 3O

Thir Met Glin Glu Wall Ser Glu Phe Ile Ser Phe Wall Thir Ser Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Ala Thir Lieu. Gly Phe Glu Asp Ile Asp 65 70 8O

Pro Lieu Lys Val Tyr Lell Met Arg Tyr Arg Glu 85 90

<210s, SEQ ID NO 58 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine ax 22 Os. FEATURE: 223 OTHER INFORMATION: G3474 conserved B domain

<4 OOs, SEQUENCE: 58

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 5 1O 15

Llys Lys Ala Lieu. Pro Ala Asn Ala Lys Ile Ser Glu Ala Glu 2O 25 3O

Thir Wall Glin Glu Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Lys Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Wall Asp 65 70 8O

Pro Lieu Lys Ile Tyr Lell His Tyr Arg Glu 85 90

<210s, SEQ ID NO 59 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Glycine ax 22 Os. FEATURE: 223 OTHER INFORMATION: G3478 conserved B domain

<4 OOs, SEQUENCE: 59

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 5 1O 15

Llys Lys Ala Lieu. Pro Ala Asn Ala Lys Ile Ser Asp Ala Glu 2O 25 3O

Thir Wall Glin Glu Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Lieu. Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Wall Glu 65 70 8O

Pro Lieu Lys Gly Tyr Lell Glin Arg Phe Arg Glu 85 90 US 8,633,353 B2 115 116 - Continued

SEQ ID NO 60 LENGTH 91. TYPE : PRT ORGANISM: Arabidopsis thaliana FEATURE: OTHER INFORMATION: G482 conserved B domain

<4 OOs, SEQUENCE: 60

Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 5 1O 15

Lys Ala Leul Pro Ala Asn Ala Lys Ile Ser Asp Ala Glu 2O 25 3O

Thir Met Glin Glu Cys Wall Ser Glu Phe Ile Ser Phe Wall Thir Gly Glu 35 4 O 45

Ala Ser Asp Glin Lys Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Wall Glu 65 70 8O

Pro Luell Val Tyr Lell Glin Arg Phe Arg Glu 85 90

SEQ ID NO 61 LENGTH 91. TYPE : PRT ORGANISM: Zea mays FEATURE: OTHER INFORMATION: G3435 conserved B domain

<4 OOs, SEQUENCE: 61

Arg Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Leul Pro Ala Asn Ala Lys Ile Ser Asp Ala Glu 2O 25 3O

Thir Wall Glin Glu Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Glin Arg Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Thir Thir Lieu. Gly Phe Glu Asp Wall Glu 65 70 8O

Pro Luell His Tyr Lell His Phe Arg Glu 85 90

SEQ ID NO 62 LENGTH 91. TYPE : PRT ORGANISM: Glycine ax FEATURE: OTHER INFORMATION: G3472 conserved B domain

SEQUENCE: 62

Glu Glin Asp Arg Phe Lell Pro Ile Ala ASn Wall Ser Arg Ile Met 5 1O 15

Lys Ala Leul Pro Ala Asn Ala Lys Ile Ser Glu Ala Glu 2O 25 3O

Thir Wall Glin Glu Cys Wall Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Lys Cys Glin Lys Glu Lys Arg Lys Thir Ile Asn Gly Asp SO 55 6 O

Asp Luell Luell Trp Ala Met Thir Thir Lieu. Gly Phe Glu Glu Wall Glu 65 70 7s 8O US 8,633,353 B2 117 118 - Continued

Pro Lieu Lys Val Tyr Lieu. His Llys Tyr Arg Glu 85 90

<210s, SEQ ID NO 63 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Zea mays 22 Os. FEATURE: <223> OTHER INFORMATION: G3436 conserved B domain

<4 OOs, SEQUENCE: 63

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn Val Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Glu 2O 25 3O

Thir Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Leu Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Wall Glu 65 70 7s 8O

Pro Lieu Lys Lieu. Tyr Lieu. His Llys Phe Arg Glu 85 90

<210s, SEQ ID NO 64 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: G3397 conserved B domain

<4 OOs, SEQUENCE: 64

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn Val Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Glu 2O 25 3O

Thir Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thir Gly Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asn Gly Asp SO 55 6 O

Asp Lieu. Leu Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Wall Asp 65 70 7s 8O

Pro Lieu Lys His Tyr Lieu. His Llys Phe Arg Glu 85 90

<210s, SEQ ID NO 65 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: G3395 conserved B domain

<4 OOs, SEQUENCE: 65

Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Ile Ser Arg Ile Met 1. 5 1O 15

Lys Lys Ala Val Pro Ala Asn Gly Lys Ile Ala Lys Asp Ala Glu 2O 25 3O

Thir Leu Gln Glu. Cys Val Ser Glu Phe Ile Ser Phe Val Thir Ser Glu 35 4 O 45

Ala Ser Asp Llys Cys Glin Lys Glu Lys Arg Llys Thir Ile Asn Gly Glu US 8,633,353 B2 119 120 - Continued

SO 55 6 O Asp Lieu. Lieu. Phe Ala Met Gly. Thir Lieu. Gly Phe Glu Glu Tyr Val Asp 65 70 7s 8O Pro Lieu Lys Ile Tyr Lieu. His Llys Tyr Arg Glu 85 90

<210s, SEQ ID NO 66 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: G3398 conserved B domain

<4 OOs, SEQUENCE: 66 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Wal Ser Arg Ile Met 1. 5 1O 15 Lys Arg Ala Lieu Pro Ala Asn Ala Lys Ile Ser Lys Asp Ala Lys Glu 2O 25 3O Thr Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Gly Glu 35 4 O 45 Ala Ser Asp Llys Cys Glin Arg Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Lieu. Trp Ala Met Thir Thr Lieu. Gly Phe Glu Asp Tyr Ile Asp 65 70 7s 8O Pro Lieu Lys Lieu. Tyr Lieu. His Llys Phe Arg Glu 85 90

<210s, SEQ ID NO 67 &211s LENGTH: 98 212. TYPE: PRT <213> ORGANISM: Zea mays 22 Os. FEATURE: <223> OTHER INFORMATION: G3866 conserved B domain

<4 OO > SEQUENCE: 67 Arg Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Ile Ser Arg Ile Met 1. 5 1O 15 Llys Lys Ala Ile Pro Ala Asn Gly Lys Thir Ile Pro Ala Asn Gly Lys 2O 25 3O Ile Ala Lys Asp Ala Lys Glu Thr Val Glin Glu. Cys Val Ser Glu Phe 35 4 O 45 Ile Ser Phe Ile Thir Ser Glu Ala Ser Asp Llys Cys Glin Arg Glu Lys SO 55 6 O Arg Llys Thir Ile Asn Gly Asp Asp Lieu. Lieu. Trp Ala Met Ala Thir Lieu 65 70 7s 8O Gly Phe Glu Asp Tyr Ile Glu Pro Lieu Lys Val Tyr Lieu. Glin Llys Tyr 85 90 95 Arg Glu

<210s, SEQ ID NO 68 &211s LENGTH: 91 212. TYPE: PRT <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: G3396 conserved B domain

<4 OOs, SEQUENCE: 68 Lys Glu Glin Asp Arg Phe Lieu Pro Ile Ala Asn. Ile Gly Arg Ile Met 1. 5 1O 15

Arg Arg Ala Val Pro Glu Asn Gly Lys Ile Ala Lys Asp Ser Lys Glu US 8,633,353 B2 121 122 - Continued

2O 25 3O Ser Val Glin Glu. Cys Val Ser Glu Phe Ile Ser Phe Ile Thr Ser Glu 35 4 O 45 Ala Ser Asp Llys Cys Lieu Lys Glu Lys Arg Llys Thir Ile Asin Gly Asp SO 55 6 O Asp Lieu. Ile Trp Ser Met Gly. Thir Lieu. Gly Phe Glu Asp Tyr Val Glu 65 70 7s 8O Pro Lieu Lys Lieu. Tyr Lieu. Arg Lieu. Tyr Arg Glu 85 90

<210s, SEQ ID NO 69 &211s LENGTH: 89 212. TYPE: PRT <213> ORGANISM: Oryza sativa 22 Os. FEATURE: <223> OTHER INFORMATION: G3429 conserved B domain

<4 OOs, SEQUENCE: 69 Thir Asn Ala Glu Lieu Pro Met Ala Asn Lieu Val Arg Lieu. Ile Llys Llys 1. 5 1O 15 Val Lieu Pro Gly Lys Ala Lys Ile Gly Gly Ala Ala Lys Gly Lieu. Thr 2O 25 3O His Asp Cys Ala Val Glu Phe Val Gly Phe Val Gly Asp Glu Ala Ser 35 4 O 45 Glu Lys Ala Lys Ala Glu. His Arg Arg Thr Val Ala Pro Glu Asp Tyr SO 55 6 O Lieu. Gly Ser Phe Gly Asp Lieu. Gly Phe Asp Arg Tyr Val Asp Pro Met 65 70 7s 8O Asp Ala Tyr Ile His Gly Tyr Arg Glu 85

<210s, SEQ ID NO 70 &211s LENGTH: 831 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P46 expression vector (for preparing 35S: : G481) <4 OO > SEQUENCE: 7 O agcgtttcgt agaaaaattic gattt citcta aagcc ctaaa actaaaacga citatic cc caa 6 O titccaagttc tagggitttico atct tcc.cca atctagtata aatggcggat acgc.ctt cqa 12 O gcc.cagctgg agatggcgga gaaag.cggcg gttc.cgittag ggagcaggat catacctt C 18O citat agctaa tat cagcagg at catgaaga aag.cgttgcc ticcitaatggit aagattggaa 24 O aagatgctaa ggatacagtt Caggaatgcg tctctgagtt cat cagctt C at Cactagog 3OO aggc.ca.gtga taagtgtcaa aaa.gagaaaa ggaaaactgt gaatggtgat gatttgttgt 360 gggcaatggc aac attagga tittgaggatt acctggalacc tictaaagata tacct agcga 42O ggtacaggga gttggagggit gataataagg gat Caggaaa gagtggagat ggatcaaata 48O gagatgctgg toggtgtt totggtgaag aaatgcc.gag Ctggtaaaag aagttgcaa.g 54 O tagtgattaa gaacaatcgc caaatgat ca agggaaatta gagat cagtg agttgttitat 6OO agttgagctg atcgacaact attitcgggitt tact citcaat titcggittatgttagtttgaa 660 cgtttggttt attgtttic cq gtttagttgg ttg tatttaa agatttctict gttagatgtt 72 O gagaac actt gaatgaagga aaaatttgtc. cacat cotgt tdttattitt c gatt cactitt 78O cggaattit.ca tagctaattt attct cattt aataccalaat cct taaatta a 831 US 8,633,353 B2 123 124 - Continued

<210s, SEQ ID NO 71 &211s LENGTH: 11.83 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P5287 expression vector (LTP1 : : G481) <4 OOs, SEQUENCE: 71 gatatgacca aaatgattaa cittgcattac agttgggaag tat caagtaa acaac attitt 6 O gttitttgttt gat atcggga atctoaaaac caaagtic cac act agtttitt ggactatata 12 O atgataaaag toagatat ct actaatact a gttgat cagt at attcaaa acatgactitt 18O c caaatgtaa gttatt tact tttitttittgc tattataatt aagat caata aaaatgtcta 24 O agttittaaat ctittat catt at atccaaac aat cataatc ttattgttaa tot ct catca 3OO acacacagtt tittaaaataa attaattacc ctittgcatga taccgaagag aaacgaattic 360 gttcaaataa ttittataa.ca ggaaataaaa tagataacco aaataaacga tagaatgatt 42O t cittag tact aact cittaac aac agittitta tittaaatgac titttgtaaaa aaaacaaagt 48O taacttatac acgtacacgt gttctgaaaata ttattgacaa toggatagcat gattct tatt 54 O agagt catgt aaaagataaa cacatgcaaa tatatatatgaataatatgt tdttalagata 6OO aact agacga ttagaatata tag cacat ct at agtttgta aaataactat ttct caacta 660 gacittaagtic titcgaaatac ataaataaac aaaactataa aaatt cagaa aaaaacatga 72 O gagtacgitta gtaaaatgta ttttitttggit aaaataatca cittitt catca ggit ctitttgt 78O aaag cagttt to atgttaga taaacgagat tittaattittt tittaaaaaaa gaagtaaact 84 O aactatgttc citat ct acac acctataatt ttgaacaatt acaaaacaac aatgaaatgc 9 OO aaagaagacg tagggcactg. t cacactaca atacgattaa taaatgtatt ttggit cqaat 96.O taataactitt coatacgata aagttgaatt alacatgtcaa acaaaagaga tigagtggtc.c 1 O2O tata catagt taggaattag gaacctictaa attaaatgag tacaiaccacc aac tact cot 108 O t ccctictata at citat cqca ttcacaccac atalacatata cqtacct act ctatataa.ca 114 O

Ctcact CCCC aaactictott Cat CatcCat Cactacacac atc. 11.83

<210s, SEQ ID NO 72 &211s LENGTH: 831 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P6812 expression vector (opLexA. : : G481)

<4 OOs, SEQUENCE: 72 agcgtttcgt agaaaaattic gattt citcta aagcc ctaaa actaaaacga citatic cc caa 6 O titccaagttc tagggitttico atct tcc.cca atctagtata aatggcggat acgc.ctt cqa 12 O gcc.cagctgg agatggcgga gaaag.cggcg gttc.cgittag ggagcaggat catacctt C 18O citat agctaa tat cagcagg at catgaaga aag.cgttgcc ticcitaatggit aagattggaa 24 O aagatgctaa ggatacagtt Caggaatgcg tctctgagtt cat cagctt C at Cactagog 3OO aggc.ca.gtga taagtgtcaa aaa.gagaaaa ggaaaactgt gaatggtgat gatttgttgt 360 gggcaatggc aac attagga tittgaggatt acctggalacc tictaaagata tacct agcga 42O ggtacaggga gttggagggit gataataagg gat Caggaaa gagtggagat ggatcaaata 48O gagatgctgg toggtgtt totggtgaag aaatgcc.gag Ctggtaaaag aagttgcaa.g 54 O tagtgattaa gaacaatcgc caaatgat ca agggaaatta gagat cagtg agttgttitat 6OO

US 8,633,353 B2 127 128 - Continued ttagaaaaac taggtggaga atacagtaaa aaaattcaaa aagtgcatat ttittggaca 72 O acattaatat gtacaaatag titt acattta aatgt attat titt actaatt aagtacatat 78O aaagttgcta aactaalacta atataattitt tdcataagta aattitat cqt taaaagttitt 84 O

Ctttctagoc actaaacaac aatacaaaat cqc coaagtc acccattaat taatttagaa 9 OO gtgaaaaa.ca aaatcttaat tatatggacg atc.ttgtcta ccatatttca agggctacag 96.O gcct acagcc gcc gaataaa tottaccago cittaalaccag aacaacggca aataagttca O2O tgtgg.cggct ggtgatgatt cacaatttico Cogacagttc tatgataatgaaactatata O8O attattgtac gtacatacat gcatgcgacg aacaacactt caatttaatt gttag tatta 14 O aattacattt at agtgaagt atgttgggac gattagacgg atacaatgca cittatgttct 2OO ccggaaaatgaat catttgt gttcagagca tact coaag agt caaaaaa gtt attaaat 26 O ttatttgaat ttaaaactta aaaatagtgt aatttittaac caccc.gctgc cqcaaacgtt 32O ggcggaagaa tacgcggtgt taaacaattt ttgttgat cqt ttcaaacat ttgta accgc 38O aatc to tact gcacaatctg ttacgtttac aatttacaag titagtataga agaacgttcg 44 O tacctgaaga cca accqacc tittagittatt gaataaatga ttatttagtt aagagta aca SOO aaat caatgg ttcaaatttgtttct ctitco ttact tc.tta aattittaatc atggaagaaa 560 caaagttcaac giga catccaa titatggccta atcatct cat t ct cotttca acaaggcgaa 62O tcaaatctitc tittatacgta at atttattt gccagotctgaaatgitat acc aaatcattitt 68O taaattaatt gcc taaatta ttagaacaaa aact attagt aaataactaa ttagt ctitat 74 O gaaact agaa atcgagatag tiggaatatag agaga cacca ttaaatt cac aaaat cattt 8OO ttaaattacc taaatt atta caacaaaaac tattaga cag aactaagttct ataatgaaac 86 O gagagat.cgt atttggaatg tagagcgaga gacaattitt C aattic attga atatataagc 92 O aaaattatat agc.ccgtaga Ctttggtgag atgaagttcta agtacaaaca actgaatgaa 98 O tittataatca ataatattga ttatattgtg attagaaaaa gaaaacaact togcgittattt 2O4. O ttcaat atta ttgtgaggat taatgtgaac atggaatcgt gtttct c ct g aaaaaaatat 21OO cagdatagag cittagaacaa tataaatata t coaccaaaa ataacttcaa catttittata 216 O caactaatac aaaaaaaaaa aagcaaactt tttgtatata taaataaatt tdaaaactica 222 O alaggtoggtc agtacgaata agacacaiaca act actataa attagaggac tittgaagaca 228O agtaggittaa citagaa catc cittaatttct aaacctacgc actictacaaa agatt catca 234 O aaaggagtaa aagactaact ttct c 23.65

<210s, SEQ ID NO 75 &211s LENGTH: 1510 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P9002 expression vector (RD29A: ; LexA-GAL.4TA) <4 OO > SEQUENCE: 75 ggttgctatg gtagggacta tigggtttitc ggatt CC ggt ggaagtgagt ggggaggcag 6 O tggcggaggit aagggagttcaagattctgg aactgaagat ttggggttitt gcttittgaat 12 O gtttgcgttt ttg tatgatg cct citgtttgttgaactittga tigt atttitat citttgttgttga 18O aaaagagatt gggittaataa aatatttgct tttittggata agaaact citt ttagcggcc c 24 O attaataaag gttacaaatg caaaatcatgttagcgt cag at atttaatt atticgaagat 3OO gattgttgata gatttaaaat tat cotagtc. aaaaagaaag agtaggttga gcagaaa.ca.g 360 US 8,633,353 B2 129 130 - Continued tgacatctgt tdtttgtaccatacaaatta gtttagatta ttggittaa.ca tottaaatgg 42O citatgcatgt gacatttaga ccttatcgga attaatttgt agaattatta attaagatgt 48O tgattagttcaaacaaaaat titt at attaa aaaatgtaaa cqaat attitt g tatgttcag 54 O tgaaagtaaa acaaattaaa ttaacaagaa acttatagaa gaaaatttitt act atttaag 6OO agaaagaaaa aaatctat catttaatctgagtic ctaaaaa citgttatact taacagttaa 660 cgcatgattt gatggaggag C catagatgc aattcaatca aactgaaatt totgcaagaa 72 O t citcaaacac ggagat ct ca aagtttgaaa gaaaattitat ttctt cact caaaacaaac 78O ttacgaaatt tagg tagaac ttatatacat tatattgtaa tttitttgtaa caaaatgttt 84 O ttattatt at tatagaattt tactggittaa attaaaaatgaatagaaaag gtgaattaag 9 OO aggagagagg aggtaaac at titt Cttct at tttitt catat titt caggata aattattgta 96.O aaagtttaca agattt coat ttgactagtg taaatgagga at attcticta gtaagat cat O2O tattt catct acttcttitta t cittctacca gtagaggaat aaacaatatt tagct cottt O8O gtaaatacaa attaattitt.c citt cittgaca to attcaatt ttaattittac gtataaaata 14 O aaagat cata cct attagaa cattaagga gaaatacaat t caatgaga aggatgtgcc 2OO gtttgttata ataaacagcc acacgacgta aacgtaaaat gaccacatga tigggc caata 26 O gacatggacc gac tactaat aatagtaagt tacattt tag gatggaataa atat catacc 32O gacatcagtt ttgaaagaaa agggaaaaaa agaaaaaata aataaaagat atact accga 38O

Catgagttcc aaaaagcaaa aaaaaagat.c aag.ccgacac agacacgcgt agaga.gcaaa 44 O atgactittga cqt cacacca cqaaaacaga cqctt catac gtgtc.ccttt atctotctoa SOO gtct ct citat 510

<210s, SEQ ID NO 76 &211s LENGTH: 48O1 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P5319 expression vector (AS1 : : LexA-GAL.4TA) <4 OO > SEQUENCE: 76 ggaccotgta atgggc catt gggccaagtt ttcttgatat aaaatctgaa atact actaa 6 O attacaattt ttcttaaact cqattt cata att catgtgg gacticagttc. tcc.gc.gt citt 12 O atgact taag agittaa.gagt aaaga caatt gattgtagtt to attatta aggttgttgat 18O tittaaaggct at attggc cc aggcaaagtg gtt atgaaag ttaaaaggta t tatt aaatg 24 O tcgittatgga ctagot aaag aaaagagatg gatatagaala C9gatttgcc agtttgtgag 3OO gttacgtact cqttacttitc tattgcattt ttgttgttgtca ttgttgcttgt gatttctitta 360 gtatatgttt ttcttitttgt caaactictitt agtacatgtt atgctittatt ttcttgttta 42O gcattgtt at tdttattittg atc catgttc titt tacttaa totgtagagt gttcacgtac 48O gact ctittat gatcgctata ctaatatact atgaaact cq aatgagaaca togcatgtcat 54 O aaat caataa alacatalacat acgacactta acctaaatca tacatt catt gatt catact 6OO at catgat co toat cacatt agtat cattt gtc.tttattt attacttagc tactitcqtta 660 tott attata t ctittacctg ttctgctggit catttgc cat aaacaccaag tacaa.gcaac 72 O t ctittagt cc aatat cagac caaattaa.ca aacattt coc caatccaaaa cqgaaattta 78O attataatta gcatttaaat aggttcgatt acaaaaaaaa atcaacaaag gaacaagtica 84 O attt cataat gigtttgttcaa ttgtcacaca acgaaatggc tagcc.ggat.c aag catgcat 9 OO US 8,633,353 B2 131 132 - Continued gatccaaatt toaacatttic catgataacc tdaattataa cqtctacata aaccatattt 96.O aaataaatag gatggit cqaa agatat catt aaaagaacga ttcaatatt c tittattgttc O2O aattgataca catgttatt c ticcittaacca gttatgaaca tdtcc tacaa gtttcttgac O8O c caaact cat aattt catat accataatcc caagttaagt tttitttittitt toggggat caa 14 O aatctoaagt taagttaagt toaattattt agctgtaatig citcggaaaaa agat.cggatg 2OO aatat coaat tdgttcaata tat accc.caa to cqgccaat citc cctatot ttatagotta 26 O attattagag aatggit caat t cacgc.catc agaac cagtt toatat ctitc atgaaccaaa 32O acgc.ctacaa ccct attatt caagaaatca citataattgt ccaagtaaaa ccattaatta 38O accoagt cqa tttittctato gtc.ctatagg catgttgtta citcaaactac tdattaatta 44 O ataagaagtt gtagtttgaa aaagaatcta gctgaaaaat act cotactic taagaattta SOO agittagaata aaa.cat atta atacaaatat aaaaatttag titattaaaaa agcgctact a 560 c caagacgt.c ctaaagaaaa act agctttgttcttctaaaa gaaaacctag cittaact acc 62O caaaaaaatc tagttttaca aac actaaag acaaattitta tttittcaa.ca aatttaccala 68O ttaaagaaaa titc catgtag gaatgitat co aaattgaaaa tat coctaca tattttgtag 74 O gaaaaaaggt ttittataaat attaaaaaaa Cagaaaaag aaaagagaaa agagaaaaaa 8OO aaaag.ccgga gagaatggag cacatgaggit aaaaggcaag agatggcaga gagaagat.ca 86 O gaga agggat citgcct caat ttgacaactic atatgtcatgtcatttic cct cactact att 92 O attitt.cct at ttcaaaaa.ca cctitt citctg ataccat cac ctitttacctt citc.tttittitt 98 O ttactgtctt tdotctgttt Cacattcc ct tctatatata cagtatagta tattttatcc 2O4. O ttcttittatt gttittgctta ctaaaagttt tttitcct cog gaatcaaaat t ctaaaatgt 21OO at at catgtt aggtogcgag ggc catgcaa tatt atgaac tatgcatgat gattaatgtc. 216 O tgtggat.cca t cacaaat at tattgaaggit tat cagaga citatggacca aaatggit cog 222 O aatcgc.ctga taataaaaaa citatt cattt ttatttittta tttitttittat taalacatgtg 228O attaatgata gat cittacga titcgcaactg ggaaacatgc actaact caa acttaaaa.ca 234 O cacaatacta aaagttct at taaattittga atgtaaagag aaatatatta ggcaatcaaa 24 OO cggit caagta aat catacac atcgataatt tatttittitta t cottcaaag caggcc catc 246 O caaggcc.cac cactatt citc at atcaa.cat acttittcttgttittggittaa atcaacctac 252O catgttggct gttct citccg ctic ct ctdtg taagat caca cca acaccac togcataattit 2580 cittg tatt at tittgagacitt gagagtaaac tdattgacaa aaaaaaaaaa aaaaaaaaaa 264 O aattgagagt aaactagttt Cttgaatatt gattittittca gcttaatttgttggggaaag 27 OO at attactac tattgctgta aaaaaaaaaa aaaaaaaaaa agatattatt act at atttg 276 O tagtgattitt attittgaaaa ttct citt cac tttitttgtag ttaac attct aattttgttga 282O aaagaactitt taatgtcagg to atgtct ct taaaaagttt gcatgatgaa atgatttaca 288O aattacaata gaaaatggaa accattgcaa actaaattitt tat caaaaaa aatcgaaaat 294 O aaaatgtatt gacittagtaa togctgtgtct gctacgatta act attacac ataatgcaac 3 OOO actgaattat coaaatacat tattagaata atagt attac agitat cact a ttacaacaac 3 O 6 O aatgtcaa.ca ataatctitat tataataata tataaataga cct tagtgac atcat attat 312 O atagaaaa.ca ttggttgcc taatttgt at aagctagata cittgggggtg atgagtgact 318O agttgatgca atgataaaag agtgaaagtt ttgtctgcct gattataga C gtcggagaaa 324 O tact aaaata cqctatgaag attittggcgc atggtag cag aaaaaaaaaa C9gagggtgt 33 OO US 8,633,353 B2 133 134 - Continued gagtgagtag titagt cqg atgtgatgga acaaagaaaa gtatttittgg tagggittatg 3360 ggagagagaa goggaccatt attacacact tacatgctitt coccaaaaga taccatt coc 342O attittctgac acgtgtcc cc ct catc.ccca attact cata cqtcaaatcc aatttittagc 3480 ctaaaagttt tttittatttgtttagccaaa totattittac taattaaagt tttcaaatgg 354 O caaatagaaa gat cittctaa gottttataa aattacttga ttatttctag titttgct cat 36OO tttittaaata aaatttct ct ttitttittctt gcaac attat tdatttittitt tttgataggg 366 O agtaac atta gtgatgttct atctottct c attgcaaaaa ctittattitt c ticatctotat 372 O ttgat catca ttgcgaaatc titc catttitc aacaaatact titt coatgtt aatatgctgt 378 O ttcaaaatat aagttgtttgg aaaataaatc aacaagttta aatgttaact attitt tatgc 384 O tattataatt atttitt citta togggtaagtg gaaattaatgttact caaat tdgacataaa 3900 attctattgt ttgagtgaag gagtttataa atggagcatt attitt cittga atggittagtt 396 O tittcttctat cattttgaca agtaaatgac ttitt cagcca citaaagtaca acactttitt c 4 O2O atttaaattit aaa.gcatc.cc ctacattaga ttgtcattitt atttct cata atgttataga 4 O8O aaaatgaatt ttgagat.ccc aatgtagtaa atatatataa aaaaaggttt aatattgtca 414 O atgacaaaca acgaactitat ggaatttcaa cittitt cacct c cacgc.gc.ct c to tcagagt 42OO tttittttitt c cccacttgtg atgtaaaaag giggaaaacgt ctdtgtctica gtcgg taaac 426 O tttittct citc tittttitttitt taaagattitt attittaatta togcc.gtc.tct gtggtctaat 432O cgtgtacgt.c gtctggittitt aaaagcct ct ct cactittgg totttitcgtt ttct citcttic 438 O cattttct Co aactatataa aaaaaaaaaa gtgagagaga gag caaatct gtgttgatgga 4 44 O agttgct citt gagtttggga ttatttatct tttcaat atc atttggtaag catttittatt 4500 ttgttittata gtaataattt taact ct citt atctt cittaa taagttctittg cittaatagtg 456 O ttittgggg.tc agcattaatt toccctgttt ggttt coaga atataggttg tatagtgtga 462O taataacaaa ttatt coaag titttgct tca aac attgtca aagttitttgt cattitt catt 468O t cittgaaacg gaaatttitt c agacitttgta atttctaatt cqaaaatticg acagat cittg 474. O tagatttgtt togatcttitt agagttittga attggagaga tittatgaaac giggttgattit 48OO t 48O1

<210s, SEQ ID NO 77 &211s LENGTH: 4276 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P6506 expression vector (35S : : LexA-GAL.4TA) <4 OO > SEQUENCE: 77 catgcc tigca ggit coccaga ttagocttitt caattitcaga aagaatgcta acccacagat 6 O ggittagagag gcttacgcag Caggit ct cat Caagacgatc taccc.gagca ataatct coa 12 O ggaaatcaaa tacct tcc.ca agaaggittaa agatgcagtic aaaagattica ggactaactg 18O catcaagaac acagagaaag atatatttct caagat caga agtact attic cagtatggac 24 O gattcaaggc titgctt caca aacca aggca agtaatagag attggagtict Ctaaaaaggit 3OO agttcc cact gaatcaaagg C catggagtic aaagattcaa atagaggacc taacagaact 360 cgc.cgtaaag actggcgaac agttcataca gag tdtctta cact caatg acaagaagaa 42O aatctt cqtc aacatggtgg agcacgacac acttgtctac tocaaaaata t caaagatac 48O agtict cagaa gaccalaaggg caattgagac ttittcaacaa agggtaatat ccggaalacct 54 O

US 8,633,353 B2 139 140 - Continued cgcaccatct tct tcaagga cacggcaac tacaaga ccc gcgc.cgaggit gaagttcgag ggcgacaccc tigtgaaccg catcgagctgaagggcatcg acttcaagga ggacggcaac 84 O atcCtggggg acaagctgga gtacalactac alacagccaca acgt.ctatat catggc.cgac 9 OO alagcagaaga acggcatcaa ggtgaact tc aagat.ccgcc acaac atcga ggacggcagc 96.O gtgcagct C ccgacCacta C cagcagaac acc cc catcg gcgacggc.cc cqtgctgctg cc.cgacaa.cc act acctgag cacccagt cc gcc ctgagca aagaccc.cala Cagaag.cgc 108 O gat cacatgg toctgctgga gttcgtgacc gcc.gc.cggga t cact ct cqg catggacgag 114 O

Ctgtacaagt ccggagggat cct ctag 1167

SEO ID NO 79 LENGTH: 1034 TYPE: DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: P47 expression vector (for preparing 35S ; : G482)

SEQUENCE: 79 acacttaa.ca attcacacct tct citttitta citct tcc taa aaccotaaat titcct cqctt 6 O cagt ct tccc act caagtica accaccaatt gaatt cqatt togaat catt gatggaaatg 12 O atttgaaaaa agagtaaagt ttatttittitt attcc ttgta attitt cagaa atgggggatt 18O ccgacaggga titc.cggtgga gggcaaaacg ggaacaacca gaacgga cag ticcitcCttgt 24 O

CtcCaagaga gcaaga Cagg ttcttgc.cga t cqctaacgt cagc.cggat C atgaagaagg 3OO cCttgcc.cgc caacgc.caag at Ctctaaag atgcc-aaaga gacgatgcag gagtgtgtct 360 cc.gagttcat cagctt.cgt.c accggagaag Catctgataa gtgtcagaag gagaagagga agacgatcaa C9gagacgat ttgctctggg ctatgactac totaggittitt gaggattatg ttgagc catt gaaagtttac ttgcagaggt ttagggagat Caaggggag aggactggac 54 O tagggaggcc acagactggt ggtgaggit cq gaga.gcatca gagagatgct gtcggagatg gcggtgggitt Ctacggtggit ggtggtggga tigcagtatica C caac at cat cagttt Ctt C 660 accagcagaa ccatatgt at ggagccacag gtggcggtag cacagtgga ggtggagctg 72 O Cctic.cggtag gaCaaggact taacaaagat ttgaagtg gat ct ct ct c titat at aga tacataaata catgtataca catgcctatt tttacgaccc atataaggta t citat catgt 84 O gatagaacga acattggtgt ttgatgta aaatcagatgtc.attalagg gtttagattt 9 OO tgaggctgtg taaaagaaga t caagtgtgc tittgttggac aataggattic act aacgaat 96.O ctgctt catt ggat.cttgta totaactaaa goc attgtat togaatgcaaa tdttitt catt tgggatgctt taala 1034

SEQ ID NO 8O LENGTH: 1034 TYPE: DNA ORGANISM: artificial sequence FEATURE: OTHER INFORMATION: P5072 expression vector (opLexA. : : G482)

SEQUENCE: 8O acacttaa.ca attcacacct tct citttitta citct tcc taa aaccotaaat titcct cqctt 6 O cagt ct tccc act caagtica accaccaatt gaatt cqatt togaat catt gatggaaatg 12 O atttgaaaaa agagtaaagt ttatttittitt attcc ttgta attitt cagaa atgggggatt 18O ccgacaggga titc.cggtgga gggcaaaacg ggaacaacca gaacgga cag ticcitcCttgt 24 O

US 8,633,353 B2 145 146 - Continued cCaagagt cc cqgaggctgt ggaagc.catg agagtggtgg agatcaaagt ccCaggit cqt 12 O tacatgttcg tdagcaagat aggtttctitc cqattgctaa cataag.ccgt at catgaaaa 18O gaggtott Co. tcc taatggg aaaatcgcta aagatgctaa ggagattgttg Caggaatgtg 24 O t citctgaatt cat cagtttc gttcaccagcg agtatgtc.tt ttitta acctt ttaacaatgc 3OO tatgttgttgt ttgct cittgc tigaatgttta gaggcttgat aggaactittg agttt tagtt 360 t caagttctta aactittatag tdatttittgg gtttittctta act agtgaaa gttgaggcct 42O ttittagt cct tacattggitt act ctittctt gttgttcatc tttgtttggc tittgatgttg 48O catatttgat gatgcaatgg tttittctatg acgttgatga gcagagc gag tataaatgt 54 O caaagagaga aaaggalagac tattaatgga gatgatttgc tittgggcaat ggctactitta 6OO ggatttgaag act acatgga acct Ctcaag gtttacctga tigagatatag agagg tatgt 660 tctgagtacg atgttt atta gttatgctgt tdtgttgaaga gagtttittga attt cactaa 72 O cittact ctdt ttcticcittitt cotttitcttg ctic tact acg togaactggaa tat cacagat 78O ggaggit cagt gigttcttctt attct attitc ttgttcattc ctaaaacttig caacgatcta 84 O tgat ctittat ggctt.ccgtgaat catct ct ttttgtcaac titatgtatica aac tattitat 9 OO gctitt CCtta t ccttittcaa tat caatag ggtgacacaa agggat.ca.gc aaaaggtggg 96.O gatcCaaatg caaagaaaga tigggcaatca agccaaaatg gcc aggtata ttct calaat 1 O2O ct catgcctt taccatttt titcgt.ctt cq cattcatctg. cacg tatt ct citct tcc.gct 108 O ttagctaagt tagt ctdaca tt catggggt ttgtttacca tttggcagtt citcgcagott 114 O gct caccalag gtcct tatgg gaact citcaa gtaacttitt c ct ct cittctic titcacactica 12 OO agcaatacgc at cattct ct tctaatttgt taagaattga aaacttgtaa agcatcactt 126 O atgttataat citatgaatac citcaaaatct gtgaactaat ttgacattgg act cqgtata 132O attgtaggct c 1331

<210s, SEQ ID NO 85 &211s LENGTH: 1.OO9 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <223> OTHER INFORMATION: P5284 expression vector (RBCS3: ; LexA-GAL.4TA) <4 OOs, SEQUENCE: 85 aaatggagta atatggataa toaacgcaac tatatagaga aaaaataata gcqctaccat 6 O atacgaaaaa tagtaaaaaa ttataataat gattcagaat aaatt attaa taactaaaaa 12 O gcqtaaagaa ataaattaga gaataagtga tacaaaattig gatgttaatg gat actt citt 18O ataattgctt aaaaggaata caagatggga aataatgtgt tatt attatt gatgtataaa 24 O gaatttgtac aattitttgta t caataaagt to caaaaata atctittaaaa aataaaagta 3OO c cctitt tatgaactttittat caaataaatgaaatccaata ttagcaaaac attgat atta 360 ttactaaata tttgttaa at taaaaaatat gtcatttitat tttittaa.cag at atttittta 42O aagtaaatgt tataaattac gaaaaaggga ttaatgagta t caaaac agc ctaaatggga 48O ggagacaata acagaaattit gctgtag taa gotggcttaa gtcat cattt aatttgatat 54 O tataaaaatt ctaattagtt tatagt ctitt cittitt cotct tttgtttgtc. ttg tatgcta 6OO aaaaaggitat attatat cita taaattatgt agcataatga cca catctgg catcatctitt 660 acacaattica cctaaatat c ticaag.cgaag titttgccaaa actgaagaaa agatttgaac 72 O aaccitat caa gtaacaaaaa toccaaacaa tatagt catc tat attaaat cittittcaatt 78O