<<

US007973216 B2

(12) Patent (10) Patent No.: US 7,973,216 B2 Espley et al. (45) Date of Patent: Jul. 5, 2011

(54) COMPOSITIONS AND METHODS FOR 6,037,522 A 3/2000 Dong et al. MODULATING PGMENT PRODUCTION IN 6,074,877 A 6/2000 DHalluin et al. 2004.0034.888 A1 2/2004 Liu et al. FOREIGN PATENT DOCUMENTS (75) Inventors: Richard Espley, Auckland (NZ); Roger WO WOO1, 59 103 8, 2001 Hellens, Auckland (NZ); Andrew C. WO WO O2/OO894 1, 2002 WO WO O2/O55658 T 2002 Allan, Auckland (NZ) WO WOO3,0843.12 10, 2003 WO WO 2004/096994 11, 2004 (73) Assignee: The New Zealand Institute for WO WO 2005/001050 1, 2005 and food Research Limited, Auckland (NZ) OTHER PUBLICATIONS Bovy et al. (Plant Cell, 14:2509-2526, Published 2002).* (*) Notice: Subject to any disclaimer, the term of this Wells (Biochemistry 29:8509-8517, 1990).* patent is extended or adjusted under 35 Guo et al. (PNAS, 101: 9205-9210, 2004).* U.S.C. 154(b) by 0 days. Keskinet al. (Protein Science, 13:1043-1055, 2004).* Thornton et al. (Nature structural Biology, structural genomics (21) Appl. No.: 12/065,251 supplement, Nov. 2000).* Ngo et al., (The Protein Folding Problem and Tertiary Structure (22) PCT Filed: Aug. 30, 2006 Prediction, K. Merz., and S. Le Grand (eds.) pp. 492-495, 1994).* Doerks et al., (TIG, 14:248-250, 1998).* (86). PCT No.: Smith et al. (Nature Biotechnology, 15:1222-1223, 1997).* Bork et al. (TIG, 12:425-427, 1996).* S371 (c)(1), Vom Endt et al. (Phytochemistry 61: 107-114, 2002).* (2), (4) Date: Jul. 16, 2008 Korban et al. (NCBI, GenBank Sequence Accession No. CV628545, Published Oct. 25, 2004).* (87) PCT Pub. No.: WO2007/027105 Accession No. AJ554700, Jan. 5, 2004. “Gerbera Hybrid cv. 'Terra Regina mRNA for MYB10 Protein.” Eloma et al. PCT Pub. Date: Mar. 8, 2007 Accession No. Q8L5P3, Oct. 1, 2002, “SubName: Full=Myb-Re lated Transcription Factor V1MYBA1-1.” Kobayashi et al. (65) Prior Publication Data GenBank Accession No. DQ267896, Jul. 21, 2006, x US 2011 FOOT2539 A1 Mar. 24, 2011 domestica RedField MYB10a mRNA. Partial cds, Espley et al. (30) Foreign Application Priority Data GenBank Accession No. DQ267897, Jul. 21, 2006, “Malus x domestica Cultivar Pacicic MEYB10a mRNA, Partial cds.” Espley et al. Aug. 30, 2005 (NZ) ...... 54.2110 GenBank Accession No. DQ267898, Jul. 21, 2006, “Mauls x domestica Cultivar Granny Smith NYB10a mRNA. Partial cds.” (51) Int. Cl. Allan et al. AOIH 5/00 (2006.01) GenBank Accession No. DQ886415, Nov. 18, 2006, “Malus x CI2N 15/82 (2006.01) domestica MYB Transcription Factor (MYB1) Gene, MYB1-2 C7H 2L/00 (2006.01) allele, Promoter Region and Complete cds.” Takos et al. (52) U.S. Cl...... 800/295: 800/278; 800/298; 435/320.1; GenBank Accession No. DQ267900, Jul. 21 2006, “Malus x domestica Cultivar Royal Gala MYB9 mRNA. Complete cods.” Allan 435/419,435/468; 536/23.2:536/23.6 et al. (58) Field of Classification Search ...... None GenBank Accession No. AF336284. Mar. 15, 2001, “Gossypium See application file for complete search history. hirsutum GHMYB36 (ghmyb36) mRNA. Complete cds.” Matz et al. GenBank Accession No. DQ074463, Jan. 24, 2006, Malus x (56) References Cited domestica MYB11 mRNA. Complete cds, Hellens et al. U.S. PATENT DOCUMENTS (Continued) 4,795,855 A 1, 1989 Fillatti et al. 5,004,863. A 4, 1991 Umbeck Primary Examiner — Vinod Kumar 5,159,135 A 10, 1992 Umbeck (74) Attorney, Agent, or Firm — Greenlee Sullivan PC 5,177,010 A 1/1993 Goldman et al. 5,187,073. A 2f1993 Goldman et al. (57) ABSTRACT 5,188,958 A 2/1993 Moloney et al. 5,416,011 A 5, 1995 Hinchee et al. This invention relates to polynucleotides encoding novel 5,463,174 A 10/1995 Moloney et al. transcription factors and to the encoded transcription factors, 5,563,455 A 10/1996 Cheng 5,569,834 A 10, 1996 Hinchee et al. that are capable of regulating anthocyanin production in 5,591,616 A 1/1997 Hiei et al. plants. The invention also relates to constructs and vectors 5,750,871 A 5/1998 Moloney et al. comprising the polynucleotides, and to host cells, plant cells 5,792,935 A 8, 1998 Arntzen et al. and plants transformed with the polynucleotides, constructs 5,824,877 A 10, 1998 Hinchee et al. 5,846,797 A 12/1998 Strickland and vectors. The invention also relates to methods of produc 5,952,543 A 9/1999 Firoozabady et al. ing plants with altered anthocyanin production and plants by 5,968,830 A 10, 1999 Dan et al. the methods. 5,981,840 A 11/1999 Zhao et al. 6,020,539 A 2, 2000 Goldman et al. 16 Claims, 11 Drawing Sheets US 7,973.216 B2 Page 2

OTHER PUBLICATIONS Heim et al. (2003) “The Basic Helix-Loop-Helix Transcription Fac tor Family in Plants: A Genome-Wide Study of Protein Structure and GenBank Accession No. CN938023, Jun. 7, 2004. Functional Diversity.” Mol. Biol. Evol. 20:735-747. GenBank Accession No. CN 934367, Jun. 7, 2004. Hofmann et al. (1999) “The PROSITE Database, it’s Status in 1999.” GenBank Accession No. AF 117267, Apr. 20, 1999, “Malus Nuc. Acids. Res. 27(1):215-219. domestica UDP Glucose: 3-0-glucsyl Tranferase (UFGT1) mRNA. Complete cods.” Lee et al. Holton et al. (Jul. 1995) "Genetics and Biochemistry of Anthocyanin GenBankAccession No. AF 117269, Apr. 20, 1999, Malus domestica Biosynthesis.” Plant Cell 7: 1071-1083. Anthocyanidin Synthase (ANS) mRNA. Complete cods, Lee et al. Honda et al. (2002) “Anthocyanin Biosynthetic Genes are Coordi GenBank Accession No. AY 227729, Mar. 27, 2003, Malus x nately Expressed During Red Coloration in Skin.” Plant domestica Cultivar Weirouge dihydroflavonol 4-Reductase mRNA, Physiol. Biochem. 40:955-962. Complete cols, Fischer et al. International Search Report, Corresponding to International Appli GenBank Accession No. CN 491664, Apr. 27, 2004, Korban et al. cation No. PCT/NZ2006/000221, Mailed Feb. 22, 2007. GenBank Accession No. CN 946541, Jun. 7, 2004, Beuning et al. Jin et al. (1999) “Multifunctionality and Diversity within the Plant GenBank Accession No. CN 944824, Jun. 7, 2004, Beuning et al. MYB-Gene Family.” Plant Mol. Biol. 41:577-585. GenBank Accession No. AF325 123, Feb. 7, 2001, “Arabidopsis Jouvenot et al. (2003) “Targeted Regulation of Imprinted Genes by thaliana Production of Anthocyanin Pigment 1 Protein (PAP1) Gene, Complete cols.” Borovitz et al. Synthetic Zinc-Finger Transcription Factors.” Gene Ther. 10:513 GenBank Accession No. AF 146702, May 1, 2000, Petuniax hybrida 522. An2 Protein (an2) mRNA, an2-V26 allele, Complete cols, Quattroc Kim et al. (2003) “Molecular Cloning and Analysis of Anthocyanin chio et al. Biosynthesis Genes Preferentially Expressed in Apple Skin.” Plant SwissProt Accession No. Q6V7VO Jul. 5, 2004, “Anthcyanin 1.” Sci. 165:403-413. Mathews et al. Kobayashi et al. (2002) "Myb-Related Geens of the Kyoho Grape SwissProt Accession No. Q9ATD3, Jun. 1, 2001, “GHMYB36.” (Vitis labruscana) Regulate Anthocyanin Biosynthesis.” Planta Matz et al. 215:924-933. SwissProt Accession No. Q9ATD5 Jun. 1, 2001, “GHMYB10.” Matz Kubo (Jul. 1999) “Anthocyanless2, a Homeobox Gene Affecting et al. Anthocyanin Distribution and Root Development in Arabidopsism” SwissProt Accession No. Q6OP46, Jul. 5, 2004, "Anthocyanin Plant Cell. 11:1217-1226. Biocynthesis Regulatory Protein P11 B73.” Swigonova et al. Abbott et al. (Mar. 2002) "Simultaneous Suppression of Multiple Lancaster J. (1992) “Regulation of Skin Color in .” Crit Rec. Genes by Single Transgenes. Down-Regulation of Three Unrelated Plant Sci. 10:487-502. Lignin Biosynthetic Genes in Tobacco.” Plant Physiol. 128(3):844 Mathews et al. (Aug. 2003) “Ativation Tagging in Tomato Identifies 853. a Transcriptional Regulator of Anthocyanin Biosynthesis, Modifica Aharonietal. (2001) “The Strawberry FaMYB1 Transcription Factor tion, and Transport.” The Plant Cell 15:1689-1703. Suppresses Anthocyanin and Flavonol Accumulation in Transgenic Mehrtens et al. (Jun. 2005) “The Arabidopsis Transcription Factor Tobacco. Plant J. 28:319-332. MYB12 is a Flavonol-Specific Regulator of Phenylpropanoid Altschul et al. (1997) “Gapped BLAST and PSI-BLAST: A New Biosynthesis.” Plant Phys. 138: 1083-1096. Generation of Protein Database Search Programs.” Nucleic Acids Mol et al. (1996) "Signal Perception, Transduction, and Gene Res. 25(17):3389-3402. Expression Involved in Anthocyanin Biosynthesis.” Crit. Rev. Plant. Bairochet al. (1994) “PROSITE: Recent Developments.” Nuc. Acids Sci. 15(5-6):525-557. Res. 22(17):3583-3589. Bolton et al. (1962) “A General Method for the Isolation of RNA Napoli et al. (Apr. 1990) “Introduction of a Chimeric Chalone Complementary to DNA” Proc. Nat. Acad. Sci. USA 48:1390-1397. Synthase Gene into Petunia Results in Reversible Co-Suppression of Borevitz et al. (Dec. 2000) "Activation Tagging Identifies a Con Homologous In Trans.” Plant Cell 2:279-289. served MYB Regulator of Phenylprooanoid Biosynthesis.” Plant Nesietal. (Sep. 2001) “The Arabidopsis T12 Gene Encodes an R2R3 Ceil 12:2383-2394. MYB Domain Protein that Acts as a Key Determinant for Bosset al. (1996) “Expression of Anthocyanin Biosynthesis Pathway Proanthocyanidin Accumulation in Developing Seed.” Plant Cell Aenes in Red and White Grapes.” Plant Mol. Biol. 32:56frS69. 13:2099-2114. Bovyet al. (Oct. 2002) “High-FlavonolTomatoes Resulting from the Page R. (1996) "TreeView: An Application to Display Phylogenetic Heterologous Expression of the Maize Transcription Factor Genes Trees on Personal Computers.” Comput. Applic. Biosci. 12(4):357 LC and CI. The Plant Cell, 14: 2509-2526. 358. Broun, P. (2005)“Transcriptional Control of Flavonoid Biosynthesis: Piazza et al. (Mar. 2002) “Members of the c1/pII Regulatory Gene a Complex Network of Conserved Regulators Involved in Multiple Family Mediate the Response of Maize Aleurone and Mesocotyl to Aspects of Differentiation in Arabidopsis.” Curr. Opin. Plant. Biol. 8: Different Light Qualities and Cytokinins.” Plant Phys. 128:1077 272-279. 1086. de Carvalho Niebel et al. (Mar. 1995) “Post Transcriptional CoSup pression of B-1,3-Glucanase Genes DoesNot Affect Accumulation of Quattrocchio et al. (1998) “Analysis of bHLH and MYB Domain Transgene Nuclear mRNA.” Plant Cell 7:347-358. Proteins: Species-Specific Regulatory Differences are Caused by Elomaaetal. (Dec. 2003) "Activation of Anthocyanin Biosynthesis in Divergent Evolution of Taraet Anthocyanin Genes.” Plant J. Gerbera hybrida (Asteraceae) Suggests Conserved Proteinprotein 13(4):475-488. and Protein-Promoter Interactions Between the Anciently Diverged Quattrocchio et al. (Aug. 1999) “Molecular Analysis of the Monocots and .” Plant Phys. Biochem. 133: 1831-1842. anthocyanin2 Gene of Petunia and its Role in the Evolution of Espley et al. (2007) “Red Colouration in Apple Fruit is Due to the Color. Plant Cell 11:1433-1444. Activity of the MYB Transription Factor, MdMYB10.” Plant J. Saito et al. (2002) “Biochemistry and Molecular Biology of the 49:414-427. Late-Stage of Biosynthesis of Anthocyanin: Lessons from Perilla Falquet et al. (2002) “The PROSITE Database, it’s Status in 2002.” frutescens as a Model Plant.” New Phytologist 155:9-23. Nuc. Acids Res. 30(1):235-238. Schwinnet al. (2004) “.” In; Davies, K.M. ed. Plant Pig Giesen et al. (Nov 1, 1998) “A Formula for Thermal Stability (T) ments and their Manipulation, vol. 14. Blackwell Oxford, pp. Prediction of PNA/DNA Duplexes.” Nuc. Acids Res. 26(21):5004 92-149. 5006. Stracke et al. (2001) “The R2R3-MYB Gene Family in Arabidopsis Grotewold etal. (Dec. 5, 2000) “Identification of the Residues in the thaliana," Curr: Opin. Plant Biol. 4:447-456. Myb Domain of Maize C1 that Specify the Interaction with the bHLH Supplementary European Search Report, Corresponding to Euro Cofactor R. Proc. Nat. Acad. Sci. USA 97:13579-13584. pean Application No. EP 06 78 4027, Completed Sep. 23, 2008. US 7,973.216 B2 Page 3

Triqlia et al. (1998) “A Procedure for in vitro Amplification of DNA Goff, et al. Transactivation of Anthocyanin Biosynthetic Genes Fol Segments that Lie Outside the Boundaries of Known Sequences.” lowing Transfer of B Regulatory Genes Into Maize Tissues, The Nuc. Acids Res. 16(16):8186. EMBO Journal, 1990, pp. 2517-2522, vol. 9, No. 8, Oxford Univer Walker et al. (Jul. 1999) “The Transparent Testa Glabrai Locus, sity Press. Which Regulates Trichome Differentiation and Anthocyanin Biosynthesis in Arabidopsis, Encodes a WD40 Repeat Protein.” Plant Cell 11:1337-1350. * cited by examiner U.S. Patent Jul. 5, 2011 Sheet 1 of 11 US 7,973.216 B2

AtMcMYB10 PAP. AtMYB75apple KGSTREE ESESSSSSSSSSSCEt GN&CRISCRLF. KES AtPAP2 AtMYB90 R S NRC WMYBA2 grape Ca A pepper E. RKC ASINPCRRSCRiRhilisir PH PhPAN2 petunia RCRSC;π. If R TeANT1 tonato GhMYB10 Gerbera R2 PmMBF1 spruce ZimCl Ilaize WKAHGEGSREW At TT2 AtMYB123 RGSN' ITTHGEGEiSTLPNgAGKR MdMYB9 apple Ris EDKAY. asr MidMYB11 apple - : i. ZmP Inalize N GSKSl- KSCRIPS El YLE MdMYB8 apple RG GEGOf FIGKSCRLP5E.L.K.ED AtGL1

Helix 2 Helix 3 e- 80 - 100 McMYB10 apple IKRGEFKEDEVELIIRLHSILGNRGSLIARRLPGRIAAVKiyi,FLRR AtPAP. AtNYB75 At PAP2 AtMYB 90 IKRGKISSKRGRISNDEVELLIRLF3LLGNRGISIIASRLPGRADvrily E.SLIIRLF E. A&RLEGRTADWRSWNET N. VlMYBA2 q rape CaPhan2 A pepper petunia IKPGDEEEviliRLHELEGNRRSLIARLPGRADyKIY,IKRGDFGWBEIELILRLHS,KRGEFA E. GRWSIIASRLPGRADWKYNNS NRK LeANT1 tomato LikBGDFEdjeviLIRLHELGIRWSEIRLPGRADvKlinss GhMYB3.0PriMBF1 spruceGerbera IKRGNIRADEEELIIRA.E.R.E.E. HALGRWSIIASRWPGRDEIRYRNINIS R3 ZimCl Italize IRRGN IS BEEELIIRL: ELG:3RSGIASRi PGRDEIRSNST v At 2 AtMYB123 KRGISS5EEELIIRLENELGIRISLIRPGRIDER, NSNRRS Md MYB9 apple IKRG IS DEEELIWRLH SSSISR PGRTDEKYENTE : McMYB1.1 apple E. LIRLHELGNRSISTIA&RLPGRDEKiy, NTTIGKR ZimB maize VERGITSKEEEEIII. FAILGIRRSiSHLPGRTDERYNSHSRQ McMYB8 apple ERGNSPEELIEHSIGirls ARLPGRIDERY RIR AtGL1 VNRGNEEQEEE.IIR.K.I.GNRWSikRVPGRDOW Ry3NHSK D, R T U.S. Patent Jul. 5, 2011 Sheet 2 of 11 US 7,973,216 B2

Figure 2

s N s SSSNSS is N N SSSSSSSS SSSSSSSSSSSSSSs Siiwas SS SASis S SSSSSSSSS'23S So 2. sisS.S.w SS &S. is S.N vs s 1s:18 SSSSSSSSs'S&SS 14-3182 S&SS&S 3 4.SS-its-S: SEsSisks salooR 1is: SeeSSSessaga s Aire, .457ts SENSA)SE At SS3. Sister Athlysislysis:g:55 S.Soo 1802 14 AMEg1050YB119A S39013Asg01 AtMYBg Asg.stjof.g5885o AMYB16NYe.96 AYB104 Ai.sSS695.06 AEAMYss8 AE A. MNB68NB84 A393NE 2A E.2A27 nais 22 E.SE SElioz,Es: Ness&ses. Af 3.Sega 302: S. sesss 3:339tson 72 *\tex, 33 s SSXS &&.2 26& SSSSSSSSSSSSSS. (SSry 22:33.83%swa o SSSSSSSSSSSSSSS is 333e3e SSSSSS$SS:X&S S&SSSSSSSSSSSS SSSSSSS i is: isso A2 g S 7 SSSSSSSSSE5aSSSS SSSESS SSSSSSg:is a d * / 'iis,SSS N S 32 fii is S "sisG as o a 6. U.S. Patent Jul. 5, 2011 Sheet 3 of 11 US 7,973,216 B2

Figure 3

Red Field OP Pacific Rose

:

: 8 s

2 U.S. Patent Jul. 5, 2011 Sheet 4 of 11 US 7,973,216 B2

Figure 4

Red Field OP Pacific Rose - - Cortex Skin Cortex Skin

(A) (B) s' -- --e-r-3-4-5- - -2-3rd-T-S------6-7: is 25 MidMYB10 :s s 2 . MELF33 3 2

s! MbHLH3 4. 3.

2 i

2 s 4. s 2 3 a. s 2 3 4. s 2 s s S U.S. Patent Jul. 5, 2011 Sheet 5 of 11 US 7,973.216 B2 Figure 5

AtPAP MpMYB10 MpMYB9 MpMYB1 MpMYB8 apart------Yr- - (A ) 2

0 1 2 3 0 1 2 3 1 2 3 () 2 3 O 2 3 At PAP MpMYB10 MpMYB9 MpMYBii MpMYB8 U.S. Patent Jul. 5, 2011 Sheet 6 of 11 US 7,973,216 B2

Figure 6

(A) O.O.a Ratio alb -). .15 -0.2) 0.25 -O,30 0.35 -0.40 O45 0.5 0.55 0.60 -0.65 70 2 3 4. 5 6 7 B

U.S. Patent Jul. 5, 2011 Sheet 7 of 11 US 7,973.216 B2

Figure 7

1. fillites U.S. Patent Jul. 5, 2011 Sheet 8 of 11 US 7,973.216 B2

Figure 8

k AtMYB75 : 78 Co (Quince : : 81. Eij (loquat : ARCRKSCRTRWLNYVKP. : 81. Md (apple) : iRCRKSCRRWLNYTKP. : 80 Ms (Crab a ; : 80 Pb (pear Y : 8O Pc (pear) : : 80 Pof (cherr ; : 78 Pprl (peach : : 78 Ppy (pear : : 80 Ps (Japane : 78 Pav (sweet : 78 Pd(almond : 78 Mg (medlar w : 80 Pdm (Europ SIKRGE5ADEvDLIIRTH. : 78 f eDEWDL66RTH

AtMYB75 1.59 Co. (Quince : 153 Ej (loquat : 153 Md (apple) : 152 Ms (crab a ; 152 Pb (pear Y : 52 Pc (pear) : 152 Pcf. (cherr : 1.49 Ppr (peach : 150 Ppy (pear : 152 Ps (Japane : 49 Paw (sweet : 150 Pd(almond : 350 Mg (medlar : II 52 Pdm (Europ : PE-Iive 144 ...... " 220 :k 240 AtNYB75 E--SQEvDILVPEATTTEKGDT : 228 Co (Quince SRSAR--SSAPEosi SE : 231 Ej (loquat - : 232 Mc (apple) 229 Ms (crab a 3.TLLEG- : 229 Pb (pear Y STLFEG-ED 230 Pc (pear) : 230 Pcf (cherr 229 Ppr (peach : 229 Ppy (pear : 230 Ps (Japane : 229 Pav (sweet : 230 Pd (almond : 211 Mg (medlar 232 Pdm (Europ 223

AtMYB75 Co (Quince Ej (loquat Md (apple) Ms (crab a Pb (pear Y PC (pear) PCf (cherr Ppr (peach Ppy (pear : Ps (Japane Pav (sweet Pd(alInond Mg (medlar Pdm (Europ U.S. Patent Jul. 5, 2011 Sheet 9 of 11 US 7,973,216 B2

Figure 9

AtMYB CO E Md Ms Pb PC PCf Ppr Ppy PS Paw Po Mg Pom 75 (Quince) (loquat) apple Crab pear pear ( (peach) (pear (Japanese (sweet (almond) (medlar) (European apple YAL ) Nashi) plum) cherry) plum) AtMYB75 100 37 37 36 36 37 36 40 40 37 40 40 44 38 41 CO 1 OO 81 89 89 92 92 7O 75 92 70 71 66 72 72 (Quince) Ej (loquat) 1OO 77 77 78 79 69 72 78 68 69 65 71 70 : le) 1OO 99 92 93 69 73 92 69 69 66 74 69 apple Ms st 1OO 94 94 69 73 94 69 69 66 74 69 apple Espar 1OO 98 7O 74 100 89 70 65 74 70 Po (pear) 1OO 7O 75 98 70 70 65 74 71 Pcf 1OO 83 70 97 94 81 86 8O (cherry plum) Ppr 1OO 74 83 84 82 78 93 (peach) Ppy (pear 100 69 70 65 74 70 Nashi) Ps 1OO 92 82 86 8O (Japanese plum) Paw 1OO 86 91 82 (sweet cherry) Po 100 79 79 (almond) Mg 100 77 (medlar) Pom 100 (European plum) U.S. Patent Jul. 5, 2011 Sheet 10 of 11 US 7,973,216 B2

Figure 10

2.5 -

U.S. Patent Jul. 5, 2011 Sheet 11 of 11 US 7,973.216 B2 Figure 11

500

AO

300

200 Cy-glu Cy-pent " 35s-MdMYB10

5 6 8 9 10 11 12 13 14 15 16 17 18 19 20 Retention Time (minutes) US 7,973,216 B2 1. 2 COMPOSITIONS AND METHODS FOR ity and correlated pigmentation increases in immature fruit MODULATING PGMENT PRODUCTION IN and then again at ripening which appears to depend on the PLANTS cultivar. Studies show that there is highly specific regulation of CROSS-REFERENCE TO RELATED genes in the anthocyanin pathway by specific binding of APPLICATIONS transcription factors (TFs) as complexes with promoter ele ments (Holton and Cornish, 1995, Plant Cell 7, 1071-1083). This application is a United States national stage applica This regulation may also extend to non-pathway genes Such tion under 35 U.S.C. S371 of International Application No. as anthocyanin transport proteins. PCT/NZ2006/000221, filed Aug. 31, 2006, which claims 10 MYB TFs have been shown to play an important role in benefit of New Zealand Patent Application No. 542110 filed transcriptional regulation of anthocyanins. Plant MYBs have Aug. 30, 2005; both of which are hereby incorporated by been implicated in controlling pathways as diverseas second reference in their entireties to the extent not inconsistent with ary metabolism (including the anthocyanin pathway), devel the disclosure herein. opment, signal transduction and disease resistance (Jin and 15 Martin, 1999, Plant Mol Biol, 41, 577-585). They are char TECHNICAL FIELD acterised by a structurally conserved DNA binding domain consisting of single or multiple imperfect repeats; those asso The present invention is in the field of pigment develop ciated with the anthocyanin pathway tend to the two-repeat ment in plants. (R2R3) class. Regulation can also be specific to discreet groups of genes, either early or late in the anthocyanin bio BACKGROUND ART synthetic pathway. In the of perilla, Perilla fruitescens, TF-driven regulation has been observed in virtually all stages The accumulation of anthocyanin pigments is an important of anthocyanin biosynthesis from CHS to the resultant antho determinant of fruit quality. Pigments provide essential cul cyanin protein transport genes whilst in grape, Vitis vinifera, tivar differentiation for consumers and are implicated in the 25 specific regulation by MybA is restricted to the end-point of health attributes of apple fruit (Boyer and Liu, 2004). protein production (UFGT). Anthocyanin pigments belong to the diverse group of ubiq There are approximately 140R3 MYBTFs in Arabidopsis, uitous secondary metabolites, collectively known as fla divided into 24 sub groups (Stracke et al. 2001, Current vonoids. In plants, flavonoids are implicated in numerous Opinion in Plant Biology, 4, 447-556). The Production of biological functions, including defence, whilst the pigmented 30 Anthocyanin Pigment 1 (PAP1) MYB (Borevitz et al., 2000, anthocyanin compounds in particular play a vital physiologi Plant Cell, 12, 2383-2394) falls into subgroup 10 (when the cal role as attractants in plant/animal interactions. phylogeny of Stracke et al., 2001 is used) and demonstrates a The predominant precursors for all flavonoids, including high degree of amino acid conservation with other known anthocyanins, are malonyl-CoA and p-coumaroyl-CoA. anthocyanin regulators. When PAP1 was overexpressed in From these precursors the enzyme chalcone synthase (CHS) 35 transgenic Arabidopsis this led to up-regulation of a number forms chalcone, the first committed step towards anthocyanin of genes in the anthocyanin biosynthesis pathway from PAL production and the establishment of the Cs backbone. Chal to CHS and DFR (Borevitz et al., 2000, Plant Cell, 12, 2383 cone is then isomerised by chalcone isomerase (CHI) to pro 2394; Tohge et al., 2005, Plant Journal, 42, 218-235). duce chalcone maringenin and from there a hydroxylation step In general MYBs interact closely with basic Helix Loop via flavanone 3,3-hydroxylase (F3H) converts maringenin to 40 Helix TFs (bHLH), and this has been extensively studied in dihydroflavonol. Reduction of dihydroflavonon by dihy relation to the production of flavonoids (Mol et al., 1996; droflavolon 4-reductase (DFR) produces leucoanthocyanin Winkel-Shirley, 2001). Examples include the maize ZmC which is converted into the coloured compound anthocyan MYB and ZmB bHLH and the petunia AN2 MYB and AN1/ indin by leucoanthocyanidin dioxygenase (LDOX) whilst the JAF13 bHLHs (Goffet al., 1992 Genes Dev, 6,864-875; Mol final glycosylation step is mediated by uridin diphosphate 45 et al., 1998, Trends in Plant Science, 3, 212-217). Evidently (UDP)-glucose:flavonoid 3-0-glucosyltransferase (UFGT). there is a degree of conservation, in different species, for this The difference in anthocyanin colour can be due to a number co-ordination. However, a MYB-bHLH partnership is not of factors including the molecular structure and the type and always necessary. Results from the overexpression of PAP1 number of hydroxyl groups, Sugars and acids attached and the suggested that, like the Maize PMYB (Grotewoldet al., 2000 cellular environment such as pH or ultrastructure. Of the 50 Proc Natl AcadSci USA,97, 13579-13584) and Arabidopsis many anthocyanin pigments it is cyanidin, in the form of MYB12 (Mehrtens et al., 2005 Plant Physiology, 138, 1083 cyanidin 3-0-galactoside, which is primarily responsible for 1096), PAP1 did not require an over-expressed bHLH co the red colouration in apple skin and the enzymes in this regulator to drive a massive increase in anthocyanin produc biosynthetic pathway for apple have been well described tion. However, further studies showed that PAP1 does interact (Kim et al., 2003, Plant Science 165, 403-413: Honda et al., 55 closely with bhLHs leading to stronger promoter (DFR) acti 2002, Plant Physiology and Biochemistry 40,955-962). It has vation in in vivo assays (Zimmermann et al., 2004 Plant J, 40, long been observed that anthocyanins are elevated in response 22-34). More recently, integrated transcriptome and metabo to particular environmental, developmental and pathogenic lome analysis of PAP1 over-expressing lines confirmed PAP1 stimuli. Research into apple fruit has demonstrated both the upregulates the bHLHTT8 (Atag09820) by 18-fold (Tohgeet environmental and developmental regulation of anthocyanin 60 al., 2005, Plant J, 42, 218-235). This dependency on a co accumulation. Pigment biosynthesis can be induced when regulator is linked to a small number of amino acid changes in fruit are subjected to white light, or more significantly, UV the highly conserved R2R3 binding domain as evident in the light, a phenomenon also observed in other species. Further comparison between the bHLH independent maize P and the more, anthocyanin levels can be elevated by cold temperature bHLH dependent maize C1 MYBs, and is sufficient to direct storage of the fruit. There is evidence for the coordinate 65 activation of distinct sets of target genes (Grotewold et al., induction of anthocyanin enzymes in a developmental man 2000, Proc Natl Acad Sci USA, 97, 13579-13584). In this ner in apple fruit with pronounced anthocyanin enzyme activ study substitution of just six amino acids from the R2R3 US 7,973,216 B2 3 4 domain of C1 into the corresponding positions in P1 resulted c) the complement of the sequence of a) in a mutant with bHLH-dependent behaviour similar to C1. d) the complement of the sequence ofb) More recently it was suggested that this may be a key mecha e) a sequence, of at least 15 nucleotides in length, capable of nism which permits MYBs to discriminate between target hybridising to the sequence of genes (Hernandez et al., 2004, J. Biol. CHem, 279, 48205 a) under stringent conditions. 48213). These key amino acids are marked on FIG. 1. In In a further embodiment the polypeptide has at least 65% contrast to PAP1, FaMYB1, represses anthocyanin biosyn identity to the amino acid sequence of SEQID NO: 1. Pref thesis during the late development of strawberry fruit. Despite erably polypeptide has the amino acid sequence of SEQ ID this alternative role FaMYB1 shares homology with activa NO: 1. tion MYBs and can interact with (activation) bHLHs such as 10 In a further embodiment the polypeptide has at least 65% the Petunia AN1 and JAF13 (Aharoni et al., 2001, Plant J. 28, identity to the amino acid sequence of SEQID NO: 2. Pref 319-332). Despite key residues being the same for PAP-like erably the polypeptide has the amino acid sequence of SEQ activators and FaMYB-like repressors, activators tend to fall ID NO: 2. in subgroup 10 while repressors fall in subgroup 17 (accord In a further embodiment the polypeptide has at least 65% ing to Stracke et al.). 15 identity to the amino acid sequence of SEQID NO: 3. Pref An additional level of anthocyanin regulation involves a erably the polypeptide has the amino acid sequence of SEQ separate class of proteins, containing WD40 domains, which ID NO: 3. form complexes with MYB and bhLH proteins (as reviewed In a further embodiment the polypeptide has at least 65% in Ramsay and Glover, 2005, Trends in Plant Science, 10. identity to the amino acid sequence of SEQID NO: 4. Pref 63-70). Examples include an 11 in petunia (de Vetten et al., erably the polypeptide has the amino acid sequence of SEQ 1997 Genes Dev, 11, 1422-1434) and TTG1 in Arabidopsis ID NO: 4. (Walker et al., 1999, Plant Cell, 11, 1337-1350). The tran Inafurther embodiment the sequence ina) has at least 70% Scriptional control of anthocyanins may be further compli identity to the sequence of any one of SEQID NO:5-8, 22-47 cated by tissue specific regulation (Kubo et al., 1999, Plant and 102. Preferably the sequence in a) has at least 70% Cell, 11, 1217-1226) and possibly different layers of regula 25 identity to the coding sequence of any one of SEQ ID NO: tion dependent on stimuli Such as cold, light and developmen 5-8, 22-47 and 102. tal cues (Davuluri et al., 2005, Nature Biotechnology, 23. Inafurther embodiment the sequence ina) has at least 70% 890-895). identity to the sequence of SEQ ID NO: 5. Preferably the Although studies into the activation and repression of sequence in a) has at least 70% identity to the coding anthocyanin synthesis in apple fruit have shown developmen 30 sequence of SEQID NO: 5. More preferably the sequence in tal and environmental regulation, to date transcription factors a) has the sequence of SEQ ID NO: 5. More preferably the regulating anthocyanin synthesis have not been identified in sequence ina) has the coding sequence of SEQID NO: 5. this species or any other fruit. The control of antho Inafurther embodiment the sequence ina) has at least 70% cyaninaccumulation in apple is a key question in understand identity to the sequence of SEQ ID NO: 6. Preferably the ing and manipulating fruit colour. Identification of the factors 35 sequence in a) has at least 70% identity to the coding that exert this control provides tools for moderating the extent sequence of SEQID NO: 6. More preferably the sequence in and distribution of anthocyanin-derived pigmentation in fruit a) has the sequence of SEQID NO: 6. More preferably the tissue. sequence ina) has the coding sequence of SEQID NO: 6. It is therefore an object of the invention to provide tran Inafurther embodiment the sequence ina) has at least 70% Scription factor sequences which regulate anthocyanin pro 40 identity to the sequence of SEQ ID NO: 7. Preferably the duction in apple species and/or at least to provide the public sequence in a) has at least 70% identity to the coding with a useful choice. sequence of SEQID NO: 7. More preferably the sequence in a) has the sequence of SEQ ID NO: 7. More preferably the SUMMARY OF THE INVENTION sequence ina) has the coding sequence of SEQID NO: 7. 45 Inafurther embodiment the sequence ina) has at least 70% In the first aspect the invention provides an isolated poly identity to the sequence of SEQ ID NO: 8. Preferably the nucleotide comprising sequence in a) has at least 70% identity to the coding a) a sequence encoding a polypeptide with any one of the sequence of SEQID NO: 8. More preferably the sequence in amino acid sequences of SEQ ID NO: 1-4 and 9-21 or a a) has the sequence of SEQID NO: 8. More preferably the variant thereof, wherein the polypeptide or variant thereof 50 sequence ina) has the coding sequence of SEQID NO: 8. is a transcription factor capable of regulating anthocyanin In a further aspect the invention provides an isolated poly production in a plant; nucleotide comprising: b) a fragment, of at least 15 nucleotides in length, of the a) a sequence with at least 70% identity to any one of the sequence of a); nucleotide sequences of SEQID NO: 5-8, 22-47 and 102, c) the complement of the sequence of a) 55 wherein the sequence encodes a transcription factor d) the complement of the sequence ofb) capable of regulating anthocyanin production in a plant; e) a sequence, of at least 15 nucleotides in length, capable of b) a fragment, of at least 15 nucleotides in length, of the hybridising to the sequence of sequence of a); a) under stringent conditions. c) the complement of the sequence of a) In one embodiment the isolated polynucleotide comprises 60 d) the complement of the sequence ofb) a) a sequence encoding a polypeptide with at least 65% iden e) a sequence, of at least 15 nucleotides in length, capable of tity to any one of the amino acid sequences of SEQID NO: hybridising to the sequence of 1-4 and 9-21, wherein the polypeptide is a transcription a) under stringent conditions. factor capable of regulating anthocyanin production in a In one embodiment the sequence in a) has at least 70% plant; 65 identity to the sequence of SEQ ID NO: 5. Preferably the b) a fragment, of at least 15 nucleotides in length, of the sequence in a) has at least 70% identity to the coding sequence of a); sequence of SEQID NO: 5. More preferably the sequence in US 7,973,216 B2 5 6 a) has the sequence of SEQ ID NO: 5. More preferably the In a further aspect the invention provides an isolated poly sequence ina) has the coding sequence of SEQID NO: 5. nucleotide comprising: In one embodiment the sequence in a) has at least 70% a) a sequence encoding a polypeptide variant any one of the identity to the sequence of SEQ ID NO: 6. Preferably the amino acid sequences of SEQ ID NO: 1-4 and 9-21, sequence in a) has at least 70% identity to the coding 5 wherein the polypeptide is a transcription factor capable of sequence of SEQID NO: 6. More preferably the sequence in regulating anthocyanin production in a plant, and wherein a) has the sequence of SEQID NO: 6. More preferably the the polypeptide comprises the sequence of SEQ ID NO: sequence ina) has the coding sequence of SEQID NO: 6. 101; In one embodiment the sequence in a) has at least 70% b) a fragment, of at least 15 nucleotides in length, of the identity to the sequence of SEQ ID NO: 7. Preferably the 10 sequence in a) has at least 70% identity to the coding sequence of a); sequence of SEQID NO: 7. More preferably the sequence in c) the complement of the sequence of a) a) has the sequence of SEQ ID NO: 7. More preferably the d) the complement of the sequence ofb) sequence ina) has the coding sequence of SEQID NO: 7. e) a sequence, of at least 15 nucleotides in length, capable of In one embodiment the sequence in a) has at least 70% 15 hybridising to the sequence of identity to the sequence of SEQ ID NO: 8. Preferably the a) under stringent conditions. sequence in a) has at least 70% identity to the coding Preferably the variant polypeptide is derived from a sequence of SEQID NO: 8. More preferably the sequence in species. a) has the sequence of SEQID NO: 8. More preferably the In a further aspect the invention provides an isolated sequence ina) has the coding sequence of SEQID NO: 8. polypeptide comprising: In the further aspect the invention provides an isolated a) a sequence with at least 65% identity to an amino acid polynucleotide having at least 70% sequence identity to a sequence selected from any one of SEQID NO: 1-4 and nucleotide sequence that encodes a polypeptide comprising 9-21, wherein the polypeptide is a transcription factor an amino acid sequence selected from any one of SEQID NO: capable of regulating anthocyanin production in a plant; or 1 to 4 and 9 to 21, wherein the polynucleotide encodes a 25 b) a fragment, of at least 5 amino acids in length, of the transcription factor capable of regulating anthocyanin pro sequence of a) duction in a plant. In one embodiment the sequence in a) has at least 65% In one embodiment the isolated polynucleotide has at least sequence identity to the amino acid sequence of SEQID NO: 70% sequence identity to a nucleotide sequence that encodes 1. Preferably the sequence in a) has the sequence of SEQID a polypeptide comprising the amino acid sequence of SEQID 30 NO: 1. NO.1. In a further embodiment the nucleotide sequence com In one embodiment the sequence in a) has at least 65% prises the nucleotide sequence of SEQID NO: 5. Preferably sequence identity to the amino acid sequence of SEQID NO: the nucleotide sequence comprises the coding sequence from 2. Preferably the sequence in a) has the sequence of SEQID NO: 2. SEQID NO: 5. 35 In a further aspect the invention providing an isolated poly In one embodiment the sequence in a) has at least 65% nucleotide comprising sequence identity to the amino acid sequence of SEQID NO: a) a sequence encoding a polypeptide with at least 65% iden 3. Preferably the sequence in a) has the sequence of SEQID tity to any one of the amino acid sequences of SEQID NO: NO: 3. 1-4 and 9-21, wherein the polypeptide is a transcription 40 In one embodiment the sequence in a) has at least 65% factor capable of regulating the promoter of a gene in the sequence identity to the amino acid sequence of SEQID NO: anthocyanin biosynthetic pathway; 4. Preferably the sequence in a) has the sequence of SEQID b) a fragment, of at least 15 nucleotides in length, of the NO: 4. sequence of a); In a further aspect the invention provides a polynucleotide c) the complement of the sequence of a) 45 encoding a polypeptide of the invention. d) the complement of the sequence ofb) In a further aspect the invention provides an antibody e) a sequence, of at least 15 nucleotides in length, capable of raised against a polypeptide of the invention. hybridising to the sequence of In a further aspect the invention provides a genetic con a) under stringent conditions. struct comprising a polynucleotide of any one of the inven In a further aspect the invention provides an isolated poly 50 tion. nucleotide comprising: In a further aspect the invention provides a host cell com a) a sequence with at least 70% identity to any one of the prising a genetic construct of the invention. nucleotide sequences of SEQID NO: 5-8, 22-47 and 102, In a further aspect the invention provides a host cell geneti wherein the sequence encodes a transcription factor cally modified to express a polynucleotide of any one of the capable of regulating the promoter of a gene in the antho 55 cyanin biosynthetic pathway; invention. b) a fragment, of at least 15 nucleotides in length, of the In a further aspect the invention provides a plant cell com sequence of a); prising the genetic construct of the invention. c) the complement of the sequence of a) In a further aspect the invention provides a plant cell d) the complement of the sequence ofb) 60 genetically modified to express a polynucleotide of the inven e) a sequence, of at least 15 nucleotides in length, capable of tion. hybridising to the sequence of In a further aspect the invention provides a plant which a) under stringent conditions. comprises the plant cell of the invention. In one embodiment the gene to be regulated encodes dihy In a further aspect the invention provides a method for droflavolon 4-reductase (DFR). 65 producing a polypeptide of the invention, the method com In an alternative embodiment the gene to be regulated prising the step of culturing a host cell comprising an a encodes chalcone synthase (CHS). genetic construct of the invention. US 7,973,216 B2 7 8 In a further aspect the invention provides a plant cell or Preferably the transcription factors and variants of the plant with altered anthocyanin production, the method com invention, that are capable of regulating anthocyanin produc prising the step of transformation of a plant cell or plant with tion in plants, are capable of regulating the production of the a genetic construct including: anthocyanins selected from the group including but not lim a) at least one polynucleotide encoding of a polypeptide of the ited to: cyanidin-3-glucoside, cyanidin-3-0-rutinoside, cya invention; nadin-3-glucoside and cyanadin-3-pentoside. b) at least one polynucleotide comprising a fragment, of at Preferably the plants or plant cells with altered production least 15 nucleotides in length, of the polynucleotide of a); of anthocyanins, produced by or selected by the methods of O the invention, are altered in production of anthocyanins c) at least one polynucleotide comprising a complement, of at 10 selected from the group including but not limited to: cyana least 15 nucleotides in length, of the polynucleotide of a). din-3-glucosidase, cyaniding-3-0-rutinoside, cyanadin-3- In a further aspect the invention provides a method of producing a plant cell or plant with altered anthocyanin pro glucoside and cyanadin-3-pentoside. duction, the method comprising the step of transforming a The polynucleotides and polynucleotide variants, of the plant cell or plant with a genetic construct including: 15 invention may be derived from any species or may be pro a) at least one of the polynucleotides of any one of the inven duced by recombinant or synthetic means. tion; In one embodiment the polynucleotide or variant, is b) at least one polynucleotide comprising a fragment, of at derived from a plant species. least 15 nucleotides in length, of the polynucleotide of a), In a further embodiment the polynucleotide or variant, is O derived from a gymnosperm plant species. c) at least one polynucleotide comprising a complement, of at In a further embodiment the polynucleotide or variant, is least 15 nucleotides in length, of the polynucleotide of a) derived from an angiosperm plant species. d) at least one polynucleotide capable of hybridising under In a further embodiment the polynucleotide or variant, is stringent conditions to the polynucleotide of a) or b). derived from a from dicotyledonous plant species. In one embodiment of the method, the construct is 25 The polypeptides and polypeptide variants of the invention designed to express a pair of transcription factors, and the may be derived from any species, or may be produced by construct comprises: recombinant or synthetic means. i) a polynucleotide sequence encoding a MYB transcription In one embodiment the polypeptides or variants of the factor with at least 65% identity to the amino acid sequence invention are derived from plant species. of any one of SEQID NO: 1, 2 and 9 to 21; and 30 In a further embodiment the polypeptides or variants of the ii) a polynucleotide sequence encoding a bHLH transcription invention are derived from gymnosperm plant species. factor with at least 65% identity to the amino acid sequence In a further embodiment the polypeptides or variants of the of SEQID NO: 1 or 2. invention are derived from angiosperm plant species. In a further embodiment the polynucleotide sequence ini) In a further embodiment the polypeptides or variants of the has at least 70% sequence identity to the nucleotide sequence 35 invention are derived from dicotyledonous plant species. of any one of SEQ ID NO: 5, 6, 22 to 27 and 102; and the In a further embodiment polypeptide or variant is derived polynucleotide sequence in ii) has at least 70% sequence from a monocotyledonous plant species. identity to the nucleotide sequence of SEQID NO: 7 or 8. The plant cells and plants of the invention may be from any In a further embodiment the polynucleotide sequence ini) species. has at least 70% sequence identity to the nucleotide sequence 40 In one embodiment the plants cells and plants of the inven of any one of SEQ ID NO: 5, 6, 22 to 27 and 102; and the tion are from gymnosperm species. coding sequence in ii) has at least 70% sequence identity to In a further embodiment the plants cells and plants of the the coding sequence of SEQID NO: 7 or 8. invention are from angiosperm species. In a further aspect the invention provides a plant produced In a further embodiment the plants cells and plants of the by the method of the invention. 45 invention are from dicotyledonous species. In a further aspect the invention provides a method for In a further embodiment the plants cells and plants of the selecting a plant altered in anthocyanin production, the invention are from monocotyledonous species. method comprising testing of a plant for altered expression of Preferred plant species (for the polynucleotide and vari a polynucleotide of the invention. ants, polypeptides and variants and plant cells and plants of In a further aspect the invention provides a method for 50 the invention) include fruit plant species selected from a selecting a plant altered in anthocyanin production, the group comprising but not limited to the following genera: method comprising testing of a plant for altered expression of Malus, Pyrus Prunis, , Rosa, Fragaria, Actinidia, a polypeptide of the invention. Cydonia, Citrus, and Vaccinium. In a further aspect the invention provides a plant selected Particularly preferred fruit plant species are: Malus domes by the method of the invention. 55 tica, Actidinia deliciosa, A. Chinensis, A. eriantha, A. arguta In a further aspect the invention provides a method for and hybrids of the four Actinidia species, Prunis persica selecting a plant cell or plant that has been transformed, the Pyrus L., Rubus, Rosa, and Fragaria. method comprising the steps Preferred plants (for the polynucleotide and variants, a) transforming a plant cell or plant with a polynucleotide or polypeptides and variants and plant cells and plants of the polypeptide of the invention capable of regulating antho 60 invention) also include vegetable plant species selected from cyanin production in a plant; a group comprising but not limited to the following genera: b) expressing the polynucleotide or polypeptide in the plant Brassica, Lycopersicon and Solanum, cell or plant; and Particularly preferred vegetable plant species are: Lycoper c) selecting a plant cell or plant with increased anthocyanin sicon esculentum and Solanum tuberosum pigmentation relative to other plant cells or plants, the 65 Preferred plants (for the polynucleotide and variants, increased anthocyanin pigmentation indicating that the polypeptides and variants and plant cells and plants of the plant cell or plant has been transformed. invention) also include crop plant species selected from a US 7,973,216 B2 10 group comprising but not limited to the following genera: lucida, mantucketensis, Amelanchier pumila, Glycine, Zea, Hordeum and Oryza. Amelanchier quinti-marti, Amelanchier sanguinea, Amelan Particularly preferred crop plant species include Glycine chier stolonifera, Amelanchier utahensis, Amelanchier wie max, Zea mays and Oryza sativa. gandii, AmelanchierX neglecta, Amelanchier bartramiana X Preferred plants (for the polynucleotide and variants, Amelanchier sp. dentata , Amelanchier sp. dentata, polypeptides and variants and plant cells and plants of the Amelanchier sp. erecta , Amelanchier sp. erectax Amelan invention) also include those of the Rosaceae family. chier laevis, Amelanchier sp. serotina, Aria alnifolia, Aro Preferred Rosaceae genera include Exochorda, Maddenia, nia prunifolia, Chaenomeles cathayensis, Chaenomeles spe Oeinleria, Osmaronia, Prinsepia, , Maloideae, ciosa, Chamaeme spilus alpina, Cornus domestica, Amelanchier, Aria, Aronia, Chaenomeles, Chamaeme spilus, 10 Cotoneaster apiculatus, Cotoneaster lacteus, Cotoneaster Cornus, Cotoneaster; Osmaronia, Prinsepia, Pru pannosus, Crataegus azarolus, Crataegus columbiana, Cra nus, Maloideae, Amelanchier, Aria, Aronia, Chaenomeles, taegus crus-galli, Cirataegus curvisepala, Crataegus laevi Chamaemespilus, Cornus, Cotoneaster, Crataegu, Cydonia, gata, Crataegus mollis, Crataegus monogyna, Ciataegus Dichotomanthes, Docynia, Docyniopsis, Eriobotrya, Eriolo niga, Crataegus rivularis, Crataegus Sinaica, Cydonia bus, Heteromeles, Kagemeckia, Lindleya, Malacomeles, 15 oblonga, Dichotomanthes tristanicarpa, Docynia delavavi, Malus, Mespilus, Osteomeles, Peraphyllum, Photinia, Docyniopsis tschonoski, Eriobotrya japonica, Eriobotrya Pseudocydonia, Pyracantha, Pyrus, Rhaphiolepis, Sorbus, prinoides, Eriolobus trilobatus, Heteromeles arbutifolia, Stranvaesia, Torminalis, Vauquelinia, , , Kagemeckia angustifolia, Kagemeckia Oblonga, Lindleya Acomastylis, , , Aphanes, Aremonia, mespiloides, Malacomeles denticulata, , Bencomia, , Cliffortia, Coluria, Cowania, Dali– , , , Malus barda, Dendriopoterium, Dryas, Duchesnea, Erythrocoma, doumeri, Malus florentina, , Malus fisca, Fallugia, Filipendula, Fragaria, Geum, Hagenia, , , Malus homanensis, , Malus Ivesia, Kerria, Leucosidea, Marcetella, Margyricarpus, ioensis, Malus kansuensis, , Malus Novosieversia, Onco stylus, , Potentilla, Rosa, micromalus, , Malus Ombrophilia, Rubus, Sanguisorba, Sarcopoterium, Sibbaldia, Sieversia, 25 Malus Orientalis, Malus pratti, , Malus Taihangia, Tetraglochin, Waldsteinia, Rosaceae incertae punila, , , , sedis, Adenostoma, Aruncus, Cercocarpus, Chamaebatiaria, , , , Chamaerhodos, Gillenia, Holodiscus, Lyonothamnus, Neil , Malus tschonoski, Malus X domestica, lia, , , Purshia, Rhodotypos, Sorbaria, Malus X domestica X Malus sieversii, Malus X domestica X Spiraea and Stephanandra. 30 Pyrus communis, Malus xiaojinensis, , Preferred Rosaceae species include Exochorda giraldii, Malus sp., Mespilus germanica, Osteomeles anthyllidifolia, Exochorda racemosa, Exochorda, Exochorda giraldii, Exo Osteomeles Schwerinae, Peraphyllum ramosissimum, Pho chorda racemosa, Exochorda serratifolia, Maddenia hypo tinia fraseri, Photinia pyrifolia, Photinia serrulata, Photinia leuca, Oemleria cerasiformis, Osmaronia cerasiformis, Prin villosa, Pseudocydonia sinensis, Pyracantha coccinea, Pyra sepia sinensis, Prinsepia uniflora, Prunus alleghaniensis, 35 cantha fortuneana, Pyrus calleryana, Pyrus caucasica, , Prunus andersonii, , Pyrus communis, Pyrus elaeagrifolia, Pyrus hybrid cultivar, , Prunus argentea, , Prunus Pyrus pyrifolia, Pyrus salicifolia, Pyrus ussuriensis, Pyrus X avium, Prunus bifrons, Prunus brigantina, Prunus bucha bretschneideri, Rhaphiolepis indica, Sorbus americana, Sor rica, Prunus buergeriana, Prunus campanulata, Prunus bus aria, Sorbus aucuparia, Sorbus Californica, Sorbus com caroliniana, , Prunus cerasus, Prunus 40 mixta, Sorbus hupehensis, Sorbus scopulina, Sorbus Sibirica, choreiana, Prunus coconilia, Prunus cyclaimina, Prunus Sorbus torminalis, Stranvaesia davidiana, Torminalis clusii, davidiana, Prunus debilis, Prunus domestica, Prunus dulcis, Vauquelinia Californica, Vauquelinia corymbosa, Acaena Prunusemarginata, Prunus fasciculata, Prunus ferganensis, anserinifolia, Acaena argentea, Acaena Caesiglauca, Prunus fordiana, Prunusfieinontii, Prunus fruticosa, Prunus Acaena cylindristachya, Acaena digitata, Acaena echinata, geniculata, Prunus glandulosa, Prunus gracilis, Prunus 45 Acaena elongata, Acaena eupatoria, Acaena fissistipula, gravana, Prunus hortulana, Prunus ilicifolia, , Acaena inermis, Acaena laevigata, Acaena latebrosa, Prunus jacquemontii, Prunus japonica, Prunus kuramica, Acaena lucida, Acaena macrocephala, Acaena magellanica, Prunus laurocerasus, Prunus leveilleana, Prunus lusitanica, Acaena masafuerana, Acaena montana, Acaena multifida, Prunus maackii, Prunus mahaleb, Prunus mandshurica, Pru Acaena novaezelandiae, Acaena ovalifolia, Acaena pinnati nus maritima, Prunus maximowiczii, Prunus mexicana, Pru 50 fida, Acaena splendens, Acaena subincisa, Acaena X anse nus microcarpa, Prunus mira, Prunus mume, Prunus munso rovina, Acomastylis elata, Acomastylis rossii, Acomastylis niana, Prunus nigra, Prunus nipponica, Prunus padus, Sikkimensis, Agrimonia eupatoria, Agrimonia nipponica, Prunus pensylvanica, Prunus persica, Prunus petunnikowii, Agrimonia parviflora, Agrimonia pilosa, Alchemilla alpina, Prunus prostrata, Prunus pseudocerasus, Prunus pumila, Alchemilla erythropoda, Alchem illa japonica, Alchemilla , , , Prunus 55 mollis, Alchemilla vulgaris, Aphanes arvensis, Aremonia sellowii, Prunus serotina, , , agrimonioides, Bencomia brachystachya, Bencomia cau Prunus Simonii, Prunus spinosa, Prunus spinulosa, Prunus data, Bencomia existipulata, Bencomia sphaerocarpa, subcordata, Prunus Subhirtella, , Pru Chamaebatia foliolosa, Cliffortia burmeana, Cliffortia nus tenella, Prunus texana, Prunus tomentosa, Prunus cuneata, Cliffortia dentata, Cliffortia graminea, Cliffortia tSchonoskii, Prunus umbellata, Prunus verecunda, Prunus 60 heterophylla, Cliffortianitidula, Cliffortia odorata, Cliffortia virginiana, Prunus webbii, Prunus X yedoensis, Prunus Zip ruscifolia, Cliffortia sericea, Coluria elegans, Coluria peliana, Prunus sp. BSP-2004-1, Prunus sp. BSP-2004-2, geoides, Cowania Stansburiana, Dalibarda repens, Dendrio Prunus sp. EB-2002, , Amelanchier poteium menendezii, Dendriopoterium pulidoi, Dryas drum arborea, Amelanchier asiatica, Amelanchier bartramiana, mondii, Dryas Octopetala, Duchesnea chrysantha, Duch Amelanchier Canadensis, Amelanchier clusickii, Amelanchier 65 esnea indica, Erythrocoma triflora, Fallugia paradoxa, fernaldii, Amelanchier florida, Amelanchier humilis, Filipendula multijuga Filipendula purpurea, Filipendula Amelanchier intermedia, Amelanchier laevis, Amelanchier ulmaria, Filipendula vulgaris, Fragaria chiloensis, Fragaria US 7,973,216 B2 11 12 daltoniana, Fragaria gracilis, Fragaria grandiflora, Fia minusculus, Rubus moorei, Rubus multibracteatus, Rubus garia inunae, Fragaria moschata, Fragaria nilgerrensis, neomexicanus, Rubus nepalensis, Rubus messensis, Rubus Fragaria nipponica, Fragaria nubicola, Fragaria Orientalis, inivalis, Rubus niveus, Rubus nubigenus, Rubus Occidentalis, Fragaria pentaphylla, Fragaria vesca, Fragaria virginiana, Rubus odoratus, Rubus palmatus, Rubus parviflorus, Rubus Fragaria viridis, Fragaria X ananassa, Fragaria sp. CFRA 5 parvifolius, Rubus parvus, Rubus pectinellus, Rubus pedatus, 538, Fragaria sp., Geum andicola, Geum borisi, Geum bul Rubus pedemontanus, Rubus pensilvanicus, Rubus phoenico garicum, Geum calthifolium, Geum chiloense, Geumgenicu lasius, Rubus picticaulis, Rubus pubescens, Rubus rigidus, latum, Geum heterocarpum, Geum macrophyllum, Geum Rubus robustus, Rubus roseus, Rubus rosifolius, Rubus sanc montanum, Geum reptans, Geum rivale, Geum schofieldii, tus, Rubus sapidus, Rubus saxatilis, Rubus setosus, Rubus Geum speciosum, Geum urbanum, Geum vernum, Geum sp. 10 'Chase 2507 K, Hagenia abyssinica, Horkelia cuneata, Hor spectabilis, Rubus sulcatus, Rubus tephrodes, Rubus trian kelia fisca, Ivesia gordoni, Kerria japonica, Leucosidea seri thus, Rubus tricolor, Rubus trifidus, Rubus trilobus, Rubus cea, Marcetella maderensis, Marcetella moquiniana, Margy trivialis, Rubus ulmifolius, Rubus ursinus, Rubus urticifolius, ricarpus pinnatus, Margyricarpus setosus, Novosieversia Rubus vigorosus, Rubus sp.JPM-2004, Sanguisorba albi glacialis, Oncostylus cockaynei, Oncostylus leiospermus, is flora, Sanguisorba alpina, Sanguisorba ancistroides, San Polylepis australis, Polylepis besseri, Polylepis crista-galli, guisorba annua, Sanguisorba Canadensis, Sanguisorba fili Polylepis hieronymi, Polylepis incana, Polylepis lanuginosa, formis, Sanguisorba hakusanensis, Sanguisorba japonensis, Polylepis multijuga, Polylepis neglecta, Polylepis pauta, Sanguisorba minor; Sanguisorba obtusa, Sanguisorba offici Polylepis pepei, Polylepis quadrijuga, Polylepis racemosa, nalis, Sanguisorba parviflora, Sanguisorba stipulata, San Polylepis reticulata, Polylepis rugulosa, Polylepis sericea, guisorba tenuifolia, Sarcopoterium spinosum, Sibbaldia Polylepis subsericans, Polylepis tarapacana, Polylepis procumbens, Sieversia pentapetala, Sieversia pusilla, tomentella, Polylepis weberbaueri, Potentilla anserina, Taihangia rupestris, Tetraglochin cristatum, Waldsteinia Potentilla arguta, Potentilla bifurca, Potentilla chinensis, fragarioides, Waldsteinia geoides, Adenostoma fasciculatum, Potentilla dickinsii, Potentilla erecta, Potentilla fragarioides, Adenostoma sparsifolium, Aruncus dioicus, Cercocarpus Potentilla fruticosa, Potentilla indica, Potentilla micrantha, 25 betuloides, Cercocarpus ledifolius, Chamaebatiaria millefo Potentilla multifida, Potentilla nivea, Potentilla norvegica, lium, Chamaerhodos erecta, Gillenia stipulata, Gillenia tri Potentilla palustris, Potentilla peduncularis, Potentilla rep foliata, , Holodiscus microphyllus, tans, Potentilla salesoviana, Potentilla Stenophylla, Poten Lyonothamnus floribundus, Neillia afinis, Neillia gracilis, tilla tridentata, Rosa abietina, Rosa abyssinica, Rosa acicu Neillia sinensis, Neillia sparsiflora, Neillia thibetica, Neillia laris, Rosa agrestis, Rosa alba, Rosa alba X Rosa 30 thyrsiflora, Neillia uekii, Neviusia alabamensis, Physocarpus corymbifera, Rosa altaica, Rosa arkansana, Rosa arvensis, alternans, Physocarpus amurensis, Physocaipus capitatus, , Rosa beggeriana, Rosa blanda, Rosa brac Physocarpus malvaceus, Physocapus monogynus, Physocar teata, , Rosa caesia, Rosa Californica, Rosa pus opulifolius, Purshia tridentata, Rhodotypos scandens, canina, , , Rosa cinnamomea, Sorbaria arborea, Sorbaria sorbifolia, Spiraea betulifolia, Rosa columnifera, Rosa corymbifera, Rosa cymosa, Rosa 35 Spiraea Cantoniensis, Spiraea densiflora, Spiraea japonica, davurica, , Rosa ecae, Rosa eglanteria, Rosa Spiraea nipponica, Spiraea X vanhouttei, Spiraea sp. Stepha elliptica, , , Rosa foliolosa, nandra chinensis, Stephanandra incisa and Stephanandra , Rosa gallica X Rosa dumetorum, Rosa tanakae. gigantea, , Rosa helenae, Rosa henryi, Rosa Particularly preferred Rosaceae genera include: Malus, hugonis, Rosa hybrid cultivar, Rosa inodora, Rosa jundzillii, 40 Pyrus, Cydonia, Prunus, Eriobotrya, and Mespilus. Rosa laevigata, Rosa laxa, Rosa luciae, , Rosa Particularly preferred Rosaceae species include: Malus marretii, Rosa maximowicziana, Rosa micrantha, Rosa mol domestica, Malus Sylvestris, Pyrus communis, Pyrus pyrifo lis, Rosa montana, , Rosa movesii, Rosa multi lia, Pyrus bretschneideri, Cydonia Oblonga, Prunus salicina, bracteata, Rosa multiflora, Rosa initida, Rosa Odorata, Rosa Prunus cerasifera, Prunus persica, Eriobotrya japonica, palustris, , , Rosa phoenicia, 45 Prunus dulcis, Prunus avium, Mespilus germanica and Pru Rosa platyacantha, , Rosa pseudoscabriuscula, nus domestica. Rosa roxburghii, , Rosa rugosa, Rosa Sam More particularly preferred Rosaceae genera include bucina, Rosa sempervirens, Rosa sericea, Rosa sertata, Rosa Malus and Prunus setigera, Rosa Sherardii, Rosa Sicula, Rosa spinosissima, Particularly preferred Rosaceae species include Malus Rosa stellata, Rosa stylosa, Rosa subcanina, Rosa subcol 50 domestica and Prunus cerasifera. lina, Rosa suffulta, Rosa tomentella, , Rosa The term “plant' is intended to include a whole plant, any tunquinensis, , Rosa virginiana, Rosa Wichurana, part of a plant, propagules and progeny of a plant. Rosa willmottiae, Rosa woodsii, Rosa X damascena, Rosa X The term propagule means any part of a plant that may be fortuniana, Rosa X macrantha, , Rosa sp. used in reproduction or propagation, either sexual or asexual, Rubus alceifolius, Rubus allegheniensis, Rubus alpinus, 55 including seeds and cuttings. Rubus amphidasys, Rubus arcticus, Rubus argutus, Rubus assamensis, Rubus australis, Rubus bifrons, Rubus caesius, DETAILED DESCRIPTION Rubus caesius X Rubus idaeus, Rubus Canadensis, , Rubus Caucasicus, Rubus chamaemorus, Rubus In this specification where reference has been made to corchorifolius, Rubus Crataegifolius, Rubus cuneifolius, 60 patent specifications, other external documents, or other Rubus deliciosus, Rubus divaricatus, Rubus ellipticus, Sources of information, this is generally for the purpose of Rubus flagellaris, Rubus fruticosus, Rubus geoides, Rubus providing a context for discussing the features of the inven glabratus, Rubus glaucus, Rubus gunnianus, Rubus hawaien tion. Unless specifically stated otherwise, reference to such sis, Rubus hawaiensis X Rubus rosifolius, Rubus hispidus, external documents is not to be construed as an admission that Rubus hochstetteroruni, Rubus humulifolius, Rubus idaeus, 65 Such documents, or Such sources of information, in any juris Rubus lambertianus, Rubus lasiococcus, Rubus leucodermis, diction, are prior art, or form part of the common general Rubus lineatus, Rubus macraei, Rubus maximiformis, Rubus knowledge in the art. US 7,973,216 B2 13 14 The term comprising, and grammatical equivalents A "recombinant polypeptide sequence is produced by thereof, is intended to mean "consisting at least in part translation from a “recombinant polynucleotide sequence. of . . . . The term "derived from with respect to polynucleotides or Polynucleotides and Fragments polypeptides of the invention being derived from a particular The term “polynucleotide(s), as used herein, means a genera or species, means that the polynucleotide or polypep single or double-stranded deoxyribonucleotide or ribonucle tide has the same sequence as a polynucleotide or polypeptide otide polymer of any length but preferably at least 15 nucle found naturally in that genera or species. The polynucleotide otides, and include as non-limiting examples, coding and or polypeptide, derived from a particular genera or species, non-coding sequences of a gene, sense and antisense may therefore be produced synthetically or recombinantly. sequences complements, exons, introns, genomic DNA, 10 cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, Variants ribozymes, recombinant polypeptides, isolated and purified As used herein, the term “variant” refers to polynucleotide naturally occurring DNA or RNA sequences, synthetic RNA or polypeptide sequences different from the specifically iden and DNA sequences, nucleic acid probes, primers and frag tified sequences, wherein one or more nucleotides or amino mentS. 15 acid residues is deleted, substituted, or added. Variants may A "fragment of a polynucleotide sequence provided be naturally occurring allelic variants, or non-naturally occur herein is a Subsequence of contiguous nucleotides that is ring variants. Variants may be from the same or from other capable of specific hybridization to a target of interest, e.g., a species and may encompass homologues, paralogues and sequence that is at least 15 nucleotides in length. The frag orthologues. In certain embodiments, variants of the inven ments of the invention comprise 15 nucleotides, preferably at tive polypeptides and polypeptides possess biological activi least 20 nucleotides, more preferably at least 30 nucleotides, ties that are the same or similar to those of the inventive more preferably at least 50 nucleotides, more preferably at polypeptides or polypeptides. The term “variant' with refer least 50 nucleotides and most preferably at least 60 nucle ence to polypeptides and polypeptides encompasses all forms otides of contiguous nucleotides of a polynucleotide of the of polypeptides and polypeptides as defined herein. invention. A fragment of a polynucleotide sequence can be 25 Polynucleotide Variants used in antisense, gene silencing, triple helix or ribozyme Variant polynucleotide sequences preferably exhibit at technology, or as a primer, a probe, included in a microarray, least 50%, more preferably at least 51%, more preferably at or used in polynucleotide-based selection methods of the least 52%, more preferably at least 53%, more preferably at invention. least 54%, more preferably at least 55%, more preferably at The term “primer' refers to a short polynucleotide, usually 30 least 56%, more preferably at least 57%, more preferably at having a free 3'OH group, that is hybridized to a template and least 58%, more preferably at least 59%, more preferably at used for priming polymerization of a polynucleotide comple least 60%, more preferably at least 61%, more preferably at mentary to the target. least 62%, more preferably at least 63%, more preferably at The term “probe' refers to a short polynucleotide that is least 64%, more preferably at least 65%, more preferably at used to detect a polynucleotide sequence, that is complemen 35 least 66%, more preferably at least 67%, more preferably at tary to the probe, in a hybridization-based assay. The probe least 68%, more preferably at least 69%, more preferably at may consist of a “fragment of a polynucleotide as defined least 70%, more preferably at least 71%, more preferably at herein. least 72%, more preferably at least 73%, more preferably at Polypeptides and Fragments least 74%, more preferably at least 75%, more preferably at The term “polypeptide', as used herein, encompasses 40 least 76%, more preferably at least 77%, more preferably at amino acid chains of any length but preferably at least 5 least 78%, more preferably at least 79%, more preferably at amino acids, including full-length proteins, in which amino least 80%, more preferably at least 81%, more preferably at acid residues are linked by covalent peptide bonds. Polypep least 82%, more preferably at least 83%, more preferably at tides of the present invention may be purified natural prod least 84%, more preferably at least 85%, more preferably at ucts, or may be produced partially or wholly using recombi 45 least 86%, more preferably at least 87%, more preferably at nant or synthetic techniques. The term may refer to a least 88%, more preferably at least 89%, more preferably at polypeptide, an aggregate of a polypeptide such as a dimer or least 90%, more preferably at least 91%, more preferably at other multimer, a fusion polypeptide, a polypeptide fragment, least 92%, more preferably at least 93%, more preferably at a polypeptide variant, or derivative thereof. least 94%, more preferably at least 95%, more preferably at A "fragment of a polypeptide is a Subsequence of the 50 least 96%, more preferably at least 97%, more preferably at polypeptide that performs a function that is required for the least 98%, and most preferably at least 99% identity to a biological activity and/or provides three dimensional struc sequence of the present invention. Identity is found over a ture of the polypeptide. The term may refer to a polypeptide, comparison window of at least 20 nucleotide positions, pref an aggregate of a polypeptide such as a dimer or other mul erably at least 50 nucleotide positions, more preferably at timer, a fusion polypeptide, a polypeptide fragment, a 55 least 100 nucleotide positions, and most preferably over the polypeptide variant, or derivative thereof capable of perform entire length of a polynucleotide of the invention. ing the above enzymatic activity. Polynucleotide sequence identity can be determined in the The term "isolated as applied to the polynucleotide or following manner. The Subject polynucleotide sequence is polypeptide sequences disclosed herein is used to refer to compared to a candidate polynucleotide sequence using sequences that are removed from their natural cellular envi 60 BLASTN (from the BLAST suite of programs, version 2.2.5 ronment. An isolated molecule may be obtained by any November 2002) in bl2seq (Tatiana A. Tatusova, Thomas L. method or combination of methods including biochemical, Madden (1999), "Blast 2 sequences—a new tool for compar recombinant, and synthetic techniques. ing protein and nucleotide sequences. FEMS Microbiol. The term “recombinant refers to a polynucleotide Lett. 174:247-250), which is publicly available from NCBI sequence that is removed from sequences that Surround it in 65 (available on the world wide web at ftp.ncbi.nih.gov/blast/). its natural context and/or is recombined with sequences that The default parameters of bl2seq are utilized except that are not present in its natural context. filtering of low complexity parts should be turned off. US 7,973,216 B2 15 16 The identity of polynucleotide sequences may be exam polynucleotide molecule to hybridize to a target polynucle ined using the following unix command line parameters: otide molecule (such as a target polynucleotide molecule bl2seq -i nucleotideseq 1 - nucleotideseq2 -F F-p blastin immobilized on a DNA or RNA blot, such as a Southern blot The parameter -FF turns off filtering of low complexity or Northern blot) under defined conditions oftemperature and sections. The parameter-p selects the appropriate algorithm salt concentration. The ability to hybridize under stringent for the pair of sequences. The bl2seq program reports hybridization conditions can be determined by initially sequence identity as both the number and percentage of iden hybridizing under less stringent conditions then increasing tical nucleotides in a line “Identities='. the stringency to the desired stringency. Polynucleotide sequence identity may also be calculated With respect to polynucleotide molecules greater than over the entire length of the overlap between a candidate and 10 Subject polynucleotide sequences using global sequence about 100 bases in length, typical stringent hybridization alignment programs (e.g. Needleman, S. B. and Wunsch, C. conditions are no more than 25 to 30°C. (for example, 10°C.) D. (1970).J. Mol. Biol. 48,443-453). A full implementation of below the melting temperature (Tm) of the native duplex (see the Needleman-Wunsch global alignment algorithm is found generally, Sambrook et al., Eds, 1987, Molecular Cloning. A in the needle program in the EMBOSS package (Rice, P. 15 Laboratory Manual, 2nd Ed. Cold Spring Harbor Press: Longden, I. and Bleasby, A. EMBOSS: The European Ausubel et al., 1987, Current Protocols in Molecular Biology, Molecular Biology Open Software Suite, Trends in Genetics Greene Publishing). Tm for polynucleotide molecules greater June 2000, vol. 16, No. 6. pp. 276-277) which is available on than about 100 bases can be calculated by the formula the worldwide web at www.hgmp.mrc.ac.uk/Software/EM Tm=81.5+0.41% (G+C-log(Na+). (Sambrook et al., Eds, BOSS/. The European Bioinformatics Institute server also 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. provides the facility to perform EMBOSS-needle global Cold Spring Harbor Press; Bolton and McCarthy, 1962, alignments between two sequences on the worldwide web at PNAS 84:1390). Typical stringent conditions for polynucle www.ebi.ac.uk/emboss/align/. otide of greater than 100 bases in length would be hybridiza Alternatively the GAP program may be used which com tion conditions such as prewashing in a solution of 6xSSC, putes an optimal global alignment of two sequences without 25 0.2% SDS: hybridizing at 65° C., 6xSSC, 0.2% SDS over penalizing terminal gaps. GAP is described in the following night; followed by two washes of 30 minutes each in 1xSSC, paper: Huang, X. (1994) On Global Sequence Alignment. 0.1% SDS at 65° C. and two washes of 30 minutes each in Computer Applications in the Biosciences 10, 227-235. 0.2xSSC, 0.1% SDS at 65° C. Polynucleotide variants of the present invention also With respect to polynucleotide molecules having a length encompass those which exhibit a similarity to one or more of 30 the specifically identified sequences that is likely to preserve less than 100 bases, exemplary Stringent hybridization con the functional equivalence of those sequences and which ditions are 5 to 10° C. below Tm. On average, the Tm of a could not reasonably be expected to have occurred by random polynucleotide molecule of length less than 100 bp is reduced chance. Such sequence similarity with respect to polypep by approximately (500/oligonucleotide length) C. tides may be determined using the publicly available bl2seq 35 With respect to the DNA mimics known as peptide nucleic program from the BLAST suite of programs (version 2.2.5 acids (PNAS) (Nielsen et al., Science. 1991 Dec. 6; November 2002) from NCBI (available on the world wide 254(5037): 1497-500) Tm values are higher than those for web at ftp.ncbi.nih.gov/blast/). DNA-DNA or DNA-RNA hybrids, and can be calculated The similarity of polynucleotide sequences may be exam using the formula described in Giesen et al., Nucleic Acids ined using the following unix command line parameters: 40 Res. 1998 Nov. 1; 26(21):5004-6. Exemplary stringent bl2seq -i nucleotideseq 1 - nucleotideseq2 -F F-p thlastX hybridization conditions for a DNA-PNA hybrid having a The parameter -FF turns off filtering of low complexity length less than 100 bases are 5 to 10° C. below the Tm. sections. The parameter-p selects the appropriate algorithm Variant polynucleotides of the present invention also for the pair of sequences. This program finds regions of simi encompasses polynucleotides that differ from the sequences larity between the sequences and for each Such region reports 45 of the invention but that, as a consequence of the degeneracy an “Evalue” which is the expected number of times one could of the genetic code, encode a polypeptide having similar expect to see Such a match by chance in a database of a fixed activity to a polypeptide encoded by a polynucleotide of the reference size containing random sequences. The size of this present invention. A sequence alteration that does not change database is set by default in the bl2seq program. For small E the amino acid sequence of the polypeptide is a “silent varia values, much less than one, the E value is approximately the 50 tion”. Except for ATG (methionine) and TGG (tryptophan), probability of Such a random match. other codons for the same amino acid may be changed by art Variant polynucleotide sequences preferably exhibit an E recognized techniques, e.g., to optimize codon expression in value of less than 1x10 more preferably less than 1x10, a particular host organism. more preferably less than 1x10°, more preferably less than Polynucleotide sequence alterations resulting in conserva 1x10', more preferably less than 1x10, more preferably 55 tive Substitutions of one or several amino acids in the encoded less than 1x10°, more preferably less than 1x10', more polypeptide sequence without significantly altering its bio preferably less than 1x10", more preferably less than 1 x logical activity are also included in the invention. A skilled 10, more preferably less than 1x10', more preferably artisan will be aware of methods for making phenotypically less than 1x107, more preferably less than 1x10, more silent amino acid substitutions (see, e.g., Bowie et al., 1990, preferably less than 1x10' and most preferably less than 60 Science 247, 1306). 1x10' when compared with any one of the specifically Variant polynucleotides due to silent variations and con identified sequences. servative Substitutions in the encoded polypeptide sequence Alternatively, variant polynucleotides of the present inven may be determined using the publicly available bl2Seq pro tion hybridize to the specified polynucleotide sequences, or gram from the BLAST suite of programs (version 2.2.5 No complements thereofunder stringent conditions. 65 vember 2002) from NCBI (available on the worldwide web The term “hybridize under stringent conditions, and at ftp.ncbi.nih.gov/blast?) via the tblastX algorithm as previ grammatical equivalents thereof, refers to the ability of a ously described. US 7,973,216 B2 17 18 Polypeptide Variants bl2seq -i peptideseq 1 - peptideseq2 -F F-p blastp The term “variant' with reference to polypeptides encom Variant polypeptide sequences preferably exhibit an E passes naturally occurring, recombinantly and synthetically value of less than 1x10, more preferably less than 1x10, produced polypeptides. Variant polypeptide sequences pref more preferably less than 1x10', more preferably less than erably exhibit at least 50%, more preferably at least 51%, 1x10', more preferably less than 1x10, more preferably more preferably at least 52%, more preferably at least 53%, less than 1x10', more preferably less than 1x10", more more preferably at least 54%, more preferably at least 55%, preferably less than 1x10', more preferably less than 1 x more preferably at least 56%, more preferably at least 57%, 10, more preferably less than 1x10, more preferably more preferably at least 58%, more preferably at least 59%, less than 1x107, more preferably less than 1x10, more more preferably at least 60%, more preferably at least 61%, 10 preferably less than 1x10' and most preferably 1x10' more preferably at least 62%, more preferably at least 63%, when compared with anyone of the specifically identified more preferably at least 64%, more preferably at least 65%, Sequences. more preferably at least 66%, more preferably at least 67%, The parameter-F F turns off filtering of low complexity more preferably at least 68%, more preferably at least 69%, sections. The parameter-p selects the appropriate algorithm more preferably at least 70%, more preferably at least 71%, 15 for the pair of sequences. This program finds regions of simi more preferably at least 72%, more preferably at least 73%, larity between the sequences and for each Such region reports more preferably at least 74%, more preferably at least 75%, an “Evalue” which is the expected number of times one could more preferably at least 76%, more preferably at least 77%, expect to see Such a match by chance in a database of a fixed more preferably at least 78%, more preferably at least 79%, reference size containing random sequences. For Small E more preferably at least 80%, more preferably at least 81%, values, much less than one, this is approximately the prob more preferably at least 82%, more preferably at least 83%, ability of such a random match. more preferably at least 84%, more preferably at least 85%, Conservative substitutions of one or several amino acids of more preferably at least 86%, more preferably at least 87%, a described polypeptide sequence without significantly alter more preferably at least 88%, more preferably at least 89%, ing its biological activity are also included in the invention. A more preferably at least 90%, more preferably at least 91%, 25 skilled artisan will be aware of methods for making pheno more preferably at least 92%, more preferably at least 93%, typically silent amino acid Substitutions (see, e.g., Bowie et more preferably at least 94%, more preferably at least 95%, al., 1990, Science 247, 1306). more preferably at least 96%, more preferably at least 97%, Constructs, Vectors and Components Thereof more preferably at least 98%, and most preferably at least The term “genetic construct” refers to a polynucleotide 99% identity to a sequences of the present invention. Identity 30 molecule, usually double-stranded DNA, which may have is found over a comparison window of at least 20 amino acid inserted into it another polynucleotide molecule (the insert positions, preferably at least 50 amino acid positions, more polynucleotide molecule) such as, but not limited to, a cDNA preferably at least 100 amino acid positions, and most pref molecule. A genetic construct may contain the necessary erably over the entire length of a polypeptide of the invention. elements that permit transcribing the insert polynucleotide Polypeptide sequence identity can be determined in the 35 molecule, and, optionally, translating the transcript into a following manner. The Subject polypeptide sequence is com polypeptide. The insert polynucleotide molecule may be pared to a candidate polypeptide sequence using BLASTP derived from the host cell, or may be derived from a different (from the BLAST suite of programs, version 2.2.5 Novem cell or organism and/or may be a recombinant polynucle ber 2002) in bl2seq, which is publicly available from NCBI otide. Once inside the host cell the genetic construct may (available on the world wide web at ftp.ncbi.nih.gov/blast/). 40 become integrated in the host chromosomal DNA. The The default parameters of bl2seq are utilized except that genetic construct may be linked to a vector. filtering of low complexity regions should be turned off. The term “vector” refers to a polynucleotide molecule, Polypeptide sequence identity may also be calculated over usually double stranded DNA, which is used to transport the the entire length of the overlap between a candidate and genetic construct into a host cell. The vector may be capable Subject polynucleotide sequences using global sequence 45 of replication in at least one additional host system, such as E. alignment programs. EMBOSS-needle (available on the coli. world wide web at www.ebi.ac.uk/emboss/align/) and GAP The term "expression construct” refers to a genetic con (Huang, X. (1994) On Global Sequence Alignment. Com struct that includes the necessary elements that permit tran puter Applications in the Biosciences 10, 227-235) as dis scribing the insert polynucleotide molecule, and, optionally, cussed above are also suitable global sequence alignment 50 translating the transcript into a polypeptide. An expression programs for calculating polypeptide sequence identity. construct typically comprises in a 5' to 3’ direction: A preferred method for calculating polypeptide sequence a) a promoter functional in the host cell into which the identity is based on aligning sequences to be compared using construct will be transformed, ClustalW (Thompson et al 1994, Nucleic Acid Res 11 (22) b) the polynucleotide to be expressed, and 4673-4680) 55 c) a terminator functional in the host cell into which the Polypeptide variants of the present invention also encom construct will be transformed. pass those which exhibit a similarity to one or more of the The term “coding region' or “open reading frame' (ORF) specifically identified sequences that is likely to preserve the refers to the sense Strand of a genomic DNA sequence or a functional equivalence of those sequences and which could cDNA sequence that is capable of producing a transcription not reasonably be expected to have occurred by random 60 product and/or a polypeptide under the control of appropriate chance. Such sequence similarity with respect to polypep regulatory sequences. The coding sequence is identified by tides may be determined using the publicly available bl2seq the presence of a 5' translation start codon and a 3’ translation program from the BLAST suite of programs (version 2.2.5 stop codon. When inserted into a genetic construct, a "coding November 2002) from NCBI (available on the world wide sequence' is capable of being expressed when it is operably web at ftp.ncbi.nih.gov/blast/). The similarity of polypeptide 65 linked to promoter and terminator sequences. sequences may be examined using the following unix com “Operably-linked' means that the sequenced to be mand line parameters: expressed is placed under the control of regulatory elements US 7,973,216 B2 19 20 that include promoters, tissue-specific regulatory elements, variants (SEQID NO: 22 to 47) of SEQID NO: 5 that encode temporal regulatory elements, enhancers, repressors and ter polypeptide variants (SEQID NO:9 to 21) of SEQID NO: 1. minators. A summary of the relationship between the polynucleotides The term “noncoding region” refers to untranslated and polypeptides is found in Table 3 (Summary of sequences that are upstream of the translational start site and Sequences). downstream of the translational stop site. These sequences The invention provides genetic constructs, vectors and are also referred to respectively as the 5' UTR and the 3' UTR. plants comprising the polynucleotide sequences. The inven These regions include elements required for transcription tion also provides plants comprising the genetic constructs initiation and termination and for regulation of translation and vectors of the invention. efficiency. 10 The invention provides plants altered, relative to suitable Terminators are sequences, which terminate transcription, control plants, in production of anthocyanin pigments. The and are found in the 3' untranslated ends of genes downstream invention provides both plants with increased and decreased of the translated sequence. Terminators are important deter production of anthocyanin pigments. The invention also pro minants of mRNA stability and in some cases have been vides methods for the production of such plants and methods found to have spatial regulatory functions. 15 of selection of Such plants. The term “promoter” refers to nontranscribed cis-regula Suitable control plants may include non-transformed tory elements upstream of the coding region that regulate plants of the same species and variety, or plants of the same gene transcription. Promoters comprise cis-initiator elements species or variety transformed with a control construct. which specify the transcription initiation site and conserved Uses of the compositions of the invention include the pro boxes such as the TATA box, and motifs that are bound by duction of fruit, or other plant parts, with increased levels of transcription factors. anthocyanin pigmentation, for example production of apples A “transgene' is a polynucleotide that is taken from one with red skin and or red flesh. organism and introduced into a different organism by trans The invention also provides methods for selecting trans formation. The transgene may be derived from the same formed plant cells and plants by selecting plant cells and species or from a different species as the species of the organ 25 plants which have increased anthocyanin pigment, the ism into which the transgene is introduced. increased anthocyanic pigment indicating that the plants are A “transgenic plant” refers to a plant which contains new transformed to express a polynucleotide or polypeptide of the genetic material as a result of genetic manipulation or trans invention. formation. The new genetic material may be derived from a Methods for Isolating or Producing Polynucleotides plant of the same species as the resulting transgenic plant or 30 The polynucleotide molecules of the invention can be iso from a different species. lated by using a variety of techniques known to those of An “inverted repeat is a sequence that is repeated, where ordinary skill in the art. By way of example, such polypep the second half of the repeat is in the complementary Strand, tides can be isolated through use of the polymerase chain C.9. reaction (PCR) described in Mullis et al., Eds. 1994 The 35 Polymerase Chain Reaction, Birkhauser, incorporated herein by reference. The polypeptides of the invention can be ampli (5') GATCTA...... TAGATC (3') fied using primers, as defined herein, derived from the poly nucleotide sequences of the invention. (3) CTAGAT...... ATCTAG (5') Further methods for isolating polynucleotides of the inven Read-through transcription will produce a transcript that 40 tion include use of all, orportions of the polypeptides having undergoes complementary base-pairing to form a hairpin the sequence set forth herein as hybridization probes. The structure provided that there is a 3-5bp spacer between the technique of hybridizing labelled polynucleotide probes to repeated regions. polynucleotides immobilized on Solid Supports such as nitro The term “regulating anthocyanin production' is intended cellulose filters or nylon membranes, can be used to Screen to include both increasing and decreasing anthocyanin pro 45 the genomic or cDNA libraries. Exemplary hybridization and duction. Preferably the term refers to increasing anthocyanin wash conditions are: hybridization for 20 hours at 65° C. in production. Anthocyanins that may be regulated include but 5.0xSSC, 0.5% sodium dodecyl sulfate, 1 xDenhardt’s solu are not limited to cyanindin-3-glucoside, cyaniding-3-0-ruti tion; washing (three washes of twenty minutes each at 55°C.) noside, cyanadin-3-galactoside and cyanadin-3-pentoside. in 1.0xSSC, 1% (w/v) sodium dodecyl sulfate, and optionally The terms “to alter expression of and “altered expression” 50 one wash (for twenty minutes) in 0.5xSSC, 1% (w/v) sodium of a polynucleotide or polypeptide of the invention, are dodecyl sulfate, at 60° C. An optional further wash (for intended to encompass the situation where genomic DNA twenty minutes) can be conducted under conditions of 0.1 x corresponding to a polynucleotide of the invention is modi SSC, 1% (w/v) sodium dodecyl sulfate, at 60° C. fied thus leading to altered expression of a polynucleotide or The polynucleotide fragments of the invention may be polypeptide of the invention. Modification of the genomic 55 produced by techniques well-known in the art such as restric DNA may be through genetic transformation or other meth tion endonuclease digestion, oligonucleotide synthesis and ods known in the art for inducing mutations. The “altered PCR amplification. expression' can be related to an increase or decrease in the A partial polynucleotide sequence may be used, in methods amount of messenger RNA and/or polypeptide produced and well-known in the art to identify the corresponding full length may also result in altered activity of a polypeptide due to 60 polynucleotide sequence. Such methods include PCR-based alterations in the sequence of a polynucleotide and polypep methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. tide produced. 218: 340-56) and hybridization-based method, computer/da The applicants have identified polynucleotide sequences tabase-based methods. Further, by way of example, inverse (SEQ ID NO: 5 to 8) which encode polypeptides (SEQ ID PCR permits acquisition of unknown sequences, flanking the NO: 1 to 4) respectively from apple, which are transcription 65 polynucleotide sequences disclosed herein, starting with factors capable of regulating anthocyanin production in primers based on a known region (Triglia et al., 1998, Nucleic plants. The applicants have also identified polynucleotide Acids Res 16, 8186, incorporated herein by reference). The US 7,973,216 B2 21 22 method uses several restriction enzymes to generate a suitable compares a nucleotide query sequence against a nucleotide fragment in the known region of a gene. The fragment is then sequence database. BLASTP compares an amino acid query circularized by intramolecular ligation and used as a PCR sequence against a protein sequence database. BLASTX template. Divergent primers are designed from the known compares a nucleotide query sequence translated in all read region. In order to physically assemble full-length clones, ing frames against a protein sequence database. tBLASTN standard molecular biology approaches can be utilized (Sam compares a protein query sequence against a nucleotide brook et al., Molecular Cloning: A Laboratory Manual, 2nd sequence database dynamically translated in all reading Ed. Cold Spring Harbor Press, 1987). frames. tBLASTX compares the six-frame translations of a It may be beneficial, when producing a transgenic plant nucleotide query sequence against the six-frame translations from a particular species, to transform such a plant with a 10 of a nucleotide sequence database. The BLAST programs sequence or sequences derived from that species. The benefit may be used with default parameters or the parameters may may be to alleviate public concerns regarding cross-species be altered as required to refine the screen. transformation in generating transgenic organisms. Addition The use of the BLAST family of algorithms, including ally when down-regulation of a gene is the desired result, it BLASTN, BLASTP, and BLASTX, is described in the pub may be necessary to utilise a sequence identical (or at least 15 lication of Altschulet al., Nucleic Acids Res. 25: 3389-3402, highly similar) to that in the plant, for which reduced expres 1997. sion is desired. For these reasons among others, it is desirable The “hits” to one or more database sequences by a queried to be able to identify and isolate orthologues of a particular sequence produced by BLASTN, BLASTP, BLASTX, gene in several different plant species. Variants (including tBLASTN, thBLASTX, or a similar algorithm, align and iden orthologues) may be identified by the methods described. tify similar portions of sequences. The hits are arranged in Methods for Identifying Variants order of the degree of similarity and the length of sequence Physical Methods overlap. Hits to a database sequence generally represent an Variant polypeptides may be identified using PCR-based overlap over only a fraction of the sequence length of the methods (Mullis et al., Eds. 1994 The Polymerase Chain queried sequence. Reaction, Birkhauser). Typically, the polynucleotide 25 The BLASTN, BLASTP, BLASTX, tRLASTN and sequence of a primer, useful to amplify variants of polynucle tBLASTX algorithms also produce “Expect values for otide molecules of the invention by PCR, may be based on a alignments. The Expect value (E) indicates the number of hits sequence encoding a conserved region of the corresponding one can "expect to see by chance when searching a database amino acid sequence. of the same size containing random contiguous sequences. Alternatively library screening methods, well known to 30 The Expect value is used as a significance threshold for deter those skilled in the art, may be employed (Sambrook et al., mining whether the hit to a database indicates true similarity. Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold For example, an E value of 0.1 assigned to a polynucleotide Spring Harbor Press, 1987). When identifying variants of the hit is interpreted as meaning that in a database of the size of probe sequence, hybridization and/or wash stringency will the database screened, one might expect to see 0.1 matches typically be reduced relatively to when exact sequence 35 over the aligned portion of the sequence with a similar score matches are sought. simply by chance. For sequences having an Evalue of 0.01 or Polypeptide variants may also be identified by physical less over aligned and matched portions, the probability of methods, for example by Screening expression libraries using finding a match by chance in that database is 1% or less using antibodies raised against polypeptides of the invention (Sam the BLASTN, BLASTP, BLASTX, tRLASTN or tRLASTX brook et al., Molecular Cloning: A Laboratory Manual, 2nd 40 algorithm. Ed. Cold Spring Harbor Press, 1987) or by identifying Multiple sequence alignments of a group of related polypeptides from natural Sources with the aid of Such anti sequences can be carried out with CLUSTALW (Thompson, bodies. J. D., Higgins, D. G. and Gibson, T. J. (1994) CLUSTALW. Computer Based Methods improving the sensitivity of progressive multiple sequence The variant sequences of the invention, including both 45 alignment through sequence weighting, positions-specific polynucleotide and polypeptide variants, may also be identi gap penalties and weight matrix choice. Nucleic Acids fied by computer-based methods well-known to those skilled Research, 22:4673-4680, available on the worldwide web at in the art, using public domain sequence alignment algo www-igbmc.u-strasbg.fr/BioInfo/ClustalW/Top.html) or rithms and sequence similarity search tools to search TCOFFEE (Cedric Notredame, Desmond G. Higgins, Jaap sequence databases (public domain databases include Gen 50 Heringa, T-Coffee: A novel method for fast and accurate bank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic multiple sequence alignment, J. Mol. Biol. (2000) 302: 205 Acids Res. 29: 1-10 and 11-16, 2001 for examples of online 217) or PILEUP, which uses progressive, pairwise align resources. Similarity searches retrieve and align target ments (Feng and Doolittle, 1987, J. Mol. Evol. 25, 351). sequences for comparison with a sequence to be analyzed Pattern recognition software applications are available for (i.e., a query sequence). Sequence comparison algorithms use 55 finding motifs or signature sequences. For example, MEME scoring matrices to assign an overall score to each of the (Multiple Em for Motif Elicitation) finds motifs and signature alignments. sequences in a set of sequences, and MAST (Motif Alignment An exemplary family of programs useful for identifying and Search Tool) uses these motifs to identify similar or the variants in sequence databases is the BLAST suite of pro same motifs in query sequences. The MAST results are pro grams (version 2.2.5 November 2002) including BLASTN, 60 vided as a series of alignments with appropriate statistical BLASTP. BLASTX, tBLASTN and tBLASTX (which are data and a visual overview of the motifs found. MEME and publicly available on the worldwide web at ftp.ncbi.nih.gov/ MAST were developed at the University of California, San blast/ or from the National Center for Biotechnology Infor Diego. mation (NCBI), National Library of Medicine, Building 38A, PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. Room 8N805, Bethesda, Md. 20894 USA). The NCBI server 65 22,3583; Hofmann et al., 1999, Nucleic Acids Res. 27, 215) also provides the facility to use the programs to screen a is a method of identifying the functions of uncharacterized number of publicly available sequence databases. BLASTN proteins translated from genomic or cDNA sequences. The US 7,973,216 B2 23 24 PROSITE database (available on the world wide web at Methods for Producing Plant Cells and Plants Comprising www.expasy.org/prosite) contains biologically significant Constructs and Vectors patterns and profiles and is designed so that it can be used with The invention further provides plant cells which comprise appropriate computational tools to assign a new sequence to a genetic construct of the invention, and plant cells modified a known family of proteins or to determine which known 5 to alter expression of a polynucleotide or polypeptide of the domain(s) are present in the sequence (Falquet et al., 2002, invention. Plants comprising such cells also form an aspect of Nucleic Acids Res. 30, 235). Prosearch is a tool that can the invention. search SWISS-PROT and EMBL databases with a given Production of plants altered in pigment production may be sequence pattern or signature. achieved through methods of the invention. Such methods 10 may involve the transformation of plant cells and plants, with The function of a variant polynucleotide of the invention as a construct of the invention designed to alter expression of a encoding a transcription factor capable of regulating pigment polynucleotide or polypeptide capable of regulating pigment production in a plant transcription factors can be tested for this ability to regulate expression of known anthocyanin bio production in Such plant cells and plants. Such methods also synthesis genes (e.g. Example 4) or can be tested for their include the transformation of plant cells and plants with a 15 combination of the construct of the invention and one or more capability to regulate pigment production (e.g. Examples 5 other constructs designed to alter expression of one or more and 6). polypeptides or polypeptides capable of regulating pigment Methods for Isolating Polypeptides production in Such plant cells and plants. The polypeptides of the invention, including variant Methods for transforming plant cells, plants and portions polypeptides, may be prepared using peptide synthesis meth thereof with polypeptides are described in Draper et al., 1988, ods well known in the art such as direct peptide synthesis Plant Genetic Transformation and Gene Expression. A Labo using Solid phase techniques (e.g. Stewart et al., 1969, in ratory Manual. Blackwell Sci. Pub. Oxford, p. 365: Potrykus Solid-Phase Peptide Synthesis, WH Freeman Co., San Fran and Spangenburg, 1995, Gene Transfer to Plants. Springer cisco Calif., or automated synthesis, for example using an Verlag, Berlin.; and Gelvinet al., 1993, Plant Molecular Biol. Applied Biosystems 431 A Peptide Synthesizer (Foster City, Manual. Kluwer Acad. Pub. Dordrecht. A review of trans Calif.). Mutated forms of the polypeptides may also be pro 25 genic plants, including transformation techniques, is pro duced during Such syntheses. vided in Galun and Breiman, 1997, Transgenic Plants. Impe The polypeptides and variant polypeptides of the invention rial College Press, London. may also be purified from natural Sources using a variety of Methods for Genetic Manipulation of Plants techniques that are well known in the art (e.g. Deutscher, A number of plant transformation strategies are available 1990, Ed, Methods in Enzymology, Vol. 182, Guide to Pro 30 (e.g. Birch, 1997, Ann Rev Plant Phys Plant Mol Biol, 48, tein Purification). 297). For example, strategies may be designed to increase Alternatively the polypeptides and variant polypeptides of expression of a polynucleotide/polypeptide in a plant cell, the invention may be expressed recombinantly in suitable organ and/or at a particular developmental stage wherefwhen host cells and separated from the cells as discussed below. it is normally expressed or to ectopically express a polynucle Methods for Producing Constructs and Vectors otide/polypeptide in a cell, tissue, organ and/or at a particular The genetic constructs of the present invention comprise 35 developmental stage which/when it is not normally one or more polynucleotide sequences of the invention and/or expressed. The expressed polynucleotide/polypeptide may be polynucleotides encoding polypeptides of the invention, and derived from the plant species to be transformed or may be may be useful for transforming, for example, bacterial, fun derived from a different plant species. gal, insect, mammalian or plant organisms. The genetic con Transformation strategies may be designed to reduce structs of the invention are intended to include expression 40 expression of a polynucleotide/polypeptide in a plant cell, constructs as herein defined. tissue, organ or at a particular developmental stage which/ Methods for producing and using genetic constructs and when it is normally expressed. Such strategies are known as vectors are well known in the art and are described generally gene silencing strategies. in Sambrook et al., Molecular Cloning: A Laboratory Genetic constructs for expression of genes in transgenic Manual, 2nd Ed. Cold Spring Harbor Press, 1987; Ausubel et 45 plants typically include promoters for driving the expression al., Current Protocols in Molecular Biology, Greene Publish of one or more cloned polynucleotide, terminators and select ing, 1987). able marker sequences to detest presence of the genetic con Methods for Producing Host Cells Comprising Polynucle struct in the transformed plant. otides, Constructs or Vectors The promoters suitable for use in the constructs of this The invention provides a host cell which comprises a 50 invention are functional in a cell, tissue or organ of a monocot genetic constructor vector of the invention. Host cells may be or dicot plant and include cell-, tissue- and organ-specific derived from, for example, bacterial, fungal, insect, mamma promoters, cell cycle specific promoters, temporal promoters, lian or plant organisms. inducible promoters, constitutive promoters that are active in Host cells comprising genetic constructs. Such as expres most plant tissues, and recombinant promoters. Choice of sion constructs, of the invention are useful in methods well promoter will depend upon the temporal and spatial expres known in the art (e.g. Sambrook et al., Molecular Cloning: A 55 sion of the cloned polynucleotide, so desired. The promoters Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987: may be those normally associated with a transgene of interest, Ausubel et al., Current Protocols in Molecular Biology, or promoters which are derived from genes of other plants, Greene Publishing, 1987) for recombinant production of viruses, and plant pathogenic bacteria and fungi. Those polypeptides of the invention. Such methods may involve the skilled in the art will, without undue experimentation, be able culture of host cells in an appropriate medium in conditions 60 to select promoters that are suitable for use in modifying and suitable for or conducive to expression of a polypeptide of the modulating plant traits using genetic constructs comprising invention. The expressed recombinant polypeptide, which the polynucleotide sequences of the invention. Examples of may optionally be secreted into the culture, may then be constitutive plant promoters include the CaMV 35S pro separated from the medium, host cells or culture medium by moter, the nopaline synthase promoter and the octopine Syn methods well known in the art (e.g. Deutscher, Ed, 1990, 65 thase promoter, and the Ubi 1 promoter from maize. Plant Methods in Enzymology, Vol 182, Guide to Protein Purifica promoters which are active in specific tissues, respond to tion). internal developmental signals or external abiotic or biotic US 7,973,216 B2 25 26 stresses are described in the scientific literature. Exemplary known as sense suppression (e.g. Napoli et al., 1990, Plant promoters are described, e.g., in WO 02/00894, which is Cell 2, 279; de Carvalho Niebel et al., 1995, Plant Cell, 7, herein incorporated by reference. 347). In some cases sense Suppression may involve over Exemplary terminators that are commonly used in plant expression of the whole or a partial coding sequence but may transformation genetic construct include, e.g., the cauliflower 5 also involve expression of non-coding region of the gene, mosaic virus (CaMV) 35S terminator, the Agrobacterium such as an intron or a 5' or 3' untranslated region (UTR). tumefaciens nopaline synthase or octopine synthase termina Chimeric partial sense constructs can be used to coordinately tors, the Zea mays Zein gene terminator, the Oryza sativa silence multiple genes (Abbott et al., 2002, Plant Physiol. ADP-glucose pyrophosphorylase terminator and the Solanum tuberosum PI-II terminator. 128(3): 844-53; Jones et al., 1998, Planta 204: 499-505). The 10 use of Such sense Suppression strategies to silence the expres Selectable markers commonly used in plant transformation include the neomycin phophotransferase II gene (NPT II) sion of a polynucleotide of the invention is also contemplated. which confers kanamycin resistance, the aadA gene, which The polynucleotide inserts in genetic constructs designed confers spectinomycin and Streptomycin resistance, the phos for gene silencing may correspond to coding sequence and/or phinothricin acetyltransferase (bar gene) for Ignite (AgrEvo) non-coding sequence, such as promoter and/or intron and/or and Basta (Hoechst) resistance, and the hygromycin phos 15 5' or 3' UTR sequence, or the corresponding gene. photransferase gene (hpt) for hygromycin resistance. Other gene silencing strategies include dominant negative Use of genetic constructs comprising reporter genes (cod approaches and the use of ribozyme constructs (McIntyre, ing sequences which express an activity that is foreign to the 1996, Transgenic Res, 5, 257) host, usually an enzymatic activity and/or a visible signal Pre-transcriptional silencing may be brought about (e.g., luciferase, GUS, GFP) which may be used for promoter through mutation of the gene itself or its regulatory elements. expression analysis in plants and plant tissues are also con Such mutations may include point mutations, frameshifts, templated. The reporter gene literature is reviewed in Her insertions, deletions and Substitutions. rera-Estrella et al., 1993, Nature 303, 209, and Schrott, 1995, The following are representative publications disclosing In: Gene Transfer to Plants (Potrykus, T., Spangenberg. Eds) genetic transformation protocols that can be used to geneti Springer Verlag. Berline, pp. 325-336. 25 cally transform the following plant species: Rice (Alam et al., Gene silencing strategies may be focused on the gene itself 1999, Plant Cell Rep. 18, 572); apple (Yao et al., 1995, Plant or regulatory elements which effect expression of the Cell Reports 14, 407-412); maize (U.S. Pat. Nos. 5,177,010 encoded polypeptide. “Regulatory elements’ is used here in the widest possible sense and includes other genes which and 5,981,840); wheat (Ortiz et al., 1996, Plant Cell Rep. 15, interact with the gene of interest. 1996,877); tomato (U.S. Pat. No. 5,159,135): potato (Kumar Genetic constructs designed to decrease or silence the 30 et al., 1996 Plant J. 9: 821); cassava (Li et al., 1996 Nat. expression of a polynucleotide/polypeptide of the invention Biotechnology 14, 736): lettuce (Michelmore et al., 1987, may include an antisense copy of a polynucleotide of the Plant Cell Rep. 6,439); tobacco (Horschet al., 1985, Science invention. In such constructs the polynucleotide is placed in 227, 1229); cotton (U.S. Pat. Nos. 5,846,797 and 5,004,863); an antisense orientation with respect to the promoter and grasses (U.S. Pat. Nos. 5,187,073 and 6,020,539); pepper 35 mint (Niu et al., 1998, Plant Cell Rep. 17, 165); citrus plants terminator. (Pena et al., 1995, Plant Sci. 104, 183); caraway (Krens et al., An “antisense' polynucleotide is obtained by inverting a 1997, Plant Cell Rep. 17, 39); banana (U.S. Pat. No. 5,792, polynucleotide or a segment of the polynucleotide so that the 935); soybean (U.S. Pat. Nos. 5,416,011; 5,569,834; 5,824, transcript produced will be complementary to the mRNA 877: 5,563,055 and 5,968,830); pineapple (U.S. Pat. No. transcript of the gene, e.g., 5,952,543); poplar (U.S. Pat. No. 4,795,855); monocots in 40 general (U.S. Pat. Nos. 5,591,616 and 6,037,522); brassica s' GATCTA 3 3 CTAGAT 5" (antisense Strand) (U.S. Pat. Nos. 5,188,958; 5.463,174 and 5,750.871); cereals (coding strand) (U.S. Pat. No. 6,074,877); pear (Matsuda et al., 2005); Prunus (Ramesh et al., 2006: Song and Sink 2005; Gonzalez Padilla 3 CUAGAU S" mRNA 5' GAUCUCG 3' antisense RNA et al., 2003); strawberry (Oosumi et al., 2006; Folta et al., 45 2006), rose (Liet al., 2003), and Rubus (Graham et al., 1995). Genetic constructs designed for gene silencing may also Transformation of other species is also contemplated by the include an inverted repeat. An inverted repeat is a sequence invention. Suitable methods and protocols are available in the that is repeated where the second half of the repeat is in the scientific literature. complementary Strand, e.g., Several further methods known in the art may be employed 50 to alter expression of a nucleotide and/or polypeptide of the invention. Such methods include but are not limited to Tilling 5 - GATCTA...... TAGATC-3' (Till et al., 2003, Methods Mol Biol, 2%, 205), so called “Deletagene' technology (Liet al., 2001, Plant Journal 27(3), 3 - CTAGAT...... ATCTAG-5 235) and the use of artificial transcription factors such as The transcript formed may undergo complementary base synthetic Zinc finger transcription factors. (e.g. Jouvenot et pairing to form a hairpinstructure. Usually a spacer of at least 55 al., 2003, Gene Therapy 10,513). Additionally antibodies or 3-5bp between the repeated region is required to allow hair fragments thereof, targeted to a particular polypeptide may pin formation. also be expressed in plants to modulate the activity of that Another silencing approach involves the use of a small polypeptide (Jobling et al., 2003, Nat. Biotechnol., 21 (1),35). antisense RNA targeted to the transcript equivalent to an Transposon tagging approaches may also be applied. Addi miRNA (Llave et al., 2002, Science 297, 2053). Use of such 60 tionally peptides interacting with a polypeptide of the inven small antisense RNA corresponding to polynucleotide of the tion may be identified through technologies Such as phase invention is expressly contemplated. display (Dyax Corporation). Such interacting peptides may The term genetic construct as used herein also includes be expressed in or applied to a plant to affect activity of a Small antisense RNAS and other Such polypeptides effecting polypeptide of the invention. Use of each of the above gene silencing. 65 approaches in alteration of expression of a nucleotide and/or Transformation with an expression construct, as herein polypeptide of the invention is specifically contemplated. defined, may also result in gene silencing through a process Methods of Selecting Plants US 7,973,216 B2 27 28 Methods are also provided for selecting plants with altered Anthocyanin MYB regulator subgroup 10. Known anthocya pigment production. Such methods involve testing of plants nin regulators are denoted by a grey dot, other genes included for altered for the expression of a polynucleotide or polypep in figure with a black dot are negative controls showing tide of the invention. Such methods may be applied at a young MdMYB10 action is specific for this MYB clade and not age or early developmental stage when the altered pigment MYBs in general. production may not necessarily be visible, to accelerate FIG. 3(A) shows data from qPCR analysis of the apple breeding programs directed toward improving anthocyanin anthocyanin biosynthetic genes from CHS to UFGT as listed COntent. on the right hand side in the cortex, skin and leafon Red Field The expression of a polynucleotide, Such as a messenger and Pacific Rose. X axis numbers refer as follows: 1, RNA, is often used as an indicator of expression of a corre 10 40DAFB, 2, 67 DAFB, 3, 102DAFB, 4, 130DAFB, 5, 146 sponding polypeptide. Exemplary methods for measuring the DAFB, 6, Red Field OP leaf and 7, Pacific Rose leaf. expression of a polynucleotide include but are not limited to FIG. 3(B) shows sections through apple fruit (Red Field Northern analysis, RT-PCR and dot-blot analysis (Sambrook OP in upper row, Pacific Rose in lower row) at developmental et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. stages 1 to 5 as in 3(A). Increased pigmentation is Red Field Cold Spring Harbor Press, 1987). Polynucleotides orportions 15 OP versus Pacific Rose is visibly apparent. of the polynucleotides of the invention are thus useful as FIG. 4 shows an expression analysis of MdMYB10, probes or primers, as herein defined, in methods for the iden MdbHLH3 and MdbHLH33. (A) RT-PCR analysis of tification of plants with altered levels of anthocyanin. The MDMYB10 in Red Field (cortex, skin and leaf) and Pacific polypeptides of the invention may be used as probes in Rose (cortex, skin and leaf) and (B) corresponding qPCR data hybridization experiments, or as primers in PCR based of Md MYB10, MdbHLH3 and MdbHLH33. Gellane and x experiments, designed to identify Such plants. axis number as follows: 1, 40 DAFB, 2, 67 DAFB, 3, 102 Alternatively antibodies may be raised against polypep DAFB, 4, 130 DAFB, 5, 146 DAFB, 6, Red Field leaf and 7, tides of the invention. Methods for raising and using antibod Pacific Rose leaf. ies are standard in the art (see for example: Antibodies. A FIG. 5 shows the dual luciferase assay shows promoter Laboratory Manual, Harlow A Lane, Eds, Cold Spring Har activity as expressed as a ratio of LUC to REN where an bour Laboratory, 1998). Such antibodies may be used in 25 increase in activity equates to an increase in LUC relative methods to detect altered expression of polypeptides which REN for a combination of MYB TFS with/without b H modulate flower size in plants. Such methods may include TFs; (A) Arabidopsis TT4 (CHS)-Luc promoter. (B) Arabi ELISA (Kemeny, 1991, A Practical Guide to ELISA, NY dopsis TT3 (DFR)-Luc promoter. 0, MYB along, 1, MYB+ Pergamon Press) and Western analysis (Towbin & Gordon, AtTT8, 2, MYB+MdbHLH3, 3, MYB+MdbHLH33. 1994, J Immunol Methods, 72,313). 30 FIG. 6 shows data from transient assay in Nicotiana tobac These approaches for analysis of polynucleotide or cum. (A) shows colour measurement by Minolta chromame polypeptide expression and the selection of plants with ter as shown as a/b ratio. A shift towards positive indicates altered expression are useful in conventional breeding pro a colour change from green towards red. (i) MdMYB10+ grams designed to produce varieties with altered pigment MdbHLH3, (ii) Md MYB10 alone. (B) shows microscope images showing pattern of antho production. 35 Plants cyanin (darker grey) accumulation in tobacco leaf tissue infil The plants of the invention may be grown and eitherself-ed trated with MdMyb10+ MdbHLH3 at 20x (left) and 40x or crossed with a different plant strain and the resulting (right). Scale bars represent 50 microns. hybrids, with the desired phenotypic characteristics, may be FIG. 7 shows HPLC traces showing (A) Nicotiana tabac identified. Two or more generations may be grown to ensure cum and (B) Nicotiana tabaccum leaf infiltrated with that the Subject phenotypic characteristics are stably main 40 MdMYB10+MdbHLH3. 1, cyanidin-3-glucoside 2, petuni tained and inherited. Plants resulting from such standard din-3-galactoside, 3 cyanidin-3-0-rutinoside. No peaks were breeding approaches also forman aspect of the present inven observed in control tobacco leaf (data not shown). FIG.8 shows protein sequence alignment of the MdMyb 10 tion. polypeptide sequence with polypeptide variants of BRIEF DESCRIPTION OF THE DRAWINGS 45 Md MYBIO and AtPAPI (also called AtMYB75) for refer ence. The accession number of AtMYB75 in the GenBank The present invention will be better understood with refer database is CAB09230. The alignment was created using the ence to the accompanying drawings in which: ClustalW algorithm (Thompson et al., 1994). The sequences FIG. 1 shows comparison of the MdMYB10 (SEQ ID are: AtMYB75 (SEQ ID NO: 119), Co Quince SEQ ID NO: 1) and Md MYB9 (SEQID NO:2) with known anthocya 50 NO:13), E(loquat SEQID NO:17), Md (apple SEQ ID nin MYB regulators from various species at the R2R3 binding NO:1), Ms (crab a SEQ ID NO:9), Pb (pear Y SEQ ID domain. Arrows indicate specific residues that contribute to a NO:12 Pc (pear SEQ ID NO:10), Pcf (cherry SEQ ID motif implicated in bHLH cofactor interaction in Arabidopsis NO:15), Ppr (peach SEQ ID NO:16), Pc (pear SEQ ID (Zimmermann et al., 2000); these same residues are evidence NO:11), Ps (Japane SEQID NO:14), Pav (sweet SEQID in MidMYB10 and Md MYB9 suggesting a similar protein NO:19), Pd (almond SEQID NO:18), Mg (medlar SEQ protein interaction. The sequences are: MdMYB10 apple 55 ID NO: 20) and Pdm (Europ. SEQID NO: 21). (SEQ ID NO: 103), AtPAP1 (SEQID NO: 104), AtMYB90 FIG.9 shows % sequence identity between the MdMyb 10 (SEQ ID NO: 105), VIMYBA2 (SEQ ID NO: 106), Ca A polypeptide sequence, polypeptide variants of MdMYB10 pepper (SEQ ID NO: 107), PhAN2 (SEQ ID NO: 108), and AtPAP1 for reference. The table shows% identity values LeANT1 (SEQ ID NO: 109), GhMYB10 Gerbera (SEQ ID for all possible sequence combinations for the sequences that NO: 110), PmBF1 spruce (SEQ ID NO: 111), ZmC1 maize 60 are included in FIG. 8. (SEQ ID NO: 112), AtTT2 AtMYB123 (SEQ ID NO: 113), FIG. 10 shows activiation of the At-DFR gene promoter by Md MYB9 apple (SEQID NO: 114), Md MYB11 apple (SEQ Md MYB10 and PcfMYB10 (a variant of PefMYB10) in ID NO: 115), ZmP maize (SEQ ID NO: 116), Md MYB8 combination with apple bHLHTFs in tobacco transient trans apple (SEQID NO: 117) and AtGL1 (SEQID NO: 118). formation assays affects the activity of the At-DFR gene FIG.2 shows a phylogenetic analysis showing relationship 65 promoter. The dual luciferase assay shows promoter activity between Arabidopsis and apple MYB TFs. Arrow shows as expressed as a ratio of DFR promoter luciferase (LUC) to position of Md MYB10 which falls next to AtPAP1 in the 35S Renilla (REN) where an increase inactivity equates to an US 7,973,216 B2 29 30 increase in LUC relative to REN. The effects of combinations DAFB), November (40 DAFB), December (67 DAFB), Janu of MYB transcription factors (Md MYB10, Pcf MYB10 and ary (102 DAFB), February (130 DAFB) and March (146 MdMYB8 (-ve control) with bHLH transcription factors Md bHLH 3 and MdbHLH33 are shown. Error bars shown are DAFB) from trees at the HortResearch orchard (Nelson, New means-S.E. of 6 replicate reactions Zealand). RNA was isolated (adapted from Chang et al., FIG. 11 shows that over-expression of MdMYB10 in apple 1993) from the fruit (six fruit from the same tree, skin and cells and/or plants elevates anthocyanin production. (A)(i) cortex separately) and the leaves of 2 genotypes; the white shows pigmented callus cells. A (ii) shows an apple plant fleshed commercial cultivar Malus domestica var. Sciros (Pa transformed with 35S-Md MYB10 (left) and empty vector control plant (right). The plant transformed with 35-Md cific RoseTM, derived from a cross between Gala and Splen MYB10 clearly shows strong pigmentation compared to the 10 dour) and the red-fleshed cultivar Malus domestica var. Red empty vector control. (B) Shows an anthocyanin profiles of Field, an open-pollinated seedling of the cultivar Redfield (a extracts of 35S-MdMYB10 apple leaf (top line) and empty cross between Wolf River and Malus domestica var. vector control (bottom line). Peaks identified from HPLC traces at 520 nm, cy-gal, cyanidin-3-galactoside, with minor Niedzwetzkyana (Brooks and Olmo, 1972). For the first traces of cy-glu, cyanidin-3-glucoside and cy-pent, cyanidin 15 developmental fruit time point, October (7 DAFB), success 3-pentoside. ful excision of skin from cortex was not possible and data from this sample has been excluded. First strand cDNA syn EXAMPLES thesis (three replicates for each sample which were subse quently pooled) was preceded by DNase treatment and per The invention will now be illustrated with reference to the formed using oligo dT according to the manufacturers following non-limiting examples. instructions (Transcriptor, Roche Applied Science). Example 1 Genes encoding apple anthocyanin pathway enzymes and regulators were identified by homology in the HortResearch Identification of an Appropriate Apple Tissue and EST database and, in the case of possible isoforms, selection Developmental Stage useful for the Isolation of 25 was made according to the expression profile and library Polynucleotides Encoding Transcription Factors tissue. Gene specific primers, corresponding to these genes which Regulate Pigment Production were designed using Vector NTI version 9.0.0 (available on the worldwide web at www.invitrogen.com) to a stringent set Materials and Methods of criteria, enabling application of universal reaction condi Real Time (qPCR) Expression Analysis 30 tions. To check reaction specificity, RT-PCR was used accord Apple fruit were collected at 6 time points during the apple ing to manufacturers instructions (Platinum Taq, Invitro fruit season from Spring (October) through Summer to gen). The sequence of each primer pair and the relevant (March) 2003-2004: October (7 days after fall bloom, accession number are shown in Supplementary Table 1 below. TABLE 1.

Gene Gene Forward primer Reverse primer identifiername (SEQ ID NO: ) (SEQ ID NO: )

CN944824 cCHS GGAGACAACTGGAGAAGGACTGGAACGACATTGATACTGGTGTCTTCA (81) (82)

CN94 6541 cCHI GGGATAACCTCGCGGCCAAA. GCATCCATGCCGGAAGCTACAA (83) (84)

CN491664 cF3H TGGAAGCTTGTGAGGACTGGGGT CTCCTCCGATGGCAAATCAAAGA (85) (86)

AY227729 cDFR GATAGGGTTTGAGTTCAAGTA TCTCC TCAGCAGCCTCAGTTTTCT (87) (88)

AF117269 dLDOX CCAAGTGAAGCGGGTTGTGCT CAAAGCAGGCGGACAGGAGTAGC (89) (90)

AF117267 cJFGT CCACCGCCCTTCCAAACACTCT CACCCTTATGTTACGCGGCATGT (91) (92)

cMYB10 TGCCTGGACTCGAGAGGAAGACA CCTGTTTCCCAAAAGCCTGTGAA (93) (94)

coLH33 ATGTTTTTGCGACGGAGAGAGCA TAGGCGAGTGAACACCATACATTAAAGG (95) (96)

CN934367 coLH3 AGGGTTCCAGAAGACCACGCCT TTGGATGTGGAGTGCTCGGAGA (97) (98)

CN938 O23 cActin TGACCGAATGAGCAAGGAAATTACTTACTCAGCTTTGGCAATCCACATC (99) (100) US 7,973,216 B2 31 32 DNA amplification and analysis was carried out using the (with a 32 fold degeneracy) designed at the R2R3 binding LightCycler System (Roche LightCycler 1.5). All reactions domain based on the sequence of anthocyanin regulators in were performed with the LightCycler FastStart SYBR Green diverse species. Numerous clNAs encoding R2R3 MYB Master Mix (Roche Applied Science) following the manufac domains were obtained. Results from sequencing data turer's method. Reactions were performed in triplicate using revealed one cDNA with high identity to anthocyanin regu 2 ul 5x Master Mix, 0.5 M each primer, 1 ul diluted cDNA lators and full length sequence was obtained using 5 RACE and nuclease-free water (Roche Applied Science) to a final (GeneRacer, Invitrogen). The complete sequence for the volume of 10 ul. A negative water control was included in MdMYB10 cDNA was compiled from overlapping frag each run. The following thermal profile was used for all qPCR ments. To compare the transcript from Red Field, full length reactions: a pre-incubation step at 95°C. for 5 minutes fol cDNAs were subsequently isolated from Malus domestica lowed by 40 cycles of 95°C. (5 seconds), 60° C. (5 seconds) 10 vars. Pacific Rose and Granny Smith. MdMYB11 and 72° C. (10 seconds). Fluorescence was measured at the (DQ074463), a subgroup 11 MYB (according to Stracke et al. end of each annealing step. Amplification was followed by a 2001) was also isolated and sequenced by the same process. melting curve analysis with continual fluorescence data Other transcription factor candidates were isolated from the acquisition during the 65°C. to 95°C. melt. The raw data was HortResearch EST collection: MdMYB9 an apple homo analysed with the LightCycler software version 4 and expres 15 logue of Arabidopsis TT2 (Nesi et al., 2001, AJ299452), and sion was normalised to Malus domestica Actin (Md.Actin, MdMYB8, an apple MYB bearing little sequence homology accession CN938023) with the Pacific Rose leaf sample act to known anthocyanin regulators. ing as calibrator with a nominal value of 1. For each gene a Previous studies in other species have shown that a sub standard curve was generated with a cDNA serial dilution and group 10 MYB may be the key determinant of pigmentation. the resultant PCR efficiency calculations (ranging between Within publicly available apple EST databases (185,000 1.839 and 1.945) were imported into relative expression data nucleotide sequences as at August 2005), there is no MYBTF analysis. showing high homology via sequence blasts to Arabidopsis Results PAP1 and subgroup 10 MYBs from other species. qPCR Expression Analysis of Biosynthetic Enzymes Overlapping sequence alignments of cDNAS cloned after In order to identify the stage of fruit development, where 25 PCR show that the best candidate, Md MYB10, shares a high degree of homology with other MYBTFs at the R2R3 domain transcriptional regulation of anthocyanin synthesis is great and, in particular, with anthocyanin regulators from other est, the analysis of expression of the major biosynthetic genes species (FIG. 1). Md MYB10 is closely related to the Arabi was performed. A comparison of transcript levels encoding dopsis subgroup 10 MYB, PAP 1, with a 77% amino acid anthocyanin biosynthetic genes between Pacific Rose and identity at the R2R3 binding domain and 58% overall. For Red Field shows striking differences. In all the genes assayed, 30 Arabidopsis PAP2 these amino acid percentage identities are representing the majority of the enzymatic steps in the path 75% and 57% respectively whilst for other species figures for way, transcript levels in Red Field showed significant eleva overall identity are as follows: Petunia AN2 60%, Tomato tion during all stages of fruit development in comparison to ANT1 57%, Maize C1 5%, and Maize P 26%. the levels found in Pacific Rose (FIG. 3). All these MYB TFs have the amino acid residues that Transcript abundance of the biosynthetic genes in Red 35 specify interaction with bHLHs (Grotewold et al., 2000). Field was enhanced throughout fruit development in both Candidates for these cofactors were therefore selected from skin and cortex, with a general pattern indicating the highest the HortResearch EST database. In the large phylogenetic transcript levels at the January time point. This configuration family with constitutes the bHLH type TF, there is a smaller mimics the degree of pigmentation observed during tissue lade termed IIIf(Heim et al., 2003) that appears to be involved sampling with the most intense pigmentation being observed 40 in the regulation of flavonoid biosynthesis. Two apple TFs early in development (40 DAFB) and then again in mid from the HortResearch EST database clustered within this summer (102 DAFB), a level that is subsequently sustained lade (data not shown). These were sequenced to full length through to fruit maturation in late summer (FIG. 3 b). In and given the identifiers MdbHLH3 (CN934367), a putative Pacific Rose cortex tissue there was comparatively low tran homologue of the Arabidopsis TT8 gene and MdbHLH33, a Script level for all the anthocyanin biosynthesis genes, with a 45 putative homologue of Delila (from Antirrhinum, Goodrichet al., 1992). general decline in expression during fruit development. Mod Phylogeny erate activity was observed in the skin with a peak of expres Apple EST sequences were trimmed of vector, adapter and sion midway through development at 102 DAFB, concomi low quality sequence regions and uploaded to Vector NTI tant with enhanced levels of pigmentation during fruit version 9.0.0 (available on the worldwide web at www.invit maturation. The level of expression in the leaves of both 50 rogen.com). The EST clustering phase was performed using varieties was evident but relatively low with little to distin Vector NTIAlignX program. Alignments were then exported guish between Red Field and Pacific Rose. to GeneDoc version 2.6.002 (available on the worldwide web Results from qPCR analysis of anthocyanin biosynthetic at www.psc.edu/biomed/genedoc/) as MSF format files. enzyme transcript levels were analysed to determine the most Trees were generated by re-aligning exported files in CLUST suitable tissue/time point for relevant MYB transcription fac 55 ALX (v1.81) using the default settings (Thompson et al., tor isolation. We chose tissue from Red Field cortex, that 1997). Phylogenetic analysis was carried out using the showed the highest expression level for the anthocyanin bio PHYLIP software package (Felsenstein, 1993). TreeView synthesis genes. (v. 1.6.5) was used to display resulting trees (Page, 1996) or circular trees were generated using MEGA version 2.1 (Ku Example 2 60 mar et al., 2001). Isolation of Polynucleotides Encoding Transcription Example 3 Factors Potentially Regulating Pigment Production in Apple Identification of Variants of the Md Myb10 65 PCR was performed using cDNA from the cortex sample Tissue was collected from Malus domestica, Malus Sylves of Red Field (January time point) using degenerate primers tris (MS, European crab apple), Pyrus communis (Pc, pear), US 7,973,216 B2 33 Pyrus pyrifolia (Ppy pear, Nashi), Pyrus bretschneideri (Pb, TABLE 2 - continued pear, YALI), Cydonia Oblonga (Co, quince), Prunus Salicina (Ps, Japanese plum, prune), Prunus cerasifera (Pcf. cherry SEQ ID plum), Prunus persica (Ppr, peach), Eriobotrya japonica (E. Primer Sequence (5' to 3') NO : loquat), Prunus dulcis (Pd, almond), Prunus avium (Pav, Sweet cherry), Mespilus germanica (Mg, medlar), Prunus KL FW6R AATATGCACCAGGAAGTCTTAAAGA 73 domestica (Pdm, European plum) Rubus idaeus (Ri, red rasp KL Fv7 AAATCTGCTTAATTTTCATGGAGGG 74 berry), Prunus armeniaca (par, ), and Prunus insititia (Pi, Damson) all of which are rosaceae species. KL Rh1F TCAGAGAGAGAGAGATGGGTGGTATTCC 7s 10 Genomic DNA (gDNA) was extracted, using DNeasy KL. Rh2R CTTCCTCTTGTTCAAAGCTCCCTCTC 76 Plant Mini Kit (QIAGEN, catalogue 69104) according to manufactures instructions, from leaves of each species, KL Rh3F AGAACTATTGGAATTGTCACTTGAG 77 except for Pyrus pyrifolia (Ppy pear, Nashi), Pyrus bretschneideri (Pb, pear, YALI) where genomic DNA was KL. Rh4R AGAATAAAATCACTTTCATAACCAC 78 15 isolated from fruit peel. KL. Rosa AGACTTCCRGGAAGRACWGCNAAT 79 PCR was performed on g|DNA from the above species (by deg 1F GMTGTG (degenerate standard techniques) using combinations of the primers primer with shown in Table 2 below. a 64 fold degeneracy) TABLE 2 KL. Rosa CCARTAATTTTTCACAKCATTNGC 8O SEQ ID deg 2R Primer Sequence (5' to 3') NO : (degenerate primer with RE73 AAAAGTTGCAGACTTAGATGGTTGAATT 48 a 16 fold degenerate ATTTGAAGCC 25 degeneracy) primer) F

RE77R GAGAATCGATCCGCAATCGAOTGTTCC 49 Genomic PCR products were sequenced by standard proce dures. RE78R ACCACCTGTTTCCCAAAAGCCTGTGAAGTCT SO 30 From sequenced genomic DNA, intron and exons were RE79R CACAAGCTAGATGGTACCACAGAAGTGAGAATC 51 predicted by known methods of comparison with MidMYB10 EST data, known intron/exon boundaries, and open reading RE95F TAAGAGATGGAGGGATATAACG 52 frames. From these deduced cDNAs, translated protein was RE96R CTAGCTATTCTTCTTTTGAATGATTC 53 generated. A summary of the variant gldNA, predicted cDNA 35 and predicted polypeptide sequences identified is included in RE108F GATCGATTCTCGCATGAAAACGGT 54 Table X. Polypeptide variants of MdMyb10 are listed in the RE109R GACGACGTTTGTGGTGGCGTACT 55 sequence listing as SEQ ID NO: 9-21. Polynucleotide vari ants of MdMyb10 are listed in the sequence listing as SEQID RE12 OF TGCCTGGACTCGAGAGGAAGACA 56 NO: 22-47. SEQ ID NO: 102 is a MdMyb10 genomic 40 Sequence. RE121R CCTGTTTCCCAAAAGCCTGTGAA st The variant polypeptide sequences (together with KL Ms1F CTTATAATTAGACTTCACAGGC 58 MdMyb 10 and AtPAP1 for reference) were aligned using Vector NTI version 9.0, which uses a Clustal W algorithm KL Ms2R CACCGTTTTCATGCGAGAAT 59 (Thompson et al., 1994). Results are shown in FIG. 8 KL . Mc GCAGATAAGAGATGGAGGGATATAACGA 60 45 Percentage sequence identity between the aligned PAP1F AAACCTGAG polypeptide sequences was also calculated using Vector NTI KL TACACAAGCTAGATGGTACCACAGAAGT 61 version 9.0 (Sep. 2, 2003 (C.1994-2003 InforMax, now McPAP1R GAGAATC licenced to Invitrogen) Results are shown in FIG. 9. These data show that the applicants have identified a dis KL PcfE GACTTTATGGAAGATGAAGTAGATC 62 50 tinct group MidMYB10 variants from rosaceae species. The rosaceae sequences share a significant degree of sequence KL PcfR AAGCGATAGTATATTATTGATGAAC 63 conservation, and each rosaceae sequence is more similar to KL Pct2F CTTGGGTGTGAGAAAAGGAG 64 another rosaceae sequence than it is to AtPAP1.

KL Pcf.3R CACGCTAAAAGAGAAATCAC 65 55 Example 4 KL Pcf.4R GCTTGTGAAGCCTAATTATT 66 Activation of Pigment Promoters by Expression of KL Ppr1F GAAAGATAAAGCCCAAGAAA 67 Transcription Factor Polynucleotides of the Invention in Plants KL Ppr2R TTTGAACTCTTGATGAAGCT 68 60 KL Ppr3F CTGCGAATTTGTATTGTATGTC 69 Dual Luciferase Assay Promoter sequences were inserted into the cloning site of KL Ppr4R TTCCCACCAATCATTTCCAT 70 pGreen 0800-LUC (Hellens et al., 2005) and modified to KL FW1F AAGAGAGGAGAGTTTGCAGAGG 71. introduce an NcoI site at the 3' end of the sequence, allowing 65 the promoter to be cloned as a transcriptional fusion with the KL FW2R TAGTTCTTCACATCATTGGCAG 72 firefly luciferase gene (LUC). Thus, TFs that bind the pro moter and increase the rate of transcription could be identified US 7,973,216 B2 35 36 by an increase in luminescence activity. Arabidopsis CHS Results from transientanalysis based on the CHS promoter (TT4) (AT5g 13930) and Arabidopsis DFR (TT3) showed that activity of the Arabidopsis PAP1 MYB was (AT5g42800) were isolated from genomic DNA. In the same greatest in co-transformation with an apple bHLH, but was construct, a luciferase gene from Renilla (REN) under the unexpectedly not affected by co-transformation with the Ara control of a 35S promoter provided an estimate of the extent 5 bidopsis TT8 bHLH. In contrast, results for Md MYB10 indi of transient expression. Activity is expressed as a ratio of cates activity that may be independent of a bHLH with the LUC to REN activity so that where the interaction between a highest activity observed with the MYB alone (FIG. 5). Co TF (+/-bHLH) and the promoter occurred, a significant transformation of MdMYB10 with the Arabidopsis bHLH increase in the LUC activity relative to REN would be appeared to inhibit activity. Md MYB9 also showed enhanced observed. 10 activity when in partnership with either of the apple bHLHs, Nicotiana benthamiana were grown underglasshouse con consistent with its sequence similarity to TT2-like genes. ditions, using natural light with daylight extension to 16 hrs, Significant activity for the remaining MYBs was not until at least 6 leaves (of 2-3 cm in length) were available for observed and this degree of activity presumably represents infiltration with Agobacterium. Plants were maintained in the basal levels. glasshouse for the duration of the experiment. Agrobacterium 15 strain GV3101 (MP90) was cultured on Lennox agar (Invit Results from the DFR promoter assay show a different rogen) Supplemented with selection antibiotics and incubated pattern indicating a significant increase in activity when at 28°C. A 10 ul loop of confluent bacterium were re-sus MdMYB10 (and AtPAP1) was co-transformed with an apple pended in 10 ml of infiltration media (10 mM MgCl, 0.5uM bHLH. In the case of the MdMYB10 the highest activity was acetosyringone), to an ODoo of 0.2, and incubated at room observed when infiltrated with MdbHLH3. This contrasted temperature without shaking for 2 h before infiltration. Infil with AtPAP1 where activity was highest when infiltrated with trations were performed according to the methods of Voinnet the apple Delila homologue, MdbHLH33. These results et al. (2003). Approximately 150 ul of this Agrobacterium reflect previous work in a transient protoplast transfection mixture was infiltrated at six points into a young leaf of N. system where in an Arabidopsis DFR promoter:Gus fusion benthamiana and transient expression was assayed 3 days 25 was only activated by PAP1 in the presence of a bHLH (Zim after inoculation. mermann et al., 2004), although it should be noted that we did The promoter-LUC fusions (CHS and DFR) in pGreen|I not see such large increases in activity when AtPAP1 was 0800-LUC were used in transient transformation by mixing infiltrated with AtTT8. MdMYB9 performed in a similar but 100 ul of Agrobacterium transformed with the reporter cas reduced manner, whilst the LUC to REN ratio for Md MYB11 sette with two other Agrobacterium strains (450 ul each) 30 and Md MYB8 was low under all conditions. transformed with cassettes containing a MYBTF gene fused When genomic cherry plum MYB10 (PcfMYB10) was to the 35S promoter and a bHLH TF gene in either p ART27 cloned into a pCREEN plasmid vector and assayed as (Gleave, 1992) or pGreen II 62-SK binary vectors (Hellens et described above, activation of the DFR promoter results. al., 2000). Highest activity is shown when PcfMYB10 is infiltrated with Firefly luciferase and renilla luciferase were assayed using 35 MdbHLH3 and MdbHLH33 (FIG. 10). This data shows that the dual luciferase assay reagents (Promega, Madison, USA). a MYB10 sequence from the or Prunoideae Three days after inoculation, 2 cm leaf discs (6 technical Sub-family is also effective at driving anthocyanin gene activ replicates from each plant) were removed and ground in 500 ity in a similar mechanism to MdMYB10 (of the Malus sub ul of passive lysis buffer (PLB). Tenul of a 1/100 dilution of family of Rosaceae). this crude extract was assayed in 40 ul of luciferase assay 40 buffer, and the chemiluminescence measured. 40 ul of Stop Example 5 and GlowTM buffer was then added and a second chemilumi nescence measurement made. Absolute relative lumines Activation of Pigment Biosynthesis by Expression of cence units (RLU) were measured in a Turner 20/20 lumi Transcription Factors of the Invention in Plants nometer, with a 5 S delay and 15 S measurement. 45 Dual Luciferase Assay Colour Assay The dual luciferase system has been demonstrated to pro Nicotiana tabacum var. Samsun were grown in a glass vide a rapid method of transient gene expression analysis house at 22°C., using natural light with daylight extension to (Hellens et al., 2005). It requires no selectable marker and 16 hrs, until at least 3 leaves (of 10-15 cm in length) were results can be quantified with a simple enzymatic assay. In 50 available for infiltration with Agrobacterium. Plants were this study the system was used to quantify the activity of the maintained in the glasshouse for the duration of the experi promoters of anthocyanin biosynthetic genes when chal ment. Agrobacterium cultures were incubated as for the dual lenged with TFs which putatively bind the promoters. We luciferase assay and separate strains containing the MYBTF used N. benthamiana for the dual luciferase transient assay to gene and the bHLH TF gene fused to the 35S promoter in test the interaction of our candidate TFs with two Arabidopsis 55 pART27 binary vector were mixed (500 ul each) and infil anthocyanin biosynthesis gene promoters, AtCHS (TT4, trated into the lower leaf surface as for the assay with N. AT5g13930) and AtlDFR (TT3, AT5g42800), that are known benthamiana. Six separate infiltrations were performed into to be regulated by Arabidopsis PAP1 and PAP2 MYB TFs N. tabacum leaves (two plants per treatment) and changes in (Tohge et al., 2005, Zimmermann et al., 2004). Several apple colour were measured daily using a Minolta CR-300 chroma MYB TFs were selected to probe the specificity of 60 metre (calibrated to D65 light) using the L*a*b* system (CIE, Md MYB10: Md MYB9, Md MYB11, Md MYB8 and, from 1986). Infiltrations comprising MdMYB10 together with an Arabidopsis, AtPAP1. These MYBs fall into clades represent apple bHLH resulted in visible pigmentation after four days. ing subgroups 10, 9, 11 and 7 respectively (FIG. 2). To inter The level of pigmentation increased throughout the experi rogate the interaction between MYB and bhLH TFs co mental period; digital photographs and microscope images transformation was performed with bhLH class putative 65 were taken eight days after infiltration. Anthocyanin pigmen regulators from apple: MdbHLH3 and MdbHLH33 and from tation did not develop when N. benthamiana was used in Arabidopsis the bHLH, TT8 (AtbHLH042; Atago9820). parallel assays (data not shown). US 7,973,216 B2 37 38 HPLC din-3-galactoside in apple skin (data not shown) but as pre N. tabaccum leaf discs were excised around the infiltration viously described (Tsou et al. 2003). Cyanidin-3-glucoside sites, freeze-dried and coarsely ground before re-suspension and petunidin-3-galactoside was observed in tobacco petal in 5 ml methanol and 0.1% HCL, extracted at room tempera (FIG. 7). Petunidin-3-galactoside is not seen in the profile ture for 2 hours and centrifuged at 3500 rpm. Aliquots of 1 ml 5 generated in a tobacco leaf by the action of MdMYB10 and were dried down to completion in a Labconco Centrivap MdbHLH3(FIG. 7). Concentrator. Samples were re-suspended in 20% methanol qPCR Expression Analysis of Transcription Factors (250 ul). Anthocyanins were characterized by HPLC on a Knowledge of the abundance, and pattern of accumulation, 250x4.6 mm, Synergi, 4 m particle size, Polar-RP. 80 A pore of biosynthetic gene transcripts provided information as to size, ether-linked phenyl column (Phenomenex, Auckland, 10 the most appropriate tissue with which to perform degenerate New Zealand). This was fitted to a Shimadzu analytical HPLC with a column oven, auto-sampler, vacuum solvent PCR for the isolation of a putative transcriptional regulator. degasser and diode-array detector. Solvents were (A) aceto qPCR of the TFs in this same development series reveals nitrile+0.1% formic acid and (B) acetonitrile/water/formic increases in the relative transcript levels of MdMYB10 in the acid, 5:92:3. Flow rate was 1.5 ml/min and column tempera- 15 fruit tissues of Red Field compared to Pacific Rose. In cortex ture 45° C. The content of solvent A was 0% at 0 time and tissue, transcript levels in Pacific Rose were barely detect ramped linearly to 17% at 17 min, 20% at 20 min, 30% at 26 able, whilst in the skin Pacific Rose transcript was evident and min, 50% at 28.5 min, 95% between 32-35 min and back to levels of the MYB transcript correlate with the biosynthetic 0% between 36-42 min. Quantification of reaction products enzymes particularly at the January time point and in relation was at 520 nm for anthocyanins and 280 nm for other pheno- 20 to UFGT. Expression levels of MdMYB10 in Red Field lics. Spectra were recorded from 240-600 nm in 4 nm steps. appear to largely follow the transcript pattern of the enzymes Sample injection Volume was 40 uL. assayed, with highly elevated levels throughout fruit tissues, Colour Assay particularly at the November time point and then again at We have established a simple method to reveal anthocyanin January, February and March (FIG. 4B). Transcript levels in pigment accumulation in N. tabacum via Agrobacterium 25 Red Field leaf were similarly elevated in comparison to infiltration. Accumulation of pigmentation in N. tabacum Pacific Rose. Results were similar for RT-PCR (FIG. 4A) and infiltrated leaves was examined visually. Pigmentation was to further confirm specificity, qPCR amplicons were evident at infiltration points as early as four days post-infil sequenced and analysed and found to encode MdMYB10. tration for Md MYB10 when co-infiltrated with an apple Transcript levels of MdbHLH3 and MdbHLH33 did not bHLH (FIG. 6A). The degree of pigmentation increased over 30 appear to follow the pattern displayed for the biosynthetic the experimental period (of up to ten days). Pigmentation was genes, or for Md MYB10 with a more consistent level of also observed but at reduced levels in treatments comprising expression both throughout the development series and in co-infiltration of AtPAP1 and an apple bHLH (MdbHLH3 or both varieties (FIG. 4B). Transcript levels of the Md MYB8, MdbHLH33), AtPAP1 and AtTT8, and, to a lesser extent, MdMYB9 and Md MYB11 genes were also assayed but did with infiltration of MdMYB10 alone. No pigmentation was 35 not show a correlative pattern with the anthocyanin enzyme visible in other combinations. Results demonstrate the effi transcript levels (data not shown). cacy of this assay as a useful reporter system to study the regulation of the pigmentation process. Example 6 Colour was quantified by measurement with a Minolta chromameter using the L*a*b* system confirmed the visible 40 Over-Expression of MdMyb10 in Transgenic Apple transition from green to red. The data is shown as a ratio of Plants Results in Elevated Anthocyanin Production a/b (FIG. 6B), where the change from negative towards positive indicates a shift from green to red. There was vari Transformation of Apple ability between replicates of a given treatment as to the extent The binary vector pSAK277-MdMYB10 containing the of pigmentation as apparent in the depth of error bars (FIG. 45 MdMYB10 cDNA driven by the Cauliflower mosaic virus 6B). 35S promoter produced by Standard techniques was trans To verify cellular build-up of anthocyanin compounds, ferred into Agrobacterium tumefaciens strain GV3101 by the microscope images were obtained from epidermal peels 1 freeze-thaw method well-known to those skilled in the art. week after inoculation (FIG. 6C). This illustrates the trans Transgenic Malus domestica Royal Gala plants were gen formation of individual cells with the candidate genes and 50 erated by Agrobacterium-mediated transformation of leaf activation of the accumulation of anthocyanin pigments pieces, using a method previously reported (Yao et al., 1995). within the vacuoles. Control plants transformed with an equivalent empty vector Analysis of HPLC Data were also produced in the same way. To confirm the identity of the anthocyanins synthesised The results are shown in FIG. 11. during tobacco transient expression of selected MYBs, 55 Highly pigmented callus cells are shown in A(i). A(ii) samples were extracted and the soluble anthocyanins analy shows a highly pigmented 35-5 Mym10 plant (left) and an sed by HPLC. The results indicate that when Md MYB10 and empty vector control plant for comparison (right). MdbHLH3 are co-overexpressed in tobacco leaves, two Panel B shows anthocyanin profiles (generated as major peaks are observed, representing cyanidin-3-glucoside described in Example 5) of extracts from 35S-MdMyB10 and and cyanidin-3-0-rutinoside (FIG. 7). These compound iden- 60 control plants. Results shows levels of anthocyanin pigments tities were confirmed by LC-MS (data not shown). No are clearly detectable in the 35S-Md MYB10 plants but not in observable anthocyanin peaks were found in the extracts of the control plants. Apple tissue was extracted in acidified tobacco leaf transformed with empty vector control (data not methanol and peaks identified from HPLC traces at 520 nm, shown). To compare this with compounds naturally occurring cy-gal, cyaniding-3-galactoside, with minor traces of cy-glu, in apple and tobacco, anthocyanins from the of tobacco 65 cyaniding-3-glucoside and cy-pent, cyaniding-3-pentoside. and skin of apple (Pacific Rose, mature fruit) were also It is not the intention to limit the scope of the invention to extracted and results confirmed the predominance of cyani the above mentioned examples only. As would be appreciated US 7,973,216 B2 39 40 by a skilled person in the art, many variations are possible Goff SA, Cone KC, ChandlerVL (1992) Functional analysis without departing from the scope of the invention. of the transcriptional activator encoded by the maize B gene: evidence for a direct functional interaction between REFERENCES two classes of regulatory proteins. Genes Dev 6: 864-875 5 Gonzalez Padilla I M, Webb K, Scorza R. (2003) Early anti Aharoni A, De Vos C H., Wein M, Sun Z, Greco R, Kroon A, biotic selection and efficient rooting and acclimatization Mol J N, O'Connell A P (2001) The strawberry FaMYB1 improve the production of transgenic plum plants (Prunus transcription factor Suppresses anthocyanin and flavonol domestica L.). Plant Cell Rep. 22(1):38-45. accumulation in transgenic tobacco. Plant J28: 319-332 Goodrich J. Carpenter R. Coen E S (1992) A common gene Borevitz J O, Xia Y. Blount J, Dixon RA, Lamb C (2000) 10 regulates pigmentation pattern in diverse plant species. Activation tagging identifies a conserved MYB regulator Ce11 68: 955-964 of phenylpropanoid biosynthesis. Plant Cell 12: 2383 Graham J. McNicol R J. Kumar A. (1995) Agrobacterium 2394 mediated transformation of soft fruit Rubus, Ribes, and Boss P K, Davies C, Robinson S P (1996). Expression of 15 Fragaria. Methods Mol Biol. 1995; 44:129-33. anthocyanin biosynthesis pathway genes in red and white Grotewold E. Sainz MB, Tagliani L, Hernandez J. M. Bowen grapes. Plant Mol Biol 32: 565-569 B, Chandler VL (2000) Identification of the residues in the Boyer J. Liu R (2004) Apple phytochemicals and their health Myb domain of maize C1 that specify the interaction with benefits. Nutrition Journal 3: 5 the b cofactor R. Proc Natl Acad Sci USA 97: 13579 Brooks, RM, Olmo, H. P. (1972). Register of New Fruit and 13584 Nut Varieties. University of California Press, London. Harborne J. B. Grayer RJ (1994) Flavonoids and insects. In J BrounP (2005). Transcriptional control of flavonoid biosyn B Harborne, ed., The Flavonoids: Advances in Research thesis: a complex network of conserved regulators Since 1986. Chapman & Hall, London, p 589-618 involved in multiple aspects of differentiation in Arabidop Heim MA, Jakoby M, Werber M, Martin C, Bailey P C, sis. Curr Opin Plant Biol 8: 272-279 25 Weisshaar B (2003) The basic helix-loop-helix transcrip Brouillard R (1988). Flavonoids and flower colour. In J B tion factor family in plants: a genome-wide study of protein Harborne, ed., The Flavonoids: Advances in Research since structure and functional diversity. Molecular Biology and 1980. Chapman & Hall, London, pp 525-538 Evolution 20: 735-747 Chang, S. Puryear, J. and Cairney, J. (1993). A simple and Hellens RP. Edwards EA, Leyland N R, Bean S, Mullineaux efficient method for isolating RNA from pine trees. Plant 30 PM (2000) pgreen: a versatile and flexible binary Tivector Mol. Biol. Rep. 11: 113-116. for Agrobacterium-mediated plant transformation. Plant CIE (1986) Colorimetry, 2" edin. Publication CIE No. 15.2, Mol. Biol. 42: 819-832 Central Bureau of the Commission Internationale de Hellens R P Allan A C, Friel E N, Bolitho K, Grafton K, L'Eclairage, Viena. Templeton M D. Karunairetnam S. Laing W A (2005) Davies KM, Schwinn K E (2003) Transcriptional regulation 35 Transient plant expression vectors for functional genom of secondary metabolism. Functional Plant Biology 30: ics, quantification of promoteractivity and RNA silencing. 913-925 Plant Methods In press Davuluri G. R. van Tuinen A, Fraser PD, Manfredonia A, Hernandez J. M., Heine GF, Irani N G, Feller A, Kim M-G, Newman R. Burgess D. Brummell DA, King S R, Palys.J. Matulnik T, Chandler V L. Grotewold E (2004) Different Uhlig J, Bramley PM, Pennings HMJ. Bowler C (2005) 40 mechanisms participate in the R-dependent activity of the Fruit-specific RNAi-mediated suppression of DET1 R2R3 MYB transcription factor C1. J. Biol. Chem. 279: enhances carotenoid and flavonoid content in tomatoes. 482O5-48213 Nature Biotechnology 23: 890-895 HoltonTA, Cornish EC (1995) Genetics and biochemistry of de Vetten N, Quattrocchio F, Mol J, Koes R (1997) The an 11 anthocyanin biosynthesis. Plant Cell 7: 1071-1083 locus controlling flower pigmentation in petunia encodes a 45 Honda C. Kotoda N. Wada M. Kondo S. Kobayashi S, Soe novel WD-repeat protein conserved in yeast, plants, and jima J, Zhang Z, Tsuda T. Moriguchi T (2002) Alithocya animals. Genes Dev 11: 1422-1434 nin biosynthetic genes are coordinately expressed during Dixon R A, Steele C L (1999) Flavonoids and isofla red coloration in apple skin. Plant Physiology and Bio Vonoids—a gold mine for metabolic engineering. Trends chemistry 40: 955-962 Plant Sci 4: 394-400 50 Li X, Gasic K, Cammue B, Broekaert W. Korban SS (2003) Dong Y H. Mitra D. Kootstra A. Lister C. Lancaster J (1998) Transgenic rose lines harboring an antimicrobial protein Postharvest stimulation of skin colour in royal gala apple. gene, Ace-AMP1, demonstrate enhanced resistance to JAm Soc Hortic Sci 120: 95-100 powdery mildew (Sphaerotheca pannosa). Planta. 218(2): Elomaa P, Uimari A, Mehto M, Albert V, Laitinen R, TeeriT 226-32. (2003) Activation of anthocyanin biosynthesis in Gerbera 55 Jin H, Martin C (1999) Multifunctionality and diversity hybrida (Asteraceae) suggests conserved protein-protein within the plant MYB-gene family. Plant Mol Biol 41: and protein-promoter interactions between the anciently 577-585 diverged monocots and eudicots. Plant Physiology and Kim S-H, Lee J-R, Hong S-TYoo Y-K, An G, KimS-R (2003) Biochemistry 133: 1831-1842 Molecular cloning and analysis of anthocyanin biosynthe Folta KM, Dhingra A. Howard L. Stewart P.J. Chandler CK. 60 sis genes preferentially expressed in apple skin. Plant Sci (2006) Characterization of LF9, an octoploid strawberry ence 165: 403-413 genotype selected for rapid regeneration and transforma Kobayashi S, Ishimaru M, Hiraoka K, Honda C (2002) Myb tion. Planta. 2006 Apr. 14; PMID: 16614818 related genes of the Kyoho grape (Vitis labruscana) regu Gleave A (1992) A versatile binary vector system with a late anthocyanin biosynthesis. Planta 215: 924-933 T-DNA organisational structure conducive to efficient inte 65 Kobayashi S. Goto-Yamamoto N, Hirochika H (2004) Ret gration of cloned DNA into the plant genome. Plant rotransposon-induced mutations in grape skin color. Sci Molecular Biology 20: 1203-1207 ence 304: 982 US 7,973,216 B2 41 42 Koes RE, Quattrocchio F, Mol J N M (1994) The flavonoid SchwinnKE, Davies KM (2004) Flavonoids. In KM Davies, biosynthetic pathway in plants: Function and evolution. ed, Plant pigments and their manipulation, Vol 14. Black BioEssays 16: 123-132 well, Oxford Koes R, Verweij W. Quattrocchio F (2005) Flavonoids: a Song GQ, Sink K.C. (2005) Transformation of Montmorency colorful model for the regulation and evolution of bio- 5 Sour cherry (Prunus cerasus L.) and Gisela 6 (P. cerasus X chemical pathways. Trends in Plant Science 10:236-242 P canescens) cherry rootstock mediated by Agrobacterium Kumar S, Tamura K, Jakobsen I B. M. N. (2001) MEGA2: tumefaciens. Plant Cell Rep. 2006: 25(2): 117-23 molecular evolutionary genetics analysis Software. Bioin Stracke R, Werber M, Weisshaar B (2001) The R2R3-MYB formatics 17: 1244-1245 gene family in Arabidopsis thaliana. Current Opinion in Kubo H, Peeters AJM, Aarts MGM, Pereira A, Koornneef 10 M (1999) Anthocyanless2, a homeobox gene affecting Plant Biology 4: 447-456 anthocyanin distribution and root development in Arabi J D Thompson, D G Higgins, and T J Gibson (1994) dopsis. Plant Cell 11: 1217-1226 CLUSTAL W: improving the sensitivity of progressive Lancaster J (1992) Regulation of skin color in apples. Crit. multiple sequence alignment through sequence weighting, Rev. Plant Sci. 10: 487-502 15 position-specific gap penalties and weight matrix choice. Lister C E. J. E. Lancaster (1996) Developmental changes in Nucleic Acids Res. 1; 22(22): 4673-4680. enzymes of flavonoid biosynthesis in the skins of red and Thompson J D. Gibson T.J. Plewniak F. Jeanmougin F, Hig green apple . J Sci Food Agric: 313-320 gins D G (1997) The ClustaiX windows interface: flexible Matsuda N.Gao M, Isuzugawa K. Takashina T. Nishimura K. strategies for multiple sequence alignment aided by quality (2005) Development of an Agrobacterium-mediated trans- 20 analysis tools. Nucleic Acids Research 24: 4876-4882 formation method for pear (Pyrus communis L.) with leaf Tohge T. Nishiyama Y. Hirai MY.Yano M, Nakajima J. Awa section and axillary shoot-meristem explants. Plant Cell Zuhara M, Inoue E, Takahashi H, Goodenowe D B, Rep. 24(1):45-51. Kitayama M, Noji M. Yamazaki M. Saito K (2005) Func Mol J. Grotewold E. Koes R (1998) How genes paint tional genomics by integrated analysis of metabolome and and seeds. Trends in Plant Science 3: 212-217 25 transcriptome of Arabidopsis plants over-expressing an Mol J. J. G. Schafer, E: Weiss, D (1996) Signal perception, MYB transcription factor. Plant J 42: 218-235 transduction, and gene expression involved in anthocyanin Tsao R. Yang R. Young J C, Zhu H (2003) Polyphenolic biosynthesis. Critical Reviews in Plant Sciences.: 525-557 profiles in eight apple cultivars using high-performance Mehrtens F. Kranz H, Bednarek P. Weisshaar B (2005) The liquid chromatography (HPLC). J Agric Food Chem 51: Arabidopsis transcription factor MYB12 is a flavonol-spe- 30 634.7-6353 cific regulator of phenylpropanoid biosynthesis. Plant Voinnet O, Rivas S. Mestre P. Baulcombe D (2003) An Physiology 138: 1083-1096 enhanced transient expression system in plants based on Nesi N. Jond C, Debeaujon I, Caboche M. L. L (2001) The Suppression of gene silencing by the p19 protein of tomato Arabidopsis TT2 gene encodes an R2R3 MYB domain bushy stunt virus. Plant J 33:949-956 protein that acts as a key determainant for proanthocyani- 35 Vom Endt D. Kijne J. W. Memelink J. (2002) Transcription din accumulation in developing seed. Plant Cell 13: 2099 factors controlling plant secondary metabolism: what 2114 regulates the regulators? Phytochemistry 61: 107-114 Noda K-i, Glover BJ, Linstead P. Martin C (1994) Flower Walker A. R. Davison PA, Bolognesi-Winfield AC, James C colour intensity depends on specialized cell shape con M, Srinivasan N, Blundell TL, Esch JJ, Marks MD, Gray trolled by a Myb-related transcription factor. 369: 661-664 40 J C (1999) The TRANSPARENT TESTA GLABRA1 Oosumi T. Gruszewski HA, Blischak LA, Baxter A.J. Wadl locus, which regulates trichome differentiation and antho PA, Shuman J. L. Veilleux RE, Shulaev V. (2006) High cyanin biosynthesis in Arabidopsis, encodes a WD40 efficiency transformation of the diploid strawberry repeat protein. Plant Cell 11: 1337-1350 (Fragaria vesca) for functional genomics. Planta.; 223(6): Winkel-Shirley B (2001) Flavonoid biosynthesis. A colorful 1219-30. 45 model for genetics, biochemistry, cell biology, and bio Page R (1996) Comput. Applic. Biosci 12: 357-358 technology. Plant Physiol 126: 485-493 Piazza P. Procissi A, Jenkins GI, Tonelli C (2002) Members Yao, J.-L., Cohen, D., Atkinson, R., Richardson, K. and Mor of the c1/pl1 regulatory gene family mediate the response ris, B. (1995) Regeneration of transgenic plants from the of maize aleurone and mesocotyl to different light qualities commercial apple cultivar Royal Gala. Plant Cell Reports, and cytokinins. Plant Physiol 128: 1077-1086 50 14, 407-412 Quattrocchio F, Wing J. F. van der Woude K, Mol J N, Koes R Zimmermann IM, Heim MA, Weisshaar B. Uhrig JF (2004) (1998) Analysis of bHLH and MYB domain proteins: spe Comprehensive identification of Arabidopsis thaliana cies-specific regulatory differences are caused by diver MYB transcription factors interacting with R/B-like gent evolution of target anthocyanin genes. Plant J 13: BHLH proteins. Plant J 40:22-34 475-488 55 Ramesh SA, Kaiser B. N. Franks T. Collins G. Sedgley M. TABLE X (2006) Improved methods in Agrobacterium-mediated transformation of almond using positive (mannose/pmi) or SUMMARY OF SEQUENCES negative (kanamycin resistance) selection-based proto SEQ cols. Plant Cell Rep. 25(8):821-8. 60 ID Ramsay NA, Glover BJ (2005) MYB-bHLH-WD40 protein NO SPECIES REF SEQUENCE TYPE complex and the evolution of cellular diversity. Trends in 1 Maius domestica Md MYB10 Polypeptide Plant Science 10: 63-70 2 Maius domestica Md MYB9 Polypeptide 3 Maius domestica Md bELH3 Polypeptide Saito K, Yamazaki M. (2002) Biochemistry and molecular 4 Maius domestica Md bELH33 Polypeptide biology of the late-stage of biosynthesis of anthocyanin: 65 5 Maius domestica Md MYB10 Polynucleotide (cDNA) lessons from Perilla frutescens as a model plant. New 6 Maius domestica Md MYB9 Polynucleotide (cDNA) Phytologist 155: 9-23 US 7,973,216 B2 43 44 TABLE X-continued TABLE X-continued SUMMARY OF SEQUENCES SUMMARY OF SEQUENCES SEQ SEQ ID 5 ID NO SPECIES REF SEQUENCE TYPE NO SPECIES REF SEQUENCE TYPE 7 Maius domestica Md bELH3 Polynucleotide (cDNA) 67 Artificial Primer KL Ppr1F Polynucleotide 8 Maius domesica Md bELH33 Polynucleotide (cDNA) 68 Artificial Primer KL Ppr2R Polynucleotide 9 Maius Sylvestris MS MYB10 Polypeptide 69 Artificial Primer KL Ppr3F Polynucleotide 10 Pyrus communis PC MYB10 Polypeptide 10 70 Artificial Primer KL Ppr4R Polynucleotide 11 Pyrus pyrifolia Ppy MYB10 Polypeptide 71 Artificial Primer KLFv1F Polynucleotide 12 Pyrus bretschneideri Pb MYB10 Polypeptide 72 Artificial Primer KLFw2R Polynucleotide 13 Cydonia oblonga CO MYB10 Polypeptide 73 Artificial Primer KLFw 6R Polynucleotide 14 Prunus saicina PSMBY1O Polypeptide 74 Artificial Primer KLFv7F Polynucleotide 15 Prunus cerasifera Pcf MYB10 Polypeptide 75 Artificial Primer KLRh1F Polynucleotide 16 Prunus persica PprMYB10 Polypeptide 15 76 Artificial Primer KL. Rh2R Polynucleotide 17 Eriobotrya japonica E MYB10 Polypeptide 77 Artificial Primer KLRh3F Polynucleotide 18 Prunus duticis PdMYB10 Polypeptide 78 Artificial Primer KLRh4R Polynucleotide 19 Prints avium Pay MYB10 Polypeptide 79 Artificial Primer KL Rosa deg 1F Polynucleotide 20 Mespilus germanica Mg MYB10 Polypeptide (degenerate primer 21 Prunus domestica Pd MYB10 Polypeptide with a 64 fold 22 Malus Sylvestris MS MYB10 Polynucleotide (cDNA) degeneracy) 23 Maius Sylvestris MS MYB10 Polynucleotide (gDNA) 2O 80 Artificial Primer KL Rosa deg 2R Polynucleotide 24 Pyrus communis PC MYB10 Polynucleotide (cDNA) (degenerate primer 25 Pyrus communis PC MYB10 Polynucleotide (cDNA) with a 16 fold 26 Pyrus communis PC MYB10 Polynucleotide (gDNA) degeneracy) 27 Pyrus pyrifolia Ppy MYB10 Polynucleotide (gDNA) 81 Artificial Primer CN944824 mdCHS Polynucleotide 28 Pyrus bretschneideri Pb MYB10 Polynucleotide (cDNA) OW8 29 Pyrus bretschneideri PB MYB10 Polynucleotide (gDNA) 25 82 Artificial Primer CN944824 MdCHS Polynucleotide 30 Cydonia oblonga CO MYB10 Polynucleotide (cDNA) reWese 31 Cydonia oblonga CO MYB10 Polynucleotide (gDNA) 83 Artificial Primer CN946541 McCHI Polynucleotide 32 Prunus saicina PS MYB10 Polynucleotide (cDNA) OW8 33 Prunus saicina PS MYB10 Polynucleotide (gDNA) 84 Artificial Primer CN 946541 McCHI Polynucleotide 34 Prunus cerasifera Pcf MYB10 Polynucleotide (cDNA) reWese 35 Prunus cerasifera Pcf MYB10 Polynucleotide (gDNA) 30 85 Artificial Primer CN491664 Md F3H Polynucleotide 36 Prunus persica PprMYB10 Polynucleotide (cDNA) OW8 37 Prunus persica PprMYB10 Polynucleotide (gDNA) 86 Artificial Primer CN491664 Md F3H Polynucleotide 38 Eriobotrya japonica E MYB10 Polynucleotide (cDNA) reWese 39 Eriobotrya japonica E MYB10 Polynucleotide (gDNA) 87 Artificial Primer AY227729 MdDFR Polynucleotide 40 Prunus dulcis PdMYB10 Polynucleotide (cDNA) OW8 41 Prunus dulcis PdMYB10 Polynucleotide (gDNA) 35 88 Artificial Primer AY227729 MdDFR Polynucleotide 42 Prints avium Pay MYB10 Polynucleotide (cDNA) reWese 43 Prints avium Pay MYB10 Polynucleotide (gDNA) 89 Artificial Primer AF117269MdLDOX Polynucleotide 44 Mespilus germanica Mg MYB10 Polynucleotide (cDNA) OW8 45 Mespilus germanica Mg MYB10 Polynucleotide (gDNA) 90 Artificial Primer AF117269MdLDOX Polynucleotide 46 Prunus domestica Pd MYB10 Polynucleotide (cDNA) reWese 47 Prunus domestica Pd MYB10 Polynucleotide (gDNA) 91 Artificial Primer AF117267MdUFGT Polynucleotide 48 Artificial Primer RE73 Polynucleotide 40 OW8 49 Artificial Primer RE77R Polynucleotide 92 Artificial Primer AF117267MdUFGT Polynucleotide 50 Artificial Primer RE78R Polynucleotide reWese 51 Artificial Primer RE79R Polynucleotide 93 Artificial Primer Md MYB10 forward Polynucleotide 52 Artificial Primer RL95F Polynucleotide 94 Artificial Primer Md MYB10 reverse Polynucleotide 53 Arti ficia f Primer RE96R Polynucleo i e 95 Artificial Primer MdbHLH33 forward Polynucleotide S4 Arti ucia f Primer RE108F o ynucleotide 45 96 Artificial Primer MdbHLH33 reverse Polynucleotide 55 Arti ucia f Timer RB199R o ynucleotide 97 Artificial Primer CN934367 Polynucleotide 56 Arti ficial rimer RE120F olynucleotide MdbHLH3 forward 57 Artificial Primer RE121R Polynucleotide - 58 Artificial- Primer KLMS1F Polynucleotide 98 Artificial Primer CN934367 Polynucleotide 59 Artificial Primer KLMS2R Polynucleotide Ali MdbHLH3 reverse C 60 Artificial Primer KLMPAP1F Polynucleotide 50 99 Artificial Primer CN938023 Md.Actin Polynucleotide 61 Artificial Primer KLMPAP1R Polynucleotide forward 62 Artificial Primer KL Peff Polynucleotide 100 Artificial Primer CN938023 Md.Actin Polynucleotide 63 Artificial Primer KLPCfR Polynucleotide reWese 64 Artificial Primer KLPCf2F Polynucleotide 101 Artificia Consensus Polypeptide 65 Artificial Primer KL Pcf.3R Polynucleotide 102 Maius domestica MMYB10 Polynucleotide (gDNA 66 Artificial Primer KL Pcf/R Polynucleotide

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS : 119

<21 Os SEQ ID NO 1 &211s LENGTH: 243 212s. TYPE: PRT <213> ORGANISM: Malus domestica US 7,973,216 B2 45 46 - Continued

<4 OOs, SEQUENCE: 1.

Met Glu Gly Tyr Asn Glu Asn Luell Ser Wall Arg Gly Ala Trp Thir 1. 1O 15

Arg Glu Glu Asp Asn Lell Lell Arg Glin Wall Glu Ile His Gly Glu 25

Gly Trp Asn Glin Wall Ser Tyr Ala Gly Lell Asn Arg Arg 35 4 O 45

Ser Arg Lell Arg Trp Luell Asn Luell Lys Pro Asn Ile SO 55 6 O

Arg Gly Asp Phe Lys Glu Asp Glu Wall Asp Luell Ile Ile Arg Luell His 65 70 8O

Arg Luell Luell Gly Asn Arg Trp Ser Luell Ile Ala Arg Arg Luell Pro Gly 85 90 95

Arg Thir Ala Asn Ala Wall Asn Tyr Trp ASn Thir Arg Luell Arg Ile 105 11 O

Asp Ser Arg Met Lys Thir Wall Lys Asn Ser Glin Glu Met Arg Glu 115 12 O 125

Thir Asn Wall Ile Arg Pro Glin Pro Glin Phe Asn Arg Ser Ser 13 O 135 14 O

Tyr Luell Ser Ser Lys Glu Pro Ile Luell Asp His Ile Glin Ser Ala Glu 145 150 155 160

Asp Luell Ser Thir Pro Pro Glin Thir Ser Ser Ser Thir Asn Gly Asn 1.65 17O 17s

Asp Trp Trp Glu Thir Lell Lell Glu Gly Glu Asp Thir Phe Glu Arg Ala 18O 185 19 O

Ala Pro Ser Ile Glu Lell Glu Glu Glu Luell Phe Thir Ser Phe Trp 195 2O5

Phe Asp Asp Arg Lell Ser Pro Arg Ser Ala Asn Phe Pro Glu Gly 21 O 215 22O

His Ser Arg Ser Glu Phe Ser Phe Ser Thir Asp Lell Trp Asn His Ser 225 23 O 235 24 O

Lys Glu Glu

SEQ ID NO 2 LENGTH: 290 TYPE : PRT ORGANISM: Malus domestica

< 4 OOs SEQUENCE: 2

Met Gly Arg Ser Pro Ser Glu Gly Lell Asn Arg Gly Ala 1. 5 1O 15

Trp Thir Ala Luell Glu Asp Ile Luell Thir Ala Ile Lys Ala His 2O 25

Gly Glu Gly Trp Arg Ser Luell Pro Arg Ala Gly Luell Arg 35 4 O 45

Gly Ser Cys Arg Lell Arg Trp Luell ASn Tyr Lell Arg Pro Asp SO 55 6 O

Ile Arg Gly Asn Ile Ser Gly Asp Glu Glu Glu Lell Ile Wall Arg 65 70

Lell His Asn Luell Lell Gly Asn Arg Trp Ser Luell Ile Ala Gly Arg Luell 85 90 95

Pro Gly Arg Thir Asp Asn Glu Ile Lys Asn Trp Asn Thir Thir Luell 1OO 105 11 O

Gly Ser Lys Wall Asp Ser Phe Ser Gly Ser Ser Glu Thir US 7,973,216 B2 47 48 - Continued

115 12 O 125

Ser Luell Asn Pro Ser Ile Ala Lys Asp Wall Glu Ser 13 O 135 14 O

Lys Thir Ser Thir Ala Ala Ala Glin Pro Luell Wall Ile Arg Thir Ala 145 150 155 160

Thir Arg Luell Thir Lys Ile Lell Wall Pro Glin ASn Ile Pro Ser Asp Glu 1.65 17O 17s

Asn Tyr Thir Ala Ala Ala Ala Asn Pro Luell Glu Lell Glin Thir Glin Ser 18O 185 19 O

Ala Glu Lys Gly Gly Ser Thir Glu Glu Phe Pro Arg Thir Asn Ala Gly 195

Asp Cys Ser Asn Ile Lell Lys Asn Phe Gly Cys Asp Asp Asp Asp Ile 21 O 215 22O

Asp Ala Gly Asp Glin Tyr Asn Glu Phe Glin Lell Luell Asn Ser 225 23 O 235 24 O

Ile Pro Luell Asp Glu Ala Met Ile Asn Asp Gly Trp Thir Gly Gly 245 250 255

Asn Gly Asp Lell Glu Asp Gly Ala Ser Lell Asp Luell Asp Ser 26 O 265 27 O

Lell Ala Phe Luell Lell Asp Ser Glu Glu Trp Pro Ser Glin Glu Asn Wall 27s 28O 285

Wall Wall 29 O

<210s, SEQ ID NO 3 &211s LENGTH: 212. TYPE : PRT <213> ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 3

Met Ala Ala Pro Pro Pro Ser Ser Ser Arg Luell Arg Gly Met Luell Glin 1. 5 15

Ala Ser Wall Glin Tyr Wall Glin Trp Thir Ser Lell Phe Trp Glin Ile 25

Pro Glin Glin Gly Ile Lell Wall Trp Ser Asp Gly Tyr Asn Gly 35 4 O 45

Ala Ile Thir Thir Wall Glin Pro Met Glu Wall Ser Ala Asp SO 55 6 O

Glu Ala Ser Luell Glin Arg Ser Glin Glin Luell Arg Glu Lell Tyr Asp Ser 65 70

Lell Ser Ala Gly Glu Thir Asn Glin Pro Pro Ala Arg Arg Pro Cys Ala 85 90 95

Ser Luell Ser Pro Glu Asp Lell Thir Glu Ser Glu Trp Phe Tyr Luell Met 105 11 O

Wall Ser Phe Ser Phe Pro Pro Gly Wall Gly Lell Pro Gly Ala 115 12 O 125

Ala Arg Arg Glin His Wall Trp Luell Thir Gly Ala Asn Glu Wall Asp 13 O 135 14 O

Ser Thir Phe Ser Arg Ala Ile Luell Ala Lys Ser Ala Arg Ile Glin 145 150 155 160

Thir Wall Wall Ile Pro Lell Luell Asp Gly Wall Wall Glu Phe Gly Thir 1.65 17O 17s

Thir Glu Arg Wall Pro Glu Asp His Ala Phe Wall Glu His Wall Thir 18O 185 19 O

Phe Phe Wall Asp His His His Pro Pro Pro Pro Pro Ala Luell Ser US 7,973,216 B2 49 50 - Continued

195 2OO 2O5

Glu His Ser Thir Ser Asn Pro Ala Ala Ser Ser Asp His Pro His Phe 21 O 215

His Ser Pro His Lell Lell Glin Ala Met Cys Thir Asn Pro Pro Luell Asn 225 23 O 235 24 O

Ala Ala Glin Glu Asp Glu Glu Asp Glu Glu Glu Asp Asp Asn Glin Glu 245 250 255

Glu Asp Asp Gly Gly Ala Glu Ser Asp Ser Glu Ala Glu Thir Gly Arg 26 O 265 27 O

Asn Gly Gly Ala Wall Wall Pro Ala Ala Asn Pro Pro Glin Wall Luell Ala 285

Ala Wall Ala Glu Pro Ser Glu Luell Met Glin Luell Glu Met Ser Glu Asp 29 O 295 3 OO

Ile Arg Luell Gly Ser Pro Asp Asp Ala Ser ASn Asn Lell Asp Ser Asp 3. OS 310 315

Phe His Luell Luell Ala Wall Ser Glin Ser Arg ASn Pro Ala Asp Glin Glin 3.25 330 335

Arg Glin Ala Asp Ser Arg Ala Glu Ser Thir Arg Arg Arg Pro Ser 34 O 345 35. O

Wall Glin Glu Pro Lell Ser Ser Gly Luell Glin Pro Pro His Thir Gly Pro 355 360 365

Lell Ala Luell Glu Glu Lell Thir His Asp Asp Asp Thir His Ser Glu 37 O 375

Thir Wall Ser Thir Ile Lell Glin Gly Glin Wall Thir Glin Lell Met Asp Ser 385 390 395 4 OO

Ser Ser Thir Asp Tyr Thir Ala Luell Thir Glin Ser Ala Phe Ala 4 OS 415

Trp Ser Ser Arg Wall Asp His His Phe Luell Met Pro Wall Glu Gly Thir 425 43 O

Ser Glin Trp Luell Lell Ile Luell Phe Ser Wall Pro Phe Luell His 435 44 O 445

Ser Lys Arg Asp Glu Asn Ser Pro Phe Glin Glu Gly Glu Gly 450 45.5 460

Ser Thir Arg Luell Arg Lys Gly Thir Pro Glin Asp Glu Lell Ser Ala Asn 465 470

His Wall Luell Ala Glu Arg Arg Arg Arg Glu Lell Asn Glu Arg Phe 485 490 495

Ile Ile Luell Arg Ser Lell Wall Pro Phe Wall Thir Met Asp Ala SOO 505

Ser Ile Luell Gly Asp Thir Ile Glu Wall Glin Lell Arg Asn 515 525

Ile Glin Asp Luell Glu Ala Arg Asn Met Luell Wall Glu Glu Asp Glin Arg 53 O 535 54 O

Ser Arg Ser Ser Gly Glu Met Glin Arg Ser ASn Ser Glu Luell 5.45 550 555 560

Arg Ser Gly Luell Thir Lell Wall Glu Arg Thir Glin Gly Gly Pro Pro Gly 565 st O sts

Ser Asp Arg Lys Lell Arg Ile Wall Glu Gly Ser Gly Gly Wall Ala 585 59 O

Ile Gly Lys Ala Lys Wall Met Glu Asp Ser Pro Pro Ser Pro Pro Pro 595 605

Pro Pro Pro Glin Pro Glu Pro Luell Pro Thir Pro Met Wall Thir Gly Thir 610 615 62O US 7,973,216 B2 51 52 - Continued

Ser Luell Glu Wall Ser Ile Ile Glu Ser Asp Gly Lell Lell Glu Luell Glin 625 630 635 64 O

Pro Arg Glu Gly Lell Luell Luell Asp Wall Met Arg Thir Luell Arg 645 650 655

Glu Luell Arg Ile Glu Thir Thir Wall Wall Glin Ser Ser Lell Asn Asn Gly 660 665 67 O

Phe Phe Wall Ala Glu Lell Arg Ala Wall Asp Asn Wall Ser Gly 675 685

Lys Wall Ser Ile Thir Glu Wall Arg Wall Ile Asn Glin Ile Ile 69 O. 695 7 OO

Pro Glin Ser Asp Ser 7 Os

<210s, SEQ ID NO 4 &211s LENGTH: 651 212. TYPE : PRT &213s ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 4.

Met Ala Glin Asn His Glu Arg Wall Pro Gly ASn Lell Arg Glin Phe 1. 15

Ala Wall Ala Wall Arg Ser Ile Trp Ser Ala Ile Phe Trp Ser 25 3O

Lell Ser Thir Thir Glin Glin Gly Luell Glu Trp Gly Glu Tyr 35 45

Asn Gly Asp Ile Thir Arg Wall Glu Gly Wall Luell SO 55 6 O

Thir Asp Met Gly Lell Glin Arg Asn Wall Glin Lell Arg Luell Tyr 65 70

Ser Luell Luell Glu Gly Glu Thir Glu Thir Glu Glin Glin Lys Ala 85 90 95

Pro Ser Ala Wall Lell Ser Pro Glu Asp Luell Thir Asp Ala Trp Tyr 105

Luell Luell Met Ser Phe Ile Phe Asn Pro Gly Glu Luell Pro 115 12 O 125

Gly Arg Ala Luell Ala Thir Gly Glin Thir Ile Trp Lell Asn Ala Glin 13 O 135 14 O

His Thir Asp Ser Wall Phe Ser Arg Ser Luell Lell Ala Ser Ala 145 150 155 160

Ser Wall Glin Thir Wall Wall Phe Pro Tyr Luell Gly Gly Wall Wall Glu 1.65 17O 17s

Lell Gly Wall Thir Glu Lell Wall Ser Glu Asp Luell Asn Lell Ile Glin His 18O 185 19 O

Ile Ala Ser Lell Lell Asp Phe Ser Pro Asp Cys Glu 195

Ser Ser Ser Ala Pro His Lys Pro Asp Asp Asp Ser Glu Glin Ile Wall 21 O 215 22O

Ala Wall Asp His Asp Wall Wall Asp Thir Luell Pro Lell Glu Asn Luell 225 23 O 235 24 O

Ser Pro Ser Glu Glu Ile Phe Asp Glin Arg Gly Ile Asn Gly 245 250 255

Lell Luell Gly Asn His Glu Glu Wall Asn Met Asp Ser Ser Asp Glu 26 O 265 27 O

Ser Asn Gly Cys Asp His Asn His Pro Thir Glu Asp Ser Met Met Luell 27s 28O 285 US 7,973,216 B2 53 - Continued Glu Gly Thr Asn Ala Val Ala Ser Glin Val Glin Ser Trp His Phe Met 29 O 295 3 OO Asp Glu Asp Phe Ser Ser Gly Val Glin Asp Ser Met Asn. Ser Ser Asp 3. OS 310 315 32O Ser Ile Ser Glu Ala Phe Val Asin Glin Gly Lys Ala His Ser Phe Ala 3.25 330 335 Llys His Glu Asn Ala Asn His Ile His Lieu Lys Glu Lieu. Glin Asn. Phe 34 O 345 35. O Asn Asp Thir Lys Lieu. Ser Ser Lieu. Tyr Lieu. Gly Ser Val Asp Glu. His 355 360 365 Val His Tyr Lys Arg Thr Lieu. Cys Thr Lieu. Leu Gly Ser Ser Met Lys 37 O 375 38O Lieu. Ile Glu Asn Pro Cys Phe Cys Asp Gly Glu Ser Lys Ser Ser Phe 385 390 395 4 OO Val Lys Trp Llys Lys Glu Val Val Gly Ser Cys Arg Pro Thr Val His 4 OS 41O 415 Gln Lys Thr Lieu Lys Lys Ile Leu Phe Thr Val Pro Leu Met Tyr Gly 42O 425 43 O Val His Ser Pro Met Ala Thr Gly Lys Glu Asn Thr Gly Lys Asp Lieu 435 44 O 445 Lieu Pro Asn Lieu. Glin Gly Asp Asp Ile Asn Arg Glu. His Asp Llys Met 450 45.5 460 Arg Glu Asn Ala Lys Lieu. Lieu Val Lieu. Arg Ser Met Val Pro Ser Ile 465 470 47s 48O Thr Glu Val Asp Llys Ala Ser Ile Lieu. Asp Asp Thr Ile Llys Tyr Lieu. 485 490 495 Lys Glu Lieu. Glu Ala Arg Ala Glu Glu Met Glu Ser Cys Met Asp Thr SOO 505 51O Val Glu Ala Ile Ser Arg Gly Llys Phe Lieu. Asn Arg Val Glu Lys Thr 515 52O 525 Ser Asp Asn Tyr Asp Llys Thr Lys Lys Asn. Asn Val Llys Llys Ser Lieu. 53 O 535 54 O Val Llys Lys Arg Lys Ala Cys Asp Ile Asp Glu Thir Asp Pro Tyr Pro 5.45 550 555 560 Asn Met Lieu Val Ser Gly Glu Ser Lieu Pro Lieu. Asp Wall Lys Val Cys 565 st O sts Val Lys Glu Glin Glu Val Lieu. Ile Glu Met Arg Cys Pro Tyr Arg Glu 58O 585 59 O Tyr Ile Lieu. Lieu. Asp Ile Met Asp Ala Ile Asn. Asn Lieu. Tyr Lieu. Asp 595 6OO 605 Ala His Ser Val Glin Ser Ser Ile Lieu. Asp Gly Val Lieu. Thir Lieu. Ser 610 615 62O Lieu Lys Ser Llys Phe Arg Gly Ala Ala Ile Ser Pro Val Gly Met Ile 625 630 635 64 O Lys Glin Val Lieu. Trp Llys Ile Ala Gly Lys Cys 645 650

<210s, SEQ ID NO 5 &211s LENGTH: 729 &212s. TYPE: DNA <213> ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 5 atggagggat atalacgaaaa cctgagtgtg agaaaaggtg cctggacticg agaggalagac 6 O aatcttctica ggcagtgcgt tagattcat ggagagggala agtggalacca agttt catac 12 O

US 7,973,216 B2 61 62 - Continued

<4 OOs, SEQUENCE: 9

Met Glu Gly Tyr Asn Glu Asn Luell Ser Wall Arg Gly Ala Trp Thir 1. 1O 15

Arg Glu Glu Asp Asn Lell Lell Arg Glin Wall Glu Ile His Gly Glu 25

Gly Trp Asn Glin Wall Ser Tyr Ala Gly Lell Asn Arg Arg 35 4 O 45

Ser Arg Glin Arg Trp Luell Asn Luell Lys Pro Asn Ile SO 55 6 O

Arg Gly Asp Phe Lys Glu Asp Glu Wall Asp Luell Ile Ile Arg Luell His 65 70 8O

Arg Luell Luell Gly Asn Arg Trp Ser Luell Ile Ala Arg Arg Luell Pro Gly 85 90 95

Arg Thir Ala Asn Ala Wall Asn Tyr Trp ASn Thir Arg Luell Arg Ile 105 11 O

Asp Ser Arg Met Lys Thir Wall Lys Asn Ser Glin Glu Met Arg 115 12 O 125

Thir Asn Wall Ile Arg Pro Glin Pro Glin Phe Asn Arg Ser Ser 13 O 135 14 O

Tyr Luell Ser Ser Lys Glu Pro Ile Luell Asp His Ile Glin Ser Ala Glu 145 150 155 160

Asp Luell Ser Thir Pro Pro Glin Thir Ser Ser Ser Thir Asn Gly Asn 1.65 17O 17s

Asp Trp Trp Glu Thir Lell Lell Glu Gly Glu Asp Thir Phe Glu Arg Ala 18O 185 19 O

Ala Pro Ser Ile Glu Lell Glu Glu Glu Luell Phe Thir Ser Phe Trp 195 2O5

Phe Asp Asp Arg Lell Ser Pro Arg Ser Ala Asn Phe Pro Glu Gly 21 O 215 22O

Glin Ser Arg Ser Glu Phe Ser Phe Ser Thir Asp Lell Trp Asn His Ser 225 23 O 235 24 O

Lys Glu Glu

SEQ ID NO 10 LENGTH: 244 TYPE : PRT ORGANISM: Pyrus communis

< 4 OOs SEQUENCE: 10

Met Glu Gly Tyr Asn Wall Asn Luell Ser Wall Arg Gly Ala Trp Thir 1. 5 1O 15

Arg Glu Glu Asp Asn Lell Lell Arg Glin Ile Glu Ile His Gly Glu 2O 25

Gly Trp Asn Glin Wall Ser Tyr Ala Gly Lell Asn Arg Arg 35 4 O 45

Ser Arg Glin Arg Trp Luell Asn Luell Lys Pro Asn Ile SO 55 6 O

Arg Gly Asp Phe Lys Glu Asp Glu Wall Asp Luell Ile Lell Arg Luell His 65 70 8O

Arg Luell Luell Gly Asn Arg Trp Ser Luell Ile Ala Arg Arg Luell Pro Gly 85 90 95

Arg Thir Ala Asn Asp Wall Asn Tyr Trp Thir Arg Luell Arg Ile 1OO 105 11 O

Asp Ser Arg Met Lys Thir Wall Asn Ser Glin Glu Thir Arg US 7,973,216 B2 63 64 - Continued

115 12 O 125

Thir Asn Wall Ile Arg Pro Glin Pro Glin Lys Phe Ile Lys Ser Ser 13 O 135 14 O

Tyr Luell Ser Ser Lys Glu Pro Ile Luell Glu His Ile Glin Ser Ala Glu 145 150 155 160

Asp Luell Ser Thir Pro Ser Glin Thir Ser Ser Ser Thir Asn Gly Asn 1.65 17O 17s

Asp Trp Trp Glu Thir Lell Phe Glu Gly Glu Asp Thir Phe Glu Arg Ala 18O 185 19 O

Ala Pro Ser Ile Glu Lell Glu Glu Glu Luell Phe Thir Ser Phe Trp 195 2O5

Phe Asp Asp Arg Lell Ser Ala Arg Ser Ala Asn Phe Pro Glu Glu 21 O 215 22O

Gly Glin Ser Arg Ser Glu Phe Ser Phe Ser Met Asp Lell Trp Asn His 225 23 O 235 24 O

Ser Lys Glu Glu

SEQ ID NO 11 LENGTH: 244 TYPE : PRT ORGANISM: Pyrus pyrifolia

< 4 OOs SEQUENCE: 11

Met Glu Gly Tyr Asn Wall Asn Luell Ser Wall Arg Gly Ala Trp Thir 1. 5 1O 15

Arg Glu Glu Asp Asn Lell Lell Arg Glin Ile Glu Ile His Gly Glu 25

Gly Trp Asn Glin Wall Ser Tyr Ala Gly Lell Asn Arg Arg 35 4 O 45

Ser Arg Glin Arg Trp Luell Asn Luell Lys Pro Asn Ile SO 55 6 O

Arg Gly Asp Phe Lys Glu Asp Glu Wall Asp Luell Ile Lell Arg Luell His 65 70 8O

Arg Luell Luell Gly Asn Arg Trp Ser Luell Ile Ala Arg Arg Luell Pro Gly 85 90 95

Arg Thir Ala Asn Asp Wall Asn Tyr Trp ASn Thir Arg Luell Gly Ile 105 11 O

Asp Ser Arg Met Lys Thir Lell Lys Asn Ser Glin Glu Thir Arg 115 12 O 125

Thir Asn Wall Ile Arg Pro Glin Pro Glin Phe Ile Ser Ser 13 O 135 14 O

Tyr Luell Ser Ser Lys Glu Pro Ile Luell Glu His Ile Glin Ser Ala Glu 145 150 155 160

Asp Luell Ser Thir Pro Ser Glin Thir Ser Ser Ser Thir Asn Gly Asn 1.65 17O 17s

Asp Trp Trp Glu Thir Lell Phe Glu Gly Glu Asp Thir Phe Glu Arg Ala 18O 185 19 O

Ala Pro Ser Ile Glu Lell Glu Glu Glu Luell Phe Thir Thir Phe Trp 195 2O5

Phe Asp Asp Arg Lell Ser Ala Arg Ser Ala Asn Phe Pro Glu Glu 21 O 215 22O

Gly Glin Ser Arg Ser Glu Phe Ser Phe Ser Met Asp Lell Trp Asn His 225 23 O 235 24 O

Ser Glu Glu US 7,973,216 B2 65 - Continued

<210s, SEQ ID NO 12 &211s LENGTH: 244 212. TYPE: PRT <213> ORGANISM: Pyrus bretschneideri <4 OOs, SEQUENCE: 12 Met Glu Gly Tyr Asn. Wall Asn Lieu. Ser Val Arg Lys Gly Ala Trp Thr 1. 5 1O 15 Arg Glu Glu Asp Asn Lieu. Lieu. Arg Glin Cys Ile Glu Ile His Gly Glu 2O 25 3O Gly Lys Trp Asin Glin Val Ser Tyr Lys Ala Gly Lieu. Asn Arg Cys Arg 35 4 O 45 Llys Ser Cys Arg Glin Arg Trp Lieu. Asn Tyr Lieu Lys Pro Asn. Ile Llys SO 55 6 O Arg Gly Asp Phe Lys Glu Asp Glu Val Asp Lieu. Ile Lieu. Arg Lieu. His 65 70 7s 8O Arg Lieu. Lieu. Gly Asn Arg Trp Ser Lieu. Ile Ala Arg Arg Lieu Pro Gly 85 90 95 Arg Thr Ala Asn Asp Wall Lys Asn Tyr Trp Asn. Thir Arg Lieu. Gly Ile 1OO 105 11 O Asp Ser Arg Met Lys Thr Lieu Lys Asn Llys Ser Glin Glu Thir Arg Llys 115 12 O 125 Thr Asn Val Ile Arg Pro Gln Pro Gln Llys Phe Ile Llys Ser Ser Tyr 13 O 135 14 O Tyr Lieu. Ser Ser Lys Glu Pro Ile Lieu. Glu. His Ile Glin Ser Ala Glu 145 150 155 160 Asp Leu Ser Thr Pro Ser Glin Thr Ser Ser Ser Thr Lys Asn Gly Asn 1.65 17O 17s Asp Trp Trp Glu Thir Lieu. Phe Glu Gly Glu Asp Thr Phe Glu Arg Ala 18O 185 19 O Ala Cys Pro Ser Ile Glu Lieu. Glu Glu Glu Lieu Phe Thr Thr Phe Trp 195 2OO 2O5 Phe Asp Asp Arg Lieu. Ser Ala Arg Ser Cys Ala Asn. Phe Pro Glu Glu 21 O 215 22O Gly Glin Ser Arg Ser Glu Phe Ser Phe Ser Met Asp Leu Trp Asn His 225 23 O 235 24 O Ser Lys Glu Glu

<210s, SEQ ID NO 13 &211s LENGTH: 245 212. TYPE: PRT <213> ORGANISM: Cydonia oblonga

<4 OOs, SEQUENCE: 13 Met Glu Gly Tyr Asn. Wall Asn Lieu. Ser Val Met Arg Lys Gly Ala Trp 1. 5 1O 15 Thir Arg Glu Glu Asp Asp Lieu. Lieu. Arg Glin Cys Ile Gly Ile Lieu. Gly 2O 25 3O Glu Gly Lys Trp His Glin Val Pro Tyr Lys Thr Gly Lieu. Asn Arg Cys 35 4 O 45 Arg Llys Ser Cys Arg Lieu. Arg Trp Lieu. Asn Tyr Lieu Lys Pro Asn. Ile SO 55 6 O Lys Arg Gly Asp Phe Thr Glu Asp Glu Val Asp Lieu. Ile Ile Arg Lieu. 65 70 7s 8O His Llys Lieu. Lieu. Gly Asn Arg Trp Ser Lieu. Ile Ala Gly Arg Lieu Pro 85 90 95 US 7,973,216 B2 67 68 - Continued

Gly Arg Thir Ala Asn Asp Wall Asn Tyr Trp Asn. Thir Arg Luell Arg 1OO 105 11 O

Ile Asn Ser Arg Met Thir Luell Asn Lys Ser Glin Glu Thir Arg 115 12 O 25

Thir Asn Wall Ile Arg Pro Glin Pro Arg Phe Ile Lys Ser Ser 13 O 135 14 O

Tyr Luell Ser Ser Lys Gly Pro Ile Luell Asp His Ile Glin Ser Ala 145 150 155 160

Glu Asp Luell Ser Thir Pro Pro Glin Thir Ser Ser Ser Asn Gly 1.65 17O 17s

Asn Asp Trp Trp. Glu Thir Lell Phe Glu Gly Glu Phe Glu Arg 18O 185 19 O

Ala Ala Cys Pro Ser Ile Glu Luell Glu Glu Glu Leul Phe Thir Ser Phe 195 2O5

Trp Phe Asp Asp Arg Lell Ser Ala Arg Ser Ala Asn Phe Pro Glu 21 O 215 22O

Glu Gly Glin Ser Arg Ser Glu Phe Ser Wall Ser Met Asp Luell Trp Asn 225 23 O 235 24 O

His Ser Glu Glu 245

<210s, SEQ ID NO 14 &211s LENGTH: 243 212. TYPE : PRT <213> ORGANISM: Prunus Salicina

<4 OOs, SEQUENCE: 14

Met Glu Gly Tyr Asn Lell Gly Wall Arg Lys Gly Ala Trp Thir Arg 1. 5 1O 15

Glu Asp Asp Lieu. Luell Arg Glin Ile Glu Lys His Gly Glu Gly 2O 25 3O

Trp His Glin Wall Pro Ala Gly Luell Ser Arg Cys Arg Ser 35 4 O 45

Arg Luell Arg Trp Lell Asn Tyr Luell Pro ASn Ile Gly SO 55 6 O

Asp Phe Met Glu Asp Glu Wall Asp Luell Ile Ile Arg Lieu. His Luell 65 70

Lell Gly Asn Arg Trp Ser Lell Ile Ala Arg Arg Leul Pro Gly Arg Thir 85 90 95

Ala Asn Asp Val Lys Asn Tyr Trp Asn Thir Arg Lieu. Arg Thir Asp Tyr 1OO 105 11 O

Met Lys Lys Met Asp Lys Ser Glin Glu Thir Ile Thir Ile 115 12 O 125

Ile Arg Pro Gln Pro Arg Arg Phe Thir Ser Ser Asn Luell Ser 13 O 135 14 O

Phe Glu Pro Ile Lell Asp His Thir Glin Luell Glu Glu Asn Phe Ser 145 150 155 160

Thir Thir Ser Glin Ile Ser Thir Ser Thir Arg Ile Gly Ser Asp Trp Trp 1.65 17s

Glu Thir Phe Lieu. Asp Asp Asp Ala Thir Glu Thir Ala Thir Gly Ser 18O 185 19 O

Gly Luell Gly Lieu. Asp Glu Glu Luell Luell Ala Ser Phe Trp Wall Asp Asp 195 2OO

Asp Met Pro Glin Ser Thir Arg Thir Wall ASn Phe Ser Glu Glu Gly 21 O 215 22O US 7,973,216 B2 69 70 - Continued

Lieu. Ser Arg Gly Asp Phe Ser Phe Ser Val Asp Lieu. Trp Asn His Ser 225 23 O 235 24 O Lys Glu Glu

<210s, SEQ ID NO 15 &211s LENGTH: 243 212. TYPE: PRT <213> ORGANISM: Prunus cerasifera

<4 OOs, SEQUENCE: 15 Met Glu Gly Tyr Asn Lieu. Gly Val Arg Lys Gly Ala Trp Thr Arg Llys 1. 5 1O 15 Glu Asp Asp Lieu. Lieu. Arg Glin Cys Ile Glu Lys His Gly Glu Gly Lys 2O 25 3O Trp His Glin Val Pro Tyr Lys Ala Gly Lieu. Ser Arg Cys Arg Arg Ser 35 4 O 45 Cys Arg Lieu. Arg Trp Lieu. Asn Tyr Lieu Lys Pro Asn. Ile Lys Arg Gly SO 55 6 O Asp Phe Met Glu Asp Glu Val Asp Lieu. Ile Ile Arg Lieu. His Llys Lieu 65 70 7s 8O Lieu. Gly Asn Arg Trp Ser Lieu. Ile Ala Arg Arg Lieu Pro Gly Arg Thr 85 90 95 Ala Asn Asp Wall Lys Asn Tyr Trp Asn. Thir Arg Lieu. Arg Lys Asp Tyr 1OO 105 11 O Cys Met Lys Llys Met Lys Asp Llys Ser Glin Glu Thir Ile Llys Thir Ile 115 12 O 125 Ile Arg Pro Gln Pro Arg Ser Phe Thr Lys Ser Ser Asn Cys Lieu Ser 13 O 135 14 O Phe Lys Glu Pro Ile Lieu. Asp His Thr Glin Leu Glu Glu Asin Phe Ser 145 150 155 160 Thr Pro Ser Glin Thr Ser Thr Ser Thr Arg Ile Gly Ser Asp Trp Trp 1.65 17O 17s Glu Thir Phe Lieu. Asp Asp Lys Asp Ala Thr Glu Arg Asp Thr Gly Ser 18O 185 19 O Gly Lieu. Gly Lieu. Asp Glu Glu Lieu. Lieu Ala Ser Phe Trp Val Asp Asp 195 2OO 2O5 Asp Met Pro Glin Ser Thr Arg Thr Cys Val Asn Phe Ser Glu Glu Gly 21 O 215 22O Lieu. Ser Arg Gly Asp Phe Ser Phe Ser Val Asp Lieu. Trp Asn His Ser 225 23 O 235 24 O Lys Glu Glu

<210s, SEQ ID NO 16 &211s LENGTH: 243 212. TYPE: PRT <213> ORGANISM: Prunus persica <4 OOs, SEQUENCE: 16 Met Glu Gly Tyr Asn Lieu. Gly Val Arg Lys Gly Ala Trp Thr Arg Glu 1. 5 1O 15 Glu Asp Asp Lieu. Lieu. Arg Glin Cys Ile Glu Asn His Gly Glu Gly Lys 2O 25 3O Trp His Glin Val Pro Asn Lys Ala Gly Lieu. Asn Arg Cys Arg Llys Ser 35 4 O 45 Cys Arg Lieu. Arg Trp Met Asn Tyr Lieu Lys Pro Asn. Ile Lys Arg Gly SO 55 6 O US 7,973,216 B2 71 72 - Continued

Glu Phe Ala Glu Asp Glu Wall Asp Luell Ile Ile Arg Lell His Lys Luell 65 70 7s

Lell Gly Asn Arg Trp Ser Lell Ile Ala Gly Arg Lell Pro Gly Arg Thir 85 90 95

Ala Asn Asp Wall Lys Asn Trp Asn Thir Arg Lell Arg Thir Asp Ser 105 11 O

Arg Luell Lys Wall Asp Lys Pro Glin Glu Thir Ile Thir Ile 115 12 O 125

Wall Ile Arg Pro Glin Pro Arg Ser Phe Ile Ser Ser Asn Luell 13 O 135 14 O

Ser Ser Glu Pro Ile Lell Asp His Ile Glin Thir Wall Glu Asn Phe 145 150 155 160

Ser Thir Pro Ser Glin Thir Ser Pro Ser Thir Asn Gly Asn Asp Trp 1.65 17s

Trp Glu Thir Phe Lell Asp Asp Glu Asp Wall Phe Glu Arg Ala Thir 18O 185 19 O

Gly Luell Ala Lell Glu Glu Glu Glu Phe Thir Ser Phe Trp Wall Asp 195

Asp Met Pro Glin Ser Arg Glin Thir ASn Wall Ser Glu Glu Gly 21 O 215 22O

Lell Gly Arg Gly Asp Phe Ser Phe Ser Wall Asp Lell Trp Asn His Ser 225 23 O 235 24 O

Lys Glu Glu

<210s, SEQ ID NO 17 &211s LENGTH: 246 212. TYPE : PRT &213s ORGANISM: Eriobotrya japonica

<4 OOs, SEQUENCE: 17

Met Glu Gly Tyr Asn Wall Asn Luell Arg Wall Arg Gly Ala Trp 1. 5 1O 15

Thir Arg Glu Glu Asp Asn Lell Luell Arg Glin Ile Glu Ile Luell Gly 2O 25

Glu Gly Lys Trp His Glin Wall Pro Ala Gly Lell Asn Arg 35 4 O 45

Arg Lys Ser Lell Arg Trp Luell Asn Wall Pro Asn Ile SO 55 6 O

Lys Arg Gly Asp Phe Thir Glu Asp Glu Wall Asp Lell Ile Ile Arg Luell 65 70 7s

His Luell Luell Gly Asn Arg Trp Ser Luell Ile Ala Gly Arg Luell Glin 85 90 95

Gly Arg Thir Ala Asn Asp Wall Asn Asn Thir Arg Luell Arg 105 11 O

Ile Asn Ser Arg Met Thir Ser Glin Asn Ser Glin Glu Thir Arg 115 12 O 125

Thir Ile Wall Ile Arg Pro Glin Pro Arg Ser Phe Ile Ser Ser 13 O 135 14 O

Asn Luell Ser Ser Lys Glu Pro Ile Luell Asp His Ile Glin Ser Glu 145 150 155 160

Glu Asp Ser Ser Thir Pro Ser Glin Thir Ser Luell Thir Asn Gly Asn 1.65 17O 17s

Asp Arg Trp Glu Thir Lell Lell Asp Glu Gly Thir Phe Glu Arg Thir 18O 185 19 O US 7,973,216 B2 73 74 - Continued

Ala Pro Ser Phe Glu Lell Glu Glu Glu Lieu. Phe Thir Ser Phe Trp 195 2OO 2O5

Ala Asp Glu Met Glin Glin Ser Ala Arg Ser Cys Thir Wall Ser Phe Pro 21 O 215 22O

Glu Glu Gly Pro Ser Lys Ser Asn Lieu. Ser Phe Asn Met Glu Luell Trp 225 23 O 235 24 O

Asn His Ser Glu Glu 245

<210s, SEQ ID NO 18 &211s LENGTH: 225 212. TYPE : PRT &213s ORGANISM: Prunus dulcis

<4 OOs, SEQUENCE: 18

Met Glu Gly Tyr Asn Lell Gly Wall Arg Lys Gly Ala Trp Thir Arg Glu 1. 5 1O 15

Glu Asp Asp Luell Lell Arg Glin Ile Glu ASn Glin Gly Glu Gly 25 3O

Trp His Glin Wall Pro Ala Gly Luell Lys Arg Cys Arg Ser 35 4 O 45

Arg Luell Arg Trp Wall Asn Tyr Luell Pro Asn Ile Gly SO 55 6 O

Glu Phe Ala Glu Asp Glu Wall Asp Luell Ile Ile Arg Lell His Luell 65 70

Lell Gly Asn Arg Trp Ser Lell Ile Ala Gly Arg Lell Pro Gly Arg Thir 85 90 95

Ala Asn Asp Wall Lys Asn Tyr Trp Asn Thir Arg Lell Arg Thir Asp Ser 105 11 O

Arg Luell Lys Wall Asp Lys Pro Glin Glu Thir Ile Thir Ile 115 12 O 125

Wall Ile Arg Pro Glin Pro Arg Arg Phe Thir Lys Ser Ser Asn Luell 13 O 135 14 O

Ser Phe Glu Pro Ile Lell Asp His Thir Glin Arg Asp Trp Trp Glu 145 150 155 160

Thir Phe Luell Asp Asp Asp Ala Thir Glu Arg Ala Thir Gly Ser Gly 1.65 17O 17s

Lell Gly Luell Asp Glu Glu Lell Luell Ala Ser Phe Trp Wall Asp Asp Asp 18O 185 19 O

Met Pro Glin Ser Thir Arg Cys Ile Asn Phe Ser Glu Gly Luell Ile 195

Arg Gly Asp Phe Ser Phe Ser Wall Asp Pro Trp Asn His Ser Glu 21 O 215 22O

Glu 225

<210s, SEQ ID NO 19 &211s LENGTH: 244 212. TYPE : PRT <213> ORGANISM: Prunus avium

<4 OOs, SEQUENCE: 19 Met Glu Gly Tyr Asn Lieu. Gly Val Arg Lys Gly Ala Trp Thr Arg Glin 1. 5 15 Glu Asp Asp Lieu. Lieu. Arg Glin Cys Ile Glu Asn Glin Gly Glu Gly Lys 25

Trp. His Glin Val Pro Tyr Lys Ala Gly Lieu. Asn Arg Cys Arg Arg Ser US 7,973,216 B2 75 76 - Continued

35 4 O 45

Arg Luell Arg Trp Lell Asn Luell Lys Pro Asn Ile Arg Gly SO 55 6 O

Asp Phe Met Glu Asp Glu Wall Asp Luell Ile Ile Arg Lell His Luell 65 70 8O

Lell Gly Asn Arg Trp Ser Lell Ile Ala Glin Arg Lell Pro Gly Arg Thir 85 90 95

Ala Asn Asp Wall Lys Asn Trp Asn Thir Arg Lell Arg Met Asp 105 11 O

Ser Luell Lys Met Asp Lys Ser Glin Glu Thir Ile Thir Ile 115 12 O 125

Ile Ile Arg Pro Glin Pro Arg Ser Phe Thir Ser Ser Asn Luell 13 O 135 14 O

Ser Phe Glu Pro Ile Lell Asp His Thir Glin Lell Glu Glu Asn Phe 145 150 155 160

Ser Thir Pro Ser Glin Thir Ser Thir Ser Thir Arg Ile Gly Ser Asp Trp 1.65 17s

Trp Glu Thir Phe Lell Asp Asp Asp Ala Thir Glu Arg Ala Thir Gly 18O 185 19 O

Ser Gly Luell Gly Lell Asp Glu Glu Luell Luell Ala Ser Phe Trp Wall Asp 195 2OO

Asp Asp Met Pro Glin Ser Thir Arg Thir Ile Asn Phe Ser Glu Glu 21 O 215 22O

Gly Luell Ser Arg Gly Asp Phe Ser Phe Ser Wall Asp Lell Trp Asn His 225 23 O 235 24 O

Ser Lys Glu Glu

<210s, SEQ ID NO 2 O &211s LENGTH: 247 212. TYPE : PRT &213s ORGANISM: Mespilus germanica

<4 OOs, SEQUENCE:

Met Glu Gly Tyr Asn Wall Asn Luell Ser Wall Arg Gly Ala Trp Thir 1. 5 1O 15

Arg Glu Glu Asp Asn Lell Lell Arg Glin Ile Glu Ile His Gly Glu 25

Gly Trp Asn Glin Wall Ser Tyr Ala Gly Lell Asn Arg Arg 35 4 O 45

Ser Arg Lell Arg Trp Luell Asn Luell Lys Pro Ser Ile SO 55 6 O

Gly Asp Phe Lys Glu Asp Glu Wall Asp Luell Ile Ile Arg Luell His 70 8O

Luell Luell Gly Asn Arg Trp Ser Luell Ile Ala Glin Arg Luell Pro Gly 85 90 95

Arg Thir Ala Asn Asp Wall Asn Tyr Trp ASn Thir Arg Luell Arg Met 105 11 O

Asp Ser Luell Lys Met Lys Asp Ser Glin Glu Thir Ile 115 12 O 125

Thir Ile Ile Ile Arg Pro Glin Pro Arg Ser Phe Thir Ser Ser Asn 13 O 135 14 O

Cys Luell Ser Phe Lys Glu Pro Ile Luell Asp His Thir Glin Luell Glu Glu 145 150 155 160

Asn Phe Ser Thir Pro Ser Glin Thir Ser Thir Ser Thir Arg Ile Gly Ser 1.65 17O 17s US 7,973,216 B2 77 - Continued

Asp Trp Trp Glu Thir Phe Lieu. Asp Asp Lys Asp Ala Thr Glu Arg Ala 18O 185 19 O Thr Gly Ser Gly Lieu. Gly Lieu. Asp Glu Glu Lieu. Lieu Ala Ser Phe Trp 195 2OO 2O5 Val Asp Asp Asp Met Pro Glin Ser Thr Arg Thr Cys Ile Asin Phe Ser 21 O 215 22O Glu Glu Gly Lieu. Ser Arg Gly Glu Lieu. Ser Phe Ser Thr Asp Lieu. Trp 225 23 O 235 24 O Asn His Ser Lys Lys Asn. Ser 245

<210s, SEQ ID NO 21 &211s LENGTH: 237 212. TYPE: PRT <213> ORGANISM: Prunus domestica

<4 OOs, SEQUENCE: 21 Met Glu Gly Tyr Asn Val Gly Val Arg Lys Gly Ala Trp Thr Arg Glu 1. 5 1O 15 Glu Asp Asp Lieu. Lieu. Arg Glin Cys Ile Glu Asn His Gly Glu Gly Lys 2O 25 3O Trp His Glin Val Pro Asn Lys Ala Gly Lieu. Asn Arg Cys Arg Llys Ser 35 4 O 45 Cys Arg Lieu. Arg Trp Lieu. Asn Tyr Lieu Lys Pro Asn. Ile Lys Arg Gly SO 55 6 O Glu Phe Ala Glu Asp Glu Val Asp Lieu. Ile Ile Arg Lieu. His Llys Lieu. 65 70 7s 8O Lieu. Gly Asn Arg Trp Ser Lieu. Ile Ala Gly Arg Lieu Pro Gly Arg Thr 85 90 95 Ala Asn Asp Wall Lys Asn Tyr Trp Asn. Thir Arg Lieu. Arg Llys Val Lys 1OO 105 11 O Asp Llys Pro Glin Glu Thir Ile Llys Thr Ile Val Ile Arg Pro Glin Pro 115 12 O 125 Arg Ser Phe Ile Llys Ser Ser Asn. Cys Lieu. Ser Ser Lys Glu Pro Ile 13 O 135 14 O Lieu. Asp His Ile Glin Thr Val Glu Asn Phe Ser Thr Pro Ser Glin Ser 145 150 155 160 Ser Pro Ser Thr Lys Asn Gly Asn Asp Trp Trp Glu Thr Phe Lieu. Asp 1.65 17O 17s Asp Glu Asp Val Phe Glu Lys Ala Thr Cys Tyr Gly Lieu Ala Lieu. Glu 18O 185 19 O Glu Glu Glu Phe Thr Ser Phe Trp Val Asp Asp Met Pro Glin Ser Lys 195 2OO 2O5 Arg Glin Cys Thr Asn Val Thr Glu Glu Gly Lieu. Gly Thr Gly Asp Phe 21 O 215 22O Ser Phe Asn. Wall Asp Lieu. Trp Asn His Ser Lys Glu Glu 225 23 O 235

<210s, SEQ ID NO 22 &211s LENGTH: 729 &212s. TYPE: DNA <213> ORGANISM: Malus Sylvestris

<4 OOs, SEQUENCE: 22 atggagggat atalacgaaaa cctgagtgtg agaaaaggtg cctggacticg agaggalagac 6 O aatcttctica ggcagtgcgt tagattcat ggagagggala agtggalacca agttt catac 12 O US 7,973,216 B2 79 80 - Continued aaag caggct taalacaggtg Caggalaga.gc tigcagacaaa gatggittaala Ctatctgaag 18O cCaaat at ca agagaggaga ctittaaagag gatgaagtag atcttataat tag actt cac 24 O aggcttittgg gaalacaggtg gt cattgatt gctagaagac titcCaggaag alacagcaaat 3OO gctgttgaaaa attattggaa cacticgattg C9gat.cgatt Ctc.gcatgala aacggtgaaa 360 aataaatcto aagaaatgag aaagaccalat gtgataagac ct cagc.ccca aaaattcaac 42O agaagttcat attact taag cagtaaagaa ccaattic tag accatatt ca atcagcagaa 48O gatttalagta cqccaccaca aacgt.cgt.cg tcaacaaaga atggaaatga ttggtgggag 54 O accttgttag aaggcgagga tacttittgaa agagctgcat atcc.ca.gcat tagttagag 6OO gaagaact ct tca caagttt ttggitttgat gatcgactgt cqc caagat c atgcgc.caat 660 titt.cctgaag gacaaagtag aagtgaattic ticcitttagca cqgacctittg gaat cattca 72 O aaagaagaa 729

<210s, SEQ ID NO 23 &211s LENGTH: 4 O88 &212s. TYPE: DNA <213> ORGANISM: Malus Sylvestris

<4 OOs, SEQUENCE: 23 talagagatgg agggatataa caaaacctg agtgtgagaa alaggtgcct g g actic gaga.g 6 O gaagacaatc ttct caggca gtgcgttgag att catggag agggaaagtg galaccaagtt 12 O t catacaaag caggtatata tdttaatgtg tatatttaac tdtgaaagat ggatatgtgt 18O attattittaa agcattt cac tag tattt cattctaagacic titttgttaaa tagtttcaag 24 O tittcaagttt tacttittatt aatgttt tag alacatgttaa tatgtctaac gigt catactt 3OO gct citcacct cact catcta ttgttgtttac atatatggct aaaatgacct atgcgtgttgt 360 gaggagggcc atgttgagag act tagtic cc ticatalaat at ttgttgttca cqtagaaaga 42O tgttatgtga atgtaaactt taattatgt atgcaggctt aaa.caggtgc aggaagagct 48O gCagacaaag atggittaaac tatctgaagc caaat at Caa gagaggaga C tittaaagagg 54 O atgaagtaga t cittataatt agacitt caca ggcttittggg aaa.cagg tac taataaataa 6OO gtgt catttt caatt catgt cqt cqtttitc attgtacgga aattggacct attaa.ca.gtg 660 agattataat catagacct c aaattactitt titccact citt ttaat attitt aatgtttitt c 72 O aatgaagt at tagtggtgtg tagaatataa aaaaaaataa alaggtgttgt gtaagtaatt 78O tggagtgtgt gaatataatc titt Cttgtta atatatt cto got coccata ttitt cagtat 84 O tittctaatac titcctaattt atatgtcatt ttatttitt catttaga catc aagcaaaaag 9 OO ttittcaattt togtag tattt tttittagatt tattaaaa.ca attattt coc aaattitttitt 96.O tgtgggccaa tigcctacca cat cattgtt tatggagaac ttaaaggct a gagtacgaag O2O tatgattitta gagtaatcgt atttaggaat aaagtgagag gagaaaaact aaggg tagca O8O acttgcaaat tt catgatac ttgagtatag taaagtgagg attact citta ttttittagct 14 O atagt ctago atgagaatct aaact acaaa at cattagag agggcaa.gcg ttataaacat 2OO t cattittaaa ttttittaata ttataatatt ctaccttaag giggcagagtt gtttitttggit 26 O taagcaaaac aaaaaatcat t cqtatcaaa tdtggitat cattgaaaatca aactt cagac 32O tittagt citta atttittaaaa tdaagacaaa tat cqagtgc taatagdaat aaaccaaaaa 38O ttittaggaac tdtttggitat cittatttgaa atttitttatc atttct caaa acatttittta 44 O aaaacatttic titgaaaacaa ttittctittaa gactcaaaaa cittgatgggt atgcaaatta SOO

US 7,973,216 B2 99 100 - Continued

&212s. TYPE: DNA <213> ORGANISM: Prunus persica <4 OOs, SEQUENCE: 36 atggagggat atalacttggg ttgagaaaa ggagcttgga ctagagagga agatgat Ctt 6 O ttgaggcagt gcattgagaa t catggagaa ggaaagtggc accaagttcc taacaaagca 12 O gggttgaaca ggtgcaggaia gagctgtaga Ctaaggtgga tigaactattt gaa.gc.calaat 18O atcaagagag gagagtttgc agaggatgaa gtagatctaa t cattaggct t cacaagctt 24 O ttaggaaa.ca ggtggit catt gattgctgga aggct tccag gaaggacagc gaatgatgtg 3OO aaaaattatt ggalacactic actg.cggacg gattct cqcc taaaaaggit gaaagataaa 360 c cccaagaaa caataaagac catcgtaata agacct caac ccc.gaagctt catcaagagt 42O tcaaattgtt tdagcagtaa agaac caatt ttggat cata ttcaaacagt cdagaattitt 48O agtacgc.cgt cacaaacatc accatcaiaca aagaatggala atgattggtg ggaalacctitt 54 O ttagatgacg aggatgttitt taaagagct acatgctatg gtctaggatt agaggaagaa 6OO gagttcacaa gtttittgggit tatgatatg C cacaatcga aaaga cagtg taccalatgtt 660 t cagaagaag gaCtaggtag aggtgatttic ticttittagcg tdgacct ttg gaatcattca 72 O aaagaagaa 729

<210s, SEQ ID NO 37 &211s LENGTH: 1920 &212s. TYPE: DNA <213> ORGANISM: Prunus persica <4 OO > SEQUENCE: 37 talagagatgg agggatataa Cttgggtgtg agaaaaggag Cttggactag agaggaagat 6 O gatcttittga ggcagtgcat tagaatcat ggaga aggaa agtggcacca agttcctaac 12 O aaag caggta ttaatgtaaa tataacticag agagatatat gatatagatg gttaataaat 18O agctagagct taattaatag goagtgaagc cittaaattag tdatgtttac aaggc cittaa 24 O acttct coac ttattgccaa ttggttgctt ttaattttgt ctittcaacgg c taggcc caa 3OO tagt cactag togcticc cata t tact catta tittatgttgt ccacgtgctt agctcaaatg 360 at at cactic tiggcc.gaggggtgctaaga gactitat coa acatctggala tttitttgcat 42O tatgcatgga tigcagggttgaac aggtgca ggaagagctg. tag actalagg tdgatgaact 48O atttgaagcc aaatat Caag agaggaga.gt ttgcagagga tigaagtagat Ctaat catta 54 O ggct tcacaa gottt tagga aac aggtacc aataaatgtc. tctitt cotta t coca catgg 6OO ttctitt catc acataccatt caaaaaaacc taaaatccac aattgc.cgac atgcatc.ccg 660 tgttgtttitc tat attatat cittct cittgt titat citcagt acgcatgcac aaccacaaaa 72 O alagcactaga agggc.catga agc catgcat gatgt at Ctt agt ct ctgttgaatcgtaaaa 78O catagtgatc atatatgtta caa.catctac aaaaagcttt titat cacaac tag.ccct gag 84 O gttittgcgaa attatcgt.ct titcgt.ctitta atgtttittta tdtgat atta atggit cotta 9 OO aggittatcat coacaaatca aaatggtctic tat cqtcagt titc.cgittaaa ttittct atta 96.O aaatgatgat gtggtatata tatgggggaa cacat citaat aatatagtgc cacgtagctt 1 O2O taataaaaga tittaaatc.cc aaccoggttc tittgcttgcc tdaccat cac cct caaatct 108 O gatgagagag agtgagtgag tigagggagg taggtgcCac gatgggg.cga Caggtttgaa 114 O ggaagagggc gagagagttt ttittcaaaaa aaagaaaatt taagttittaa tottatctitt 12 OO tittaaatata ttaatactitt atttaaaatc acgtggctat at attgttgg atgtgtggct 126 O

US 7,973,216 B2 109 110 - Continued aaatgt atta actittct cat gctatgtgtg galaggtggtc attgattgct caaag acttic 18OO

Caggaaggac to gaatgat gtgaaaaatt actggaacac ccgattgcgg atggattatt 1860

CCCtgaaaaa gatgaaagac aaatcc.caag aaacaataaa. gac catcata ataaggccac 1920 aaccaaggag citt caccaaa agttcaaatt gtttgagttt taaagaacca attittgg acc 198O at acticaact agaagagaat tittagtacgc Cat Cacaaac atcaa.catca acaaggattg gaagtgattg gtgggagacic tttittagatg acaaggatgc tactgaaaga gct acaggitt 21OO

Ctggtcttgg gttagatgaa gaattgct cq caagtttittg ggttgatgat gatatgccac 216 O aatcgacaag aacgtgcatc aatttittctg aagaaggact aagtagaggit gatttct citt 222 O ttagcgtgga cctittggaat cattcaaaag aagaatagct ag 2262

<210s, SEQ ID NO 44 &211s LENGTH: 744 &212s. TYPE: DNA <213> ORGANISM: Mespilus germanica

<4 OOs, SEQUENCE: 44 atggagggat atalacgittaa Cttgagtgtg agaaaaggtg Cctggacticg agaggaagac 6 O aatc.ttctica ggcagtgcat tgagatt cat ggagagggaa agtggaacca agttt catac 12 O aaag caggct taalacaggtg Caggaagagc tgcagactaa gatggittaala Ctacctgaag 18O cCaagtatica agagagggga ctittaaagag gatgaagtag atc.ttataat tag actt cac 24 O aagcttittag gaalacaggtg gtcattgatt gct caaagac titcCaggaag gactg.cgaat 3OO gatgtgaaaa attactggaa caccc.gattg cggatggatt att coctdaa aaagatgaaa 360 gacaaatc.cc aagaaacaat aaagaccatc ataataaggc cacaaccaag gagct tcacc aaaagttcaa attgtttgag ttittaaagaa c caattittgg accatactica act agaagag aattittagta cqc cat caca aaCat Caa.ca t caacaagga ttggaagttga ttggtgggag 54 O acct ttittag atgacaagga tgctactgaa agagctacag gttctggtct tottagat gaagaattgc ticgcaagttt ttgggttgat gatgatatgc Cacaatcgac aagaacgtgc 660 atcaatttitt ctdaagaagg actalagtaga ggtgaattat CCtttagcac ggacctittgg 72 O aatcattcaa agaagaatag Ctag 744

<210s, SEQ ID NO 45 &211s LENGTH: 2287 &212s. TYPE: DNA <213> ORGANISM: Mespilus germanica

<4 OOs, SEQUENCE: 45 talagagatgg agggatataa cgittaacttg agtgtgagaa alaggtgcct g g actic gaga.g 6 O gaagacaatc ttct caggca gtgcattgag att catggag agggaaagtg galaccaagtt 12 O t catacaaag caggtatata tgttaatgat gtgtatattt aactgttgaaa gatgtatatg 18O tg tatt attt taaagcattt cactag tatt t cattctaag accttttgtt aattagttt c 24 O aagttgaatgttt tacttitt attagtgttt tagaacatgt taatgtgtct aacggccata 3OO cctgcc ct ca cct cactaat Ctatggtgtt tacatatatg gctaaaatga CCtttgcgtg 360 tgtgagcagg gcc atgttga gagacittagt c ctitt acaag tatgtgttgt toacgtagaa agatgttata tdaatataaa ctittgaatta tgtatgcagg Cttaaac agg to aggalaga gctgcagact aagatggitta aactacctga agccaagtat Caagagaggg gactittaaag 54 O aggatgaagt agatcttata attagactitc acaagcttitt aggaaac agg taccalatata

US 7,973,216 B2 113 114 - Continued aaagaaccaa ttittggacca tattoaaa.ca gtc.gaga att ttagtacgc.c gtcacaatca t cac catcaa caaagaacgg aaatgattgg tggga aacct ttittagatga cgaggatgtt 54 O tittgaaaaag ctacatgcta tggtctagog ttagaggaag aagagttcac aagtttittgg gttgatgata tdccacaatc gaaaagacag tgtaccalatg ttacagaaga aggactaggit 660 acaggtgatt tot cittittaa cgtggacctt tggaat catt caaaagaaga atag 714.

<210s, SEQ ID NO 47 &211s LENGTH: 1916 &212s. TYPE: DNA &213s ORGANISM: Prunu s domestica 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (849) ... (849) 223 OTHER INFORMAT ON: n is a 22 Os. FEATURE: <221 > NAMEA KEY: misc feature <222s. LOCATION: (857) ... (857) 223 OTHER INFORMAT ON: n is a 22 Os. FEATURE: <221 > NAMEA KEY: misc feature <222s. LOCATION: (867) ... (867) 223 OTHER INFORMAT ON: n is a 22 Os. FEATURE: <221 > NAMEA KEY: misc feature <222s. LOCATION: (115 ) . . (1151) 223 OTHER INFORMAT ON: n is a 22 Os. FEATURE: <221 > NAMEA KEY: misc feature <222s. LOCATION: (1160 ) . . (1160) 223 OTHER INFORMAT ON: n is a

<4 OOs, SEQUENCE: 47 talagagatgg agggatataa cgtgggtgttg agaaaaggag Cttggactag agaggaagat 6 O gatcttittga ggcagtgcat tgagaat cat ggaga aggaa agtggcacca agttcctaac 12 O aaag caggta ttaatgtaaa tataacticag agagatatat gatatataga tggittaataa 18O aaagctagag cittaattaat aggcagtgaa gcc ttaaatt agtgatgttt acaaggc citt 24 O aaactt. Ct. CC act tatto CC aattggttgc ttttgattitt atgttt cact ggctaggccC 3OO aatagt cact agtgct coca tatt acticac tatt tatgtt gtctatgtgt ttagotcaaa 360 tgat attgct Cttgagc.cga ggggtgctaa gag actitatic cca catcggg aattittittgc attatgcatg gatgcagggit tgaac aggtg Caggaagagc tgtag actaa ggtggttgaa ctatttgaag ccaaatatica agagaggaga gtttgcagag gatgaagtag atctaataat 54 O taggctt cac aagcttittag gaaac aggta c caataaacg totottt cost tat CC catat ggct ctitt catca catacca ttaaaaaaaa. aaaaaactaa aat CCaCaat cgc.cgacatg 660 tatic ctdtgt tdttitt citat attatatott ctgttgttta t ct cagtacg catgcacaac 72 O Cacaaaaagc act agaaggg c catgitaagc atgcacgatg tat cittagt c tctgtgaatc gtaaaatata gtgat catat atgttagaac atttacagaa grwittittatc acaaatagt c 84 O ctgaggittna taaaatnatc gtc.ttting to tittaatatat at attattitt atttitttittg 9 OO tgatactaat gigt ctittaag titt at cotto acacatcaaa. atggit ct cog tcqtcaatitt 96.O cc.gtcaaatt ttctgttaaa atgctgatgt ggcatatatg tagggccaca Catctaataa. tatagtgcca catagottta ataaaagatt taaatcc.cga cctggttctt tgcttgcctg 108 O accatcatcc ticaaatctga tgttgaga.gag agagaga gag agaga.gaga.g agaga.gagaa 114 O agagagaggg intggagggan ggtggggtgc Cacggtgggg cgacaggttt galaggalaga.g 12 OO gacgagagag agtttitttitt aaaaaaaaaa. tittaagttitt aat Cttatct tttittaaata 126 O US 7,973,216 B2 115 116 - Continued tattalacatt ttatttaaaa. ccacgtggct at at attatt ggatttgttgg cccacatata 32O tgctatat catctttittaac agaaaatttg acggaagttg acggCagggit ttgatttatt ttcttcttitt tttgcgtctg tat Cataa.ca caaatgtatt gatttattitt citcatgctat 44 O

Ctgtcgaagg tdt cattga ttgctggaag gct tccaggg aggacagcga atgatgtgaa SOO aaattattgg aac act coac tgcgaaaggt gaaagataaa c cc caagaaa caataaagac 560 catcgtaata agacct caac cc.cgaagctt Cat Caagagt tcaaattgtt tdagcagtaa agaaccaatt ttggaccata ttcaaacagt cgagaattitt agtacgc.cgt cacaatcatc accatcaa.ca aagaacggaa atgattggtg ggaaacctitt ttagatgacg aggatgttitt 74 O tgaaaaagct acatgctato gtctagogtt agaggaagaa gagttcacaa gtttittgggit tgatgatatgccacaatcga aaaga cagtg taccalatgtt acagaagaag gaCtagg tac 86 O aggtgatttic ticttittaacg tggacctttg gaat cattca aaagaagaat agctag 916

<210s, SEQ ID NO 48 &211s LENGTH: 38 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 48 aaaagttgca gacttagatg gttgaatt at ttgaa.gc.c 38

<210s, SEQ ID NO 49 &211s LENGTH: 27 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 49 gagaatcgat ccdcaatcga gtgttcc 27

<210s, SEQ ID NO 50 &211s LENGTH: 31 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 50 accacctgtt toccaaaag.c ctdtgaagtic t 31

<210s, SEQ ID NO 51 &211s LENGTH: 33 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 51

Cacaagctag atggtaccac agaagtgaga atc 33

<210s, SEQ ID NO 52 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 52 US 7,973,216 B2 117 118 - Continued talagagatgg agggatataa C9 22

<210s, SEQ ID NO 53 &211s LENGTH: 26 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer <4 OOs, SEQUENCE: 53 ctagot attc ttcttittgaa tdattic 26

<210s, SEQ ID NO 54 &211s LENGTH: 24 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer <4 OOs, SEQUENCE: 54 gatcgatt ct cqcatgaaaa C9gt 24

<210s, SEQ ID NO 55 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 55 gacgacgttt gtggtggcgt act 23

<210s, SEQ ID NO 56 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 56 tgcctggact cagaggaag aca 23

<210s, SEQ ID NO 57 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 57 cctgtttic cc aaaag.cctgt gaa 23

<210s, SEQ ID NO 58 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 58 cittataatta gactitcacag go 22

<210s, SEQ ID NO 59 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial US 7,973,216 B2 119 120 - Continued

22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 59 caccgtttitc atgcgagaat

<210s, SEQ ID NO 60 &211s LENGTH: 37 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 60 gCagataaga gatggaggga tataacgaala acctgag 37

<210s, SEQ ID NO 61 &211s LENGTH: 35 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 61 tacaca agct agatgg tacc acagalagtga gaatc 35

<210s, SEQ ID NO 62 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 62 gactitt atgg aagatgaagt agatc 25

<210s, SEQ ID NO 63 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 63 aag.cgatagt at attattga tigaac 25

<210s, SEQ ID NO 64 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 64 Cttgggtgtg agaaaaggag

<210s, SEQ ID NO 65 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 65 cacgctaaaa gagaaatcac US 7,973,216 B2 121 122 - Continued SEQ ID NO 66 LENGTH: 2O TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer SEQUENCE: 66 gcttgttgaag cctaattatt

SEO ID NO 67 LENGTH: 2O TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer SEQUENCE: 67 gaaagataaa CCC caagaaa

SEQ ID NO 68 LENGTH: 2O TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 68 tittgaact ct tdatgaagct

SEO ID NO 69 LENGTH: 22 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 69 ctg.cgaattt g tattg tatgtc 22

SEO ID NO 7 O LENGTH: 2O TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 7 O titcCCaCCaa toattt Coat

SEO ID NO 71 LENGTH: 22 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 71 aagagaggag agttctgcaga gig 22

SEO ID NO 72 LENGTH: 22 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 72 US 7,973,216 B2 123 124 - Continued tagttct tca cat cattggc ag 22

<210s, SEQ ID NO 73 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer <4 OO > SEQUENCE: 73 aatatgcacc aggaagttctt aaaga 25

<210s, SEQ ID NO 74 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer <4 OOs, SEQUENCE: 74 aaatctgctt aattitt catg gaggg 25

<210s, SEQ ID NO 75 &211s LENGTH: 28 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 75 t cagagagag agagatgggt ggt attcc 28

<210s, SEQ ID NO 76 &211s LENGTH: 26 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 76 ctitcct cittgttcaaagctic cct ct c 26

<210s, SEQ ID NO 77 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 77 agaact attg gaattgtcac ttgag 25

<210s, SEQ ID NO 78 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 78 agaataaaat cactitt cata accac 25

<210s, SEQ ID NO 79 &211s LENGTH: 30 &212s. TYPE: DNA <213> ORGANISM: Artificial US 7,973,216 B2 125 126 - Continued

22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (21) ... (21) <223> OTHER INFORMATION: n is a, c, g, or t <4 OO > SEQUENCE: 79 agact tccrg galagracWgc naatgmtgtg 3 O

<210s, SEQ ID NO 8O &211s LENGTH: 24 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (22) ... (22) <223> OTHER INFORMATION: n is a, c, g, or t <4 OOs, SEQUENCE: 80 c cartaattt ttcacakcat timgc 24

<210s, SEQ ID NO 81 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 81 ggagacaact ggagaaggac tigaa 25

<210s, SEQ ID NO 82 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 82 cgacattgat actggtgtct tca 23

<210s, SEQ ID NO 83 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 83 gggata acct cqcggccaaa

<210s, SEQ ID NO 84 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 84 gcatcCatgc cggaagctac aa 22

<210s, SEQ ID NO 85 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificial US 7,973,216 B2 1. 27 128 - Continued

22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 85 tggaagcttg taggactgg gat 23

<210s, SEQ ID NO 86 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificia 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 86 citcc to cqat ggcaaatcaa aga 23

<210s, SEQ ID NO 87 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificia 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OO > SEQUENCE: 87 gatagggittt gagttcaagt a 21

<210s, SEQ ID NO 88 &211s LENGTH: 24 &212s. TYPE: DNA <213> ORGANISM: Artificia 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 88 tctic ct cago agcct cagtt titc t 24

<210s, SEQ ID NO 89 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificia 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 89 cCaagtgaag C9ggttgttgc t 21

<210s, SEQ ID NO 90 &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificia 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 90 caaagcaggc gga caggagt agc 23

<210s, SEQ ID NO 91 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: 223 OTHER INFORMATION: Synthetic construct useful as a primer

<4 OOs, SEQUENCE: 91 ccaccgc.cct tccaaacact ct 22 US 7,973,216 B2 129 130 - Continued SEQ ID NO 92 LENGTH: 23 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer SEQUENCE: 92 caccct tatgttacgcggca tdt 23

SEO ID NO 93 LENGTH: 23 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer SEQUENCE: 93 tgcctggact cagaggaag aca 23

SEQ ID NO 94 LENGTH: 23 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 94 cctgtttic cc aaaag.cctgt gaa 23

SEO ID NO 95 LENGTH: 23 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 95 atgtttittgc gacggagaga gca 23

SEO ID NO 96 LENGTH: 28 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 96 taggcgagtg alacaccatac attaaagg 28

SEO ID NO 97 LENGTH: 22 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 97 agggttccag aagaccacgc ct 22

SEO ID NO 98 LENGTH: 22 TYPE: DNA ORGANISM: Artificial FEATURE: OTHER INFORMATION: Synthetic construct useful as a primer

SEQUENCE: 98 US 7,973,216 B2 131 132 - Continued ttggatgtgg agtgct ciga ga 22

<210s, SEQ ID NO 99 &211s LENGTH: 25 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer <4 OOs, SEQUENCE: 99 tgaccgaatg agcaaggaaa ttact 25

<210s, SEQ ID NO 100 &211s LENGTH: 24 &212s. TYPE: DNA <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct useful as a primer <4 OOs, SEQUENCE: 1.OO tact cagott toggcaatcca catc 24

<210s, SEQ ID NO 101 &211s LENGTH: 249 212. TYPE: PRT <213> ORGANISM: Artificial 22 Os. FEATURE: <223> OTHER INFORMATION: Synthetic construct 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (6) . . (6) <223> OTHER INFORMATION: W L OR E 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (7) . . (7) 223 OTHER INFORMATION: N OR G 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (8) ... (8) 223 OTHER INFORMATION: L OR W 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (9) ... (9) 223 OTHER INFORMATION: R OR S 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (10) ... (10) 223 OTHER INFORMATION: W OR. K. 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (11) (11) <223> OTHER INFORMATION: G R K OR M 22 Os. FEATURE: <221 > NAMEAKEY SC FEATURE <222s. LOCATION: (12) ... (12) 223 OTHER INFORMATION: R OR IS ABSENT 22 Os. FEATURE: <221 > NAMEAKEY SC FEATURE <222s. LOCATION: (13) . . (13) 223 OTHER INFORMATION: K OR IS ABSENT 22 Os. FEATURE: <221 > NAMEAKEY SC FEATURE <222s. LOCATION: (14) . . (14) 223 OTHER INFORMATION: G OR IS ABSENT 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (19) . . (19) <223> OTHER INFORMATION: E. K. OR O 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE <222s. LOCATION: (22) ... (22) 223 OTHER INFORMATION: D OR N 22 Os. FEATURE: <221 > NAMEAKEY: SC FEATURE

US 7,973,216 B2 135 136 - Continued

O CATION: ... (96) OT HER IN ON: P OR O FEATURE: NA E FEATURE A. ... (102) E ON: D OR A E A.T E FEATURE A. ... (107) E ON: W OR C T E FEATURE A ... (108) E ON: N OR Y T E FEATURE A . . (112) E ON : R OR G T E FEATURE A ... (113) E ON: I, T, M OR K T E FEATURE A ... (114) E ON: D OR N T E FEATURE A ... (115) E ON: S OR Y T E FEATURE A ... (116) E ON: R C OR S T E FEATURE A ... (117) E ON : M OR L. T E FEATURE A ON ... (118) E ON: K OR IS ABSENT T E FEATURE A ... (119) E ON: K OR T T E FEATURE A ... (120) E ON: W M L OR S T E FEATURE A . . (121) E ON: K OR O T E FEATURE A . . (122) E ON: D OR N T E FEATURE A. ... (124) E ON: S OR P T E FEATURE A. ... (127) E ON : T OR M T E FEATURE A. ... (128) E ON: I OR. R. T E FEATURE A. ... (129) E ON: K OR E E A.T E FEATURE CA. I ON ... (131) ER IN ON: I OR N FEATURE: NA E/KEY: FEATURE US 7,973,216 B2 137 138 - Continued

O CATION: . . . 3 2) OT HER IN ON: W OR FEATURE: NA E FEA U RE A. E ON: OR E A.T E FEA RE A. E on: OR IS ABSENT T E FEA RE A. E on: OR T E FEA A. E on: OR. R. T E FEA A. E ON: OR IN T E FEA A. E ON: K OR T E FEA A. E ON: OR T E FEA A. E on: T E FEA A. E ON: T E FEA A. ON E on: T E FEA A. E on: T E FEA A. E on: T E FEA A. E on: or T T E FEA A. E on: or W T E FEA A. E ON: is Absent T E FEA A. E ON: T E FEA A. E ON: OR S T E FEA A. E ON: IS ABSENT E A.T E FEA CA. I ON ER IN ON: IS ABSENT FEATURE: NA E/KEY: FEA

US 7,973,216 B2 147 148 - Continued caagttt cat acaaag cagg tatatatgtt aatgtgtata tittaactgttgaaagatggat 18O atgtgt atta ttittaaag.ca ttt cactagt attt cattct aag acct titt gttaaatagt 24 O ttcaagtttic aagttt tact titt attaatgttittagaaca tottaatatgtctaacggit c 3OO at acttgctic ticacct cact catctattgt gtttacatat atggctaaaa tdaccitatgc 360 gtgttgtgagg agggc.catgt tagagacitt agt cc ct cat aaatatttgt titt Cacgta 42O gaaagatgtt atgtgaatgt aaactittgaa ttatgtatgc aggcttaaac aggtgcagga 48O agagctgcag acaaagatgg ttaaactatc taagccaaa tat caagaga ggagactitta 54 O aagaggatga agtagatctt ataattagac titcac aggct tttgggaaac agg tactaat 6OO aaataagtgt cattttcaat t catgtcgt.c gttitt cattg tacggaaatt gggcc tatta 660 acagtgagat tataat cata gacct caaat tactttitt co act cittittaa tatttaatgt 72 O ttitt caatga agt attagtg gtgtgtagaa tataaaaaaa aataaaaggt gttgttgtaag 78O taatttggag tdtgtgaata taatc.tttct tdttaatata ttctggcticc ccatattitt c 84 O agtattitt ct aatact tcct aatttatatgtcattittatt titt catttag acatcaa.gca 9 OO aaaagttittcaattittgtag tatttitttitt agatttatta aaaacaatta titt cocaaat 96.O tttittttgttg ggccaatggc ctaccacatc attgtttaat ggagaactta aaggctagag O2O taacgaagta tattittaga gtaaatcgta attittaggaa taaaagtgag aggagaaaaa O8O ctaagggtag caacttgcaa actt catgat acttgagtat agtaaagtga gqattactict 14 O tatttitt tag ctatagt cta gcatgagaat ctaaactaca aaatcattag agagggcaa.g 2OO cgittataaac att catttta aattittittaa tattataata ttctacctta aggggcagag 26 O ttgttttittg gttaa.gcaaa acaaaaaatc attcgitatica aatgtggitat cattgaaaat 32O caaact tcag actittagt ct taatttittaa aatgaagaca aatat coagt gctaatagoa 38O ataaac caaa aattittagga actgtttggt at cittatttgaaattittitta t catttctica 44 O aaacatttitt taaaaacatt tottgaaaac aattittctitt aagacitcaaa aacttgatgg SOO gtatgcaaat taaaaattitc aaatc.ttacg actittaacta gacaaagaga tictaaacaga 560 gggggit cacg gtagagggag aaagaagaga gaggaggaaa aaaagtaaga gatgattgaa 62O tgaaagagaa agaagagaga tigatcgggat gaaatagaga ggaatatgag tagagaaag 68O aactgagagg ataaaaagag cagattggag aaaagacgala aaggtaataa aaaaaggaga 74 O gagaaagaag aaaagagaga gatagatttg gagagagaga gagaagaaga gaaaataaat 8OO aagagagttt aagtttaaaa actictaaaac toactttitta tdttittagat aatagacitat 86 O atttittgagt tagt cittgag ttcaatttitt aaaaatagtic ctaccaaaaa agtttittaag 92 O gcctaaaact tdaaaattgt ttittgagttt aaaaagttgg attcaaataa agtat caaac 98 O aggit cottag titttitt ctitg accqaaaaaa taaaaatctt tat coggaag gigcattagta 2O4. O aact caaaca acctitt ctitc gtaatgattt ttg tatgtaa agt catttitc atgcttittaa 21OO t ccctagt c actgagcaaa cct ttaagat ttgt attcg gccaacagag gcttggat.ca 216 O gaatagataa gaagttatac attcaaaatt to acaattaa tagaattaga agggagaact 222 O ttgg taataa atacacgcag taattittatt ttttgtatta aaactaatgt togggcaagg 228O atttggcctt ggcacagctic ccttggagtg ttggc acttg gcgttgatgt tdgttgttgg 234 O tcgagttctt gct acttggt gtgct acaag aagagtacaa agittagttitt gaaatgtgcc 24 OO tttgttggggg. Cttagatgta ggt ctitgagg ct cacaatca aaactaacaa aaagttaggc 246 O gtgccactgt tat ct caata taatagatgt tdaatatatt tdatctaagt cqattacttg 252O US 7,973,216 B2 149 150 - Continued cgc.cccalaga tgcttaaact tott attatc. aaaggattgt Cacaaatggg ttaagtctgc 2580 attaagttct tttitcttgcc ttgttgaccaa ggacttittitc ttatagttitt gtatgtttag 264 O atgaagggag to CatalaccC ttittaatcat tcqcttgatt acaatticgac gcttaaagaa 27 OO gtgaagittaa cittgcttagc caaatttgat Cataaatggg tcttagagcg agaaacaatt 276 O gtaagtgagt taaaaCatala cataatatgc atttalaacaa. Caaat Ctaca aactgtaagt 282O acaa acacala ggaagttggg caagattt ca tcttgtcggg Caaggttcaa acct cittgca 288O acCact totc. Cttgagtttg tagaagagtt gtggg tactg caaaatggaa atgtaagcag 294 O gaaatacaaa caaggttt co taaagggaat ctatt citacg aatctaggat alaggtag cag 3 OOO agctittgc.ca aaagatggct ttctg.cgtgg aatctatggc taalaaggtgc ataatctgga 3 O 6 O cgaagtgcag Ctttittggitt atttgcttgt Ctgagaggca agtttgt atg tottttgttt 312 O gattggttga citctgttgtc tottt ct cost tittatagacg atttggc.ccg 318O actgcttittg gctictatt ct tgtc.cgaaag Ctcttggagg gcaatgagtic at catct titt 324 O tacttgtagt gcc attagaa agtgttttitt ggctaat agt gagttgatct ctgtcacttg 33 OO t cact t cact cct acacatg tggccitat at tittaattgag gaggcaccala tttgttc.ca.g 3360 gcttgtcgac tgggCCtcgg gcaagttctitc acttaatttg a catc catgg gtc.ttggcta 342O totacaaaac cittatgttaa at attalactic aaacaac tag to CactC Cat ttaattictaa 3480 agaagaaaat cgitt tatgca atctotgttc cittitt tttitt citt tatt cat cittattitt to 354 O aggcaaatgt att Cat Catt titt tott cat gcatgitaatg tacttaggtg gtcattgatt 36OO gctagaagac titcCaggaag aac agcaaat gctgttgaaaa attattggaa cacticgattg 366 O cggat.cgatt citcgcatgaa aacggtgaaa aataa at Ctc. aagaaatgag agagaccaat 372 O gtgatalagac cticagoccca aaaattcaac agaagttcat attacttaag Cagtaaagaa 378 O c caattictag accatatt Ca atcagcagaa gatttaagta cgccaccaca aacgt.cgt.cg 384 O t caacaaaga atggaaatga ttggtgggag accttgttag aaggcgagga tacttittgaa 3900 agagctgcat atcc.ca.gcat tgagttagag gaagaactict t cacaagttt ttggtttgat 396 O gatcgactgt cgc.caagatc atgcgc.caat titt cotgaag gacaaagtag aagggaattic t cctittagca cggacctttg gaat cattca aaagaagaat agctagagaa aatgattic 4. Of 8

<210s, SEQ ID NO 103 &211s LENGTH: 104 212. TYPE : PRT <213s ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 103

Val Arg Lys Gly Ala Trp Thr Arg Glu Glu Asp Asn Lieu. Luell Arg Glin 1. 5 1O 15

Cys Wall Glu Ile His Gly Glu Gly Llys Trp Asn Glin Wal Ser 25 3O

Ala Gly Luell Asn Arg Cys Arg Llys Ser Cys Arg Glin Arg Trp Luell Asn 35 4 O 45

Luell Pro Asn. Ile Lys Arg Gly Asp Phe Lys Glu Asp Glu Wall SO 55 6 O

Asp Luell Ile Ile Arg Lieu. His Arg Lieu. Lieu. Gly Asn Arg Trp Ser Luell 65 70 7s 8O

Ile Ala Arg Arg Lieu Pro Gly Arg Thir Ala Asn Ala Wall Lys Asn Tyr 85 90 95

Trp Asn Thir Arg Lieu. Arg Ile Asp US 7,973,216 B2 151 152 - Continued

<210s, SEQ ID NO 104 &211s LENGTH: 104 212. TYPE : PRT <213> ORGANISM: Arabidopsis thaliana

<4 OOs, SEQUENCE: 104

Lell Arg Lys Gly Ala Trp Thir Thir Glu Glu Asp Ser Lell Luell Arg Glin 1. 5 1O 15

Cys Ile Asn Lys Tyr Gly Glu Gly Lys Trp His Glin Wall Pro Wall Arg 25

Ala Gly Luell Asn Arg Arg Lys Ser Arg Lell Arg Trp Luell Asn 35 4 O 45

Luell Pro Ser Ile Lys Arg Gly Luell Ser Ser Asp Glu Wall SO 55 6 O

Asp Luell Luell Luell Arg Lell His Arg Luell Luell Gly Asn Arg Trp Ser Luell 65 70

Ile Ala Gly Arg Lell Pro Gly Arg Thir Ala ASn Asp Wall Asn Tyr 85 90 95

Trp Asn Thir His Lell Ser 1OO

<210s, SEQ ID NO 105 &211s LENGTH: 104 212. TYPE : PRT <213> ORGANISM: Arabidopsis thaliana

<4 OOs, SEQUENCE: 105

Lell Arg Lys Gly Ala Trp Thir Ala Glu Glu Asp Ser Lell Luell Arg Luell 1. 5 1O 15

Cys Ile Asp Lys Tyr Gly Glu Gly Lys Trp His Glin Wall Pro Luell Arg 25

Ala Gly Luell Asn Arg Arg Lys Ser Arg Lell Arg Trp Luell Asn 35 4 O 45

Luell Pro Ser Ile Lys Arg Gly Arg Luell Ser Asn Asp Glu Wall SO 55 6 O

Asp Luell Luell Luell Arg Lell His Luell Luell Gly Asn Arg Trp Ser Luell 65 70

Ile Ala Gly Arg Lell Pro Gly Thir Ala ASn Asp Wall Asn Tyr 85 90 95

Trp Asn Thir His Lell Ser

SEQ ID NO 106 LENGTH: 104 TYPE : PRT ORGANISM: Witis labrusca X Witis vinifera

< 4 OOs SEQUENCE: 106

Val Arg Lys Gly Ala Trp Ile Glin Glu Glu Asp Wall Lell Luell Arg 1. 5 15

Cys Ile Glu Lys Tyr Gly Glu Gly Lys Trp His Lieu. Wall Pro Luell Arg 25 3O

Ala Gly Lieu. Asn Arg Cys Arg Lys Ser Cys Arg Lieu. Arg Trp Luell Asn 35 4 O 45

Tyr Lieu Lys Pro Asp Ile Lys Arg Gly Glu Phe Ala Lell Asp Glu Wall SO 55 6 O US 7,973,216 B2 153 154 - Continued Asp Lieu Met Ile Arg Lieu. His Asn Lieu. Lieu. Gly Asn Arg Trp Ser Lieu 65 70 7s 8O Ile Ala Gly Arg Lieu Pro Gly Arg Thr Ala Asn Asp Wall Lys Asn Tyr 85 90 95 Trp. His Gly His His Lieu. Llys Llys 1OO

<210s, SEQ ID NO 107 &211s LENGTH: 104 212. TYPE: PRT <213> ORGANISM: Capsicum annuum <4 OOs, SEQUENCE: 107 Val Arg Lys Gly Ala Trp Thr Glu Glu Glu Asp Phe Lieu. Lieu. Arg Llys 1. 5 1O 15 Cys Ile Glin Asn Tyr Gly Glu Gly Lys Trp His Lieu Val Pro Ile Arg 2O 25 3O Ala Gly Lieu. Asn Arg Cys Arg Llys Ser Cys Arg Lieu. Arg Trp Lieu. Asn 35 4 O 45 Tyr Lieu. Arg Pro His Ile Lys Arg Gly Asp Phe Gly Trp Asp Glu Ile SO 55 6 O Asp Lieu. Ile Lieu. Arg Lieu. His Llys Lieu. Lieu. Gly Asn Arg Trp Ser Lieu 65 70 7s 8O Ile Ala Gly Arg Lieu Pro Gly Arg Thr Ala Asn Asp Wall Lys Asn Tyr 85 90 95 Trp Asn. Ser His Lieu. Glin 1OO

<210s, SEQ ID NO 108 &211s LENGTH: 104 212. TYPE: PRT <213> ORGANISM: Petunia hybrida

<4 OOs, SEQUENCE: 108 Val Arg Lys Gly Ala Trp Thr Glu Glu Glu Asp Lieu. Lieu. Lieu. Arg Glu 1. 5 1O 15 Cys Ile Asp Llys Tyr Gly Glu Gly Lys Trp His Lieu Val Pro Val Arg 2O 25 3O Ala Gly Lieu. Asn Arg Cys Arg Llys Ser Cys Arg Lieu. Arg Trp Lieu. Asn 35 4 O 45 Tyr Lieu. Arg Pro His Ile Lys Arg Gly Asp Phe Ser Lieu. Asp Glu Val SO 55 6 O Asp Lieu. Ile Lieu. Arg Lieu. His Llys Lieu. Lieu. Gly Asn Arg Trp Ser Lieu 65 70 7s 8O Ile Ala Gly Arg Lieu Pro Gly Arg Thr Ala Asn Asp Wall Lys Asn Tyr 85 90 95 Trp Asn. Thir His Lieu. Arg 1OO

<210s, SEQ ID NO 109 &211s LENGTH: 104 212. TYPE: PRT <213s ORGANISM: Solanum lycopersicum

<4 OOs, SEQUENCE: 109 Val Arg Lys Gly Ser Trp Thir Asp Glu Glu Asp Phe Lieu. Lieu. Arg Llys 1. 5 1O 15 Cys Ile Asp Llys Tyr Gly Glu Gly Lys Trp His Lieu Val Pro Ile Arg 2O 25 3O US 7,973,216 B2 155 156 - Continued

Ala Gly Luell Asn Arg Lys Ser Arg Lell Arg Trp Lieu Asn 35 4 O 45

Luell Arg Pro His Ile Lys Arg Gly Asp Phe Glu Glin Asp Glu Wall SO 55 6 O

Asp Luell Ile Luell Arg Lell His Luell Luell Gly Asn Arg Trp Ser Luell 65 70

Ile Ala Gly Arg Lell Pro Gly Ala ASn Asp Val Lys Asn Tyr 85 90 95

Trp Asn Thir Asn Lell Lell Arg 1OO

<210s, SEQ ID NO 110 &211s LENGTH: 104 212. TYPE : PRT &213s ORGANISM: Gerbera hybrid cv.

<4 OOs, SEQUENCE: 11O

Lieu. Arg Lys Gly Ala Trp Thir Ala Glu Glu Asp Met Lell Luell Lys Asn 1. 5 1O 15

Cys Ile Glu Arg Tyr Gly Glu Gly Lys Trp His Lell Wall Pro Luell 25

Ala Gly Luell Asn Arg Arg Lys Ser Arg Lell Arg Trp Luell Asn 35 4 O 45

Luell Arg Pro Asn Ile Lys Arg Gly Asp Phe Gly Glu Asp Glu Ile SO 55 6 O

Asp Luell Ile Ile Arg Lell His Luell Luell Gly Asn Arg Trp Ser Luell 65 70 8O

Ile Ala Gly Arg Ile Pro Gly Arg Thir Ala ASn Asp Wall Asn Trp 85 90 95

Trp Asn Thir His Lell Arg Ser Arg 1OO

<210s, SEQ ID NO 111 &211s LENGTH: 104 212. TYPE : PRT &213s ORGANISM: Picea mariana

<4 OOs, SEQUENCE: 111

Lieu. Asn Lys Gly Ala Trp Ser Ala Glu Glu Asp Ser Lell Luell Gly 1. 5 1O 15

Tyr Ile Glin Thir His Gly Glu Gly Asn Trp Arg Ser Lell Pro 25

Ala Gly Luell Arg Arg Gly Lys Ser Arg Lell Arg Trp Luell Asn 35 4 O 45

Luell Arg Pro Ile Lys Arg Gly Asn Ile Thir Ala Asp Glu Glu SO 55 6 O

Glu Luell Ile Ile Arg Met His Ala Luell Luell Gly Asn Arg Trp Ser Ile 65 70 8O

Ile Ala Gly Arg Wall Pro Gly Arg Thir Asp ASn Glu Ile Asn Tyr 85 90 95

Trp Asn Thir Asn Lell Ser 1OO

<210s, SEQ ID NO 112 &211s LENGTH: 212. TYPE : PRT &213s ORGANISM: Zea mays US 7,973,216 B2 157 158 - Continued <4 OOs, SEQUENCE: 112

Val Lys Arg Gly Ala Trp Thir Ser Lys Glu Asp Asp Ala Lieu Ala Ala 1. 5 1O 15 Tyr Val Lys Ala His Gly Glu Gly Lys Trp Arg Glu Val Pro Glin 2O 25 3O

Ala Gly Lieu. Arg Arg Cys Gly Lys Ser Cys Arg Lieu. Arg Trp Lieu. Asn 35 4 O 45

Tyr Lieu. Arg Pro Asn. Ile Arg Arg Gly Asn. Ile Ser Tyr Asp Glu Glu SO 55 6 O

Asp Lieu. Ile Ile Arg Lieu. His Arg Lieu. Lieu. Gly Asn Arg Trp Ser Luell 65 70 7s 8O Ile Ala Gly Arg Lieu Pro Gly Arg Thr Asp Asn. Glu Ile Lys Asn 85 90 95 Trp Asn Ser Thr Lieu Val 1OO

<210s, SEQ ID NO 113 &211s LENGTH: 104 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana

<4 OOs, SEQUENCE: 113 Lieu. Asn Arg Gly Ala Trp Thr Asp His Glu Asp Llys Ile Lieu. Arg Asp 1. 5 1O 15

Tyr Ile Thr Thr His Gly Glu Gly Llys Trp Ser Thr Lieu Pro Asn Glin 2O 25 3O

Ala Gly Lieu Lys Arg Cys Gly Lys Ser Cys Arg Lieu. Arg Trp Llys Asn 35 4 O 45

Tyr Lieu. Arg Pro Gly Ile Lys Arg Gly Asn. Ile Ser Ser Asp Glu Glu SO 55 6 O

Glu Lieu. Ile Ile Arg Lieu. His Asn Lieu. Lieu. Gly Asn Arg Trip Ser Luell 65 70 7s

Ile Ala Gly Arg Lieu Pro Gly Arg Thr Asp Asn. Glu Ile Lys Asn His 85 90 95 Trp Asn. Ser Asn Lieu. Arg Lys Arg 1OO

<210s, SEQ ID NO 114 &211s LENGTH: 104 212. TYPE: PRT <213> ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 114

Lieu. Asn Arg Gly Ala Trp Thr Ala Lieu. Glu Asp Llys Ile Lieu. Ser Ser 1. 5 1O 15 Tyr Ile Lys Ala His Gly Glu Gly Lys Trp Arg Ser Lieu Pro Llys Arg 2O 25 3O

Ala Gly Lieu Lys Arg Cys Gly Lys Ser Cys Arg Lieu. Arg Trp Lieu. Asn 35 4 O 45

Tyr Lieu. Arg Pro Asp Ile Lys Arg Gly Asn. Ile Ser Gly Asp Glu Glu SO 55 6 O

Glu Lieu. Ile Val Arg Lieu. His Asn Lieu. Lieu. Gly Asn Arg Trip Ser Luell 65 70 7s 8O Ile Ala Gly Arg Lieu Pro Gly Arg Thr Asp Asn. Glu Ile Lys Asn 85 90 95 Trp Asn. Thir Thir Lieu. Gly Llys Llys 1OO US 7,973,216 B2 159 160 - Continued

<210s, SEQ ID NO 115 &211s LENGTH: 104 212. TYPE : PRT &213s ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 115

Lieu. Asn Arg Gly Ala Trp. Thir Ala Met Glu Asp Wall Luell Thir Glu 1. 5 1O 15

Tyr Ile Gly Asn Gly Glu Gly Lys Trp Arg Asn Lell Pro 2O 25

Ala Gly Luell Ser Arg Lell Arg Trp Luell Asn 35 4 O 45

Luell Arg Pro Asp Ile Lys Arg Gly Asn Ile Thir Arg Asp Glu Glu SO 55 6 O

Glu Luell Ile Ile Arg Lieu. His Llys Luell Luell Gly Asn Arg Trp Ser Luell 65 70

Ile Ala Gly Arg Lell Pro Gly Arg Thir Asp ASn Glu Ile Asn Tyr 85 90 95

Trp Asn Thir Thir Ile Gly Lys Arg

SEQ ID NO 116 LENGTH: 104 TYPE : PRT ORGANISM: Zea mays

< 4 OOs SEQUENCE: 116

Lieu Lys Arg Gly Arg Trp. Thir Ala Glu Glu Asp Glin Lell Luell Ala Asn 1. 5 1O 15

Tyr Ile Ala Glu His Gly Glu Gly Ser Trp Arg Ser Lell Pro Asn 25

Ala Gly Luell Luell Arg Cys Gly Lys Ser Arg Lell Arg Trp Ile Asn 35 4 O 45

Luell Arg Ala Asp Val Lys Arg Gly Asn Ile Ser Glu Glu Glu SO 55 6 O

Asp Ile Ile Ile Lieu. His Ala Thir Luell Gly Asn Arg Trp Ser Luell 65 70

Ile Ala Ser His Lell Pro Gly Arg Thir Asp ASn Glu Ile Asn Tyr 85 90 95

Trp Asn Ser His Lell Ser Arg Glin 1OO

<210s, SEQ ID NO 117 &211s LENGTH: 104 212. TYPE : PRT &213s ORGANISM: Malus domestica

<4 OOs, SEQUENCE: 117

Arg Arg Gly Glin Trp Ile Lieu. Glu Glu Asp Ser Lell Luell Ile Glin 5 1O 15

Ile Glu Arg Gly Glu Gly Glin Trp ASn Lell Lell Ala 2O 25 3O

Ser Gly Luell Arg Thr Gly Lys Ser Arg Lell Arg Trp Luell Asn 35 4 O 45

Luell Lys Pro Asp Val Lys Arg Gly Asn Luell Ser Pro Glu Glu Glin SO 55 6 O

Lell Luell Ile Luell Asp Lieu. His Ser Met Gly Asn Arg Trp Ser US 7,973,216 B2 161 162 - Continued

Ile Ala Arg Tyr Lieu Pro Gly Arg Thr Asp Asn. Glu Ile Lys Asn Tyr 85 90 95 Trp Arg Thr Arg Val His Lys Glin 1OO

<210s, SEQ ID NO 118 &211s LENGTH: 103 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana <4 OOs, SEQUENCE: 118 Llys Lys Gly Lieu. Trp Thr Val Glu Glu Asp Asn. Ile Lieu Met Asp Tyr 1. 5 1O 15 Val Lieu. Asn His Gly. Thr Gly Glin Trp Asin Arg Ile Val Arg Llys Thr 2O 25 3O Gly Lieu Lys Arg Cys Gly Llys Ser Cys Arg Lieu. Arg Trp Met Asn Tyr 35 4 O 45 Lieu. Ser Pro Asn. Wall Asn Lys Gly Asn. Phe Thr Glu Glin Glu Glu Asp SO 55 6 O Lieu. Ile Ile Arg Lieu. His Llys Lieu. Lieu. Gly Asn Arg Trp Ser Lieu. Ile 65 70 7s 8O Ala Lys Arg Val Pro Gly Arg Thr Asp Asin Glin Val Lys Asn Tyr Trip 85 90 95 Asn. Thir His Lieu. Ser Llys Llys 1OO

<210s, SEQ ID NO 119 &211s LENGTH: 248 212. TYPE: PRT <213> ORGANISM: Arabidopsis thaliana

<4 OOs, SEQUENCE: 119 Met Glu Gly Ser Ser Lys Gly Lieu. Arg Lys Gly Ala Trp Thir Thr Glu 1. 5 1O 15 Glu Asp Ser Lieu. Lieu. Arg Glin Cys Ile Asn Llys Tyr Gly Glu Gly Lys 2O 25 3O Trp His Glin Val Pro Val Arg Ala Gly Lieu. Asn Arg Cys Arg Llys Ser 35 4 O 45 Cys Arg Lieu. Arg Trp Lieu. Asn Tyr Lieu Lys Pro Ser Ile Lys Arg Gly SO 55 6 O Llys Lieu. Ser Ser Asp Glu Val Asp Lieu. Lieu. Lieu. Arg Lieu. His Arg Lieu. 65 70 7s 8O Lieu. Gly Asn Arg Trp Ser Lieu. Ile Ala Gly Arg Lieu Pro Gly Arg Thr 85 90 95 Ala Asn Asp Wall Lys Asn Tyr Trp Asn. Thir His Lieu. Ser Lys Llys His 1OO 105 11 O Glu Pro Cys Cys Lys Ile Llys Met Lys Lys Arg Asp Ile Thr Pro Ile 115 12 O 125 Pro Thir Thr Pro Ala Leu Lys Asn Asn Val Tyr Lys Pro Arg Pro Arg 13 O 135 14 O Ser Phe Thr Val Asn. Asn Asp Cys Asn His Lieu. Asn Ala Pro Pro Llys 145 150 155 160 Val Asp Val Asn Pro Pro Cys Lieu. Gly Lieu. Asn. Ile Asn. Asn Val Cys 1.65 17O 17s Asp Asn. Ser Ile Ile Tyr Asn Lys Asp Llys Llys Lys Asp Glin Lieu Val 18O 185 19 O US 7,973,216 B2 163 164 - Continued

Asn Asn Lieu. Ile Asp Gly Asp Asn Met Trp Lieu. Glu Llys Phe Lieu. Glu 195 2OO 2O5

Glu Ser Glin Glu Val Asp Ile Lieu Wall Pro Glu Ala Th Thir Thr Glu 21 O 215 22O

Lys Gly Asp Thir Lieu. Ala Phe Asp Wall Asp Glin Lieu. Trp Ser Lieu. Phe 225 23 O 235 24 O

Asp Gly Glu Thir Val Lys Phe Asp 245

The invention claimed is: 15 9. An isolated polynucleotide comprising a nucleotide 1. An isolated polynucleotide comprising a nucleotide sequence encoding a polypeptide having at least 90% sequence encoding a polypeptide having at least 90% sequence identity to the amino acid sequence of SEQ ID sequence identity to the amino acid sequence of SEQ ID NO:1, wherein 96 identity is calculated over the whole length NO:1, wherein 96 identity is calculated over the whole length of the amino acid sequence, and wherein the polypeptide is a of the amino acid sequence, and wherein the polypeptide is a transcription factor that up-regulates promoter of a gene transcription factor that increases anthocyanin production involved in the anthocyanin biosynthetic pathway. upon expression in a plant. 10. The polynucleotide of claim 9 wherein the gene 2. The isolated polynucleotide of claim 1, wherein the encodes dihydroflavolon 4-reductase (DFR). polypeptide has the amino acid sequence of SEQID NO:1. 11. The polynucleotide of claim 9 wherein the gene 3. The isolated polynucleotide of claim 1, wherein the 25 encodes chalcone synthase (CHS). nucleotide sequence encoding the polypeptide has at least 12. A genetic construct comprising the polynucleotide of 90% sequence identity to the coding nucleotide sequence of claim 9. SEQID NO:5 or SEQID NO:102. 13. A host cell transformed with the polynucleotide of 4. The isolated polynucleotide of claim 1, wherein the claim 9 to express said polypeptide. 30 14. A plant cell or plant transformed with the polynucle nucleotide sequence encoding the polypeptide has at least otide of claim 9 to express said polypeptide, and wherein 90% sequence identity to the coding nucleotide sequence of expression of said polypeptide increases anthocyanin produc SEQID NO:5. 5. The isolated polynucleotide of claim 1, wherein the tion in said transformed plant cell or plant. nucleotide sequence encoding the polypeptide has the coding 15. A method for producing a plant cell or plant with nucleotide sequence of SEQID NO:5. 35 increased anthocyanin production, the method comprising 6. A genetic construct comprising the polynucleotide of the steps of transformation of a plant cell or plant with the claim 1. polynucleotide of claim 1 and expression of the polypeptide 7. A host cell transformed with the polynucleotide of claim encoded by said polynucleotide in the transformed plant cell 1 to express said polypeptide. or plant, and wherein expression of said polypeptide 8. A plant cell or plant transformed with the polynucleotide 40 increases anthocyanin production in said transformed plant of claim 1 to express said polypeptide, and wherein expres cell or plant. sion of said polypeptide increases anthocyanin production in 16. A plant produced by the method of claim 15. said transformed plant cell or plant. k k k k k